Async vs. Sync Scraping: When to Use Each

FastWebScraper Team2 min read

Async vs. Sync Scraping: When to Use Each

FastWebScraper offers two scraping modes — synchronous and asynchronous. Choosing the right one affects your application's performance, reliability, and complexity.

The Two Modes

Sync: POST /v1/scrape/sync

The sync endpoint blocks until the scrape is complete and returns the result in a single response. Simple to use, but the HTTP connection stays open while the page is being fetched and rendered.

Response time: 5-30 seconds depending on the target page complexity.

Async: POST /v1/scrape/async

The async endpoint returns a job ID immediately (within milliseconds). You then poll the GET /v1/jobs/:id endpoint to check the status and retrieve results once complete.

Initial response time: < 200ms (just the job ID).

When to Use Sync

Sync scraping is the simpler approach. Use it when:

  • Testing and prototyping: You want quick results without extra polling logic
  • Low-volume scraping: A few requests at a time, not hundreds
  • Simple scripts: One-off data collection tasks
  • Webhook-driven workflows: Where a single request-response cycle is expected

Sync Example

// Simple and direct — one request, one response const response = await fetch('https://api.fastwebscraper.com/v1/scrape/sync', { method: 'POST', headers: { 'X-API-Key': 'YOUR_API_KEY', 'Content-Type': 'application/json', }, body: JSON.stringify({ url: 'https://example.com/product/123', mode: 'auto', waitForSelector: '.price', }), }); const { data } = await response.json(); console.log('HTML:', data.html);
import requests response = requests.post( 'https://api.fastwebscraper.com/v1/scrape/sync', headers={ 'X-API-Key': 'YOUR_API_KEY', 'Content-Type': 'application/json', }, json={ 'url': 'https://example.com/product/123', 'mode': 'auto', 'waitForSelector': '.price', }, timeout=60 # Sync can take up to 30s+ for complex pages ) data = response.json() print('HTML:', data['data']['html'])
using System.Net.Http.Json; using var client = new HttpClient { Timeout = TimeSpan.FromSeconds(60) }; client.DefaultRequestHeaders.Add("X-API-Key", "YOUR_API_KEY"); var request = new { url = "https://example.com/product/123", mode = "auto", waitForSelector = ".price" }; var response = await client.PostAsJsonAsync( "https://api.fastwebscraper.com/v1/scrape/sync", request); var result = await response.Content.ReadFromJsonAsync<JsonElement>(); var html = result.GetProperty("data").GetProperty("html").GetString(); Console.WriteLine($"HTML length: {html?.Length}");

When to Use Async

Async scraping adds complexity but provides significant benefits at scale:

  • Batch scraping: Processing hundreds or thousands of URLs
  • Production systems: Where you can't afford blocking HTTP connections for 30 seconds
  • Queue-based architectures: Submit jobs, process results from a queue
  • Rate-limited workflows: Submit jobs at a controlled pace and collect results later
  • Long-running scrapes: Complex pages that take 30+ seconds to render

Async Example: Submit and Poll

// Step 1: Submit jobs const urls = [ 'https://example.com/product/1', 'https://example.com/product/2', 'https://example.com/product/3', ]; const jobIds: string[] = []; for (const url of urls) { const response = await fetch('https://api.fastwebscraper.com/v1/scrape/async', { method: 'POST', headers: { 'X-API-Key': 'YOUR_API_KEY', 'Content-Type': 'application/json', }, body: JSON.stringify({ url, mode: 'auto' }), }); const { data } = await response.json(); jobIds.push(data.jobId); console.log(`Submitted ${url} -> ${data.jobId}`); } // Step 2: Poll for results for (const jobId of jobIds) { let status = 'PENDING'; let result; while (status === 'PENDING' || status === 'IN_PROGRESS') { await new Promise(r => setTimeout(r, 3000)); // Wait 3 seconds const response = await fetch( `https://api.fastwebscraper.com/v1/jobs/${jobId}`, { headers: { 'X-API-Key': 'YOUR_API_KEY' } } ); result = await response.json(); status = result.data.status; } if (status === 'COMPLETED') { console.log(`Job ${jobId}: ${result.data.html.length} chars`); } else { console.error(`Job ${jobId} failed: ${result.data.error}`); } }
import requests import time # Step 1: Submit jobs urls = [ 'https://example.com/product/1', 'https://example.com/product/2', 'https://example.com/product/3', ] job_ids = [] for url in urls: response = requests.post( 'https://api.fastwebscraper.com/v1/scrape/async', headers={ 'X-API-Key': 'YOUR_API_KEY', 'Content-Type': 'application/json', }, json={'url': url, 'mode': 'auto'} ) job_id = response.json()['data']['jobId'] job_ids.append(job_id) print(f'Submitted {url} -> {job_id}') # Step 2: Poll for results for job_id in job_ids: while True: time.sleep(3) response = requests.get( f'https://api.fastwebscraper.com/v1/jobs/{job_id}', headers={'X-API-Key': 'YOUR_API_KEY'} ) result = response.json() status = result['data']['status'] if status in ('COMPLETED', 'FAILED'): break if status == 'COMPLETED': print(f'Job {job_id}: {len(result["data"]["html"])} chars') else: print(f'Job {job_id} failed: {result["data"].get("error")}')
using System.Net.Http.Json; using var client = new HttpClient(); client.DefaultRequestHeaders.Add("X-API-Key", "YOUR_API_KEY"); // Step 1: Submit jobs var urls = new[] { "https://example.com/product/1", "https://example.com/product/2", "https://example.com/product/3", }; var jobIds = new List<string>(); foreach (var url in urls) { var request = new { url, mode = "auto" }; var response = await client.PostAsJsonAsync( "https://api.fastwebscraper.com/v1/scrape/async", request); var result = await response.Content.ReadFromJsonAsync<JsonElement>(); var jobId = result.GetProperty("data").GetProperty("jobId").GetString()!; jobIds.Add(jobId); Console.WriteLine($"Submitted {url} -> {jobId}"); } // Step 2: Poll for results foreach (var jobId in jobIds) { string status; JsonElement result; do { await Task.Delay(3000); var response = await client.GetFromJsonAsync<JsonElement>( $"https://api.fastwebscraper.com/v1/jobs/{jobId}"); result = response; status = result.GetProperty("data").GetProperty("status").GetString()!; } while (status is "PENDING" or "IN_PROGRESS"); if (status == "COMPLETED") { var html = result.GetProperty("data").GetProperty("html").GetString()!; Console.WriteLine($"Job {jobId}: {html.Length} chars"); } else { Console.WriteLine($"Job {jobId} failed"); } }

Comparison Table

FeatureSyncAsync
Response time5-30s (full page)< 200ms (job ID only)
ComplexitySimple — one requestMore complex — submit + poll
Best forTesting, low volumeProduction, batch processing
Connection handlingBlocks until completeNon-blocking
Timeout riskHigher (long-lived connections)Lower (short requests)
ScalabilityLimited by connection poolScales to thousands of jobs

Hybrid Approach

Many production systems use both:

  1. Sync for on-demand user-triggered scrapes (e.g., "preview this URL")
  2. Async for scheduled batch jobs (e.g., "scrape all competitor prices nightly")

Choose based on the specific use case within your application, not as a one-size-fits-all decision.

Next Steps