Async vs. Sync Scraping: When to Use Each

FastWebScraper offers two scraping modes — synchronous and asynchronous. Choosing the right one affects your application's performance, reliability, and complexity.

The Two Modes

Sync: `POST /v1/scrape/sync`

The sync endpoint blocks until the scrape is complete and returns the result in a single response. Simple to use, but the HTTP connection stays open while the page is being fetched and rendered.

Response time: 5-30 seconds depending on the target page complexity.

Async: `POST /v1/scrape/async`

The async endpoint returns a job ID immediately (within milliseconds). You then poll the GET /v1/jobs/:id endpoint to check the status and retrieve results once complete.

Initial response time: < 200ms (just the job ID).

When to Use Sync

Sync scraping is the simpler approach. Use it when:

Testing and prototyping: You want quick results without extra polling logic
Low-volume scraping: A few requests at a time, not hundreds
Simple scripts: One-off data collection tasks
Webhook-driven workflows: Where a single request-response cycle is expected

Sync Example

// Simple and direct — one request, one response
const response = await fetch('https://api.fastwebscraper.com/v1/scrape/sync', {
  method: 'POST',
  headers: {
    'X-API-Key': 'YOUR_API_KEY',
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    url: 'https://example.com/product/123',
    mode: 'auto',
    waitForSelector: '.price',
  }),
});

const { data } = await response.json();
console.log('HTML:', data.html);

import requests

response = requests.post(
    'https://api.fastwebscraper.com/v1/scrape/sync',
    headers={
        'X-API-Key': 'YOUR_API_KEY',
        'Content-Type': 'application/json',
    },
    json={
        'url': 'https://example.com/product/123',
        'mode': 'auto',
        'waitForSelector': '.price',
    },
    timeout=60  # Sync can take up to 30s+ for complex pages
)

data = response.json()
print('HTML:', data['data']['html'])

using System.Net.Http.Json;

using var client = new HttpClient { Timeout = TimeSpan.FromSeconds(60) };
client.DefaultRequestHeaders.Add("X-API-Key", "YOUR_API_KEY");

var request = new
{
    url = "https://example.com/product/123",
    mode = "auto",
    waitForSelector = ".price"
};

var response = await client.PostAsJsonAsync(
    "https://api.fastwebscraper.com/v1/scrape/sync", request);
var result = await response.Content.ReadFromJsonAsync<JsonElement>();
var html = result.GetProperty("data").GetProperty("html").GetString();
Console.WriteLine($"HTML length: {html?.Length}");

When to Use Async

Async scraping adds complexity but provides significant benefits at scale:

Batch scraping: Processing hundreds or thousands of URLs
Production systems: Where you can't afford blocking HTTP connections for 30 seconds
Queue-based architectures: Submit jobs, process results from a queue
Rate-limited workflows: Submit jobs at a controlled pace and collect results later
Long-running scrapes: Complex pages that take 30+ seconds to render

Async Example: Submit and Poll

// Step 1: Submit jobs
const urls = [
  'https://example.com/product/1',
  'https://example.com/product/2',
  'https://example.com/product/3',
];

const jobIds: string[] = [];
for (const url of urls) {
  const response = await fetch('https://api.fastwebscraper.com/v1/scrape/async', {
    method: 'POST',
    headers: {
      'X-API-Key': 'YOUR_API_KEY',
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({ url, mode: 'auto' }),
  });

  const { data } = await response.json();
  jobIds.push(data.jobId);
  console.log(`Submitted ${url} -> ${data.jobId}`);
}

// Step 2: Poll for results
for (const jobId of jobIds) {
  let status = 'PENDING';
  let result;

  while (status === 'PENDING' || status === 'IN_PROGRESS') {
    await new Promise(r => setTimeout(r, 3000)); // Wait 3 seconds
    const response = await fetch(
      `https://api.fastwebscraper.com/v1/jobs/${jobId}`,
      { headers: { 'X-API-Key': 'YOUR_API_KEY' } }
    );
    result = await response.json();
    status = result.data.status;
  }

  if (status === 'COMPLETED') {
    console.log(`Job ${jobId}: ${result.data.html.length} chars`);
  } else {
    console.error(`Job ${jobId} failed: ${result.data.error}`);
  }
}

import requests
import time

# Step 1: Submit jobs
urls = [
    'https://example.com/product/1',
    'https://example.com/product/2',
    'https://example.com/product/3',
]

job_ids = []
for url in urls:
    response = requests.post(
        'https://api.fastwebscraper.com/v1/scrape/async',
        headers={
            'X-API-Key': 'YOUR_API_KEY',
            'Content-Type': 'application/json',
        },
        json={'url': url, 'mode': 'auto'}
    )
    job_id = response.json()['data']['jobId']
    job_ids.append(job_id)
    print(f'Submitted {url} -> {job_id}')

# Step 2: Poll for results
for job_id in job_ids:
    while True:
        time.sleep(3)
        response = requests.get(
            f'https://api.fastwebscraper.com/v1/jobs/{job_id}',
            headers={'X-API-Key': 'YOUR_API_KEY'}
        )
        result = response.json()
        status = result['data']['status']

        if status in ('COMPLETED', 'FAILED'):
            break

    if status == 'COMPLETED':
        print(f'Job {job_id}: {len(result["data"]["html"])} chars')
    else:
        print(f'Job {job_id} failed: {result["data"].get("error")}')

using System.Net.Http.Json;

using var client = new HttpClient();
client.DefaultRequestHeaders.Add("X-API-Key", "YOUR_API_KEY");

// Step 1: Submit jobs
var urls = new[]
{
    "https://example.com/product/1",
    "https://example.com/product/2",
    "https://example.com/product/3",
};

var jobIds = new List<string>();
foreach (var url in urls)
{
    var request = new { url, mode = "auto" };
    var response = await client.PostAsJsonAsync(
        "https://api.fastwebscraper.com/v1/scrape/async", request);
    var result = await response.Content.ReadFromJsonAsync<JsonElement>();
    var jobId = result.GetProperty("data").GetProperty("jobId").GetString()!;
    jobIds.Add(jobId);
    Console.WriteLine($"Submitted {url} -> {jobId}");
}

// Step 2: Poll for results
foreach (var jobId in jobIds)
{
    string status;
    JsonElement result;
    do
    {
        await Task.Delay(3000);
        var response = await client.GetFromJsonAsync<JsonElement>(
            $"https://api.fastwebscraper.com/v1/jobs/{jobId}");
        result = response;
        status = result.GetProperty("data").GetProperty("status").GetString()!;
    } while (status is "PENDING" or "IN_PROGRESS");

    if (status == "COMPLETED")
    {
        var html = result.GetProperty("data").GetProperty("html").GetString()!;
        Console.WriteLine($"Job {jobId}: {html.Length} chars");
    }
    else
    {
        Console.WriteLine($"Job {jobId} failed");
    }
}

Comparison Table

Feature	Sync	Async
Response time	5-30s (full page)	< 200ms (job ID only)
Complexity	Simple — one request	More complex — submit + poll
Best for	Testing, low volume	Production, batch processing
Connection handling	Blocks until complete	Non-blocking
Timeout risk	Higher (long-lived connections)	Lower (short requests)
Scalability	Limited by connection pool	Scales to thousands of jobs

Hybrid Approach

Many production systems use both:

Sync for on-demand user-triggered scrapes (e.g., "preview this URL")
Async for scheduled batch jobs (e.g., "scrape all competitor prices nightly")

Choose based on the specific use case within your application, not as a one-size-fits-all decision.

Next Steps

Read about proxy types for optimizing success rates
See the API Reference for complete endpoint documentation
Explore use cases for production scraping patterns