Async vs. Sync Scraping: When to Use Each
Async vs. Sync Scraping: When to Use Each
FastWebScraper offers two scraping modes — synchronous and asynchronous. Choosing the right one affects your application's performance, reliability, and complexity.
The Two Modes
Sync: POST /v1/scrape/sync
The sync endpoint blocks until the scrape is complete and returns the result in a single response. Simple to use, but the HTTP connection stays open while the page is being fetched and rendered.
Response time: 5-30 seconds depending on the target page complexity.
Async: POST /v1/scrape/async
The async endpoint returns a job ID immediately (within milliseconds). You then poll the GET /v1/jobs/:id endpoint to check the status and retrieve results once complete.
Initial response time: < 200ms (just the job ID).
When to Use Sync
Sync scraping is the simpler approach. Use it when:
- Testing and prototyping: You want quick results without extra polling logic
- Low-volume scraping: A few requests at a time, not hundreds
- Simple scripts: One-off data collection tasks
- Webhook-driven workflows: Where a single request-response cycle is expected
Sync Example
// Simple and direct — one request, one response
const response = await fetch('https://api.fastwebscraper.com/v1/scrape/sync', {
method: 'POST',
headers: {
'X-API-Key': 'YOUR_API_KEY',
'Content-Type': 'application/json',
},
body: JSON.stringify({
url: 'https://example.com/product/123',
mode: 'auto',
waitForSelector: '.price',
}),
});
const { data } = await response.json();
console.log('HTML:', data.html);import requests
response = requests.post(
'https://api.fastwebscraper.com/v1/scrape/sync',
headers={
'X-API-Key': 'YOUR_API_KEY',
'Content-Type': 'application/json',
},
json={
'url': 'https://example.com/product/123',
'mode': 'auto',
'waitForSelector': '.price',
},
timeout=60 # Sync can take up to 30s+ for complex pages
)
data = response.json()
print('HTML:', data['data']['html'])using System.Net.Http.Json;
using var client = new HttpClient { Timeout = TimeSpan.FromSeconds(60) };
client.DefaultRequestHeaders.Add("X-API-Key", "YOUR_API_KEY");
var request = new
{
url = "https://example.com/product/123",
mode = "auto",
waitForSelector = ".price"
};
var response = await client.PostAsJsonAsync(
"https://api.fastwebscraper.com/v1/scrape/sync", request);
var result = await response.Content.ReadFromJsonAsync<JsonElement>();
var html = result.GetProperty("data").GetProperty("html").GetString();
Console.WriteLine($"HTML length: {html?.Length}");When to Use Async
Async scraping adds complexity but provides significant benefits at scale:
- Batch scraping: Processing hundreds or thousands of URLs
- Production systems: Where you can't afford blocking HTTP connections for 30 seconds
- Queue-based architectures: Submit jobs, process results from a queue
- Rate-limited workflows: Submit jobs at a controlled pace and collect results later
- Long-running scrapes: Complex pages that take 30+ seconds to render
Async Example: Submit and Poll
// Step 1: Submit jobs
const urls = [
'https://example.com/product/1',
'https://example.com/product/2',
'https://example.com/product/3',
];
const jobIds: string[] = [];
for (const url of urls) {
const response = await fetch('https://api.fastwebscraper.com/v1/scrape/async', {
method: 'POST',
headers: {
'X-API-Key': 'YOUR_API_KEY',
'Content-Type': 'application/json',
},
body: JSON.stringify({ url, mode: 'auto' }),
});
const { data } = await response.json();
jobIds.push(data.jobId);
console.log(`Submitted ${url} -> ${data.jobId}`);
}
// Step 2: Poll for results
for (const jobId of jobIds) {
let status = 'PENDING';
let result;
while (status === 'PENDING' || status === 'IN_PROGRESS') {
await new Promise(r => setTimeout(r, 3000)); // Wait 3 seconds
const response = await fetch(
`https://api.fastwebscraper.com/v1/jobs/${jobId}`,
{ headers: { 'X-API-Key': 'YOUR_API_KEY' } }
);
result = await response.json();
status = result.data.status;
}
if (status === 'COMPLETED') {
console.log(`Job ${jobId}: ${result.data.html.length} chars`);
} else {
console.error(`Job ${jobId} failed: ${result.data.error}`);
}
}import requests
import time
# Step 1: Submit jobs
urls = [
'https://example.com/product/1',
'https://example.com/product/2',
'https://example.com/product/3',
]
job_ids = []
for url in urls:
response = requests.post(
'https://api.fastwebscraper.com/v1/scrape/async',
headers={
'X-API-Key': 'YOUR_API_KEY',
'Content-Type': 'application/json',
},
json={'url': url, 'mode': 'auto'}
)
job_id = response.json()['data']['jobId']
job_ids.append(job_id)
print(f'Submitted {url} -> {job_id}')
# Step 2: Poll for results
for job_id in job_ids:
while True:
time.sleep(3)
response = requests.get(
f'https://api.fastwebscraper.com/v1/jobs/{job_id}',
headers={'X-API-Key': 'YOUR_API_KEY'}
)
result = response.json()
status = result['data']['status']
if status in ('COMPLETED', 'FAILED'):
break
if status == 'COMPLETED':
print(f'Job {job_id}: {len(result["data"]["html"])} chars')
else:
print(f'Job {job_id} failed: {result["data"].get("error")}')using System.Net.Http.Json;
using var client = new HttpClient();
client.DefaultRequestHeaders.Add("X-API-Key", "YOUR_API_KEY");
// Step 1: Submit jobs
var urls = new[]
{
"https://example.com/product/1",
"https://example.com/product/2",
"https://example.com/product/3",
};
var jobIds = new List<string>();
foreach (var url in urls)
{
var request = new { url, mode = "auto" };
var response = await client.PostAsJsonAsync(
"https://api.fastwebscraper.com/v1/scrape/async", request);
var result = await response.Content.ReadFromJsonAsync<JsonElement>();
var jobId = result.GetProperty("data").GetProperty("jobId").GetString()!;
jobIds.Add(jobId);
Console.WriteLine($"Submitted {url} -> {jobId}");
}
// Step 2: Poll for results
foreach (var jobId in jobIds)
{
string status;
JsonElement result;
do
{
await Task.Delay(3000);
var response = await client.GetFromJsonAsync<JsonElement>(
$"https://api.fastwebscraper.com/v1/jobs/{jobId}");
result = response;
status = result.GetProperty("data").GetProperty("status").GetString()!;
} while (status is "PENDING" or "IN_PROGRESS");
if (status == "COMPLETED")
{
var html = result.GetProperty("data").GetProperty("html").GetString()!;
Console.WriteLine($"Job {jobId}: {html.Length} chars");
}
else
{
Console.WriteLine($"Job {jobId} failed");
}
}Comparison Table
| Feature | Sync | Async |
|---|---|---|
| Response time | 5-30s (full page) | < 200ms (job ID only) |
| Complexity | Simple — one request | More complex — submit + poll |
| Best for | Testing, low volume | Production, batch processing |
| Connection handling | Blocks until complete | Non-blocking |
| Timeout risk | Higher (long-lived connections) | Lower (short requests) |
| Scalability | Limited by connection pool | Scales to thousands of jobs |
Hybrid Approach
Many production systems use both:
- Sync for on-demand user-triggered scrapes (e.g., "preview this URL")
- Async for scheduled batch jobs (e.g., "scrape all competitor prices nightly")
Choose based on the specific use case within your application, not as a one-size-fits-all decision.
Next Steps
- Read about proxy types for optimizing success rates
- See the API Reference for complete endpoint documentation
- Explore use cases for production scraping patterns