Scraping Modes and Best Practices
Scraping Modes and Best Practices
FastWebScraper offers five scraping modes, each optimized for different scenarios. Choosing the right mode balances success rate, speed, and cost.
Why Modes Matter
Websites identify and block scrapers through several signals:
- IP reputation: Datacenter IPs are flagged as non-human by default
- Browser fingerprinting: Missing or inconsistent browser signatures trigger detection
- Request patterns: Too many requests from one IP triggers rate limiting
- JavaScript execution: SPAs require real browser rendering to see content
FastWebScraper's modes combine different scraping methods and proxy types to handle these challenges.
Available Modes
auto (Recommended)
What it does: Smart selection based on domain history. FastWebScraper learns which mode works best for each domain and automatically uses it.
Best for:
- Most scraping tasks — let the system optimize for you
- New projects where you don't know what protection sites use
- Mixed workloads targeting many different domains
Credits: Varies based on the mode selected (starts with cheapest, escalates if needed)
{
"url": "https://example.com",
"mode": "auto"
}http (1 credit)
What it does: Fast HTTP requests with datacenter proxies. No browser rendering.
Best for:
- Sites with no or minimal bot protection
- APIs and JSON endpoints
- High-speed bulk data collection
- Static HTML pages
Limitations:
- Won't execute JavaScript — SPAs will return empty shells
- Easily detected by sophisticated anti-bot systems
- Lower success rates on protected sites
{
"url": "https://public-data-site.gov/records",
"mode": "http"
}browser (7 credits)
What it does: Full browser rendering (Chromium) with datacenter proxies. Executes JavaScript and waits for dynamic content.
Best for:
- Single-page applications (React, Vue, Angular)
- Sites that load content via JavaScript
- Pages that require interaction before showing content
Limitations:
- Slower than HTTP requests
- Browser fingerprint may still be detected on heavily protected sites
{
"url": "https://spa-site.com/products",
"mode": "browser",
"waitForSelector": ".product-card"
}browser_stealth (10 credits)
What it does: Browser rendering with residential proxies and anti-detection measures. Mimics real user behavior.
Best for:
- E-commerce sites (Amazon, Walmart, Target)
- Social media platforms
- Sites with Cloudflare, DataDome, or PerimeterX protection
- Any site that blocks
browsermode
{
"url": "https://protected-ecommerce.com/product/123",
"mode": "browser_stealth",
"country": "US"
}http_stealth (15 credits)
What it does: HTTP requests through premium unlocker proxies with anti-bot bypass. No browser overhead but high success rates.
Best for:
- Protected non-SPA sites where browser rendering isn't needed
- High-volume scraping of protected sites
- When you need speed but
httpmode is being blocked
{
"url": "https://protected-site.com/data",
"mode": "http_stealth"
}Choosing the Right Mode
| Scenario | Recommended Mode | Why |
|---|---|---|
| Don't know yet | auto | Let the system learn and optimize |
| Government/public data | http | No bot protection, fastest speed |
| E-commerce price monitoring | browser_stealth | Strong anti-bot protection |
| Job board scraping | browser_stealth | Most job sites have bot detection |
| News article collection | http or auto | Low protection, speed matters |
| Real estate listings | browser_stealth | Strong protection, geo-targeting needed |
| API endpoint scraping | http | No browser fingerprinting |
| Social media monitoring | browser_stealth | Aggressive bot detection |
| SPAs without protection | browser | Need JS execution, no anti-bot |
Geo-Targeting
Many websites serve different content based on visitor location — prices, product availability, language, or entirely different page layouts.
Use the country parameter to route requests through proxies in a specific country:
// Get US pricing
const usResponse = await fetch('https://api.fastwebscraper.com/v1/scrape/sync', {
method: 'POST',
headers: {
'X-API-Key': 'YOUR_API_KEY',
'Content-Type': 'application/json',
},
body: JSON.stringify({
url: 'https://store.example.com/product/widget',
mode: 'auto',
country: 'US',
}),
});
// Get UK pricing for the same product
const ukResponse = await fetch('https://api.fastwebscraper.com/v1/scrape/sync', {
method: 'POST',
headers: {
'X-API-Key': 'YOUR_API_KEY',
'Content-Type': 'application/json',
},
body: JSON.stringify({
url: 'https://store.example.com/product/widget',
mode: 'auto',
country: 'GB',
}),
});Best Practices
1. Start with auto, Optimize Later
Don't guess which mode you need. Use auto mode and let the system learn from success/failure patterns. Check your job results to see which mode was actually used — if you see consistent patterns, you can hardcode the mode to save the auto-selection overhead.
2. Use Geo-Targeting Intentionally
Only set the country parameter when the content you need is location-specific. Unnecessary geo-targeting limits the proxy pool and may reduce performance.
3. Spread Requests Over Time
Even with proxy rotation, sending hundreds of requests to the same domain in seconds looks suspicious. Space out your requests:
import time
urls = get_urls_to_scrape()
for url in urls:
scrape(url)
time.sleep(1.5) # 1.5 seconds between requests4. Monitor Success Rates
Track the success rate per domain. If a domain's success rate drops below 85%, consider:
- Switching to a stealth mode (
browser_stealthorhttp_stealth) - Reducing request frequency
- Adding
waitForSelectorfor JavaScript-heavy pages - Checking if the site's structure has changed
5. Handle Failures Gracefully
Not every request will succeed, even with stealth modes. Build retry logic with exponential backoff:
async function scrapeWithRetry(url: string, retries = 3) {
for (let i = 0; i < retries; i++) {
try {
const result = await scrape(url);
if (result.success) return result;
} catch (error) {
// Retry on failure
}
await new Promise(r => setTimeout(r, 2 ** i * 1000));
}
throw new Error(`Failed after ${retries} retries: ${url}`);
}Mode Selection Summary
| Mode | Method | Proxy | Credits | Speed | Success Rate |
|---|---|---|---|---|---|
auto | varies | varies | varies | varies | highest |
http | HTTP | datacenter | 1 | fastest | low on protected sites |
browser | browser | datacenter | 7 | moderate | medium |
browser_stealth | browser | residential | 10 | slower | high |
http_stealth | HTTP | unlocker | 15 | fast | high |
The right mode depends on the target site's protection level. Start with auto and let the system optimize, or choose explicitly based on your knowledge of the target site. For more on the scraping API, see the API Reference.