What HTTP headers do I need for web scraping?
Essential HTTP headers make your scraper look like a legitimate browser and prevent blocks.
Minimum required headers:
User-Agent (most important): Identifies your browser/client:
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36
Without this, many sites block requests immediately.
Accept: Tells the server what content types you can handle:
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: Preferred language for responses:
Accept-Language: en-US,en;q=0.9
Accept-Encoding: Compression formats you support:
Accept-Encoding: gzip, deflate, br
Additional important headers:
Referer: Previous page URL (important for navigation):
Referer: https://example.com/previous-page
Connection:
Connection: keep-alive
Upgrade-Insecure-Requests:
Upgrade-Insecure-Requests: 1
Complete Python example:
import requests
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Language': 'en-US,en;q=0.9',
'Accept-Encoding': 'gzip, deflate, br',
'Connection': 'keep-alive',
'Upgrade-Insecure-Requests': '1'
}
response = requests.get(url, headers=headers)
Headers to avoid:
Don't send headers that reveal automation:
X-Automated-ToolBot,Scraperin User-Agent- Mismatched header combinations
When to add more headers:
For tougher sites, add:
Sec-Fetch-*headers (Chrome-specific)DNT(Do Not Track)Cache-Control*
Best practice:
Use a header generator to get realistic, matched header sets for your target browser. Mismatched headers (e.g., Safari User-Agent with Chrome-specific headers) can trigger detection.