How do I scrape dynamic content loaded with JavaScript in Python?

Dynamic JavaScript content requires different approaches than static HTML scraping.

Strategy 1: Find the API endpoint (best):

Often, JavaScript-rendered content comes from API calls:

Open browser DevTools Network tab
Reload the page and filter for XHR/Fetch requests
Find the API endpoint returning JSON data
Make direct requests to the API with requests

This is faster and more reliable than browser automation.

Strategy 2: Use Selenium for browser automation:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

driver = webdriver.Chrome()
driver.get(url)

# Wait for dynamic content to load
element = WebDriverWait(driver, 10).until(
    EC.presence_of_element_located((By.CLASS_NAME, "dynamic-content"))
)

html = driver.page_source
driver.quit()

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

driver = webdriver.Chrome()
driver.get(url)

# Wait for dynamic content to load
element = WebDriverWait(driver, 10).until(
    EC.presence_of_element_located((By.CLASS_NAME, "dynamic-content"))
)

html = driver.page_source
driver.quit()

Strategy 3: Use Playwright (modern alternative):

Playwright is faster and more reliable than Selenium:

from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    browser = p.chromium.launch()
    page = browser.new_page()
    page.goto(url)
    page.wait_for_selector('.dynamic-content')
    html = page.content()
    browser.close()

from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    browser = p.chromium.launch()
    page = browser.new_page()
    page.goto(url)
    page.wait_for_selector('.dynamic-content')
    html = page.content()
    browser.close()

When to use each approach:

API endpoint: Always try this first - fastest and most reliable
Selenium: When you need to interact (click buttons, fill forms)
Playwright: Better than Selenium for most modern sites

Performance impact:

Browser automation is 10-100x slower than direct HTTP requests. Always prefer API endpoints when available.

How do I scrape dynamic content loaded with JavaScript in Python?

Related Questions