How I Got Here
I needed to scrape a React app protected by Cloudflare. Thought it would be easy - just install playwright-stealth, configure it, and done. That's not what happened.
After weeks of trial and error, going through GitHub issues, trying different proxies, and failing countless times, I finally figured out what actually works. Spoiler: it's not just playwright-stealth.
This is what I learned, the problems I hit, and the solutions that actually worked in production.
Problem 1: Playwright-stealth Doesn't Install Properly
First problem I hit - couldn't even get playwright-stealth working. The package on PyPI is outdated. There's a notice on GitHub:
⚠️ GitHub Issue #40
"This repo is not the source for the PyPi package (playwright-stealth) anymore"
What happened: I ran pip install playwright-stealth, then tried to import it:
from playwright_stealth import Stealth
Got ImportError. The package is broken. Multiple issues report this (#16, #22, #31).
What Actually Worked
I had to install from the original repo directly:
pip install git+https://github.com/Granitosaurus/playwright-stealth.git
Then use it like this:
from playwright.sync_api import sync_playwright
from playwright_stealth import stealth_sync
with sync_playwright() as p:
browser = p.chromium.launch(headless=True)
page = browser.new_page()
stealth_sync(page) # This hides webdriver properties
page.goto("https://example.com")
💡 Tip
Even with stealth installed, Cloudflare still detects you. More on that next.
Problem 2: Cloudflare Still Detects the Bot
Got stealth working, but Cloudflare still blocked me. Got the 5-second shield, then a 1020 error (access denied).
This is what everyone complains about on StackOverflow and Reddit. Even with:
- Stealth mode enabled
- Custom user-agent
- Real browser cookies
- Human-like mouse movements
Cloudflare still blocked the request. Why?
What Cloudflare Actually Checks
After digging through forums and GitHub issues, I found that Cloudflare checks:
- TLS fingerprint - Python's requests have different TLS fingerprints than Chrome
- HTTP/2 fingerprint - The way browser orders headers
- Browser fingerprint - Canvas, WebGL, audio context
- IP reputation - Datacenter IPs vs residential IPs
- Behavior patterns - Too perfect timing = bot
Playwright-stealth only hides the navigator.webdriver property. It doesn't fix TLS fingerprints or IP reputation.
Problem 3: Residential Proxies Are Expensive
Everyone said "use residential proxies." Yeah, great. They cost $500/month for decent ones.
⚠️ Reality Check
Free proxies are either slow, already blacklisted by Cloudflare, or exit after 5 minutes. Residential proxy services cost serious money.
I tried:
- Bright Data - $500+/month
- Smartproxy - $75-$200/month
- Oxylabs - $300+/month
- Free proxy lists - All blocked by Cloudflare
What Actually Worked (Budget Solution)
For low-volume scraping, I used:
from playwright.sync_api import sync_playwright
from playwright_stealth import stealth_sync
# Use datacenter proxies with rotation
proxies = [
"http://proxy1.example.com:8000",
"http://proxy2.example.com:8000",
# ... more proxies
]
with sync_playwright() as p:
for proxy in proxies:
try:
browser = p.chromium.launch(
headless=True,
proxy={"server": proxy}
)
page = browser.new_page()
stealth_sync(page)
# Add random delays to mimic human behavior
page.wait_for_timeout(2000 + random.randint(0, 3000))
page.goto("https://target-site.com")
# If we get here, proxy works
break
except Exception as e:
continue
# Try next proxy
This isn't perfect. Cloudflare still blocks some datacenter IPs. But with enough rotation, some get through.
Problem 4: Headless Mode Gets Detected
GitHub Issue #30 and #21 - headless browsers fail detection tests. Cloudflare checks:
- Screen size (headless often has 0x0 or weird dimensions)
- WebGL renderer (headless shows "Google SwiftShader" instead of real GPU)
- Missing browser plugins
- Automation indicators
What Actually Worked
I had to run in headed mode on a remote server:
browser = p.chromium.launch(
headless=False, # headed mode
args=[
'--start-maximized',
'--disable-blink-features=AutomationControlled'
]
)
# Set viewport size
page = browser.new_page(
viewport={'width': 1920, 'height': 1080}
)
For servers without a display, use Xvfb:
# Install Xvfb
# sudo apt-get install xvfb
# Run Playwright with virtual display
xvfb-run python script.py
💡 Note
Modern Cloudflare checks are more sophisticated. Even headed mode can be detected through behavioral analysis.
Problem 5: reCAPTCHA Doesn't Show
GitHub Issue #38 - recaptcha not showing. Sometimes Cloudflare serves a CAPTCHA, but it never appears in Playwright.
What happens: The page loads, you see the "Checking your browser" message, then... nothing. Page stays blank.
What Actually Worked
This is usually because:
- JavaScript is blocked
- The iframe containing the CAPTCHA is blocked by privacy settings
- Cloudflare decided to block instead of challenge
I tried:
# Wait for the Cloudflare challenge to complete
page.wait_for_selector("title", timeout=30000)
# Or wait for specific element that appears after challenge
page.wait_for_selector("#main-content", timeout=30000)
Sometimes though, the site just doesn't want automated traffic. No workaround for that.
⚠️ Unsolved Problem
Some sites use Cloudflare's "Managed Challenge" which is invisible. If it fails, you just get blocked. No CAPTCHA to solve.
What Actually Works (in 2026)
After all this trial and error, here's my setup that usually works:
from playwright.sync_api import sync_playwright
from playwright_stealth import stealth_sync
import random
import time
def create_stealth_context(proxy_url):
"""Create a browser context with anti-detection measures"""
with sync_playwright() as p:
browser = p.chromium.launch(
headless=False, # headed mode
proxy={"server": proxy_url} if proxy_url else None,
args=[
'--disable-blink-features=AutomationControlled',
'--disable-dev-shm-usage',
'--no-sandbox',
]
)
context = browser.new_context(
viewport={'width': 1920, 'height': 1080},
user_agent='Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/122.0.0.0 Safari/537.36',
locale='en-US',
timezone_id='America/New_York',
)
page = context.new_page()
stealth_sync(page)
return browser, page
def scrape_with_retry(url, max_retries=5):
"""Scrape with proxy rotation and retries"""
proxies = load_proxies() # Your proxy list
for attempt in range(max_retries):
proxy = random.choice(proxies) if proxies else None
try:
browser, page = create_stealth_context(proxy)
# Random delay before request
time.sleep(random.uniform(2, 5))
# Navigate to page
response = page.goto(url, wait_until="domcontentloaded", timeout=30000)
# Wait for Cloudflare challenge
time.sleep(5)
# Check if we got blocked
if response.status == 403 or response.status == 1020:
browser.close()
continue
# Wait for content
page.wait_for_selector("body", timeout=15000)
# Extract data
content = page.content()
browser.close()
return content
except Exception as e:
if 'browser' in locals():
browser.close()
continue
raise Exception("All retries failed")
Key Points
- Headed mode - Less suspicious than headless
- Proxy rotation - Don't hit from same IP repeatedly
- Random delays - Humans don't make instant requests
- Real user agent - Use current Chrome version
- Geolocation match - Match timezone to proxy location
- Retry logic - Some proxies will fail, try others
⚠️ Reality Check
Even with all this, some sites will still block you. Cloudflare gets smarter every month. What works today might not work next month.
When to Just Give Up
Sometimes the best solution is to accept defeat. I've learned to move on when:
- Site uses Enterprise Cloudflare with bot management (budget: $0, success rate: 0%)
- Requires phone verification or 2FA
- Session checks are too strict (fingerprinting + IP + behavior pattern)
- Legal gray area or explicit terms prohibiting scraping
Alternatives I use:
- Official API (if available)
- Manual export/download (for one-time data needs)
- Third-party data providers (sometimes cheaper than building own scraper)
- Different target site with similar data but weaker protection
Final Thoughts
Playwright-stealth helps, but it's not a magic bullet. Cloudflare is sophisticated, and they actively work to detect automated browsers.
The combination that works for me:
- Stealth mode (hide webdriver flags)
- Headed mode (avoid headless detection)
- Proxy rotation (spread requests across IPs)
- Human-like delays (don't request too fast)
- Accept that some sites just can't be scraped
Not claiming this is the "right" way or that it works everywhere. It's just what I've learned through months of failing and trying again.
If you're dealing with Cloudflare in 2026, expect to spend time debugging and adjusting. The cat-and-mouse game never ends.
Related Articles
DrissionPage: The Selenium Alternative That Bypasses Bot Detection
Stop getting blocked by Cloudflare. See how DrissionPage bypasses anti-bot shields that catch Selenium every time.
DrissionPage: Real Problems That Blocked Me (And How I Got Past Them)
Click detection failing, scroll issues, Cloudflare blocking. Real problems from GitHub issues and what actually works.
Sources & Further Reading
- playwright-stealth GitHub Issues - Real problems people report
- Cloudflare Bypass Examples - Working code examples
- StackOverflow: Bypass Cloudflare with Playwright
- Reddit: Cloudflare Bypass Discussion