Enhancement Request Template
Is your enhancement request related to a problem? Please describe.
Our current scraping implementation uses fetch() + cheerio inside a Next.js API route deployed on Vercel. This method previously worked but is now consistently failing because websites like Peerlist and Medium have introduced stronger bot protections such as Cloudflare Browser Challenges.
These protections return:
403 Forbidden
cf-mitigated: challenge
- Unrendered HTML shells with no usable content
Since Vercel's serverless/edge runtime cannot execute JavaScript in a real browser environment, it cannot solve Cloudflare challenges, resulting in scraping failures.
Describe the enhancement you'd like
Introduce a dedicated scraping service using AWS Lambda + Playwright (or Chromium). Lambda can run a real headless browser, allowing it to:
- Execute JavaScript
- Solve Cloudflare JS challenges
- Load fully-rendered HTML
- Scrape dynamic websites reliably
Our Next.js API routes will call this Lambda function instead of attempting to scrape directly from Vercel.
Describe alternatives you've considered
-
Adding browser-like headers
Attempted custom User-Agent, cookies, and Accept headers. Cloudflare still blocks the requests.
-
Scraping directly on Vercel
Not viable since Vercel prevents running Playwright or any full browser in serverless functions.
-
Using proxies
Rotating proxies do not solve Cloudflare's JavaScript challenge.
-
Using third-party scraping APIs
They work but add ongoing costs and provide less control. A Lambda-based scraper is more flexible and scalable.
Possible Implementation Details
- Create an AWS Lambda function using:
playwright-core
@sparticuz/chromium (Lambda-optimized Chromium build)
- Lambda loads the target URL in a real browser:
const browser = await playwright.chromium.launch({
args: chromium.args,
executablePath: await chromium.executablePath(),
headless: chromium.headless,
});
const page = await browser.newPage();
await page.goto(url, { waitUntil: "networkidle" });
const html = await page.content();
Lambda returns the fully-rendered HTML to our Next.js API.
Next.js extracts text using cheerio as before, but now with real, complete HTML.
This bypasses Cloudflare entirely and restores consistent scraping.
Additional context
Peerlist and Medium recently updated their bot protection systems.
Cloudflare challenges require JavaScript execution and browser fingerprinting.
AWS Lambda supports full browser automation, making it a suitable scraping backend.
This change will significantly improve scraping reliability and reduce API failures.
Optional Sections
Priority: High
Are you willing to submit a PR for this enhancement? Yes
Does this enhancement require changes in documentation? Yes — the scraping architecture must be updated to reflect the new Lambda-based workflow.
Enhancement Request Template
Is your enhancement request related to a problem? Please describe.
Our current scraping implementation uses
fetch()+cheerioinside a Next.js API route deployed on Vercel. This method previously worked but is now consistently failing because websites like Peerlist and Medium have introduced stronger bot protections such as Cloudflare Browser Challenges.These protections return:
403 Forbiddencf-mitigated: challengeSince Vercel's serverless/edge runtime cannot execute JavaScript in a real browser environment, it cannot solve Cloudflare challenges, resulting in scraping failures.
Describe the enhancement you'd like
Introduce a dedicated scraping service using AWS Lambda + Playwright (or Chromium). Lambda can run a real headless browser, allowing it to:
Our Next.js API routes will call this Lambda function instead of attempting to scrape directly from Vercel.
Describe alternatives you've considered
Adding browser-like headers
Attempted custom
User-Agent, cookies, andAcceptheaders. Cloudflare still blocks the requests.Scraping directly on Vercel
Not viable since Vercel prevents running Playwright or any full browser in serverless functions.
Using proxies
Rotating proxies do not solve Cloudflare's JavaScript challenge.
Using third-party scraping APIs
They work but add ongoing costs and provide less control. A Lambda-based scraper is more flexible and scalable.
Possible Implementation Details
playwright-core@sparticuz/chromium(Lambda-optimized Chromium build)Lambda returns the fully-rendered HTML to our Next.js API.
Next.js extracts text using cheerio as before, but now with real, complete HTML.
This bypasses Cloudflare entirely and restores consistent scraping.
Additional context
Peerlist and Medium recently updated their bot protection systems.
Cloudflare challenges require JavaScript execution and browser fingerprinting.
AWS Lambda supports full browser automation, making it a suitable scraping backend.
This change will significantly improve scraping reliability and reduce API failures.
Optional Sections
Priority: High
Are you willing to submit a PR for this enhancement? Yes
Does this enhancement require changes in documentation? Yes — the scraping architecture must be updated to reflect the new Lambda-based workflow.