Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
147 changes: 147 additions & 0 deletions RENDER_DEPLOYMENT_GUIDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,147 @@
# Render Deployment Guide - Fixing Playwright Issues

## Problem
When deploying to Render, you may encounter this error:
```
Failed to detect dynamic content for https://www.imdb.com/search/title/?groups=top_100&sort=user_rating,desc:
BrowserType.launch: Executable doesn't exist at /opt/render/.cache/ms-playwright/chromium_headless_shell-1181/chrome-linux/headless_shell
```

## Root Cause
Playwright requires browser binaries to be installed separately from the Python package. On Render, these binaries are not automatically installed during deployment.

## Solutions

### Solution 1: Using render.yaml (Recommended)

1. **Create `render.yaml` in your project root:**
```yaml
services:
- type: web
name: nexus-platform-backend
env: python
plan: starter
buildCommand: |
cd backend
pip install -r requirements.txt
playwright install chromium
playwright install --with-deps
startCommand: |
cd backend
uvicorn main:app --host 0.0.0.0 --port $PORT
envVars:
- key: PYTHON_VERSION
value: 3.11.0
- key: OPENAI_API_KEY
sync: false
- key: DATABASE_URL
sync: false
- key: ANTHROPIC_API_KEY
sync: false
```

2. **Deploy using the render.yaml file:**
- Connect your repository to Render
- Render will automatically detect and use the `render.yaml` configuration

### Solution 2: Manual Render Dashboard Configuration

1. **In your Render dashboard, set the build command to:**
```bash
cd backend && pip install -r requirements.txt && playwright install chromium
```

2. **Set the start command to:**
```bash
cd backend && uvicorn main:app --host 0.0.0.0 --port $PORT
```

### Solution 3: Using a Build Script

1. **Create `backend/install_playwright_minimal.sh`:**
```bash
#!/bin/bash
echo "🔧 Installing Playwright with minimal approach..."
pip install -r requirements.txt
playwright install chromium
export PLAYWRIGHT_BROWSERS_PATH=/opt/render/.cache/ms-playwright
export DISPLAY=:99
echo "✅ Playwright installation completed!"
```

2. **Make it executable:**
```bash
chmod +x backend/install_playwright_minimal.sh
```

3. **Set the build command in Render to:**
```bash
cd backend && ./install_playwright_minimal.sh
```

## Environment Variables

Make sure to set these environment variables in your Render dashboard:

- `OPENAI_API_KEY` - Your OpenAI API key
- `DATABASE_URL` - Your database connection string
- `ANTHROPIC_API_KEY` - Your Anthropic API key (if using Claude)
- `PYTHON_VERSION` - Set to 3.11.0 for best compatibility

## Verification

After deployment, you can verify that Playwright is working by:

1. **Checking the build logs** - You should see:
```
playwright install chromium
✅ Playwright installation completed!
```

2. **Testing the API endpoint** - Make a request to your dynamic scraping endpoint

## Troubleshooting

### If you still get the error:

1. **Check the build logs** in Render dashboard
2. **Verify the build command** includes `playwright install chromium`
3. **Ensure you're using Python 3.11** (not 3.13, which may have compatibility issues)
4. **Check that the start command** is correct and points to the right directory

### Common Issues:

1. **Wrong working directory** - Make sure to `cd backend` before running commands
2. **Missing dependencies** - Ensure `playwright install chromium` is included (avoid `--with-deps`)
3. **Python version** - Use Python 3.11 for best compatibility

## Alternative: Disable Dynamic Scraping

If you don't need dynamic content scraping, you can modify the code to fall back to static scraping:

```python
# In your scraping logic, add a fallback
try:
# Try dynamic scraping with Playwright
content = await dynamic_scraper.get_content(url)
except Exception as e:
logger.warning(f"Dynamic scraping failed: {e}")
# Fall back to static scraping
content = await static_scraper.get_content(url)
```

## Files Modified

- `render.yaml` - Render configuration file
- `backend/install_playwright_minimal.sh` - Minimal Playwright installation script
- `backend/app/core/dynamic_scraper.py` - Added error handling and auto-installation
- `RENDER_DEPLOYMENT_GUIDE.md` - This guide

## Next Steps

1. Choose one of the solutions above
2. Deploy to Render using the chosen method
3. Monitor the build logs to ensure Playwright browsers are installed
4. Test your dynamic scraping functionality

The key is ensuring that `playwright install chromium` runs during the build process on Render.
2 changes: 1 addition & 1 deletion backend/.python-version
Original file line number Diff line number Diff line change
@@ -1 +1 @@
3.12.7
3.11.7
62 changes: 46 additions & 16 deletions backend/app/core/dynamic_scraper.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,22 +21,52 @@ def __init__(self, timeout: int = 60, headless: bool = True):

async def __aenter__(self):
"""Async context manager entry"""
self.playwright = await async_playwright().start()
self.browser = await self.playwright.chromium.launch(
headless=self.headless,
args=[
'--no-sandbox',
'--disable-setuid-sandbox',
'--disable-dev-shm-usage',
'--disable-gpu',
'--no-first-run',
'--no-default-browser-check',
'--disable-background-timer-throttling',
'--disable-backgrounding-occluded-windows',
'--disable-renderer-backgrounding'
]
)
return self
try:
self.playwright = await async_playwright().start()
self.browser = await self.playwright.chromium.launch(
headless=self.headless,
args=[
'--no-sandbox',
'--disable-setuid-sandbox',
'--disable-dev-shm-usage',
'--disable-gpu',
'--no-first-run',
'--no-default-browser-check',
'--disable-background-timer-throttling',
'--disable-backgrounding-occluded-windows',
'--disable-renderer-backgrounding'
]
)
return self
except Exception as e:
logger.error(f"Failed to initialize Playwright: {e}")
# Try to install browsers if they're missing
import subprocess
import sys
try:
logger.info("Attempting to install Playwright browsers...")
subprocess.run([sys.executable, "-m", "playwright", "install", "chromium"],
check=True, capture_output=True)
logger.info("Playwright browsers installed successfully, retrying...")
self.playwright = await async_playwright().start()
self.browser = await self.playwright.chromium.launch(
headless=self.headless,
args=[
'--no-sandbox',
'--disable-setuid-sandbox',
'--disable-dev-shm-usage',
'--disable-gpu',
'--no-first-run',
'--no-default-browser-check',
'--disable-background-timer-throttling',
'--disable-backgrounding-occluded-windows',
'--disable-renderer-backgrounding'
]
)
return self
except Exception as install_error:
logger.error(f"Failed to install Playwright browsers: {install_error}")
raise

async def __aexit__(self, exc_type, exc_val, exc_tb):
"""Async context manager exit"""
Expand Down
19 changes: 19 additions & 0 deletions backend/build.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
#!/bin/bash

# Render Build Script for Playwright Support

echo "🔧 Starting build process..."

# Install Python dependencies
echo "📦 Installing Python dependencies..."
pip install -r requirements.txt

# Install Playwright browsers
echo "🌐 Installing Playwright browsers..."
playwright install chromium

# Verify installation
echo "✅ Verifying Playwright installation..."
python -c "from playwright.async_api import async_playwright; print('Playwright installed successfully')"

echo "🎉 Build completed successfully!"
39 changes: 39 additions & 0 deletions backend/install_playwright.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
#!/bin/bash

echo "🔧 Installing Playwright with custom approach..."

# Install Python dependencies
pip install -r requirements.txt

# Install Playwright browsers without system dependencies
echo "🌐 Installing Playwright browsers..."
playwright install chromium

# Set environment variables to avoid system dependency installation
export PLAYWRIGHT_SKIP_BROWSER_DOWNLOAD=0
export PLAYWRIGHT_BROWSERS_PATH=/opt/render/.cache/ms-playwright

# Verify installation
echo "✅ Verifying Playwright installation..."
python -c "
from playwright.async_api import async_playwright
import asyncio

async def test_playwright():
try:
playwright = await async_playwright().start()
browser = await playwright.chromium.launch(headless=True)
await browser.close()
await playwright.stop()
print('✅ Playwright installed and working successfully!')
return True
except Exception as e:
print(f'❌ Playwright test failed: {e}')
return False

result = asyncio.run(test_playwright())
if not result:
exit(1)
"

echo "🎉 Playwright installation completed!"
16 changes: 16 additions & 0 deletions backend/install_playwright_minimal.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
#!/bin/bash

echo "🔧 Installing Playwright with minimal approach..."

# Install Python dependencies
pip install -r requirements.txt

# Install only Chromium browser without system dependencies
echo "🌐 Installing Chromium browser only..."
playwright install chromium

# Set environment variables for headless operation
export PLAYWRIGHT_BROWSERS_PATH=/opt/render/.cache/ms-playwright
export DISPLAY=:99

echo "✅ Playwright installation completed!"
20 changes: 20 additions & 0 deletions render.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
services:
- type: web
name: nexus-platform-backend
env: python
plan: starter
buildCommand: |
cd backend
./install_playwright_minimal.sh
startCommand: |
cd backend
uvicorn main:app --host 0.0.0.0 --port $PORT
envVars:
- key: PYTHON_VERSION
value: 3.11.7
- key: OPENAI_API_KEY
sync: false
- key: DATABASE_URL
sync: false
- key: ANTHROPIC_API_KEY
sync: false