[Bug]: Critical Bug: Incomplete Redirect Chain Detection Breaks Internal URL Extraction

### crawl4ai version

 0.7.4

### Expected Behavior

# Critical Bug: Incomplete Redirect Chain Detection Breaks Internal URL Extraction

## Summary
The `redirected_url` attribute in the CrawlResult object only captures the first redirect in a redirect chain, not the final destination URL after multiple redirects. This causes **severe issues** with internal URL extraction, relative path resolution, and all URL-based processing operations.

## Environment
- **crawl4ai version**: 0.7.4
- **Python version**: 3.12
- **Browser**: Chromium (via Playwright)

## Steps to Reproduce

1. Use crawl4ai to fetch a URL that has multiple redirects
2. Check the `redirected_url` attribute in the result
3. Compare with the actual final URL that the browser reaches

### Test Code
```python
import asyncio
from crawl4ai import AsyncWebCrawler

async def test_redirect():
    url = 'https://zhaopin.sgcc.com.cn'
    
    async with AsyncWebCrawler() as crawler:
        result = await crawler.arun(url=url)
        
        if result:
            print(f"Original URL: {url}")
            print(f"redirected_url: {result.redirected_url}")
            print(f"status_code: {result.status_code}")
            
            # The actual final URL should be: https://zhaopin.sgcc.com.cn/sgcchr/static/home.html
            # But redirected_url only shows: https://zhaopin.sgcc.com.cn/

asyncio.run(test_redirect())
```

## Expected Behavior
The `redirected_url` attribute should contain the final destination URL after all redirects have been followed, which in this case should be:
`https://zhaopin.sgcc.com.cn/sgcchr/static/home.html`

## Actual Behavior
The `redirected_url` attribute only contains the first redirect:
`https://zhaopin.sgcc.com.cn/`

## Redirect Chain Analysis
Using Playwright directly, the complete redirect chain appears to be:
1. `https://zhaopin.sgcc.com.cn` → `https://zhaopin.sgcc.com.cn/` (HTTP redirect)
2. `https://zhaopin.sgcc.com.cn/` → `https://zhaopin.sgcc.com.cn/sgcchr/static/home.html` (JavaScript redirect or additional HTTP redirect)

## Impact
This bug affects applications that need to:
- Track the complete redirect chain
- Get the actual final URL after all redirects
- Perform URL-based analysis or caching based on the final destination

## Suggested Solution
Consider adding one or more of the following:
1. A `final_url` attribute that contains the actual final destination
2. A `redirect_chain` attribute that contains the complete list of redirects
3. Update the existing `redirected_url` to contain the final destination instead of just the first redirect

## Additional Context
This appears to be a common pattern with many websites that use multiple redirects (HTTP + JavaScript) to reach their final destination. The current implementation only captures the first HTTP redirect but misses subsequent redirects that may be handled by JavaScript or additional HTTP redirects.

## Critical Impact on Internal URL Extraction

This bug has severe consequences for applications that extract and process internal links:

### Problem Analysis
When `redirected_url` only contains the first redirect (`https://zhaopin.sgcc.com.cn/`) instead of the final URL (`https://zhaopin.sgcc.com.cn/sgcchr/static/home.html`), all subsequent URL processing becomes incorrect:

1. **Wrong Base URL for Link Extraction**: Internal link extraction uses the incomplete `redirected_url` as the base URL
2. **Incorrect Relative Path Resolution**: All relative links are resolved against the wrong base URL
3. **Wrong Internal Link Classification**: Links are classified as internal/external based on the wrong domain
4. **Broken URL Normalization**: All URL processing operations use the incorrect final URL

### Code Example of the Problem
```python
# Current behavior with crawl4ai
result = await crawler.arun('https://zhaopin.sgcc.com.cn')
final_url = result.redirected_url  # Only gets: https://zhaopin.sgcc.com.cn/
# But the actual final URL should be: https://zhaopin.sgcc.com.cn/sgcchr/static/home.html

# This causes incorrect internal link extraction:
internal_links = extract_internal_links(html_content, final_url)  # Uses wrong base URL!
```

### Real-World Impact
- **Web Scraping**: Extracted internal links point to wrong URLs
- **SEO Analysis**: Incorrect internal link structure analysis
- **Content Analysis**: Wrong base URL for relative resource resolution
- **Caching Systems**: Cache keys based on incorrect final URLs
- **Security Analysis**: Wrong domain-based security checks

## Workaround
Currently, developers need to manually track redirects or use additional tools like Playwright to get the complete redirect chain, which defeats the purpose of using crawl4ai for this functionality.

## Additional Test Case
```python
import asyncio
from crawl4ai import AsyncWebCrawler
from urllib.parse import urljoin

async def test_internal_link_extraction():
    url = 'https://zhaopin.sgcc.com.cn'
    
    async with AsyncWebCrawler() as crawler:
        result = await crawler.arun(url=url)
        
        if result:
            # This is WRONG - only gets first redirect
            base_url = result.redirected_url or url
            print(f"Base URL used for link extraction: {base_url}")
            
            # Example: if HTML contains <a href="/some-page">
            # It gets resolved to: https://zhaopin.sgcc.com.cn/some-page
            # But should be: https://zhaopin.sgcc.com.cn/sgcchr/static/some-page
            
            # This affects ALL relative URL resolution in the application
            relative_link = "/some-page"
            resolved_url = urljoin(base_url, relative_link)
            print(f"Resolved relative link: {resolved_url}")
            print("This URL is likely INCORRECT due to wrong base URL!")

asyncio.run(test_internal_link_extraction())
```

## Priority
This should be considered a **HIGH PRIORITY** bug because:
- It affects core functionality (URL processing)
- It causes silent failures in many use cases
- It breaks fundamental web scraping operations
- The impact is not immediately obvious to developers



### Current Behavior

The redirected_url attribute only contains the first redirect:
https://zhaopin.sgcc.com.cn/

### Is this reproducible?

Yes

### Inputs Causing the Bug

```bash

```

### Steps to Reproduce

```bash

```

### Code snippets

```python

```

### OS

macOS

### Python version

 3.12

### Browser

Chrome

### Browser version

_No response_

### Error logs & Screenshots (if applicable)

_No response_

Uh oh!

[Bug]: Critical Bug: Incomplete Redirect Chain Detection Breaks Internal URL Extraction #1472

Description

crawl4ai version

Expected Behavior

Critical Bug: Incomplete Redirect Chain Detection Breaks Internal URL Extraction

Summary

Environment

Steps to Reproduce

Test Code

Expected Behavior

Actual Behavior

Redirect Chain Analysis

Impact

Suggested Solution

Additional Context

Critical Impact on Internal URL Extraction

Problem Analysis

Code Example of the Problem

Real-World Impact

Workaround

Additional Test Case

Priority

Current Behavior

Is this reproducible?

Inputs Causing the Bug

Steps to Reproduce

Code snippets

OS

Python version

Browser

Browser version

Error logs & Screenshots (if applicable)

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions