[repo-monitor] Medium: concurrent_requests_per_domain setting silently ignored — no per-domain rate limiting enforced

## Summary
The `Spider` trait exposes a `concurrent_requests_per_domain()` method that users can override to cap simultaneous requests to any single domain. The value is stored in `CrawlStats` but the `CrawlerEngine` never actually enforces it — all requests contend on a single global `Semaphore` regardless of their target domain. Spiders that override this method believe they are rate-limiting per-domain when in reality they are not.

## Location
- **File**: `src/spiders/engine.rs`
- **Line(s)**: 95 (stored in stats), entire `process_request` / `crawl` loop (no per-domain semaphore)
- **File**: `src/spiders/spider.rs` — Line 17 (trait method declaration)
- **File**: `src/spiders/result.rs` — Line 68 (stored in `CrawlStats` but unused for control flow)

## Severity
**Medium**

## Details
`CrawlerEngine` creates one `global_limiter: Arc<Semaphore>` with capacity `spider.concurrent_requests().max(1)`. The per-domain value from `spider.concurrent_requests_per_domain()` is copied into `CrawlStats` for reporting purposes only — no `HashMap<String, Arc<Semaphore>>` keyed on domain is ever created or consulted.

Consequence: a spider that sets `concurrent_requests_per_domain` to `1` (one-at-a-time per host) can still hammer the same origin with as many parallel requests as `concurrent_requests` allows. This may trigger bans, violate politeness policies, or cause unintended load on the target.

```rust
// engine.rs line 95 — value recorded but never used to throttle
stats.concurrent_requests_per_domain = self.spider.concurrent_requests_per_domain();
```

No code path checks `concurrent_requests_per_domain` before acquiring a semaphore permit.

## Suggested Fix
Introduce a `HashMap<String, Arc<Semaphore>>` keyed on the request's domain (lazily created on first encounter). Before dispatching each request, acquire a permit from both the global semaphore and the per-domain semaphore when `concurrent_requests_per_domain > 0`. Example sketch:

```rust
let domain_limiters: Arc<Mutex<HashMap<String, Arc<Semaphore>>>> = ...;

// in process_request:
if per_domain > 0 {
    let domain = request.domain().unwrap_or_default();
    let sem = domain_limiters.lock().await
        .entry(domain)
        .or_insert_with(|| Arc::new(Semaphore::new(per_domain as usize)))
        .clone();
    let _permit = sem.acquire_owned().await?;
    // proceed with fetch
}
```

---
*Automated finding by repo-monitor*

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[repo-monitor] Medium: concurrent_requests_per_domain setting silently ignored — no per-domain rate limiting enforced #10

Summary

Location

Severity

Details

Suggested Fix

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[repo-monitor] Medium: concurrent_requests_per_domain setting silently ignored — no per-domain rate limiting enforced #10

Description

Summary

Location

Severity

Details

Suggested Fix

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions