Summary
When request.domain() returns None (malformed URL, data: scheme, file://), the allowed_domains whitelist check is silently skipped and the request is allowed through.
Location
- File:
src/spiders/engine.rs
- Line(s): 270–278
Severity
High
Details
if !allowed.is_empty() {
if let Some(domain) = request.domain() {
if !allowed.contains(&domain) { /* rejected */ }
}
// If domain() is None, request is silently ALLOWED!
}
A spider configured with allowed_domains could fetch local files (file://) or internal network resources if follow-URLs include non-HTTP schemes.
Suggested Fix
Reject requests when the domain cannot be parsed:
if !allowed.is_empty() {
match request.domain() {
Some(domain) if allowed.contains(&domain) => {} // allowed
_ => {
stats.lock().await.offsite_requests_count += 1;
return; // reject unknown or unparseable domains
}
}
}
Automated finding by repo-monitor
Summary
When
request.domain()returnsNone(malformed URL,data:scheme,file://), theallowed_domainswhitelist check is silently skipped and the request is allowed through.Location
src/spiders/engine.rsSeverity
High
Details
A spider configured with
allowed_domainscould fetch local files (file://) or internal network resources if follow-URLs include non-HTTP schemes.Suggested Fix
Reject requests when the domain cannot be parsed:
Automated finding by repo-monitor