Implement consent forwarding pipeline#380
Conversation
Wire consent signals into OpenRTB bid requests, add per-partner forwarding modes, and persist consent to KV Store for returning users. Phase 2 - OpenRTB integration: populate regs/user consent fields with dual-placement (top-level 2.6 + ext), add EID consent gating, AC string forwarding, and new Eid/Uid/ConsentedProvidersSettings structs. Phase 3 - Configuration + observability: add [consent] config section with jurisdiction detection, expiration checking, GPC-to-US-Privacy construction, and structured logging. Phase 4 - Partner integrations: cookie stripping via ConsentForwardingMode, Prebid/Lockr consent cookie filtering, APS consent fields, adserver mock consent summary. Phase 5 - KV Store persistence: consent/kv.rs with KvConsentEntry and ConsentKvMetadata types, SHA-256 fingerprint change detection, read fallback when cookies absent, write-on-change via Fastly KV Store API.
b4dfdde to
3e8e3c5
Compare
|
This has a minimal TCF Decoding implementation. |
|
Also maybe #390 should be merged first, there will be conflicts. |
|
@ChristianPavilonis Please resolve conflicts |
aram356
left a comment
There was a problem hiding this comment.
Summary
Comprehensive consent forwarding pipeline spanning signal extraction, TCF/GPP/USP decoding, jurisdiction detection, KV persistence, and OpenRTB integration. The architecture is clean and well-decomposed. Four blocking issues around edge-case correctness and input validation need attention before merge.
Blocking
🔧 wrench
- Expired TCF leaks through GPP-embedded fallback: when both standalone TC and GPP-embedded TCF are present and expired, only one is cleared —
effective_tcf()still returns the GPP copy (crates/common/src/consent/mod.rs:171) is_empty()misclassifies__gpp_sid-only requests:gpp_section_idsis not checked, so requests with only__gpp_sid=2,6are treated as empty — skipping logging and KV writes (crates/common/src/consent/types.rs:155)- KV Store consent persistence disabled on
/auctionpath:synthetic_id: Nonecauses bothtry_kv_fallbackandtry_kv_writeto early-return (crates/common/src/auction/endpoints.rs:57) - No input length validation on consent strings before decoding: unbounded base64 decode from cookie input could allocate large heap buffers from malicious input (
crates/common/src/consent/tcf.rs:57)
Non-blocking
🤔 thinking
- KV fingerprint ignores
gpp_section_ids: SID-only changes are invisible to the fingerprint, skipping KV writes (crates/common/src/consent/kv.rs:178) regs.gdpr = Some(0)false negative in ambiguous cases: a GDPR-jurisdiction user without a TCF cookie getsregs.gdpr=0, signaling "GDPR does not apply" (crates/common/src/integrations/prebid.rs:644)
♻️ refactor
apply_tcf_conflict_resolutionclones eagerly: bothTcfConsentstructs are cloned before determining if a conflict exists (crates/common/src/consent/mod.rs:312)now_deciseconds()usesas u64truncation fromu128: safe in practice but avoidable withu64-native arithmetic (crates/common/src/consent/mod.rs:377)
🌱 seedling
extract_and_log_consentis declared but never called: public function with zero callers — either document the intent or remove (crates/common/src/consent/mod.rs:203)- No end-to-end test for consent → OpenRTB pipeline: individual consent modules are well-tested, but there's no integration test that starts from a
Requestwith consent cookies and asserts the finalOpenRtbRequestJSON contains correctregs.gdpr,regs.gpp,user.consent, etc. This cross-module path (endpoints.rs→formats.rs→prebid.rs) is where mismatches are most likely. Consider adding one in a follow-up. gate_eids_by_consentis defined but not wired in:gate_eids_by_consent(mod.rs:430) is implemented and unit-tested but never called in the actual bid request path. The EIDs field is alwaysNonein prebid.rs. Document whether this is deferred to a future phase or wire it in.
👍 praise
- Clean pipeline architecture: the extract → decode → normalize → gate pipeline is well-decomposed. Each stage is independently testable, proxy mode is a clean escape hatch, and the KV fingerprint-based write deduplication is smart. The conflict resolution strategies (Restrictive/Newest/Permissive) and config-driven jurisdiction lists are particularly well thought out.
- Dual-placement for Prebid compatibility: populating both OpenRTB 2.6 top-level fields and
regs.ext.*/user.ext.consentfor Prebid compatibility is the right call. TheRegsExtmirroring pattern with corresponding tests shows attention to real-world integration needs.
CI Status
- Analyze (javascript-typescript): PASS
crates/common/src/consent/mod.rs
Outdated
| ); | ||
| ctx.expired = true; | ||
|
|
||
| if ctx.tcf.is_some() { |
There was a problem hiding this comment.
🔧 wrench — Expired TCF leaks through GPP-embedded fallback.
effective_tcf() at line 298 falls back from ctx.tcf to ctx.gpp.eu_tcf. But the if/else if here only clears one of them: when both standalone TC and GPP-embedded TCF are present, clearing ctx.tcf leaves ctx.gpp.eu_tcf accessible through effective_tcf().
Scenario: request has both euconsent-v2 cookie and __gpp with section 2, both expired. After this code runs:
ctx.tcf = None(cleared at line 172)ctx.gpp.eu_tcf— stillSome(expired_tcf)(theelse ifdoesn't fire)effective_tcf(ctx)now returns the GPP-embedded expired consent
Fix — clear both decoded TCF holders on expiration:
ctx.tcf = None;
if let Some(gpp) = &mut ctx.gpp {
gpp.eu_tcf = None;
}Add a test with both sources present and expired to prevent regression.
| impl ConsentContext { | ||
| /// Returns `true` when no consent signals are present. | ||
| #[must_use] | ||
| pub fn is_empty(&self) -> bool { |
There was a problem hiding this comment.
🔧 wrench — is_empty() ignores gpp_section_ids, causing valid GDPR hints to be treated as empty.
When a request has only __gpp_sid=2,6 (no __gpp string), build_context_from_signals at mod.rs:253 populates gpp_section_ids = Some([2, 6]) and gdpr_applies = true. But this method returns true because it doesn't check gpp_section_ids.
This causes:
log_consent_context(mod.rs:545) to skip loggingsave_consent_to_kv(kv.rs:306) to skip writing
Fix — include gpp_section_ids in the check:
pub fn is_empty(&self) -> bool {
self.raw_tc_string.is_none()
&& self.raw_gpp_string.is_none()
&& self.gpp_section_ids.is_none()
&& self.raw_us_privacy.is_none()
&& self.raw_ac_string.is_none()
&& self.tcf.is_none()
&& self.gpp.is_none()
&& self.us_privacy.is_none()
&& !self.gpc
}| geo: geo.as_ref(), | ||
| synthetic_id: None, // Auction requests don't carry a Synthetic ID yet. | ||
| }); | ||
|
|
There was a problem hiding this comment.
🔧 wrench — KV fallback/write is effectively disabled because synthetic_id: None.
The synthetic ID is generated later in convert_tsjs_to_auction_request (formats.rs:86), but the consent pipeline needs it here for KV Store operations. Both try_kv_fallback (mod.rs:479) and try_kv_write (mod.rs:501) early-return on None.
Fix — generate the synthetic ID before building consent context:
let cookie_jar = handle_request_cookies(&req)?;
let synthetic_id = get_or_generate_synthetic_id(settings, &req)?;
let geo = GeoInfo::from_request(&req);
let consent_context = consent::build_consent_context(&consent::ConsentPipelineInput {
jar: cookie_jar.as_ref(),
req: &req,
config: &settings.consent,
geo: geo.as_ref(),
synthetic_id: Some(&synthetic_id),
});Then pass the already-generated ID through to convert_tsjs_to_auction_request to avoid regenerating it.
| /// | ||
| /// - [`ConsentDecodeError::InvalidTcString`] if base64 decoding fails, the | ||
| /// version is not 2, or the bitfield is too short. | ||
| pub fn decode_tc_string(tc_string: &str) -> Result<TcfConsent, Report<ConsentDecodeError>> { |
There was a problem hiding this comment.
🔧 wrench — No length limit on consent strings before base64 decoding.
A malicious euconsent-v2 cookie could contain megabytes of data that gets base64-decoded into a large heap buffer. The same applies to GPP strings. Real TC Strings are typically < 1KB.
Fix — add a max-length check before decoding:
const MAX_TC_STRING_LEN: usize = 4096;
pub fn decode_tc_string(tc_string: &str) -> Result<TcfConsent, Report<ConsentDecodeError>> {
if tc_string.len() > MAX_TC_STRING_LEN {
return Err(Report::new(ConsentDecodeError::InvalidTcString {
reason: format!("TC string too long: {} chars (max {})", tc_string.len(), MAX_TC_STRING_LEN),
}));
}
// ...existing code...Apply similar limits to decode_gpp_string and decode_us_privacy.
|
|
||
| // Feed each signal into the hash, separated by a sentinel byte to | ||
| // prevent ambiguity (e.g., None+Some("x") vs Some("x")+None). | ||
| hash_optional(&mut hasher, ctx.raw_tc_string.as_deref()); |
There was a problem hiding this comment.
🤔 thinking — Fingerprint does not include gpp_section_ids.
When only __gpp_sid changes (no __gpp string present), the fingerprint won't change and the KV write is skipped. This compounds with the is_empty() issue (finding 2), but even after fixing is_empty(), SID-only changes would still be missed.
Consider adding normalized section IDs to the hash:
if let Some(ids) = &ctx.gpp_section_ids {
for id in ids {
hasher.update(&id.to_le_bytes());
}
}| } | ||
|
|
||
| let gdpr = if ctx.gdpr_applies { Some(1) } else { Some(0) }; | ||
|
|
There was a problem hiding this comment.
🤔 thinking — Emitting regs.gdpr = 0 may falsely signal "GDPR does not apply."
Currently gdpr_applies means "has TCF signal" not "user is in GDPR jurisdiction." A German user with only Sec-GPC: 1 (no TCF cookie) gets gdpr_applies=false → regs.gdpr=0, which tells Prebid Server "GDPR does NOT apply." That's a false negative.
Consider:
Some(1)whengdpr_appliesis trueNonewhengdpr_appliesis false (unknown, not a definitive "no")- Or use
ctx.jurisdiction == Jurisdiction::Gdprto inform this field
let gdpr = if ctx.gdpr_applies {
Some(1)
} else if ctx.jurisdiction == Jurisdiction::Gdpr {
Some(1) // GDPR applies by geo even without TCF signal
} else {
None // Don't assert "not applicable" — leave it ambiguous
};
crates/common/src/consent/mod.rs
Outdated
|
|
||
| /// Resolves conflicts between standalone TC and GPP EU TCF consents. | ||
| fn apply_tcf_conflict_resolution(ctx: &mut ConsentContext, config: &ConsentConfig) { | ||
| let Some(standalone_tcf) = ctx.tcf.clone() else { |
There was a problem hiding this comment.
♻️ refactor — Both TcfConsent structs are cloned before checking if they actually conflict (line 322 early-return).
TcfConsent contains Vec<u16> vendor lists. The conflict check at line 322 (standalone_allows == gpp_allows) only needs references. Clone only the winner after the decision:
let Some(standalone_tcf) = ctx.tcf.as_ref() else { return; };
let Some(gpp_tcf) = ctx.gpp.as_ref().and_then(|g| g.eu_tcf.as_ref()) else { return; };
// ... check on references ...
if standalone_allows == gpp_allows { return; }
// Only clone the winner
ctx.tcf = Some(if select_gpp { gpp_tcf.clone() } else { standalone_tcf.clone() });
crates/common/src/consent/mod.rs
Outdated
| SystemTime::now() | ||
| .duration_since(UNIX_EPOCH) | ||
| .unwrap_or_default() | ||
| .as_millis() as u64 |
There was a problem hiding this comment.
♻️ refactor — as_millis() returns u128; the as u64 cast silently truncates (safe for centuries, but avoidable).
Clearer alternative that stays in u64 throughout:
let dur = SystemTime::now().duration_since(UNIX_EPOCH).unwrap_or_default();
dur.as_secs() * 10 + u64::from(dur.subsec_millis()) / 100
crates/common/src/consent/mod.rs
Outdated
| /// | ||
| /// Use this when you need the raw signals but don't need decoded data. | ||
| /// Prefer [`build_consent_context`] for the full pipeline. | ||
| pub fn extract_and_log_consent(jar: Option<&CookieJar>, req: &Request) -> RawConsentSignals { |
There was a problem hiding this comment.
🌱 seedling — extract_and_log_consent is public but has zero callers. If it's planned for a future use case, document the intent. Otherwise consider removing it to avoid dead-code drift.
Fix expired TCF leaking through GPP fallback by clearing both sources. Add gpp_section_ids to is_empty() check and KV fingerprint hash. Generate synthetic ID before consent pipeline in auction endpoint so KV fallback and write operations work correctly. Add max-length guard on TC strings before base64 decoding. Use jurisdiction to inform regs.gdpr instead of falsely emitting 0. Defer cloning in TCF conflict resolution and remove dead code.
Review feedback addressed in 798e4a2All 9 review comments addressed in a single commit. Here's how each was resolved: Bug fixes
Design concerns
Cleanup
|
Summary
[consent]section with jurisdiction detection, per-partner forwarding modes, expiration checking, and GPC-to-US-Privacy construction.Changes
OpenRTB integration
crates/common/src/openrtb.rsregs/userconsent fields with dual-placement (top-level 2.6 +extfor older exchanges); addEid,Uid,ConsentedProvidersSettingsstructsConfiguration & observability
crates/common/src/consent_config.rs[consent]config section:ConsentConfig,ConsentMode,ConsentForwardingMode,GdprConfig(31 countries),UsStatesConfig(20 states), conflict resolution, expiration checkingcrates/common/src/consent/jurisdiction.rsJurisdictionenum (Gdpr, UsState, NonRegulated, Unknown) +detect_jurisdiction()from geo + configcrates/common/src/consent/mod.rsbuild_consent_context(),ConsentPipelineInput, KV fallback/write, expiration checking, GPC-to-US-Privacy, EID gatingcrates/common/src/consent/types.rsTcfConsenthelper methods (has_purpose_consent,has_storage_consent, etc.)crates/common/src/settings.rsconsent: ConsentConfigfieldcrates/common/src/lib.rsconsent_configcrates/common/build.rsconsent_config.rsin build inputsPartner integrations
crates/common/src/cookies.rsstrip_cookies,forward_cookie_header,CONSENT_COOKIE_NAMES)crates/common/src/integrations/prebid.rsConsentForwardingModesupport, consent cookie stripping inOpenrtbOnlymodecrates/common/src/integrations/lockr.rsforward_cookie_headercrates/common/src/integrations/aps.rsApsGdprConsentstruct, consent fields inApsBidRequestcrates/common/src/integrations/adserver_mock.rsextKV Store persistence
crates/common/src/consent/kv.rsKvConsentEntryandConsentKvMetadatatypes, SHA-256 fingerprint change detection, read fallback when cookies absent, write-on-change via Fastly KV Store APIWiring & config
crates/common/src/auction/endpoints.rs/auctionendpointcrates/common/src/publisher.rssynthetic_idinto publisher handlerfastly.tomlconsent_storeKV store for local devtrusted-server.toml[consent]config section with all optionsKey design decisions
extfor backward compatibility with older exchanges.ConsentForwardingModecontrols whether consent travels via OpenRTB body only (OpenrtbOnlystrips cookies) or both cookies and body (CookiesAndBody).How to enable
[consent]section intrusted-server.tomlconsent_storeinfastly.toml(already added for local dev)mode = "proxy"ormode = "interpreter"depending on desired consent processing depthTest plan
cargo fmt --all -- --checkcargo clippy --all-targets --all-features -- -D warningscargo test --workspace— 460 tests passingnpx vitest run— 111 JS tests passingnpm run format(js + docs) — cleanChecklist
Closes #312