Skip to content

Implement consent forwarding pipeline#380

Draft
ChristianPavilonis wants to merge 6 commits intomainfrom
feature/consent-management
Draft

Implement consent forwarding pipeline#380
ChristianPavilonis wants to merge 6 commits intomainfrom
feature/consent-management

Conversation

@ChristianPavilonis
Copy link
Collaborator

@ChristianPavilonis ChristianPavilonis commented Feb 26, 2026

Summary

  • Wire CMP consent signals end-to-end from cookie/header extraction through OpenRTB bid requests, partner integrations, and KV Store persistence so publishers can comply with GDPR and US state privacy laws without additional integration work.
  • Adds configurable [consent] section with jurisdiction detection, per-partner forwarding modes, expiration checking, and GPC-to-US-Privacy construction.
  • Grows test coverage from 431 → 460 tests with comprehensive unit tests for every new module.

Changes

OpenRTB integration

File Change
crates/common/src/openrtb.rs Populate regs/user consent fields with dual-placement (top-level 2.6 + ext for older exchanges); add Eid, Uid, ConsentedProvidersSettings structs

Configuration & observability

File Change
crates/common/src/consent_config.rs Full [consent] config section: ConsentConfig, ConsentMode, ConsentForwardingMode, GdprConfig (31 countries), UsStatesConfig (20 states), conflict resolution, expiration checking
crates/common/src/consent/jurisdiction.rs Jurisdiction enum (Gdpr, UsState, NonRegulated, Unknown) + detect_jurisdiction() from geo + config
crates/common/src/consent/mod.rs Pipeline orchestrator: build_consent_context(), ConsentPipelineInput, KV fallback/write, expiration checking, GPC-to-US-Privacy, EID gating
crates/common/src/consent/types.rs TcfConsent helper methods (has_purpose_consent, has_storage_consent, etc.)
crates/common/src/settings.rs Added consent: ConsentConfig field
crates/common/src/lib.rs Module declaration for consent_config
crates/common/build.rs Include consent_config.rs in build inputs

Partner integrations

File Change
crates/common/src/cookies.rs Cookie stripping utilities (strip_cookies, forward_cookie_header, CONSENT_COOKIE_NAMES)
crates/common/src/integrations/prebid.rs ConsentForwardingMode support, consent cookie stripping in OpenrtbOnly mode
crates/common/src/integrations/lockr.rs Always strips consent cookies via forward_cookie_header
crates/common/src/integrations/aps.rs ApsGdprConsent struct, consent fields in ApsBidRequest
crates/common/src/integrations/adserver_mock.rs Consent summary in mediation request ext

KV Store persistence

File Change
crates/common/src/consent/kv.rs KvConsentEntry and ConsentKvMetadata types, SHA-256 fingerprint change detection, read fallback when cookies absent, write-on-change via Fastly KV Store API

Wiring & config

File Change
crates/common/src/auction/endpoints.rs Wire consent pipeline into /auction endpoint
crates/common/src/publisher.rs Wire consent pipeline with synthetic_id into publisher handler
fastly.toml Added consent_store KV store for local dev
trusted-server.toml Added commented [consent] config section with all options

Key design decisions

  • Dual-placement OpenRTB fields: consent values placed both at top-level (2.6 spec) and in ext for backward compatibility with older exchanges.
  • Consent cookie stripping: per-partner ConsentForwardingMode controls whether consent travels via OpenRTB body only (OpenrtbOnly strips cookies) or both cookies and body (CookiesAndBody).
  • Write-on-change KV persistence: SHA-256 fingerprint of consent signals avoids redundant KV writes; KV read used as fallback when cookies are absent (e.g., Safari ITP).

How to enable

  1. Uncomment the [consent] section in trusted-server.toml
  2. For KV persistence, configure consent_store in fastly.toml (already added for local dev)
  3. Optionally set mode = "proxy" or mode = "interpreter" depending on desired consent processing depth

Test plan

  • cargo fmt --all -- --check
  • cargo clippy --all-targets --all-features -- -D warnings
  • cargo test --workspace — 460 tests passing
  • npx vitest run — 111 JS tests passing
  • npm run format (js + docs) — clean

Checklist

  • Code compiles without warnings
  • All existing tests pass
  • New tests added for all new modules (29 new tests)
  • No secrets or credentials committed
  • Configuration is opt-in (commented out by default)

Closes #312

@ChristianPavilonis ChristianPavilonis self-assigned this Feb 26, 2026
@ChristianPavilonis ChristianPavilonis marked this pull request as draft February 26, 2026 00:41
Wire consent signals into OpenRTB bid requests, add per-partner
forwarding modes, and persist consent to KV Store for returning users.

Phase 2 - OpenRTB integration: populate regs/user consent fields with
dual-placement (top-level 2.6 + ext), add EID consent gating, AC string
forwarding, and new Eid/Uid/ConsentedProvidersSettings structs.

Phase 3 - Configuration + observability: add [consent] config section
with jurisdiction detection, expiration checking, GPC-to-US-Privacy
construction, and structured logging.

Phase 4 - Partner integrations: cookie stripping via ConsentForwardingMode,
Prebid/Lockr consent cookie filtering, APS consent fields, adserver mock
consent summary.

Phase 5 - KV Store persistence: consent/kv.rs with KvConsentEntry and
ConsentKvMetadata types, SHA-256 fingerprint change detection, read
fallback when cookies absent, write-on-change via Fastly KV Store API.
@ChristianPavilonis
Copy link
Collaborator Author

ChristianPavilonis commented Mar 2, 2026

This has a minimal TCF Decoding implementation.
Do we want to make a full implementation as a separate crate?

@ChristianPavilonis ChristianPavilonis marked this pull request as ready for review March 2, 2026 13:59
@ChristianPavilonis
Copy link
Collaborator Author

Also maybe #390 should be merged first, there will be conflicts.

@aram356
Copy link
Collaborator

aram356 commented Mar 5, 2026

@ChristianPavilonis Please resolve conflicts

Copy link
Collaborator

@aram356 aram356 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

Comprehensive consent forwarding pipeline spanning signal extraction, TCF/GPP/USP decoding, jurisdiction detection, KV persistence, and OpenRTB integration. The architecture is clean and well-decomposed. Four blocking issues around edge-case correctness and input validation need attention before merge.

Blocking

🔧 wrench

  • Expired TCF leaks through GPP-embedded fallback: when both standalone TC and GPP-embedded TCF are present and expired, only one is cleared — effective_tcf() still returns the GPP copy (crates/common/src/consent/mod.rs:171)
  • is_empty() misclassifies __gpp_sid-only requests: gpp_section_ids is not checked, so requests with only __gpp_sid=2,6 are treated as empty — skipping logging and KV writes (crates/common/src/consent/types.rs:155)
  • KV Store consent persistence disabled on /auction path: synthetic_id: None causes both try_kv_fallback and try_kv_write to early-return (crates/common/src/auction/endpoints.rs:57)
  • No input length validation on consent strings before decoding: unbounded base64 decode from cookie input could allocate large heap buffers from malicious input (crates/common/src/consent/tcf.rs:57)

Non-blocking

🤔 thinking

  • KV fingerprint ignores gpp_section_ids: SID-only changes are invisible to the fingerprint, skipping KV writes (crates/common/src/consent/kv.rs:178)
  • regs.gdpr = Some(0) false negative in ambiguous cases: a GDPR-jurisdiction user without a TCF cookie gets regs.gdpr=0, signaling "GDPR does not apply" (crates/common/src/integrations/prebid.rs:644)

♻️ refactor

  • apply_tcf_conflict_resolution clones eagerly: both TcfConsent structs are cloned before determining if a conflict exists (crates/common/src/consent/mod.rs:312)
  • now_deciseconds() uses as u64 truncation from u128: safe in practice but avoidable with u64-native arithmetic (crates/common/src/consent/mod.rs:377)

🌱 seedling

  • extract_and_log_consent is declared but never called: public function with zero callers — either document the intent or remove (crates/common/src/consent/mod.rs:203)
  • No end-to-end test for consent → OpenRTB pipeline: individual consent modules are well-tested, but there's no integration test that starts from a Request with consent cookies and asserts the final OpenRtbRequest JSON contains correct regs.gdpr, regs.gpp, user.consent, etc. This cross-module path (endpoints.rsformats.rsprebid.rs) is where mismatches are most likely. Consider adding one in a follow-up.
  • gate_eids_by_consent is defined but not wired in: gate_eids_by_consent (mod.rs:430) is implemented and unit-tested but never called in the actual bid request path. The EIDs field is always None in prebid.rs. Document whether this is deferred to a future phase or wire it in.

👍 praise

  • Clean pipeline architecture: the extract → decode → normalize → gate pipeline is well-decomposed. Each stage is independently testable, proxy mode is a clean escape hatch, and the KV fingerprint-based write deduplication is smart. The conflict resolution strategies (Restrictive/Newest/Permissive) and config-driven jurisdiction lists are particularly well thought out.
  • Dual-placement for Prebid compatibility: populating both OpenRTB 2.6 top-level fields and regs.ext.* / user.ext.consent for Prebid compatibility is the right call. The RegsExt mirroring pattern with corresponding tests shows attention to real-world integration needs.

CI Status

  • Analyze (javascript-typescript): PASS

);
ctx.expired = true;

if ctx.tcf.is_some() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔧 wrench — Expired TCF leaks through GPP-embedded fallback.

effective_tcf() at line 298 falls back from ctx.tcf to ctx.gpp.eu_tcf. But the if/else if here only clears one of them: when both standalone TC and GPP-embedded TCF are present, clearing ctx.tcf leaves ctx.gpp.eu_tcf accessible through effective_tcf().

Scenario: request has both euconsent-v2 cookie and __gpp with section 2, both expired. After this code runs:

  1. ctx.tcf = None (cleared at line 172)
  2. ctx.gpp.eu_tcf — still Some(expired_tcf) (the else if doesn't fire)
  3. effective_tcf(ctx) now returns the GPP-embedded expired consent

Fix — clear both decoded TCF holders on expiration:

ctx.tcf = None;
if let Some(gpp) = &mut ctx.gpp {
    gpp.eu_tcf = None;
}

Add a test with both sources present and expired to prevent regression.

impl ConsentContext {
/// Returns `true` when no consent signals are present.
#[must_use]
pub fn is_empty(&self) -> bool {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔧 wrenchis_empty() ignores gpp_section_ids, causing valid GDPR hints to be treated as empty.

When a request has only __gpp_sid=2,6 (no __gpp string), build_context_from_signals at mod.rs:253 populates gpp_section_ids = Some([2, 6]) and gdpr_applies = true. But this method returns true because it doesn't check gpp_section_ids.

This causes:

  • log_consent_context (mod.rs:545) to skip logging
  • save_consent_to_kv (kv.rs:306) to skip writing

Fix — include gpp_section_ids in the check:

pub fn is_empty(&self) -> bool {
    self.raw_tc_string.is_none()
        && self.raw_gpp_string.is_none()
        && self.gpp_section_ids.is_none()
        && self.raw_us_privacy.is_none()
        && self.raw_ac_string.is_none()
        && self.tcf.is_none()
        && self.gpp.is_none()
        && self.us_privacy.is_none()
        && !self.gpc
}

geo: geo.as_ref(),
synthetic_id: None, // Auction requests don't carry a Synthetic ID yet.
});

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔧 wrench — KV fallback/write is effectively disabled because synthetic_id: None.

The synthetic ID is generated later in convert_tsjs_to_auction_request (formats.rs:86), but the consent pipeline needs it here for KV Store operations. Both try_kv_fallback (mod.rs:479) and try_kv_write (mod.rs:501) early-return on None.

Fix — generate the synthetic ID before building consent context:

let cookie_jar = handle_request_cookies(&req)?;
let synthetic_id = get_or_generate_synthetic_id(settings, &req)?;
let geo = GeoInfo::from_request(&req);
let consent_context = consent::build_consent_context(&consent::ConsentPipelineInput {
    jar: cookie_jar.as_ref(),
    req: &req,
    config: &settings.consent,
    geo: geo.as_ref(),
    synthetic_id: Some(&synthetic_id),
});

Then pass the already-generated ID through to convert_tsjs_to_auction_request to avoid regenerating it.

///
/// - [`ConsentDecodeError::InvalidTcString`] if base64 decoding fails, the
/// version is not 2, or the bitfield is too short.
pub fn decode_tc_string(tc_string: &str) -> Result<TcfConsent, Report<ConsentDecodeError>> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔧 wrench — No length limit on consent strings before base64 decoding.

A malicious euconsent-v2 cookie could contain megabytes of data that gets base64-decoded into a large heap buffer. The same applies to GPP strings. Real TC Strings are typically < 1KB.

Fix — add a max-length check before decoding:

const MAX_TC_STRING_LEN: usize = 4096;

pub fn decode_tc_string(tc_string: &str) -> Result<TcfConsent, Report<ConsentDecodeError>> {
    if tc_string.len() > MAX_TC_STRING_LEN {
        return Err(Report::new(ConsentDecodeError::InvalidTcString {
            reason: format!("TC string too long: {} chars (max {})", tc_string.len(), MAX_TC_STRING_LEN),
        }));
    }
    // ...existing code...

Apply similar limits to decode_gpp_string and decode_us_privacy.


// Feed each signal into the hash, separated by a sentinel byte to
// prevent ambiguity (e.g., None+Some("x") vs Some("x")+None).
hash_optional(&mut hasher, ctx.raw_tc_string.as_deref());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 thinking — Fingerprint does not include gpp_section_ids.

When only __gpp_sid changes (no __gpp string present), the fingerprint won't change and the KV write is skipped. This compounds with the is_empty() issue (finding 2), but even after fixing is_empty(), SID-only changes would still be missed.

Consider adding normalized section IDs to the hash:

if let Some(ids) = &ctx.gpp_section_ids {
    for id in ids {
        hasher.update(&id.to_le_bytes());
    }
}

}

let gdpr = if ctx.gdpr_applies { Some(1) } else { Some(0) };

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 thinking — Emitting regs.gdpr = 0 may falsely signal "GDPR does not apply."

Currently gdpr_applies means "has TCF signal" not "user is in GDPR jurisdiction." A German user with only Sec-GPC: 1 (no TCF cookie) gets gdpr_applies=falseregs.gdpr=0, which tells Prebid Server "GDPR does NOT apply." That's a false negative.

Consider:

  • Some(1) when gdpr_applies is true
  • None when gdpr_applies is false (unknown, not a definitive "no")
  • Or use ctx.jurisdiction == Jurisdiction::Gdpr to inform this field
let gdpr = if ctx.gdpr_applies {
    Some(1)
} else if ctx.jurisdiction == Jurisdiction::Gdpr {
    Some(1) // GDPR applies by geo even without TCF signal
} else {
    None // Don't assert "not applicable" — leave it ambiguous
};


/// Resolves conflicts between standalone TC and GPP EU TCF consents.
fn apply_tcf_conflict_resolution(ctx: &mut ConsentContext, config: &ConsentConfig) {
let Some(standalone_tcf) = ctx.tcf.clone() else {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ refactor — Both TcfConsent structs are cloned before checking if they actually conflict (line 322 early-return).

TcfConsent contains Vec<u16> vendor lists. The conflict check at line 322 (standalone_allows == gpp_allows) only needs references. Clone only the winner after the decision:

let Some(standalone_tcf) = ctx.tcf.as_ref() else { return; };
let Some(gpp_tcf) = ctx.gpp.as_ref().and_then(|g| g.eu_tcf.as_ref()) else { return; };
// ... check on references ...
if standalone_allows == gpp_allows { return; }
// Only clone the winner
ctx.tcf = Some(if select_gpp { gpp_tcf.clone() } else { standalone_tcf.clone() });

SystemTime::now()
.duration_since(UNIX_EPOCH)
.unwrap_or_default()
.as_millis() as u64
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ refactoras_millis() returns u128; the as u64 cast silently truncates (safe for centuries, but avoidable).

Clearer alternative that stays in u64 throughout:

let dur = SystemTime::now().duration_since(UNIX_EPOCH).unwrap_or_default();
dur.as_secs() * 10 + u64::from(dur.subsec_millis()) / 100

///
/// Use this when you need the raw signals but don't need decoded data.
/// Prefer [`build_consent_context`] for the full pipeline.
pub fn extract_and_log_consent(jar: Option<&CookieJar>, req: &Request) -> RawConsentSignals {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🌱 seedlingextract_and_log_consent is public but has zero callers. If it's planned for a future use case, document the intent. Otherwise consider removing it to avoid dead-code drift.

@ChristianPavilonis ChristianPavilonis marked this pull request as draft March 6, 2026 17:33
Fix expired TCF leaking through GPP fallback by clearing both sources.
Add gpp_section_ids to is_empty() check and KV fingerprint hash.
Generate synthetic ID before consent pipeline in auction endpoint so
KV fallback and write operations work correctly.
Add max-length guard on TC strings before base64 decoding.
Use jurisdiction to inform regs.gdpr instead of falsely emitting 0.
Defer cloning in TCF conflict resolution and remove dead code.
@ChristianPavilonis
Copy link
Collaborator Author

Review feedback addressed in 798e4a2

All 9 review comments addressed in a single commit. Here's how each was resolved:

Bug fixes

  1. Expired TCF leaks through GPP fallback (consent/mod.rs:171) — Changed the if/else if to unconditionally clear both ctx.tcf and ctx.gpp.eu_tcf on expiration, so effective_tcf() can no longer fall back to stale GPP-embedded consent.

  2. is_empty() ignores gpp_section_ids (consent/types.rs:155) — Added && self.gpp_section_ids.is_none() to the check so requests with only __gpp_sid are no longer treated as empty.

  3. KV fallback disabled due to synthetic_id: None (auction/endpoints.rs:55) — Moved synthetic ID generation before build_consent_context in the auction endpoint. Updated convert_tsjs_to_auction_request to accept the pre-generated ID as a &str parameter instead of generating its own.

  4. No length limit on TC string before base64 decode (consent/tcf.rs:57) — Added MAX_TC_STRING_LEN = 4096 guard before decoding. US Privacy already validates exact length (4 chars) and GPP delegates to iab_gpp which handles its own parsing, so no changes needed there.

Design concerns

  1. Fingerprint omits gpp_section_ids (consent/kv.rs:178) — Added sorted section IDs to the SHA-256 hash with sentinel byte separators, so SID-only changes now trigger KV writes.

  2. regs.gdpr = Some(0) false negative (prebid.rs:643) — Now uses jurisdiction from ConsentContext: GDPR jurisdiction sets gdpr=1 even without a TCF string; unknown jurisdiction emits None instead of Some(0).

Cleanup

  1. Unnecessary cloning in apply_tcf_conflict_resolution (consent/mod.rs:312) — Switched to working with references for the conflict check. Only clones the GPP winner when selected; early-returns when standalone wins since it's already in ctx.tcf.

  2. as_millis() as u64 truncation (consent/mod.rs:377) — Replaced with dur.as_secs() * 10 + u64::from(dur.subsec_millis()) / 100 to stay in u64 throughout.

  3. Dead extract_and_log_consent (consent/mod.rs:203) — Removed. Zero callers in the codebase.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

As publisher I want CMP consent passed to downstream providers

2 participants