fix(webapp): cap AI SDK OTel attribute size so ClickHouse JSON parse doesn't drop spans#3620
fix(webapp): cap AI SDK OTel attribute size so ClickHouse JSON parse doesn't drop spans#36200ski wants to merge 1 commit into
Conversation
|
WalkthroughThis PR implements multi-level OpenTelemetry attribute controls and batch resilience: three env vars (default per-attribute 8KB, AI-content per-attribute 1KB, and total-per-span 32KB) are added and used to build a SpanAttributeLimits object. A new otlpAttributeLimits module provides surrogate-safe truncateAttributes and capAssembledAttributesSize (which drops AI-content keys by priority until the assembled JSON fits). OTLP exporter is refactored to accept and apply these limits across traces, logs, and metrics. DynamicFlushScheduler detects ClickHouse JSON parse errors and uses bounded recursive split-and-retry (preserving split depth) to isolate bad rows; irrecoverable rows are logged with a sampled 1KB snippet and counted in droppedRows. Tests validate truncation, overrides, and priority-based dropping. Estimated code review effort🎯 4 (Complex) | ⏱️ ~40 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@apps/webapp/app/v3/dynamicFlushScheduler.server.ts`:
- Around line 21-23: At split exhaustion the code currently falls through and
retries unparseable payloads without decrementing totalQueuedItems; update the
batching logic around MAX_SPLIT_DEPTH/splitDepth in
dynamicFlushScheduler.server.ts so that when splitDepth === MAX_SPLIT_DEPTH and
subBatchSize > 1 you treat the leaf like the singleton-drop branch: decrement
totalQueuedItems by subBatchSize, increment failedBatches, log the dropped
sub-batch, and return (i.e., mirror the behavior in the subBatchSize === 1
branch) to avoid leaking queue count and wasted retries; also correct the
MAX_SPLIT_DEPTH comment to state it is the maximum split depth (yielding up to
2^MAX_SPLIT_DEPTH-way splits) rather than claiming it isolates single rows.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 21d321ea-89c2-45ba-869b-3a1788c27904
📒 Files selected for processing (6)
.server-changes/otel-ai-sdk-attribute-truncation.mdapps/webapp/app/env.server.tsapps/webapp/app/v3/dynamicFlushScheduler.server.tsapps/webapp/app/v3/otlpAttributeLimits.tsapps/webapp/app/v3/otlpExporter.server.tsapps/webapp/test/otlpAttributeLimits.test.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
- GitHub Check: webapp / 🧪 Unit Tests: Webapp (5, 8)
- GitHub Check: webapp / 🧪 Unit Tests: Webapp (4, 8)
- GitHub Check: webapp / 🧪 Unit Tests: Webapp (3, 8)
- GitHub Check: webapp / 🧪 Unit Tests: Webapp (8, 8)
- GitHub Check: webapp / 🧪 Unit Tests: Webapp (7, 8)
- GitHub Check: webapp / 🧪 Unit Tests: Webapp (6, 8)
- GitHub Check: webapp / 🧪 Unit Tests: Webapp (1, 8)
- GitHub Check: webapp / 🧪 Unit Tests: Webapp (2, 8)
- GitHub Check: typecheck / typecheck
- GitHub Check: e2e-webapp / 🧪 E2E Tests: Webapp
- GitHub Check: Analyze (javascript-typescript)
🧰 Additional context used
📓 Path-based instructions (13)
**/*.{ts,tsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.{ts,tsx}: Use types over interfaces for TypeScript
Avoid using enums; prefer string unions or const objects instead
**/*.{ts,tsx}: Import from@trigger.dev/coresubpaths only, never from the root. Subpath imports must be used to maintain proper module boundaries.
When writing Trigger.dev tasks, always import from@trigger.dev/sdk. Never use@trigger.dev/sdk/v3or deprecatedclient.defineJob.
Prisma is version 6.14.0. Use the Prisma client frominternal-packages/databasefor all database operations.
For ClickHouse client, schema migrations, and analytics queries, useinternal-packages/clickhouse.
Files:
apps/webapp/app/env.server.tsapps/webapp/test/otlpAttributeLimits.test.tsapps/webapp/app/v3/dynamicFlushScheduler.server.tsapps/webapp/app/v3/otlpAttributeLimits.tsapps/webapp/app/v3/otlpExporter.server.ts
{packages/core,apps/webapp}/**/*.{ts,tsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Use zod for validation in packages/core and apps/webapp
Files:
apps/webapp/app/env.server.tsapps/webapp/test/otlpAttributeLimits.test.tsapps/webapp/app/v3/dynamicFlushScheduler.server.tsapps/webapp/app/v3/otlpAttributeLimits.tsapps/webapp/app/v3/otlpExporter.server.ts
**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Use function declarations instead of default exports
Add crumbs as you write code — not just when debugging. Mark lines with
//@Crumbsor wrap blocks in `// `#region` `@crumbs. They stay on the branch throughout development and are stripped byagentcrumbs stripbefore merge.
Files:
apps/webapp/app/env.server.tsapps/webapp/test/otlpAttributeLimits.test.tsapps/webapp/app/v3/dynamicFlushScheduler.server.tsapps/webapp/app/v3/otlpAttributeLimits.tsapps/webapp/app/v3/otlpExporter.server.ts
**/*.ts
📄 CodeRabbit inference engine (.cursor/rules/otel-metrics.mdc)
**/*.ts: When creating or editing OTEL metrics (counters, histograms, gauges), ensure metric attributes have low cardinality by using only enums, booleans, bounded error codes, or bounded shard IDs
Do not use high-cardinality attributes in OTEL metrics such as UUIDs/IDs (envId, userId, runId, projectId, organizationId), unbounded integers (itemCount, batchSize, retryCount), timestamps (createdAt, startTime), or free-form strings (errorMessage, taskName, queueName)
When exporting OTEL metrics via OTLP to Prometheus, be aware that the exporter automatically adds unit suffixes to metric names (e.g., 'my_duration_ms' becomes 'my_duration_ms_milliseconds', 'my_counter' becomes 'my_counter_total'). Account for these transformations when writing Grafana dashboards or Prometheus queries
Files:
apps/webapp/app/env.server.tsapps/webapp/test/otlpAttributeLimits.test.tsapps/webapp/app/v3/dynamicFlushScheduler.server.tsapps/webapp/app/v3/otlpAttributeLimits.tsapps/webapp/app/v3/otlpExporter.server.ts
apps/webapp/**/*.{ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)
apps/webapp/**/*.{ts,tsx}: Access environment variables through theenvexport ofenv.server.tsinstead of directly accessingprocess.env
Use subpath exports from@trigger.dev/corepackage instead of importing from the root@trigger.dev/corepathUse named constants for sentinel/placeholder values (e.g.
const UNSET_VALUE = '__unset__') instead of raw string literals scattered across comparisons
Files:
apps/webapp/app/env.server.tsapps/webapp/test/otlpAttributeLimits.test.tsapps/webapp/app/v3/dynamicFlushScheduler.server.tsapps/webapp/app/v3/otlpAttributeLimits.tsapps/webapp/app/v3/otlpExporter.server.ts
apps/webapp/**/*.server.ts
📄 CodeRabbit inference engine (apps/webapp/CLAUDE.md)
apps/webapp/**/*.server.ts: Never userequest.signalfor detecting client disconnects. UsegetRequestAbortSignal()fromapp/services/httpAsyncStorage.server.tsinstead, which is wired directly to Expressres.on('close')and fires reliably
Access environment variables viaenvexport fromapp/env.server.ts. Never useprocess.envdirectly
Always usefindFirstinstead offindUniquein Prisma queries.findUniquehas an implicit DataLoader that batches concurrent calls and has active bugs even in Prisma 6.x (uppercase UUIDs returning null, composite key SQL correctness issues, 5-10x worse performance).findFirstis never batched and avoids this entire class of issues
Files:
apps/webapp/app/env.server.tsapps/webapp/app/v3/dynamicFlushScheduler.server.tsapps/webapp/app/v3/otlpExporter.server.ts
**/*.{ts,tsx,js,jsx,json,md,css,scss}
📄 CodeRabbit inference engine (AGENTS.md)
Code formatting is enforced using Prettier. Run
pnpm run formatbefore committing
Files:
apps/webapp/app/env.server.tsapps/webapp/test/otlpAttributeLimits.test.tsapps/webapp/app/v3/dynamicFlushScheduler.server.tsapps/webapp/app/v3/otlpAttributeLimits.tsapps/webapp/app/v3/otlpExporter.server.ts
apps/**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (CLAUDE.md)
When modifying only server components (
apps/webapp/,apps/supervisor/, etc.) with no package changes, add a.server-changes/file instead of a changeset. See.server-changes/README.mdfor format and documentation.
Files:
apps/webapp/app/env.server.tsapps/webapp/test/otlpAttributeLimits.test.tsapps/webapp/app/v3/dynamicFlushScheduler.server.tsapps/webapp/app/v3/otlpAttributeLimits.tsapps/webapp/app/v3/otlpExporter.server.ts
**/*.{test,spec}.{ts,tsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Use vitest for all tests in the Trigger.dev repository
Files:
apps/webapp/test/otlpAttributeLimits.test.ts
apps/webapp/**/*.test.{ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)
Do not import
env.server.tsdirectly or indirectly into test files; instead pass environment-dependent values through options/parameters to make code testableFor testable code, never import
env.server.tsin test files. Pass configuration as options instead (e.g.,realtimeClient.server.tstakes config as constructor arg,realtimeClientGlobal.server.tscreates singleton with env config)
Files:
apps/webapp/test/otlpAttributeLimits.test.ts
**/*.test.{ts,tsx,js}
📄 CodeRabbit inference engine (AGENTS.md)
**/*.test.{ts,tsx,js}: Use vitest for unit testing and run tests withpnpm run test
Test files should live beside the files under test with descriptivedescribeanditblocks
Tests should avoid mocks or stubs and use helpers from@internal/testcontainerswhen Redis or Postgres are needed
Files:
apps/webapp/test/otlpAttributeLimits.test.ts
**/*.test.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.test.{ts,tsx,js,jsx}: Use vitest exclusively for testing and never mock anything - use testcontainers instead.
Place test files next to source files (e.g.,MyService.ts->MyService.test.ts).
Files:
apps/webapp/test/otlpAttributeLimits.test.ts
apps/webapp/app/v3/**/*.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
In the
apps/webapp/app/v3/directory, only modify V2 code paths. All new work uses Run Engine 2.0 (@internal/run-engine) and redis-worker. The directory name is misleading - most code is actively used by V2, not legacy V1. Refer toapps/webapp/CLAUDE.mdfor the exact list of V1-only legacy files.
Files:
apps/webapp/app/v3/dynamicFlushScheduler.server.tsapps/webapp/app/v3/otlpAttributeLimits.tsapps/webapp/app/v3/otlpExporter.server.ts
🧠 Learnings (7)
📚 Learning: 2026-03-22T13:26:12.060Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3244
File: apps/webapp/app/components/code/TextEditor.tsx:81-86
Timestamp: 2026-03-22T13:26:12.060Z
Learning: In the triggerdotdev/trigger.dev codebase, do not flag `navigator.clipboard.writeText(...)` calls for `missing-await`/`unhandled-promise` issues. These clipboard writes are intentionally invoked without `await` and without `catch` handlers across the project; keep that behavior consistent when reviewing TypeScript/TSX files (e.g., usages like in `apps/webapp/app/components/code/TextEditor.tsx`).
Applied to files:
apps/webapp/app/env.server.tsapps/webapp/test/otlpAttributeLimits.test.tsapps/webapp/app/v3/dynamicFlushScheduler.server.tsapps/webapp/app/v3/otlpAttributeLimits.tsapps/webapp/app/v3/otlpExporter.server.ts
📚 Learning: 2026-03-22T19:24:14.403Z
Learnt from: matt-aitken
Repo: triggerdotdev/trigger.dev PR: 3187
File: apps/webapp/app/v3/services/alerts/deliverErrorGroupAlert.server.ts:200-204
Timestamp: 2026-03-22T19:24:14.403Z
Learning: In the triggerdotdev/trigger.dev codebase, webhook URLs are not expected to contain embedded credentials/secrets (e.g., fields like `ProjectAlertWebhookProperties` should only hold credential-free webhook endpoints). During code review, if you see logging or inclusion of raw webhook URLs in error messages, do not automatically treat it as a credential-leak/secrets-in-logs issue by default—first verify the URL does not contain embedded credentials (for example, no username/password in the URL, no obvious secret/token query params or fragments). If the URL is credential-free per this project’s conventions, allow the logging.
Applied to files:
apps/webapp/app/env.server.tsapps/webapp/test/otlpAttributeLimits.test.tsapps/webapp/app/v3/dynamicFlushScheduler.server.tsapps/webapp/app/v3/otlpAttributeLimits.tsapps/webapp/app/v3/otlpExporter.server.ts
📚 Learning: 2026-05-05T09:38:02.512Z
Learnt from: d-cs
Repo: triggerdotdev/trigger.dev PR: 3523
File: apps/webapp/app/routes/api.v3.batches.ts:178-181
Timestamp: 2026-05-05T09:38:02.512Z
Learning: When reviewing code that catches `ServiceValidationError` in `*.server.ts` files, do not blindly forward `error.status` to HTTP responses, because SVEs may be thrown with non-default statuses (e.g., 400/500) and forwarding them can cause client-visible behavioral regressions (e.g., surfacing 500s to clients). Prefer a safe default response status of `error.status ?? 422`, but only after confirming via the reachable call graph that the caught `ServiceValidationError` instances are expected to carry those non-default statuses; otherwise, normalize to `422` to avoid unexpected client-visible 5xx behavior.
Applied to files:
apps/webapp/app/env.server.tsapps/webapp/app/v3/dynamicFlushScheduler.server.tsapps/webapp/app/v3/otlpExporter.server.ts
📚 Learning: 2026-05-12T21:04:05.815Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3542
File: apps/webapp/app/components/sessions/v1/SessionStatus.tsx:1-3
Timestamp: 2026-05-12T21:04:05.815Z
Learning: In this Remix + TypeScript codebase, do not flag a server/client boundary violation when a file imports only types from a module matching `*.server`.
Specifically, it’s safe to import types using `import type { Foo } from "*.server"` or `import { type Foo } from "*.server"` because TypeScript erases type-only imports at compile time and they emit no JavaScript, so they won’t cross the Remix server/client bundle boundary.
Only raise the boundary concern for value imports (e.g., `import { Foo }` without `type`, or `import Foo`), since those produce JavaScript output.
Applied to files:
apps/webapp/app/env.server.tsapps/webapp/test/otlpAttributeLimits.test.tsapps/webapp/app/v3/dynamicFlushScheduler.server.tsapps/webapp/app/v3/otlpAttributeLimits.tsapps/webapp/app/v3/otlpExporter.server.ts
📚 Learning: 2026-05-07T12:25:18.271Z
Learnt from: d-cs
Repo: triggerdotdev/trigger.dev PR: 3531
File: apps/webapp/test/sentryTraceContext.server.test.ts:9-47
Timestamp: 2026-05-07T12:25:18.271Z
Learning: In the triggerdotdev/trigger.dev webapp test suite, it is acceptable to leave `createInMemoryTracing()` calls that register a global `NodeTracerProvider` without `afterEach`/`afterAll` teardown. Do not flag this as a test-ordering risk when the code follows the established pattern used across webapp tests (e.g., replication service/benchmark/backfiller tests). This is considered safe because `trace.getActiveSpan()` when called outside a `context.with(...)` block reads `AsyncLocalStorage.getStore()` (undefined when no `run()` scope exists), so it falls back to `ROOT_CONTEXT` with no attached span—regardless of which provider is registered.
Applied to files:
apps/webapp/test/otlpAttributeLimits.test.ts
📚 Learning: 2026-03-29T19:16:28.864Z
Learnt from: nicktrn
Repo: triggerdotdev/trigger.dev PR: 3291
File: apps/webapp/app/v3/featureFlags.ts:53-65
Timestamp: 2026-03-29T19:16:28.864Z
Learning: When reviewing TypeScript code that uses Zod v3, treat `z.coerce.*()` schemas as their direct Zod type (e.g., `z.coerce.boolean()` returns a `ZodBoolean` with `_def.typeName === "ZodBoolean"`) rather than a `ZodEffects`. Only `.preprocess()`, `.refine()`/`.superRefine()`, and `.transform()` are expected to wrap schemas in `ZodEffects`. Therefore, in reviewers’ logic like `getFlagControlType`, do not flag/unblock failures that require unwrapping `ZodEffects` when the input schema is a `z.coerce.*` schema.
Applied to files:
apps/webapp/app/v3/dynamicFlushScheduler.server.tsapps/webapp/app/v3/otlpAttributeLimits.tsapps/webapp/app/v3/otlpExporter.server.ts
📚 Learning: 2026-05-14T08:21:07.614Z
Learnt from: d-cs
Repo: triggerdotdev/trigger.dev PR: 3614
File: apps/webapp/app/v3/mollifier/mollifierGate.server.ts:48-52
Timestamp: 2026-05-14T08:21:07.614Z
Learning: When using Trigger.dev v3 feature flags in the webapp, prefer the existing per-org gating mechanism supported by `flag()` via the `overrides` argument. Pass `Organization.featureFlags` (from `environment.organization.featureFlags`) as the `overrides` value; overrides must take precedence over the global `featureFlag` row. Do not require schema changes or add an `orgId` field to `FlagsOptions` for per-org gating—use the overrides pattern consistently (e.g., in gate flows like `resolveOrgFlag` and any server code that threads `environment.organization.featureFlags` into the gate call).
Applied to files:
apps/webapp/app/v3/dynamicFlushScheduler.server.tsapps/webapp/app/v3/otlpExporter.server.ts
🔇 Additional comments (9)
.server-changes/otel-ai-sdk-attribute-truncation.md (1)
1-20: LGTM!apps/webapp/app/env.server.ts (1)
501-510: LGTM!apps/webapp/app/v3/otlpAttributeLimits.ts (1)
1-207: LGTM!apps/webapp/app/v3/otlpExporter.server.ts (5)
44-49: LGTM!Also applies to: 60-60, 71-71, 91-91, 112-112
259-259: LGTM!Also applies to: 265-281, 302-316
364-364: LGTM!Also applies to: 370-386, 409-423
486-486: LGTM!
1131-1143: LGTM!apps/webapp/test/otlpAttributeLimits.test.ts (1)
1-194: LGTM!
…doesn't drop spans Vercel AI SDK spans carry tens of KB of prompt/response content per attribute, producing an assembled attributes JSON that ClickHouse rejects with "Cannot parse JSON object" and silently drops the whole batch. - Add otlpAttributeLimits module with per-key overrides for ai.* / gen_ai.* content keys (tighter 1KB cap) plus a 32KB total-attributes backstop that drops AI content keys in priority order when exceeded. - Wire SERVER_OTEL_SPAN_ATTRIBUTE_VALUE_LENGTH_LIMIT / SERVER_OTEL_AI_CONTENT_ATTRIBUTE_VALUE_LENGTH_LIMIT / SERVER_OTEL_SPAN_TOTAL_ATTRIBUTES_LENGTH_LIMIT env vars through the OTLP exporter for spans and logs. - DynamicFlushScheduler now recognises ClickHouse JSON parse errors and recursively splits the failing batch (up to depth 8, 256-way isolation) to narrow the bad row instead of poisoning the whole 5-10k-row batch. Leaves that can't be salvaged — single rows ClickHouse still rejects, or split-exhausted chunks — are counted in a new droppedRows metric and removed from the queue so totalQueuedItems doesn't leak.
e08d939 to
adcff4d
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.server-changes/otel-ai-sdk-attribute-truncation.md:
- Around line 19-20: Update the changelog sentence that currently reads "Leaves
that still fail — whether a single row or a split-exhausted chunk — are
logged..." by replacing "Leaves" with "Rows" so it reads "Rows that still fail —
whether a single row or a split-exhausted chunk — are logged..."; locate the
exact phrase "Leaves that still fail" in the
.server-changes/otel-ai-sdk-attribute-truncation.md diff and perform the
single-word correction to fix the wording typo.
- Around line 17-18: Update the changelog text that currently reads "up to 8
split levels / 256-way isolation" to the correct behavior "up to 4 split levels
/ 16-way isolation" so the wording matches the scheduler limits; locate and
replace that exact phrase in the .md content (search for the string "up to 8
split levels / 256-way isolation") and ensure the surrounding sentence remains
grammatically correct.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: ecf3c36d-a7a1-47e0-94d0-f723a7bd5162
📒 Files selected for processing (6)
.server-changes/otel-ai-sdk-attribute-truncation.mdapps/webapp/app/env.server.tsapps/webapp/app/v3/dynamicFlushScheduler.server.tsapps/webapp/app/v3/otlpAttributeLimits.tsapps/webapp/app/v3/otlpExporter.server.tsapps/webapp/test/otlpAttributeLimits.test.ts
🚧 Files skipped from review as they are similar to previous changes (5)
- apps/webapp/test/otlpAttributeLimits.test.ts
- apps/webapp/app/v3/dynamicFlushScheduler.server.ts
- apps/webapp/app/env.server.ts
- apps/webapp/app/v3/otlpExporter.server.ts
- apps/webapp/app/v3/otlpAttributeLimits.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: webapp / 🧪 Unit Tests: Webapp (8, 8)
- GitHub Check: webapp / 🧪 Unit Tests: Webapp (1, 8)
- GitHub Check: webapp / 🧪 Unit Tests: Webapp (4, 8)
- GitHub Check: webapp / 🧪 Unit Tests: Webapp (6, 8)
- GitHub Check: typecheck / typecheck
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2026-05-14T14:54:39.095Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3545
File: .server-changes/agent-view-sessions.md:10-10
Timestamp: 2026-05-14T14:54:39.095Z
Learning: In the `trigger.dev` repository, do not flag inconsistent dot vs slash notation in route/path strings inside `.server-changes/*.md` files. These markdown files are consumed verbatim into the changelog, so the mixed notation (e.g., `resources.orgs.../runs.$runParam/...`) is intentional and should be preserved as-is.
Applied to files:
.server-changes/otel-ai-sdk-attribute-truncation.md
| batch is split in half and retried (up to 8 split levels / 256-way | ||
| isolation) instead of failing all 5–10k rows at once. Leaves that |
There was a problem hiding this comment.
Correct split-depth numbers to match implemented behavior.
Line 17 says up to 8 split levels / 256-way isolation, but this PR’s described behavior is up to 4 split levels / 16-way isolation. Please align this changelog text with the actual scheduler limits to avoid operational confusion.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.server-changes/otel-ai-sdk-attribute-truncation.md around lines 17 - 18,
Update the changelog text that currently reads "up to 8 split levels / 256-way
isolation" to the correct behavior "up to 4 split levels / 16-way isolation" so
the wording matches the scheduler limits; locate and replace that exact phrase
in the .md content (search for the string "up to 8 split levels / 256-way
isolation") and ensure the surrounding sentence remains grammatically correct.
| still fail — whether a single row or a split-exhausted chunk — are | ||
| logged with a 1KB sample, counted in `droppedRows`, and removed from |
There was a problem hiding this comment.
Fix wording typo in failure-row sentence.
Line 19 should read Rows that still fail ... (not Leaves that still fail ...) for clear changelog wording.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.server-changes/otel-ai-sdk-attribute-truncation.md around lines 19 - 20,
Update the changelog sentence that currently reads "Leaves that still fail —
whether a single row or a split-exhausted chunk — are logged..." by replacing
"Leaves" with "Rows" so it reads "Rows that still fail — whether a single row or
a split-exhausted chunk — are logged..."; locate the exact phrase "Leaves that
still fail" in the .server-changes/otel-ai-sdk-attribute-truncation.md diff and
perform the single-word correction to fix the wording typo.
Tighten OTel span attribute truncation for Vercel AI SDK content keys
(
ai.prompt*,ai.response.text/object/toolCalls/reasoning*,gen_ai.prompt,gen_ai.completion,gen_ai.request.messages,gen_ai.response.text) to a 1KB per-attribute cap, plus a 32KB per-spanbackstop that drops these content keys in priority order if the assembled
attributes JSON still exceeds it. Cost/token metadata (
ai.usage.*,ai.model.*,gen_ai.usage.*,gen_ai.response.model, etc.) keeps thedefault 8KB cap so LLM enrichment continues to work.
Adds a parse-error-aware safety net in
DynamicFlushScheduler: whenClickHouse rejects a batch with
Cannot parse JSON object here, thebatch is split in half and retried (up to 4 split levels / 16-way
isolation) instead of failing all 5–10k rows at once. Singleton rows
that still fail are logged with a 1KB sample and dropped so the rest of
the queue keeps flowing.