Skip to content

chore: Pin models and increase max tokens in e2e/canary tests#1763

Merged
Luca Forstner (lforst) merged 4 commits intomainfrom
lforst/more-robust-e2e
Apr 8, 2026
Merged

chore: Pin models and increase max tokens in e2e/canary tests#1763
Luca Forstner (lforst) merged 4 commits intomainfrom
lforst/more-robust-e2e

Conversation

@lforst
Copy link
Copy Markdown
Member

Closes #1760

@lforst
Copy link
Copy Markdown
Member Author

Just realized that e2e tests continue to be relatively flakey even with retry because retry doesn't affect beforeAll where we run the llm calls.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume we'll do a follow up to add a retry to withScenarioHarness? Then we can remove per assertion retries in favour of just retrying the entire scenario.

# Conflicts:
#	e2e/scenarios/ai-sdk-instrumentation/scenario.impl.mjs
@lforst
Copy link
Copy Markdown
Member Author

I assume we'll do a follow up to add a retry to withScenarioHarness? Then we can remove per assertion retries in favour of just retrying the entire scenario.

Yeah I am vibing on a more proper but elaborate fix here where we migrate to more granular testing, putting everything into the tests and not the beforeAll hook.

Secondary reason for that is that the tests are starting to take up 20 minutes, and I would like to shard them up and that only really makes sense if we do the whole scenario shebang in the tests and not the before all hook.

@lforst Luca Forstner (lforst) merged commit d040a2e into main Apr 8, 2026
49 checks passed
@lforst Luca Forstner (lforst) deleted the lforst/more-robust-e2e branch April 8, 2026 14:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[canary] e2e canary failures

2 participants