Skip to content

[BLOOM] — TRP1 FDE Programme, April 2026#36

Open
IbnuEyni wants to merge 1 commit intoucbepic:mainfrom
IbnuEyni:main
Open

[BLOOM] — TRP1 FDE Programme, April 2026#36
IbnuEyni wants to merge 1 commit intoucbepic:mainfrom
IbnuEyni:main

Conversation

@IbnuEyni
Copy link
Copy Markdown

BLOOM Team — Oracle Forge

Programme: TRP1 FDE April 2026
Model: google/gemini-3.1-pro-preview
Overall Score: 19/54 = 35.2% pass@1
Repo: https://github.com/IbnuEyni/oracle-forge

Architecture

3-layer KB context injection (~16KB per query):

  • Layer 1: Schema hints via --use_hints
  • Layer 2: Domain knowledge (join keys, field inventory)
  • Layer 3: Corrections memory (auto-written after every failure)

Results

Dataset Passed Total Score
bookreview 3 3 100%
stockindex 2 3 66.7%
crmarenapro 6 13 46.2%
yelp 4 7 57.1%
music_brainz_20k 1 3 33.3%
googlelocal 1 4 25.0%
agnews 1 4 25.0%
stockmarket 1 5 20.0%
GITHUB_REPOS 0 4 0.0%
PANCANCER_ATLAS 0 3 0.0%
PATENTS 0 3 0.0%
DEPS_DEV_V1 0 2 0.0%

IbnuEyni added a commit to IbnuEyni/oracle-forge that referenced this pull request Apr 18, 2026
@shreyashankar
Copy link
Copy Markdown
Collaborator

Hi @IbnuEyni — we can't re-validate this submission as-is because the JSON has only aggregate counts (overall, per-dataset {passed, total}). Please re-emit according to the instructions in the README — an array of {"dataset", "query", "run", "answer"} entries, one per run, covering every query across all 12 datasets with at least 5 runs each. Once the file is in I'll verify and post the Pass@1 here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants