I noticed a few annotation inconsistencies in the ground truth files — mostly in page_10.json, which appears to follow CSL-JSON conventions while the other pages (and the Pydantic schema in dataclass.py) follow a different convention.
1. Hyphenated vs underscored keys (page_10)
page_10.json uses CSL-JSON hyphenated keys:
"publisher-place": "London",
"container-title": "..."
While pages 2–5 and the Entry model in dataclass.py use underscored keys:
"publisher_place": "London",
"container_title": "..."
2. Entry type values (page_10, page_2)
page_10.json uses CSL-JSON type values:
"article-journal" instead of "journal-article" (used in pages 2–5)
"chapter" (not present in the EntryType enum, which defines book, journal-article, other)
page_2.json also uses "review" as a type value, which is not in the EntryType enum either.
3. Nested entries inside related field (page_10, entry 145)
Entry 145 in page_10.json is a conference proceedings that contains three sub-entries (146–148) as full objects nested inside its related field:
{
"id": "145",
"related": [
{"id": "146", "type": "article-journal", "title": "Storia e materialismo storico", ...},
{"id": "147", "type": "article-journal", "title": "Critica del giudizio storico", ...},
{"id": "148", "type": "article-journal", "title": "Forza e spirito nella storia", ...}
]
}
This makes the ground truth structurally different from what models typically produce (a flat list of entries), and from how related is used in other pages (where it's either absent or contains simple ID references). It also means automated scoring with get_all_keys / get_nested_value traverses into these nested objects, creating key paths like entries[5].related[0].title that don't align with a flat prediction structure.
I noticed a few annotation inconsistencies in the ground truth files — mostly in
page_10.json, which appears to follow CSL-JSON conventions while the other pages (and the Pydantic schema indataclass.py) follow a different convention.1. Hyphenated vs underscored keys (page_10)
page_10.jsonuses CSL-JSON hyphenated keys:While pages 2–5 and the
Entrymodel indataclass.pyuse underscored keys:2. Entry type values (page_10, page_2)
page_10.jsonuses CSL-JSON type values:"article-journal"instead of"journal-article"(used in pages 2–5)"chapter"(not present in theEntryTypeenum, which definesbook,journal-article,other)page_2.jsonalso uses"review"as a type value, which is not in theEntryTypeenum either.3. Nested entries inside
relatedfield (page_10, entry 145)Entry 145 in
page_10.jsonis a conference proceedings that contains three sub-entries (146–148) as full objects nested inside itsrelatedfield:{ "id": "145", "related": [ {"id": "146", "type": "article-journal", "title": "Storia e materialismo storico", ...}, {"id": "147", "type": "article-journal", "title": "Critica del giudizio storico", ...}, {"id": "148", "type": "article-journal", "title": "Forza e spirito nella storia", ...} ] }This makes the ground truth structurally different from what models typically produce (a flat list of entries), and from how
relatedis used in other pages (where it's either absent or contains simple ID references). It also means automated scoring withget_all_keys/get_nested_valuetraverses into these nested objects, creating key paths likeentries[5].related[0].titlethat don't align with a flat prediction structure.