Skip to content

Commit 94d4cb1

Browse files
shaneclaude
andcommitted
Fix issuer resolution, fractional denominations, search strategy, and confidence scoring
- Fractional denominations (2 1/2 Shillings): fixed mixed-number parsing in parseNumericValue and Unicode vulgar fraction handling in transformValueNumber - Historical sub-issuer search (South Africa 1896): replaced 4-strategy search with 3-strategy (S3: no-issuer fallback with country in q) - Alias-resolved issuers now score +20 country points via issuerAliases code comparison; deleted dead calculateMatchConfidence from numista-api.js - Mandatory Palestine and East Africa Protectorate issuer aliases added and verified - Parent-aware tie-breaking for same-name issuers (Lesson 31) - ISSUER_ALIASES exposed to renderer via preload.js Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
1 parent b7fa4c3 commit 94d4cb1

11 files changed

Lines changed: 209 additions & 238 deletions

File tree

claude.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -98,6 +98,10 @@
9898

9999
20. **Numista issuers have a parent/child hierarchy — always match most specific** - The `/issuers` endpoint returns issuers with `level` (1-5) and `parent` fields. Section-level issuers (lower level) group territories under a country (e.g., "United Kingdom" section includes Falkland Islands, Gibraltar). The specific country issuer has a higher level number. When resolving issuer codes, always prefer the most specific (highest level) match. Using a section-level code causes the search API to return coins from all grouped territories, polluting results with irrelevant coins and pushing the correct coin out of view.
100100

101+
31. **Issuer name collisions require parent-aware tie-breaking** - Two Numista issuers can share the identical name (e.g. both called "East Africa") but exist in completely different parent hierarchies (one standalone, one under "Islamic states"). The old tie-break of "higher level wins" was designed for parent/child pairs within the same hierarchy (e.g. "United Kingdom" section vs. "United Kingdom" country) but silently picks the wrong issuer when the collision crosses hierarchies. Fix: when Dice scores tie, score the parent name against the query — prefer the issuer whose parent is most relevant; prefer parentless issuers over those with a zero-scoring parent. Level number remains the final tiebreaker within the same context. Also add explicit alias entries to `issuer-aliases.json` for known historical issuers whose name doesn't exactly match their Numista entry (e.g. "East Africa Protectorate" → `afrique_de_l_est`, verified 2026-02-21).
102+
103+
32. **Automatic search strategy: issuer + country-in-q is contradictory; no-issuer fallback is the correct last resort** — `searchForMatches()` uses three strategies in sequence. S1: full structured query with `issuer` + `q="value unit"` + `date` — handles the vast majority of coins. S2: same but with alternate denomination forms from `getAlternateSearchForms()` — handles language variant denominations (e.g., Czech "haléřů" vs English "heller"). S3: no-issuer fallback — `issuer` is dropped entirely and the country name is moved into `q` (e.g., `q="South Africa 1 shilling"`) while `date` and `category` are kept — handles historical issuer mismatches where a modern country label (e.g., "South Africa" → `afrique_du_sud`) doesn't cover pre-Union sub-issuers. **Year (`date`) must always be a separate param, never in `q`** — Numista type titles don't contain years; putting year in `q` returns 0 results. **Do not add a strategy that combines issuer + country-in-q** — the issuer param already scopes to the country, so requiring the country name to also appear in the coin's title is contradictory and strictly more restrictive than S1 (always returns a subset of S1's results, usually the same 0). See PROJECT-REFERENCE.md "Automatic Search Strategy" section for full table and rationale.
104+
101105
30. **Never hardcode external API identifiers without live verification — treat guessed values as destructive** - `issuer-aliases.json` was created with Numista issuer codes guessed by naming convention (e.g., "korea-south", "germany-federal-republic"). These codes did not exist in the Numista API at all, causing silent 400 errors on every Korean and German coin search. Two others ("united-states", "united-kingdom") existed but were section-level codes that polluted results with unrelated territories. The file caused more harm than good because half its values were fabricated. Rule: any file that stores external system identifiers (API codes, IDs, keys) must be populated exclusively from live API responses — never by pattern-guessing. When adding a new entry, call the relevant endpoint, inspect the actual response, and record the verified value. Document the verification source (endpoint + date) in a comment. Verified codes as of 2026-02-19: `united-states`→`etats-unis`, `united-kingdom`→`royaume-uni`, `west-germany`→`allemagne`, `east-germany`→`ddr`, `south-korea`→`coree_du_sud`, `north-korea`→`coree_du_nord`.
102106

103107
### Database & Data Persistence

docs/CHANGELOG.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,19 @@ All notable changes to NumiSync Wizard for OpenNumismat.
88
99
---
1010

11+
## v1.1.3
12+
13+
| Date | Type | Files | Summary |
14+
|------|------|-------|---------|
15+
| Feb 22 | Fix | src/renderer/app.js | **Match confidence penalised -20 for fractional denominations (e.g. "2 1/2 Shillings" scores 43% instead of ~63%)**`parseNumericValue` handled pure ASCII fractions (`"1/2"`) and Unicode fractions (`"2½"`) but not mixed-number notation (`"2 1/2"` or `"21/2"`). `parseFloat("2 1/2")` stops at the space and returns `2`, while Numista's `value.text` of `"2½ Shillings"` correctly parsed to `2.5` via the Unicode path — causing a false mismatch and a -20 point penalty. Fix: added a mixed-number pattern `/^(\d+?)\s*(\d+)\/(\d+)$/` (non-greedy first group) inserted before the pure-fraction check. Handles both `"2 1/2"` (with space) and `"21/2"` (no space) → `2.5`; pure fractions like `"1/2"` still fall through to the existing check. |
16+
| Feb 22 | Fix | src/modules/default-field-mapping.js | **Value field truncated for fractional denominations (e.g. "2½ Shillings" → "2")** — Numista returns fractional face values using Unicode vulgar fraction characters (e.g. `½` U+00BD) in `value.text`. The `transformValueNumber` regex `/^[\d\/\.]+/` is ASCII-only and stopped at the Unicode character, returning only the integer prefix. Fix: added a `UNICODE_FRACTIONS_TO_ASCII` map and a pre-pass in `transformValueNumber` that detects any Unicode fraction, extracts the integer prefix, and returns catalog-standard ASCII notation (e.g. `"2½ Shillings"``"2 1/2"`). All 13 common vulgar fraction characters covered. Notation follows Heritage Auctions / KM catalog convention; matches what OpenNumismat already stores for fractional denominations. |
17+
| Feb 22 | Fix | src/renderer/app.js, docs/reference/PROJECT-REFERENCE.md, CLAUDE.md | **Historical sub-issuer coins return no matches (South Africa 1896 Shilling)** — Coins labeled with a modern country name (e.g., "South Africa") resolve to a modern Numista issuer code (`afrique_du_sud`) that only covers post-Union coins. Pre-Union issues (e.g., 1896 ZAR Shilling under "South African Republic") live in a completely different Numista issuer hierarchy and returned 0 results from all issuer-constrained searches. Fix: restructured `searchForMatches()` from 4 strategies to 3. Removed "core query" strategy (dead code — `buildCoreQuery` produced the same `q` as S1 whenever `coin.value` was present, and the `coreQuery !== baseParams.q` guard prevented it ever firing) and "minimal query" strategy (contradictory — passing `issuer=afrique_du_sud` while also putting "South Africa" into `q` required the country name to appear in the Numista coin title, which it never does; was strictly more restrictive than S1 and always returned a subset of S1's results). Replaced both with a single no-issuer fallback (new S3): drops `issuer` entirely, moves country name into `q` alongside the denomination (e.g., `q="South Africa 1 shilling"`), keeps `date` and `category` — mirrors how the Numista website's own full-text search finds coins regardless of issuer hierarchy. Documented the 3-strategy structure, reasoning, and removed-strategy rationale in PROJECT-REFERENCE.md and CLAUDE.md (Lesson 32) to prevent future drift. |
18+
| Feb 22 | Fix | src/renderer/app.js, src/main/preload.js, src/modules/numista-api.js, docs/reference/PROJECT-REFERENCE.md | **Alias-resolved issuers score 0 for country match / dead scoring function removed** — Match confidence scoring was split across two functions: `calculateConfidence` in `app.js` (active, called by `renderMatches()`) and `calculateMatchConfidence` in `numista-api.js` (dead code, never called). Both were maintained in parallel for months, causing fixes to land in the wrong place. Root fix: (1) deleted `calculateMatchConfidence` from `numista-api.js` entirely; (2) exposed `ISSUER_ALIASES` to the renderer by loading `issuer-aliases.json` in `preload.js` and adding `issuerAliases` to `window.stringSimilarity`; (3) updated `calculateConfidence` in `app.js` to fall back to alias-code comparison (`issuerAliases[coinCountry] === match.issuer.code`) when raw string match fails — "Mandatory Palestine" now correctly scores +20 country points against "British Palestine" (code `palestine`); (4) documented the single-owner architecture in `PROJECT-REFERENCE.md` to prevent future drift. |
19+
| Feb 22 | Fix | src/data/issuer-aliases.json | **Mandatory Palestine returns no search results** — "Mandatory Palestine" (the official name for the British Mandate territory 1920–1948) had no alias entry. Numista catalogs this issuer as "British Palestine" with code `palestine` (level 2, parent: Israel section). Without an alias, fuzzy matching failed to bridge "Mandatory"↔"British", returning zero results. Added alias family mapping "palestine", "mandatory palestine", "british palestine", "british mandate of palestine", "british mandate palestine", "palestine mandate", and "mandate of palestine" → `palestine`. Code verified via live /issuers API call 2026-02-22. |
20+
| Feb 21 | Fix | src/data/issuer-aliases.json, src/modules/numista-api.js | **East Africa Protectorate returns no search results** — "East Africa Protectorate" had no alias entry, so fuzzy matching ran against all 11,756 Numista issuers. Two issuers share the identical name "East Africa" — `afrique_de_l_est` (level 1, British colonial) and `east_africa_islamic` (level 2, child of "Islamic states"). Both scored 0.6061 Dice; the old tie-breaker (higher level wins) picked `east_africa_islamic`, the wrong issuer, returning zero results. Fix 1: added alias entries for "east africa", "east africa protectorate", "british east africa", and "east africa colony" → `afrique_de_l_est` (verified via live /issuers API call 2026-02-21), bypassing fuzzy matching for known historical names. Fix 2: improved `resolveIssuerCode` tie-breaking to score the parent name against the query — an issuer with a relevant parent (e.g. parent="United Kingdom") is preferred over one with an irrelevant parent (e.g. parent="Islamic states"); parentless standalone issuers beat those with zero-scoring parents. Level number remains the final tiebreaker within the same hierarchy. |
21+
22+
---
23+
1124
## v1.1.2
1225

1326
| Date | Type | Files | Summary |

docs/reference/IPC-HANDLERS-QUICK-REF.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -185,6 +185,7 @@
185185
| `open-manual` | *(none)* | `{ success }` | `openUserManual()` (local helper) |
186186
| `export-log-file` | *(none)* | `{ success, filePath? }` | `logger.js`, Electron dialog |
187187
| `get-denomination-aliases` | *(none)* | `{ aliasMap, pluralMap, allCanonicalsMap, issuerOverrides, subunitMap }` | `denomination-normalizer.js` |
188+
| `get-issuer-aliases` | *(none)* | `{ [alias: string]: code }` flat map | `issuer-aliases.json` (built in index.js at startup) — sync IPC for preload sandbox |
188189

189190
---
190191

docs/reference/PROJECT-REFERENCE.md

Lines changed: 61 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -160,7 +160,6 @@ numismat-enrichment/
160160
| `getIssuePricing(typeId, issueId, currency)` | Get pricing for specific issue |
161161
| `fetchCoinData(typeId, coin, fetchSettings)` | Main orchestration - conditional fetch |
162162
| `matchIssue(coin, issuesResponse)` | Auto-match logic (year/gregorian_year+mintmark+type) |
163-
| `calculateMatchConfidence(coin, type)` | Scoring with denomination normalization via `denomination-normalizer.js` (alias + plural/singular) |
164163
| `getIssuers()` | Fetch and cache full issuer list |
165164
| `resolveIssuerCode(countryName)` | Resolve country to issuer code (aliases loaded from `issuer-aliases.json`) |
166165

@@ -197,6 +196,67 @@ numismat-enrichment/
197196

198197
---
199198

199+
## Automatic Search Strategy
200+
201+
**Owner: `searchForMatches()` in `src/renderer/app.js`**
202+
203+
All strategies share a single `baseParams` object built by `buildSearchParams(coin)`, which contains:
204+
- `issuer` — resolved Numista issuer code (e.g., `afrique_du_sud`), absent if resolution fails
205+
- `q` — denomination string (e.g., `"1 shilling"`), built from structured `value`+`unit` fields; falls back to stripped title only when both are absent
206+
- `date` — Gregorian year string (e.g., `"1896"`); **never placed in `q`** — Numista type titles don't contain years, so putting year in `q` returns 0 results
207+
- `category` — from fetch settings (`coin`, `banknote`, `exonumia`, or absent for all)
208+
- `page` — always 1 for initial call; pagination handled by `fetchAllSearchPages()`
209+
210+
Strategies fire in sequence; each is skipped if the previous one found results.
211+
212+
| # | `issuer` | `q` | `date` | Purpose |
213+
|---|----------|-----|--------|---------|
214+
| S1 | resolved code | `"1 shilling"` | `"1896"` | Exact structured query — the common case |
215+
| S2 | resolved code | `"1 haléřů"` (alt form) | `"1896"` | Alternate denomination spelling (e.g., Czech "haléřů" vs English "heller") — issuer kept, only `q` varies |
216+
| S3 | *(omitted)* | `"South Africa 1 shilling"` | `"1896"` | No-issuer fallback — country name moves into `q`; handles coins whose country label maps to a modern issuer that doesn't cover historical sub-issuers |
217+
218+
### Why this structure
219+
220+
**S1** handles the vast majority of coins. The issuer parameter is the primary precision tool — it constrains results to the correct country without requiring the country name to appear in the Numista coin title (titles are just the denomination, e.g. "1 Shilling", never "South Africa 1 Shilling").
221+
222+
**S2** handles denominations with language variants. When `denomination-aliases.json` has cross-referenced entries (e.g., "heller" ↔ "haléřů"), `getAlternateSearchForms()` returns the alternate forms and S2 retries with each, still keeping the issuer filter for precision.
223+
224+
**S3** handles the historical issuer mismatch problem. Some coins in OpenNumismat are labeled with a modern country name (e.g., "South Africa") that resolves to a modern Numista issuer code (`afrique_du_sud`) that only covers post-Union coins. Pre-Union coins (e.g., 1896 ZAR Shilling) are cataloged under a completely different Numista sub-issuer ("South African Republic"). S1 and S2 both return 0 for these. S3 drops the `issuer` param entirely and puts the country name into `q`, mirroring how the Numista website's own full-text search finds coins regardless of issuer hierarchy. `date` and `category` are retained for precision.
225+
226+
### What was removed and why (do not re-add)
227+
228+
Two strategies and their builder functions were removed in Feb 2026 after analysis showed they were either dead code or architecturally contradictory:
229+
230+
- **"Core query" (removed)**`buildCoreQuery()` produced `value + normalizedUnit`, identical to what `buildSearchParams()` already produces when `coin.value` is present. The guard `coreQuery !== baseParams.q` prevented it from ever firing. Dead code; deleted.
231+
232+
- **"Minimal query" (removed)**`buildMinimalQuery()` produced `country + denominationUnit` (no value) and was passed to the API **with the issuer param still set**. This was contradictory: the issuer param already scopes results to the correct country, so adding the country name to `q` required it to appear in the Numista coin title too — which it never does. The combination was strictly more restrictive than S1 and always returned a subset of S1's results (usually 0 when S1 also returned 0). The "country in q" concept was correct but belongs only in S3 where the issuer is absent.
233+
234+
---
235+
236+
## Match Confidence Scoring
237+
238+
**Single owner: `calculateConfidence(coin, match)` in `src/renderer/app.js`**
239+
240+
Match confidence scoring lives entirely in the renderer. The main process has no scoring role.
241+
242+
| Component | Points | Notes |
243+
|-----------|--------|-------|
244+
| Title (Dice) | 0–30 | `window.stringSimilarity.diceCoefficient` |
245+
| Year in range | +25 / −15 | Penalty if coin year outside `min_year``max_year` |
246+
| Country match | +20 | String inclusion OR alias-code match via `window.stringSimilarity.issuerAliases` |
247+
| Denomination | +25 / −20 | Value + unit match; partial credit when unit unknown |
248+
| Category | +10 / −10 | Boost for standard circulation; penalty for proof/pattern/specimen |
249+
250+
**Country match logic** (in order of precedence):
251+
1. Exact or substring string match (`"British Palestine".includes("British Palestine")`)
252+
2. Alias-code match: `issuerAliases[coinCountry] === match.issuer.code` — handles cases where OpenNumismat country name differs from Numista catalog name (e.g. "Mandatory Palestine" → code `palestine` = `match.issuer.code`)
253+
254+
**`window.stringSimilarity.issuerAliases`** is built in `preload.js` at startup by reading `src/data/issuer-aliases.json` and flattening all alias arrays into a single `alias → code` map. It is exposed via `contextBridge` alongside the denomination utilities.
255+
256+
**Do not add scoring logic to `numista-api.js` or `index.js`** — the renderer cannot call main-process functions synchronously during UI rendering, so any scoring placed there is unreachable from the display path.
257+
258+
---
259+
200260
## Field Mapping System
201261

202262
**Key Files:**

package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"name": "numisync-wizard",
3-
"version": "1.1.2",
3+
"version": "1.1.3",
44
"description": "NumiSync Wizard - Enrich your OpenNumismat collection with data from Numista",
55
"homepage": "https://numisync.com",
66
"main": "src/main/index.js",

src/data/issuer-aliases.json

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,5 +33,13 @@
3333
"bohemia-and-moravia": {
3434
"aliases": ["bohemia and moravia", "protectorate of bohemia and moravia", "böhmen und mähren"],
3535
"code": "boheme_moravie"
36+
},
37+
"east-africa": {
38+
"aliases": ["east africa", "east africa protectorate", "british east africa", "east africa colony"],
39+
"code": "afrique_de_l_est"
40+
},
41+
"palestine": {
42+
"aliases": ["palestine", "mandatory palestine", "british palestine", "british mandate of palestine", "british mandate palestine", "palestine mandate", "mandate of palestine"],
43+
"code": "palestine"
3644
}
3745
}

src/main/index.js

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -948,6 +948,23 @@ ipcMain.on('get-denomination-aliases', (event) => {
948948
event.returnValue = { aliasMap: DENOMINATION_ALIASES, pluralMap: DENOMINATION_PLURALS, allCanonicalsMap: ALL_CANONICALS, issuerOverrides: ISSUER_DENOMINATION_OVERRIDES, subunitMap: SUBUNIT_MAP };
949949
});
950950

951+
// Flat issuer alias map for renderer-side confidence scoring (used by preload.js).
952+
// Built once at startup from issuer-aliases.json; returned synchronously so preload
953+
// can expose it via window.stringSimilarity.issuerAliases without requiring fs/path.
954+
const _issuerAliasRaw = JSON.parse(fs.readFileSync(
955+
path.join(__dirname, '..', 'data', 'issuer-aliases.json'), 'utf8'
956+
));
957+
const ISSUER_ALIASES_FLAT = {};
958+
for (const [key, value] of Object.entries(_issuerAliasRaw)) {
959+
if (key.startsWith('_')) continue;
960+
for (const alias of value.aliases) {
961+
ISSUER_ALIASES_FLAT[alias.toLowerCase()] = value.code;
962+
}
963+
}
964+
ipcMain.on('get-issuer-aliases', (event) => {
965+
event.returnValue = ISSUER_ALIASES_FLAT;
966+
});
967+
951968
// ============================================================================
952969
// IPC HANDLERS - File Operations
953970
// ============================================================================

0 commit comments

Comments
 (0)