Conversation
Introduces CachingRocksDatabase, a RocksDatabase subclass that layers a per-instance WeakLRUCache on top of the process-wide VerificationTable. Cache freshness is checked via entry.version (Harper's localTime timestamp) rather than a txnId, so reads hitting a fresh VT slot skip disk entirely. Primary RocksDB stores now use CachingRocksDatabase instead of RocksDatabase. DatabaseTransaction creates transactions with coordinatedRetry: true, so IsBusy conflicts are signalled as RETRY_NOW_VALUE (immediate retry, no backoff) rather than an ERR_BUSY rejection. The old quadratic-backoff retry handler is removed. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Tests cover: - primaryStore is a CachingRocksDatabase (and RocksDatabase) instance - Basic read/write correctness - VT slot population after two reads enabling the fast-path - Third read returns value without DB access (VT fast-path) - Write clears VT slot so stale version cannot verify - Remove clears cache; subsequent read returns undefined - Transaction-scoped reads bypass cache and return correct value - Concurrent writes complete without error via coordinatedRetry - Write-then-read returns updated value, not stale cache Tests are skipped when HARPER_STORAGE_ENGINE=lmdb. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
||
| getSync(id: any, options?: any): any { | ||
| if (options?.transaction) { | ||
| return super.getSync(id, options); |
There was a problem hiding this comment.
Untested branch (blocker): The if (options?.transaction) bypass is never exercised by the test suite. The test labelled "Read with a transaction bypasses the cache" passes context = {}, which has no transaction property — options?.transaction is falsy and the cache path is taken, not the bypass. Both legs of this production-vs-fallback split need coverage. An actual RocksTransaction (or a Harper DatabaseTransaction-backed context) needs to be passed to verify the bypass reaches super.getSync without touching the cache.
| if (commitResult === RETRY_NOW_VALUE) { | ||
| this.retries++; | ||
| harperLogger.debug?.('coordinated retry', transaction.id, this.retries); | ||
| return this.commit({ transaction }); |
There was a problem hiding this comment.
No application-level retry bound: this.retries++ is incremented on each RETRY_NOW_VALUE but never checked against a limit. The old ERR_BUSY handler had MAX_RETRIES = 40 (now dead code at line 23) that threw ServerError after 40 attempts. If @harperfast/rocksdb-js guarantees it will eventually stop returning RETRY_NOW_VALUE under any contention scenario, please document that guarantee. Otherwise, reinstate an application-level cap so sustained conflicts don't produce unbounded recursion.
Review: Record caching (PR #410)1.
|
Benchmarks six scenarios with 2,000 records (~80-byte values each):
1. getSync cold — first read, cache miss (both go to DB)
2. getSync soft VT miss — 2nd read: cache warm, expectedVersion passed,
DB read happens but slot is populated + FRESH returned
3. getSync VT fast-path — 3rd+ read: VT slot matches, no DB access
4. getSync hot key — repeated single-key reads
5. get (async) VT hit — async path with warm cache
6. putSync — write throughput (cache invalidation cost)
Each scenario reports ops/sec for plain RocksDatabase and CachingRocksDatabase
with a speedup ratio, printed as a formatted table.
Adds `npm run bench` script. Also fixes `test:unit:resources` to exclude
*.bench.js files so they don't run as part of the normal test suite.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…integration Renames CachingRocksDatabase to PrimaryRocksDatabase and moves all RocksDB-specific entry handling (previously done by handleLocalTimeForGets instance patches) into real class methods. Caching now operates at the getEntry level using full Entry objects with `version`, using verifyVersion/populateVersion for VT-based freshness checks rather than expectedVersion. handleLocalTimeForGets delegates to initStore() for PrimaryRocksDatabase instances via the isPrimaryRocksDatabase marker. All 10 new tests and 1720 existing tests pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…rsion) call The rocksdb-js store.getSync/get natively accept expectedVersion: on a VT hit they return FRESH_VERSION_FLAG without touching disk; on a VT miss they read from the DB and auto-populate the VT slot. Replacing the two-step verifyVersion+populateVersion pattern with a single native call removes one round-trip per read on the warm cache path. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…nchmark - PrimaryRocksDatabase.getEntry now uses populateVersion option for cold reads (single native call, no separate populateVersion()) and merges options when passing expectedVersion so transaction snapshot is preserved - handleLocalTimeForGets: remove all isRocksDB branches — RocksDB stores now use the isPrimaryRocksDatabase early-return path exclusively; the remaining code is LMDB-only, so isRocksDB is always false there - Benchmark: use RecordEncoder + recordUpdater with Date.now()-range versions so values carry encoded version bytes and caching works correctly Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
| if (commitResult === RETRY_NOW_VALUE) { | ||
| this.retries++; | ||
| harperLogger.debug?.('coordinated retry', transaction.id, this.retries); | ||
| return this.commit({ transaction }); |
There was a problem hiding this comment.
No retry cap on the coordinatedRetry path.
MAX_RETRIES = 40 is still defined at the top of the file but is now unreachable dead code — nothing checks it. If RocksDB keeps returning RETRY_NOW_VALUE (sustained write contention, edge-case in the native layer, etc.) this recurses without bound until stack exhaustion.
| return this.commit({ transaction }); | |
| if (commitResult === RETRY_NOW_VALUE) { | |
| this.retries++; | |
| if (this.retries > MAX_RETRIES) { | |
| throw new ServerError( | |
| `After ${MAX_RETRIES} retries, unable to commit transaction, transaction is in conflict with ongoing writes` | |
| ); | |
| } | |
| harperLogger.debug?.('coordinated retry', transaction.id, this.retries); | |
| return this.commit({ transaction }); | |
| } |
| @@ -196,6 +197,7 @@ | |||
| "jsonata": "1.8.7", | |||
| "jsonwebtoken": "9.0.3", | |||
| "lmdb": "3.5.4", | |||
There was a problem hiding this comment.
weak-lru-cache is not documented in dependencies.md.
Per this repo's own policy, every new runtime dependency requires an entry in dependencies.md answering the standard checklist (size, security track record, overlap, memory cost, removal plan, etc.). Please add the entry before merging.
Review: Record caching (PR #410)2 blockers found. 1. No retry cap on the
|
| attributes: [ | ||
| { name: 'id', isPrimaryKey: true }, | ||
| { name: 'embedding', indexed: { type: 'HNSW' }, type: 'Array' }, | ||
| ], |
There was a problem hiding this comment.
Blocker — missing worker script
vectorIndex-thread.js does not exist anywhere in this repository (confirmed with git log and directory listing). Every Worker instantiation here will throw ENOENT immediately, causing the entire before() hook to fail and the test suite to error out without running.
Please add the missing worker script (it should live at unitTests/resources/vectorIndex-thread.js) that receives the { type: 'insert', start, count, dims } message and posts back a result/error reply.
| "jsonata": "1.8.7", | ||
| "jsonwebtoken": "9.0.3", | ||
| "lmdb": "3.5.4", | ||
| "weak-lru-cache": "^1.2.2", |
There was a problem hiding this comment.
Blocker — runtime dependency not documented in dependencies.md
weak-lru-cache is a new runtime dependency but has no entry in dependencies.md. Per repo policy, every runtime dependency addition must answer the questions listed there (size, security track record, overlap with existing packages, memory cost, removal plan, etc.) so the full engineering team can review the decision alongside the code.
Please add an entry for weak-lru-cache to dependencies.md before merging.
| import { setTimeout as delay } from 'node:timers/promises'; | ||
| import { Transaction as RocksTransaction, type Store as RocksStore } from '@harperfast/rocksdb-js'; | ||
| import { Transaction as RocksTransaction, type Store as RocksStore, constants } from '@harperfast/rocksdb-js'; | ||
| const RETRY_NOW_VALUE = constants.RETRY_NOW_VALUE; |
There was a problem hiding this comment.
Nit — MAX_RETRIES (line 23) is now dead code
The constant MAX_RETRIES = 40 is still defined but no longer used — the ERR_BUSY backoff guard that consumed it was removed when switching to coordinatedRetry. Consider removing it, or reinstate it as a cap inside the new RETRY_NOW_VALUE branch to preserve the JS-side safety net if the native layer ever loops unexpectedly.
Review: Record caching (PR #410)What I traced
1.
|
Introduces CachingRocksDatabase, a RocksDatabase subclass that layers a
per-instance WeakLRUCache on top of the process-wide VerificationTable.
Cache freshness is checked via entry.version (Harper's localTime timestamp)
rather than a txnId, so reads hitting a fresh VT slot skip disk entirely.
Primary RocksDB stores now use CachingRocksDatabase instead of RocksDatabase.
DatabaseTransaction creates transactions with coordinatedRetry: true, so
IsBusy conflicts are signalled as RETRY_NOW_VALUE (immediate retry, no
backoff) rather than an ERR_BUSY rejection. The old quadratic-backoff retry
handler is removed.