feat(kvp): add store trait and Hyper-V KVP storage crate by peytonr18 · Pull Request #288 · Azure/azure-init

peytonr18 · 2026-03-09T19:03:47Z

What this PR introduces

Adds workspace crate libazureinit-kvp.
Adds libazureinit-kvp/src/lib.rs and libazureinit-kvp/src/kvp_pool.rs.
Adds crate dependencies: libc, serde, serde_json, sysinfo.

KVP design

Public trait: KvpStore (pure interface; no default method implementations).
Concrete implementation: KvpPoolStore (file-backed KVP pool).
Error type: KvpError.

KvpStore API:

max_key_size() -> usize
max_value_size() -> usize
insert(key, value) -> Result<(), KvpError>
append(key, value) -> Result<(), KvpError>
read(key) -> Result<Option<String>, KvpError>
delete(key) -> Result<bool, KvpError>
clear() -> Result<(), KvpError>
entries() -> Result<HashMap<String, String>, KvpError>
len() -> Result<usize, KvpError>
is_empty() -> Result<bool, KvpError>
is_stale() -> Result<bool, KvpError>
dump(path) -> Result<(), KvpError>

On-disk format and limits

Hyper-V wire format: fixed-size records.
key field: 512 bytes
- value field: 2048 bytes
- record size: 2560 bytes
Maximum unique keys: 1024.
PoolMode controls write limits:
- Restricted: key <= 254 bytes, value <= 1022 bytes
- Full: key <= 512 bytes, value <= 2048 bytes
Reads accept keys up to 512 bytes in both modes.
Default path when path == None: /var/lib/hyperv/.kvp_pool_1.

Locking change (`flock` -> `fcntl`)

Replaces fs2 locks with libc::fcntl locks in libazureinit-kvp.
New lock behavior:
- shared read lock: fcntl(F_OFD_SETLKW, F_RDLCK)
- exclusive write lock: fcntl(F_OFD_SETLKW, F_WRLCK)
- unlock: fcntl(F_OFD_SETLK, F_UNLCK)
Why: Linux flock and fcntl locks are separate lock domains and do not coordinate with each other.
Result: the KVP store now coordinates correctly with Linux components using fcntl-style locks (for example hv_kvp_daemon/cloud-init workflows).

File I/O behavior

Uses regular file I/O (OpenOptions, read_exact, write_all, set_len, seek, flush).
open_for_read_write_create() explicitly sets .truncate(false) to avoid implicit truncation on open.
insert flow:
1. open file read+write+create,
2. take exclusive lock,
3. iterate records, collecting unique keys into a set: if key found, overwrite value in-place; else check unique-key cap and append new record at EOF,
4. flush,
5. unlock (on iterator drop).
append flow:
1. open file read+write+create,
2. take exclusive lock,
3. seek to EOF and write a new record (no duplicate check, no unique-key cap),
4. flush,
5. unlock (on iterator drop).
delete flow:
1. open file read+write,
2. take exclusive lock,
3. iterate records: if key found, swap with last record and truncate,
4. flush,
5. unlock (on iterator drop).
read/entries flow:
1. open file read-only,
2. take shared lock,
3. read/decode all records,
4. unlock (on iterator drop).

Stale-data behavior

Constructor: KvpPoolStore::new(path, mode, truncate_on_stale).
If truncate_on_stale == true and file mtime predates current boot time, the pool is truncated.
Stale check is explicit and caller-controlled via constructor flag.

Error model

KvpError variants:

EmptyKey
KeyContainsNull
KeyTooLarge { max, actual }
ValueTooLarge { max, actual }
MaxUniqueKeysExceeded { max }
Io(io::Error)

Test coverage

43 tests cover:

key validation rules (empty, null, size boundaries in both modes),
path/mode behavior,
overwrite semantics, unique-key cap (insert, delete-then-insert, overwrite-at-limit),
append semantics (no cap, duplicate-key first-wins on read, size validation),
delete, clear, len, is_empty, stale checks,
JSON dump behavior,
iterator correctness (yield order, empty file, malformed file, mid-iteration drop releases lock),
in-place overwrite (adjacent records untouched, multiple overwrites in one pass),
remove/swap-and-continue, remove-last, remove-only-record,
file-size invariants (unchanged on overwrite, grows on new key, preserves other records),
file integrity after mixed operations,
concurrent reader/writer, writer/writer, append, and mixed insert+append scenarios.

libazureinit-kvp/src/lib.rs

doc/libazurekvp.md

libazureinit-kvp/src/hyperv.rs

Copilot

Pull request overview

Adds a new standalone workspace crate (libazureinit-kvp) that defines a storage abstraction (KvpStore) and a production Hyper-V pool-file implementation (HyperVKvpStore), along with updated technical reference documentation.

Changes:

Introduces libazureinit-kvp crate with KvpStore and KvpLimits (Hyper-V/Azure presets).
Implements Hyper-V binary pool-file backend with flock-based concurrency and stale-file truncation support.
Rewrites doc/libazurekvp.md as a technical reference for the new crate/API.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
`Cargo.toml`	Adds `libazureinit-kvp` to the workspace members.
`libazureinit-kvp/Cargo.toml`	Defines the new crate and its dependencies (`fs2`, `sysinfo`).
`libazureinit-kvp/src/lib.rs`	Defines `KvpStore`, `KvpLimits`, and exports `HyperVKvpStore` plus size constants.
`libazureinit-kvp/src/hyperv.rs`	Implements Hyper-V pool-file encoding/decoding and the `KvpStore` backend, plus unit tests.
`doc/libazurekvp.md`	Updates documentation to describe the new crate’s record format, semantics, truncation behavior, and limits.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

libazureinit-kvp/src/hyperv.rs

doc/libazurekvp.md

libazureinit-kvp/src/hyperv.rs

doc/libazurekvp.md

libazureinit-kvp/src/lib.rs

libazureinit-kvp/src/hyperv.rs

libazureinit-kvp/src/lib.rs

libazureinit-kvp/src/hyperv.rs

…ests

…it, clear() API

libazureinit-kvp/src/azure.rs

libazureinit-kvp/src/lib.rs

…ation; add PoolMode to handle size limits

…ics and JSON dump Overhaul the KvpStore trait into a pure customer-facing interface with no default implementations or backend_* methods. All logic (validation, file I/O, locking) now lives in the KvpPoolStore implementation. Key changes: - Replace append-based write with upsert (insert-or-update) so each key has exactly one record in the file. Eliminates entries_raw and last-write-wins deduplication. - Move validate_key/validate_value from trait defaults to private KvpPoolStore methods. - Decouple read validation from PoolMode: reads accept keys up to the full wire-format max (512 bytes) regardless of mode; only writes are restricted by PoolMode. - Add len(), is_empty(), and dump() (JSON via serde_json) to the trait. - Add 4 multi-threaded concurrency tests covering concurrent upserts to different keys, same key, mixed readers/writers, and key cap boundary.

… locks Use libc fcntl(F_OFD_SETLKW/F_OFD_SETLK) for shared/exclusive/unlock operations so pool file locking interoperates with Linux KVP consumers that use fcntl record locks. Also replace fs2 with libc and keep existing pool behavior/tests intact.

bpryan99 · 2026-03-30T19:10:34Z

libazureinit-kvp/src/kvp_pool.rs

+use crate::{KvpError, KvpStore};
+
+/// Default KVP pool path when none is provided.
+pub const DEFAULT_KVP_POOL_PATH: &str = "/var/lib/hyperv/.kvp_pool_1";


Is there any chance for multiple KVP pools where filename would need to be distinct? (ex. kvp_pool_2)

Technically other filenames exist (.kvp_pool_2 does sometimes exist), but the only one hyperV looks at is .kvp_pool_1. I gave the option to accept a different pool path in case the user wants to not use hyperV and just do their own thing but otherwise, the only filename we would want to target for HyperV ingestion is pool_1.

Let me know if that answers your question!!

Makes complete sense, thanks!

bpryan99 · 2026-03-30T19:16:58Z

libazureinit-kvp/src/kvp_pool.rs

+            Err(e) => return Err(e.into()),
+        };
+        let boot = Self::boot_time()?;
+        Ok(metadata.mtime() < boot)


Is this function intended to return true when metadata timestamp is less than time since boot in order to consider the pool stale?

Yes!
The boot timestamp is compared to the ,etadata.mtime and if mtime >= boot_time, then the file is treated as current. If not, it's stale!

bpryan99 · 2026-03-30T19:18:22Z

libazureinit-kvp/src/kvp_pool.rs

+            Err(ref e) if e.kind() == ErrorKind::NotFound => return Ok(false),
+            Err(e) => return Err(e.into()),
+        };
+        Ok(metadata.mtime() < boot_time)


What's the functional difference between this function and the above function if both return on the evaluation of metadata time vs boot time? Are both needed?

boot_is_stale(&self) is the production path that computes the boot time internally via Self::boot_time() and then compares it to the metadata.

pool_is_stale_at_boot(&self, boot_time: i64 is a test only helper that takes in the boot time as an argument so that the tests can inject a fixed value and avoid the issue of not having a real system clock/uptime.

So, they functionally do the same comparison, one is just for the testing env. But, if this is confusing, I could extract the shared logic into a helper function!!

Makes sense to me -- I noticed the separate var for boot time in the latter, but just needed clarification on intent. Maybe a comment/docstring would be helpful but I wouldn't ask for functional changes here.

bpryan99 · 2026-03-30T19:26:40Z

libazureinit-kvp/src/kvp_pool.rs

+            }
+
+            map.insert(key.to_string(), value.to_string());
+            rewrite_file(&mut file, &map)?;


If the map contains all records in this case, the whole file would be rewritten. Is this necessary/cause any unneeded latency?

Great question!!! Chris and I talked about this at length offline, and I ran some small tests to evaluate some of the other approaches (memory mapping, indexed in-place inserts or key updates, etc). but based on those tests, I thought this was the best approach when thinking about speed and safety.

The good news is because there is a hard-cap of 1024 records, the rewrite cost is bounded. I think the idea was that we start with this but if we start seeing latency issues, we can go to a more complex approach with memory mapping or whatever alternative.

Great, as long as it was discussed/tested -- It's not a whole lot to ever write given the record cap plus k/v size limits, but since this is involved in provisioning, I wanted to be sure we had considered even minor latency factors.

bpryan99

My comments were addressed, LGTM! Great work.

Introduce explicit insert (overwrite existing key) and append (always add record) APIs, refactor pool operations to iterator-based in-place updates/deletes, and document record-count vs unique-key behavior with expanded concurrency/integrity test coverage.

cjp256 · 2026-04-03T16:19:40Z

libazureinit-kvp/src/kvp_pool.rs

+    /// Append a new record at the end of the file, zero-padded to the
+    /// fixed-width wire format.
+    fn append(&mut self, key: &str, value: &str) -> io::Result<()> {
+        self.file.seek(io::SeekFrom::End(0))?;


we could do an assertion here to ensure we are properly aligned

cjp256 · 2026-04-03T16:27:38Z

libazureinit-kvp/src/kvp_pool.rs

+
+        let mut iter = self.iter_mut()?;
+        iter.append(key, value)?;
+        iter.flush()?;


the flush story needs to be worked out. if kvpd will read from this file while we hold the lock, then every write should probably be followed up by flush. if not, then maybe we only flush before we drop the lock

cjp256 · 2026-04-03T16:29:33Z

doc/kvp_pr_description.md

@@ -0,0 +1,74 @@
+# PR: Add `libazureinit-kvp` with unified KVP pool store


file seems out of date - is it going to be deleted?

cjp256 · 2026-04-03T16:33:06Z

libazureinit-kvp/src/kvp_pool.rs

+use crate::{KvpError, KvpStore};
+
+/// Default KVP pool path when none is provided.
+pub const DEFAULT_KVP_POOL_PATH: &str = "/var/lib/hyperv/.kvp_pool_1";


let's do an enum for these, see https://github.com/Azure/azure-init/pull/290/changes#diff-0e11ae6109310d637614a88f49107f5f77f3f8adb457d64e32d83cf37b6875e0R64

cjp256 · 2026-04-03T16:33:51Z

libazureinit-kvp/src/kvp_pool.rs

+/// Policy mode controlling key/value size limits for writes.
+#[derive(Clone, Copy, Debug, PartialEq, Eq)]
+pub enum PoolMode {
+    /// Conservative limits for Azure compatibility


the limits are for linux kernel compatibility

cjp256 · 2026-04-03T16:34:37Z

libazureinit-kvp/src/kvp_pool.rs

+pub enum PoolMode {
+    /// Conservative limits for Azure compatibility
+    /// (key <= 254 bytes, value <= 1022 bytes).
+    Restricted,


let's call this "Safe"

we might be able to figure out how to detect the kernel's limits and set behavior for Safe accordingly

cjp256 · 2026-04-03T16:34:57Z

libazureinit-kvp/src/kvp_pool.rs

+    Restricted,
+    /// Full Hyper-V wire-format limits
+    /// (key <= 512 bytes, value <= 2048 bytes).
+    Full,


let's call this "Unsafe"

cjp256 · 2026-04-03T16:38:05Z

libazureinit-kvp/src/lib.rs

+    fn append(&self, key: &str, value: &str) -> Result<(), KvpError>;
+
+    /// Read the value for a key. Returns `Ok(None)` when absent.
+    fn read(&self, key: &str) -> Result<Option<String>, KvpError>;


this should document behavior if multiple matches for "key" are found. (last one wins)

cjp256 · 2026-04-03T16:38:29Z

libazureinit-kvp/src/lib.rs

+    fn read(&self, key: &str) -> Result<Option<String>, KvpError>;
+
+    /// Remove a key. Returns `true` if the key was present.
+    fn delete(&self, key: &str) -> Result<bool, KvpError>;


similarly document behavior if multiple matches are found (all are delted)

cjp256 · 2026-04-03T16:41:00Z

libazureinit-kvp/src/lib.rs

+    fn clear(&self) -> Result<(), KvpError>;
+
+    /// Return all key-value pairs.
+    fn entries(&self) -> Result<HashMap<String, String>, KvpError>;


can we get an interface that provides a list of key-values, so we can validate when multiple keys are present?

it'd be a useful interface to test with.

e.g.

append("key", "val1")
append("key", "val2")
append("key", "val3")
assert entries() == {"key": "val3"}
assert dump() == [("key", "val1"), ("key", "val2"), ("key", "val3")]

cjp256 · 2026-04-03T16:43:10Z

libazureinit-kvp/src/kvp_pool.rs

+    }
+
+    fn is_stale(&self) -> Result<bool, KvpError> {
+        self.pool_is_stale()


i think pool_is_stale() can be moved into here. no need for the jump and "pool" is implied

cjp256 · 2026-04-03T16:44:07Z

libazureinit-kvp/src/kvp_pool.rs

+    }
+
+    fn is_empty(&self) -> Result<bool, KvpError> {
+        match std::fs::metadata(&self.path) {


would it make sense to check len() == 0?

cjp256 · 2026-04-03T16:45:34Z

libazureinit-kvp/src/kvp_pool.rs

+        Ok(metadata.mtime() < boot_time)
+    }
+
+    fn truncate_pool(&self) -> Result<(), KvpError> {


"_pool" is implied, maybe just remove it

actually maybe just move all this into clear

feat(kvp): add store trait and Hyper-V KVP storage crate

f435f70

peytonr18 force-pushed the probertson/kvp-store-trait branch from b134a98 to f435f70 Compare March 9, 2026 19:19

cadejacobson requested changes Mar 9, 2026

View reviewed changes

libazureinit-kvp/src/lib.rs Outdated Show resolved Hide resolved

doc/libazurekvp.md Outdated Show resolved Hide resolved

doc/libazurekvp.md Outdated Show resolved Hide resolved

libazureinit-kvp/src/hyperv.rs Outdated Show resolved Hide resolved

add Azure autodetect limits selection for HyperVKvpStore

5226e32

cadejacobson self-requested a review March 9, 2026 20:42

Improving libazurekvp.md clarity

c1440c9

peytonr18 requested a review from Copilot March 9, 2026 22:34

Copilot started reviewing on behalf of peytonr18 March 9, 2026 22:34 View session

Copilot AI reviewed Mar 9, 2026

View reviewed changes