Skip to content

folknor/protohoggr

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

protohoggr

Zero-copy protobuf wire-format primitives for Rust. No external dependencies.

Extracted from pbfhogg, an OpenStreetMap PBF reader/writer.

Built with LLMs. See LLM.md.

Design philosophy

This crate is optimized for speed on well-formed protobuf data, not for adversarial input validation. It was extracted from an OpenStreetMap PBF pipeline where throughput matters and the wire data is trusted.

Concretely: varint/tag decoding validates overflow and rejects malformed sequences, but packed iterators silently end on truncated data (rather than returning Result per element), 32-bit packed iterators truncate without range checks, skip_varint has no length limit, and read_tag passes through reserved wire types without filtering. These are intentional trade-offs documented in the source — see the doc comments on PackedIter::next, zigzag_decode_32, skip_varint, and read_tag.

What's in the box

  • Cursor — zero-copy reader over a byte slice: varints (LEB128), zigzag-decoded sint32/sint64, tags, length-delimited fields, fixed-width 32/64, float/double, field skipping, raw field extraction, read_varint_unchecked for validated regions
  • Packed iteratorsPackedIter, PackedSint64Iter, PackedSint32Iter, PackedInt64Iter, PackedInt32Iter, PackedUint32Iter, PackedBoolIter
  • Batch packed operationscount_packed_varints (SSE2 SIMD on x86-64), decode_packed_sint64_cumulative (batch decode with cumulative sum)
  • Varint/zigzag encodingencode_varint, encode_varint_to_slice (unsafe, slice-based with branchless 1-2 byte fast path), zigzag_encode_64, zigzag_encode_32
  • Field encoders — varint, int64, int32, uint32, bool, bytes, sint64, sint32, fixed32, fixed64, float, double — each with a skip-zero default and an _always variant
  • Packed repeated field encodersencode_packed_uint32, encode_packed_int32, encode_packed_sint64, encode_packed_sint32, encode_packed_bool

All encoding functions skip zero/empty/false values by default (matching protobuf conventions), with _always variants for fields that must always be present.

Usage

use protohoggr::{Cursor, encode_varint, encode_bytes_field};

// Decode
let data = [0x08, 0xac, 0x02]; // field 1, varint 300
let mut cursor = Cursor::new(&data);
let (field, wire_type) = cursor.read_tag().unwrap().unwrap();
let value = cursor.read_varint().unwrap();
assert_eq!((field, value), (1, 300));

// Encode
let mut buf = Vec::new();
encode_bytes_field(&mut buf, 1, b"hello");

License

Apache-2.0 or MIT.

About

Zero-copy protobuf wire-format primitives for Rust. No dependencies.

Topics

Resources

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE
MIT
LICENSE-MIT

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages