Skip to content

Add negation flag to SubfieldPatternQuery #63

@dchud

Description

@dchud

Motivation

A user scanning ~250k records needed to find ones where a subfield did NOT match a regex. Currently this requires Python-side filtering (not re.match(...)) which means:

  • Extra string copies across the Rust/Python FFI boundary for every record
  • GIL overhead on each field value that crosses into Python
  • Inability to short-circuit inside Rust iteration

Adding a negate flag pushes the inversion into Rust so non-matching fields never cross the FFI boundary.

Design

Follow the existing partial flag pattern on SubfieldValueQuery — add a negate: bool field defaulting to false.

Rust core (src/field_query.rs)

  • Add pub negate: bool to SubfieldPatternQuery
  • Update new() to set negate: false
  • Add pub fn negated(tag, subfield_code, pattern) -> Result<Self> constructor (sets negate: true)
  • Update matches(): use != self.negate so that when negated, a match means "subfield exists but does NOT match the pattern"

PyO3 bindings (src-python/src/query.rs)

  • Add negate as an optional keyword argument to PySubfieldPatternQuery::new() (default False)
  • Add #[getter] fn negate(&self) -> bool
  • Update __repr__ to include negate=true when set

Python stubs (mrrc/_mrrc.pyi)

  • Update SubfieldPatternQuery.__init__ signature: add negate: bool = False
  • Add negate property

Tests

  • Rust unit tests in src/field_query.rs: negated match, negated non-match, negated with missing subfield (should return false — subfield must exist)
  • Python tests in tests/python/test_query_dsl.py: negated pattern query via kwarg, verify negate property, integration with fields_matching_pattern()

Documentation

  • Update docs/guides/query-dsl.md with negation examples
  • Update docs/tutorials/python/querying-fields.md if SubfieldPatternQuery is covered there
  • Update API reference docs if they document SubfieldPatternQuery parameters

Edge cases

  • Missing subfield: When negate=true and the subfield doesn't exist on a field, matches() returns false. Negation means "the subfield exists but its value doesn't match" — not "the subfield is absent."
  • Repeating subfields: get_subfield() returns only the first subfield with the given code. With negate=true, only the first value is checked. This is consistent with existing non-negated behavior but should be documented.

matches() implementation

field.get_subfield(self.subfield_code)
    .is_some_and(|value| self.pattern.is_match(value) != self.negate)

Scope

This issue covers the negate flag on SubfieldPatternQuery only. SubfieldValueQuery negation is tracked separately. A broader query composition framework (NotQuery, AndQuery, OrQuery) is a separate concern.

Bead: bd-w9cm

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions