Skip to content

Optimize BPMN parser XML lookups#459

Merged
essweine merged 2 commits into
mainfrom
parser
Apr 24, 2026
Merged

Optimize BPMN parser XML lookups#459
essweine merged 2 commits into
mainfrom
parser

Conversation

@jbirddog
Copy link
Copy Markdown
Contributor

Collapse repeated BPMN parser XPath scans into document- and process-level indexes while preserving existing parser behavior. This adds one-time indexes for messages, signals, errors, escalations, correlations, outgoing flows, boundary events, task nodes, data references, and BPMN DI lane/position metadata, and avoids fallback root scans when indexed absence is already known.

Focused BPMN parser tests pass, and the full suite passed in both serial and parallel runs (681 tests, 1 skipped). On a large production workflow set of about 1.4 MB of BPMN/DMN XML, specs_from_xml improved from roughly 1.0s before this indexing work to a 10-run median of 0.195s, with a 0.161s minimum and 0.208s mean.

Refactor ProcessParser.start_messages() to build a message-id lookup table once instead of rescanning all BPMN message nodes for each message start event.

This preserves the existing behavior while reducing the lookup path from O(m*n) to O(m+n), where m is the number of messages and n is the number of message start events. Add a focused regression test that counts id lookups so the improvement is verified without relying on noisy timing assertions.
Collapse repeated BPMN parser XPath scans into document- and process-level indexes while preserving existing parser behavior. This adds one-time indexes for messages, signals, errors, escalations, correlations, outgoing flows, boundary events, task nodes, data references, and BPMN DI lane/position metadata, and avoids fallback root scans when indexed absence is already known.

Focused BPMN parser tests pass, and the full suite passed in both serial and parallel runs (681 tests, 1 skipped). On a large production workflow set of about 1.4 MB of BPMN/DMN XML, specs_from_xml improved from roughly 1.0s before this indexing work to a 10-run median of 0.195s, with a 0.161s minimum and 0.208s mean.
@essweine essweine merged commit e077a5e into main Apr 24, 2026
6 checks passed
@essweine essweine deleted the parser branch April 24, 2026 14:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants