Skip to content

Performance regression: 37-79% slower parsing between v5.3.7 and v5.5.9 #816

@scolladon

Description

@scolladon

Description

After upgrading from fast-xml-parser@5.3.7 to 5.5.9, we observed a consistent 37–79% performance regression across all our benchmarks. The regression is most severe for parse-only operations (64–79% slower) and also affects full merge pipelines (37–51% slower).

Benchmark Results

Benchmark v5.3.7 (ops/sec) v5.5.9 (ops/sec) Ratio
parse-small 698 ±0.97% 390 ±0.58% 1.79x slower
parse-medium 80 ±1.23% 46 ±2.00% 1.74x slower
parse-large 18 ±1.64% 11 ±6.66% 1.64x slower
merge-small-no-conflict 317 ±3.88% 231 ±3.06% 1.37x slower
merge-small-with-conflict 353 ±1.86% 248 ±1.02% 1.42x slower
merge-medium-no-conflict 42 ±0.80% 30 ±0.79% 1.40x slower
merge-medium-with-conflict 43 ±3.65% 31 ±1.05% 1.39x slower
merge-large-no-conflict 10 ±1.78% 7 ±0.92% 1.43x slower
merge-ordered-globalvalueset 513 ±1.80% 347 ±1.82% 1.48x slower
merge-picklist-customfield 624 ±2.27% 413 ±2.34% 1.51x slower

Benchmarks run on the same CI runner (Ubuntu, Node 20), same code, only fast-xml-parser version changed.

Parser Options Used

const parserOptions = {
  cdataPropName: '#cdata',
  commentPropName: '#comment',
  ignoreAttributes: false,
  processEntities: false,
  ignoreDeclaration: true,
  numberParseOptions: { leadingZeros: false, hex: false },
  parseAttributeValue: false,
  parseTagValue: false,
  preserveOrder: true,
  trimValues: false,
}

Root Cause Analysis

After reviewing the code diff between v5.3.7 and v5.5.9, we identified several contributing factors:

1. jPath string replaced by Matcher object (highest impact)

The simple jPath string concatenation (jPath += "." + tagName) was replaced with a Matcher class from path-expression-matcher. This introduces:

  • Matcher.push() on every opening tag: creates a new object with { tag, position, counter, namespace, values }, iterates over a Map to calculate position, updates a Map for sibling tracking. Previously: single string concatenation.
  • Matcher.pop() on every closing tag: pops an array, truncates sibling stacks, returns node object. Previously: jPath.substring(0, jPath.lastIndexOf(".")).
  • Matcher.toString() called 6+ times per tag in the hot path (in parseTextData, buildAttributesMap, addChild, replaceEntitiesValue, saveTextToParentTag). Each call does this.path.map(n => n.tag).join(sep) — allocating a new array and string every time. Previously the string was already available.
  • readonlyMatcher Proxy: every property access goes through a Proxy get trap, checking MUTATING_METHODS.has(prop), then Reflect.get(). For .path and .siblingStacks, it creates frozen copies on every access.

2. Two-pass attribute parsing

buildAttributesMap() now processes attributes in two passes: first to build rawAttrsForMatcher, then again with full matcher context. Each attribute goes through resolveNameSpace(), value extraction, and replaceEntitiesValue() twice.

3. Removed indexOf('&') early-exit in replaceEntitiesValue

v5.3.7 had an early return at the top of replaceEntitiesValue:

if (val.indexOf('&') === -1) return val;

This was removed. Now every text value enters the function and evaluates the config, even when processEntities: false. The vast majority of XML text content has no &, so this early exit was highly effective.

4. Per-tag security validation

Every tag now goes through sanitizeName() (checks against criticalProperties and DANGEROUS_PROPERTY_NAMES using .includes()), transformTagName(), strictReservedNames check, extractNamespace(), and maxNestedTags depth check. Individually cheap, but they accumulate across thousands of tags.

Suggested Optimizations

  1. Cache Matcher.toString() result — compute once per push()/pop(), not on every callback invocation. This would eliminate the biggest hot-path allocation.
  2. Restore indexOf('&') early-exit in replaceEntitiesValue — no reason to remove this; it's compatible with all entity configurations.
  3. Make two-pass attribute parsing conditional — only do the second pass when PEM features (path expressions with attribute matchers) are actually in use.
  4. Avoid Proxy for readonlyMatcher when no callbacks use it — or cache the frozen copies instead of recreating them on every access.

Environment

  • Node.js: v20.20.1
  • OS: Ubuntu 24.04 (GitHub Actions runner)
  • Benchmarks: Vitest bench (powered by tinybench)

Reproducing

The benchmarks are from sf-git-merge-driver CI. The regression is 100% reproducible by swapping fast-xml-parser versions.

We understand the changes were motivated by important security fixes and the #793 O(n²) bug fix. We've accepted the regression on our side for now but wanted to report it in case optimizations can be applied without reverting the security/correctness improvements.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No fields configured for Task.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions