Skip to content

O(n²) performance degradation when tag names contain dots #793

@dollebrolle

Description

@dollebrolle

Description

The jPath variable in OrderedObjParser.js uses dots (.) as a separator to track tag hierarchy (e.g., root.parent.child). When closing tags, it uses lastIndexOf(".") to pop back one level. However, when tag names themselves contain dots (e.g., <applink.ios_url>), lastIndexOf(".") finds the dot within the tag name instead of the hierarchy separator, causing incorrect truncation.

This leaves residue in the jPath string that accumulates with every dotted tag processed:

Open  <promotions>:           jPath = "promotions"
Open  <promotion>:            jPath = "promotions.promotion"
Open  <applink.ios_url>:      jPath = "promotions.promotion.applink.ios_url"
Close </applink.ios_url>:     jPath = "promotions.promotion.applink"        ← WRONG
Open  <applink.ios_app_name>: jPath = "promotions.promotion.applink.applink.ios_app_name"
Close </applink.ios_app_name>:jPath = "promotions.promotion.applink.applink" ← accumulates

The parsed JSON output is still correct because tree structure is maintained by tagsNodeStack, but the ever-growing jPath string makes all string operations progressively slower, resulting in O(n²) behavior.

Affected lines in OrderedObjParser.js

All uses of lastIndexOf(".") to manipulate jPath:

  • Line 241: jPath.substring(jPath.lastIndexOf(".")+1)
  • Line 247: jPath.lastIndexOf('.', jPath.lastIndexOf('.')-1)
  • Line 250: jPath.lastIndexOf(".")
  • Line 345: jPath.lastIndexOf(".")
  • Line 387: jPath.lastIndexOf(".")
  • Line 415: jPath.lastIndexOf(".")
  • Line 423: jPath.lastIndexOf(".")

Reproduction

const { XMLParser } = require('fast-xml-parser');

// With dots - triggers O(n²)
const xml = '<root>' + '<item><a.b>x</a.b></item>'.repeat(10000) + '</root>';
console.time('with dots');
new XMLParser().parse(xml);
console.timeEnd('with dots');

// Without dots - normal O(n)
const xml2 = '<root>' + '<item><a_b>x</a_b></item>'.repeat(10000) + '</root>';
console.time('no dots');
new XMLParser().parse(xml2);
console.timeEnd('no dots');

Real-world impact: a 55MB XML feed (26K items, ~8 dotted tags each) takes >5 minutes to parse instead of ~3 seconds.

Proposed fix

Replace all lastIndexOf(".")-based jPath truncation with exact arithmetic using the known tag name length. Since jPath is always built by appending "." + tagName, it can be reversed by subtracting tagName.length + 1 (or just tagName.length at root level).

Versions affected

Tested on v4.5.3, v5.3.4, and v5.4.1.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions