Package detail

saxen

nikku445.9kMIT10.0.0

A tiny, super fast, namespace aware sax-style XML parser written in plain JavaScript

xml, sax, parser, pure

readme

/saxen/ parser

CI Codecov

A tiny, super fast, namespace aware sax-style XML parser written in plain JavaScript.

Features

  • (optional) entity decoding and attribute parsing
  • (optional) namespace aware
  • element / attribute normalization in namespaced mode
  • tiny (2.6Kb minified + gzipped)
  • pretty damn fast

Usage

var {
  Parser
} = require('saxen');

var parser = new Parser();

// enable namespace parsing: element prefixes will
// automatically adjusted to the ones configured here
// elements in other namespaces will still be processed
parser.ns({
  'http://foo': 'foo',
  'http://bar': 'bar'
});

parser.on('openTag', function(elementName, attrGetter, decodeEntities, selfClosing, getContext) {

  elementName;
  // with prefix, i.e. foo:blub

  var attrs = attrGetter();
  // { 'bar:aa': 'A', ... }
});

parser.parse('<blub xmlns="http://foo" xmlns:bar="http://bar" bar:aa="A" />');

Supported Hooks

We support the following parse hooks:

  • openTag(elementName, attrGetter, decodeEntities, selfClosing, contextGetter)
  • closeTag(elementName, decodeEntities, selfClosing, contextGetter)
  • error(err, contextGetter)
  • warn(warning, contextGetter)
  • text(value, decodeEntities, contextGetter)
  • cdata(value, contextGetter)
  • comment(value, decodeEntities, contextGetter)
  • attention(str, decodeEntities, contextGetter)
  • question(str, contextGetter)

In contrast to error, warn receives recoverable errors, such as malformed attributes.

In proxy mode, openTag and closeTag a view of the current element replaces the raw element name. In addition element attributes are not passed as a getter to openTag. Instead, they get exposed via the element.attrs:

  • openTag(element, decodeEntities, selfClosing, contextGetter)
  • closeTag(element, selfClosing, contextGetter)

Namespace Handling

In namespace mode, the parser will adjust tag and attribute namespace prefixes before passing the elements name to openTag or closeTag. To do that, you need to configure default prefixes for wellknown namespaces:

parser.ns({
  'http://foo': 'foo',
  'http://bar': 'bar'
});

To skip the adjustment and still process namespace information:

parser.ns();

Proxy Mode

In this mode, the first argument passed to openTag and closeTag is an object that exposes more internal XML parse state. This needs to be explicity enabled by instantiating the parser with { proxy: true }.

// instantiate parser with proxy=true
var parser = new Parser({ proxy: true });

parser.ns({
  'http://foo-ns': 'foo'
});

parser.on('openTag', function(el, decodeEntities, selfClosing, getContext) {
  el.originalName; // root
  el.name; // foo:root
  el.attrs; // { 'xmlns:foo': ..., id: '1' }
  el.ns; // { xmlns: 'foo', foo: 'foo', foo$uri: 'http://foo-ns' }
});

parser.parse('<root xmlns:foo="http://foo-ns" id="1" />')

Proxy mode comes with a performance penelty of roughly five percent.

Caution! For performance reasons the exposed element is a simple view into the current parser state. Because of that, it will change with the parser advancing and cannot be cached. If you would like to retain a persistent copy of the values, create a shallow clone:

parser.on('openTag', function(el) {
  var copy = Object.assign({}, el);
  // copy, ready to keep around
});

Non-Features

/saxen/ lacks some features known in other XML parsers such as sax-js:

  • no support for parsing loose documents, such as arbitrary HTML snippets
  • no support for text trimming
  • no automatic entity decoding
  • no automatic attribute parsing

...and that is ok ❤.

Credits

We build on the awesome work done by easysax.

/saxen/ is named after Sachsen, a federal state of Germany. So geht sächsisch!

LICENSE

MIT

changelog

Changelog

All notable changes to saxen are documented here. We use semantic versioning for releases.

Unreleased

_Note: Yet to be released changes appear here._

10.0.0

  • FEAT: turn into ES module
  • CHORE: require Node >= 18
  • CHORE: drop UMD distribution

Breaking Changes

  • No longer ships UMD distribution
  • Requires Node >= 18

9.0.0

  • FEAT: do not transform xsi:type attribute contents (#23)
  • FEAT: do not hardcode xsi namespace prefix (#23)

Breaking Changes

  • Moved xsi:type attribute manipulation by saxen and xsi prefix binding out of this library, to be handled by downstream libraries, this one is low-level (#23)

8.1.2

  • FIX: correct skipping of > in body tag (#22)

8.1.1

  • FIX: parse > in attribute names (#17, #20)
  • CHORE: drop leftover console.log statement

8.1.0

  • FEAT: warn on non-whitespace outside root node (#11, #12)
  • FEAT: allow dots in tag names

8.0.0

Breaking Changes

  • CHORE: rename ES module to dist/index.esm.js for improved bundler compatibility
  • FIX: drop browser field for better interoperability with module bundlers

7.0.1

  • FIX: allow . in attribute name part

7.0.0

Breaking Changes

  • FEAT: expose Parser and decode via single export only. Use import or destructuring to access it:

    var { Parser } = require('saxen');

Other Improvements

  • FEAT: generate pre-built distributions for CommonJS and Browser targets
  • FEAT: generate UMD bundle
  • CHORE: Migrate code base to ES6

6.0.1

This is a re-publish of the broken v6.0.0 version.

  • FEAT: recover from attribute parse errors (#13)

6.0.0

Unpublished; Use v6.0.1 instead.

5.7.0

  • FEAT: detect and gracefully handle local attribute re-declarations that are forbidden via the XML spec: We'll now emit a warning and ignore the offending attribute (7d0c8629)

5.6.0

This release accidently introduced backwards incompatible changes; use >= 5.7.0 instead.

5.5.0

  • FEAT: expose getContext on all hooks (634857b0)

5.4.1

  • FIX: bundle decode.js with published package (528cd1c0)

5.4.0

  • CHORE: configure hooks only if actually used (5ab3e2ee)

5.3.1

  • FIX: properly handle missing open tags </a>

5.3.0

  • CHORE: simplify and speed up entity decoding (066e712d)

5.2.0

5.1.0

  • FEAT: proxy mode exposes clonable view (73c6c44a)

5.0.1

  • FIX: return {} on non-existing attributes, too

5.0.0

  • CHORE: don't return true on empty attrs (f7360b11)

4.0.1

  • DOCS: improve readme

4.0.0

  • FEAT: fully support anonymous elements in namespace mode (2f48744a)
  • FEAT: emit <warn> for all attribute parse issues (a5014b25)

3.1.0

  • FEAT: keep non-decodeable entities as is
  • FEAT: decode only required sub-set of named entities

3.0.1

  • CHORE: add license field to package.json

3.0.0

  • FEAT: throw on handler errors (4b0ebb1)
  • FEAT: expose current namespaces in proxy mode
  • FEAT: normalize xsi:type attribute values (#4)
  • FEAT: add warn event, informing about recoverable errors (7fce2151)

2.0.0

  • FEAT: rename events

    • textNode -> text
    • startNode -> openTag
    • endNode -> closeTag

1.1.0

  • FEAT: handle non-xml input

1.0.4

  • DOCS: better @type annotations
  • CHORE: save a few bytes in decoding logic

1.0.3

  • DOCS: correct @type and @return annotations in parser

1.0.2

  • FIX: properly handle namespace prefix collisions (#1)

1.0.1

  • CHORE: improve test coverage and documentation

1.0.0

  • FEAT: don't skip unknown namespace nodes
  • FEAT: expose parse context in startNode, endNode and error
  • FEAT: introduce parser object mode
  • FEAT: pipe handler errors to error handler
  • FEAT: allow non-args #ns call
  • FIX: various namespace handling errors
  • STYLE: unify code style
  • CHORE: rename library to saxen
  • CHORE: improve test coverage
  • CHORE: add linting
  • DOCS: move to english language for documentation and README

...

Check git log for earlier history.