NAME
html/treebuilder - HTML tree construction framework.
SYNOPSIS
from html/treebuilder import HTMLTreeBuilder;
let result := new HTMLTreeBuilder(
_input: "<!doctype html><title>Example</title>",
).parse();
NOTE
This module is not normally useful to end users. Instead use html/parser.
DESCRIPTION
This module implements the tree-builder layer for html/parser. It connects the tokenizer to the html/dom classes and covers the initial, before html, before head, in head, text, after head, in body, table, select, template, frameset, after body, after after body, and fragment insertion-mode setup. It also routes SVG and MathML foreign content through namespace-aware insertion, adjusted SVG/MathML names, foreign XLink/XML/XMLNS attributes, HTML/MathML integration points, and foreign CDATA sections.
It deliberately does not implement script execution, file load/dump helpers, or the html5lib .dat harness.
EXPORTS
Classes
HTMLTreeBuilderTree-construction engine. Most applications should use
HTML.parseorHTMLParser; this class is exported for tests, diagnostics, and tools which need direct access to the tree-building layer.Construct with
_inputto provide source text.parse()returns anHTMLTreeConstructionResultfor a full document.parseFragmentparses a context-sensitive fragment and returns anHTMLTreeConstructionResultwith both a staging document and a fragment.Useful public accessors are
tokenizer,document,fragment,errors,parseErrors,insertionMode, andcurrentNode.errors()returns tokenizer and tree-construction parse errors collected during the latest parse.Lower-level stack, scope, insertion, and mode methods are exposed by the class because the implementation is Pure ZuzuScript, but they are not part of the stable application API. Prefer the parser facade unless a test or tool needs exact tree-builder state.
HTMLTreeConstructionResultResult object returned by
HTMLTreeBuilder.parseandHTMLTreeBuilder.parseFragment.document()returns the parsed or stagingHTMLDocument.fragment()returns theHTMLDocumentFragmentfor fragment parses andnullfor full documents.errors()returns a copy of parse errors, andparseErrors()is an alias forerrors().HTMLTreeTestSerializerSerializer for html5lib tree-construction tests. The static
serialize(node)method returns the tree-test representation used bytests/html/tree-construction.zzs. It serializes document and fragment children, element namespaces, sorted attributes, comments, doctypes, text nodes, and template content in the shape expected by the vendored fixtures.
LIMITATIONS
This module implements the tree-construction behaviour claimed by the distribution tests, not every edge case in the WHATWG algorithm. Known html5lib expected failures are tracked in tests/html/tree-construction-xfails.zzm and summarized in the distribution README.
Script execution during parsing is not implemented. The scripting flag affects noscript parsing decisions but does not run scripts or allow parser-time script DOM mutation.
COPYRIGHT AND LICENCE
html/treebuilder is copyright Toby Inkster.
It is free software; you may redistribute it and/or modify it under the terms of either the Artistic License 1.0 or the GNU General Public License version 2.