What Is XML and Why Format It
XML is a markup language for encoding hierarchical, document-oriented data. Each piece of information lives inside a tag, tags can contain other tags, and attributes attached to a tag describe metadata that doesn't fit into the main content. Where JSON is optimized for terse machine-to-machine data exchange, XML is built for documents that mix structured data with prose — which is why it underpins office documents, SVG graphics, and the configuration formats of nearly every enterprise platform that predates JSON's rise.
Real-world XML often arrives without indentation: minified, single-line responses from a SOAP endpoint, log entries crammed into one column of a database, or copy-pasted blobs from a vendor's documentation. Reading nested XML without formatting is painful — you can lose track of which closing tag matches which opening tag a few lines down. Pretty-printing restores the visual hierarchy that makes the document readable in seconds.
Validation matters too. The browser's XML parser will reject malformed input with a precise error pointing at the first problem — usually a mismatched tag, an unescaped ampersand or angle bracket, or a missing closing quote on an attribute. Catching these in a formatter, before the data reaches production, saves the much more painful debugging experience of an XML error inside a server log. This tool runs the same parser your browser uses to render web pages, so any input that formats here will parse anywhere XML is consumed.
Frequently Asked Questions
What is XML and where is it still used?
XML (eXtensible Markup Language) is a text-based markup format that uses nested tags to represent hierarchical data. It predates JSON as the dominant data-interchange format and is still the backbone of many enterprise systems and document formats: SOAP web services, RSS and Atom feeds, SVG graphics, Office Open XML (.docx, .xlsx), Android resource files, Maven and Gradle build configs, Java Spring config, and most B2B exchange formats (HL7, EDI, FpML, ISO 20022). New systems usually pick JSON, but XML is far from gone — and any developer integrating with banks, healthcare, government, or legacy enterprise software ends up reading and writing it.
How is this XML formatter different from copying into an IDE?
Most IDEs format XML by serializing the parsed DOM, which adds whitespace inside text content and may rearrange attribute order. This tool uses the same DOMParser as your browser, then walks the tree and emits each element on its own line with consistent two-space indentation. Self-closing tags stay self-closing, attribute order is preserved, and CDATA sections are kept verbatim. For one-off cleanup of XML you pasted from a log file or API response, this is faster than opening an IDE.
What is the difference between formatting and minifying XML?
Formatting (pretty-printing) adds line breaks and indentation so the structure is easy to read — ideal for debugging, code review, or documentation. Minifying strips whitespace between elements and collapses indentation, producing the most compact representation suitable for storage or network transmission. Whitespace inside text content is preserved by both modes because changing it would alter the document's meaning.
Does this tool validate XML against a schema (XSD or DTD)?
No — it validates that the XML is well-formed (parseable) but does not check it against an XSD or DTD schema. Browsers do not include XML Schema validators, so a tool that needed schema validation would have to ship a large XSD parser like xerces-js or xmllint compiled to WebAssembly. For schema validation, use xmllint (command line), an IDE plugin, or your application's own validator. Well-formedness checking — what this tool does — catches the vast majority of real-world XML errors.
Why does the formatter sometimes add a missing XML declaration?
It does not add anything. If your input lacks the optional declaration, the output also lacks it. The browser's DOMParser is permissive about a missing declaration (XML 1.0 makes it optional), and well-formed fragments without a root or declaration usually still parse and format correctly. Add the declaration manually if your downstream consumer requires it.