TypeScript/JavaScript implementation of DocLang v0.6: lossless XML parsing, typed AST model, serialization, validation (XSD- + Schematron-equivalent) and CLI — plus converters Markdown / HTML / PDF → DocLang.
- Reference specification: doclang-project/doclang (v0.6).
- Document examples:
examples/.
✅ Core MVP (
core,xml,validator,cli) and converters (markdown,html,
| Package | Role | Status |
|---|---|---|
@doclith/core |
Pure domain: lossless AST, constants, errors, traversal | ✅ |
@doclith/xml |
XML parsing (anti-XXE) + serialization (round-trip) | ✅ |
@doclith/validator |
XSD-equivalent + Schematron-equivalent validation (13 rules) | ✅ |
@doclith/cli |
CLI doclang validate (text/json, exit codes 0/1/2) |
✅ |
@doclith/markdown |
Markdown (GFM) → DocLang | ✅ |
@doclith/html |
HTML → DocLang (parse5) |
✅ |
@doclith/pdf |
PDF → DocLang (pdfjs, text+structure, no OCR) |
✅ |
import { parseDocLang, serializeDocLang } from "@doclith/xml";
import { validateDocLang } from "@doclith/validator";
const doc = parseDocLang(xml); // lossless AST (node order + mixed text preserved)
const result = validateDocLang(doc, { allowEmptyNamespace: false });
if (!result.valid) console.error(result.issues); // structured issues (code, source, path, …)
const out = serializeDocLang(doc, { pretty: true }); // lossless round-tripdoclang validate document.dclg.xml # ✓/✗ + issues, exit 0/1
doclang validate document.dclg.xml -f json # machine-readable JSON output
doclang validate document.dclg.xml --xsd-only # structure only
doclang validate document.dclg.xml -n # tolerate missing namespaceExit codes: 0 valid · 1 invalid · 2 usage/file error.
import { markdownToDocLangXml } from "@doclith/markdown";
import { htmlToDocLangXml } from "@doclith/html";
import { pdfToDocLang } from "@doclith/pdf"; // async — text + structure, no OCR
const xml1 = markdownToDocLangXml("# Titre\n\n- a\n- b");
const xml2 = htmlToDocLangXml("<h1>Titre</h1><ul><li>a</li></ul>");
const doc = await pdfToDocLang(new Uint8Array(pdfBytes));Each converter produces a valid document (verified in tests against XSD + Schematron).
pnpm install
pnpm typecheck && pnpm lint && pnpm test && pnpm buildpnpm monorepo, strict TypeScript (ESM, NodeNext), Vitest. CI: lint + typecheck + build + test. License: Apache-2.0.