HTParse Manual Page

Availability

htparse is available in the MMM distribution. It is an Objective Caml program, using the same HTML parser as MMM.

Purpose

htparse parses HTML files according to HTML DTD 3.2, and displays errors in the structure of the document, or the structure of the document.

Command line options

Usage: htparse <opts> file1.html ... filen.html
  -v		Verbose mode
  -struct n	(n=0,1,2) Displays a "parse tree"
  -nesting	Check nesting only
  -dtd		Displays DTD

Error detection

htparse only detects lexical errors and some structure errors. It doesn't know anything about tag attributes. The structure is not completely checked (e.g. checks only parenthesizing according to minimisation rules, and legality of elements in the current context).

Output

htparse displays errors with line number and character position in the line. For each error, the nature of the error is printed, as well as the behaviour of error correction.

Emacs mode

You can load html-error.el as a complement to your usual html-mode.el. Then, edit an HTML file (e.g. foo.html); the buffer should be in HTML mode.
Type: M-x compile,
Compile command: htparse foo.html,
and then C-x ` to go from error to error as usual.

Caveats

If the HTML file contains complex server side includes that break the syntax for comments, htparse will of course show errors.

A minimal knowledge of HTML syntax (e.g. DTD) is preferred.

This is not an HTML validator. Do not under any circumstance claim that your HTML is correct because it passes htparse.