Module Xml_parser

Xml Light Parser

While basic parsing functions can be used in the Xml module, this module is providing a way to create, configure and run an Xml parser.

type xml = Xml_datatype.xml

An Xml node is either Element (tag-name, attributes, children) or PCData text

type t

Abstract type for an Xml parser.

Xml Exceptions

Several exceptions can be raised when parsing an Xml document :

type error_pos
type error_msg =
| UnterminatedComment
| UnterminatedString
| UnterminatedEntity
| IdentExpected
| CloseExpected
| NodeExpected
| AttributeNameExpected
| AttributeValueExpected
| EndOfTagExpected of string
| EOFExpected
| Empty
type error = error_msg * error_pos
exception Error of error
exception File_not_found of string
val error : error -> string

Get a full error message from an Xml error.

val error_msg : error_msg -> string

Get the Xml error message as a string.

val line : error_pos -> int

Get the line the error occurred at.

val range : error_pos -> int * int

Get the relative character range (in current line) the error occurred at.

val abs_range : error_pos -> int * int

Get the absolute character range the error occurred at.

val pos : Stdlib.Lexing.lexbuf -> error_pos
type source =
| SChannel of Stdlib.in_channel
| SString of string
| SLexbuf of Stdlib.Lexing.lexbuf

Several kind of resources can contain Xml documents.

val make : source -> t

This function returns a new parser with default options.

val check_eof : t -> bool -> unit

When a Xml document is parsed, the parser may check that the end of the document is reached, so for example parsing "<A/><B/>" will fail instead of returning only the A element. You can turn on this check by setting check_eof to true (by default, check_eof is false, unlike in the original Xmllight).

val parse : ?canonicalize:bool -> t -> xml

Once the parser is configured, you can run the parser on a any kind of xml document source to parse its contents into an Xml data structure.

When canonicalize is set, the parser tries to remove blank PCDATA elements.