Module Unicode
Unicode utilities
val classify : int -> status
Classify a unicode char into 3 classes or unknown.
val ident_refutation : string -> (bool * string) option
Return
None
if a given string can be used as a (Coq) identifier. ReturnSome (b,s)
otherwise, wheres
is an explanation andb
is severity.
val is_valid_ident_initial : status -> bool
Tells if a valid initial character for an identifier
val is_valid_ident_trailing : status -> bool
Tells if a valid non-initial character for an identifier
val is_letter : status -> bool
Tells if a letter
val is_unknown : status -> bool
Tells if a character is unclassified
val lowercase_first_char : string -> string
First char of a string, converted to lowercase
- raises Assert_failure
if the input string is empty.
val split_at_first_letter : string -> (string * string) option
Split a string supposed to be an ident at the first letter; as an optimization, return None if the first character is a letter
val is_basic_ascii : string -> bool
Return
true
if all UTF-8 characters in the input string are just plain ASCII characters. Returnsfalse
otherwise.
val ascii_of_ident : string -> string
ascii_of_ident s
maps UTF-8 string to a string composed solely from ASCII characters. The non-ASCII characters are translated to"_UUxxxx_"
where xxxx is the Unicode index of the character in hexadecimal (from four to six hex digits). To avoid potential name clashes, any preexisting substring"_UU"
is turned into"_UUU"
.