Method validate_utf8()


Method validate_utf8

bool validate_utf8(utf8_string s)
bool validate_utf8(utf8_string s, int extended)

Description

Checks whether a string is a valid UTF-8 byte-stream.

Parameter s

String of UTF-8 encoded data to validate.

Parameter extended

Bitmask with extension options.

1

Accept the extension used by string_to_utf8(), including lone UTF-16 surrogates.

2

Accept UTF-8 encoded UTF-16 (ie accept valid surrogate-pairs).

Returns

Returns 0 (zero) if the stream is not a legal UTF-8 byte-stream, and 1 if it is.

Note

In conformance with RFC 3629 and Unicode 3.1 and later, non-shortest forms are considered invalid.

See also

Charset.encoder(), string_to_unicode(), string_to_utf8(), unicode_to_string(), utf8_to_string()