For example, wp_is_valid_utf8() includes this example text:
true === wp_is_valid_utf8( '' );
true === wp_is_valid_utf8( 'just a test' );
true === wp_is_valid_utf8( "\xE2\x9C\x8F" ); // Pencil, U+270F.
true === wp_is_valid_utf8( "\u{270F}" ); // Pencil, U+270F.
true === wp_is_valid_utf8( '' ); // Pencil, U+270F.
false === wp_is_valid_utf8( "just xC0 test" ); // Invalid bytes.
false === wp_is_valid_utf8( "\xE2\x9C" ); // Invalid/incomplete sequences.
false === wp_is_valid_utf8( "\xC1\xBF" ); // Overlong sequences.
false === wp_is_valid_utf8( "\xED\xB0\x80" ); // Surrogate halves.
false === wp_is_valid_utf8( "B\xFCch" ); // ISO-8859-1 high-bytes.
// E.g. The “ü” in ISO-8859-1 is a single byte 0xFC,
// but in UTF-8 is the two-byte sequence 0xC3 0xBC.
However, the documented example is this:
true === wp_is_valid_utf8( '' );
true === wp_is_valid_utf8( 'just a test' );
true === wp_is_valid_utf8( "\xE2\x9C\x8F" ); // Pencil, U+270F.
true === wp_is_valid_utf8( "\u{270F}" ); // Pencil, U+270F.
true === wp_is_valid_utf8( '✏' ); // Pencil, U+270F.
false === wp_is_valid_utf8( "just \xC0 test" ); // Invalid bytes.
false === wp_is_valid_utf8( "\xE2\x9C" ); // Invalid/incomplete sequences.
false === wp_is_valid_utf8( "\xC1\xBF" ); // Overlong sequences.
false === wp_is_valid_utf8( "\xED\xB0\x80" ); // Surrogate halves.
false === wp_is_valid_utf8( "B\xFCch" ); // ISO-8859-1 high-bytes.
// E.g. The “ü” in ISO-8859-1 is a single byte 0xFC,
// but in UTF-8 is the two-byte sequence 0xC3 0xBC.
Differences:
just \xC0 test becomes just xC0 test
✏ is removed
For example,
wp_is_valid_utf8()includes this example text:However, the documented example is this:
Differences:
just \xC0 testbecomesjust xC0 test✏is removed