Removing non-UTF8 characters from strings with PHP

[sourcecode lang=”php”]
//reject overly long 2 byte sequences,
//as well as characters above U+10000
//and replace with ?
$some_string = preg_replace(‘/’.
‘?’, $some_string );

//reject overly long 3 byte sequences
//and UTF-16 surrogates and replace with ?
$some_string = preg_replace(‘/’.
‘?’, $some_string );

Via Remove non-UTF8 characters from string with PHP

