In sanitize_user() there is a line of code which is commented "Kill octets":
// Kill octets.
$username = preg_replace( '|%([a-fA-F0-9][a-fA-F0-9])|', '', $username );
I thought that octets were essentially bytes which encode unicode characters (although I know many unicode characters are encoded by more than one byte) and therefore I do not understand why they need to be ‘killed’.
>Solution :
sanitize_user() removes characters and sequences that aren’t allowed in WordPress user names. User names like Mickey%20Mouse aren’t allowed. That user name attempts to include a space by including the %20 space octet.
In general, sanitize operations strip out disallowed data.