文字コード識別エンジンの比較

encoding lang universalchardet icu libguess
ISO-8859-1
ISO-8859-2 russian/polish
ISO-8859-6
ISO-2022-CN chinese
BIG5 chinese/taiwanese
GB2312 chinese
EUC-TW taiwanese
GB18030 chinese
HZ-GB-23121 chinese
ISO-8859-5 russian(Cyrillic)
KOI8-R russian(Cyrillic)
KOI8-U russian
WINDOWS-1251 russian(Cyrillic)
MACCYRILLIC russian(Cyrillic)
IBM866 russian(Cyrillic)
IBM855 russian(Cyrillic)
ISO-8859-7 greek
WINDOWS-1253 greek
ISO-8859-8 hebrew
ISO-8859-9 turkish
WINDOWS-1255 hebrew
ISO-8859-6 arabic
WINDOWS-1256 arabic
ISO-2022-JP japanese
SHIFT_JIS japanese
EUC-JP japanese
ISO-2022-KR korean
EUC-KR korean
johab korean
cp1250 polish
cp1254 turkish
ISO-8859-13 baltic
cp1257 baltic
UTF-8 *
UTF-16BE *
UTF-16LE
UTF-32BE *
UTF-32LE *
X-ISO-10646-UCS-4-34121
X-ISO-10646-UCS-4-21431