Supported Languages

The Perl extension Lingua::Lid implements an interface to lid - a C/C++ library that currently supports 42 languages and transliterations in a variety of modern, common and legacy character encodings.

Support for additional languages and character encodings is added regularly as the development of the underlying lid library proceeds. However, if you need a specific language or character encoding supported, feel free to contact us so we can support it quickly.

Language ISO 639-3 Code Character Encodings
Bulgarian Bulgarian bul UTF-32BE, UTF-32LE, UTF-16BE, UTF-16LE, UTF-8, ISO-8859-5, Windows-1251, MacCyrillic, CP 855, CP 866, KOI8-R
Bulgarian
(DIN 1460 transliteration)
Bulgarian bul UTF-32BE, UTF-32LE, UTF-16BE, UTF-16LE, UTF-8, Windows-1250, ASCII
Bulgarian
(ISO 9 transliteration)
Bulgarian bul UTF-32BE, UTF-32LE, UTF-16BE, UTF-16LE, UTF-8, ASCII
Bulgarian
(Streamlined System transliteration)
Bulgarian bul UTF-32BE, UTF-32LE, UTF-16BE, UTF-16LE, UTF-8, Windows-1250, ASCII
Czech Czech ces UTF-32BE, UTF-32LE, UTF-16BE, UTF-16LE, UTF-8, ISO-8859-2, Windows-1250, MacCentralEuropean, CP 852
Czech
(Common transliteration)
Czech ces UTF-32BE, UTF-32LE, UTF-16BE, UTF-16LE, UTF-8, ISO-8859-1, ASCII
Danish Danish dan UTF-32BE, UTF-32LE, UTF-16BE, UTF-16LE, UTF-8, ISO-8859-1, Windows-1252, MacRoman, CP 850, ASCII
Dutch Dutch nld UTF-32BE, UTF-32LE, UTF-16BE, UTF-16LE, UTF-8, ISO-8859-1, ISO-8859-15, Windows-1252, MacRoman, CP 850, ASCII
English English eng UTF-32BE, UTF-32LE, UTF-16BE, UTF-16LE, UTF-8, ISO-8859-1, Windows-1252, MacRoman, CP 850, ASCII
Estonian Estonian est UTF-32BE, UTF-32LE, UTF-16BE, UTF-16LE, UTF-8, ISO-8859-4, Windows-1257, MacCentralEuropean, CP 775, ASCII
Finnish Finnish fin UTF-32BE, UTF-32LE, UTF-16BE, UTF-16LE, UTF-8, ISO-8859-1, ISO-8859-15, Windows-1252, MacRoman, CP 850, ASCII
French French fra UTF-32BE, UTF-32LE, UTF-16BE, UTF-16LE, UTF-8, ISO-8859-1, ISO-8859-15, Windows-1252, MacRoman, CP 850, ASCII
German German deu UTF-32BE, UTF-32LE, UTF-16BE, UTF-16LE, UTF-8, ISO-8859-1, ISO-8859-15, Windows-1252, MacRoman, CP 850, ASCII
German
(Common transliteration)
German deu UTF-32BE, UTF-32LE, UTF-16BE, UTF-16LE, UTF-8, ISO-8859-1, ASCII
Greek Greek ell UTF-32BE, UTF-32LE, UTF-16BE, UTF-16LE, UTF-8, ISO-8859-7, Windows-1253, MacGreek, CP 737
Greek
(DIN 31634 transliteration)
Greek ell UTF-32BE, UTF-32LE, UTF-16BE, UTF-16LE, UTF-8, ASCII
Greek
(Greeklish transliteration)
Greek ell UTF-32BE, UTF-32LE, UTF-16BE, UTF-16LE, UTF-8, ISO-8859-1, ASCII
Greek
(ISO 843 transliteration)
Greek ell UTF-32BE, UTF-32LE, UTF-16BE, UTF-16LE, UTF-8, ASCII
Hungarian Hungarian hun UTF-32BE, UTF-32LE, UTF-16BE, UTF-16LE, UTF-8, ISO-8859-2, ISO-8859-16, Windows-1250, CP 852, MacCentralEuropean
Irish (Gaelic) Irish (Gaelic) gle UTF-32BE, UTF-32LE, UTF-16BE, UTF-16LE, UTF-8, ISO-8859-1, Windows-1252, MacRoman, CP 850, ASCII
Italian Italian ita UTF-32BE, UTF-32LE, UTF-16BE, UTF-16LE, UTF-8, ISO-8859-1, ISO-8859-16, Windows-1252, MacRoman, CP 850, ASCII
Latvian Latvian lav UTF-32BE, UTF-32LE, UTF-16BE, UTF-16LE, UTF-8, ISO-8859-4, Windows-1257, MacCentralEuropean, CP 775, ASCII
Lithuanian Lithuanian lit UTF-32BE, UTF-32LE, UTF-16BE, UTF-16LE, UTF-8, ISO-8859-4, Windows-1257, MacCentralEuropean, CP 775, ASCII
Maltese Maltese mlt UTF-32BE, UTF-32LE, UTF-16BE, UTF-16LE, UTF-8, ISO-8859-3
Mandarin (Chinese) Mandarin (Chinese) cmn UTF-32BE, UTF-32LE, UTF-16BE, UTF-16LE, UTF-8, Big5, GB2312
Polish Polish pol UTF-32BE, UTF-32LE, UTF-16BE, UTF-16LE, UTF-8, ISO-8859-2, ISO-8859-16, Windows-1250, MacCentralEuropean, CP 852
Polish
(Common transliteration)
Polish pol UTF-32BE, UTF-32LE, UTF-16BE, UTF-16LE, UTF-8, ISO-8859-1, ASCII
Portuguese Portuguese por UTF-32BE, UTF-32LE, UTF-16BE, UTF-16LE, UTF-8, ISO-8859-1, ISO-8859-15, Windows-1252, MacRoman, CP 850, ASCII
Romanian Romanian ron UTF-32BE, UTF-32LE, UTF-16BE, UTF-16LE, UTF-8, ISO-8859-2, Windows-1250, MacRomanian, CP 852
Romanian
(Common transliteration)
Romanian ron UTF-32BE, UTF-32LE, UTF-16BE, UTF-16LE, UTF-8, ISO-8859-1, ASCII
Russian Russian rus UTF-32BE, UTF-32LE, UTF-16BE, UTF-16LE, UTF-8, ISO-8859-5, Windows-1251, MacCyrillic, CP 855, CP 866, KOI8-R
Russian
(DIN 1460 transliteration)
Russian rus UTF-32BE, UTF-32LE, UTF-16BE, UTF-16LE, UTF-8
Russian
(ISO 9 transliteration)
Russian rus UTF-32BE, UTF-32LE, UTF-16BE, UTF-16LE, UTF-8
Slovak Slovak slk UTF-32BE, UTF-32LE, UTF-16BE, UTF-16LE, UTF-8, ISO-8859-2, Windows-1250, MacCentralEuropean, CP 852
Slovak
(Common transliteration)
Slovak slk UTF-32BE, UTF-32LE, UTF-16BE, UTF-16LE, UTF-8, ISO-8859-1, ASCII
Slovenian Slovenian slv UTF-32BE, UTF-32LE, UTF-16BE, UTF-16LE, UTF-8, ISO-8859-2, ISO-8859-16, Windows-1250, MacCentralEuropean, CP 852, ASCII
Slovenian
(Common transliteration)
Slovenian slv UTF-32BE, UTF-32LE, UTF-16BE, UTF-16LE, UTF-8, ISO-8859-1, ASCII
Spanish Spanish spa UTF-32BE, UTF-32LE, UTF-16BE, UTF-16LE, UTF-8, ISO-8859-1, ISO-8859-15, Windows-1252, MacRoman, CP 850, ASCII
Swedish Swedish swe UTF-32BE, UTF-32LE, UTF-16BE, UTF-16LE, UTF-8, ISO-8859-1, Windows-1252, MacRoman, CP 850, ASCII
Ukrainian Ukrainian ukr UTF-32BE, UTF-32LE, UTF-16BE, UTF-16LE, UTF-8, Windows-1251, MacUkrainian, KOI8-U
Ukrainian
(DIN 1460 transliteration)
Ukrainian ukr UTF-32BE, UTF-32LE, UTF-16BE, UTF-16LE, UTF-8
Ukrainian
(ISO 9 transliteration)
Ukrainian ukr UTF-32BE, UTF-32LE, UTF-16BE, UTF-16LE, UTF-8