Language Identifier / Encoding Identifier
Lingua-Systems' language identifiers provide two important pieces of
information: the language and character encoding a text is written in.
If you process text in any way these information offer you the possibility
to adjust every following step: You may process the data
language-specifically and take specifics into consideration, proceed
systematically and thereby improve the quality of your applications.
Identify language and character encoding and use this information to
enlarge the possibilities - for yourself and your customers.
The areas of application for a language identifier are numerous!
To cover a great number of applications, Lingua-Systems' language identifiers are provided in different variants:
- lid
- C/C++ library to identify language and character encoding
- lidc
- application to identify language and character encoding
- Lingua::Lid (Open Source)
- Perl interface to the lid library
A more detailed introduction to all available language identifiers is available here. Not what you were looking for? Feel free to contact us if you are interested in another variant.
Transliteration
Transliteration is the assignment of letters from one alphabet or writing system to another one, like Greek to Latin letters. In everyday life a lot more than that is understood to be transliteration: any transmission in a different form can be summed up to be transliteration, for example omitting any special characters found in many languages.
Our software allows you to convert text from one writing system to another automatically. There are already many tables for transliteration included, following national and international standards. Additional tools allow you to write your own rulesets in an XML format and integrate them easily.
- Lingua::Translit (Open Source)
- Perl module that transliterates text between various writing systems








