Lingua-Systems' Software Products
Text Classifier
The text classifier Textweiser allows
to assign text documents to user-defined categories automatically.
Categorization of texts eases automatic processing of huge amounts of
data and helps to maintain information.
Most of the information is available from unstructured data -
text classification allows to structure textual data and therefore supports
knowledge management.
Read more about Textweiser...
Language Identifier / Encoding Identifier
Lingua-Systems' language identifiers
provide two important pieces of information: the language and character
encoding a text is written in.
If you process text in any way these information offer you the possibility
to adjust every following step: You may process the data
language-specifically and take specifics into consideration, proceed
systematically and thereby improve the quality of your applications.
Identify language and character encoding and use this information to
enlarge the possibilities - for yourself and your customers.
The areas of application for a language identifier are numerous!
A more detailed introduction to all available language identifiers is available here.
Automatic Unicode Converter
The Unicode converter enables you
to convert texts from various character sets to Unicode -
automatically as the converter identifies the input's charset.
Different, false or not specified charsets may complicate the processing
of text. Misinterpreted characters (for example "ö" instead of
German umlaut "รถ") are not only distracting for humans - they may even lead
to failures in processing data.
Therefore it is useful to convert data from different charsets to a
single one to get a uniform basis. Unicode is the most suitable
alternative to unify different charsets and languages.
Find out more about it here.
Transliteration
Transliteration is the assignment of letters from one alphabet or writing system to another one, like Greek to Latin letters. In everyday life a lot more than that is understood to be transliteration: any transmission in a different form can be summed up to be transliteration, for example omitting any special characters found in many languages.
Our software allows you to convert text from one writing system to another automatically. There are already many tables for transliteration included, following national and international standards. Additional tools allow you to write your own rulesets in an XML format and integrate them easily.
Lingua::Translit (Open Source)
