Lingua-Systems' Software Products

Text Classifier

Textweiser text classifier

The text classifier Textweiser allows to assign text documents to user-defined categories automatically.

Categorization of texts eases automatic processing of huge amounts of data and helps to maintain information. Most of the information is available from unstructured data - text classification allows to structure textual data and therefore supports knowledge management.

Read more about Textweiser...

Language Identifier / Encoding Identifier

lid language identifier

Lingua-Systems' language identifiers provide two important pieces of information: the language and character encoding a text is written in.

If you process text in any way these information offer you the possibility to adjust every following step: You may process the data language-specifically and take specifics into consideration, proceed systematically and thereby improve the quality of your applications.
Identify language and character encoding and use this information to enlarge the possibilities - for yourself and your customers. The areas of application for a language identifier are numerous!

A more detailed introduction to all available language identifiers is available here.

Automatic Unicode Converter

AutoUniConv - Automatic Unicode Converter

The Unicode converter enables you to convert texts from various character sets to Unicode - automatically as the converter identifies the input's charset.

Different, false or not specified charsets may complicate the processing of text. Misinterpreted characters (for example "ö" instead of German umlaut "รถ") are not only distracting for humans - they may even lead to failures in processing data.
Therefore it is useful to convert data from different charsets to a single one to get a uniform basis. Unicode is the most suitable alternative to unify different charsets and languages.

Find out more about it here.

Transliteration

transliterate greek to latin

Transliteration is the assignment of letters from one alphabet or writing system to another one, like Greek to Latin letters. In everyday life a lot more than that is understood to be transliteration: any transmission in a different form can be summed up to be transliteration, for example omitting any special characters found in many languages.

Our software allows you to convert text from one writing system to another automatically. There are already many tables for transliteration included, following national and international standards. Additional tools allow you to write your own rulesets in an XML format and integrate them easily.

Lingua::Translit (Open Source)