Software Tests and Benchmarks
We aim to develop best possible solutions for the challenging task of natural language processing. Therefore we focus on meticulous software tests and benchmarks: Our software has to fulfill our high demands on robustness, security, performance and accuracy. Only after all tests have proven the product's quality, we make the software available to the market - to offer our customers solid products that are easy to integrate and of high quality.
We test every version suitable for release on thousands of documents. We assume the documents to be representative for the respective languages. Furthermore we integrate documents with common attack patterns. This way we can assure that our software products are of high quality and suffice our security standards.
We like to present our series of benchmarks for the current version of lid. You have the opportunity to get an impression of our method of operating and the results we observed.
The following presentation is based on 3425 language documents. The results are evaluated regularly and additional tests are constantly added during development and release cycles.
Although we choose the documents carefully and assume them to be representative, this presentation may not lead to the assumption that these results are guaranteed for any set of documents. The following results are specific results of our quality management - it cannot be excluded that there may be other sets of documents that produce different results.
Accuracy
The diagram to the right sums up the results of our accuracy tests for the current version of lid.
On average lid identified the languages with an accuracy of 99.55% and determined the character encoding with 99.48% accuracy.
The results range from 96.32% to 100% for languages and from 97.87% to 100% for character encodings. We enhance our results continuously with ongoing linguistic analysis and knowledge engineering.
Performance
We tested lid's performance as well. Running the tests on an up-to-date machine (Core2Duo, 3.33GHZ, 4GB RAM - Linux 2.6) lid processed 357.38 documents per second using a single thread, with 4 threads, 693.25 documents were processed. That is about 21442.80 / 41595.00 per minute!



