Monday, November 9, 2009

OpenOCR Freeware Utility from Cognitive Technologies

In our modern world, simple retyping documents into a computer is absolutely inefficient process, as it takes your valuable time, you can spend for more important tasks. Besides, when the paper document includes images and other elements of decoration, you will not be able to copy them easily into the computer file without scanning. OpenOCR (CuneiForm) is offering you an advanced and automatic way to convert a paper copy of the original document, including images, tables, columns, paragraphs, indentions, font styles and sizes, into identical soft copy.

OpenOCR has been developed by Cognitive Technologies, famous Russian software development company, offering a serious competition to the Abbyy Fine Reader – commercial product. It combines broad experience acquired by Russian scientists with the most advanced achievements in the field of optical recognition as cognitive analysis algorithm, adaptive recognition of characters, meridian segmentation of tables, neuron nets, etc. The software was released to the users as freeware in December, 2007. In September 2008, part of Cuneiform was released as open source software.

CuneiForm is the OmniFont system. Algorithms used in CuneiForm come from the rules of writing of letters, from their topology, and do not require definition of patterns or teaching. CuneiForm recognizes any printing fonts (scanned books, newspapers, magazines, output from laser and dot-matrix printers, text from typewriters, etc.). It does not recognize handwritten or pseudo-handwritten text nor does it recognize decorative fonts (e.g. Gothic). There are special settings in CuneiForm for recognition of text from dot-matrix printer and 200x100 DPI resolution faxes. CuneiForm can save text formatting and recognizes complicated tables of any structure.

Other Features

1. Support of 20 languages: English, German, French, Spanish, Italian, Portuguese, Dutch, Russian, Mixed Russian-English, Ukrainian, Danish, Swedish, Finnish, Serbian, Croatian, Polish and others. Every language is supplied with a dictionary which lets do a context check of recognized characters and improve the recognition results.

2. Using built-in text editor, you can easily work with images, tables, columns, various fonts, headers and footers, if manual interaction is needed. Built-in wizards guide you through all stages of scanning and recognition and help to reach the final recognition goal quickly and with high quality and accuracy.

3. Recognition of tables of different structure even with cells not separated by lines.

4. Improved automatic and semiautomatic searching of text, tables and images, which makes the work with documents of complex structure highly flexible, it has also powerful means of manual fragmentation.

5. Ensures scanning from remote scanner in a local network. There can be only one scanner in office, but it can be used by any user in the organization.




