Data Capture | ABBYY FlexiCapture

ABBYY FlexiCapture is highly accurate and scalable document imaging and data extraction software that automatically transforms documents of any structure, language or content into usable and accessible business-ready data.

Intelligent self-learning classification and state-of-the-art recognition technologies enable FlexiCapture to replace error-prone manual processes with automatic document classification and processing.

Flexible and customizable, FlexiCapture can handle virtually all document processing scenarios and can be tailored to any company’s workflows and regulations.

 Why ABBYY FlexiCapture?

Software for Document-driven Business Processes

One system for processing all kinds of paper documents in any industry

Intelligent Auto-learning Technology Makes Set Up Easy

Interactive training technology simplifies system implementation and set up.

Mobile Document Capture

FlexiCapture’s mobile capture client provides an alternative entry point for documents – usable at anytime, from anywhere.

Take the data. Leave the paper.

Product Highlights

Auto Document Classification

  • Automatically separate and classify documents regardless of how they were imported into the system
  • No limitation on classification rules
  • Advanced scalability for high-volume data and document capture across enterprises

Accurate Data Extraction with Table Extraction Capability

  • Automatic columns and rows identification and data extraction
  • Simple setup through point-and-click and auto-regex
  • Ability to span pages and extract multi-page table or invoice data
  • Ability to extract line data that is not in a table format
  • Tolerance for movement of the table on the page due to differing DPI and/or page registration issues on scanning
  • Ability to fine tune extraction methods using custom regex for value pattern matching

Workflow Auto-Processing

  • Automatic processing on documents including import, document classification, recognition and data extraction, and export.
  • Flexible workflow that can be easily adjusted to specific business processes
  • Support double verification by two independent operators

Easy Integration with Existing Business Applications

Import Options

FlexiCapture provides import from:

  • Scanning device (TWAIN, ISIS, WIA)
  • Watched folder (local or LAN)
  • FTP server
  • E-mail attachments from MS Exchange or POP3 mail servers

Supported file formats on import:


Export Options

FlexiCapture provides export to:

  • Files
  • SharePoint 2003/2007/2010/2013
  • ODBC-compatible databases
  • any ERP system and invoice approval workflow
  • any external application by using custom script-modules

Supported file formats on export:

Data Output Formats: .XML, .TXT, .XLS, .DBF, .CSV.
Image Output Formats: PDF (Image only, text under image), PDF/A (Image only, text under image), TIFF, JPEG, JPEG2000, PCX, BMP, PNG, DCX.

A Single Solution for All Document Types

Speed up business processes by using automated data entry software and eliminating time- and resource-consuming manual data entry. The intelligent capture algorithms enable the system to process any kind of document: invoices, contracts, registration forms and more.


Capture data from any documents, from structured forms to unstructured text-heavy papers.

How It Works?

ABBYY FlexiCapture supports a wide range of input channels and ensures easy processing regardless of the document type and source. It accommodates scanning, image loads from watched folders, FTP-servers or e-mails.

1. Flexible Import Options

  • Scanning device (TWAIN, ISIS, WIA)
  • Watched folder (local or LAN)
  • FTP server
  • E-mail attachments from MS Exchange or POP3 mail servers

2. Supported File Formats on Import


3. Scanning Station

FlexiCapture Scanning Station enables easy scanning with any TWAIN-, ISIS- and WIA-enabled device. Available in thick and thin client versions.


4. Scanning Profiles

Scanning Station features scanning profiles, which enable pre-defined settings for applications to be applied to specific batches. When scanning a new set of documents, the user needs only to choose the right profile from a drop-down menu.


5. Image Improvement

Pre-loaded or scanned images can be improved before processing using features that include rotation, deskewing, hide sensitive data and more.


The recognition stage includes assembly of documents, classification, text and data extraction and automatic validation. These stages are executed simultaneously in unattended mode.

1. Automatic Assembly of Multi-page Documents from a Mix of Pages

This may rely on separators (blank pages inserted between two documents), page counters or advanced ABBYY classification algorithms – that enable automatic detection of pages belonging to different documents.

2. Automated Image-base Classification

  • Content-based classification
  • Rule-based classification
  • Any combination of above

3. Highly Accurate OCR/ICR/OMR and Barcode Recognition

  • Optical character recognition of printed text in up to 190 languages
  • Intelligent character recognition for hand-printed text in over 110 languages
  • Barcode recognition for a variety of 1D and 2D barcodes
  • Optical mark recognition for a wide range of checkmarks


4. Automatic Validation

  • Comparison against databases
  • Conformity with built-in validation rules
  • Compliance with format
  • Data normalization
  • Application of other user-defined checks

5. Support Many Recognition Languages

  • 43 main languages with dictionary support: Arabic (Saudi Arabia), Armenian (Eastern), Armenian (Grabar), Armenian (Western), Azeri (Latin), Bashkir, Bulgarian, Catalan, Croatian, Czech, Danish, Dutch, Dutch (Belgian), English, Estonian, Finnish, French, German, German (new spelling), Greek, Hebrew, Hungarian, Indonesian, Italian, Latvian, Lithuanian, Norwegian, Norwegian (Bokmal), Norwegian (Nynorsk), Polish, Portuguese, Portuguese (Brazilian), Romanian, Russian, Slovak, Slovenian, Spanish, Swedish, Tatar, Thai, Turkish, Ukrainian, Vietnamese;
  • 133 additional languages without dictionary support: Abkhaz, Adyghe, Afrikaans, Agul, Albanian, Altai, Avar, Aymara, Azerbaijani (Cyrillic), Basque, Belarusian, Bemba, Blackfoot, Breton, Bugotu, Buryat, Cebuano, Chamorro, Chechen, Chukchee, Chuvash, Corsican, Crimean Tatar, Crow, Dargwa, Dungan, Eskimo (Cyrillic), Eskimo (Latin), Even, Evenki, Faroese, Fijian, Frisian, Friulian, Gagauz, Galician, Ganda, German (Luxembourg), Guarani, Hani, Hausa, Hawaiian, Icelandic, Indonesian, Ingush, Irish, Jingpo, Kabardian, Kalmyk, Karachay-balkar, Karakalpak, Kasub, Kawa, Kazakh, Khakass, Khanty, Kikuyu, Kirghiz, Kongo, Koryak, Kpelle, Kumyk, Kurdish, Lak, Latin, Lezgi, Luba, Macedonian, Malagasy, Malay (Malaysian), Malinke, Maltese, Mansi, Maori, Mari, Maya, Miao, Minangkabau, Mohawk, Moldavian, Mongol, Mordvin, Nahuatl, Nenets, Nivkh, Nogay, Nyanja, Ojibway, Ossetian, Papiamento, Provencal, Quechua, Rhaeto-Romanic, Romany, Rundi, Russian (Old Spelling), Rwanda, Sami (Lappish) , Samoan, Scottish Gaelic, Selkup, Serbian (Cyrillic), Serbian (Latin), Shona, Sioux (Dakota), Somali, Sorbian, Sotho, Sunda, Swahili, Swazi, Tabasaran, Tagalog, Tahitian, Tajik, Tok Pisin, Tongan, Tswana, Tun, Turkmen, Tuvinian, Udmurt, Uigur (Cyrillic), Uigur (Latin), Uzbek (Cyrillic), Uzbek (Latin), Welsh, Wolof, Xhosa, Yakut, Yiddish, Zapotec, and Zulu;
  • 5 East Asian languages: Chinese (Traditional, Simplified), Japanese, Korean and Hangul (Korean);
  • 6 languages for recognition of old European documents and Gothic fonts in books printed in 18-20th centuries
    • English,
    • French,
    • German,
    • Italian,
    • Spanish,
    • Latvian;
  • 4 artificial languages: Esperanto, Ido, Interlingua, and Occidental;
  • Digits
  • 1D Barcodes
    • Code 39, Check Code 39, Interleaved 25, Check Interleaved 25, EAN 13, EAN 8, Code 128, Codabar, Code 93, IATA 25, UCC-128, UPC-A, UPC-E, Matrix 2 of 5, Industrial 2 of 5, PostNet, Patch code (1, 2, 3, 4, T/Transfer, 6)
  • 2D Barcodes
    • PDF 417, Aztec, Datamatrix, QR code
  • Multiple Text Types
    • Typographic, Handprinted, Typewriter, Matrix printer, Index, OCR-A, OCR-B, MICR (E13B), MICR (CMC7)
For manual verification of recognition results, ABBYY FlexiCapture offers several verification modes that enable speedy verification and greater convenience.

1. Group Verification

Group verification for checkmarks and digits is applied across documents in form recognition projects. Identical figures (signs) from an entire document batch are displayed together.


2. Field Verification

Field verification checks data fields one by one.


3. Verification in Document Window

Recognition results of all required data fields are viewed simultaneously and compared with the original image. Information that is not successfully recognized, such as handwritten text or notes, can be typed manually into the fields.


Verification is available in thick and thin client versions. Verification is an optional stage and can be skipped.

1. Multi-Export Destinations

ABBYY FlexiCapture enables multiple destinations for data and images as well as generation of searchable PDFs.

2. Flexible Export Options

FlexiCapture provides export to:

  • Files
  • SharePoint 2003/2007/2010/2013
  • ODBC-compatible databases
  • Any ERP system and invoice approval workflow
  • Any external application by using custom script modules

FlexiCapture supports export to following file formats:
Data Output Formats: .XML, .TXT, .XLS, .DBF, .CSV.
Image Output Formats: PDF (Image only, text under image), PDF/A (Image only, text under image), TIFF, JPEG, JPEG2000, PCX, BMP, PNG, DCX.

1. Web-based Administration and Monitoring Console

FlexiCapture includes a web-based Administration and Monitoring Console that enables 24/7 supervision from any location. An administrator can easily manage user rights, check event logs, view standard reports or generate custom performance reports.


2. E-mail Alerts

Administrators can opt to receive e-mail alerts for important events like errors, license expiration and page count limits. Administrators can also be notified about imminent database overflow, running out of disk space, requests for access rights, or failed attempts to log in.


ABBYY document imaging and recognition software can be tailored to companies’ workflows and processing scenarios. Scripts, performed between processing stages, enable FlexiCapture to modify document processing to suit virtually any need. Scripting makes it possible to extend the default workflow by enabling:

  • Custom processing stages
  • Connection to additional OCR/ICR engines
  • Use of third-party image enhancement tools
  • Use of custom verification clients
  • Connection to signature comparison and other external modules

Plus, the Web Service API ensures easy integration of FlexiCapture into many business applications and workflows as an automatic document classification and data capture service.