- Extract required n-number of data from scanned pdf/pdf documents in the excel file.
- Supports any kind of structured document.
- Data extraction support based on the generic XML configuration.
- Get values through key and value search criteria.
- If no key values are available, you can get value through the row numbers.
- We can get and confirm if any 'Key' exists in the entire document.
- Current bot will support all PDF structured documents, Scanned documents, and Image scanned documents.
- OCR support 'Tesseract'.
Majority of the documents available with the bank are structured and can be easily digitised by using a good quality OCR as well as PDF Integration technologies. The solution envisages the use of OCR and PDF integration technology, which can convert the scanned documents to the digital format, and a configuration module, which can extract the required information. An XML configuration file is used to identify and extract the digitised information. This is an easily configurable file which can be tuned based on the information to be extracted.
Key Use Case:
1. Get the list of structured PDF documents.
2. Convert into txt files using OCR/PDF integration.
3. Configure it based on the relevant required data in the XML.
4. Run the bot and get the data into excel file.
See the Bot in Action
Download the Bot and follow the instructions to install it in your AAE Control Room.
Open the Bot to configure your username and other settings the Bot will need (see the Installation Guide or ReadMe for details.)
That's it - now the Bot is ready to get going!
Requirements and Inputs
- AAE 10.7 sp2 or higher, Microsoft Excel
Inputs: PDF documents/Scanned structured pdf documents/ Image PDF documents such as Account statements, Loan Disbursement Letter, ID cards etc.