< All Bots

Extract Data From Scanned PDF Document

This bot extracts relevant information from any type of structured PDF documents/ scanned documents / PDF image documents. This bot is provided by Automation Anywhere.

Top Benefits

  • Bot improves Adaptability, Configurability and Simplicity. In the event, a new format of document arises, business can configure the XML file of their own which drastically reduces involvement of tech teams and help business to be more agile to changes.

1. Extract required n-number of data from scanned pdf/pdf documents in the excel file 2. Any kind of structured document it will support 3. Data extraction support based on the generic XML configuration. 4. Easy to configure anyone. No need to depend on the technical team. 5. No code changes required even though if document template changes. Only configuration will change. 6. Values get it through Key & value search criteria. 7. If no key values available you can get it value through the row/line numbers -either single/multiple lines. (Ex: address) 8. We can get & confirm if any 'Key' exist in the entire document. status Yes/No. 9. Current bot will support for all PDF structured documents/ Scanned documents/ Image scanned documents. 10. OCR support 'Tesseract'.

Majority of the documents available with the bank are structured which can be easily digitised by using a good quality OCR as well as PDF Integration technologies. The solution envisages the use of OCR and PDF integration technology which can convert the scanned documents to the digital format and a configuration module which can extract the required information. An XML configuration file is used to identify and extract the digitised information. This is an easily configurable file which can be tuned based on the information to be extracted.

Key Use Case: 1.Get the list of structured PDF documents 2.Convert into txt files using OCR/PDF integration 3.Configure it based on the relevant required data in the XML 4.Run the bot and get the data into excel file

Get Bot


Business Process
Passed third-party anti-virus scan conducted by Automation Anywhere.
Automation Type
Last Updated
November 4, 2019
First Published
November 19, 2018
Enterprise Version
Community Version
Not Supported

See the Bot in Action


Setup Process


Download the Bot and follow the instructions to install it in your AAE Control Room.


Open the Bot to configure your username and other settings the Bot will need (see the Installation Guide or ReadMe for details.)


That's it - now the Bot is ready to get going!

Requirements and Inputs

  • AAE 10.7 sp2 or higher, Microsoft Excel
    Inputs: PDF documents/Scanned structured pdf documents/ Image PDF documents such as Account statements, Loan Disbursement Letter, ID cards etc.