Extract Data From Scanned PDF Document

This bot extracts relevant information from any type of structured PDF documents, scanned documents, and PDF image documents.

Top Benefits

  • Bot improves Adaptability, Configurability and Simplicity.
  • Easy for anyone to configure. No need to depend on the technical team.
  • No code changes required even if the document template changes. Only configuration will change.


  • Extract required n-number of data from scanned pdf/pdf documents in the excel file.
  • Supports any kind of structured document.
  • Data extraction support based on the generic XML configuration.
  • Get values through key and value search criteria.
  • If no key values are available, you can get value through the row numbers. 
  • We can get and confirm if any 'Key' exists in the entire document.
  • Current bot will support all PDF structured documents, Scanned documents, and Image scanned documents.
  • OCR support 'Tesseract'.

Majority of the documents available with the bank are structured and can be easily digitised by using a good quality OCR as well as PDF Integration technologies. The solution envisages the use of OCR and PDF integration technology, which can convert the scanned documents to the digital format, and a configuration module, which can extract the required information. An XML configuration file is used to identify and extract the digitised information. This is an easily configurable file which can be tuned based on the information to be extracted.

Key Use Case:

1. Get the list of structured PDF documents.

2. Convert into txt files using OCR/PDF integration.

3. Configure it based on the relevant required data in the XML.

4. Run the bot and get the data into excel file.

Get Bot


Bot Security Program
Level 1
Business Process
Automation Type
Last Updated
February 7, 2020
First Published
November 19, 2018
Enterprise Version

See the Bot in Action


Setup Process


Download the Bot and follow the instructions to install it in your AAE Control Room.


Open the Bot to configure your username and other settings the Bot will need (see the Installation Guide or ReadMe for details.)


That's it - now the Bot is ready to get going!

Requirements and Inputs

  • AAE 10.7 sp2 or higher, Microsoft Excel
    Inputs: PDF documents/Scanned structured pdf documents/ Image PDF documents such as Account statements, Loan Disbursement Letter, ID cards etc.