Insights from DCS
Knowledge for the information generation

Is Cognitive Document Automation the missing link for RPA?

By Pola Zafra-Davis - 16 Aug 2019

I’m not setting out to be controversial but in my opinion, many RPA (Robotic Process Automation) solutions are missing an essential element: the ability to automatically process documents needed for a business process. Cognitive Document Automation helps just that.

The fact is that RPA is not simply about replacing the user’s keyboard activity with conveniently structured and verified information. You may have heard that RPA can be used in diverse contexts like applications, databases, PDF or spreadsheets. But people can forget that a significant percentage of business processes are reliant on information that’s contained inside business documents and messages.

A Thought Exercise: Robotic process automation and HR Processes examples


Just consider the following classic example: An RPA process to streamline employee on-boarding.

Conventional wisdom says that this includes integration and data replication between HR, payroll, pension admin, employee benefits, IT admin systems and external insurance portals. But unless the organisation has a streamlined process and has already virtualised many of the traditional steps, I’m also willing to bet that there are documents that still need to be coded properly before you can productively automate a robot. I.e. document such as references, P60 information, Right to Work documents, driver’s license information and in the case of highly regulated industries -- proof of education and professional accreditations etc.

To gain true automation, a CDA system would automatically recognise documents, extract and verify the information against external data before passing onto the next task in the process.

Now, you may think that if your target organisation is not a large-scale enterprise that this HR RPA example is probably a marginal one. However, for transactional items such as AP invoices or insurance claims, processing becomes a greater burden on an operations team! Essentially, the RPA robots are delayed while human resources process the documents and/or messages. In typical cases, employees have to re-key already digitised information because the information is still unstructured.

How does Cognitive Document Automation help?

Cognitive and Document Automation -- the components of this term is key to getting to grips why RPA needs CDA. A quick diagram sums up the gist of the argument. I’ll be breaking down this diagram further below…

You can jump to the sections here:

Document Automation

Firstly, the “DA” (Document Automation) part of Cognitive Document Automation, is well-established technology. DA has its roots in the OCR days of old. Scanned documents would be OCR’d then the data is processed by specialist algorithms. These algorithms are tuned to extract a dataset based on various rules depending on the location of the data on each page. Effective DA systems lived or died on the strength, ease of setup and reliability of their algorithms. Data extraction was a tough task because of the infinitely variable nature of the documents and messages, especially those coming from “Joe or Jane Public”.

It’s reasonable to expect that in the majority of cases, a contemporary document automation system should reduce manual work by a factor of 3X – 4X, i.e. an operator will be able to process three to four times more documents than an entirely manual system. As a bonus, those documents will also contain fewer errors as an automated system can incorporate more rigorous verification.


Secondly, “Cognitive” or machine-learning overcomes the need to keep up with inevitable constant changes in documents. Cognitive automation can also refine the rules regarding the location of data in order to maintain acceptable levels of automation.

The dirty little secret here is that for most applications those many documents can change over time. If a document automation system is left at its deployed “day one” configuration, then data capture accuracy can be severely impacted.

A cognitive or machine-learning approach allows these incremental changes to be accommodated as a part of the production process. Under a cognitive approach, validation operators will make a change and the machine learning element will “learn” that change. These smarter validation operators will now remember to flag the next examples of the newly learned document layout for validation.

The Benefits of CDA

A Cognitive Document Automation approach delivers two main benefits:

  1. Simplicity. A simpler, less risky setup as the system learns from inputs of expert users rather than relying on the configuration skill of a technician tasked with handling every document layout.
  2. Resilience. The system remains in tune as it learns and adapts to change; this compares well with the somewhat hit and miss approach of a periodic tuning when things appear to be taking longer, i.e. more corrections documents and less automation.

Conclusion: The Big Automation Picture

Hopefully by now you will have begun to appreciate how end-to-end processes that involve the processing of documents and messages can benefit from CDA (Cognitive Document Automation). It’s critical that the integration between the RPA solution and the CDA components is tight, robust and well supported.

But it’s also important to start with the basis that an advanced Document Automation system will have a lot of sophistication even before Cognitive is introduced. OCR errors are still common place and the new flood of images from mobile devices simply increases data capture challenges. An advanced Document Automation system will increase specific capabilities to address these challenges -- capabilities such as image enhancement and fuzzy matching of data. Fuzzy matching is an important advanced feature because again, there are no guarantees that information contained on a document will be an exact match with data in Master Files or applications.

Ready to explore Cognitive Document Automation in real-time? With all the talk about RPA, OCR, machine learning and document management, I admit it can get confusing where it all fits in. When factoring your own unique business processes or “ways of doing” you will need something more substantial: a use case.

If you’re looking for some one-to-one advice, DCS has an experienced team that can review and analyse requirements for RPA. DCS is especially skilled at building RPA use cases with requirements involving documents, messages or other unstructured data. Drop us a note!

You may also be interested in...

Robotic Process Automation – Business Solutions Page
DCS Human Resources Solutions Page