Skip to content

Amazon Textract

Amazon Textract is a machine learning service that automatically extracts text, handwriting, layout elements, and data from scanned documents. Unlike traditional optical character recognition (OCR) it can accurately process forms, tables, images, and other structured documents and reproduce them digitally, without manual configuration.

Eliminate manual effort

Where OCR often requires significant manual work to build templates for each possible document layout, Textract can adapt automatically and provide accurate digitalisation of data while preserving its context. This eliminates the need for extensive manual checks, even with novel document structures

Confidently extract data of all types

Textract provides confidence ratings for all its extractions, allowing you to flag any ambiguous entries and giving increased certainty that your data is error-free. It’s relied on for handling sensitive data by customers in the financial, health, and public sectors.

Powerful pretrained features and customisability

Textract contains ready-made features for recognising common document formats such as tables, invoices, identity documents, and signatures. Extensive customisation options allow tailoring it to the structure of your documents, letting it identify missing or incomplete data.

Our work with Textract

We rebuilt several parts of the UK government’s Register to Vote service, including the system used to process paper applications. This service experiences a huge surge in demand during the run-up to elections, so needs to be robust and not reliant on significant manual corrections.

Talk 1-1 with a consultant

Book a call with one of our consultants to discuss your challenges.