- Overview
- Document Understanding Process
- Quickstart tutorials
- Framework components
- ML packages
- Overview
- Document Understanding - ML package
- DocumentClassifier - ML package
- ML packages with OCR capabilities
- 1040 - ML package
- 1040 Schedule C - ML package
- 1040 Schedule D - ML package
- 1040 Schedule E - ML package
- 1040x - ML package
- 3949a - ML package
- 4506T - ML package
- 709 - ML package
- 941x - ML package
- 9465 - ML package
- 990 - ML Package - Preview
- ACORD125 - ML package
- ACORD126 - ML package
- ACORD131 - ML package
- ACORD140 - ML package
- ACORD25 - ML package
- Bank Statements - ML package
- Bills Of Lading - ML package
- Certificate of Incorporation - ML package
- Certificate of Origin - ML package
- Checks - ML package
- Children Product Certificate - ML package
- CMS 1500 - ML package
- EU Declaration of Conformity - ML package
- Financial Statements - ML package
- FM1003 - ML package
- I9 - ML package
- ID Cards - ML package
- Invoices - ML package
- Invoices China - ML package
- Invoices Hebrew - ML package
- Invoices India - ML package
- Invoices Japan - ML package
- Invoices Shipping - ML package
- Packing Lists - ML package
- Passports - ML package
- Payslips - ML package
- Purchase Orders - ML package
- Receipts - ML Package
- Remittance Advices - ML package
- UB04 - ML package
- Utility Bills - ML package
- Vehicle Titles - ML package
- W2 - ML package
- W9 - ML package
- Other Out-of-the-box ML Packages
- Public Endpoints
- Hardware requirements
- Pipelines
- Document Manager
- OCR services
- Supported languages
- Deep Learning
- Insights dashboards
- Document Understanding deployed in Automation Suite
- Document Understanding deployed in AI Center standalone
- Licensing
- Activities
- UiPath.Abbyy.Activities
- UiPath.AbbyyEmbedded.Activities
- UiPath.DocumentProcessing.Contracts
- UiPath.DocumentUnderstanding.ML.Activities
- UiPath.DocumentUnderstanding.OCR.LocalServer.Activities
- UiPath.IntelligentOCR.Activities
- UiPath.OCR.Activities
- UiPath.OCR.Contracts
- UiPath.OmniPage.Activities
- UiPath.PDF.Activities
Document Understanding User Guide
Hardware requirements
Running the Document UnderstandingTM ML Packages on a GPU includes an optimization meant to accelerate the training process.
As a result, training on GPU is five times faster than on CPU (previously it was 10-20 times faster). This also makes it possible to train models on CPU with up to 5000 pages (previously it was 500 maximum).
Please be aware that training Document Understanding models on GPU requires a GPU with at least 11GB of video RAM to run successfully.
Use the below table to check the compatibility between the ML Packages, CUDA version, and GPU driver version.
ML Packages version |
CUDA version |
cudDNN version |
NVIDIA driver (lowest compatible version) |
Hardware Generation |
---|---|---|---|---|
2023.10 |
CUDA 11.8 or latest |
cuDNN 8.2.0 or latest |
R450.80.04 |
Ampere, Turing, Volta, Pascal, Maxwell, Kepler |
CUDA is backward compatible, meaning that existing CUDA applications can continue to be used with newer CUDA versions.
More information about compatibility can be found here.
You can use the Document Understanding framework for reading text using an OCR engine, classifying the documents, and extracting information from the documents. While classification and extraction tasks are run on CPU, the OCR is recommended to be run on GPU (through a CPU version is also provided in case a GPU is not available).
The On-premises deployment is done using Automation Suite and its hardware requirements.
You can use the same type of VM for both extractors and classifiers, the only difference being the infrastructure size. We recommend using the OCR engine with a GPU VM. The compatibility between the ML Packages, CUDA version, and GPU driver version are described in the Compatibility Matrix section.
Let's take an actual example for better understanding the hardware requirements.
ML Package | Hardware requirement | Capability |
---|---|---|
Extractor packages (Invoices, Receipts, PurchaseOrders, etc.) | Use a VM with minimum 2 CPU cores and 8 GB RAM | Can process 25,000 pages/day or 5 million pages/year, assuming perfectly constant traffic (no spikes). |
Classifier packages (DocumentClassifier) | Use a VM with minimum 2 CPU cores and 8 GB RAM | Can process 40,000 documents/day or 8 million documents/year, assuming perfectly constant traffic (no spikes). |
OCR | Requires minimum 8 GB RAM if running on CPU. No requirement if running on GPU. | Can process 50,000 pages/day. |
OCR_CPU | Requires minimum 4 GB RAM. | Can process 50,000 pages/day. |
Example : If you process 10 million pages/year, then you need a VM with 4 CPU cores, 16 GB RAM for the extractor, another one for the classifier, and a third VM with an NVidia GPU core for the OCR engine.
You can also choose to use only one VM for both extractor and classifier, meaning that you need a single VM with 8 CPU cores and 32 GB RAM.