- Overview
- Getting started
- Building models
- Consuming models
- ML packages
- 1040 - document type
- 1040 Schedule C - document type
- 1040 Schedule D - document type
- 1040 Schedule E - document type
- 1040x - document type
- 3949a - document type
- 4506T - document type
- 709 - document type
- 941x - document type
- 9465 - document type
- ACORD125 - document type
- ACORD126 - document type
- ACORD131 - document type
- ACORD140 - document type
- ACORD25 - document type
- Bank Statements - document type
- Bills Of Lading - document type
- Certificate of Incorporation - document type
- Certificate of Origin - document type
- Checks - document type
- Children Product Certificate - document type
- CMS 1500 - document type
- EU Declaration of Conformity - document type
- Financial Statements - document type
- FM1003 - document type
- I9 - document type
- ID Cards - document type
- Invoices - document type
- Invoices Australia - document type
- Invoices China - document type
- Invoices Hebrew - document type
- Invoices India - document type
- Invoices Japan - document type
- Invoices Shipping - document type
- Packing Lists - document type
- Payslips - document type
- Passports - document type
- Purchase Orders - document type
- Receipts - document type
- Remittance Advices - document type
- UB04 - document type
- Utility Bills - document type
- Vehicle Titles - document type
- W2 - document type
- W9 - document type
- Public endpoints
- Supported languages
- Insights dashboards
- Data and security
- Licensing
- How to
- Troubleshooting

Document Understanding Modern Projects User Guide
UiPath® DocPath
The DocPath large language model (LLM) is our latest data extraction model technology, designed to replace current generation models used within UiPath® Document UnderstandingTM. While DocPath operates similarly to previous models, it was trained using a wide variety of documents. This enables it to process common document types with little to no training needed. What sets DocPath LLM apart is its generative architecture, which significantly improves accuracy and simplifies extraction. Additionally, you can also fine-tune the model with your unique datasets.
To gain further insights into the DocPath architecture and the techniques used for training, check the DocPath page from our AI blog.
Currently, UiPath DocPath is only available for US-based tenants in Document Understanding modern projects.
- Extraction models in West Europe region are based on DocPath, except for Invoices Japan and Receipts Japan.
- Public endpoints
for extraction models in West Europe are based on DocPath,
except for the following:
- 9465
- Financial Statements
- Invoices China
- Invoices Hebrew
- Invoices Japan
- Receipts Japan
- The following
public endpoints are based on DocPath in the Japan
region:
- Invoices China
- Invoices Japan
- Receipts Japan
DocPath LLM offers numerous enhancements over previous models. It improves accuracy, especially with tables, adapts to various document layouts to reduce annotation efforts, and boosts automation rates.
- Improved accuracy: DocPath LLM delivers a higher accuracy rate and superior F1 score for semi-structured documents such as invoices, receipts, and purchase orders. This ensures precise and consistent data extraction.
- Effortless annotation: The model reduces manual work by only requiring one annotation per document, eliminating the need to annotate each field instance on every page.
- Enhanced automation: With a greater correlation between confidence level and accuracy, DocPath LLM enhances automation rates while reducing the number of documents sent to Action Center for the same accuracy level.
From our internal tests, DocPath outperformed its predecessor in performance. It reduced the false positive rate by around 15%, and the false negative rate dropped by nearly 17%.
The DocPath LLM is available exclusively for Document Understanding modern projects. Despite the introduction of DocPath, all existing project versions will still use current model versions. This ensures a seamless transition without any disruption to ongoing production workflows.
To start training an exisiting document type on DocPath, unconfirm and confirm all fields in a few documents.
The field names you choose can greatly impact the performance of the model. To ensure optimal results, use natural language and proper grammar for field names. You should only use widely recognized acronyms such as Number (No), Account (Acct), Address (Addr), and Apartment (Apt). Currently, only West European languages are supported, so make sure that the chosen field names align with these languages. Refrain from using non-descriptive names, such as "Column 3", unless the document specifically uses that terminology.
UiPath DocPath currently supports only Latin script languages. If you need to train a model in non-Latin script languages, choose the legacy model type. If the legacy model is selected, choose the appropriate base model for your document type.
To choose between DocPath or legacy model type, navigate to the Settings tab in Document Type Manager and select the needed model type from the Model type drop-down list.
- The extracted fields must match exactly with the text in the documents. This process does not include summarization or other types of text analysis.
- The following document types are
not currently based on DocPath and still work on the previous generation:
- Financial Statements
- Invoices China
- Invoices Hebrew
- Invoices Japan
The document type will be trained using the legacy model.
UiPath DocPath does not currently support non-Latin script languages.