document-understanding
latest
false
Document Understanding User Guide
Automation CloudAutomation Cloud Public SectorAutomation SuiteStandalone
Last updated Sep 17, 2024

Intelligent OCR activities

With Intelligent OCR activities you can process documents in a comprehensive manner, allowing you to not only digitize, extract, classify, and validate documents, but also train your extractor and classifiers on your specific data, so they can be faster and more accurate. The steps involved in creating Document UnderstandingTM processes using Intelligent OCR activities are:

  1. Create the Taxonomy: Define document types and convert them into a Document Object Model variable using the Load Taxonomy activity.
  2. Digitize documents: Prepare documents so robots can process them using an OCR engine, by storing their text inside a String variable, and basic information about them inside a Document Object Model file.
  3. Classify documents: Prepare documents using certain classifiers, so robots can identify what types of files they're processing.
  4. Validate the classification of documents: Verify and validate that the documents have been correctly classified.
  5. Train your classifiers: Configure your classifiers based on input received while validating the classification
  6. Extract data from documents: Identify and extract specific information from your documents using various extractors to send it for validation.
  7. Validate the extractions documents: Verify and validate the documents you processed, classified, and extracted, using the input of your team members within Action Center.
  8. Train your extractors: Configure your extractors based on input received while validating the extraction.
  9. Consume exported data: Once you validate the extracted data, you can use it as it is or export it as a DataSet variable using the Export Extraction Results activity.

Before you begin

Before you begin using IntelligentOCR.Activities, check the following characteristics:

  • High configurability, which also involves a high learning curve.
  • The presence of multiple objects and activities, designed to cater for flexibility.
  • Reduced reusability, due to the following complexities:
    • You need to configure numerous configurations inside the workflow.
    • You need to pass explicit arguments from one activity to the other repeatedly, such as:
      • Taxonomy
      • Document Object Model
      • Text
      • Classification results
      • Extraction results
  • Before you begin

Was this page helpful?

Get The Help You Need
Learning RPA - Automation Courses
UiPath Community Forum
Uipath Logo White
Trust and Security
© 2005-2024 UiPath. All rights reserved.