document-understanding
2021.10
false
- Getting Started
- Framework Components
- Data Extraction Training Overview
- Configure Extractors Wizard of Train Extractors Scope
- Machine Learning Extractor Trainer
- Data Extraction Training Related Activities
- ML Packages
- Pipelines
- Data Manager
- OCR Services
- Document Understanding deployed in Automation Suite
- Document Understanding deployed in AI Center standalone
- Deep Learning
- Licensing
- References
- UiPath.Abbyy.Activities
- UiPath.AbbyyEmbedded.Activities
- UiPath.DocumentUnderstanding.ML.Activities
- UiPath.DocumentUnderstanding.OCR.LocalServer.Activities
- UiPath.IntelligentOCR.Activities
- UiPath.OCR.Activities
- UiPath.OCR.Contracts
- UiPath.DocumentProcessing.Contracts
- UiPath.OmniPage.Activities
- UiPath.PDF.Activities
Machine Learning Extractor Trainer
Document Understanding User Guide
Last updated Oct 17, 2024
Machine Learning Extractor Trainer
The Machine Learning Extractor Trainer collects the human feedback for you, in a directory of your choice. Once you collect data and you want to retrain an ML Model, you can just zip the content of the directory and upload it in Data Manager for curation.
Below are the steps that you need to follow for using the Machine Learning Extractor Trainer activity.
- Use the Taxonomy Manager Wizard to define your document types and fields.
- Drag a Machine Learning Extractor Trainer in a Train Extractors Scope activity.
- In the Machine Learning Extractor wizard that automatically opens, add the Endpoint information.
- Select the checkbox for the Update activity arguments if you wish to also use the entered values as input arguments for the activity, more precisely for the Endpoint.
- Click the Get Capabilities button. The wizard closes after this operation.
- Enter a value for Output Folder.
- Select the Configure Extractors option of the Train Extractors Scope. A wizard is displayed.
- The Machine Learning Extractor Trainer is now ready for configuration. Expand the document type that you want to apply it for, and start selecting the fields you want to train, by clicking the checkboxes next to the appropriate fields.
- Fill in the textboxes either manually or by selecting, from the available drop-down list, the correct data you wish to map
to each field. The drop-down list contains all fields that the Machine Learning Extractor Trainer, using the endpoint entered in the Machine Learning Extractor wizard, declares as extraction capability.
Note: If you click the checkbox but you leave the textbox empty, the latter will be automatically filled in with the Document Type ID from the local taxonomy. The changes apply after saving. Should you want to avoid using a long string for the field ID, we would recommend you to manually enter a value in case you do not have access to the internal taxonomy of the extractor.
- To check if you are using the latest capabilities of the extractor, you can click the Get or refresh extractor capabilities which opens the Machine Learning Extractor wizard.
- Selecting one of the options from a drop-down list automatically confirms that field.
- To train an extractor based on its extraction result, you can set the exact alphanumeric value in the Framework Alias field previously used for an extractor.
- Select the Save button once all fields are configured properly.
Important: You cannot choose the same option for two distinct fields.