communications-mining
latest
false
Communications Mining User Guide
Last updated Nov 7, 2024

Overview of setting up your extraction fields

Note: Set up your labels and decide on the processes that you want to automate. Set up the extractions in one of the following ways, considering their advantages. At this stage, it is very important to decide which data points you need to extract to facilitate end-to-end automation.​

Explore page

  • At any point during the model training process, you can set up a new extraction, modify your schema, or add any additional fields to your existing schema in Explore.​
  • By setting up your extractions in Explore, you can:
    • base your fields off data from your messages.
    • add new fields to extractions as you see them.

Settings page

  • At any point during the model training process, you can set up a new extraction, modify your schema, or add any additional fields to your existing schema in Settings.
  • If you know what fields you want to extract upfront, set up your extractions in bulk, in Settings.

Train page

If you train your model through the Train tab, you can set up any new extractions. You can also annotate both labels and field extractions, as you go through the guided training experience.

General guidance

Note: The platform’s LLM generative capabilities create the extractions. Predictions are based off the trained label, and the name of the field.
  • To set up your extractions, set up your fields that require a name and a field type. It is recommended to do this at the lowest child-level label.​
  • Be descriptive and concise. Choose field names that accurately describe the data they represent. Aim for a balance between brevity and clarity. Give your field an accurate and descriptive name, as it gives the model the necessary context on the role of the field.​
  • For example, for an address change, if you only want to extract a new address, it is helpful to have configured field names called: new street address, new town, new postcode, and new city.​
  • Avoid ambiguous field names. Ensure that field names are unambiguous and not easily confused with other fields or concepts in your project. For example, instead of using Value, use a more specific name like Sales Amount or Account Balance.​
  • You can have extraction fields with the same field type in, but not for multiple general fields. To address this for general fields, create another field type with the same settings to address this.
Note: If you have a Date Change label and want to capture the Date Before and Date After fields, you cannot have the same data type tied to both these fields (e.g., a Date data type used as the underlying field type for both these form definitions).

You need to create 2 different fields types (one for Date Before and Date After, and map them to the respective form definitions.

Field name best practice

A Field Name is used to prompt the model. If your extractions are not performing as expected, adjust your Field Name to be more specific to your use case. Adjusting the field name may help with performance.

The field names below are just examples – how you name your fields is use case dependent, and depends on the context of what you are trying to extract.​

Use caseNot recommended Field Names​Better performing Field Names​
As part of an address change request, you want to extract the details of the new address to input into your system downstream.
  • Address Line ​
  • Postcode​
  • City
  • New Address Line​
  • New Postcode​
  • New City ​
As part of a logistics shipping request, you want to identify the total tax breakdown (both the VAT amount, and the VAT rate) on each of your goods to input into SAP.
  • Item ID​
  • Tax value
  • Item ID​
  • VAT amount ​
  • VAT percentage
As part of an invoice change request, you want to identify what the old invoice number was and what it needs to be changed to, to cancel the old invoice, and re-issue a new one.
  • Invoice number
  • Old invoice number​
  • New invoice number

General vs. extraction fields

There are two different types of fields that help facilitate end-to-end automation:

  1. General fields
  2. Extraction fields.

It is important to understand the different types of fields available in Communications Mining, and when to use each one.

GENERAL FIELDSEXTRACTION FIELDS​
General fields are fields that you may want to extract, that can be found across multiple different topics/labels in a dataset.
  • Formerly known as entities.
  • Generally applicable for messages across a dataset, and are not tied to a specific label​.
  • Typically useful for triaging and should be limited to data points, used as identifiers (e.g., policy numbers).
Extraction fields are the fields conditioned (and created) on a specific label. In other words, it is tied to a specific label that you want to automate.
  • Created and trained on a message level and is tied to a specific label​.
Note: When you set up your extraction schema, you need to decide what process (i.e., label) you want to automate. Your extraction schema should ALWAYS contain each of the fields needed to automatically process the request.​

The following table captures the key distinctions between general fields and extraction fields. Check the differences because two completely different models predict these field kinds.

Field typePredictedReviewed atSpanless* vs. Spanful*Overlap spans?Share field types between fields of same kindSupported Data Types**
General FieldsAutomatically across dataset​A paragraph level​Only spanful​No​No (for now)​
  • String ​
  • Date​
  • Monetary Quantity​
  • RegEx​
  • Template​
Extraction FieldsOnly on demand (currently)​A message level (in context of label)​Both spanful and spanless​Yes​Yes​
  • String​
  • Date​
  • Monetary Quantity​
  • Number​

Check the Spanless fields in the Spanful vs. Spanless Fields page of this guide.​

Check the Data types supported by each field kind in the Data Types page of this guide.

Extraction Fields Example​

In this example, the platform is able to identify the extraction fields, relevant to facilitating the end-to-end automation of these two labels.



General Fields Example

In this example, the platform isn’t confident enough that a certain label in the taxonomy applies to this message. The platform can still extract certain fields from the message itself. When you set up general fields, the platform can pick up these fields, irrespective of a label prediction.



Set up your fields via Explore

You can set up or modify both your general fields or extraction fields through the Explore page by following the steps below.

  1. On a communication containing a label, where you want to define your extraction field in Explore, select Annotate Fields.
  2. If you set up an extraction field, hover next to the label name in the Field annotations bar on the right, and select Manage fields. If you set up a general field, hover next to General fields and manage your fields there.​


  3. Select New extraction field to add a new extraction field. You can add more than one field.​
  4. Fill in the extraction Field name(s) and field type that you want to extract. You can select an existing field type or create a new one if what you’re trying to extract is not configured.


  5. Select Save in the bottom right to save the extraction fields.

Set up your fields via Settings

Set up or modify both your general fields or extraction fields through the Settings page, by following the steps below.

Note: If you set up your fields in Train, you are redirected to Settings to configure them.

To configure fields via Train as well, follow these steps:

  1. Go to Settings, then Taxonomy.
  2. To create an extraction field, go to the Labels and fields tab.
  3. On the specific label that you want to create an extraction field on, select the dropdown menu. Selecting the drop-down expands the list of all the fields on a given label.
  4. To add a new extraction field, select Extraction field at the bottom.​
  5. Fill out the Field name, as well as the Extraction field type to configure your new extraction field.


  6. To create a new general field, go to the General fields tab. Select New field in the top right corner.​
  7. Fill out the Field name, and General field type to configure your new General field(s).​


Setting up field types

When you set up your fields, you have to select the specific data type.

The following table details out when to use each type.

Field Types
Data TypeGeneral Field​Extraction Field​DescriptionExamples
StringX​X​Strings can include any characters (letters, numbers, etc.). ​

Strings can also have input values that are explicitly present (spanful) in the message or inferred (spanless). Check below for more details.

  • Organization name
  • First name
  • Address line
Date*X​X​Dates come in varying unstructured formats and use UiPath’s® pre-trained date field. ​

  • Start dates ​
  • Expiration dates​
NumberX​X​Quantities come in varying unstructured formats and use UiPath’s® pre-trained quantity field to interpret numbers.​

  • Number of items​
  • Change in % ​
Monetary Quantity*X​X​Similarly, monetary quantities also typically come in varying unstructured formats and use UiPath’s® pre-trained monetary quantity model. ​

  • Total premium value ​
  • Fees due​
RegexX​If a specific field always needs to be extracted in a specific format, the rules can be configured with RegEx. For more details, check the official UiPath® documentation
  • A policy number that must always start with 3 letters, and end in 6 numbers​
TemplateX​Check the official UiPath® documentation for a list of supported templates
  • SEDOL​
  • BIC​

Note:

Many fields may need to be normalized into a structured data format for downstream processes. ​

Within the platform, monetary quantities and dates are general field types that are automatically normalized. For more details, check the official UiPath® documentation on field normalization.

What is a spanful field?

A spanful field is a data point that is explicitly stated in the text (e.g., a Trade ID, Policy Number).

What is a spanless field?

A spanless field is a data point that might not be explicitly stated in the text but needs to be extracted from the message (i.e., can be inferred from the message). In other words, the span of text you want to extract might not necessarily be present in the message.

When setting up general fields, specify if the input value must be present in the message, or if it can be inferred from the message (i.e. – needs to be extracted exactly as-is from the text), or not. ​

Some examples of fields that may need to be spanless:

  • Values that need to be normalized (e.g., a date).​
  • Values that need to be concatenated across different areas in an email​.
  • Values that are not present anywhere in an email, but are implied through the nature of the email ​
  • Values that span across multiple paragraphs, lines, or columns (i.e., do not appear in a continuous span).
Note: Spanless fields are only available when the data type is configured as a string on an extraction field.


Creating a new field type​

A field type is the initial state of your new field. If you do not have a field type to use, follow these steps to set up a new field type. You can set up the new field type from the drop-down when creating a field, but also on the field type page itself if needed​.

Put the broadest field type possible, then fine-tune it to be more specific.​

  1. A - Give your field type a name.​
    Note: The field type name is NOT used by the model for context the same way that field names are.
  2. B - Define whether you are setting up a new field type for an extraction field, or a general field. ​
  3. C - When setting up your general fields or extraction fields, you have to select the specific data type for the field type.
    Note: Depending on whether you set up a new field type or general field for an extraction, your data type that you can configure may vary. Additional configurations are also applicable, depending on the data type that you select.


Note: The steps below contain a list (with details) of all the pre-defined field types available to you in the platform, and when you should use each one.

Creating a new field type via Settings​

Note: Creating a new field type can also be done on the Field type page if needed. Doing it from Field pages pre-selects what it is defined for, and immediately assigns it to that field.​

You can set up a new field type either through the Explore page, or the Settings page, via the Train tab.

Once the data type has been configured on a field type, you cannot change it. Select the correct data type when creating a field type. If you don't select the correct data, you have to delete the field type and re-create it with the correct data type.

You can set up a new field type for both Extraction fields and General fields through the Settings page.​

To set up a new field type in the Settings page, follow the steps below.

(1) Settings > (2) Taxonomy > (3) Field Types > (4) New Field type > (5) Set up your field type.



Creating a new field type via Explore

Note: Creating a new field type via Explore has the same mechanism in Train​.

To set up your field types via the Explore page, follow the steps below.

Note: You must set up the field type that corresponds to a general field or extraction field in its respective section, in the Field annotations pane.

(1) Explore > (2) Annotate Fields > (3) click the 3 dots next to either the general field or extraction field section. You can only create a new field type in its respective section > (4) Manage fields > (5) Select the field type drop down then New field type. Set up your field type.



Was this page helpful?

Get The Help You Need
Learning RPA - Automation Courses
UiPath Community Forum
Uipath Logo White
Trust and Security
© 2005-2024 UiPath. All rights reserved.