- Getting started
- Balance
- Clusters
- Concept drift
- Coverage
- Datasets
- General fields (previously entities)
- Labels (predictions, confidence levels, hierarchy, etc.)
- Models
- Streams
- Model Rating
- Projects
- Precision
- Recall
- Reviewed and unreviewed messages
- Sources
- Taxonomies
- Training
- True and false positive and negative predictions
- Validation
- Messages
- Administration
- Manage sources and datasets
- Understanding the data structure and permissions
- Create or delete a data source in the GUI
- Uploading a CSV file into a source
- Preparing data for .CSV upload
- Create a new dataset
- Multilingual sources and datasets
- Enabling sentiment on a dataset
- Amend a dataset's settings
- Delete messages via the UI
- Delete a dataset
- Export a dataset
- Using Exchange Integrations
- Model training and maintenance
- Understanding labels, general fields and metadata
- Label hierarchy and best practice
- Defining your taxonomy objectives
- Analytics vs. automation use cases
- Turning your objectives into labels
- Building your taxonomy structure
- Taxonomy design best practice
- Importing your taxonomy
- Overview of the model training process
- Generative Annotation (NEW)
- Dastaset status
- Model training and annotating best practice
- Training with label sentiment analysis enabled
- Understanding data requirements
- Train
- Introduction to Refine
- Precision and recall explained
- Precision and recall
- How does Validation work?
- Understanding and improving model performance
- Why might a label have low average precision?
- Training using Check label and Missed label
- Training using Teach label (Refine)
- Training using Search (Refine)
- Understanding and increasing coverage
- Improving Balance and using Rebalance
- When to stop training your model
- Using general fields
- Generative extraction
- Using analytics and monitoring
- Automations and Communications Mining
- Licensing information
- FAQs and more
Communications Mining User Guide
Generating your extractions
- The Extraction validation process is required to understand the performance of these
extractions via Validation.
Decide on the extraction that you want to train out. We use Report > Statement of Accounts as an example of a schema we want to train out.
To automate this process, extract the following data points to input into a downstream system:
Note: This is only applicable if you are training in Explore. In Train, clicking into an extraction training batch pre-loads the extractions.Use this training mode as required, to boost the number of training examples for each extraction (i.e., a set of fields assigned to a label) to at least 25, allowing the model to accurately estimate the performance of the extraction.
- Go to Explore then Label, and select the label you want to generate
extractions on.
- Select Predict extractions. Predict extractions generates extractions on a
per page basis in Explore (i.e.- this applies predictions on all the
comments on a given page).
Note: Each time you go to the next page, you need to select Predict extractions again.
You can also generate extractions on an individual comment level. Select Annotate Fields, then Predict extractions icon.
- The model uses generative models and maps each of the data points that you previously defined (in our extraction schema), to relate to them to an intent (label).
- It extracts and returns them in a structured schema, for an SME to go through and confirm.
- The structured schema is intended to enable more complex automations, and is structured in JSON format in the API for consumption by any downstream automations.
- After making the extraction
predictions, if the model picked up field extractions on the comment, it highlights
the relevant span in the text (if applicable). The model displays the extracted
value on the right-hand side. Check the Validating and annotating extractions
page to learn how to validate the predicted values.