- Getting started
- Balance
- Clusters
- Concept drift
- Coverage
- Datasets
- General fields (previously entities)
- Labels (predictions, confidence levels, hierarchy, etc.)
- Models
- Streams
- Model Rating
- Projects
- Precision
- Recall
- Reviewed and unreviewed messages
- Sources
- Taxonomies
- Training
- True and false positive and negative predictions
- Validation
- Messages
- Administration
- Manage sources and datasets
- Understanding the data structure and permissions
- Create or delete a data source in the GUI
- Uploading a CSV file into a source
- Preparing data for .CSV upload
- Create a new dataset
- Multilingual sources and datasets
- Enabling sentiment on a dataset
- Amend a dataset's settings
- Delete messages via the UI
- Delete a dataset
- Export a dataset
- Using Exchange Integrations
- Model training and maintenance
- Understanding labels, general fields and metadata
- Label hierarchy and best practice
- Defining your taxonomy objectives
- Analytics vs. automation use cases
- Turning your objectives into labels
- Building your taxonomy structure
- Taxonomy design best practice
- Importing your taxonomy
- Overview of the model training process
- Generative Annotation (NEW)
- Dastaset status
- Model training and annotating best practice
- Training with label sentiment analysis enabled
- Understanding data requirements
- Train
- Introduction to Refine
- Precision and recall explained
- Precision and recall
- How does Validation work?
- Understanding and improving model performance
- Why might a label have low average precision?
- Training using Check label and Missed label
- Training using Teach label (Refine)
- Training using Search (Refine)
- Understanding and increasing coverage
- Improving Balance and using Rebalance
- When to stop training your model
- Using general fields
- Generative extraction
- Using analytics and monitoring
- Automations and Communications Mining
- Licensing information
- FAQs and more
Communications Mining User Guide
Labels (predictions, confidence levels, hierarchy, etc.)
A label is a structured summary of an intent or concept expressed within a message. A message is often summarised by multiple labels - i.e. a label isn't a mutually exclusive classification of the message.
As an example, in a dataset monitoring the customer experience we might create a label called ‘Incorrect Invoice Notification’, which describes when a customer is informing the business that they’ve received what they believe is an incorrect invoice.
Pinned vs. Predicted
Labels are initially created by users by applying one to a relevant message. Users can continue to apply them to build up training examples for the model, and the platform will then start to automatically predict the label across the dataset where it's relevant.
A label that has been applied by a user to a message is considered 'pinned', whereas those that the platform assigns to messages are known as label predictions. For more detail, see here to learn about reviewed and unreviewed messages.
Confidence levels
When the platform predicts whether a label applies to a message that has not been reviewed by a user, it provides a confidence level (%) for that label prediction. The higher the confidence level, the more confident the platform is that the label applies.
Labels are shaded by the confidence level that the platform has in the predicted labels. The more opaque the label, the higher the platform's confidence is that the label applies.
Label hierarchy
Labels can be organised in a hierarchical structure to help you organise and train new concepts more quickly.
This hierarchy takes a format like this: [Parent label] > [Branch label 1] > [Branch label n] > [Child label]
A label can be a standalone parent label, or have branch and child labels (separated by '>') that form subsets of the previous labels in the hierarchy.
Any time a child label or branch label is pinned or predicted, the model considers the previous levels in the hierarchy to have been pinned or predicted too. Predictions for parent labels will typically have higher confidence levels than the lower levels of the hierarchy, as they're often easier to identify.
To see more about label hierarchies, see here.
Label sentiment
For datasets with sentiment analysis enabled, every label (both pinned and predicted) has an associated positive or negative sentiment indicated by a green or red colour (such as the positive sentiment predictions below).
Different levels of a label hierarchy can have different sentiment predictions - e.g. a review could be overall positive about a 'Property' but be negative about the 'Property > Location'.