- Getting started
- Balance
- Clusters
- Concept drift
- Coverage
- Datasets
- General fields (previously entities)
- Labels (predictions, confidence levels, hierarchy, etc.)
- Models
- Streams
- Model Rating
- Projects
- Precision
- Recall
- Reviewed and unreviewed messages
- Sources
- Taxonomies
- Training
- True and false positive and negative predictions
- Validation
- Messages
- Administration
- Manage sources and datasets
- Understanding the data structure and permissions
- Create or delete a data source in the GUI
- Uploading a CSV file into a source
- Preparing data for .CSV upload
- Create a new dataset
- Multilingual sources and datasets
- Enabling sentiment on a dataset
- Amend a dataset's settings
- Delete messages via the UI
- Delete a dataset
- Export a dataset
- Using Exchange Integrations
- Model training and maintenance
- Understanding labels, general fields and metadata
- Label hierarchy and best practice
- Defining your taxonomy objectives
- Analytics vs. automation use cases
- Turning your objectives into labels
- Building your taxonomy structure
- Taxonomy design best practice
- Importing your taxonomy
- Overview of the model training process
- Generative Annotation (NEW)
- Dastaset status
- Model training and annotating best practice
- Training with label sentiment analysis enabled
- Understanding data requirements
- Train
- Introduction to Refine
- Precision and recall explained
- Precision and recall
- How does Validation work?
- Understanding and improving model performance
- Why might a label have low average precision?
- Training using Check label and Missed label
- Training using Teach label (Refine)
- Training using Search (Refine)
- Understanding and increasing coverage
- Improving Balance and using Rebalance
- When to stop training your model
- Using general fields
- Generative extraction
- Using analytics and monitoring
- Automations and Communications Mining
- Licensing information
- FAQs and more
Communications Mining User Guide
Generative Annotation (NEW)
Generative Annotation uses Microsoft’s Azure OpenAI endpoint to generate AI suggested labels to accelerate taxonomy design and early phases of model training; and reduce time-to-value for all Communications Mining use cases.
It includes:
- Cluster Suggestions: Suggested new or existing labels for clusters based on their identified theme(s)
- Assisted Annotating: Automatic predictions for labels based on the label names or descriptions.
Generative Annotation features will be automatically enabled on datasets – you don't need to do anything to start using them.
Once a dataset is created, cluster suggestions will automatically be generated within a short period of time. If a taxonomy has been uploaded (highly recommended), Communications Mining will suggest both existing and new labels for clusters.
When a taxonomy is uploaded to a dataset, this will also automatically trigger an initial model to be trained with no training data, just using label names and descriptions – this may take a few minutes from when you've uploaded the taxonomy.
- For Cluster Suggestions: go to the Train tab and select a clusters batch or go to the Discover tab and select the Cluster mode to start annotating.
- For Assisted Annotating: go to the Train tab and follow the recommended actions, or go to the Explore tab and select Shuffle or Teach Label mode to start annotating.
Prerequisite: ‘Review and Annotate’ permission/
Cluster Suggestions will appear on the top of each Cluster page (white shading with blue border). This can be one or multiple suggested labels for each cluster.
If you have Label sentiment analysis enabled, Cluster Suggestions will have either positive or negative sentiment (white shading with green or red border).
You can tell it’s an AI suggested label by the red sparkle icon next to the label name.
Model trainers should review each Cluster Suggestion and:
- Accept it by clicking on it, or
- Assign a new label if you don’t agree with the given suggestion.
Cluster Suggestions can significantly speed up the first phase of the model training process by automatically generating suggested labels for each cluster.
It can also help with taxonomy design, if users are struggling to define the concepts they want to train.
Cluster Suggestions are generated based on the identified theme shared across the messages within a cluster.
The creation of clusters and generation of label suggestions is an automatic and completely unsupervised process with no human input required.
Label suggestions on clusters will be generated with or without a pre-defined taxonomy, but suggestions will be influenced and typically made more helpful by leveraging imported / existing labels.
Prerequisite 1: ‘Review and Annotate’ permission.
Prerequisite 2: Imported list of label names.
Optional but highly recommended: Imported list of label descriptions.
Once the initial model has automatically trained using label names and descriptions as it's training input, predictions will appear for many of the messages in the dataset.
These predictions work in the exact same way as they have done previously – they are just generated with no training data.
If you have Label sentiment analysis enabled, initial predictions will have either positive or negative sentiment (different shades of green / red based on its confidence level).
Assisted Annotating works in any training batch or mode but it’s most effective to use in ‘Shuffle’ and ‘Teach Label’ (follow the regular annotating steps in each training batch in Train or Explore).
Assisted Annotating can significantly speed up the second phase of the model training process by automatically generating predictions for each label with sufficient context, with no training examples required.
Initial predictions will be driven by the quality of the label names and natural language descriptions (i.e. vague names might lead to vague or minimal predictions). Detailed label descriptions can boost the initial model’s performance.
As you train your dataset further, the platform will use both the label names and descriptions and your pinned examples to generate relevant label predictions.
These will keep improving with more training and ultimately rely only on annotated training examples when enough have been provided.
Assisted Annotating still requires supervised learning by accepting / rejecting the predictions, but it accelerates the most time-consuming part of model training by providing better predictions with zero or very few pinned examples.