- Getting started
- Balance
- Clusters
- Concept drift
- Coverage
- Datasets
- General fields (previously entities)
- Labels (predictions, confidence levels, hierarchy, etc.)
- Models
- Streams
- Model Rating
- Projects
- Precision
- Recall
- Reviewed and unreviewed messages
- Sources
- Taxonomies
- Training
- True and false positive and negative predictions
- Validation
- Messages
- Administration
- Manage sources and datasets
- Understanding the data structure and permissions
- Create or delete a data source in the GUI
- Uploading a CSV file into a source
- Preparing data for .CSV upload
- Create a new dataset
- Multilingual sources and datasets
- Enabling sentiment on a dataset
- Amend a dataset's settings
- Delete messages via the UI
- Delete a dataset
- Export a dataset
- Using Exchange Integrations
- Model training and maintenance
- Understanding labels, general fields and metadata
- Label hierarchy and best practice
- Defining your taxonomy objectives
- Analytics vs. automation use cases
- Turning your objectives into labels
- Building your taxonomy structure
- Taxonomy design best practice
- Importing your taxonomy
- Overview of the model training process
- Generative Annotation (NEW)
- Dastaset status
- Model training and annotating best practice
- Training with label sentiment analysis enabled
- Understanding data requirements
- Train
- Introduction to Refine
- Precision and recall explained
- Precision and recall
- How does Validation work?
- Understanding and improving model performance
- Why might a label have low average precision?
- Training using Check label and Missed label
- Training using Teach label (Refine)
- Training using Search (Refine)
- Understanding and increasing coverage
- Improving Balance and using Rebalance
- When to stop training your model
- Using general fields
- Generative extraction
- Using analytics and monitoring
- Automations and Communications Mining
- Licensing information
- FAQs and more
Communications Mining User Guide
Model training and annotating best practice
Before you begin training your model it is important to read the following tips and avoid the common pitfalls. These will help keep the training time shorter and improve the performance of your model.
The three most important things to remember whenever you are training a Communications Mining model are:
Add all labels that apply: remember to add all the labels that apply to a message. It’s a common pitfall for new users to partially annotate a message by only applying the one they are focusing on and forgetting to add all others that apply. Not applying a label is as powerful as applying one - you are telling the model that the message isn't something as well as what it is. Therefore it's important to apply all labels as it may confuse the model later, potentially leading to poorer performance.
Apply labels consistently: Remember to be consistent in adding labels. For example, if you add the label ‘Room > Size’ to a message and forget to add it another where it should be added you will confuse the model. As with the previous tip above when you don’t apply a label it is as powerful as applying one.
Annotate what you see in front of you: Don’t make assumptions when applying your business knowledge. If nothing in the subject or body of the message indicates that a label should apply, don't apply it, or the model won't be able to understand why it applies.
Additional tips:
Don't spend ages deciding label names: Don’t spend too long thinking about the correct name for a label. You can rename a label at any point during the training process.
Be specific when naming a label: Be as specific as possible when naming a label and keep the taxonomy as flat as possible initially. It is better to be as specific as possible with your label name at the outset as you can always change and restructure the hierarchy later.
For example, if you chose to apply a label to describe the cleanliness of a room you could apply ‘Room cleanliness’. If you later decided to change it and have cleanliness as a sub label you can rename it to ‘Room > Cleanliness’. At this stage you should add as many labels as possible to a message as you can always go back and merge later.