- API docs
- CLI
- Integration guides
- Blog
- How machines learn to understand words: a guide to embeddings in NLP
- Prompt-based learning with Transformers
- Efficient Transformers II: knowledge distillation & fine-tuning
- Efficient Transformers I: attention mechanisms
- Deep hierarchical unsupervised intent modelling: getting value without training data
- Fixing annotating bias with Communications Mining
- Active learning: better ML models in less time
- It's all in the numbers - assessing model performance with metrics
- Why model validation is important
- Comparing Communications Mining and Google AutoML for conversational data intelligence
Communications Mining Developer Guide
Get results from stream
Permissions required: Consume streams, View labels, View sources.
/results
route is the new way of fetching comments and their
predictions from a stream, replacing the existing /fetch
route
(Streams - legacy). We maintain the /fetch route for legacy support, but we
recommend that all the new use cases use the /results
route, as it
supports all the possible use cases, including those using generative
extraction.
- Bash
curl -X GET 'https://<my_api_endpoint>/api/v1/datasets/project1/collateral/streams/dispute/results?max_results=5&max_filtered=15' \ -H "Authorization: Bearer $REINFER_TOKEN"
curl -X GET 'https://<my_api_endpoint>/api/v1/datasets/project1/collateral/streams/dispute/results?max_results=5&max_filtered=15' \ -H "Authorization: Bearer $REINFER_TOKEN" - Node
const request = require("request"); request.get( { url: "https://<my_api_endpoint>/api/v1/datasets/project1/collateral/streams/dispute/results?max_results=5&max_filtered=15", headers: { Authorization: "Bearer " + process.env.REINFER_TOKEN, }, }, function (error, response, json) { // digest response console.log(JSON.stringify(json, null, 2)); } );
const request = require("request"); request.get( { url: "https://<my_api_endpoint>/api/v1/datasets/project1/collateral/streams/dispute/results?max_results=5&max_filtered=15", headers: { Authorization: "Bearer " + process.env.REINFER_TOKEN, }, }, function (error, response, json) { // digest response console.log(JSON.stringify(json, null, 2)); } ); - Python
import json import os import requests response = requests.get( "https://<my_api_endpoint>/api/v1/datasets/project1/collateral/streams/dispute/results", headers={"Authorization": "Bearer " + os.environ["REINFER_TOKEN"]}, params={"max_results": 5, "max_filtered": 15}, ) print(json.dumps(response.json(), indent=2, sort_keys=True))
import json import os import requests response = requests.get( "https://<my_api_endpoint>/api/v1/datasets/project1/collateral/streams/dispute/results", headers={"Authorization": "Bearer " + os.environ["REINFER_TOKEN"]}, params={"max_results": 5, "max_filtered": 15}, ) print(json.dumps(response.json(), indent=2, sort_keys=True)) - Response
{ "status": "ok", "results": [ { "comment": { "uid": "18ba5ce699f8da1f.0123456789abcdef", "id": "0123456789abcdef", "timestamp": "2018-09-17T09:54:56.332000Z", "user_properties": { "number:Messages": 1, "string:Folder": "Sent (/ Sent)", "string:Has Signature": "Yes", "string:Message ID": "<abcdef@abc.company.com>", "string:Sender": "alice@company.com", "string:Sender Domain": "company.com", "string:Thread": "<abcdef@abc.company.com>" }, "messages": [ { "from": "alice@company.com", "to": [ "bob@organisation.org" ], "sent_at": "2018-09-17T09:54:56.332000Z", "body": { "text": "Hi Bob,\n\nCould you send me today's figures?" }, "subject": { "text": "Today's figures" }, "signature": { "text": "Thanks,\nAlice" } } ], "text_format": "plain", "attachments": [], "source_id": "18ba5ce699f8da1f", "last_modified": "2024-07-03T13:30:53.991000Z", "created_at": "2020-12-14T15:07:03.718000Z", "context": "1", "has_annotations": true }, "prediction": { "taxonomies": [ { "name": "default", "labels": [ { "name": "Margin Call", "occurrence_confidence": { "value": 0.9905891418457031, "thresholds": ["stream"] }, "extraction_confidence": { "value": 0.4712367373372217, "thresholds": [] }, "fields": [ { "name": "Notification Date", "value": null } ] }, { "name": "Margin Call > Interest Accrual", "occurrence_confidence": { "value": 0.9905891418457031, "thresholds": [] }, "extraction_confidence": { "value": 0.9905891418457031, "thresholds": [] }, "fields": [ { "name": "Amount", "value": { "formatted": "636,000.00" } }, { "name": "Broker number", "value": null }, { "name": "Client name", "value": null }, { "name": "Currency", "value": { "formatted": "AUD" } } ] } ], "general_fields": [ { "name": "monetary-quantity", "value": { "formatted": "636,000.00 GBP" } }, { "name": "MarginCallDateType", "value": { "formatted": "2018-09-21 00:00 UTC" } }, { "name": "client-name", "value": { "formatted": "Big Client Example Bank" } } ] } ] }, "continuation": "pmjKYXYBAAADqHUvPkQf1ypNCZFR37vu" } ], "num_filtered": 0, "more_results": true, "continuation": "pmjKYXYBAAAsXghZ2niXPNP6tOIJtL_8" }
{ "status": "ok", "results": [ { "comment": { "uid": "18ba5ce699f8da1f.0123456789abcdef", "id": "0123456789abcdef", "timestamp": "2018-09-17T09:54:56.332000Z", "user_properties": { "number:Messages": 1, "string:Folder": "Sent (/ Sent)", "string:Has Signature": "Yes", "string:Message ID": "<abcdef@abc.company.com>", "string:Sender": "alice@company.com", "string:Sender Domain": "company.com", "string:Thread": "<abcdef@abc.company.com>" }, "messages": [ { "from": "alice@company.com", "to": [ "bob@organisation.org" ], "sent_at": "2018-09-17T09:54:56.332000Z", "body": { "text": "Hi Bob,\n\nCould you send me today's figures?" }, "subject": { "text": "Today's figures" }, "signature": { "text": "Thanks,\nAlice" } } ], "text_format": "plain", "attachments": [], "source_id": "18ba5ce699f8da1f", "last_modified": "2024-07-03T13:30:53.991000Z", "created_at": "2020-12-14T15:07:03.718000Z", "context": "1", "has_annotations": true }, "prediction": { "taxonomies": [ { "name": "default", "labels": [ { "name": "Margin Call", "occurrence_confidence": { "value": 0.9905891418457031, "thresholds": ["stream"] }, "extraction_confidence": { "value": 0.4712367373372217, "thresholds": [] }, "fields": [ { "name": "Notification Date", "value": null } ] }, { "name": "Margin Call > Interest Accrual", "occurrence_confidence": { "value": 0.9905891418457031, "thresholds": [] }, "extraction_confidence": { "value": 0.9905891418457031, "thresholds": [] }, "fields": [ { "name": "Amount", "value": { "formatted": "636,000.00" } }, { "name": "Broker number", "value": null }, { "name": "Client name", "value": null }, { "name": "Currency", "value": { "formatted": "AUD" } } ] } ], "general_fields": [ { "name": "monetary-quantity", "value": { "formatted": "636,000.00 GBP" } }, { "name": "MarginCallDateType", "value": { "formatted": "2018-09-21 00:00 UTC" } }, { "name": "client-name", "value": { "formatted": "Big Client Example Bank" } } ] } ] }, "continuation": "pmjKYXYBAAADqHUvPkQf1ypNCZFR37vu" } ], "num_filtered": 0, "more_results": true, "continuation": "pmjKYXYBAAAsXghZ2niXPNP6tOIJtL_8" }
Once you create a stream, you can query it to fetch comments and their predictions. This includes labels, general fields, and label extractions, containing a set of extraction fields for each instance of that label occurring.
When you create a stream, you set its initial position to be equal to its creation time. If needed, you can set the stream to a different position (either forwards or backwards in time), using the reset endpoint. The stream returns comments starting from its current position. You determine the position of the comment in the comment queue by the order in which you uploaded the comments.
Depending on your application design, you can choose between:
- advancing the stream once, for
the whole batch. Use the batch's
continuation
contained in the response. - advancing the stream for each
individual comment. Use the comment's
continuation
, contained in the response.
comment_filter
when creating the stream, the
results don't include comments not matching the filter, but still count towards the
requested max_filtered
. You can see responses where all of
max_filtered
comments are filtered out, leading to an empty
results
array. In the example below, you request a batch of 8
comments, all of which are filtered out.
{
"filtered": 8,
"results": [],
"sequence_id": "qs8QcHIBAADJ1p3W2FtmBB3QiOJsCJlR",
"status": "ok"
}
{
"filtered": 8,
"results": [],
"sequence_id": "qs8QcHIBAADJ1p3W2FtmBB3QiOJsCJlR",
"status": "ok"
}
max_filtered
parameter, to prevent filtered
comments from counting towards the requested max_results
.
/fetch
route does not return comments with predictions that did not meet the confidence
threshold.
/results
route, you return all the
predictions for a comment, and the
confidence
value
as well. You also indicate
which type(s) of threshold it meets.
"occurrence_confidence": {
"value": 0.9905891418457031,
"thresholds": ["stream"]
}
"occurrence_confidence": {
"value": 0.9905891418457031,
"thresholds": ["stream"]
}
confidence
for a prediction 0.9905..
and the
thresholds
value indicates that the prediction meets the
configured threshold for the stream
.
stream
value, to confirm
that the prediction meets the threshold you configured in the stream.
For more information about generated extractions, and how to work with thresholds, check the Understanding validation on extractions and extraction performance page.
NAME | TYPE | REQUIRED | DESCRIPTION |
---|---|---|---|
max_results | number | no | The number of comments to fetch for this stream. Returns fewer comments if it reaches the end of the batch, or if you filter out comments according to the comment filter.. Max value is 32. Default is 16. |
max_filtered | number | no | Convenience parameter for streams with a comment filter. When
you provide them, up to max_filtered filtered
comments do not count towards the requested
max_results . This is useful if you expect a
large number of comments to not match the filter. Has no effect
on streams without a comment filter. Max value is 1024. Default
is null.
|
NAME | TYPE | DESCRIPTION |
---|---|---|
status | string | ok if the request is successful, or
error , in case of an error. To learn more
about error responses, check the Overview page.
|
num_filtered | number | Number of comments that were filtered out based on a comment
filter. If you created the stream without a filter, this number
is always 0 .
|
continuation | string | The batch continuation token. Use it to acknowledge the processing of this batch, and advance the stream to the next batch. |
more_results | bool | True if there were no additional results in the stream, when you made the request. False otherwise. |
results | array<Result> | An array containing result objects. |
Result
has the following format:
NAME | TYPE | DESCRIPTION |
---|---|---|
comment | Comment | Comment data. For a detailed explanation, see the Comment Reference. |
continuation | string | The comment's continuation token. Used to acknowledge processing of this comment and advance stream to the next comment. |
prediction | array<Prediction> | The prediction for this comment. Is available only if the stream specifies a model version. For more information about generative predictions, check the: Communications Mining - Understanding validation on extractions and extraction performance page. |
Prediction
has the following format:
NAME | TYPE | DESCRIPTION |
---|---|---|
taxonomies | array<TaxonomyPrediction> | List of taxonomy predictions. You currently define only one taxonomy per dataset, but you provide it as a list, for future compatibility. |
TaxonomyPrediction
has the following format:
NAME | TYPE | DESCRIPTION |
---|---|---|
name | string | Name of the taxonomy. The only value is currently
default .
|
labels | array<LabelPrediction> | A list of extracted label predictions with their
occurrence_confidence ,
extraction_confidence and extracted
fields . For more information about
generative predictions, check the Communications Mining -
Understanding validation on extractions and extraction
performance page.
|
general_fields | array<FieldPrediction> | A list of extracted general field predictions with their
name and extracted value .
For more information about generative predictions, check the
Communications Mining -
Understanding validation on extractions and extraction
performance page.
|