- Release Notes
- Before you begin
- Getting started
- Projects
- Datasets
- ML packages
- Pipelines
- ML Skills
- ML Logs
- Document Understanding in AI Center
- How To
- Basic Troubleshooting Guide
About pipelines
A Pipeline is a description of a machine learning workflow, including all of the functions in the workflow and the order of executions of these functions. The pipeline includes the definition of the inputs required to run the pipeline and outputs received from it.
A Pipeline Run is an execution of a pipeline based on code provided by the user. Once completed, a pipeline run has associated outputs and logs.
There are three types of pipelines:
- Training Pipeline - takes as input a package and a dataset, and produces a new package version.
- Evaluation Pipeline - takes as input a package version and a dataset, and produces a set of metrics and logs.
-
Full Pipeline - runs a processing function, a training pipeline, and immediately after an evaluation pipeline.
Tip: The examples used to explain these concepts are based on a sample package, tutorialpackage.zip, that you can download by clicking the button below. We recommend you to upload this sample package if it's your first time when you learn about pipelines. Make sure you enable it for training.
The Pipelines page, accessible from the Pipelines menu after selecting a project, enables you to view all the pipelines within that project, along with information about their type, associated package and package version, status, creation time, duration, and score. Here you can create new pipelines, access existing pipelines' details, or remove pipelines.
A pipeline run can be in one of the following statuses:
- Scheduled –A pipeline that has been scheduled to start in the future (for example at 1am every Monday). When the date-time set for a pipeline to start running is reached, the pipeline is queued to run.
- Packaging – A pipeline that has start building the docker image on wich the job itself will be executed. If this is the first time you are training this specific version of the ML Package this can takes up to 20 minutes.
- Running – A pipeline that has started and is executing.
- Failed – A pipeline that failed during execution.
Note: Pipelines can fail if the dataset set size exceeds the 50 Gb limit.
- Killed – A pipeline that was executing until the user explicitly called for its termination.
- Successful – A pipeline that completed execution.
Note: Pipelines are automatically killed after seven days to avoid being stuck for longer periods of time and consuming licenses.