ai-center
latest
false
UiPath logo, featuring letters U and I in white

AI Center

Automation CloudAutomation SuiteStandalone
Last updated Jan 9, 2025

Object Detection

OS Packages > Image Analysis > ObjectDetection

This is a generic, retrainable deep learning model to perform Object Detection. This ML Package is pretrained on COCO Dataset so you can directly create an ML Skill which can be used for identifying 80 classes of COCO Dataset.

Well, you can also train it on your own data and create an ML Skill and use for performing object detection where it will now work on your data.

This deep learning model uses You only look once (YOLO) which a state-of-the-art and one of the most effective object detection algorithms that also encompasses many of the most innovative ideas evolving from the field of computer vision.

Important: Please note that this model is not supported on GPU (both for Pipeline and ML Skill) today.

Model details

Input type

FILE

Input description

Full Path of the image file on which you want to detect the objects.

Output description

JSON with identified object’s class byte array representation (allows you to see box around objects), identified object’s class - name, score (between 0-1)

Example:

{
  "Predicted ByteArray":
    "/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAIBAQEBAQIBAQECAgICAgQDAgI…TD",
  "Predicted Class":
     "[{'class': 'book', 'score': ' 0.31'}, {'class': 'dog', 'score': ' 0.53'}, {'class': 'chair', 'score': ' 0.79'}]"
}{
  "Predicted ByteArray":
    "/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAIBAQEBAQIBAQECAgICAgQDAgI…TD",
  "Predicted Class":
     "[{'class': 'book', 'score': ' 0.31'}, {'class': 'dog', 'score': ' 0.53'}, {'class': 'chair', 'score': ' 0.79'}]"
}

Pipelines

This package only supports full pipeline runs.

Dataset format

Image file

By default, this model will read images of format .jpg, .jpeg. Below are some points of considerations for the input images:

  • Having all images of same format
  • Having all images of same size, preferred 800*600
  • Having at least 100+ count of image for each category of object

XML annotation file

For each uploaded image there should be a corresponding annotation .XML file which contains the bounding box details of the image. The required file format for the .XML file is Pascal VOC.

For annotating the images, you can use an opensource annotation tool like Label Studio or any other tool of your preference.

Below are some points to be considered while creating the .xml files:

  • Its preferred to have single class in the .xml file
  • Giving meaningful name to the class (as above)
  • Avoiding any alterations in .xml file

    So, this is how your dataset folder will look:



In the above image we can see there are 5 classes – cat, dog, giraffe, horse, zebra and they have corresponding images and xml’s in the dataset folder. Of course, your dataset folder will have more images and xml’s this is just an example to understand the folder structure.

Environment variables

  • learning_rate: change this value to adjust the learning rate, default learning rate is 0.0001

Artifacts

Evaluate function produces one artifact: Here the performance of model is evaluated on map value

  • result.txt – A report containing summary information of how the model performed by sharing map (mean average precision) value of each class and total map value

Sample workflow

You can use this sample Workflow to try this model. Make sure you first deploy the model on your own tenant and then use this workflow with any of your image to send the image to the workflow and automatically identify objects in that image.

Dependencies

  • UiPath.MLServices.Activities v1.1.3
  • UiPath.Web.Activities v1.4.5

Paper

YOLOv3: An Incremental Improvement by Joseph Redmon, Ali Farhadi

Was this page helpful?

Get The Help You Need
Learning RPA - Automation Courses
UiPath Community Forum
Uipath Logo White
Trust and Security
© 2005-2025 UiPath. All rights reserved.