- Release notes
- Overview
- Introduction
- Model update resilience
- Setup and configuration
- Data storage
Introduction
AI Computer Vision is a machine-learning based method used to visually identify all the UI elements on a computer screen and interact with them via UiPath Robots, simulating human interaction. It doesn't require or use the underlying properties of applications, but only the aspect and relationship of various screen elements.
Rather than relying on selectors, AI Computer Vision uses AI (Object Detection, OCR, fuzzy text-matching, image-matching for icons) and an anchoring system to tie it all together. More exactly, to visually locate elements on the screen, AI Computer Vision performs an element detection (on the machine-learning server) and a text (OCR) detection, and combines these two into a full understanding of the UI. The relationship between elements detected with these two methods is then encoded into a multi-anchor descriptor, which uniquely identifies the targeted element.
AI Computer Vision is composed of a set of activities, that are part of the UI Automation activity package, as well as a server (which can be cloud, on-premises, or local) hosting an AI model, which is needed to perform the actual analysis of the UI you're automating. By default, our UiPath cloud server is used and also recommended for all AI Computer Vision and UI Automation activities. You can use AI Computer Vision cloud regardless of your deployment type. For instance, it does not matter if you have Orchestrator on-premises or Orchestrator cloud, you can run Computer Vision cloud with no special configuration required.
Alternatively, you can host and manage your own on-premises AI Computer Vision server and use it to run the AI Computer Vision activities. When using this type of server, you need to have your own hardware infrastructure (GPUs) or cloud environment. Also, you need to deploy, update, and maintain your own environment locally. Compared to the UiPath cloud server, you might also run into issues with backwards compatibility when upgrading the AI model. For further details on how to avoid this kind of issues, go to Model update resilience.
Local server is another flavour you can opt for. It runs on local CPU and it is the most portable version. However, it is slower and has a slightly lower detection accuracy.
Here are some features of AI Computer Vision you can benefit from:
- Automation beyond selectors - Enable robots to recognize and interact with more on-screen fields and components - even Flash, Silverlight, PDFs, and images.
- Reliable on VDIs and desktops - Relieves issues with failure-prone image automation techniques and with selector-based targeting on desktops. Start by creating automations within Citrix, VWware or Microsoft’s Remote Desktop.
- Broad range of interface types - Includes VDI environments (Citrix, VMWare, Microsoft RDP, VNC, and others) for desktop and web applications. Save your time by getting UI elements identified and added to object repository for you.
- Intelligent, intuitive capabilities - Provides details, validation, and notifications about on-screen selections via an on-screen wizard. Uses the recorder to easily generate full vision-based automations.
- Run-time auto-scroll support - Easily automate scrollable content in webpages or apps using AI Computer Vision activities.
- Cross-platform capabilities - Automate for Windows, Linux, Android and other operating systems through remote desktops.
- Automation between VDI & non-VDI - Simplifies VDI-to-desktop automation by reducing necessary modifications.
- Multiple deployment options - Deploys via SaaS; available on-premises for Linux and Windows, or right from your desktop.
- Dynamic UI elements - Enables automations that include tables, drop-down lists, and checkbox elements. This increases the resilience of your automations, enabling them to adapt to small changes to the UI and interact with these dynamic elements.
- Available in UI Automation as part of Unified Target - Reduces the complexity of building UI-based automations when you need both selectors and AI Computer Vision descriptors.
For a parallel comparison of our existing AI Computer Vision deployment options, check the AI Computer Vision differences section in the Overview guide.