activities
latest
false
Integration Service Activities
Last updated Sep 9, 2024

PREVIEW
Index and Ingest (Public Preview)

Description

Index and ingest data from data sources to create embeddings to support Retrieval Augmented Generation (RAG) within UiPath GenAI Activities.

Project compatibility

Windows | Cross-platform

Configuration

  • Connection ID - The connection established in Integration Service. Access the dropdown menu to choose, add, or manage connections.

  • Orchestrator folder - The Orchestrator folder that contains data you'd like to query with Context Grounding. This must be a Shared folder. Search by name or select from the dropdown list of available/permissioned Orchestrator folders in that tenant. This field supports String type input.
  • Orchestrator bucket - The Orchestrator bucket in sthe shared folder. Search by name or select from the dropdown list from the buckets in that folder. This field is displayed after you select the Orchestrator folder. This field supports String type input.
  • Index name - If you have previously created an index, select one from the available options in dropdown list. If you have not, create a new index. This field supports String type input.
  • Data type - Define the specific data type in the Orchestrator bucket being ingested: PDF, JSON, or CSV. Only one file type can be ingested at a time. If you have multiple types, run the each activity for each file type.
  • File glob pattern - Define this to match the file in the Data type field if you have multiple data types in the same Orchestrator Bucket that you want to ingest. Select:
    • *. - if you are ingesting the same file type as the previous request.
    • *.pdf - for PDFs.
    • *.csv - for CSVs.
    • *.json - for JSONs.
Additional properties
Output
  • Index ID - The unique identifier for the schema used in indexing. Automatically generated output variable.
  • Datasource ID - A unique identifier for the data source. Automatically generated output variable.
  • Index and Ingest - Automatically generated output variable.

How to use Index and Ingest

The Index and Ingest activity makes your datasets available at runtime for querying and Retrieval Augmented Generation (RAG) by LLMs. Note that Orchestrator buckets and indices are separate entities. Context Grounding uses Orchestrator buckets where you can upload and store files to create indices. These indices can then be referenced when searching for semantically similar context to insert into an LLM prompt.

  • Index: In the UiPath-managed vector database, create an organized location (e.g. a folder) in which embeddings are stored and referenced at runtime.
  • Ingest: Convert business data stored in Orchestrator buckets into representative embeddings; vectors that can be searched with results presented in a manner that LLMs can understand well.

To use Index and Ingest, you must upload data into shared UiPath Orchestrator buckets via direct upload, API, or an activity. The activity uses this data to:

  • Create new indices (e.g. just getting started, adding new data you’d like organized in different folders).
  • Re-ingest and re-index data (e.g. making sure you have the most relevant results; deleting a dataset, adding new ones).
Table 1. Terminology
TermDefinition
Orchestrator folderGeneral storage for data to be used in UiPath platform.
Orchestrator bucketThe specific location of the data within the folder for which you’d like to create an index. There is typically a 1-1 relationship between buckets and indices.
Index nameThe unique name of the index you’d like to create or update.

Once created, this appears under the Index name field dropdown list in the Index and Ingest activity, and in the Index field dropdown list in the Content Generation activity.

We recommend using this activity asynchronously in case it takes additional time. That way, create/record/update/delete actions can be managed effectively, and potential errors do not implicate downstream activities. You can accomplish this using a separate process or a Delay activity within the same process. For best results, use two separate processes.

For more information, see Managing the ingestion pipeline.

  • Description
  • Project compatibility
  • Configuration
  • How to use Index and Ingest

Was this page helpful?

Get The Help You Need
Learning RPA - Automation Courses
UiPath Community Forum
Uipath Logo White
Trust and Security
© 2005-2024 UiPath. All rights reserved.