orchestrator
2023.10
true
Orchestrator User Guide
Automation CloudAutomation Cloud Public SectorAutomation SuiteStandalone
Last updated Oct 17, 2024

Process Data Retention Policy

Overview

Executing processes generates large amounts of job data, which may crowd your Orchestrator database rapidly. A retention policy helps you free up the database in an organized manner.

What is a retention policy? It is an agreement to ensure built-in data off-loading capabilities, by setting an action to remove data from your database after a period of time. What to expect? Due to a lighter database, your cloud Orchestrator performs better.

Job conditions

For the specified process, the retention policy you configure applies to all jobs that simultaneously meet the following conditions:

  • they have a final status, such as Faulted, Successful, or Stopped
  • they have ended more than X days ago, X being the retention duration

Determining when a Job is deleted

The retention is calculated based on calendar days. Therefore, qualified jobs are deleted on the X+1 calendar day, X being the retention duration, and +1 representing the deletion on the following calendar day.

Note that the deletion may execute at the very beginning of the following calendar day, hence a couple of hours apart from the moment the retention duration ends.

For example, say you set a retention duration of one day:

If the end time of a job is either June 6 2022 00:01:00 (the first minute in the calendar day) or June 6 2022 23:59:00 (the last minute in the calendar day), it qualifies for the deletion that runs on June 8th (June 6th + one-day retention duration + one day after = June 8th).

Therefore:

  • we ensure your job data is kept for at least one calendar day (the retention duration) by archiving it on the next calendar day, and
  • we aim at ensuring your items are archived by the end of the next calendar day.

Policy types

There are three types of retention policy:

  • the default policy for newly created processes - all jobs that are created from new processes are deleted after 30 days, without the possibility to undo their deletion. This is the built-in option.
  • the custom policy - all jobs are deleted or archived after a retention duration of your choosing, which is maximum 180 days. This option can be configured as instructed in the Configuring a custom retention policy section.
  • the keep policy - pre-existing processes and jobs have no defined initial retention policy, meaning their data is kept indefinitely until you set a default or custom policy.
Important:

The default policy of 30 days applies to:

  • jobs without an associated process
  • jobs whose associated process was deleted

Policy outcomes

A custom retention policy has the following outcomes:

  • it deletes the jobs that are older than the specified duration.
  • it deletes the valid jobs that are older than the specified duration, but archives their data into an existing storage bucket, for future reference. This way, you offload your Orchestrator database without losing the information.

    Note:

    Insights dashboards containing deleted job information will continue to display the correct data.

    The deletion in Orchestrator will not be propagated towards Insights.

    Note: We preserve unique references of deleted job, therefore adding a new job does not create a duplicate unique reference.

Implementation phases

We acknowledge the impact that this functionality may have on your data, so we roll out the retention policy option in three phases. This aims to give you enough time to assess and determine which policy best suits your business needs. Be aware that even if you do not configure a custom retention policy, the default one still applies and deletes all existing process items older than 120 days.

Phase

What happens

Phase 0

This is an informing phase, announcing all organizations about the upcoming policy, its impact on accounts, the feature behavior and the rollout mechanism.

At the end of phase 0, the feature UI and functionality are deployed to all environments, but no policy is activated.

Phase 1

This is a six-week period of time between the feature deployment and the first policy activation, allowing you to adjust and prepare your processes .

An Application Information counter displays the remaining days until the retention policy starts, so you do not overlook preparing your account before the policies live date.

At the end of phase 1, all policies, either the default ones or the ones you configured, are applied.

Phase 2

All policies become active and your account data is offloaded based on the policy configuration.

Phase 2 has no end date. This means that if you configure a new policy, it applies immediately.

Offloading mechanism

A background job runs daily at a time your server is not busy and performs the actions necessary for all retention policies.

Initially, a large volume of data needs to be handled. To avoid any operational performance impact, the job may take about one month to parse its data backlog and become accurate to the day.

Therefore, policies may not apply immediately, but they will catch up in about one month.

For example, say you configure a deletion policy of 45 days for a process. The policy becomes active at the end of phase 1, but it takes about one month to guarantee that all your 45-day-old jobs are handled. This is a first time exception, to allow the job to go through the data backlog.

Configuring a custom retention policy

To configure a custom retention policy:

  1. In Orchestrator, navigate to the desired folder in your tenant.
  2. Open the Processes page.
  3. To add a new process, click Add Process. Respectively, to edit an existing process, click More Actions > Edit for the desired process. The Create/Update Process page opens.
  4. In the Retention policy section, select the outcome of your policy from the Action dropdown menu.

    To delete jobs, but keep their information, read the steps in the Archiving jobs section.

    To permanently delete jobs, read the steps in the Deleting jobs section.

Archiving Jobs

If you do not want to lose your job data, but you need to offload this information from the Orchestrator database, archive your jobs.

Prerequisite: You need a storage bucket to store your archived jobs.

  1. Select Archive from the Action dropdown menu.
  2. Select a Retention duration. Input a value between 1 and 180. The default value is 30.

    At the end of this duration, all final state jobs (including job events and execution media) that have not been updated in the meantime are deleted, and their information is stored in a Target bucket.

  3. Select a Target bucket to store your archived items.

To retrieve the archived information, access the archive files from the associated storage bucket.

Note:

Note 1: You can either use an Orchestrator storage bucket, or link an external storage bucket.

Note 2: The storage bucket you use must not be read-only, so that the archiving operation can add items to it.

Note 3: You can use the same storage bucket to archive process items from different processes .

Note 4: This field is only available for the Archive option.

Note 5: A successful archiving operation is logged on the Tenant > Audit page, identifiable by the Action type as Archive.

Note 6: If an error interrupts the archiving operation, an alert informs you in order to fix the error. The archiving operation is retried the next time the deletion runs (the next calendar day). Until the archiving is successfully retried, the affected jobs cannot be viewed or accessed.

Deleting Jobs

If you decide that processed job data is no longer useful, you can remove all that information from your Orchestrator database.

  1. Select Delete from the Action dropdown menu.
  2. Select a Retention duration. Input a value between 1 and 180. The default value is 30.

    At the end of this duration, all final state jobs(including job events and execution media) that have not been updated in the meantime are permanently deleted.

Keeping Queue Items

If you want to keep the processed queue items data for an indefinite time, select Keep from the Action dropdown menu.

All final state queue items (including queue item events and comments) are kept indefinitely in your configured database.

Archive output

The .zip file

When you archive your jobs, a .zip file is created at the end of the retention duration with the path:

"Archive/Processes/Process-{process_key}/{archiving_operation_date}-{archiving_operation_timestamp}.zip", in which:

  • {process_key} - the unique identifier of the process containing the jobs
  • {archiving_operation_date} - the UTC date when the archive was generated, in the yyyy-MM-dd format
  • {archiving_operation_timestamp} - the UTC time when the archive was generated, in the HH-mm-ss-fff format
    For example, an archive file could be named Archive/Processes/Process-1d1ad84a-a06c-437e-974d-696ae66e47c2/2022-05-26-03-00-08-496.zip.

The .csv file

Once extracted, the .zip file displays a .csv file with the same name syntax:

"Process-{process_key}-{archiving_operation_date}-{archiving_operation_timestamp}.csv".

The Metadata.json file

The .json file contains details about the container process, to help you identify it more easily.

Large data volumes

For processes that processed a large number of jobs, these are archived in batches. In this case, the .zip file of each batch has a different {archiving-operation-timestamp}, depending on the time the batch archive was created.

Process retention policy APIs

To incorporate the retention policy in your client, use the dedicated endpoints of the ReleaseRetention API in your Swagger file:

  • GET /odata/ReleaseRetention - returns the list of all active policies, containing information such as the policy action, the retention duration in days, the ID of the process the policy applies to.
  • GET /odata/ReleaseRetention({key}) - returns the policy information about the specified process.
  • PUT /odata/ReleaseRetention({key}) - updates the policy information about the specified process.
  • DELETE /odata/ReleaseRetention({key}) - resets the specified process policy to the default one of 30-day retention + deletion.
Note: If you call the DELETE endpoint for processes created before the introduction of the retention policy feature, the built-in retention policy of 30 days + deletion applies.

See an example in our reference guide.

Policy Tracking Columns and Audit

To easily identify which processes have a custom retention policy in place, enable the Retention action and Retention (days) columns on the Processs page, by selecting the corresponding checkboxes from the Columns dropdown.

The Retention action column displays the policy outcome, while the Retention (days) column display the remaining time until the policy applies.



As mentioned, a 30-day retention policy applies to newly created processes. However, you cannot always rely on this value to identify the processes which have a default policy in place. For example, if you set a custom retention duration of 55 days and you later update it to 30 days, the resulting policy is not the default one. To see whether these scenarios represent default policies or not, check the Audit page.

Whenever the background job does Retention Policy related cleanup actions (archive + delete or just delete) a corresponding entry is created in the audit on behalf of the administrator.

1 represents the Archive action type. 0 represents the Delete action type.

Was this page helpful?

Get The Help You Need
Learning RPA - Automation Courses
UiPath Community Forum
Uipath Logo White
Trust and Security
© 2005-2024 UiPath. All rights reserved.