Skip to content

Workflow Archive

v2.5 and after

If you want to keep completed workflows for a long time, you can use the workflow archive to save them in a Postgres (>=9.4) or MySQL (>= 5.7.8) database. The workflow archive stores the status of the workflow, which pods have been executed, what was the result etc. The job logs of the workflow pods will not be archived. If you need to save the logs of the pods, you must setup an artifact repository according to this doc.

The quick-start deployment includes a Postgres database server. In this case the workflow archive is already enabled. Such a deployment is convenient for test environments, but in a production environment you must use a production quality database service.

Enabling Workflow Archive

To enable archiving of the workflows, you must configure database parameters in the persistence section of your configuration and set archive: to true.

Example:

persistence:
  archive: true
  postgresql:
    host: localhost
    port: 5432
    database: postgres
    tableName: argo_workflows
    userNameSecret:
      name: argo-postgres-config
      key: username
    passwordSecret:
      name: argo-postgres-config
      key: password

You must also create the secret with database user and password in the namespace of the workflow controller.

Example:

kubectl create secret generic argo-postgres-config -n argo --from-literal=password=mypassword --from-literal=username=argodbuser

Note that IAM-based authentication is not currently supported. However, you can start your database proxy as a sidecar (e.g. via CloudSQL Proxy on GCP) and then specify your local proxy address, IAM username, and an empty string as your password in the persistence configuration to connect to it.

The following tables will be created in the database when you start the workflow controller with enabled archive:

  • argo_workflows
  • argo_archived_workflows
  • argo_archived_workflows_labels
  • schema_history

Automatic Database Migration

Every time the Argo workflow-controller starts with persistence enabled, it tries to migrate the database to the correct version. If the database migration fails, the workflow-controller will also fail to start. In this case you can delete all the above tables and restart the workflow-controller.

If you know what are you doing you also have an option to skip migration:

persistence:
  skipMigration: true

Required database permissions

Postgres

The database user/role must have CREATE and USAGE permissions on the public schema of the database so that the tables can be created during the migration.

Archive TTL

You can configure the time period to keep archived workflows before they will be deleted by the archived workflow garbage collection function. The default is forever.

Example:

persistence:
  archiveTTL: 10d

The ARCHIVED_WORKFLOW_GC_PERIOD variable defines the periodicity of running the garbage collection function. The default value is documented here. When the workflow controller starts, it sets the ticker to run every ARCHIVED_WORKFLOW_GC_PERIOD. It does not run the garbage collection function immediately and the first garbage collection happens only after the period defined in the ARCHIVED_WORKFLOW_GC_PERIOD variable.

Cluster Name

Optionally you can set a unique name of your Kubernetes cluster. This name will populate the clustername field in the argo_archived_workflows table.

Example:

persistence:
  clusterName: dev-cluster

Disabling Workflow Archive

To disable archiving of the workflows, set archive: to false in the persistence section of your configuration.

Example:

persistence:
  archive: false

Have a question?

Search on GitHub Discussions and Slack.