Upgrading Guide¶
Breaking changes typically (sometimes we don't realise they are breaking) have "!" in the commit message, as per the conventional commits.
Upgrading to v3.6¶
See also the list of new features in 3.6.
Deprecations¶
The following features are deprecated and will be removed in a future verison of Argo Workflows:
- The Python SDK is deprecated, we recommend migrating to Hera
schedule
in CronWorkflows,podPriority
,mutex
andsemaphore
in Workflows and WorkflowTemplates.
For more information on how to migrate these see deprecations
Fixed Server --basehref
inconsistency¶
For consistency, the Server now uses --base-href
and ARGO_BASE_HREF
.
Previously it was --basehref
(no dash in between) and ARGO_BASEHREF
(no underscore in between).
Removed redundant Server environment variables¶
ALLOWED_LINK_PROTOCOL
and BASE_HREF
have been removed as redundant.
Use ARGO_ALLOWED_LINK_PROTOCOL
and ARGO_BASE_HREF
instead.
Legacy insecure pod patch fallback removed. (#13100)¶
For the Emissary executor to work properly, you must set up RBAC. See workflow RBAC
Archived Workflows on PostgreSQL¶
To improve performance, this upgrade will automatically transform the column used to store archived workflows from type json
to type jsonb
on controller start-up.
This requires PostgreSQL version 9.4 or higher.
The migration involves obtaining an ACCESS EXCLUSIVE lock on the argo_archived_wokflows
table, which blocks all reads and writes until it has finished.
For the vast majority of users, we anticipate this will take less than a minute, but it could take much longer if you have a large number of workflows (100,000+), or the average workflow size is high (100KB+).
If you don't fall into one of those two categories, or if minimizing downtime isn't important to you, then you don't need to read any further.
Otherwise, you have a few options to keep downtime to a minimum:
- If you don't actually need the archived workflows anymore, simply delete them with
delete from argo_archived_workflows
and the migration will complete almost instantly. -
Using a variation of Altering a Postgres Column with Minimal Downtime, it's possible to manually perform this migration with nearly no downtime. This is a two-step process;
-
Before the upgrade, run the following queries to create a temporary
workflowjsonb
column and populate it with the existing data. This is safe to do whilst running version 3.5 because the column types are compatible.-- Add temporary workflowjsonb column ALTER TABLE argo_archived_workflows ADD COLUMN workflowjsonb JSONB NULL; -- Add trigger to update workflowjsonb for each insert CREATE OR REPLACE FUNCTION update_workflow_jsonb() RETURNS TRIGGER AS $BODY$ BEGIN NEW.workflowjsonb=NEW.workflow; RETURN NEW; END $BODY$ LANGUAGE PLPGSQL; CREATE TRIGGER argo_archived_workflows_update_workflow_jsonb BEFORE INSERT ON argo_archived_workflows FOR EACH ROW EXECUTE PROCEDURE update_workflow_jsonb(); -- Backfill existing rows UPDATE argo_archived_workflows SET workflowjsonb = workflow WHERE workflowjsonb IS NULL;
-
Once the above has completed and you're ready to proceed with the upgrade, run the following queries before starting the controller:
BEGIN; LOCK TABLE argo_archived_workflows IN SHARE ROW EXCLUSIVE MODE; DROP TRIGGER argo_archived_workflows_update_workflow_jsonb ON argo_archived_workflows; ALTER TABLE argo_archived_workflows DROP COLUMN workflow; ALTER TABLE argo_archived_workflows RENAME COLUMN workflowjsonb TO workflow; ALTER TABLE argo_archived_workflows ADD CONSTRAINT workflow CHECK (workflow IS NOT NULL) NOT VALID; COMMIT;
-
-
Version 3.6 retains compatibility with workflows stored as type
json
. Therefore, it's currently safe to skip the migration by settingskipMigration: true
. This should only be used as an emergency stop-gap, as future versions may drop support forjson
without notice.
Metrics changes¶
You can now retrieve metrics using the OpenTelemetry Protocol using the OpenTelemetry collector, and this is the recommended mechanism.
These notes explain the differences in using the Prometheus /metrics
endpoint to scrape metrics for a minimal effort upgrade. It is not recommended you follow this guide blindly, the new metrics have been introduced because they add value, and so they should be worth collecting and using.
New metrics¶
The following are new metrics:
cronworkflows_concurrencypolicy_triggered
cronworkflows_triggered_total
deprecated_feature
is_leader
k8s_request_duration
pod_pending_count
pods_total_count
queue_duration
queue_longest_running
queue_retries
queue_unfinished_work
total_count
version
workflowtemplate_runtime
workflowtemplate_triggered_total
and can be disabled with
metricsConfig: |
modifiers:
build_info:
disable: true
...
Renamed metrics¶
If you are using these metrics in your recording rules, dashboards, or alerts, you will need to update their names after the upgrade:
Old name | New name |
---|---|
argo_workflows_count |
argo_workflows_gauge |
argo_workflows_pods_count |
argo_workflows_pods_gauge |
argo_workflows_queue_depth_count |
argo_workflows_queue_depth_gauge |
log_messages |
argo_workflows_log_messages |
Custom metrics¶
Custom metric names and labels must be valid Prometheus and OpenTelemetry names now. This prevents the use of :
, which was usable in earlier versions of workflows
Custom metrics, as defined by a workflow, could be defined as one type (say counter) in one workflow, and then as a histogram of the same name in a different workflow. This would work in 3.5 if the first usage of the metric had reached TTL and been deleted. This will no-longer work in 3.6, and custom metrics may not be redefined. It doesn't really make sense to change a metric in this way, and the OpenTelemetry SDK prevents you from doing so.
metricsTTL
for histogram metrics is not functional as opentelemetry doesn't allow deletion of metrics. This is faked via asynchronous meters for the other metric types.
TLS¶
The Prometheus /metrics
endpoint now has TLS enabled by default.
To disable this set metricsConfig.secure
to false
.
Removed Swagger UI¶
The Swagger UI has been removed from the /apidocs
page.
It has been replaced with a link to the Swagger UI in the versioned documentation and download links for the OpenAPI spec and JSON schema.
JSON templating fix¶
When returning a map or array in an expression, you would get a Golang representation. This now returns plain JSON.
ARGO_TEMPLATE
removed from main container¶
The environment variable ARGO_TEMPLATE
which is an internal implementation detail is no longer available inside the main
container of your workflow pods.
This is documented here as we are aware that some users of Argo Workflows use this.
Upgrading to v3.5¶
There are no known breaking changes in this release. Please file an issue if you encounter any unexpected problems after upgrading.
Unified Workflows List API and UI¶
The Workflows List in the UI now shows Archived Workflows in the same page. As such, the previously separate Archived Workflows page in the UI has been removed.
The List API /api/v1/workflows
also returns both types of Workflows now.
This is not breaking as the Archived API still exists and was not removed, so this is an addition.
Upgrading to v3.4¶
Non-Emissary executors are removed. (#7829)¶
Emissary executor is now the only supported executor. If you are using other executors, e.g. docker, k8sapi, pns, and kubelet, you need to
remove your containerRuntimeExecutors
and containerRuntimeExecutor
from your controller's configmap. If you have workflows that use different
executors with the label workflows.argoproj.io/container-runtime-executor
, this is no longer supported and will not be effective.
chore!: Remove dataflow pipelines from codebase. (#9071)¶
You are affected if you are using dataflow pipelines in the UI or via the /pipelines
endpoint.
We no longer support dataflow pipelines and all relevant code has been removed.
feat!: Add entrypoint lookup. Fixes #8344¶
Affected if:
- Using the Emissary executor.
- Used the
args
field for any entry inimages
.
This PR automatically looks up the command and entrypoint. The implementation for config look-up was incorrect (it
allowed you to specify args
but not entrypoint
). args
has been removed to correct the behaviour.
If you are incorrectly configured, the workflow controller will error on start-up.
Actions¶
You don't need to configure images that use v2 manifests anymore, such as argoproj/argosay:v2
.
You can remove them:
% docker manifest inspect argoproj/argosay:v2
# ...
"schemaVersion": 2,
# ...
For v1 manifests, such as docker/whalesay:latest
:
% docker image inspect -f '{{.Config.Entrypoint}} {{.Config.Cmd}}' docker/whalesay:latest
[] [/bin/bash]
images:
docker/whalesay:latest:
cmd: [/bin/bash]
feat: Fail on invalid config. (#8295)¶
The workflow controller will error on start-up if incorrectly configured, rather than silently ignoring mis-configuration.
Failed to register watch for controller config map: error unmarshaling JSON: while decoding JSON: json: unknown field \"args\"
feat: add indexes for improve archived workflow performance. (#8860)¶
This PR adds indexes to archived workflow tables. This change may cause a long time to upgrade if the user has a large table.
feat: enhance artifact visualization (#8655)¶
For AWS users using S3: visualizing artifacts in the UI and downloading them now requires an additional "Action" to be configured in your S3 bucket policy: "ListBucket".
Upgrading to v3.3¶
662a7295b feat: Replace patch pod
with create workflowtaskresult
. Fixes #3961 (#8000)¶
The PR changes the permissions that can be used by a workflow to remove the pod patch
permission.
See workflow RBAC and #8013.
06d4bf76f fix: Reduce agent permissions. Fixes #7986 (#7987)¶
The PR changes the permissions used by the agent to report back the outcome of HTTP template requests. The permission patch workflowtasksets/status
replaces patch workflowtasksets
, for example:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: agent
rules:
- apiGroups:
- argoproj.io
resources:
- workflowtasksets/status
verbs:
- patch
Workflows running during any upgrade should be give both permissions.
See #8013.
feat!: Remove deprecated config flags¶
This PR removes the following configmap items -
- executorImage (use executor.image in configmap instead) e.g. Workflow controller configmap similar to the following one given below won't be valid anymore:
apiVersion: v1
kind: ConfigMap
metadata:
name: workflow-controller-configmap
data:
...
executorImage: argoproj/argocli:latest
...
From now and onwards, only provide the executor image in workflow controller as a command argument as shown below:
apiVersion: v1
kind: ConfigMap
metadata:
name: workflow-controller-configmap
data:
...
executor: |
image: argoproj/argocli:latest
...
- executorImagePullPolicy (use executor.imagePullPolicy in configmap instead) e.g. Workflow controller configmap similar to the following one given below won't be valid anymore:
data:
...
executorImagePullPolicy: IfNotPresent
...
Change it as shown below:
data:
...
executor: |
imagePullPolicy: IfNotPresent
...
- executorResources (use executor.resources in configmap instead) e.g. Workflow controller configmap similar to the following one given below won't be valid anymore:
data:
...
executorResources:
requests:
cpu: 0.1
memory: 64Mi
limits:
cpu: 0.5
memory: 512Mi
...
Change it as shown below:
data:
...
executor: |
resources:
requests:
cpu: 0.1
memory: 64Mi
limits:
cpu: 0.5
memory: 512Mi
...
fce82d572 feat: Remove pod workers (#7837)¶
This PR removes pod workers from the code, the pod informer directly writes into the workflow queue. As a result the --pod-workers
flag has been removed.
93c11a24ff feat: Add TLS to Metrics and Telemetry servers (#7041)¶
This PR adds the ability to send metrics over TLS with a self-signed certificate. In v3.5 this will be enabled by default, so it is recommended that users enable this functionality now.
0758eab11 feat(server)!: Sync dispatch of webhook events by default¶
This is not expected to impact users.
Events dispatch in the Argo Server has been change from async to sync by default. This is so that errors are surfaced to the client, rather than only appearing as logs or Kubernetes events. It is possible that response times under load are too long for your client and you may prefer to revert this behaviour.
To revert this behaviour, restart Argo Server with ARGO_EVENT_ASYNC_DISPATCH=true
. Make sure that asyncDispatch=true
is logged.
bd49c6303 fix(artifact)!: default https to any URL missing a scheme. Fixes #6973¶
HTTPArtifact without a scheme will now defaults to https instead of http
user need to explicitly include a http prefix if they want to retrieve HTTPArtifact through http
chore!: Remove the hidden flag --verify
from argo submit
¶
The hidden flag --verify
has been removed from argo submit
. This is a internal testing flag we don't need anymore.
Upgrading to v3.2¶
e5b131a33 feat: Add template node to pod name. Fixes #1319 (#6712)¶
This add the template name to the pod name, to make it easier to understand which pod ran which step. This behaviour can be reverted by setting POD_NAMES=v1
on the workflow controller.
be63efe89 feat(executor)!: Change argoexec
base image to alpine. Closes #5720 (#6006)¶
Changing from Debian to Alpine reduces the size of the argoexec
image, resulting is faster starting workflow pods, and it also reduce the risk of security issues. There is not such thing as a free lunch. There maybe other behaviour changes we don't know of yet.
Some users found this change prevented workflow with very large parameters from running. See #7586
48d7ad3 chore: Remove onExit naming transition scaffolding code (#6297)¶
When upgrading from <v2.12
to >v3.2
workflows that are running at the time of the upgrade and have onExit
steps may experience the onExit
step running twice. This is only applicable for workflows that began running before a workflow-controller
upgrade and are still running after the upgrade is complete. This is only applicable for upgrading from v2.12
or earlier directly to v3.2
or later. Even under these conditions, duplicate work may not be experienced.
Upgrading to v3.1¶
3fff791e4 build!: Automatically add manifests to v*
tags (#5880)¶
The manifests in the repository on the tag will no longer contain the image tag, instead they will contain :latest
.
- You must not get your manifests from the Git repository, you must get them from the release notes.
- You must not use the
stable
tag. This is defunct, and will be removed in v3.1.
ab361667a feat(controller) Emissary executor. (#4925)¶
The Emissary executor is not a breaking change per-se, but it is brand new so we would not recommend you use it by default yet. Instead, we recommend you test it out on some workflows using a workflow-controller-configmap
configuration.
# Specifies the executor to use.
#
# You can use this to:
# * Tailor your executor based on your preference for security or performance.
# * Test out an executor without committing yourself to use it for every workflow.
#
# To find out which executor was actually use, see the `wait` container logs.
#
# The list is in order of precedence; the first matching executor is used.
# This has precedence over `containerRuntimeExecutor`.
containerRuntimeExecutors: |
- name: emissary
selector:
matchLabels:
workflows.argoproj.io/container-runtime-executor: emissary
be63efe89 feat(controller): Expression template tags. Resolves #4548 & #1293 (#5115)¶
This PR introduced a new expression syntax know as "expression tag template". A user has reported that this does not
always play nicely with the when
condition syntax (Goevaluate).
This can be resolved using a single quote in your when expression:
when: "'{{inputs.parameters.should-print}}' != '2021-01-01'"
Upgrading to v3.0¶
defbd600e fix: Default ARGO_SECURE=true. Fixes #5607 (#5626)¶
The server now starts with TLS enabled by default if a key is available. The original behaviour can be configured with --secure=false
.
If you have an ingress, you may need to add the appropriate annotations:(varies by ingress):
alb.ingress.kubernetes.io/backend-protocol: HTTPS
nginx.ingress.kubernetes.io/backend-protocol: HTTPS
01d310235 chore(server)!: Required authentication by default. Resolves #5206 (#5211)¶
To login to the user interface, you must provide a login token. The original behaviour can be configured with --auth-mode=server
.
f31e0c6f9 chore!: Remove deprecated fields (#5035)¶
Some fields that were deprecated in early 2020 have been removed.
Field | Action |
---|---|
template.template and template.templateRef | The workflow spec must be changed to use steps or DAG, otherwise the workflow will error. |
spec.ttlSecondsAfterFinished | change to spec.ttlStrategy.secondsAfterCompletion , otherwise the workflow will not be garbage collected as expected. |
To find impacted workflows:
kubectl get wf --all-namespaces -o yaml | grep templateRef
kubectl get wf --all-namespaces -o yaml | grep ttlSecondsAfterFinished
c8215f972 feat(controller)!: Key-only artifacts. Fixes #3184 (#4618)¶
This change is not breaking per-se, but many users do not appear to aware of artifact repository ref, so check your usage of that feature if you have problems.