使用 OBS 在 Argo 流程中传递制品

在所有的通用工作流中，都会有文件传递的需求，Argo workflow 中，可以通过对接外部存储来支持这一需求。下面就以华为云为例，展示一下对接对象存储的过程。

OBS 侧配置

首先在 OBS 服务中创建一个存储桶，并在控制台的用户->我的凭证->访问密钥模块中，创建一个访问密钥，并下载凭据文件，凭据文件格式大致如下所示：

User Name,Access Key Id,Secret Access Key
"myusername",Y9C3WCABCDEFG,6bHX5eHIJKLMN

Argo workflow 配置

使用文件中的 Access Key 和 Secret Access Key ，在Workflow 所在的 Namespace 中创建 Kubernetes Secret。例如：

$ kubectl create secret generic s3-secret \
    --from-literal accessKey=Y9C3WCABCDEFG \
    --from-literal secretKey=6bHX5eHIJKLMN
...

接下来需要修改 Argo workflow 的配置文件，加入对制品的支持内容：

  artifactRepository: |
    archiveLogs: true
    s3:
      endpoint: obs.[Region ID].myhuaweicloud.com
      bucket: [Bucket Name]
      region: cn-north-4
      insecure: false
      keyFormat: "my-artifacts\
        /{{workflow.creationTimestamp.Y}}\
        /{{workflow.creationTimestamp.m}}\
        /{{workflow.creationTimestamp.d}}\
        /{{workflow.name}}\
        /{{pod.name}}"

      accessKeySecret:
        name: s3-secret
        key: accessKey
      secretKeySecret:
        name: s3-secret
        key: secretKey
      useSDKCreds: false

上面的配置大致解释一下：

在 OBS 中存储 Pod 日志
使用了华为云北京四 Region 的 OBS 端点。
需要引用前面创建的存储桶名称
使用加密方式进行访问
制品的存储路径模板为：my-artifacts/实例创建时间（年/月/日）/实例名称/步骤所在 Pod 名称/
Access Key 引用 Kubernetes Secret 中名为 s3-secret 的 accessKey 字段
Secret Key 引用 Kubernetes Secret 中名为 s3-secret 的 secretKey 字段

将上述内容加入 Argo workflow 所在命名空间的 workflow-controller-configmap。

启动工作流

尝试启动一个使用制品能力的工作流，清单内容来自https://argo-workflows.readthedocs.io/en/latest/walk-through/artifacts/。

这个流程模板中定义了两个工步：

生成制品

...
outputs:
  artifacts:
  # generate hello-art artifact from /tmp/hello_world.txt
  # artifacts can be directories as well as files
  - name: hello-art
    path: /tmp/hello_world.txt

上述代码中，将 /tmp/hello_world.txt 内容作为制品，并命名为 hello-art。

读取制品

inputs:
  artifacts:
  # unpack the message input artifact
  # and put it at /tmp/message
  - name: message
    path: /tmp/message

这段代码则是获取输入中名为 message 的制品，并解压到 /tmp/message 路径下。

执行时候，用 {{steps.generate-artifact.outputs.artifacts.hello-art}} 方式引用生成的制品。

执行

使用 Argo CLI 启动流程后，会看到类似如下的输出：

Name:                artifact-passing-mkn57
Namespace:           default
ServiceAccount:      argo-executor
Status:              Succeeded
...
STEP                       TEMPLATE                 PODNAME                                                    DURATION  MESSAGE
 ✔ artifact-passing-mkn57  artifact-example
 ├───✔ generate-artifact   hello-world-to-file      artifact-passing-mkn57-hello-world-to-file-551171166       8s
 └───✔ consume-artifact    print-message-from-file  artifact-passing-mkn57-print-message-from-file-1735545326  8s

这时如果返回 OBS 面板，会看到存储桶中，按照前面的路径规则存储了文件以及相关的日志（*.log）。

其他制品相关功能

覆盖仓库配置

前面我们在 Workflow Controller 配置文件中的配置，适用于单租户场景；多租户场景下，还可以通过 artifactRepositoryRef 方式，让每个流程可以使用自己的制品配置（https://argo-workflows.readthedocs.io/en/latest/artifact-repository-ref/）。

首先使用 Configmap 定义多个存储对接的参数，例如：

apiVersion: v1
kind: ConfigMap
metadata:
  # If you want to use this config map by default, name it "artifact-repositories". Otherwise, you can provide a reference to a
  # different config map in `artifactRepositoryRef.configMap`.
  name: my-artifact-repository
  annotations:
    # v3.0 and after - if you want to use a specific key, put that key into this annotation.
    workflows.argoproj.io/default-artifact-repository: default-v1-s3-artifact-repository
data:
  default-v1-s3-artifact-repository: |
    s3:
...
  v2-s3-artifact-repository: |
...

这段 YAML 中，提供了几个信息：

如果想要默认使用这个 Configmap 定义制品仓库，可以将其名称设置为 artifact-repositories。
如果不是默认，就需要在 artifactRepositoryRef.configMap 中显示定义 Configmap 名称。
v3.0 以后，可以用 workflows.argoproj.io/default-artifact-repository 注解定义这个 Configmap 中的默认仓库定义
data 字段定义了两个制品仓库。

然后可以在 Workflow 中引用：

spec:
  artifactRepositoryRef:
    configMap: my-artifact-repository
    key: v2-s3-artifact-repository

垃圾回收

在 Workflow 的 spec.artifactGC 中，可以定义 Garbage Collection 的策略。可选策略包括 OnWorkflowCompletion 和 OnWorkflowDeletion。

存储驱动能力列表

除了 S3 之外，目前 Argo Workflow 支持的存储驱动能力如下：

(https://argo-workflows.readthedocs.io/en/latest/configure-artifact-repository/)

Name	Inputs	Outputs	Garbage Collection	Usage (Feb 2020)
Artifactory	Yes	Yes	No	11%
Azure Blob	Yes	Yes	Yes	-
GCS	Yes	Yes	Yes	-
Git	Yes	No	No	-
HDFS	Yes	Yes	No	3%
HTTP	Yes	Yes	No	2%
OSS	Yes	Yes	No	-
Raw	Yes	No	No	5%
S3	Yes	Yes	Yes	86%

在 Argo workflow 中使用 OBS 进行制品传递

OBS 侧配置

Argo workflow 配置

启动工作流

生成制品

读取制品

执行

其他制品相关功能

覆盖仓库配置

垃圾回收

存储驱动能力列表

Comments

More from this blog

龙虾恐慌：AIOps 又要改名了？

再见 2025

辅助编程？dora 说：我知道你很急可是请你别急

[译]dora：ai 辅助软件开发状态报告

僭越了，有人在用 Rust 写 Kubernetes

Command Palette

OBS 侧配置

Argo workflow 配置

启动工作流

生成制品

读取制品

执行

其他制品相关功能

覆盖仓库配置

垃圾回收

存储驱动能力列表

Comments

More from this blog