Migrate From Data Injections

This guide explains how to migrate your Zarf deployment from the Data Injections feature to OCI image-based data delivery methods. Data injections are deprecated in the current ZarfPackageConfig schema and will be fully removed in the next API version of the schema. Existing ZarfPackageConfig files with Data Injections will remain valid after the next API version, v1beta1, is introduced. However, Data Injections will not be part of the v1beta1 schema.

There are several reasons for the deprecation of Data Injections:

Poor User Experience: Many users have struggled to figure out how to adopt Data Injections.
Host dependency: Data Injections shell out to tar. This makes testing difficult and introduces differences across environments.
Ephemeral Storage: Because Data Injections only load data during zarf package deploy the data must be saved to persistent storage. Otherwise, when a pod restarts any data injected will be lost.
Better alternatives: OCI images provide a Kubernetes native solution for data delivery, and neatly fit into the Zarf delivery paradigm.

Migration guide

There are two recommended approaches to replacing Data Injections.

The first, and preferred approach, is the Kubernetes Feature read-only OCI based volumes. This feature is generally available as of Kubernetes 1.35 and compatible with the Zarf agent as of v0.70.0. It provides a direct path for consuming container images as read only data sources.

The second approach uses init containers to migrate data to volumes. This approach has been widely adopted among Kubernetes users for years. It uses an init container from an OCI image that holds the required data. This data is then persisted to a common volume which the long running container(s) in the pod can use.

Migration for both of these approaches will require packaging an OCI image. Their implementation will differ in the pod specification.

Step 1: Package Your Data in an OCI Image

First, create a container image containing your data:

# The init containers approach requires a shell, for read only OCI volumes "FROM scratch" is sufficient.
FROM alpine:3.18
COPY your-data-file /your-data/your-data-file

Build and push this image:

docker build -t your-registry/your-data:tag .
docker push your-registry/your-data:tag

Step 2: Update zarf.yaml

Before (Data Injections):

kind: ZarfPackageConfig
metadata:
  name: data-injections
components:
  - name: my-app
    required: true
    images:
      - ghcr.io/my-app:1.0.0
      - alpine:3.18
    dataInjections:
      - source: my-folder
        target:
          namespace: my-app
          selector: app=my-app
          container: data-loader
          path: /data
        compress: true

After:

kind: ZarfPackageConfig
metadata:
  name: init-data-loading
components:
  - name: my-app
    required: true
    images:
      - ghcr.io/my-app:1.0.0
      - your-registry/your-data:tag  # Your container with your data files

Rather than storing the data in the package through data injections, the data is stored in the your-registry/your-data:tag image.

Step 3: Update Deployment Manifest

Before (Data Injections):

spec:
  template:
    spec:
      initContainers:
        - name: data-injector
          image: alpine:3.18
          command: ["sh", "-c"]
          args:
            - 'while [ ! -f /data/###ZARF_DATA_INJECTION_MARKER### ]; do echo "waiting for zarf data sync" && sleep 1; done; echo "we are done waiting!"'
          volumeMounts:
            - mountPath: /data
              name: data
      containers:
        - name: my-app
          image: "ghcr.io/my-app:1.0.0"
          command:
            [
              "sh",
              "-c",
              "ls -la /data", # This will list all files copied by data injections
            ]
          volumeMounts:
            - mountPath: /data
              name: data
      volumes:
        - name: data
          persistentVolumeClaim:
            claimName: my-data

If your cluster supports OCI volume sources (GA in Kubernetes 1.35+, Zarf v0.70.0+), the read-only volume approach is recommended. Otherwise, use the init container approach.

Read-Only Volume Approach

spec:
  template:
    spec:
      containers:
        - name: my-app
          image: "ghcr.io/my-app:1.0.0"
          command:
            [
              "sh",
              "-c",
              "ls -la /mount-path", # This will list all files in the base directory of the your-registry/your-data:tag image.
            ]
          ...
          volumeMounts:
            - name: data
              mountPath: /mount-path
      volumes:
        - name: data
          image:
            reference: your-registry/your-data:tag

OCI image volumes reduces the surface area of image execution and are more efficient since the data is directly mounted into the pod rather than copied to a volume in an init container.

Init Container Approach

spec:
  template:
    spec:
      initContainers:
        - name: data-loader
          image: your-registry/your-data:tag
          command: ["sh", "-c"]
          args:
            - |
              cp /your-data/your-data-file /data/my-app-data-location
          volumeMounts:
            - mountPath: /data
              name: data
      containers:
        - name: my-app
          image: "ghcr.io/my-app:1.0.0"
          command:
            [
              "sh",
              "-c",
              "ls -la /data/my-app-data-location", # This will list all files copied during the init container run
            ]
          volumeMounts:
            - mountPath: /data
              name: data
      volumes:
        - name: data
          emptyDir: {}

With the image-based approach, data can use ephemeral storage, such as emptyDir, since the init container repopulates the data from the container image on each pod restart.

Need Help?

If there are any reasons that these methods do not work for you, please comment in issue #3926