Skip to content

Migrate From Data Injections

This guide explains how to migrate your Zarf deployment from the Data Injections feature to OCI image-based data delivery methods. Data injections are deprecated in the current ZarfPackageConfig schema and will be fully removed in the next API version of the schema. Existing ZarfPackageConfig files with Data Injections will remain valid after the next API version, v1beta1, is introduced. However, Data Injections will not be part of the v1beta1 schema.

There are several reasons for the deprecation of Data Injections:

  • Poor User Experience: Many users have struggled to figure out how to adopt Data Injections.
  • Host dependency: Data Injections shell out to tar. This makes testing difficult and introduces differences across environments.
  • Ephemeral Storage: Because Data Injections only load data during zarf package deploy the data must be saved to persistent storage. Otherwise, when a pod restarts any data injected will be lost.
  • Better alternatives: OCI images provide a Kubernetes native solution for data delivery, and neatly fit into the Zarf delivery paradigm.

There are two recommended approaches to replacing Data Injections.

The first, and preferred approach, is the Kubernetes Feature read-only OCI based volumes. This feature is generally available as of Kubernetes 1.35 and compatible with the Zarf agent as of v0.70.0. It provides a direct path for consuming container images as read only data sources.

The second approach uses init containers to migrate data to volumes. This approach has been widely adopted among Kubernetes users for years. It uses an init container from an OCI image that holds the required data. This data is then persisted to a common volume which the long running container(s) in the pod can use.

Migration for both of these approaches will require packaging an OCI image. Their implementation will differ in the pod specification.

First, create a container image containing your data:

# The init containers approach requires a shell, for read only OCI volumes "FROM scratch" is sufficient.
FROM alpine:3.18
COPY your-data-file /your-data/your-data-file

Build and push this image:

Terminal window
docker build -t your-registry/your-data:tag .
docker push your-registry/your-data:tag

Before (Data Injections):

kind: ZarfPackageConfig
metadata:
name: data-injections
components:
- name: my-app
required: true
images:
- ghcr.io/my-app:1.0.0
- alpine:3.18
dataInjections:
- source: my-folder
target:
namespace: my-app
selector: app=my-app
container: data-loader
path: /data
compress: true

After:

kind: ZarfPackageConfig
metadata:
name: init-data-loading
components:
- name: my-app
required: true
images:
- ghcr.io/my-app:1.0.0
- your-registry/your-data:tag # Your container with your data files

Rather than storing the data in the package through data injections, the data is stored in the your-registry/your-data:tag image.

Before (Data Injections):

spec:
template:
spec:
initContainers:
- name: data-injector
image: alpine:3.18
command: ["sh", "-c"]
args:
- 'while [ ! -f /data/###ZARF_DATA_INJECTION_MARKER### ]; do echo "waiting for zarf data sync" && sleep 1; done; echo "we are done waiting!"'
volumeMounts:
- mountPath: /data
name: data
containers:
- name: my-app
image: "ghcr.io/my-app:1.0.0"
command:
[
"sh",
"-c",
"ls -la /data", # This will list all files copied by data injections
]
volumeMounts:
- mountPath: /data
name: data
volumes:
- name: data
persistentVolumeClaim:
claimName: my-data

If your cluster supports OCI volume sources (GA in Kubernetes 1.35+, Zarf v0.70.0+), the read-only volume approach is recommended. Otherwise, use the init container approach.

spec:
template:
spec:
containers:
- name: my-app
image: "ghcr.io/my-app:1.0.0"
command:
[
"sh",
"-c",
"ls -la /mount-path", # This will list all files in the base directory of the your-registry/your-data:tag image.
]
...
volumeMounts:
- name: data
mountPath: /mount-path
volumes:
- name: data
image:
reference: your-registry/your-data:tag

OCI image volumes reduces the surface area of image execution and are more efficient since the data is directly mounted into the pod rather than copied to a volume in an init container.

spec:
template:
spec:
initContainers:
- name: data-loader
image: your-registry/your-data:tag
command: ["sh", "-c"]
args:
- |
cp /your-data/your-data-file /data/my-app-data-location
volumeMounts:
- mountPath: /data
name: data
containers:
- name: my-app
image: "ghcr.io/my-app:1.0.0"
command:
[
"sh",
"-c",
"ls -la /data/my-app-data-location", # This will list all files copied during the init container run
]
volumeMounts:
- mountPath: /data
name: data
volumes:
- name: data
emptyDir: {}

With the image-based approach, data can use ephemeral storage, such as emptyDir, since the init container repopulates the data from the container image on each pod restart.

If there are any reasons that these methods do not work for you, please comment in issue #3926