Migrate From Data Injections
This guide explains how to migrate your Zarf deployment from the Data Injections feature to OCI image-based data delivery methods. Data injections are deprecated in the current ZarfPackageConfig schema and will be fully removed in the next API version of the schema. Existing ZarfPackageConfig files with Data Injections will remain valid after the next API version, v1beta1, is introduced. However, Data Injections will not be part of the v1beta1 schema.
There are several reasons for the deprecation of Data Injections:
- Poor User Experience: Many users have struggled to figure out how to adopt Data Injections.
- Host dependency: Data Injections shell out to
tar. This makes testing difficult and introduces differences across environments. - Ephemeral Storage: Because Data Injections only load data during
zarf package deploythe data must be saved to persistent storage. Otherwise, when a pod restarts any data injected will be lost. - Better alternatives: OCI images provide a Kubernetes native solution for data delivery, and neatly fit into the Zarf delivery paradigm.
There are two recommended approaches to replacing Data Injections.
The first, and preferred approach, is the Kubernetes Feature read-only OCI based volumes. This feature is generally available as of Kubernetes 1.35 and compatible with the Zarf agent as of v0.70.0. It provides a direct path for consuming container images as read only data sources.
The second approach uses init containers to migrate data to volumes. This approach has been widely adopted among Kubernetes users for years. It uses an init container from an OCI image that holds the required data. This data is then persisted to a common volume which the long running container(s) in the pod can use.
Migration for both of these approaches will require packaging an OCI image. Their implementation will differ in the pod specification.
First, create a container image containing your data:
# The init containers approach requires a shell, for read only OCI volumes "FROM scratch" is sufficient.FROM alpine:3.18COPY your-data-file /your-data/your-data-fileBuild and push this image:
docker build -t your-registry/your-data:tag .docker push your-registry/your-data:tagBefore (Data Injections):
kind: ZarfPackageConfigmetadata: name: data-injectionscomponents: - name: my-app required: true images: - ghcr.io/my-app:1.0.0 - alpine:3.18 dataInjections: - source: my-folder target: namespace: my-app selector: app=my-app container: data-loader path: /data compress: trueAfter:
kind: ZarfPackageConfigmetadata: name: init-data-loadingcomponents: - name: my-app required: true images: - ghcr.io/my-app:1.0.0 - your-registry/your-data:tag # Your container with your data filesRather than storing the data in the package through data injections, the data is stored in the your-registry/your-data:tag image.
Before (Data Injections):
spec: template: spec: initContainers: - name: data-injector image: alpine:3.18 command: ["sh", "-c"] args: - 'while [ ! -f /data/###ZARF_DATA_INJECTION_MARKER### ]; do echo "waiting for zarf data sync" && sleep 1; done; echo "we are done waiting!"' volumeMounts: - mountPath: /data name: data containers: - name: my-app image: "ghcr.io/my-app:1.0.0" command: [ "sh", "-c", "ls -la /data", # This will list all files copied by data injections ] volumeMounts: - mountPath: /data name: data volumes: - name: data persistentVolumeClaim: claimName: my-dataIf your cluster supports OCI volume sources (GA in Kubernetes 1.35+, Zarf v0.70.0+), the read-only volume approach is recommended. Otherwise, use the init container approach.
spec: template: spec: containers: - name: my-app image: "ghcr.io/my-app:1.0.0" command: [ "sh", "-c", "ls -la /mount-path", # This will list all files in the base directory of the your-registry/your-data:tag image. ] ... volumeMounts: - name: data mountPath: /mount-path volumes: - name: data image: reference: your-registry/your-data:tagOCI image volumes reduces the surface area of image execution and are more efficient since the data is directly mounted into the pod rather than copied to a volume in an init container.
spec: template: spec: initContainers: - name: data-loader image: your-registry/your-data:tag command: ["sh", "-c"] args: - | cp /your-data/your-data-file /data/my-app-data-location volumeMounts: - mountPath: /data name: data containers: - name: my-app image: "ghcr.io/my-app:1.0.0" command: [ "sh", "-c", "ls -la /data/my-app-data-location", # This will list all files copied during the init container run ] volumeMounts: - mountPath: /data name: data volumes: - name: data emptyDir: {}With the image-based approach, data can use ephemeral storage, such as emptyDir, since the init container repopulates the data from the container image on each pod restart.
If there are any reasons that these methods do not work for you, please comment in issue #3926