This Helm chart deploys the Platforma application to a Kubernetes cluster.
There are two recommended methods for installing the Platforma Helm chart.
This is the preferred method for modern Helm versions. It pulls the chart directly from the GitHub Container Registry.
# Replace <version> with the specific chart version you want to install
# Replace <namespace> with the target namespace
# Provide your custom values file with -f
helm install my-platforma oci://ghcr.io/milaboratory/platforma-helm-charts/platforma \
--version <version> \
--namespace <namespace> \
-f my-values.yaml
This method uses the traditional Helm repository hosted on GitHub Pages.
1. Add the Helm Repository:
helm repo add platforma https://milaboratory.github.io/platforma-helm-charts
helm repo update
2. Install the Chart:
# You can search for available versions
helm search repo platforma/platforma --versions
# Install the chart (replace <version> with the desired chart version)
# Replace <namespace> with the target namespace
# Provide your custom values file with -f
helm install my-platforma platforma/platforma \
--version <version> \
--namespace <namespace> \
-f my-values.yaml
appVersion
), pullPolicy, imagePullSecrets
.listenOptions.port
(default 6345). Optional HTTP Service only when primaryStorage.fs.enabled
is true.host
; gRPC path always when enabled; HTTP path only if primaryStorage.fs.enabled
.httpGet
, tcpSocket
, or grpc
, separately configurable for liveness/readiness.mainRoot
PVC or split dbDir
/workDir
/packagesDir
PVCs; optional logging PVC; optional FS data libraries; optional FS primary storage PVC.Version 2.0.0
of this Helm chart introduces significant structural changes and is not backward-compatible with 1.x
versions. A manual migration is required to upgrade existing releases while preserving data.
The key change is the refactoring of the values.yaml
file for better organization and clarity.
Backup Your Data: Before starting the migration, ensure you have a backup of your persistent volumes.
Prepare a Migration values.yaml
:
You will need to create a new values file (migration-values.yaml
) that maps your old configuration to the new structure. The primary goal is to reuse your existing PersistentVolumeClaims (PVCs) to avoid data loss.
Your existing PVCs typically follow this naming pattern:
<release-name>-platforma-database
<release-name>-platforma-work
<release-name>-platforma-softwareloader
Map Old Values to New Structure:
Here is an example of how to configure the persistence
section in your migration-values.yaml
to reuse your existing volumes:
# migration-values.yaml
persistence:
dbDir:
enabled: true
createPvc: false # Important: Set to false to use existing PVC
existingClaim: "<release-name>-platforma-database"
mountPath: "/db"
workDir:
enabled: true
createPvc: false # Important: Set to false to use existing PVC
existingClaim: "<release-name>-platforma-work"
mountPath: "/data/work"
packagesDir:
enabled: true
createPvc: false # Important: Set to false to use existing PVC
existingClaim: "<release-name>-platforma-softwareloader"
mountPath: "/storage/controllers/software-loader"
You must also port other custom configurations from your old values.yaml
(e.g., image.tag
, ingress
, resources
, primaryStorage
, authOptions
) to their new locations in the platforma
structure.
Perform the Upgrade:
Run helm upgrade
with your release name, the new chart version, and your migration values file.
helm upgrade <release-name> platforma/platforma --version 2.0.0 -f migration-values.yaml
You can pass licenses for Platforma (PL_LICENSE
) and other integrated tools (MI_LICENSE
) securely using Kubernetes Secrets and environment variables.
1. Create the Secret Resources
Create Kubernetes secrets to hold your license keys.
Using kubectl
:
kubectl create secret generic pl-license-secret --from-literal=pl-license-key='your_pl_license_key_here'
kubectl create secret generic mi-license-secret --from-literal=mi-license-key='your_mi_license_key_here'
2. Reference the Secrets in values.yaml
Modify your values.yaml
to reference these secrets. The chart will inject them as environment variables into the application container.
env:
secretVariables:
- name: PL_LICENSE
secretKeyRef:
name: pl-license-secret
key: pl-license-key
- name: MI_LICENSE
secretKeyRef:
name: mi-license-secret
key: mi-license-key
Persistence is enabled by default and controlled under persistence
:
globalEnabled
; behavior now depends on mainRoot.enabled
vs split volumes.persistence.mainRoot.mountPath
(default /data/platforma-data
). When mainRoot.enabled: true
, the split volumes below are ignored.mainRoot.enabled: false
:
dbDir
: RocksDB stateworkDir
: working directorypackagesDir
: software packages
For each, either set existingClaim
or createPvc: true
(+ size
, optional storageClass
).logging.destination
is a dir://
path and logging.persistence.enabled
is true, the chart mounts a PVC at logging.persistence.mountPath
.dataLibrary.fs
can create or reuse a PVC and is mounted at its path
.Tip: set existingClaim
to reuse an existing volume; otherwise set createPvc: true
and specify size
(and storageClass
if needed).
For sensitive files like TLS certificates, S3 credentials, or the Platforma license file, this chart uses a secure mounting mechanism.
You can create secrets from files or literal values.
kubectl create secret generic ldap-cert-secret \
--from-file=tls.crt=./tls.crt \
--from-file=tls.key=./tls.key \
--from-file=ca.crt=./ca.crt
kubectl create secret generic platforma-license \
--from-file=license=./license.txt
kubectl create secret generic my-s3-secret \
--from-literal=access-key=AKIA... \
--from-literal=secret-key=abcd1234...
values.yaml
Reference the secrets in values.yaml
under the appropriate section (e.g., authOptions.ldap.secretRef
, mainOptions.licenseFile.secretRef
, primaryStorage.s3.secretRef
).
The chart mounts the referenced secret as files into the pod (e.g., at /etc/platforma/secrets/ldap/
), and the application is automatically configured to use these file paths.
This Helm chart provides flexible options for both primary and data library storage, allowing you to use S3, GCS, or a local filesystem (via PersistentVolumeClaims).
Primary storage is used for long-term storage of analysis results. Only one primary storage provider can be enabled at a time.
primaryStorage.s3
section. You can provide credentials directly or reference a Kubernetes secret.primaryStorage.gcs
, specifying the bucket URL, project ID, and service account.primaryStorage.fs
.primaryStorage:
gcs:
enabled: true
url: "gs://my-gcs-bucket/primary-storage/"
projectId: "my-gcp-project-id"
serviceAccount: "my-gcs-service-account@my-gcp-project-id.iam.gserviceaccount.com"
Exactly one of primaryStorage.s3.enabled
, primaryStorage.fs.enabled
, or primaryStorage.gcs.enabled
must be true. The chart validates this at render time and will fail if none or multiple are enabled.
Data libraries allow you to mount additional datasets into the application. You can configure multiple libraries of different types.
dataLibrary.s3
.dataLibrary.gcs
.dataLibrary.fs
, which will be provisioned using PVCs.This chart supports integration with Google Batch for offloading job execution. This is useful for large-scale data processing tasks. To enable this, you need a shared filesystem (like NFS) that is accessible by both the Platforma pod and the Google Batch jobs. Google Cloud Filestore is a common choice for this.
Configuration:
The googleBatch
section in values.yaml
controls this integration.
enabled
: Set to true
to enable Google Batch integration.storage
: Specifies the mapping between a local path in the container and the shared NFS volume. The format is <local-path>=<nfs-uri>
.project
: Your Google Cloud Project ID.region
: The GCP region where Batch jobs will run.serviceAccount
: The email of the GCP service account that Google Batch jobs will use. This service account needs appropriate permissions for Batch and storage access.network
/ subnetwork
: The VPC network and subnetwork for the Batch jobs.volumes
: Configures the shared NFS volume. Provide EITHER existingClaim
(reuse an existing PVC) OR storageClass
+ size
(let the chart create a PVC). Set accessMode
as needed (default ReadWriteMany
).Example Configuration:
googleBatch:
enabled: true
storage: "/data/platforma-data=nfs://10.0.0.2/fileshare"
project: "my-gcp-project-id"
region: "us-central1"
serviceAccount: "batch-executor@my-gcp-project-id.iam.gserviceaccount.com"
network: "projects/my-gcp-project-id/global/networks/default"
subnetwork: "projects/my-gcp-project-id/regions/us-central1/subnetworks/default"
volumes:
enabled: true
existingClaim: "my-filestore-pvc" # or omit and set storageClass + size for dynamic provisioning
accessMode: "ReadWriteMany"
# storageClass: "filestore-rwx"
# size: "1Ti"
This configuration assumes you have already created a Google Cloud Filestore instance and a corresponding PersistentVolumeClaim (my-filestore-pvc
) in your Kubernetes cluster.
dataLibrary:
s3:
- id: "my-s3-library"
enabled: true
url: "s3://my-s3-bucket/path/to/library/"
region: "us-east-1"
The chart offers flexible logging options configured via the logging.destination
parameter in values.yaml
.
stream://stdout
: Logs are sent to standard output (recommended for Kubernetes).stream://stderr
: Logs are sent to standard error.dir:///path/to/logs
: Logs are written to files in the specified directory. To persist logs, enable logging.persistence
in values.yaml
, which will create a PersistentVolumeClaim (PVC) to store the log files.logging:
destination: "dir:///var/log/platforma"
persistence:
enabled: true
size: 10Gi
storageClass: "standard"
When deploying to a production environment, consider the following:
requests
and limits
in the resources
section to ensure stable performance. For example:
resources:
# Default (sane for small clusters/testing)
limits:
cpu: 2000m
memory: 4Gi
requests:
cpu: 1000m
memory: 2Gi
For production, consider increasing resources as needed, e.g.:
resources:
limits:
cpu: 8000m
memory: 16Gi
requests:
cpu: 4000m
memory: 8Gi
serviceAccount
and link it to a cloud IAM role for secure access to cloud resources.deployment.securityContext
and podSecurityContext
to run the application with the least required privileges.ingress
with a real TLS certificate.networkPolicy
to restrict traffic between pods for a more secure network posture.-http
Service exist only when primaryStorage.fs.enabled
is true. The Ingress HTTP path is added only in that case. gRPC access is always via the main Service.service:
annotations:
traefik.ingress.kubernetes.io/service.serversscheme: "h2c"
networkPolicy
if your cluster enforces them.runAsUser: 0
). Consider hardening via deployment.securityContext
and deployment.podSecurityContext
to comply with cluster policies.Ready-to-use example values are provided under the examples/
directory:
examples/hetzner-s3.yaml
examples/aws-s3.yaml
examples/gke-gcs.yaml
examples/fs-primary.yaml
Important: Always review and adapt example files before deployment. Replace placeholders (bucket names, domains, storageClass, regions, service account emails, credentials) with values that match your environment and security policies.
kubectl create secret generic my-s3-secret \
--from-literal=access-key=AKIA... \
--from-literal=secret-key=abcd1234...
primaryStorage:
s3:
enabled: true
url: "s3://my-bucket/primary/"
region: "eu-central-1"
secretRef:
enabled: true
name: my-s3-secret
keyKey: access-key
secretKey: secret-key
IAM Integration for AWS EKS and GCP GKE:
When running on managed Kubernetes services like AWS EKS or GCP GKE, it is common practice to associate Kubernetes service accounts with cloud IAM roles for fine-grained access control. You can add the necessary annotations to the ServiceAccount
created by this chart using the serviceAccount.annotations
value.
AWS EKS Example (IAM Roles for Service Accounts - IRSA):
serviceAccount:
create: true
annotations:
eks.amazonaws.com/role-arn: "arn:aws:iam::123456789012:role/MyPlatformaIAMRole"
GCP GKE Example (Workload Identity):
serviceAccount:
create: true
annotations:
iam.gke.io/gcp-service-account: "my-gcp-sa@my-gcp-project-id.iam.gserviceaccount.com"
When running on GKE with GCS/Batch or on EKS with S3, grant at least the following permissions to the cloud identity used by the chart.
Assign these roles to the GCP service account mapped via Workload Identity:
Attach an IAM policy similar to the following to the role mapped via IRSA. Substitute placeholders with your own values:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "ListEntireBucketAndMultipartActions",
"Effect": "Allow",
"Action": [
"s3:ListBucket",
"s3:ListBucketMultipartUploads",
"s3:ListMultipartUploadParts"
],
"Resource": "arn:aws:s3:::example-bucket-name"
},
{
"Sid": "FullAccessUserSpecific",
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:DeleteObject",
"s3:GetObjectAttributes",
"s3:AbortMultipartUpload"
],
"Resource": [
"arn:aws:s3:::example-bucket-name/user-demo",
"arn:aws:s3:::example-bucket-name/user-demo/*"
]
},
{
"Sid": "GetObjectCommonPrefixes",
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:GetObjectAttributes"
],
"Resource": [
"arn:aws:s3:::example-bucket-name/corp-library/*",
"arn:aws:s3:::example-bucket-name/test-assets/*"
]
}
]
}