Kubernetes#
Introduction#
A Helm chart for deploying recruIT on a Kubernetes cluster is available in the main repository's OCI registry and in the MIRACUM charts repository. The chart can be used to deploy the application as well as all dependencies required for it to run (OHDSI WebAPI, OHDSI Atlas, HAPI FHIR server). The chart also includes MailHog, a mock mail server for testing email notifications.
Using the default values provided with the chart, all dependencies are installed and all services are configured to use them.
Setup#
-
Setup a Kubernetes cluster using your cloud provider of choice OR in a local environment using minikube, KinD, or k3d.
Installation#
Deploy recruIT to a namespace called recruit
by running
helm install -n recruit \
--create-namespace \
--render-subchart-notes \
--set ohdsi.cdmInitJob.enabled=true \
recruit oci://ghcr.io/miracum/recruit/charts/recruit
As a quick check to make sure everything is running correctly, you can use the following to check the readiness of all services:
$ helm test -n recruit recruit
NAME: recruit
LAST DEPLOYED: Wed May 4 21:45:06 2022
NAMESPACE: recruit
STATUS: deployed
REVISION: 1
TEST SUITE: recruit-fhirserver-test-endpoints
Last Started: Wed May 4 22:14:23 2022
Last Completed: Wed May 4 22:14:39 2022
Phase: Succeeded
TEST SUITE: recruit-ohdsi-test-connection
Last Started: Wed May 4 22:14:39 2022
Last Completed: Wed May 4 22:14:43 2022
Phase: Succeeded
TEST SUITE: recruit-test-health-probes
Last Started: Wed May 4 22:14:43 2022
Last Completed: Wed May 4 22:14:49 2022
Phase: Succeeded
NOTES:
1. Get the screening list URL by running these commands:
http://recruit-list.127.0.0.1.nip.io/
Example installation of the recruIT chart with ingress support using KinD#
This will demonstrate how to install recruIT on your local machine using KinD using the following advanced features:
- create a multi-node Kubernetes cluster to demonstrate topology-zone aware pod spreading for high-availability deployments
- expose all user-facing services behing the NGINX ingress controller on a https://nip.io domain resolved to localhost
- enable and enforce the restricted Pod Security Standard to demonstrate security best practices followed by all components
- pre-load the OMOP CDM database with SynPUF-based sample data
First, create a new cluster with Ingress support:
cat <<EOF | kind create cluster --config=-
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
featureGates:
PodSecurity: true
nodes:
- role: control-plane
image: docker.io/kindest/node:v1.26.0@sha256:45aa9ecb5f3800932e9e35e9a45c61324d656cf5bc5dd0d6adfc1b0f8168ec5f
kubeadmConfigPatches:
- |
kind: InitConfiguration
nodeRegistration:
kubeletExtraArgs:
node-labels: "ingress-ready=true"
extraPortMappings:
- containerPort: 80
hostPort: 80
protocol: TCP
- containerPort: 443
hostPort: 443
protocol: TCP
labels:
topology.kubernetes.io/zone: a
- role: worker
image: docker.io/kindest/node:v1.26.0@sha256:45aa9ecb5f3800932e9e35e9a45c61324d656cf5bc5dd0d6adfc1b0f8168ec5f
labels:
topology.kubernetes.io/zone: b
- role: worker
image: docker.io/kindest/node:v1.26.0@sha256:45aa9ecb5f3800932e9e35e9a45c61324d656cf5bc5dd0d6adfc1b0f8168ec5f
labels:
topology.kubernetes.io/zone: c
EOF
Install the NGINX ingress controller
kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v1.5.1/deploy/static/provider/kind/deploy.yaml
Wait until it's ready to process requests by running
kubectl wait --namespace ingress-nginx \
--for=condition=ready pod \
--selector=app.kubernetes.io/component=controller \
--timeout=90s
Create a namespace for the new installation. Enable and enforce restricted pod security policies:
kubectl create namespace recruit
kubectl label namespace recruit pod-security.kubernetes.io/enforce=restricted
kubectl label namespace recruit pod-security.kubernetes.io/enforce-version=v1.26
Save the following as values-kind-recruit.yaml
, or you can clone this repo and reference the file as -f docs/_snippets/values-kind-recruit.yaml
.
The ohdsi.cdmInitJob.extraEnv
option SETUP_SYNPUF=true
means that the OMOP database will be initialized with SynPUF
1K sample patient data.
Documentation for all available chart options
You can find a complete description of all available chart configuration options here: https://github.com/miracum/charts/blob/master/charts/recruit/README.md#configuration
list:
resources:
requests:
memory: "128Mi"
cpu: "250m"
limits:
memory: "128Mi"
ingress:
enabled: true
hosts:
- host: recruit-list.127.0.0.1.nip.io
paths: ["/"]
fhirserver:
resources:
requests:
memory: "3Gi"
cpu: "2500m"
limits:
memory: "3Gi"
postgresql:
auth:
postgresPassword: fhir
ingress:
enabled: true
hosts:
- host: recruit-fhir-server.127.0.0.1.nip.io
paths: ["/"]
query:
resources:
requests:
memory: "1Gi"
cpu: "1000m"
limits:
memory: "1Gi"
webAPI:
dataSource: "SynPUF-CDMV5"
omop:
resultsSchema: synpuf_results
cdmSchema: synpuf_cdm
cohortSelectorLabels:
- "recruIT"
notify:
resources:
requests:
memory: "1Gi"
cpu: "1000m"
limits:
memory: "1Gi"
rules:
schedules:
everyMorning: "0 0 8 1/1 * ? *"
trials:
- acronym: "*"
subscriptions:
- email: "everything@example.com"
- acronym: "SAMPLE"
accessibleBy:
users:
- "user1"
- "user.two@example.com"
subscriptions:
- email: "everyMorning@example.com"
notify: "everyMorning"
mailhog:
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "64Mi"
ingress:
enabled: true
hosts:
- host: recruit-mailhog.127.0.0.1.nip.io
paths:
- path: "/"
pathType: Prefix
ohdsi:
atlas:
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "64Mi"
webApi:
resources:
requests:
memory: "4Gi"
cpu: "250m"
limits:
memory: "4Gi"
postgresql:
auth:
postgresPassword: ohdsi
primary:
resources:
limits:
memory: 4Gi
cpu: 2500m
requests:
memory: 256Mi
cpu: 250m
ingress:
enabled: true
hosts:
- host: recruit-ohdsi.127.0.0.1.nip.io
cdmInitJob:
enabled: false
ttlSecondsAfterFinished: ""
extraEnv:
- name: SETUP_SYNPUF
value: "true"
achilles:
schemas:
cdm: "synpuf_cdm"
vocab: "synpuf_cdm"
res: "synpuf_results"
sourceName: "SynPUF-CDMV5"
loadCohortDefinitionsJob:
enabled: false
cohortDefinitions:
- |
{
"name": "A sample cohort",
"description": "[acronym=SAMPLE] [recruIT] Sample Cohort containing only female patients older than 90 years.",
"expressionType": "SIMPLE_EXPRESSION",
"expression": {
"ConceptSets": [],
"PrimaryCriteria": {
"CriteriaList": [
{
"ObservationPeriod": {
"First": true
}
}
],
"ObservationWindow": {
"PriorDays": 0,
"PostDays": 0
},
"PrimaryCriteriaLimit": {
"Type": "First"
}
},
"QualifiedLimit": {
"Type": "First"
},
"ExpressionLimit": {
"Type": "First"
},
"InclusionRules": [
{
"name": "Older than 18",
"expression": {
"Type": "ALL",
"CriteriaList": [],
"DemographicCriteriaList": [
{
"Age": {
"Value": 90,
"Op": "gt"
},
"Gender": [
{
"CONCEPT_CODE": "F",
"CONCEPT_ID": 8532,
"CONCEPT_NAME": "FEMALE",
"DOMAIN_ID": "Gender",
"INVALID_REASON_CAPTION": "Unknown",
"STANDARD_CONCEPT_CAPTION": "Unknown",
"VOCABULARY_ID": "Gender"
}
]
}
],
"Groups": []
}
}
],
"CensoringCriteria": [],
"CollapseSettings": {
"CollapseType": "ERA",
"EraPad": 0
},
"CensorWindow": {},
"cdmVersionRange": ">=5.0.0"
}
}
And finally, run
helm install -n recruit \
--render-subchart-notes \
-f values-kind-recruit.yaml \
--set ohdsi.cdmInitJob.enabled=true \
--set ohdsi.loadCohortDefinitionsJob.enabled=true \
recruit oci://ghcr.io/miracum/recruit/charts/recruit
CDM init job
The included CDM initialization job is currently not idempotent and may cause problems if ran multiple times.
You should set ohdsi.cdmInitJob.enabled=false
when the job has completed once when changing the chart configuration.
Similarly, you should set ohdsi.loadCohortDefinitionsJob.enabled=false
to avoid creating duplicate cohort definitions.
The application stack is now deployed. You can wait for the OMOP CDM init job to be done by running the following. This may take quite some time to complete.
kubectl wait job \
--namespace=recruit \
--for=condition=Complete \
--selector=app.kubernetes.io/component=cdm-init \
--timeout=1h
At this point, all externally exposed services should be accessible:
Service | Ingress URL |
---|---|
OHDSI Atlas | http://recruit-ohdsi.127.0.0.1.nip.io/atlas/ |
recruIT Screening List | http://recruit-list.127.0.0.1.nip.io/ |
HAPI FHIR Server | http://recruit-fhir-server.127.0.0.1.nip.io/ |
MailHog | http://recruit-mailhog.127.0.0.1.nip.io/ |
The values-kind-recruit.yaml
used to install the chart automatically loaded a sample cohort defined in
the ohdsi.loadCohortDefinitionsJob.cohortDefinitions
setting. If the CDM init job completed and the query module
ran at least once, you should see a notification email at http://recruit-mailhog.127.0.0.1.nip.io/:
and the corresponding screening list is accesible at http://recruit-list.127.0.0.1.nip.io/:
To create additional studies, follow the Creating your first study
guide using Atlas at http://recruit-ohdsi.127.0.0.1.nip.io/atlas/. Be sure to use [recruIT]
as the special
label instead of [UC1]
as the values above override query.cohortSelectorLabels[0]=recruIT
.
Metrics#
All modules expose metrics in Prometheus format (see Observability). The chart makes it easy to scrape these metrics by integrating with the widely used Prometheus Operator:
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm install --create-namespace -n monitoring kube-prometheus-stack prometheus-community/kube-prometheus-stack
You can now update your release by combining the values-kind-recruit.yaml
from above with the following:
list:
metrics:
serviceMonitor:
enabled: true
additionalLabels:
release: kube-prometheus-stack
query:
metrics:
serviceMonitor:
enabled: true
additionalLabels:
release: kube-prometheus-stack
notify:
metrics:
serviceMonitor:
enabled: true
additionalLabels:
release: kube-prometheus-stack
fhirserver:
metrics:
serviceMonitor:
enabled: true
additionalLabels:
release: kube-prometheus-stack
ohdsi:
webApi:
metrics:
serviceMonitor:
enabled: true
additionalLabels:
release: kube-prometheus-stack
helm upgrade -n recruit \
-f values-kind-recruit.yaml \
-f values-kind-recruit-enable-servicemonitors.yaml \
recruit oci://ghcr.io/miracum/recruit/charts/recruit
Opening the Grafana instance included with the kube-prometheus-stack
chart will allow you to query the exposed metrics:
High-Availability#
The FHIR server, the screening list, and the notification module support running using multiple replicas to ensure
high-availability in case of individual component failures.
Scaling up the notification module requires setting up a backend database for persistence to avoid sending duplicate emails.
Setting notify.ha.enabled=true
and postgresql.enabled=true
in the values will deploy an integrated PostgreSQL database
for the notification module. See the options under the notify.ha.database
key for specifying a custom database to use.
The snippet below configures the release to run multiple replicas of any supporting service, enables pod disruption budget resources, and uses pod topology spread constraints to spread the pods across node topology zones.
For information on setting up recruIT with highly-available PostgreSQL clusters provided by CloudNativePG, see below.
notify:
replicaCount: 2
podDisruptionBudget:
enabled: true
topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: ScheduleAnyway
labelSelector:
matchLabels:
app.kubernetes.io/name: recruit
# note that this label depends on the name of the chart release
# this assumes the chart is deployed with a name of `recruit`
app.kubernetes.io/instance: recruit
app.kubernetes.io/component: notify
ha:
enabled: true
list:
replicaCount: 2
podDisruptionBudget:
enabled: true
topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: ScheduleAnyway
labelSelector:
matchLabels:
app.kubernetes.io/name: recruit
app.kubernetes.io/instance: recruit
app.kubernetes.io/component: list
postgresql:
enabled: true
auth:
postgresPassword: recruit-notify-ha
ohdsi:
atlas:
replicaCount: 2
topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: ScheduleAnyway
labelSelector:
matchLabels:
app.kubernetes.io/name: ohdsi
app.kubernetes.io/instance: recruit
app.kubernetes.io/component: atlas
fhirserver:
replicaCount: 2
podDisruptionBudget:
enabled: true
topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: ScheduleAnyway
labelSelector:
matchLabels:
app.kubernetes.io/name: fhirserver
app.kubernetes.io/instance: recruit
fhir-pseudonymizer:
replicaCount: 2
podDisruptionBudget:
enabled: true
topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: ScheduleAnyway
labelSelector:
matchLabels:
app.kubernetes.io/name: fhir-pseudonymizer
app.kubernetes.io/instance: recruit
vfps:
enabled: true
replicaCount: 2
podDisruptionBudget:
enabled: true
topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: ScheduleAnyway
labelSelector:
matchLabels:
app.kubernetes.io/name: vfps
app.kubernetes.io/instance: recruit
Service mesh integration#
The application can be integrated with a service mesh, both for observability and to secure service-to-service communication via mTLS.
Linkerd#
The following values-kind-recruit-linkerd.yaml
shows how to configure the chart release
to place Linkerd's linkerd.io/inject: enabled
annotation for all service pods (excluding pods created by Jobs):
podAnnotations:
linkerd.io/inject: "enabled"
postgresql:
primary:
service:
annotations:
config.linkerd.io/opaque-ports: "5432"
ohdsi:
postgresql:
primary:
service:
annotations:
config.linkerd.io/opaque-ports: "5432"
atlas:
podAnnotations:
linkerd.io/inject: "enabled"
webApi:
podAnnotations:
linkerd.io/inject: "enabled"
fhirserver:
postgresql:
primary:
service:
annotations:
config.linkerd.io/opaque-ports: "5432"
podAnnotations:
linkerd.io/inject: "enabled"
mailhog:
automountServiceAccountToken: true
podAnnotations:
linkerd.io/inject: "enabled"
service:
annotations:
config.linkerd.io/opaque-ports: "1025"
You can also use the linkerd.io/inject: enabled
on the recruit
namespace, see https://linkerd.io/2.11/features/proxy-injection/
but you will have to manually add a disabled
annotation to the OHDSI Achilles CronJob and init job.
Istio#
Add a namespace label to instruct Istio to automatically inject Envoy sidecar proxies when you deploy your application later:
kubectl label namespace recruit istio-injection=enabled
To disable sidecar proxy injection for the Achilles and OMOP CDM init job, see the following values.yaml:
# ohdsi:
# cdmInitJob:
# podAnnotations:
# sidecar.istio.io/inject: "false"
# achilles:
# podAnnotations:
# sidecar.istio.io/inject: "false"
mailhog:
automountServiceAccountToken: true
ingress:
annotations:
kubernetes.io/ingress.class: istio
list:
ingress:
annotations:
kubernetes.io/ingress.class: istio
fhirserver:
ingress:
annotations:
kubernetes.io/ingress.class: istio
ohdsi:
ingress:
annotations:
kubernetes.io/ingress.class: istio
hosts:
- host: recruit-ohdsi.127.0.0.1.nip.io
pathType: Prefix
Zero-trust networking#
To limit the communication between the components you can deploy Kubernetes NetworkPolicy
resources. Because the details of a deployment can differ significantly (external databases, dependencies spread across
several namespaces, etc.), no generic NetworkPolicy
resources are included in the Helm chart. Instead, the following
policies and explanations should provide a starting point for customization.
The policies are based on these assumptions:
- the recruit application is deployed in a namespace called
recruit
- the OHDSI stack is deployed in a namespace called
ohdsi
- the SMTP server is running on a host outside the cluster at IP
192.0.2.1
and port1025
- the Prometheus monitoring stack is deployed in a namespace called
monitoring
You can use https://editor.cilium.io/ to visualize and edit individual policies or https://orca.tufin.io/netpol/# to have the entire policies explained.
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: fhir-server-policy
spec:
podSelector:
matchLabels:
app.kubernetes.io/instance: recruit
app.kubernetes.io/name: fhirserver
ingress:
# all modules are allowed to communicate with
# the FHIR server
- from:
- podSelector:
matchLabels:
app.kubernetes.io/instance: recruit
app.kubernetes.io/component: list
- podSelector:
matchLabels:
app.kubernetes.io/instance: recruit
app.kubernetes.io/component: notify
- podSelector:
matchLabels:
app.kubernetes.io/instance: recruit
app.kubernetes.io/component: query
ports:
- port: http
- from:
# allow the FHIR server to be scraped by the Prometheus stack
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: monitoring
podSelector:
matchLabels:
app.kubernetes.io/instance: kube-prometheus-stack-prometheus
ports:
- port: metrics
# allow the FHIR server to be accessed via the NGINX Ingress
- from:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: ingress-nginx
podSelector:
matchLabels:
app.kubernetes.io/name: ingress-nginx
ports:
- port: http
egress:
# for subscriptions to work, the FHIR server must be allowed to
# initiate connections to the notify module
- to:
- podSelector:
matchLabels:
app.kubernetes.io/instance: recruit
app.kubernetes.io/component: notify
ports:
- port: http
# allow the server access to its own database
- to:
- podSelector:
matchLabels:
app.kubernetes.io/instance: recruit
app.kubernetes.io/component: primary
app.kubernetes.io/name: fhir-server-postgres
ports:
- port: tcp-postgresql
# allow DNS lookups
- to:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: kube-system
podSelector:
matchLabels:
k8s-app: kube-dns
ports:
- port: 53
protocol: UDP
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: list-policy
spec:
podSelector:
matchLabels:
app.kubernetes.io/instance: recruit
app.kubernetes.io/component: list
ingress:
- from:
# allow the list module to be scraped by the Prometheus stack
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: monitoring
podSelector:
matchLabels:
app.kubernetes.io/instance: kube-prometheus-stack-prometheus
# allow the list module to be accessed via the NGINX Ingress
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: ingress-nginx
podSelector:
matchLabels:
app.kubernetes.io/name: ingress-nginx
ports:
- port: http
egress:
# allow the list module to initiate connections to the FHIR server
# for querying screening lists
- to:
- podSelector:
matchLabels:
app.kubernetes.io/instance: recruit
app.kubernetes.io/name: fhirserver
ports:
- port: http
# allow DNS lookups
- to:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: kube-system
podSelector:
matchLabels:
k8s-app: kube-dns
ports:
- port: 53
protocol: UDP
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: query-policy
spec:
podSelector:
matchLabels:
app.kubernetes.io/instance: recruit
app.kubernetes.io/component: query
ingress:
# allow the query module to be scraped by the Prometheus stack
- from:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: monitoring
podSelector:
matchLabels:
app.kubernetes.io/instance: kube-prometheus-stack-prometheus
ports:
- port: http-metrics
egress:
# allow the query module to initiate connections to the FHIR server
# to transmit FHIR resources
- to:
- podSelector:
matchLabels:
app.kubernetes.io/instance: recruit
app.kubernetes.io/name: fhirserver
ports:
- port: http
# allow the query module to initiate connections to the OHDSI WebAPI
# in the ohdsi namespace
- to:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: ohdsi
podSelector:
matchLabels:
app.kubernetes.io/instance: ohdsi
app.kubernetes.io/component: webapi
ports:
- port: http
# allow the query module to initiate connections to the OHDSI PostgreSQL DB
# in the ohdsi namespace
- to:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: ohdsi
podSelector:
matchLabels:
app.kubernetes.io/name: postgresql
app.kubernetes.io/instance: ohdsi
app.kubernetes.io/component: primary
ports:
- port: tcp-postgresql
- to:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: kube-system
podSelector:
matchLabels:
k8s-app: kube-dns
ports:
- port: 53
protocol: UDP
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: notify-policy
spec:
podSelector:
matchLabels:
app.kubernetes.io/instance: recruit
app.kubernetes.io/component: notify
ingress:
# allow the notify module to be scraped by the Prometheus stack
- from:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: monitoring
podSelector:
matchLabels:
app.kubernetes.io/instance: kube-prometheus-stack-prometheus
ports:
- port: http-metrics
# allow the notify module to receive subscription invocations from the FHIR server
- from:
- podSelector:
matchLabels:
app.kubernetes.io/instance: recruit
app.kubernetes.io/name: fhirserver
ports:
- port: http
egress:
# allow the notify module to initiate connections to the FHIR server
- to:
- podSelector:
matchLabels:
app.kubernetes.io/instance: recruit
app.kubernetes.io/name: fhirserver
ports:
- port: http
# allow the notify module to access the SMTP server at
# 192.0.2.1. The `32` subnet prefix length limits egress
# to just this one address
- to:
- ipBlock:
cidr: 192.0.2.1/32
ports:
- protocol: TCP
port: 1025
# allow the notify module to initiate connections to its PostgreSQL db
# in case of HA
- to:
- podSelector:
matchLabels:
app.kubernetes.io/name: recruit-postgres
app.kubernetes.io/instance: recruit
app.kubernetes.io/component: primary
ports:
- port: tcp-postgresql
- to:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: kube-system
podSelector:
matchLabels:
k8s-app: kube-dns
ports:
- port: 53
protocol: UDP
Distributed Tracing#
All services support distributed tracing based on OpenTelemetry.
For testing, you can install the Jaeger operator to prepare your cluster for tracing.
# Cert-Manager is required by the Jaeger Operator
# See <https://cert-manager.io/docs/installation/> for details.
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.9.1/cert-manager.yaml
kubectl wait --namespace cert-manager \
--for=condition=ready pod \
--selector=app.kubernetes.io/instance=cert-manager \
--timeout=5m
kubectl create namespace observability
kubectl create -n observability -f https://github.com/jaegertracing/jaeger-operator/releases/download/v1.38.0/jaeger-operator.yaml
kubectl wait --namespace observability \
--for=condition=ready pod \
--selector=name=jaeger-operator \
--timeout=5m
cat <<EOF | kubectl apply -n observability -f -
# simple, all-in-one Jaeger installation. Not suitable for production use.
apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
name: simplest
EOF
The following values enable tracing for the query, list, and notify module, the HAPI FHIR server and the OHDSI WebAPI:
query:
extraEnv:
- name: JAVA_TOOL_OPTIONS
value: "-javaagent:/app/opentelemetry-javaagent.jar"
- name: OTEL_METRICS_EXPORTER
value: "none"
- name: OTEL_LOGS_EXPORTER
value: "none"
- name: OTEL_TRACES_EXPORTER
value: "jaeger"
- name: OTEL_SERVICE_NAME
value: "recruit-query"
- name: OTEL_EXPORTER_JAEGER_ENDPOINT
value: "http://simplest-collector.observability.svc:14250"
list:
extraEnv:
- name: TRACING_ENABLED
value: "true"
- name: OTEL_TRACES_EXPORTER
value: "jaeger"
- name: OTEL_SERVICE_NAME
value: "recruit-list"
- name: OTEL_EXPORTER_JAEGER_AGENT_HOST
value: "simplest-agent.observability.svc"
notify:
extraEnv:
- name: JAVA_TOOL_OPTIONS
value: "-javaagent:/app/opentelemetry-javaagent.jar"
- name: OTEL_METRICS_EXPORTER
value: "none"
- name: OTEL_LOGS_EXPORTER
value: "none"
- name: OTEL_TRACES_EXPORTER
value: "jaeger"
- name: OTEL_SERVICE_NAME
value: "recruit-notify"
- name: OTEL_EXPORTER_JAEGER_ENDPOINT
value: "http://simplest-collector.observability.svc:14250"
fhirserver:
extraEnv:
# the recruit tool relies on the FHIR server subscription mechanism to create notifications.
# if you overwrite `fhirserver.extraEnv`, make sure to keep this setting enabled.
- name: HAPI_FHIR_SUBSCRIPTION_RESTHOOK_ENABLED
value: "true"
- name: SPRING_FLYWAY_BASELINE_ON_MIGRATE
value: "true"
# OTel options
- name: JAVA_TOOL_OPTIONS
value: "-javaagent:/app/opentelemetry-javaagent.jar"
- name: OTEL_METRICS_EXPORTER
value: "none"
- name: OTEL_LOGS_EXPORTER
value: "none"
- name: OTEL_TRACES_EXPORTER
value: "jaeger"
- name: OTEL_SERVICE_NAME
value: "recruit-hapi-fhir-server"
- name: OTEL_EXPORTER_JAEGER_ENDPOINT
value: "http://simplest-collector.observability.svc:14250"
fhir-pseudonymizer:
extraEnv:
- name: Tracing__Enabled
value: "true"
- name: Tracing__ServiceName
value: "recruit-fhir-pseudonymizer"
- name: Tracing__Jaeger__AgentHost
value: "simplest-agent.observability.svc"
vfps:
extraEnv:
- name: Tracing__IsEnabled
value: "true"
- name: Tracing__ServiceName
value: "recruit-vfps"
- name: Tracing__Jaeger__AgentHost
value: "simplest-agent.observability.svc"
ohdsi:
webApi:
tracing:
enabled: true
jaeger:
protocol: "grpc"
endpoint: http://simplest-collector.observability.svc:14250
Screening List De-Pseudonymization#
Info
Requires version 9.3.0 or later of the recruIT Helm chart.
You can optionally deploy both the FHIR Pseudonymizer and Vfps as a pseudonym service backend to allow for de-pseudonymizing patient and visit identifiers stored in OMOP or the FHIR server prior to displaying them on the screening list.
The background is detailed in De-Pseudonymization.
The following values.yaml enable the included FHIR Pseudonymizer and Vfps as a pseudonym service. When Vfps is installed, it uses another PostgreSQL database which is naturally empty and does not contain any pre-defined namespaces or pseudonyms. It is up to the user to pseudonymize the resources stored inside the FHIR server used by the screening list.
list:
dePseudonymization:
enabled: true
fhir-pseudonymizer:
enabled: true
auth:
apiKey:
# enable requiring an API key placed in the `x-api-key` header to
# authenticate against the fhir-pseudonymizer's `/fhir/$de-pseudonymize`
# endpoint.
enabled: true
# the API key required to be set when the list module invokes
# the FHIR Pseudonymizer's `$de-pseudonymize` endpoint.
# Note: instead of storing the key in plaintext in the values.yaml,
# you might want to leverage the `existingSecret` option instead.
key: "demo-secret-api-key"
# the values below are the default values defined in <https://github.com/miracum/charts/blob/master/charts/recruit/values.yaml>
pseudonymizationService: Vfps
vfps:
enabled: true
postgresql:
enabled: true
auth:
database: vfps
postgresPassword: vfps
CloudNativePG for HA databases#
Install the CloudNativePG operator first by following the official documentation site:
kubectl apply -f https://raw.githubusercontent.com/cloudnative-pg/cloudnative-pg/release-1.18/releases/cnpg-1.18.0.yaml
Next, create PostgreSQL clusters and pre-configured users for OHDSI, the HAPI FHIR server, the Vfps pseudonymization service, and the notify module:
---
apiVersion: v1
kind: Secret
metadata:
name: recruit-ohdsi-db-app-user
type: kubernetes.io/basic-auth
stringData:
password: recruit-ohdsi
username: ohdsi
---
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
name: recruit-ohdsi-db
spec:
instances: 3
primaryUpdateStrategy: unsupervised
replicationSlots:
highAvailability:
enabled: true
storage:
size: 64Gi
bootstrap:
initdb:
database: ohdsi
owner: ohdsi
secret:
name: recruit-ohdsi-db-app-user
---
apiVersion: v1
kind: Secret
metadata:
name: recruit-fhir-server-db-app-user
type: kubernetes.io/basic-auth
stringData:
password: recruit-fhir-server
username: fhir_server_user
---
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
name: recruit-fhir-server-db
spec:
instances: 3
primaryUpdateStrategy: unsupervised
replicationSlots:
highAvailability:
enabled: true
storage:
size: 64Gi
bootstrap:
initdb:
database: fhir_server
owner: fhir_server_user
secret:
name: recruit-fhir-server-db-app-user
---
apiVersion: v1
kind: Secret
metadata:
name: vfps-db-app-user
type: kubernetes.io/basic-auth
stringData:
password: vfps
username: vfps_user
---
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
name: vfps-db
spec:
instances: 3
primaryUpdateStrategy: unsupervised
replicationSlots:
highAvailability:
enabled: true
storage:
size: 64Gi
bootstrap:
initdb:
database: vfps
owner: vfps_user
secret:
name: vfps-db-app-user
---
apiVersion: v1
kind: Secret
metadata:
name: recruit-notify-db-app-user
type: kubernetes.io/basic-auth
stringData:
password: notify
username: notify_user
---
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
name: recruit-notify-db
spec:
instances: 3
primaryUpdateStrategy: unsupervised
replicationSlots:
highAvailability:
enabled: true
storage:
size: 64Gi
bootstrap:
initdb:
database: notify_jobstore
owner: notify_user
secret:
name: recruit-notify-db-app-user
kubectl apply -f cnpg-clusters.yaml
Finally, install the recruIT chart using the following updated values.yaml:
ohdsi:
postgresql:
enabled: false
webApi:
db:
host: "recruit-ohdsi-db-rw"
port: 5432
database: "ohdsi"
username: "ohdsi"
password: ""
existingSecret: "recruit-ohdsi-db-app-user"
existingSecretKey: "password"
schema: "ohdsi"
fhirserver:
postgresql:
enabled: false
externalDatabase:
host: "recruit-fhir-server-db-rw"
port: 5432
database: "fhir_server"
user: "fhir_server_user"
password: ""
existingSecret: "recruit-fhir-server-db-app-user"
existingSecretKey: "password"
notify:
ha:
enabled: true
database:
host: "recruit-notify-db-rw"
port: 5432
username: "notify_user"
password: ""
name: "notify_jobstore"
existingSecret:
name: "recruit-notify-db-app-user"
key: "password"
postgresql:
enabled: false
fhir-pseudonymizer:
enabled: true
vfps:
postgresql:
enabled: false
database:
host: "vfps-db-rw"
port: 5432
database: "vfps"
username: "vfps_user"
password: ""
existingSecret: "vfps-db-app-user"
existingSecretKey: "password"
schema: "vfps"
Running the query module using Argo Workflows#
By default, the query module runs on a dedicated schedule. As of version 10.1.0
,
the module can also be configured to run as a one-shot container. This is useful
when integrating with existing containerized workflows, e.g. using
Airflow
or Argo Workflows.
Below you can find an example for running the query module as part of a larger workflow:
# yaml-language-server: $schema=https://raw.githubusercontent.com/argoproj/argo-workflows/v3.4.3/api/jsonschema/schema.json
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: recruit-query-workflow-
spec:
entrypoint: full-run
templates:
- name: omop-cdm-etl
container:
image: docker.io/docker/whalesay@sha256:178598e51a26abbc958b8a2e48825c90bc22e641de3d31e18aaf55f3258ba93b
command: [cowsay]
args: ["Running ETL Job from source to the OMOP CDM database"]
securityContext:
readOnlyRootFilesystem: true
runAsUser: 65532
runAsGroup: 65532
seccompProfile:
type: RuntimeDefault
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
privileged: false
runAsNonRoot: true
- name: ohdsi-achilles
# run for at most 1 hour before timing out to make sure the query module will run eventually
activeDeadlineSeconds: "3600"
container:
image: docker.io/ohdsi/broadsea-achilles:sha-bccd396@sha256:a881063aff6200d0d368ec30eb633381465fb8aa15e7d7138b7d48b6256a6feb
env:
- name: ACHILLES_DB_URI
value: >-
postgresql://broadsea-atlasdb:5432/postgres?ApplicationName=recruit-ohdsi-achilles
- name: ACHILLES_DB_USERNAME
value: postgres
- name: ACHILLES_DB_PASSWORD
valueFrom:
secretKeyRef:
name: recruit-ohdsi-webapi-db-secret
key: postgres-password
- name: ACHILLES_CDM_SCHEMA
value: demo_cdm
- name: ACHILLES_VOCAB_SCHEMA
value: demo_cdm
- name: ACHILLES_RES_SCHEMA
value: demo_cdm_results
- name: ACHILLES_CDM_VERSION
value: "5.3"
- name: ACHILLES_SOURCE
value: EUNOMIA
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
privileged: false
runAsNonRoot: true
runAsUser: 10001
runAsGroup: 10001
readOnlyRootFilesystem: true
seccompProfile:
type: RuntimeDefault
volumeMounts:
- name: achilles-workspace-volume
mountPath: /opt/achilles/workspace
- name: r-tempdir-volume
mountPath: /tmp
volumes:
- name: achilles-workspace-volume
emptyDir: {}
- name: r-tempdir-volume
emptyDir: {}
- name: recruit-query
container:
image: ghcr.io/miracum/recruit/query:v10.1.12 # x-release-please-version
env:
- name: QUERY_RUN_ONCE_AND_EXIT
value: "true"
- name: QUERY_SCHEDULE_ENABLED
value: "false"
- name: QUERY_SELECTOR_MATCHLABELS
value: ""
- name: FHIR_URL
value: http://recruit-fhirserver:8080/fhir
- name: OMOP_JDBCURL
value: >-
jdbc:postgresql://broadsea-atlasdb:5432/postgres?ApplicationName=recruit-query
- name: OMOP_USERNAME
value: postgres
- name: OMOP_PASSWORD
valueFrom:
secretKeyRef:
name: recruit-ohdsi-webapi-db-secret
key: postgres-password
- name: OMOP_CDMSCHEMA
value: demo_cdm
- name: OMOP_RESULTSSCHEMA
value: demo_cdm_results
- name: QUERY_WEBAPI_BASE_URL
value: http://recruit-ohdsi-webapi:8080/WebAPI
- name: ATLAS_DATASOURCE
value: EUNOMIA
- name: MANAGEMENT_ENDPOINT_HEALTH_PROBES_ADD_ADDITIONAL_PATHS
value: "true"
- name: MANAGEMENT_SERVER_PORT
value: "8081"
- name: CAMEL_HEALTH_ENABLED
value: "false"
- name: QUERY_WEBAPI_COHORT_CACHE_SCHEMA
value: webapi
securityContext:
privileged: false
capabilities:
drop:
- ALL
runAsNonRoot: true
runAsUser: 65532
runAsGroup: 65532
readOnlyRootFilesystem: true
allowPrivilegeEscalation: false
seccompProfile:
type: RuntimeDefault
volumeMounts:
- name: tmp-volume
mountPath: /tmp
volumes:
- name: tmp-volume
emptyDir: {}
- name: full-run
dag:
tasks:
- name: run-omop-cdm-etl
template: omop-cdm-etl
- name: run-ohdsi-achilles
depends: run-omop-cdm-etl
template: ohdsi-achilles
- name: run-recruit-query
# doesn't really matter whether the achilles job failed or succeeded
depends: "run-omop-cdm-etl && (run-ohdsi-achilles.Succeeded || run-ohdsi-achilles.Failed)"
template: recruit-query
You can run this workflow against the integration test setup of the recruIT Helm chart:
kubectl create namespace recruit
helm repo add argo https://argoproj.github.io/argo-helm
helm upgrade --install \
--create-namespace \
--namespace=argo-workflows \
-f tests/chaos/argo-workflows-values.yaml \
argo-workflows argo/argo-workflows
helm upgrade --install \
--namespace=recruit \
-f charts/recruit/values-integrationtest.yaml \
--set query.enabled=false \
recruit charts/recruit/
argo submit -n recruit --wait --log docs/_snippets/k8s/query-argo-workflow.yaml