Terragrunt Deployment Guide - Automation Anywhere EKB Docs

This guide walks through the full deployment of the EKB EKS infrastructure on AWS using Terragrunt. It covers tool installation, environment setup, and a phased deployment sequence designed to ensure proper dependency ordering across all infrastructure components. Deployments are organized into nine phases:

State Management — Bootstraps the S3 bucket used to store Terraform state for the environment.
EKS Infrastructure — Provisions the VPC, subnets, NAT gateways, IAM roles, and the EKS cluster and managed node groups.
Storage & Load Balancing — Deploys the EBS CSI driver for persistent volumes and the AWS Load Balancer Controller for ALB ingress.
Karpenter Autoscaling — Sets up dynamic node provisioning with Spot instance support and interruption handling via SQS and EventBridge.
KEDA Autoscaling — Deploys KEDA for pod-level autoscaling based on CPU and memory thresholds.
Data Services — Provisions Supabase (self-hosted or Cloud), ElastiCache Redis, and Amazon MQ RabbitMQ.
Odin Services — Deploys the EKB application stack (Web, FastAPI, Celery, Automator) via Helm.
SigNoz Observability — Deploys distributed tracing, metrics, and log aggregation via SigNoz and the k8s-infra agent.
Final Deployment — Runs a full terragrunt apply to reconcile any remaining resources.

Before starting, complete the prerequisites checklist with the customer and ensure all <YOUR_*> placeholders in the environment template are filled in. Several values — including the VPC ID, EKS cluster endpoint, and Redis and RabbitMQ endpoints — are only available after specific phases complete, so the guide flags exactly when to capture and apply them.

Prerequisites

AWS CLI configured with appropriate permissions
Terraform (>= 1.0)
Terragrunt (latest version)
kubectl for Kubernetes management
helm for Helm chart management

Installation Guide

Installing Terragrunt

macOS (Homebrew)

brew install terragrunt

Linux (apt)

# Add HashiCorp GPG key
wget -O- https://apt.releases.hashicorp.com/gpg | sudo gpg --dearmor -o /usr/share/keyrings/hashicorp-archive-keyring.gpg
 
# Add HashiCorp repository
echo "deb [signed-by=/usr/share/keyrings/hashicorp-archive-keyring.gpg] https://apt.releases.hashicorp.com $(lsb_release -cs) main" | sudo tee /etc/apt/sources.list.d/hashicorp.list
 
# Update and install
sudo apt update
sudo apt install terragrunt

Windows (Chocolatey)

choco install terragrunt

Installing kubectl

macOS (Homebrew)

brew install kubectl

Linux

curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl

Windows (Chocolatey)

choco install kubernetes-cli

Installing Helm

macOS (Homebrew)

brew install helm

Linux

curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash

Windows (Chocolatey)

choco install kubernetes-helm

Verifying Installation

terragrunt --version
terraform --version
kubectl version --client
helm version

AWS CLI Configuration

# Install AWS CLI
curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
unzip awscliv2.zip
sudo ./aws/install
 
# Configure AWS credentials
aws configure
 
# Verify configuration
aws sts get-caller-identity

Creating a New Environment

Step 1: Copy the Environment Template

The env-template-folder contains pre-structured files with <YOUR_*> placeholders ready to be filled in. Copy it entirely to create your new environment folder.

# Navigate to the terragrunt environments directory
cd terragrunt/environments
 
# Copy the full template folder to a new environment (replace 'your-env-name')
cp -r env-template-folder your-env-name
 
# The folder structure is ready:
# your-env-name/
# ├── terragrunt.hcl               # Core cluster configuration
# ├── state/
# │   └── terragrunt.hcl           # S3 state bucket configuration
# └── values/
#     ├── infrastructure.yaml      # AWS Load Balancer Controller
#     ├── karpenter-values.yaml    # Karpenter controller settings
#     ├── karpenter-nodeclasses.yaml  # EC2NodeClass definitions
#     ├── karpenter.yaml           # Karpenter NodePool definitions
#     ├── keda.yaml                # KEDA autoscaler
#     ├── aws-ebs-csi-driver.yaml  # EBS CSI driver
#     ├── odin-services.yaml       # Odin application services
#     ├── supabase.yaml            # Supabase (if self-hosting)
#     ├── ha-supabase-db.yaml      # Supabase HA DB (if self-hosting)
#     ├── cloudnative-pg.yaml      # CloudNativePG operator (if self-hosting)
#     ├── signoz.yaml              # SigNoz observability (optional)
#     └── signoz-k8s-infra.yaml    # SigNoz k8s metrics (optional)

Step 2: Verify All Placeholders Are Present

cd your-env-name
 
# List all placeholders that need to be filled in
grep -r "<YOUR_" . --include="*.hcl" --include="*.yaml" | sort

All placeholders follow the <YOUR_*> convention. The steps below walk through filling them in file by file.

Step 3: Provision SSL Certificates (AWS ACM)

Before setting environment variables you need the certificate ARNs. Use the AWS Console to request SSL certificates in AWS Certificate Manager (ACM) for all domains your environment will serve. Option A: Single wildcard certificate (recommended) A single wildcard certificate covers all subdomains with one ARN. For example, if your base domain is app.example.com, a single *.app.example.com certificate covers:

Service	Domain
Web	`app.example.com`
FastAPI	`api-app.example.com`
Automator	`automations-app.example.com`
Supabase	`supabase-app.example.com`
SigNoz	`signoz-app.example.com`

Option B: Per-service certificates Request one certificate per domain if you cannot use a wildcard. Repeat the steps below for each domain: <YOUR_WEB_DOMAIN>, <YOUR_API_DOMAIN>, <YOUR_AUTOMATOR_DOMAIN>, <YOUR_SUPABASE_DOMAIN> (only if ENABLE_SUPABASE=true), <YOUR_SIGNOZ_DOMAIN> (only if ENABLE_SIGNOZ=true). Requesting a certificate in the AWS Console

Open the AWS Certificate Manager console
Switch to the correct region (top-right) — must match <YOUR_AWS_REGION>
Click Request a certificate → Request a public certificate → Next
Under Fully qualified domain name, enter the wildcard (e.g., *.app.example.com) or a specific domain
Set Validation method to DNS validation
Click Request — the certificate is created in Pending validation state

Adding the DNS CNAME validation record ACM generates a CNAME record that you must add to your DNS provider to prove domain ownership. Get the values from the ACM Console by opening the certificate and expanding the domain under Domains.

DNS Field	Value
Record type	`CNAME`
Name / Host	e.g., `_fa187f22ac17bce6f508bf3c56439c61.signoz-app.example.com.`
Value / Points to	e.g., `_c7c97325fe38061e168e232d122c7ff3.jkddzztszm.acm-validations.aws.`

Include the trailing dot (.) at the end of the CNAME values if your DNS provider requires it.

Cloudflare

Log in to Cloudflare → select your domain → go to DNS → Records → Add record
Set Type to CNAME
Paste the ACM CNAME name into Name and the ACM CNAME value into Target
Set Proxy status to DNS only (grey cloud icon) — the certificate will not validate through the Cloudflare proxy
Click Save

Route 53

Open the Route 53 console → Hosted zones → select your zone → Create record
Set Record type to CNAME
Paste the ACM CNAME name into Record name (subdomain portion only) and the value into Value
Set TTL to 300 and click Create records

In ACM you can also click Create records in Route 53 to have ACM add the record automatically if the hosted zone is in the same account.

Once DNS propagates (typically 1–5 minutes), the certificate status changes to Issued. Copy the ARN from the top of the certificate — it looks like arn:aws:acm:<region>:<account-id>:certificate/<uuid>. Keep the ARN(s) handy for the next step.

Step 4: Set Environment Variables

Set these shell environment variables before running any Terragrunt commands. They are read directly by terragrunt.hcl via get_env().

export AWS_REGION="<YOUR_AWS_REGION>"          # e.g., "eu-west-2", "us-east-2"
export CLUSTER_NAME="<env_folder_name>"          # e.g., "env-template-folder"
 
# Domain configuration
export WEB_DOMAIN="<YOUR_WEB_DOMAIN>"                    # e.g., "app.example.com"
export FASTAPI_DOMAIN="<YOUR_API_DOMAIN>"                # e.g., "api-app.example.com"
export AUTOMATOR_DOMAIN="<YOUR_AUTOMATOR_DOMAIN>"        # e.g., "automations-app.example.com"
export SUPABASE_DOMAIN="<YOUR_SUPABASE_DOMAIN>"          # e.g., "supabase-app.example.com"
export SIGNOZ_DOMAIN="<YOUR_SIGNOZ_DOMAIN>"              # e.g., "signoz-app.example.com"
 
# SSL Certificate ARNs — Option A: Single wildcard certificate (recommended)
export WILDCARD_CERTIFICATE_ARN="arn:aws:acm:<YOUR_AWS_REGION>:<YOUR_AWS_ACCOUNT_ID>:certificate/<YOUR_WILDCARD_CERT_ID>"
 
# SSL Certificate ARNs — Option B: Per-service certificates
export WEB_CERTIFICATE_ARN="arn:aws:acm:<YOUR_AWS_REGION>:<YOUR_AWS_ACCOUNT_ID>:certificate/<YOUR_WEB_CERT_ID>"
export FASTAPI_CERTIFICATE_ARN="arn:aws:acm:<YOUR_AWS_REGION>:<YOUR_AWS_ACCOUNT_ID>:certificate/<YOUR_API_CERT_ID>"
export AUTOMATOR_CERTIFICATE_ARN="arn:aws:acm:<YOUR_AWS_REGION>:<YOUR_AWS_ACCOUNT_ID>:certificate/<YOUR_AUTOMATOR_CERT_ID>"
export SUPABASE_CERTIFICATE_ARN="arn:aws:acm:<YOUR_AWS_REGION>:<YOUR_AWS_ACCOUNT_ID>:certificate/<YOUR_SUPABASE_CERT_ID>"
export SIGNOZ_CERTIFICATE_ARN="arn:aws:acm:<YOUR_AWS_REGION>:<YOUR_AWS_ACCOUNT_ID>:certificate/<YOUR_SIGNOZ_CERT_ID>"
 
# Service enablement flags
export ENABLE_ALB_CONTROLLER="true"
export ENABLE_AWS_SERVICES="true"      # Set to "true" to enable ElastiCache and AmazonMQ
 
# Supabase stack (self-hosted) — enable all three together if self-hosting Supabase
export ENABLE_CNPG="true"             # CloudNativePG operator  (namespace: cnpg-system)
export ENABLE_HA_SUPABASE_DB="true"   # Supabase HA database    (namespace: ha-supabase-db)
export ENABLE_SUPABASE="true"         # Supabase application    (namespace: supabase)
 
export ENABLE_SIGNOZ="true"           # Set to "true" to enable SigNoz observability
export SSL_TERMINATION="alb"

Spot Instances & Stateful Workloads

Spot instances are configured per NodePool in values/karpenter.yaml, not via environment variables. Each NodePool declares its own capacity strategy:

NodePool	`workload-type` label	Capacity Type	Rationale
`general`	`general`	Spot → On-Demand fallback	Cost-optimised for stateless batch/background workloads
`compute-intensive`	`compute-intensive`	Spot → On-Demand fallback	Cost-optimised for CPU-bound workloads
`memory-intensive`	`memory-intensive`	On-Demand → Spot fallback	Stability prioritised for high-memory pods
`gpu`	`gpu`	Spot → On-Demand fallback	Cost-optimised for AI/ML batch workloads
`application`	`application`	On-Demand only	Stable user-facing services (Supabase, Kong, etc.) — no Spot interruptions
`database`	`database` / `node-type: database-dedicated`	On-Demand only	Stateful — Spot interruption is unsafe for databases

The application NodePool uses m/c instance families (generation 5+) with On-Demand only. Supabase service pods are pinned here via nodeSelector: workload-type: "application" to guarantee they are never interrupted by a Spot reclamation event. The database-dedicated NodePool never uses Spot. It uses consolidationPolicy: WhenEmpty so Karpenter will not evict a node that still has a running pod, making it safe for stateful workloads such as PostgreSQL and CloudNativePG replicas. Guidelines for stateful applications on Spot:

Do not schedule databases, persistent queues, or any pod with a PersistentVolumeClaim on Spot NodePools.
Use a nodeSelector targeting node-type: database-dedicated with the matching database-workload: "true" toleration for database pods.
Use nodeSelector: workload-type: "application" for user-facing stateless services that must remain available without interruption.
For background workloads (Web, API, Celery, Automator), the general Spot NodePool is appropriate — Karpenter’s SQS interruption handler drains Spot nodes gracefully before AWS reclaims them, and KEDA’s minimum replica count (≥ 2) ensures availability during node replacement.
To disable Spot globally, remove "spot" from the values list in every NodePool inside values/karpenter.yaml.

How Karpenter handles Spot interruption warnings: AWS gives a 2-minute interruption notice before terminating a Spot instance. Karpenter uses EventBridge and SQS to act on this automatically:

AWS Spot Interruption Event
        │
        ▼
Amazon EventBridge (CloudWatch Events)
  Rule: EC2 Spot Instance Interruption Warning
        │
        ▼
   SQS Queue (Karpenter interruption queue)
        │
        ▼
  Karpenter Controller (polls SQS continuously)
        │
        ├── Cordons the node (no new pods scheduled)
        ├── Drains existing pods (respects PodDisruptionBudgets)
        ├── Provisions a replacement node in parallel
        └── Pods reschedule onto the new node before the 2-min window closes

This is configured in the karpenter block in terragrunt.hcl:

karpenter = {
  spot_interruption_handling = true   # creates the SQS queue and EventBridge rule
  enable_spot_instances       = true   # allows Spot in NodePool capacity requirements
}

Step 5: Update Environment-Specific File Values

Do a find-and-replace across all files in your new env folder for the following placeholders:

Placeholder	Description	Example
`<YOUR_ENV_NAME>`	Unique environment identifier	`app-eks-prod`
`<YOUR_AWS_REGION>`	AWS region of the cluster	`eu-west-2`, `us-east-2`
`<YOUR_AWS_ACCOUNT_ID>`	12-digit AWS account ID	`123456789012`
`<YOUR_ENVIRONMENT>`	Environment tag value	`prod`, `staging`, `dev`
`<YOUR_PROJECT>`	Project tag value	`odin`, `ekb`

# Run from your new env folder to find all remaining placeholders
grep -r "<YOUR_" .

5.1 `terragrunt.hcl` — Core cluster configuration

nano terragrunt.hcl

Field	Placeholder	Notes
`cluster_name`	`<YOUR_ENV_NAME>`	Must match EKS cluster name
`cluster_region`	`<YOUR_AWS_REGION>`	AWS region
`aws_account_id`	`<YOUR_AWS_ACCOUNT_ID>`	12-digit account ID
`vpc_cidr`	`<YOUR_VPC_CIDR>`	e.g., `192.168.0.0/16`
`availability_zones`	`<YOUR_REGION>a/b/c`	3 AZs in your region
`tags.Environment`	`<YOUR_ENVIRONMENT>`	e.g., `prod`
`tags.Project`	`<YOUR_PROJECT>`	e.g., `odin`
`aws_services.amazon_mq.rabbitmq.username`	`<YOUR_RABBITMQ_USERNAME>`	RabbitMQ admin username (only when `ENABLE_AWS_SERVICES=true`)
`aws_services.amazon_mq.rabbitmq.password`	`<YOUR_RABBITMQ_PASSWORD>`	Min 12 chars; must include uppercase, lowercase, digits, and special characters

5.2 `state/terragrunt.hcl` — S3 state bucket

nano state/terragrunt.hcl

Field	Placeholder	Notes
`bucket_name`	`odin-terraform-state-<YOUR_ENV_NAME>`	Must be globally unique
`region`	`<YOUR_AWS_REGION>`	Same region as cluster

5.3 `values/infrastructure.yaml` — AWS Load Balancer Controller

Obtain the VPC ID after the EKS cluster is created before deploying the AWS Load Balancer Controller.

# Get VPC ID after EKS cluster is created
aws eks describe-cluster --name <YOUR_ENV_NAME> \
  --query "cluster.resourcesVpcConfig.vpcId" --output text

nano values/infrastructure.yaml

Field	Placeholder	Notes
`clusterName`	`<YOUR_ENV_NAME>`	EKS cluster name
`region`	`<YOUR_AWS_REGION>`	AWS region
`vpcId`	`<YOUR_VPC_ID>`	Required before ALB deploy
`serviceAccount.annotations.eks.amazonaws.com/role-arn`	`<YOUR_AWS_ACCOUNT_ID>`, `<YOUR_ENV_NAME>`	IAM role for ALB controller

5.4 `values/karpenter-values.yaml` — Karpenter controller

Obtain the EKS cluster endpoint after the EKS cluster is created and before deploying Karpenter.

# Get cluster endpoint after EKS cluster is created
aws eks describe-cluster --name <YOUR_ENV_NAME> \
  --query "cluster.endpoint" --output text

nano values/karpenter-values.yaml

Field	Placeholder	Notes
`serviceAccount.annotations.eks.amazonaws.com/role-arn`	`<YOUR_AWS_ACCOUNT_ID>`, `<YOUR_ENV_NAME>`	IAM role for Karpenter
`env.CLUSTER_NAME`	`<YOUR_ENV_NAME>`	EKS cluster name
`env.CLUSTER_ENDPOINT`	`<YOUR_EKS_CLUSTER_ENDPOINT>`	Required before Karpenter deploy
`settings.aws.defaultInstanceProfile`	`<YOUR_ENV_NAME>`	Karpenter node instance profile

5.5 `values/karpenter-nodeclasses.yaml` — Karpenter node classes

nano values/karpenter-nodeclasses.yaml

Field	Placeholder	Notes
All `kubernetes.io/cluster/<YOUR_ENV_NAME>` tags	`<YOUR_ENV_NAME>`	Cluster tag for subnet/SG selectors
`user_data` bootstrap cluster name	`<YOUR_ENV_NAME>`	Node bootstrap script
`tags.Environment`	`<YOUR_ENVIRONMENT>`	e.g., `prod`
`tags.Project`	`<YOUR_PROJECT>`	e.g., `odin`

5.6 `values/aws-ebs-csi-driver.yaml` — EBS CSI Driver

nano values/aws-ebs-csi-driver.yaml

Field	Placeholder	Notes
`controller.serviceAccount.annotations.eks.amazonaws.com/role-arn`	`<YOUR_AWS_ACCOUNT_ID>`, `<YOUR_ENV_NAME>`	IAM role for EBS CSI controller
`node.serviceAccount.annotations.eks.amazonaws.com/role-arn`	`<YOUR_AWS_ACCOUNT_ID>`, `<YOUR_ENV_NAME>`	IAM role for EBS CSI node
`controller.env.AWS_DEFAULT_REGION`	`<YOUR_AWS_REGION>`	AWS region
`controller.env.AWS_REGION`	`<YOUR_AWS_REGION>`	AWS region
`node.env.AWS_DEFAULT_REGION`	`<YOUR_AWS_REGION>`	AWS region
`node.env.AWS_REGION`	`<YOUR_AWS_REGION>`	AWS region

5.7 `values/karpenter.yaml` — Karpenter NodePools

nano values/karpenter.yaml

Field	Placeholder	Notes
`*.labels.Environment`	`<YOUR_ENVIRONMENT>`	Applied to all NodePool labels
`*.requirements topology.kubernetes.io/zone`	`["<YOUR_REGION>a", "<YOUR_REGION>b", "<YOUR_REGION>c"]`	AZs for all NodePools

Node class names (general, compute-intensive, memory-intensive, gpu, database) must match entries in karpenter-nodeclasses.yaml.

5.8 `values/keda.yaml` — KEDA Autoscaler

No environment-specific placeholders required. Resource limits and replica counts are pre-configured with sensible defaults. Review and adjust if needed.

5.9 `values/supabase.yaml` — Supabase application (only if `ENABLE_SUPABASE=true`)

All keys below must be generated consistently and shared with ha-supabase-db.yaml. Generate them once and use the same values in both files.

# Generate JWT secret
openssl rand -hex 32
 
# Generate anon/service role JWTs (requires Supabase CLI)
brew install supabase/tap/supabase
supabase gen-keys
 
# Generate passwords and tokens
openssl rand -hex 24       # for passwords
openssl rand -base64 64    # for secretKeyBase

nano values/supabase.yaml

Field	Placeholder	Notes
`secret.jwt.anonKey`	`<YOUR_SUPABASE_ANON_KEY>`	Must match `ha-supabase-db.yaml` `anonKey`
`secret.jwt.serviceKey`	`<YOUR_SUPABASE_SERVICE_ROLE_KEY>`	Must match `ha-supabase-db.yaml` `serviceRoleKey`
`secret.jwt.secret`	`<YOUR_SUPABASE_JWT_SECRET>`	Must match `ha-supabase-db.yaml` `jwtSecret`
`secret.db.password`	`<YOUR_SUPABASE_DB_PASSWORD>`	Must match `ha-supabase-db.yaml` `postgresPassword`
`secret.analytics.publicAccessToken`	`<YOUR_SUPABASE_ANALYTICS_PUBLIC_TOKEN>`	Internal Logflare token
`secret.analytics.privateAccessToken`	`<YOUR_SUPABASE_ANALYTICS_PRIVATE_TOKEN>`	Internal Logflare token
`secret.dashboard.username`	`<YOUR_SUPABASE_DASHBOARD_USERNAME>`	Studio UI login
`secret.dashboard.password`	`<YOUR_SUPABASE_DASHBOARD_PASSWORD>`	Studio UI login
`secret.realtime.secretKeyBase`	`<YOUR_SUPABASE_REALTIME_SECRET_KEY_BASE>`	Phoenix secret key
`secret.meta.cryptoKey`	`<YOUR_SUPABASE_META_CRYPTO_KEY>`	`openssl rand -hex 32`
`secret.s3.keyId`	`<YOUR_MINIO_KEY_ID>`	Must match `secret.minio.user` (`openssl rand -hex 16`)
`secret.s3.accessKey`	`<YOUR_MINIO_ACCESS_KEY>`	Must match `secret.minio.password` (`openssl rand -hex 32`)
`secret.minio.user`	`<YOUR_MINIO_KEY_ID>`	Same value as `secret.s3.keyId`
`secret.minio.password`	`<YOUR_MINIO_ACCESS_KEY>`	Same value as `secret.s3.accessKey`

5.10 `values/ha-supabase-db.yaml` — Supabase HA Database (only if `ENABLE_HA_SUPABASE_DB=true`)

Secrets here must match supabase.yaml. Use the same generated values for postgresPassword, jwtSecret, anonKey, and serviceRoleKey.

nano values/ha-supabase-db.yaml

Field	Placeholder	Notes
`secrets.inline.postgresPassword`	`<YOUR_SUPABASE_DB_PASSWORD>`	Must match `supabase.yaml` `secret.db.password`
`secrets.inline.authenticatorPassword`	`<YOUR_SUPABASE_DB_PASSWORD>`	Must be identical to `postgresPassword`
`secrets.inline.pgbouncerPassword`	`<YOUR_SUPABASE_DB_PASSWORD>`	Must be identical to `postgresPassword`
`secrets.inline.jwtSecret`	`<YOUR_SUPABASE_JWT_SECRET>`	Must match `supabase.yaml` `secret.jwt.secret`
`secrets.inline.anonKey`	`<YOUR_SUPABASE_ANON_KEY>`	Must match `supabase.yaml` `secret.jwt.anonKey`
`secrets.inline.serviceRoleKey`	`<YOUR_SUPABASE_SERVICE_ROLE_KEY>`	Must match `supabase.yaml` `secret.jwt.serviceKey`

Storage class (ebs-csi-gp2), instance counts, and resource limits are pre-configured. Adjust postgres.storage.size and postgres.walStorage.size for your expected data volume.

5.11 `values/cloudnative-pg.yaml` — CloudNativePG Operator (only if `ENABLE_CNPG=true`)

No environment-specific placeholders required. This deploys the CNPG operator controller only. Default settings (3 replicas, resource limits) are suitable for most environments.

5.12 `values/odin-services.yaml` — Odin application services

Redis and RabbitMQ endpoints are only available after Terraform creates those AWS resources. Certificate ARNs must be provisioned in ACM before deployment.

nano values/odin-services.yaml

General settings:

Field	Placeholder	Notes
`server`	`<YOUR_WEB_DOMAIN>`	Main web domain
`toolkitEncryptionKey`	`<YOUR_TOOLKIT_ENCRYPTION_KEY>`	Generate: `python -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())"`

Supabase (dataServiceConfig) — self-hosted (ENABLE_SUPABASE=true):

Field	Placeholder	Source
`supabase.projectUrl`	`http://supabase-kong:8000`	Fixed — internal Supabase Kong
`supabase.key`	`<YOUR_SUPABASE_SERVICE_ROLE_KEY>`	Same as `secret.jwt.serviceKey` in `supabase.yaml`
`supabase.postgres.user`	`postgres`	Fixed for self-hosted
`supabase.postgres.host`	`ha-supabase-db-postgres-pooler-rw.ha-supabase-db.svc.cluster.local`	Fixed — DB Pool service within the cluster
`supabase.postgres.password`	`<YOUR_SUPABASE_DB_PASSWORD>`	Same as `secret.db.password` in `supabase.yaml`
`supabase.projectId`	(leave empty)	Not used in self-hosted mode

Supabase (dataServiceConfig) — Supabase Cloud (ENABLE_SUPABASE=false):

Field	Placeholder	Source
`supabase.projectUrl`	`<YOUR_SUPABASE_PROJECT_URL>`	Supabase dashboard → Project Settings → API
`supabase.key`	`<YOUR_SUPABASE_SERVICE_ROLE_KEY>`	Supabase dashboard → API → `service_role` key
`supabase.postgres.user`	`<YOUR_SUPABASE_DB_USER>`	Supabase dashboard → Project Settings → Database
`supabase.postgres.host`	`<YOUR_SUPABASE_DB_HOST>`	Supabase dashboard → Database (e.g., `aws-0-eu-west-2.pooler.supabase.com`)
`supabase.postgres.password`	`<YOUR_SUPABASE_DB_PASSWORD>`	Supabase dashboard → Project Settings → Database
`supabase.projectId`	`<YOUR_SUPABASE_PROJECT_ID>`	From your Supabase project URL

Redis:

Field	Placeholder	Notes
`redis.url`	`rediss://<YOUR_REDIS_HOST>:6379?ssl_cert_reqs=none`	After Terraform creates ElastiCache
`redis.host`	`<YOUR_REDIS_HOST>`	ElastiCache primary endpoint

# Get Redis endpoint after Terraform apply
aws elasticache describe-cache-clusters \
  --show-cache-node-info \
  --query "CacheClusters[?starts_with(CacheClusterId,'<YOUR_ENV_NAME>')].CacheNodes[0].Endpoint.Address" \
  --output text

RabbitMQ:

Field	Placeholder	Notes
`rabbitmq.url`	`amqps://<YOUR_RABBITMQ_USERNAME>:<YOUR_RABBITMQ_PASSWORD>@<YOUR_RABBITMQ_HOST>:5671`	After Terraform creates AmazonMQ
`rabbitmq.host`	`<YOUR_RABBITMQ_HOST>`	AmazonMQ broker endpoint
`rabbitmq.username`	`<YOUR_RABBITMQ_USERNAME>`	Set in `terragrunt.hcl`
`rabbitmq.password`	`<YOUR_RABBITMQ_PASSWORD>`	Set in `terragrunt.hcl`

# Get RabbitMQ endpoint after Terraform apply
aws mq list-brokers \
  --query "BrokerSummaries[?BrokerName=='odin-rabbitmq'].BrokerId" --output text | \
  xargs -I{} aws mq describe-broker --broker-id {} \
  --query "BrokerInstances[0].Endpoints[0]" --output text

SSL / Certificate ARNs:

Field	Placeholder	Notes
`ssl.services.web.domain`	`<YOUR_WEB_DOMAIN>`	e.g., `app.example.com`
`ssl.services.web.certificateArn`	`<YOUR_WEB_CERTIFICATE_ARN>`	ACM certificate ARN
`ssl.services.fastapiBackend.domain`	`<YOUR_API_DOMAIN>`	e.g., `api-app.example.com`
`ssl.services.fastapiBackend.certificateArn`	`<YOUR_API_CERTIFICATE_ARN>`	ACM certificate ARN
`ssl.services.automator.domain`	`<YOUR_AUTOMATOR_DOMAIN>`	e.g., `automations-app.example.com`
`ssl.services.automator.certificateArn`	`<YOUR_AUTOMATOR_CERTIFICATE_ARN>`	ACM certificate ARN
`ssl.services.supabase.domain`	`<YOUR_SUPABASE_DOMAIN>`	e.g., `supabase-app.example.com`
`ssl.services.supabase.certificateArn`	`<YOUR_SUPABASE_CERTIFICATE_ARN>`	ACM certificate ARN

# List ACM certificates in your region
aws acm list-certificates --region <YOUR_AWS_REGION> \
  --query "CertificateSummaryList[*].[DomainName,CertificateArn]" --output table

Web frontend Supabase keys — self-hosted (ENABLE_SUPABASE=true):

Field	Placeholder	Source
`web.supabase.url`	`https://<YOUR_SUPABASE_DOMAIN>`	External URL routed via ALB ingress
`web.supabase.anonKey`	`<YOUR_SUPABASE_ANON_KEY>`	Same as `secret.jwt.anonKey` in `supabase.yaml`
`web.supabase.serviceRoleKey`	`<YOUR_SUPABASE_SERVICE_ROLE_KEY>`	Same as `secret.jwt.serviceKey` in `supabase.yaml`
`web.supabase.clientanonKey`	`<YOUR_SUPABASE_SERVICE_ROLE_KEY>`	Same as `secret.jwt.serviceKey` in `supabase.yaml`

Web frontend Supabase keys — Supabase Cloud (ENABLE_SUPABASE=false):

Field	Placeholder	Source
`web.supabase.url`	`<YOUR_SUPABASE_PROJECT_URL>`	Supabase dashboard → Project Settings → API
`web.supabase.anonKey`	`<YOUR_SUPABASE_ANON_KEY>`	Supabase dashboard → API → `anon` key
`web.supabase.serviceRoleKey`	`<YOUR_SUPABASE_SERVICE_ROLE_KEY>`	Supabase dashboard → API → `service_role` key
`web.supabase.clientanonKey`	`<YOUR_SUPABASE_CLIENT_ANON_KEY>`	Same as `service_role` key

5.13 `values/signoz.yaml` — SigNoz Observability (only if `ENABLE_SIGNOZ=true`)

nano values/signoz.yaml

Field	Placeholder	Notes
`global.clusterName`	`<YOUR_ENV_NAME>`	EKS cluster name
`signoz.ingress.annotations.alb.ingress.kubernetes.io/certificate-arn`	`<YOUR_AWS_REGION>`, `<YOUR_AWS_ACCOUNT_ID>`, `<YOUR_SIGNOZ_CERTIFICATE_ID>`	ACM certificate for SigNoz
`signoz.ingress.hosts[0].host`	`<YOUR_SIGNOZ_DOMAIN>`	e.g., `signoz-app.example.com`

5.14 `values/signoz-k8s-infra.yaml` — SigNoz K8s Metrics (only if `ENABLE_SIGNOZ=true`)

nano values/signoz-k8s-infra.yaml

Field	Placeholder	Notes
`global.clusterName`	`<YOUR_ENV_NAME>`	EKS cluster name for metric labeling

The OTel collector endpoint (signoz-otel-collector.monitoring.svc.cluster.local:4317) is pre-configured assuming both SigNoz and k8s-infra are deployed in the monitoring namespace. No change needed unless you use a custom release name.

Deployment Ordering Reminder

Some values are only available after certain infrastructure has been deployed. Follow this order:

Before any deployment — Set: <YOUR_ENV_NAME>, <YOUR_AWS_REGION>, <YOUR_AWS_ACCOUNT_ID>, <YOUR_ENVIRONMENT>, <YOUR_PROJECT>, <YOUR_VPC_CIDR>, all domain names, all certificate ARNs, all Supabase values, <YOUR_TOOLKIT_ENCRYPTION_KEY>, RabbitMQ username/password
After EKS cluster created — Set: <YOUR_VPC_ID> (infrastructure.yaml), <YOUR_EKS_CLUSTER_ENDPOINT> (karpenter-values.yaml)
After terraform apply for AWS services — Set: <YOUR_REDIS_HOST>, <YOUR_RABBITMQ_HOST> (odin-services.yaml)

Step 6: Verify No Placeholders Remain

grep -r "<YOUR_" . --include="*.hcl" --include="*.yaml"

Expected output should be empty, or contain only references to resources about to be created (VPC, Redis, MQ, EKS). If any placeholders remain, refer to the Step 5 sub-sections above. Files checklist:

File	Step	Required
`terragrunt.hcl`	5.1	Always
`state/terragrunt.hcl`	5.2	Always
`values/infrastructure.yaml`	5.3	Always
`values/karpenter-values.yaml`	5.4	Always
`values/karpenter-nodeclasses.yaml`	5.5	Always
`values/karpenter.yaml`	5.7	Always
`values/keda.yaml`	5.8	Always
`values/aws-ebs-csi-driver.yaml`	5.6	Always
`values/odin-services.yaml`	5.12	Always
`values/cloudnative-pg.yaml`	5.11	Only if `ENABLE_CNPG=true`
`values/ha-supabase-db.yaml`	5.10	Only if `ENABLE_HA_SUPABASE_DB=true`
`values/supabase.yaml`	5.9	Only if `ENABLE_SUPABASE=true`
`values/signoz.yaml`	5.13	Only if `ENABLE_SIGNOZ=true`
`values/signoz-k8s-infra.yaml`	5.14	Only if `ENABLE_SIGNOZ=true`

Phase 1: State Management Setup

Purpose: S3 bucket creation for Terraform state. Each environment’s state management module creates an S3 bucket with the pattern odin-terraform-state-{environment-name}, configures encryption, versioning, and public access blocking, and uses local state for the state module itself (bootstrap pattern).

cd terragrunt/environments/{your-env-name}/state
terragrunt init
terragrunt plan
terragrunt apply

Phase 2: EKS Infrastructure Deployment

Purpose: Core networking (VPC, subnets, NAT gateway), IAM roles and policies, EKS cluster and managed node groups.

2.1 Dry Run — EKS Infrastructure

Core Infrastructure

cd terragrunt/environments/your-env-name
terragrunt plan -target="aws_vpc.main" \
  -target="aws_internet_gateway.main" \
  -target="aws_subnet.public" \
  -target="aws_subnet.private" \
  -target="aws_eip.nat" \
  -target="aws_nat_gateway.main" \
  -target="aws_route_table.public" \
  -target="aws_route_table.private" \
  -target="aws_route_table_association.public" \
  -target="aws_route_table_association.private"

IAM Roles and Policies

terragrunt plan -target="aws_iam_role.cluster" \
  -target="aws_iam_role_policy_attachment.cluster_AmazonEKSClusterPolicy" \
  -target="aws_iam_openid_connect_provider.eks" \
  -target="aws_iam_role.node" \
  -target="aws_iam_role_policy_attachment.node_AmazonEKSWorkerNodePolicy" \
  -target="aws_iam_role_policy_attachment.node_AmazonEKS_CNI_Policy" \
  -target="aws_iam_role_policy_attachment.node_AmazonEC2ContainerRegistryReadOnly"

EKS Cluster and Node Groups

terragrunt plan -target="aws_eks_cluster.main" \
  -target="aws_eks_node_group.main" \
  -target="kubernetes_secret.regcred"

Using a Custom / Private Docker Registry

By default, EKB images are pulled from Docker Hub using a secret named regcred. If the customer hosts images in a different registry, follow these steps before deploying odin-services. Step 1 — Create the imagePullSecret in the target namespace

# Generic private registry (Docker Hub, Quay, self-hosted, etc.)
kubectl create secret docker-registry regcred \
  --namespace default \
  --docker-server=<YOUR_REGISTRY_HOST> \
  --docker-username=<YOUR_REGISTRY_USERNAME> \
  --docker-password=<YOUR_REGISTRY_PASSWORD> \
  --docker-email=<YOUR_EMAIL>
 
# AWS ECR — token expires every 12h; refresh via a CronJob or use ECR pull-through cache
aws ecr get-login-password --region <YOUR_AWS_REGION> | \
  kubectl create secret docker-registry regcred \
    --namespace default \
    --docker-server=<YOUR_AWS_ACCOUNT_ID>.dkr.ecr.<YOUR_AWS_REGION>.amazonaws.com \
    --docker-username=AWS \
    --docker-password-stdin

Step 2 — Set the secret name in values/odin-services.yaml

# values/odin-services.yaml
imagePullSecrets:
  - name: regcred          # must match the secret name created above
  # - name: customer-registry-secret  # add additional registries if needed

Step 3 — Update image references

web:
  image: <YOUR_REGISTRY_HOST>/<YOUR_ORG>/web:<TAG>
 
fastapiBackend:
  image: <YOUR_REGISTRY_HOST>/<YOUR_ORG>/server:<TAG>

Step 4 — Verify pull access before full deployment

kubectl run registry-test \
  --image=<YOUR_REGISTRY_HOST>/<YOUR_ORG>/web:<TAG> \
  --overrides='{"spec":{"imagePullSecrets":[{"name":"regcred"}]}}' \
  --restart=Never --rm -it -- echo "Pull successful"

2.2 Deploy EKS Infrastructure

Step 1: Core Infrastructure

cd terragrunt/environments/your-env-name
terragrunt apply -target="aws_vpc.main" \
  -target="aws_internet_gateway.main" \
  -target="aws_subnet.public" \
  -target="aws_subnet.private" \
  -target="aws_eip.nat" \
  -target="aws_nat_gateway.main" \
  -target="aws_route_table.public" \
  -target="aws_route_table.private" \
  -target="aws_route_table_association.public" \
  -target="aws_route_table_association.private"

After this step, update vpcId in values/infrastructure.yaml before deploying the AWS Load Balancer Controller.

Step 2: EKS Cluster and IAM Roles and Policies

terragrunt apply -target="aws_iam_role.cluster" \
  -target="aws_iam_role_policy_attachment.cluster_AmazonEKSClusterPolicy" \
  -target="aws_iam_openid_connect_provider.eks" \
  -target="aws_iam_role.node" \
  -target="aws_iam_role_policy_attachment.node_AmazonEKSWorkerNodePolicy" \
  -target="aws_iam_role_policy_attachment.node_AmazonEKS_CNI_Policy" \
  -target="aws_iam_role_policy_attachment.node_AmazonEC2ContainerRegistryReadOnly"

Step 3: Node Groups and Addons

terragrunt apply -target="aws_eks_cluster.main" \
  -target="aws_eks_node_group.main" \
  -target="kubernetes_secret.regcred"

After this step, update CLUSTER_ENDPOINT in values/karpenter-values.yaml before deploying Karpenter.

Check EKS Cluster Connectivity

aws eks update-kubeconfig --region $AWS_REGION --name $CLUSTER_NAME
 
kubectl cluster-info
kubectl get nodes
kubectl get secret regcred -n default

Phase 3: Storage and Load Balancing

Purpose: EBS CSI driver for persistent volumes, AWS Load Balancer Controller running on managed node group.

3.1 Dry Run — Storage and Load Balancing

EBS CSI Driver

cd terragrunt/environments/your-env-name
terragrunt plan -target="aws_iam_role.ebs_csi_driver" \
  -target="aws_iam_role_policy_attachment.ebs_csi_driver" \
  -target="helm_release.ebs_csi_driver"

AWS Load Balancer Controller

terragrunt plan -target="aws_iam_role.aws_load_balancer_controller" \
  -target="aws_iam_role_policy_attachment.aws_load_balancer_controller" \
  -target="aws_iam_policy.aws_load_balancer_controller" \
  -target="helm_release.infrastructure"

3.2 Deploy Storage and Load Balancing

Step 1: EBS CSI Driver

cd terragrunt/environments/your-env-name
terragrunt apply -target="aws_iam_role.ebs_csi_driver" \
  -target="aws_iam_role_policy_attachment.ebs_csi_driver" \
  -target="helm_release.ebs_csi_driver"

Verification

helm list -n kube-system | grep ebs
kubectl get pods -n kube-system | grep ebs-csi
kubectl get storageclass
aws iam get-role --role-name $CLUSTER_NAME-ebs-csi-driver-role --region $AWS_REGION
kubectl get sa -n kube-system | grep ebs-csi
kubectl describe sa ebs-csi-controller-sa -n kube-system

Step 2: AWS Load Balancer Controller

terragrunt apply -target="aws_iam_role.aws_load_balancer_controller" \
  -target="aws_iam_role_policy_attachment.aws_load_balancer_controller" \
  -target="aws_iam_policy.aws_load_balancer_controller" \
  -target="helm_release.infrastructure"

Verification

helm list -n infrastructure
kubectl get pods -n infrastructure | grep aws-load-balancer-controller
kubectl get sa -n infrastructure
kubectl describe sa aws-load-balancer-controller -n infrastructure
aws iam get-role --role-name $CLUSTER_NAME-aws-load-balancer-controller --region $AWS_REGION
kubectl logs -n infrastructure -l app.kubernetes.io/name=aws-load-balancer-controller
kubectl get ingressclass

Phase 4: Karpenter Autoscaling

Purpose: IAM roles for Karpenter, Spot interruption handling, Karpenter controller and node pools.

4.1 Dry Run — Karpenter

Karpenter IAM Resources

cd terragrunt/environments/your-env-name
terragrunt plan -target="aws_iam_role.karpenter_controller" \
  -target="aws_iam_policy.karpenter_controller" \
  -target="aws_iam_role_policy_attachment.karpenter_controller" \
  -target="aws_iam_role.karpenter_node" \
  -target="aws_iam_role_policy_attachment.karpenter_node_AmazonEKSWorkerNodePolicy" \
  -target="aws_iam_role_policy_attachment.karpenter_node_AmazonEKS_CNI_Policy" \
  -target="aws_iam_role_policy_attachment.karpenter_node_AmazonEC2ContainerRegistryReadOnly" \
  -target="aws_iam_role_policy_attachment.karpenter_node_AmazonEBSCSIDriverPolicy" \
  -target="aws_iam_instance_profile.karpenter_node"

EC2 Spot Service-Linked Role (if spot instances are enabled)

terragrunt plan -target="aws_iam_service_linked_role.ec2_spot[0]"

Karpenter Spot Interruption (if enabled in terragrunt.hcl)

terragrunt plan -target="aws_sqs_queue.karpenter_interruption_queue" \
  -target="aws_sqs_queue_policy.karpenter_interruption_queue" \
  -target="aws_cloudwatch_event_rule.karpenter_interruption" \
  -target="aws_cloudwatch_event_target.karpenter_interruption"

Karpenter Helm Charts

terragrunt plan -target="helm_release.karpenter"

Karpenter NodePools and EC2NodeClasses

terragrunt plan -target="kubernetes_manifest.karpenter_nodepool" \
  -target="kubernetes_manifest.karpenter_nodeclass" \
  -target="kubernetes_config_map.aws_auth"

An expected error may appear during plan: API did not recognize GroupVersionKind from manifest (CRD may not be installed). This is safe to ignore — Kubernetes validates resources against the live API at plan time, before CRDs are installed.

4.2 Deploy Karpenter

Step 1: Karpenter IAM Resources

cd terragrunt/environments/your-env-name
terragrunt apply -target="aws_iam_role.karpenter_controller" \
  -target="aws_iam_policy.karpenter_controller" \
  -target="aws_iam_role_policy_attachment.karpenter_controller" \
  -target="aws_iam_role.karpenter_node" \
  -target="aws_iam_role_policy_attachment.karpenter_node_AmazonEKSWorkerNodePolicy" \
  -target="aws_iam_role_policy_attachment.karpenter_node_AmazonEKS_CNI_Policy" \
  -target="aws_iam_role_policy_attachment.karpenter_node_AmazonEC2ContainerRegistryReadOnly" \
  -target="aws_iam_role_policy_attachment.karpenter_node_AmazonEBSCSIDriverPolicy" \
  -target="aws_iam_instance_profile.karpenter_node"

Verification

aws iam get-role --role-name $CLUSTER_NAME-karpenter-controller --region $AWS_REGION
aws iam get-role --role-name $CLUSTER_NAME-karpenter-node --region $AWS_REGION
aws iam get-instance-profile --instance-profile-name $CLUSTER_NAME-karpenter-node --region $AWS_REGION
aws iam list-attached-role-policies --role-name $CLUSTER_NAME-karpenter-node --region $AWS_REGION

Step 2: EC2 Spot Service-Linked Role (if spot instances are enabled)

The EC2 Spot service-linked role is account-wide (only one per AWS account) and must exist before Karpenter can launch Spot instances.

Option A: Let Terraform create it (recommended for new deployments)

terragrunt apply -target="aws_iam_service_linked_role.ec2_spot[0]"

Option B: Import if the role already exists

# Check if the role exists
aws iam get-role --role-name AWSServiceRoleForEC2Spot --region $AWS_REGION
 
# If it doesn't exist, create it manually
aws iam create-service-linked-role --aws-service-name spot.amazonaws.com --region $AWS_REGION
 
# Import into Terraform (replace ACCOUNT_ID with your 12-digit AWS account ID)
terragrunt import 'aws_iam_service_linked_role.ec2_spot[0]' \
  arn:aws:iam::ACCOUNT_ID:role/aws-service-role/spot.amazonaws.com/AWSServiceRoleForEC2Spot

Step 3: Karpenter Spot Interruption (if enabled in terragrunt.hcl)

terragrunt apply -target="aws_sqs_queue.karpenter_interruption_queue" \
  -target="aws_sqs_queue_policy.karpenter_interruption_queue" \
  -target="aws_cloudwatch_event_rule.karpenter_interruption" \
  -target="aws_cloudwatch_event_target.karpenter_interruption"

Verification

aws events describe-rule --name $CLUSTER_NAME-karpenter-interruption --region $AWS_REGION
aws events list-targets-by-rule --rule $CLUSTER_NAME-karpenter-interruption --region $AWS_REGION
aws sqs get-queue-url --queue-name $CLUSTER_NAME-karpenter-interruption-queue --region $AWS_REGION

Step 4: Karpenter Helm Chart

terragrunt apply -target="helm_release.karpenter"

Verification

helm list -n kube-system | grep karpenter
kubectl get pods -n kube-system | grep karpenter
kubectl logs -n kube-system -l app.kubernetes.io/name=karpenter
kubectl describe sa karpenter -n kube-system

Step 5: Karpenter Kubernetes Manifests

terragrunt apply -target="kubernetes_manifest.karpenter_nodepool" \
  -target="kubernetes_manifest.karpenter_nodeclass"
 
# Import the existing aws-auth ConfigMap
# Note: Use quotes to prevent zsh from interpreting brackets as glob patterns
terragrunt import 'kubernetes_config_map.aws_auth[0]' kube-system/aws-auth
 
# Then apply
terragrunt apply -target='kubernetes_config_map.aws_auth[0]'

Verification

kubectl get nodepools -o wide
kubectl describe nodepool general
kubectl describe nodepool application
kubectl describe nodepool database
kubectl get ec2nodeclasses -o wide
kubectl get configmap aws-auth -n kube-system -o jsonpath='{.data.mapRoles}' | grep karpenter-node
kubectl get nodepools -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.status.conditions[?(@.type=="Ready")].status}{"\n"}{end}'
kubectl get nodes -l karpenter.sh/nodepool --show-labels
kubectl get events -n kube-system --field-selector involvedObject.name=karpenter --sort-by='.lastTimestamp'

Phase 5: KEDA Autoscaling

Purpose: KEDA for application-level autoscaling.

5.1 Dry Run — KEDA

cd terragrunt/environments/your-env-name
terragrunt plan -target="helm_release.keda"

5.2 Deploy KEDA

cd terragrunt/environments/your-env-name
terragrunt apply -target="helm_release.keda"

Verification

helm list -n keda
kubectl get pods -n keda
kubectl get deployment -n keda
kubectl get crd | grep keda
kubectl get validatingwebhookconfigurations | grep keda
kubectl get svc -n keda

Phase 6: Data Services

Purpose: Supabase (database), ElastiCache (Redis), RabbitMQ (message queue).

Deploy CloudNativePG operator first, then the HA Supabase DB cluster, then the Supabase application. The DB cluster must be ready before Supabase starts.

6.1 Dry Run — Data Services

Step 1: CloudNativePG operator (if enabled)

cd terragrunt/environments/your-env-name
ENABLE_CNPG=true terragrunt plan \
  --target='helm_release.additional_charts["cloudnative-pg"]'

Step 2: HA Supabase DB (if enabled)

ENABLE_HA_SUPABASE_DB=true terragrunt plan \
  --target='helm_release.additional_charts["ha-supabase-db"]'

Step 3: Supabase application (if enabled)

if [ "${ENABLE_SUPABASE:-false}" = "true" ]; then
  ENABLE_SUPABASE=true terragrunt plan \
    --target='helm_release.supabase[0]'
fi

Step 4: AWS Services — ElastiCache and RabbitMQ (if enabled)

if [ "${ENABLE_AWS_SERVICES:-false}" = "true" ]; then
  terragrunt plan -target="aws_elasticache_subnet_group.redis" \
    -target="aws_security_group.redis" \
    -target="aws_elasticache_replication_group.redis" \
    -target="aws_security_group.rabbitmq" \
    -target="aws_mq_broker.rabbitmq"
fi

6.2 Deploy Data Services

Step 1: CloudNativePG operator (if enabled)

cd terragrunt/environments/your-env-name
ENABLE_CNPG=true terragrunt apply --auto-approve \
  --target='helm_release.additional_charts["cloudnative-pg"]'

Step 2: HA Supabase DB (if enabled)

ENABLE_HA_SUPABASE_DB=true terragrunt apply --auto-approve \
  --target='helm_release.additional_charts["ha-supabase-db"]'

Verify PgBouncer pooler and credentials after deployment:

kubectl get svc -n ha-supabase-db | grep pooler
kubectl get secrets -n ha-supabase-db
kubectl get secret ha-supabase-db-authenticator-credentials -n ha-supabase-db \
  -o jsonpath='{.data.username}' | base64 -d && echo ""

Use the pooler ClusterIP (or EXTERNAL-IP if LoadBalancer) as the SUPABASE_POSTGRES_HOST value in values/odin-services.yaml and as secret.db.postgresHost in values/supabase.yaml. Step 3: Supabase application (if enabled) All Supabase service pods run exclusively on the Karpenter application NodePool (On-Demand only) to prevent Spot interruptions.

if [ "${ENABLE_SUPABASE:-false}" = "true" ]; then
  ENABLE_SUPABASE=true terragrunt apply --auto-approve \
    --target='helm_release.supabase[0]'
fi

Step 4: AWS Services — ElastiCache and RabbitMQ (if enabled)

if [ "${ENABLE_AWS_SERVICES:-false}" = "true" ]; then
  terragrunt apply \
    -target="aws_elasticache_subnet_group.redis" \
    -target="aws_security_group.redis" \
    -target="aws_elasticache_replication_group.redis" \
    -target="aws_security_group.rabbitmq" \
    -target="aws_mq_broker.rabbitmq"
fi

Verification

# Get connection details from Terraform outputs
terragrunt output elasticache_endpoint
terragrunt output elasticache_port
terragrunt output rabbitmq_endpoint
terragrunt output rabbitmq_port
 
# Test Redis connectivity from EKS cluster
kubectl run redis-test --image=redis:7-alpine --restart=Never -- \
  sh -c "redis-cli -h <redis-endpoint> -p 6379 --tls --insecure ping && echo 'Redis connection successful'"
kubectl logs redis-test
kubectl delete pod redis-test
 
# Check Redis encryption status
aws elasticache describe-replication-groups \
  --replication-group-id $CLUSTER_NAME-redis \
  --region $AWS_REGION \
  --query 'ReplicationGroups[0].{AtRestEncryption:AtRestEncryptionEnabled,TransitEncryption:TransitEncryptionEnabled}'

Before deploying Odin Services, update values/odin-services.yaml with the Redis endpoint, RabbitMQ endpoint, and all certificate ARNs obtained in this phase.

Phase 7: Odin Services

Purpose: Application deployment via Helm.

Before deploying, temporarily scale down fastapiBackend to a single replica for the initial database migration run — set replicaCount: 1, workers: 1, and keda.minReplicas: 1. Once the migration completes successfully, revert these values to their production defaults before re-deploying.

7.1 Dry Run — Odin Services

cd terragrunt/environments/your-env-name
terragrunt plan -target="helm_release.odin_services"

7.2 Deploy Odin Services

cd terragrunt/environments/your-env-name
terragrunt apply -target="helm_release.odin_services"

Verification

kubectl get pods
kubectl get ingress  # Add the ALB endpoints to your DNS provider

Phase 8: SigNoz Observability

Purpose: Logs and metrics monitoring.

8.1 Dry Run — SigNoz Charts

cd terragrunt/environments/your-env-name
terragrunt plan -target='helm_release.additional_charts["signoz"]'
terragrunt plan -target='helm_release.additional_charts["k8s-infra"]'

8.2 Deploy SigNoz Charts

cd terragrunt/environments/your-env-name
terragrunt apply -target='helm_release.additional_charts["signoz"]'
terragrunt apply -target='helm_release.additional_charts["k8s-infra"]'

Verification

kubectl get pods -n monitoring
kubectl get ingress -n monitoring  # Add the ALB endpoints to your DNS provider

Phase 9: Final Deployment

9.1 Complete Deployment

cd terragrunt/environments/your-env-name
terragrunt apply

This final apply handles any remaining resources not explicitly targeted in previous phases.

9.2 Verify Deployment

# Update kubeconfig
aws eks update-kubeconfig --region us-east-2 --name your-env-name
 
# Check cluster status
kubectl get nodes
kubectl get pods --all-namespaces
 
# Check Karpenter
kubectl get pods -n kube-system -l app.kubernetes.io/name=karpenter
kubectl logs -n kube-system -l app.kubernetes.io/name=karpenter
 
# Check AWS Load Balancer Controller
kubectl get pods -n kube-system -l app.kubernetes.io/name=aws-load-balancer-controller
 
# Check KEDA
kubectl get pods -n keda
 
# Check Odin Services
kubectl get pods -n default
kubectl get services -n default
kubectl get ingress -n default
 
# Check all Helm releases
helm list --all-namespaces

Troubleshooting

State lock issues

terragrunt force-unlock <lock-id>

Karpenter not working

kubectl describe nodes
kubectl logs -n kube-system -l app.kubernetes.io/name=karpenter

Load Balancer issues

kubectl describe ingress -n default
kubectl logs -n kube-system -l app.kubernetes.io/name=aws-load-balancer-controller

Helm chart issues

helm status <release-name> -n <namespace>
helm rollback <release-name> <revision> -n <namespace>

Cleanup

# Destroy infrastructure
cd terragrunt/environments/your-env-name
terragrunt destroy -auto-approve
 
# Destroy state bucket (use with caution)
cd terragrunt/environments/your-env-name/state
terragrunt destroy -auto-approve

Monitoring and Logging

# AWS Resources
aws eks describe-cluster --name your-env-name --region us-east-2
aws ec2 describe-instances --filters "Name=tag:kubernetes.io/cluster/your-env-name,Values=owned"
 
# Kubernetes Resources
kubectl top nodes
kubectl top pods --all-namespaces
kubectl get events --sort-by=.metadata.creationTimestamp

Quick Reference — All Deployment Commands

# Phase 1: State Management
cd terragrunt/environments/your-env-name/state
terragrunt apply
 
# Phase 2: EKS Infrastructure
cd terragrunt/environments/your-env-name
 
terragrunt apply -target="aws_vpc.main" -target="aws_internet_gateway.main" \
  -target="aws_subnet.public" -target="aws_subnet.private" -target="aws_eip.nat" \
  -target="aws_nat_gateway.main" -target="aws_route_table.public" \
  -target="aws_route_table.private" -target="aws_route_table_association.public" \
  -target="aws_route_table_association.private" -auto-approve
 
terragrunt apply -target="aws_iam_role.cluster" \
  -target="aws_iam_role_policy_attachment.cluster_AmazonEKSClusterPolicy" \
  -target="aws_iam_openid_connect_provider.eks" -target="aws_iam_role.node" \
  -target="aws_iam_role_policy_attachment.node_AmazonEKSWorkerNodePolicy" \
  -target="aws_iam_role_policy_attachment.node_AmazonEKS_CNI_Policy" \
  -target="aws_iam_role_policy_attachment.node_AmazonEC2ContainerRegistryReadOnly" \
  -auto-approve
 
terragrunt apply -target="aws_eks_cluster.main" \
  -target="aws_eks_node_group.main" -target="kubernetes_secret.regcred" -auto-approve
 
# Phase 3: Storage and Load Balancing
terragrunt apply -target="aws_iam_role.ebs_csi_driver" \
  -target="aws_iam_role_policy_attachment.ebs_csi_driver" \
  -target="helm_release.ebs_csi_driver" -auto-approve
 
terragrunt apply -target="aws_iam_role.aws_load_balancer_controller" \
  -target="aws_iam_role_policy_attachment.aws_load_balancer_controller" \
  -target="aws_iam_policy.aws_load_balancer_controller" \
  -target="helm_release.infrastructure" -auto-approve
 
# Phase 4: Karpenter Autoscaling
terragrunt apply -target="aws_iam_role.karpenter_controller" \
  -target="aws_iam_policy.karpenter_controller" \
  -target="aws_iam_role_policy_attachment.karpenter_controller" \
  -target="aws_iam_role.karpenter_node" \
  -target="aws_iam_role_policy_attachment.karpenter_node_AmazonEKSWorkerNodePolicy" \
  -target="aws_iam_role_policy_attachment.karpenter_node_AmazonEKS_CNI_Policy" \
  -target="aws_iam_role_policy_attachment.karpenter_node_AmazonEC2ContainerRegistryReadOnly" \
  -target="aws_iam_role_policy_attachment.karpenter_node_AmazonEBSCSIDriverPolicy" \
  -target="aws_iam_instance_profile.karpenter_node" -auto-approve
 
# Spot interruption handling (if spot_interruption_handling = true)
terragrunt apply -target="aws_sqs_queue.karpenter_interruption_queue" \
  -target="aws_sqs_queue_policy.karpenter_interruption_queue" \
  -target="aws_cloudwatch_event_rule.karpenter_interruption" \
  -target="aws_cloudwatch_event_target.karpenter_interruption" -auto-approve
 
terragrunt apply -target="helm_release.karpenter" -auto-approve
 
terragrunt apply -target="kubernetes_manifest.karpenter_nodepool" \
  -target="kubernetes_manifest.karpenter_nodeclass" \
  -target='kubernetes_config_map.aws_auth[0]' -auto-approve
 
# Phase 5: KEDA Autoscaling
terragrunt apply -target="helm_release.keda" -auto-approve
 
# Phase 6: Data Services
ENABLE_CNPG=true terragrunt apply --target='helm_release.additional_charts["cloudnative-pg"]' -auto-approve
ENABLE_HA_SUPABASE_DB=true terragrunt apply --target='helm_release.additional_charts["ha-supabase-db"]' -auto-approve
 
if [ "${ENABLE_SUPABASE:-false}" = "true" ]; then
  ENABLE_SUPABASE=true terragrunt apply --target='helm_release.supabase[0]' -auto-approve
fi
 
if [ "${ENABLE_AWS_SERVICES:-false}" = "true" ]; then
  terragrunt apply -target="aws_elasticache_subnet_group.redis" \
    -target="aws_security_group.redis" \
    -target="aws_elasticache_replication_group.redis" \
    -target="aws_security_group.rabbitmq" \
    -target="aws_mq_broker.rabbitmq" -auto-approve
fi
 
# Phase 7: Odin Services
terragrunt apply -target="helm_release.odin_services" -auto-approve
 
# Phase 8: SigNoz (if enabled)
terragrunt apply -target='helm_release.additional_charts["signoz"]' -auto-approve
terragrunt apply -target='helm_release.additional_charts["k8s-infra"]' -auto-approve
 
# Phase 9: Final Deployment
terragrunt apply -auto-approve

Replace your-env-name with your actual environment name throughout. Always run dry runs (terragrunt plan) first to validate your configuration before applying changes.

​Prerequisites

​Installation Guide

​Installing Terragrunt

​Installing kubectl

​Installing Helm

​Verifying Installation

​AWS CLI Configuration

​Creating a New Environment

​Step 1: Copy the Environment Template

​Step 2: Verify All Placeholders Are Present

​Step 3: Provision SSL Certificates (AWS ACM)

​Step 4: Set Environment Variables

​Spot Instances & Stateful Workloads

​Step 5: Update Environment-Specific File Values

​5.1 terragrunt.hcl — Core cluster configuration

​5.2 state/terragrunt.hcl — S3 state bucket

​5.3 values/infrastructure.yaml — AWS Load Balancer Controller

​5.4 values/karpenter-values.yaml — Karpenter controller

​5.5 values/karpenter-nodeclasses.yaml — Karpenter node classes

​5.6 values/aws-ebs-csi-driver.yaml — EBS CSI Driver

​5.7 values/karpenter.yaml — Karpenter NodePools

​5.8 values/keda.yaml — KEDA Autoscaler

​5.9 values/supabase.yaml — Supabase application (only if ENABLE_SUPABASE=true)

​5.10 values/ha-supabase-db.yaml — Supabase HA Database (only if ENABLE_HA_SUPABASE_DB=true)

​5.11 values/cloudnative-pg.yaml — CloudNativePG Operator (only if ENABLE_CNPG=true)

​5.12 values/odin-services.yaml — Odin application services

​5.13 values/signoz.yaml — SigNoz Observability (only if ENABLE_SIGNOZ=true)

​5.14 values/signoz-k8s-infra.yaml — SigNoz K8s Metrics (only if ENABLE_SIGNOZ=true)

​Deployment Ordering Reminder

​Step 6: Verify No Placeholders Remain

​Phase 1: State Management Setup

​Phase 2: EKS Infrastructure Deployment

​2.1 Dry Run — EKS Infrastructure

​Using a Custom / Private Docker Registry

​2.2 Deploy EKS Infrastructure

​Phase 3: Storage and Load Balancing

​3.1 Dry Run — Storage and Load Balancing

​3.2 Deploy Storage and Load Balancing

​Phase 4: Karpenter Autoscaling

​4.1 Dry Run — Karpenter

​4.2 Deploy Karpenter

​Phase 5: KEDA Autoscaling

​5.1 Dry Run — KEDA

​5.2 Deploy KEDA

​Phase 6: Data Services

​6.1 Dry Run — Data Services

​6.2 Deploy Data Services

​Phase 7: Odin Services

​7.1 Dry Run — Odin Services

​7.2 Deploy Odin Services

​Phase 8: SigNoz Observability

​8.1 Dry Run — SigNoz Charts

​8.2 Deploy SigNoz Charts

​Phase 9: Final Deployment

​9.1 Complete Deployment

​9.2 Verify Deployment

​Troubleshooting

​Cleanup

​Monitoring and Logging

​Quick Reference — All Deployment Commands

Prerequisites

Installation Guide

Installing Terragrunt

Installing kubectl

Installing Helm

Verifying Installation

AWS CLI Configuration

Creating a New Environment

Step 1: Copy the Environment Template

Step 2: Verify All Placeholders Are Present

Step 3: Provision SSL Certificates (AWS ACM)

Step 4: Set Environment Variables

Spot Instances & Stateful Workloads

Step 5: Update Environment-Specific File Values

5.1 `terragrunt.hcl` — Core cluster configuration

5.2 `state/terragrunt.hcl` — S3 state bucket

5.3 `values/infrastructure.yaml` — AWS Load Balancer Controller

5.4 `values/karpenter-values.yaml` — Karpenter controller

5.5 `values/karpenter-nodeclasses.yaml` — Karpenter node classes

5.6 `values/aws-ebs-csi-driver.yaml` — EBS CSI Driver

5.7 `values/karpenter.yaml` — Karpenter NodePools

5.8 `values/keda.yaml` — KEDA Autoscaler

5.9 `values/supabase.yaml` — Supabase application (only if `ENABLE_SUPABASE=true`)

5.10 `values/ha-supabase-db.yaml` — Supabase HA Database (only if `ENABLE_HA_SUPABASE_DB=true`)

5.11 `values/cloudnative-pg.yaml` — CloudNativePG Operator (only if `ENABLE_CNPG=true`)

5.12 `values/odin-services.yaml` — Odin application services

5.13 `values/signoz.yaml` — SigNoz Observability (only if `ENABLE_SIGNOZ=true`)

5.14 `values/signoz-k8s-infra.yaml` — SigNoz K8s Metrics (only if `ENABLE_SIGNOZ=true`)

Deployment Ordering Reminder

Step 6: Verify No Placeholders Remain

Phase 1: State Management Setup

Phase 2: EKS Infrastructure Deployment

2.1 Dry Run — EKS Infrastructure

Using a Custom / Private Docker Registry

2.2 Deploy EKS Infrastructure

Phase 3: Storage and Load Balancing

3.1 Dry Run — Storage and Load Balancing

3.2 Deploy Storage and Load Balancing

Phase 4: Karpenter Autoscaling

4.1 Dry Run — Karpenter

4.2 Deploy Karpenter

Phase 5: KEDA Autoscaling

5.1 Dry Run — KEDA

5.2 Deploy KEDA

Phase 6: Data Services

6.1 Dry Run — Data Services

6.2 Deploy Data Services

Phase 7: Odin Services

7.1 Dry Run — Odin Services

7.2 Deploy Odin Services

Phase 8: SigNoz Observability

8.1 Dry Run — SigNoz Charts

8.2 Deploy SigNoz Charts

Phase 9: Final Deployment

9.1 Complete Deployment

9.2 Verify Deployment

Troubleshooting

Cleanup

Monitoring and Logging

Quick Reference — All Deployment Commands