Back to Blog

Testing Helm Charts: Catch Kubernetes Configuration Bugs Before They Reach Production

Infrastructure code is production code. Untested Helm charts, Terraform modules, and Kubernetes manifests are defects waiting to surface during the worst possible deployment. This guide covers how to test IaC with helm unittest, Conftest policies, Terraform test, and integration tests in real clusters.

Published

6 min read

Reading time

Testing Helm Charts: Catch Kubernetes Configuration Bugs Before They Reach Production

A Helm chart is code. A Terraform module is code. A Kubernetes manifest is code. Yet most engineering teams who would never ship application code without tests deploy infrastructure changes with nothing more than a manual helm upgrade and a visual inspection of kubectl get pods.

The consequences are predictable: a values.yaml typo disables autoscaling in production, a resource limit misconfiguration causes OOM kills under load, a missing readinessProbe causes traffic to hit unready pods during rolling deployments. All of these are testable and preventable.


The IaC Testing Pyramid

flowchart TD
    A[Unit: Static Analysis\nhelm lint, terraform validate, conftest] --> B
    B[Integration: Cluster Test\nhelm unittest, kind/k3s] --> C
    C[End-to-End: Deploy + Verify\nActual deploy + smoke tests]

    style A fill:#22c55e,color:#fff
    style B fill:#3b82f6,color:#fff
    style C fill:#f59e0b,color:#fff
Layer Tools What It Tests Speed
Static analysis helm lint, kubeval, Conftest Syntax, schema, policies Seconds
Unit tests helm unittest Template rendering, values Seconds
Cluster integration kind + helm install Real K8s behavior Minutes
E2E deploy test Staging deploy + test suite Full system behavior Minutes

Layer 1: Helm Lint and Schema Validation

Start with the free, fast checks:

# Lint the chart for syntax errors and best practices
helm lint ./helm/scanly/

# Validate rendered templates against Kubernetes API schema
helm template ./helm/scanly/ | kubeval --strict

# Validate with multiple K8s versions
helm template ./helm/scanly/ | kubeval --kubernetes-version 1.28.0
helm template ./helm/scanly/ | kubeval --kubernetes-version 1.29.0

# In CI:
helm template ./helm/scanly/ \
  --set image.tag=test \
  --set environment=staging \
  | kubeval \
  --strict \
  --ignore-missing-schemas

Layer 2: Helm Unit Tests

helm unittest lets you assert on rendered template output without deploying:

# Install the plugin
helm plugin install https://github.com/helm-unittest/helm-unittest

# Run tests
helm unittest ./helm/scanly/
# helm/scanly/tests/deployment_test.yaml
suite: Deployment tests
tests:
  - it: should use the specified image tag
    values:
      - values-test.yaml
    set:
      image.tag: 'v1.2.3'
    asserts:
      - equal:
          path: spec.template.spec.containers[0].image
          value: registry.example.com/scanly:v1.2.3
        template: templates/deployment.yaml

  - it: should have readinessProbe configured
    asserts:
      - isNotNull:
          path: spec.template.spec.containers[0].readinessProbe
        template: templates/deployment.yaml
      - equal:
          path: spec.template.spec.containers[0].readinessProbe.httpGet.path
          value: /api/health
        template: templates/deployment.yaml

  - it: should have resource limits set
    asserts:
      - isNotNull:
          path: spec.template.spec.containers[0].resources.limits
        template: templates/deployment.yaml
      - isNotNull:
          path: spec.template.spec.containers[0].resources.requests
        template: templates/deployment.yaml

  - it: should not expose secrets as environment variables directly
    asserts:
      - notContainsDocument:
          path: spec.template.spec.containers[0].env
          content:
            name: DATABASE_PASSWORD
            value: # Should use secretKeyRef, not literal value
        template: templates/deployment.yaml

  - it: number of replicas matches minReplicas HPA value
    set:
      autoscaling.enabled: true
      autoscaling.minReplicas: 3
    asserts:
      - equal:
          path: spec.minReplicas
          value: 3
        template: templates/hpa.yaml

Layer 3: Policy Testing with Conftest

Conftest uses OPA (Open Policy Agent) Rego policies to enforce organizational standards across all Kubernetes manifests:

# policies/kubernetes/deny_latest_tag.rego
package kubernetes.deployment

deny[msg] {
  input.kind == "Deployment"
  container := input.spec.template.spec.containers[_]
  contains(container.image, ":latest")
  msg := sprintf("Container '%v' uses ':latest' tag", [container.name])
}

deny[msg] {
  input.kind == "Deployment"
  container := input.spec.template.spec.containers[_]
  not contains(container.image, ":")
  msg := sprintf("Container '%v' has no image tag", [container.name])
}
# policies/kubernetes/require_resource_limits.rego
package kubernetes.resources

deny[msg] {
  input.kind == "Deployment"
  container := input.spec.template.spec.containers[_]
  not container.resources.limits
  msg := sprintf("Container '%v' is missing resource limits", [container.name])
}

deny[msg] {
  input.kind == "Deployment"
  container := input.spec.template.spec.containers[_]
  not container.resources.requests
  msg := sprintf("Container '%v' is missing resource requests", [container.name])
}

warn[msg] {
  input.kind == "Deployment"
  container := input.spec.template.spec.containers[_]
  to_number(container.resources.limits.memory) > 4294967296  # 4Gi
  msg := sprintf("Container '%v' has memory limit exceeding 4Gi", [container.name])
}
# Run Conftest against rendered helm templates
helm template ./helm/scanly/ --set image.tag=v1.0.0 \
  | conftest test --policy policies/kubernetes/ -

# Expected output:
# PASS - Container 'frontend' has resource limits
# PASS - Container 'frontend' has resource requests
# FAIL - Container 'frontend' uses ':latest' tag

Layer 4: Terraform Testing

# modules/scanly-vpc/tests/vpc_test.tftest.hcl
variables {
  environment = "test"
  region      = "us-east-1"
  cidr_block  = "10.0.0.0/16"
}

run "verify_vpc_created" {
  command = plan

  assert {
    condition     = aws_vpc.main.cidr_block == var.cidr_block
    error_message = "VPC CIDR block does not match input variable"
  }

  assert {
    condition     = aws_vpc.main.enable_dns_hostnames == true
    error_message = "DNS hostnames must be enabled"
  }
}

run "verify_subnets_in_multiple_azs" {
  command = plan

  assert {
    condition     = length(distinct(aws_subnet.private[*].availability_zone)) >= 2
    error_message = "Private subnets must span at least 2 availability zones"
  }
}
# Run Terraform tests
terraform test

# With specific test directory
terraform test -test-directory=./tests

CI/CD Pipeline Integration

# .github/workflows/iac-tests.yml
name: Infrastructure Tests
on:
  pull_request:
    paths:
      - 'helm/**'
      - 'terraform/**'
      - 'deploy/**'

jobs:
  helm-tests:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Install Helm
        uses: azure/setup-helm@v3
        with:
          version: v3.13.0

      - name: Helm lint
        run: helm lint ./helm/scanly/

      - name: Install helm-unittest
        run: helm plugin install https://github.com/helm-unittest/helm-unittest

      - name: Run helm unit tests
        run: helm unittest ./helm/scanly/

      - name: Install conftest
        run: |
          curl -L https://github.com/open-policy-agent/conftest/releases/latest/download/conftest_Linux_x86_64.tar.gz \
            | tar xz conftest
          mv conftest /usr/local/bin/

      - name: Run policy tests
        run: |
          helm template ./helm/scanly/ --set image.tag=ci \
            | conftest test --policy policies/kubernetes/ -

  terraform-tests:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: hashicorp/setup-terraform@v3

      - name: Terraform format check
        run: terraform fmt -check -recursive

      - name: Terraform validate
        run: |
          terraform init -backend=false
          terraform validate

      - name: Terraform test
        run: terraform test
        env:
          AWS_DEFAULT_REGION: us-east-1

Related articles: Also see ephemeral Kubernetes environments as the target for your Helm deployments, testing Terraform and Pulumi alongside Helm for full IaC coverage, and Docker as the runtime underpinning both Helm charts and test containers.


Helm Testing Coverage Goals

Test Tool Pass Criterion
All templates render without error helm lint Zero errors
Image tags are never ":latest" Conftest Zero violations
Resource limits set on all containers Conftest Zero violations
readinessProbe on all deployments helm unittest Test passes
Secrets use secretKeyRef Conftest Zero violations
HPA min replicas >= 2 in prod helm unittest Test passes
ConfigMap keys match application expectations helm unittest Test passes

Testing infrastructure code is one of the highest-leverage activities in a platform engineering practice. The cost of an untested Helm chart bug in production is orders of magnitude higher than the cost of five minutes of unit tests in CI.

Verify your application layer is healthy after every infrastructure change: Try ScanlyApp free and run automated functional checks after each infrastructure deployment.

Related Posts

API Cost Optimisation: How Engineering Teams Cut Cloud Spend by 40%
DevOps & Infrastructure
7 min read

API Cost Optimisation: How Engineering Teams Cut Cloud Spend by 40%

Cloud API costs are the silent killer of SaaS unit economics. AI APIs in particular can generate unexpected bills when called without budgets, caching, or rate limiting. This guide covers a systematic approach to auditing, controlling, and testing your API cost assumptions before they become business-critical surprises.

Chaos Engineering: Break Your System on Purpose Before Your Users Do It for You
DevOps & Infrastructure
6 min read

Chaos Engineering: Break Your System on Purpose Before Your Users Do It for You

Chaos engineering deliberately breaks things before they break on their own — in a controlled environment, with observability, and with a hypothesis. This guide covers practical chaos experiments for SaaS applications: network latency injection, dependency failure simulation, and building confidence that your system degrades gracefully under real-world failure conditions.