What is tf.libsonnet

Learn about tf.libsonnet and advantages of using Jsonnet for generating Terraform

This document walks through an overview of Jsonnet and tf.libsonnet, as well as some of the reasons you might consider using it for your next Terraform project.

What is Jsonnet?

Jsonnet is a data templating language originally created by Google. It is a superset of JSON that adds programming constructs to the language. Think of it as JSON enhanced with:

Jsonnet is designed and optimized for the use case of writing configuration files. Check out their design rationale document for more details.

What is tf.libsonnet?

tf.libsonnet is a collection of pure Jsonnet libraries that provide functions and utilities for generating Terraform code (as JSON configuration syntax). The libraries allow you to streamline the experience of writing static Terraform JSON configuration.

Here is an example:

local tf = import 'github.com/tf-libsonnet/core/main.libsonnet';
local tfnull = import 'github.com/tf-libsonnet/hashicorp-null/main.libsonnet';

local o =
  tf.withProvider('null', {}, src='hashicorp/null', version='~>3.0')
  + tf.withVariable('trigger_id')
  + tfnull.resource.new(
    'this',
    triggers={
      trigger_id: '${var.trigger_id}',
    },
  )
  + tf.withOutput(
    'this_id',
    o._ref.null_resource.this.get('id'),
  );

o

This is equivalent to the following Terraform JSON code:

{
   "terraform": {
      "required_providers": {
         "null": {
            "source": "hashicorp/null",
            "version": "~>3.0"
         }
      }
   },
   "provider": {
      "null": [
         { }
      ]
   },
   "variable": {
      "trigger_id": { }
   },
   "resource": {
      "null_resource": {
         "this": {
            "triggers": {
               "trigger_id": "${var.trigger_id}"
            }
         }
      }
   },
   "output": {
      "this_id": {
         "value": "${null_resource.this.id}"
      }
   }
}

Comparison to Terraform HCL

Limitations of HCL

As a configuration language, Jsonnet is very similar to HCL. Both languages strive to enhance the experience of writing static declarative configuration files in YAML or JSON.

That same example above can be written in Terraform HCL in a more terse and readable format:

terraform {
  required_providers {
    null = {
      source  = "hashicorp/source"
      version = "~> 3.0"
    }
  }
}

variable "trigger_id" {}

resource "null_resource" "this" {
  triggers = {
    id = var.trigger_id
  }
}

output "this_id" {
  value = null_resource.this.id
}

HCL provides many of the same constructs as Jsonnet, which means that there isn’t much to gain from using Jsonnet for a single Terraform module.

However, using Jsonnet can greatly increase the maintainability of large scale Terraform deployments that span multiple modules and state files. When expanding a Terraform project to many smaller modules, you run into many limitations of the Terraform HCL language that is caused by its tight coupling to the Terraform runtime.

To name a few:

Code reusability is limited to a single module, and thus single state file.
Certain constructs can not be interpolated dynamically (e.g., lifecycle and backend).
Certain blocks can not be reused across modules (e.g. provider).
A single module can not write to multiple state files.

Motivating example

Consider a Terraform deployment where you have a single database in a VPC, across two environments (stage and prod). To simplify the example, we will also assume we have defined sub modules for defining a canonical VPC and MySQL database. This will reduce the root modules in our examples to a single module block.

In this example, we will assume that we want to follow best practices and isolate the state files for the components. This will result in four state files:

VPC in Stage
VPC in Prod
MySQL database in Stage
MySQL database in Prod

To achieve this, we need to define the components with four root modules, one for each state file above. We will have a project structure like below:

.
├── stage
│   ├── mysql
│   │   ├── backend.tf
│   │   ├── main.tf
│   │   └── provider.tf
│   └── vpc
│       ├── backend.tf
│       ├── main.tf
│       └── provider.tf
└── prod
    ├── mysql
    │   ├── backend.tf
    │   ├── main.tf
    │   └── provider.tf
    └── vpc
        ├── backend.tf
        ├── main.tf
        └── provider.tf

The main.tf file for each component contains a single module call to define the underlying component infrastructure with some hardcoded parameters as the inputs.

For example, the stage/vpc/main.tf might look like:

module "vpc" {
  source = "github.com/myorg/my-vpc?ref=v1.0.8"

  name       = "stage"
  cidr_block = "10.0.0.0/16"
}

output "vpc_id" {
  value = module.vpc.vpc_id
}

The provider.tf file will contain the provider configuration with the required_providers block to specify the version:

terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 4.0"
    }
  }
}

provider "aws" {
  region = "us-west-2"
}

Finally, the backend.tf file will contain the backend configuration for storing the state file in S3. For example, for stage/vpc:

terraform {
  backend "s3" {
    bucket = "my-stage-bucket"
    key    = "vpc/terraform.tfstate"
    region = "us-west-2"
  }
}

This setup works well for a small scale deployment like above, or for a simple provider and backend configuration.

But what if you have hundreds of components distributed across tens of environments? And what if you need to adjust your provider configurations to use different AWS IAM Roles or you want to enhance it with allowed_account_ids?

The challenge is that you can’t modularize the provider and backend configurations due to the aforementioned limitations. This results in the provider.tf and backend.tf files being mostly duplicated across all the modules, and maintained manually, leading to a painful refactoring process if you ever need to change a pattern in one of these files.

We can address this by using Jsonnet.

Using Jsonnet to DRY multi-component, multi-environment Terraform projects

In the previous example we discussed the limitations of Terraform HCL around DRY-ing certain components. Now let’s see what we can accomplish with Jsonnet.

INFO:

All the code is available at https://github.com/tf-libsonnet/infrastructure-live-example.

First, since Jsonnet is not bound by Terraform, we can essentially modularize any component using functions. Even though Terraform may not support this, in Jsonnet we can create a function that generates the provider and backend blocks and reuse it wherever we need it:

provider.libsonnet

local const = import './constants.json';
local aws = import 'github.com/tf-libsonnet/hashicorp-aws/main.libsonnet';

aws.provider.new(region=const.region)

backend.libsonnet

local const = import './constants.json';

{
  getBackend:: function(envName, component) {
    // NOTE: the + syntax in the keys. This indicates a merge operation, which
    // means deep merge the nested object as opposed to replacing the
    // contents.
    terraform+: {
      backend+: {
        s3: {
          bucket: 'my-' + envName + '-bucket',
          key: component + '/terraform.tfstate',
          region: const.region,
        },
      },
    },
  },
}

NOTE:

We use the .libsonnet extension here. Conventionally, Jsonnet libraries intended to be imported use the .libsonnet extension while root files that are meant to be processed by jsonnet use the .jsonnet extension.

In addition to modularizing these blocks, we can also interpolate values in the backend block to dynamically generate the content based on given parameters. This is possible because the values are static by the time Terraform sees it post compilation. In this way, we can construct the backend block for each region and component without all the boilerplate repeated.

We also extracted the hard coded region into a reusable JSON file to promote more DRY code. Since Jsonnet is a superset of JSON, we can import plain JSON like normal Jsonnet. The contents of constants.json looks like follows:

{"region": "us-west-2"}

Moving on, let’s modularize the VPC and MySQL components by defining a function for generating the relevant blocks:

vpc.libsonnet

local tf = import 'github.com/tf-libsonnet/core/main.libsonnet';

{
  // Given the environment name and cidr block, return the VPC module call with
  // output blocks.
  getVPC:: function(envName, cidrBlock) (
    // Bind the resulting object to a reference so we can refer to self.
    local o =
      tf.withModule(
        'vpc',
        'github.com/myorg/my-vpc?ref=v1.0.8',
        {
          name: envName,
          cidr_block: cidrBlock,
        },
      )
      + tf.withOutput('vpc_id', o._ref.module.vpc.get('vpc_id'));

    o
  ),
}

mysql.libsonnet

local tf = import 'github.com/tf-libsonnet/core/main.libsonnet';
local aws = import 'github.com/tf-libsonnet/hashicorp-aws/main.libsonnet';

{
  // Given the environment name, return the VPC data source lookup with mysql
  // module call blocks.
  getMySQL:: function(envName) (
    // Bind the resulting object to a reference so we can refer to self.
    local o =
      aws.data.vpc.new(
        'vpc',
        tags={
          Name: envName,
        },
      )
      + tf.withModule(
        'mysql',
        'github.com/myorg/my-mysql?ref=v1.0.8',
        {
          name: envName,
          vpc_id: o._ref.data.aws_vpc.vpc.get('id'),
        },
      )
      + tf.withOutput('fqdn', o._ref.module.mysql.get('fqdn'));

    o
  ),
}

At this point, your folder structure should look like the following:

.
├── jsonnetfile.json
├── jsonnetfile.lock.json
└── stack
    ├── backend.libsonnet
    ├── constants.json
    ├── mysql.libsonnet
    ├── provider.libsonnet
    └── vpc.libsonnet

Where the jsonnetfile.json and jsonnetfile.lock.json files are jsonnet-bundler configurations for installing tf.libsonnet. You can generate this using the following commands:

jb init
jb install github.com/tf-libsonnet/hashicorp-aws@main
# NOTE: we don't need to explicitly install tf-libsonnet/core because
#       it will be pulled as a transient dependency of the hashicorp-aws
#       library.

With these building blocks, we can define what a single environment should look like:

stack/main.jsonnet

local backend = import './backend.libsonnet';
local mysql = import './mysql.libsonnet';
local provider = import './provider.libsonnet';
local vpc = import './vpc.libsonnet';

{
  // getStack returns the Terraform code to deploy the full application stack
  // for a single environment. To facilitate this, this returns the resulting
  // main.tf.json for each folder in the environment stack, which should be
  // extracted by the -m flag of jsonnet.
  getStack:: function(envName, cidrBlock) {
    [envName + '/vpc/main.tf.json']: (
      backend.getBackend(envName, 'vpc')
      + provider
      + vpc.getVPC(envName, cidrBlock)
    ),
    [envName + '/mysql/main.tf.json']: (
      backend.getBackend(envName, 'mysql')
      + provider
      + mysql.getMySQL(envName)
    ),
  },
}

Note how the getStack function returns an object with keys that look like folder/file structures. This uses a very useful feature of jsonnet that allows outputting multiple documents from a single jsonnet program. Using this, we can generate our entire environment/component combos with two function calls:

main.jsonnet

local stack = import './stack/main.libsonnet';

stack.getStack('stage', '10.0.0.0/16')
+ stack.getStack('prod', '10.1.0.0/16')

The final folder structure will look something like this:

.
├── jsonnetfile.json
├── jsonnetfile.lock.json
├── main.jsonnet
└── stack
    ├── backend.libsonnet
    ├── constants.json
    ├── main.libsonnet
    ├── mysql.libsonnet
    ├── provider.libsonnet
    └── vpc.libsonnet

You can run this through jsonnet to generate the Terraform folder structure:

$ jb install
$ jsonnet -J ./vendor -c -m out main.jsonnet
$ tree ./out
./out
├── prod
│   ├── mysql
│   │   └── main.tf.json
│   └── vpc
│       └── main.tf.json
└── stage
    ├── mysql
    │   └── main.tf.json
    └── vpc
        └── main.tf.json

As a sample, here is what stage/vpc/main.tf.json looks like:

{
   "module": {
      "vpc": {
         "cidr_block": "10.0.0.0/16",
         "name": "stage",
         "source": "github.com/myorg/my-vpc?ref=v1.0.8"
      }
   },
   "output": {
      "vpc_id": {
         "value": "${module.vpc.vpc_id}"
      }
   },
   "provider": {
      "aws": [
         {
            "region": "us-west-2"
         }
      ]
   },
   "terraform": {
      "backend": {
         "s3": {
            "bucket": "my-stage-bucket",
            "key": "vpc/terraform.tfstate",
            "region": "us-west-2"
         }
      }
   }
}

What is neat about this approach is that adding a new environment only requires modifying main.jsonnet. E.g., adding a new dev env:

local stack = import './stack/main.libsonnet';

stack.getStack('dev', '10.2.0.0/16')
+ stack.getStack('stage', '10.0.0.0/16')
+ stack.getStack('prod', '10.1.0.0/16')

Additionally, since the resulting code is plain Terraform, the runtime can be handled by anything that supports Terraform natively. This means that you should be able to use your existing Terraform pipelines without modificatins on the resulting code to deploy the resulting changes (e.g., Terraform Cloud / Terraform Enterprise, Atlantis, Spacelift, env0, etc). This can greatly help with incrementally transitioning to using Jsonnet.

Summary

To summarize the comparison:

Terraform HCL has limitations on what you can modularize and reuse.
Terraform HCL is limited to managing a single state file.
Jsonnet is not bound by the Terraform runtime, giving you (mostly) free reign on what you can reuse and generate. This includes provider, lifecycle, and backend blocks.
Jsonnet can be made to generate multiple folders, and thus automatically generate multi-state Terraform project structures.

Comparison to other tools

If you have run across these limitations of Terraform HCL, you might have been introduced to other tools in this space that attempt to solve these problems. To name a few:

Each of these tools attempt to solve the problem using similar approaches of using a templating abstraction (Terragrunt and Terramate use an HCL abstraction, while CDKTF and Pulumi uses general purpose programming languages).

However, in addition to being a templating abstraction, these tools also attempt to manage the lifecycle of the resources, and thus the Terraform runtime. For example, instead of running terraform plan and terraform apply, you might run:

terragrunt plan and terragrunt apply
pulumi up
cdktf deploy
terramate run

This control gives each of these tools the extensibility to implement feature enhancements that are not provided by terraform natively (e.g., terragrunt run-all and dependency blocks).

However, there are two significant disadvantages that you trade for this power:

Leaky abstractions. The tools try to abstract away Terraform, but because of the law of leaky abstractions, it is impossible to completely hide it. Bugs and issues in Terraform frequently bubble up to the abstraction layer, which can lead to major confusions around debuggability. To most users, it is not clear when an issue is at the Terraform layer or the abstraction layer. In some cases, dropping to Terraform is necessary to debug the issues. Albeit, most of these tools provide a way to debug such issues, but it may not feel natural (see terragrunt debugging for example).
Poor integration support. Runtime services need to natively support the runtime and outputs of these tools because these tools depend on being invoked directly. This adds a barrier to entry as many runtimes need to make the decision on whether they should expend time on developing a native integration. Typically, many services start with native Terraform support, and later on add these tools as an afterthought, or may not even support it. For example, Terraform Cloud / Terraform Enterprise does not natively support calling anything other than terraform. This may prevent your team from adopting these tools.

Using Jsonnet addresses these issues by adopting a different paradigm: the Terraform Compiler Pattern. In this approach, the tool is purely a templating abstraction, and does not control any part of actually running terraform. This results in a clean separation between the Jsonnet world and the terraform world.

Here is how that addresses the above disadvantages:

Clear separation of boundaries. Note that using Jsonnet does not preclude you from learning Terraform. You must still be aware of Terraform, and the associated runtime. However, by explicitly separating out the steps, we embrace the leaky abstraction instead of hiding it. This gives you clear entrypoints for addressing an issue. Need to drop to terraform? Interact directly with the “compiled” code. Found an issue in the generated terraform output? Focus on the Jsonnet code to compare the generated outputs.
Any runtime/tool that natively supports Terraform is supported. Because the output is pure Terraform JSON code, any tool that works with Terraform can work with the Jsonnet output. The Jsonnet is only a preprocessing step for generating the Terraform code, and does not care about how the resulting code is run or deployed. For example, you can integrate with Terraform Cloud / Terraform Enterprise by having a secondary repository from your Jsonnet code that stores the compiled Terraform code.

Note that some of the aforementioned tools support a similar pattern by focusing on just the templating feature. For example, with cdktf, you can use cdktf synth to achieve a similar effect.

In this case, the line between Jsonnet and cdktf comes down to the implementation language: whether you want to write Terraform using Jsonnet, HCL, or a general purpose programming language. This will mostly come down to personal preference, but Jsonnet has a few notable differentiating factors:

Hermeticity. Jsonnet code is hermetic, meaning that the same JSON output is always generated regardless of the runtime environment (e.g., in CI or locally).
Imposed limitations. Jsonnet is not a free-form general purpose language. This means that what you can do in Jsonnet is limited (e.g., you can not make a network call to retrieve resource information). Whether you see this as a disadvantage or advantage will depend on your specific use case and environment. In some situations, this is a desirable factor as you can ensure your code is kept simple with limited points of failure. It also increases the safety of consuming third party libraries, since you can be sure it won’t be able to do certain things like installing and executing a foreign binary.
Simple. Jsonnet is a relatively simple language. The official tutorial covers everything there is about Jsonnet, and can be completed in a day at most.

If you are convinced that Jsonnet will work for you, then tf.libsonnet should greatly enhance the Terraform writing experience.