devops

Cache your Terraform providers to save space and time

Terraform LogoTL;DR:

  • Set “TF_PLUGIN_CACHE_DIR”  environment variable to an empty dir, then rerun “terraform init” to switch to a shared providers directory. This allows to cache Terraform providers.

Dynamic Terraform providers

Have you noticed, that when you run “terraform init”, Terraform fetches additional binaries from the Internet? These binaries are called “dynamic providers” and they are (mainly) interfaces to the infrastructure operators you’re using in your Terraform code.

So, for example, when you’re using AWS as an infrastructure provider, and you also use some templating to generate config files, you might run “terraform version” and see something like this:

$ terraform version
Terraform v0.11.7
+ provider.aws v1.20.0
+ provider.null v1.0.0
+ provider.template v1.0.0

In the example above, we have 3 providers being used: the “null” and “template” provider and a specialized provider for talking with AWS API. You can find a list of projects developing these providers on Github. But where does Terraform keep these additional provider binaries? Well, each directory that is used as a root of Terraform project (where you run “terraform init”) has a hidden “.terraform” directory. There you can find the downloaded providers:

ll -h .terraform/plugins/linux_amd64
-rwxrwxr-x 1 piontec piontec 239 May 24 06:28 lock.json
-rwxr-xr-x 1 piontec piontec 71M May 24 06:28 terraform-provider-aws_v1.20.0_x4
-rwxr-xr-x 1 piontec piontec 12M Apr 19 07:29 terraform-provider-null_v1.0.0_x4
-rwxr-xr-x 1 piontec piontec 12M Apr 19 07:29 terraform-provider-template_v1.0.0_x4

This approach is nice, as all your projects are independent and self-contained: you can copy the whole directory to some other place or even machine and everything will still work. However, this approach has also a drawback. If you have multiple Terraform working directories, you have to download and store your providers for every one of them. As you can see, the basic set of providers listed above takes around 100MB – not much for today standards, but still. This gets even worse when you have multiple Terraform projects used by a group of people.  And when every one of them runs “terraform init”, he or she needs to wait for the binaries to be downloaded.

A cache directory for providers to the rescue

The solution to the above problem is actually very simple and already included in Terraform, although a little bit hard to find. You can create a single directory for storing your dynamic providers’ binaries, then set the environment variable “TF_PLUGIN_CACHE_DIR” and you’re done! All Terraform commands run when this variable is set will use a shared directory to lookup and download dynamic providers. That way you can share them across projects and users, if only they have sufficient access permissions to that directory.

Let’s see how it works.

$ rm -rf .terraform
$ export TF_PLUGIN_CACHE_DIR=/opt/terraform-plugin-dir
$ terraform init -upgrade=true
$ ll -h /opt/terraform-plugin-dir/linux_amd64
-rwxr-xr-x 1 piontec piontec 71M May 24 06:28 terraform-provider-aws_v1.20.0_x4
-rwxr-xr-x 1 piontec piontec 12M Apr 19 07:29 terraform-provider-null_v1.0.0_x4
-rwxr-xr-x 1 piontec piontec 12M Apr 19 07:29 terraform-provider-template_v1.0.0_x4
$ ll .terraform/plugins/linux_amd64
-rwxrwxr-x 1 piontec piontec 239 May 24 06:28 lock.json
lrwxrwxrwx 1 piontec piontec 71 May 24 06:28 terraform-provider-aws_v1.20.0_x4 -> /opt/terraform-plugin-dir/linux_amd64/terraform-provider-aws_v1.20.0_x4
lrwxrwxrwx 1 piontec piontec 71 May 24 06:28 terraform-provider-null_v1.0.0_x4 -> /opt/terraform-plugin-dir/linux_amd64/terraform-provider-null_v1.0.0_x4
lrwxrwxrwx 1 piontec piontec 75 May 24 06:28 terraform-provider-template_v1.0.0_x4 -> /opt/terraform-plugin-dir/linux_amd64/terraform-provider-template_v1.0.0_x4
$ du -hs .terraform
336K .terraform

So, after setting the “TF_PLUGIN_CACHE_DIR” variable, Terraform downloads providers to that directory and just links from your project’s “.terraform” directory to the shared directory. And that’s it! Now you can have just a single copy of each provider for the given provider version.

Please keep in mind, that each project still selects the providers’ versions on its own: the fact that one Terraform directory downloaded a new provider version to the cache directory doesn’t mean that all other projects will use the new version as well. So, normal versioning restrictions apply. If you want to be sure which version is locked for use with your current project, you can inspect SHA256 of files saved in one of the files in the “.terraform” directory:

$ cat .terraform/plugins/linux_amd64/lock.json
{
"aws": "3c78df7e116ed60bf917e4cd5cda5999ada163ebffd9142ea41aa7992252cda8",
"null": "d45c0f02cbc08b3915e143434e575298d4c638a2c93598c12f4ea551cd821abd",
"template": "f1d8e373d9f89d21fade8858a562bed75b463c814a2cf8fb750eb017083f1e88"
}

$ ll /opt/terraform-plugin-dir/linux_amd64/
-rwxr-xr-x 1 piontec piontec 72686016 Apr 19 07:29 terraform-provider-aws_v1.15.0_x4
-rwxr-xr-x 1 piontec piontec 73263584 May 7 08:52 terraform-provider-aws_v1.17.0_x4
-rwxr-xr-x 1 piontec piontec 74119744 May 21 14:05 terraform-provider-aws_v1.19.0_x4
-rwxr-xr-x 1 piontec piontec 74242624 May 24 06:28 terraform-provider-aws_v1.20.0_x4
-rwxr-xr-x 1 piontec piontec 11621440 Apr 19 07:29 terraform-provider-null_v1.0.0_x4
-rwxr-xr-x 1 piontec piontec 11711744 Apr 19 07:29 terraform-provider-template_v1.0.0_x4

$ sha256sum /opt/terraform-plugin-dir/linux_amd64/terraform-provider-aws_v1.20.0_x4
3c78df7e116ed60bf917e4cd5cda5999ada163ebffd9142ea41aa7992252cda8 /opt/terraform-plugin-dir/linux_amd64/terraform-provider-aws_v1.20.0_x4

As you can see, the SHA256 hash for AWS provider saved in the “lock.json” file matches the hash of  AWS provider v1.20 saved in the cache directory.