May 19, 2021 | DevOps, Hashicorp, Open Source, Terraform Provider
Developing a Terraform provider is a great thing for a company to do as it allows customers to quickly integrate a product with their existing systems with very little friction. During development, occasionally there might be bugs and issues to fix, and it can be quite difficult to work out what is causing them. In this post, I outline how you can attach a debugger such as Delve to a Terraform provider to save time when solving these issues.
Recently, we have been working with a large CDN and edge cloud platform, on improving and upgrading their Terraform provider. In doing so, I came across a quick tip for debugging that I’d like to share.
There are a few things you can use for debugging when working with Terraform providers:
Which one of these to use depends on the situation, and each has its benefits. For example, using acceptance tests has the benefit of being able to commit the tests afterwards with little extra effort and having the assurance that there’s a test case to catch the bug in future. Using the logs is probably the quickest of the three methods and can be sufficient for simpler issues. However, using a debugger can give the best visibility into what’s happening in the provider and allow you to precisely trace the execution of the provider. This can also be beneficial if you are unfamiliar with the provider or the Terraform plugin SDK in general, as it will essentially take you on a guided tour of the code.
I also went through this quick tip in a recent video, if you prefer to watch things instead of reading.
The high-level steps are:
main.go
supports running the plugin in debug modeTo demonstrate the process, I will use the HashiCups demo provider that HashiCorp use in their tutorials, but it should be easy to adapt this to any other provider using the Terraform plugin SDK.
main
functionFirstly, we need to make sure that the provider is set up for debugging. This means that in the main
function, you can use a flag to optionally call plugin.Debug
instead of plugin.Serve
.
When a Terraform command is run normally, Terraform Core will run the plugin in a subprocess and communicate with it via gRPC, where Terraform Core is a client and the plugin is a server. It will pass it some data when it starts up to allow communication with TLS. This is what plugin.Serve
expects, and it will fail if run directly:
$ ./bin/terraform-provider-hashicups
This binary is a plugin. These are not meant to be executed directly. Please execute the program that consumes these plugins, which will load any plugins automatically
To avoid this, we need to be able to pass a --debug
flag and run plugin.Debug
instead. This will start up the gRPC server without expecting the data to be passed immediately and print out information needed to tell Terraform to use this instance instead. Your provider may already have this support, but if not, here is an example main.go
that can be adapted:
package main
import (
"context"
"flag"
"log"
"github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema"
"github.com/hashicorp/terraform-plugin-sdk/v2/plugin"
"github.com/hashicorp/terraform-provider-hashicups/hashicups"
)
func main() {
var debugMode bool
flag.BoolVar(&debugMode, "debug", false, "set to true to run the provider with support for debuggers like delve")
flag.Parse()
opts := &plugin.ServeOpts{
ProviderFunc: func() *schema.Provider {
return hashicups.Provider()
},
}
if debugMode {
err := plugin.Debug(context.Background(), "hashicorp.com/edu/hashicups", opts)
if err != nil {
log.Fatal(err.Error())
}
return
}
plugin.Serve(opts)
}
Secondly, we need to make sure that the plugin is compiled in “debug mode” or without optimisations enabled. This ensures that the debugger can properly map between the compiled instructions and the source code. Depending on your debugging tool, you might be able to do this and run it in one command automatically, for example delve has a dlv debug
command, and IDEs such as GoLand have a Debug button which takes care of this. The command below shows the required compiler options if invoking the Go compiler directly.
go build -gcflags="all=-N -l"
go tool compile --help
...
-N
Disable optimizations.
-l
Disable inlining.
--debug
flagWhichever compilation method used, when running the plugin, we need to pass the --debug
flag that we added support for earlier. Here is how to do this with delve:
dlv debug . -- --debug
Similarly, in an IDE such as GoLand, this can be done in the Run Configuration:
Using whichever debugger you prefer, you can now run the provider. For delve, you will need to type continue
at the debugger prompt after running dlv debug
. For GoLand, click the Debug button in the Run menu.
$ dlv debug . -- --debug
Type 'help' for list of commands.
(dlv) continue
If everything was set up correctly, the plugin will output a message telling you to set the TF_REATTACH_PROVIDERS
environment variable to the value shown. This environment variable will instruct Terraform Core to attach to this existing plugin process rather than starting a new subprocess for the given provider address, and to instead send its requests to the socket address.
Provider started, to attach Terraform set the TF_REATTACH_PROVIDERS env var:
TF_REATTACH_PROVIDERS='{"hashicorp.com/edu/hashicups":{"Protocol":"grpc","Pid":19658,"Test":true,"Addr":{"Network":"unix","String":"/var/folders/qm/swg2hb354d5wp1g6ff6h8w7a2000gn/T/plugin909236753"}}}'
Copy and paste this to another shell, from which you will run Terraform. Make sure to export the variable so that it is passed to Terraform, or alternatively prepend the TF_REATTACH_PROVIDERS=...
expression to every invocation of Terraform, for example:
$ export TF_REATTACH_PROVIDERS='{"hashicorp.com/edu/hashicups":{"Protocol":"grpc","Pid":19658,"Test":true,"Addr":{"Network":"unix","String":"/var/folders/qm/swg2hb354d5wp1g6ff6h8w7a2000gn/T/plugin909236753"}}}'
$ terraform apply
$ TF_REATTACH_PROVIDERS='{"hashicorp.com/edu/hashicups":{"Protocol":"grpc","Pid":19658,"Test":true,"Addr":{"Network":"unix","String":"/var/folders/qm/swg2hb354d5wp1g6ff6h8w7a2000gn/T/plugin909236753"}}}' terraform apply
To verify that this worked, you can observe the log output in the debugger, or even better, set a breakpoint somewhere in the provider that you know will be triggered. For example, in the HashiCups provider, the resourceOrderRead
function in hashicups/resource_order.go
could be a good option if your HCL uses the hashicups_order
resource. See the snippet below for an example that does.
terraform {
required_providers {
hashicups = {
version = "0.3"
source = "hashicorp.com/edu/hashicups"
}
}
}
provider "hashicups" {
username = "education"
password = "test123"
}
resource "hashicups_order" "new" {
items {
coffee {
id = 3
}
quantity = 2
}
items {
coffee {
id = 2
}
quantity = 2
}
}
If you set a breakpoint in resourceOrderRead
, then run terraform apply, you will see the breakpoint being triggered in the debugger. You can then step through line by line, see the value of variables, and evaluate expressions. Consult the documentation for your debugger for more details on what you can do.
When the Terraform command completes, you will notice that the plugin stays running. This instance will be reused for all subsequent Terraform commands which have the environment variable set. If you make some changes to the plugin code, be sure to kill the running instance with Ctrl-C
or a stop button in your debugger, recompile, rerun, and reset the environment variable to ensure the new changes have an effect.
Something that has helped me to better understand the plugin protocol and how the SDK works, is to set a breakpoint in the plugin SDK’s implementation of the GRPC interface (eg. vendor/github.com/hashicorp/terraform-plugin-sdk/v2/plugin/grpc_provider.go
), and to see the communication between Terraform Core and the plugin. Doing this can help to understand when things get called, and also help to diagnose why certain diffs and behaviour occur.
In particular, the PlanResourceChange
and ApplyResourceChange
functions show exactly the contents of the requests and responses to Terraform Core during a terraform apply
, and how the Plugin SDK converts these into the various type systems used internally. For more information on these functions, see the excellent documentation in the Terraform repository, or attach the debugger and walk through it yourself!
Hopefully this tip has been useful. Please get in contact if you have any questions, or if you have any other Terraform debugging tips. If you have a Terraform provider of your own that you would like help to expand or add to, then OpenCredo has a lot of experience with this so feel free to have a chat about that too. Most of all, good luck debugging and see you on the next quick tip!
This blog is written exclusively by the OpenCredo team. We do not accept external contributions.
GOTOpia 2021 – Platform Engineering as a (Community) Service
Watch Nicki Watt’s talk on Platform Engineering as a (Community) Service at GOTOpia to learn what it takes to build a platform that is fit…