Terraform Action Patterns and Guidelines

Disclaimer: I am working at HashiCorp (now IBM) as part of the Terraform Core team. The postings on this site are my own and don’t necessarily represent IBM’s positions, strategies or opinions.
Since I am involved in Terraform my opinions can sometimes be (unconsciously) biased. I hope you enjoy the post anyway.

Introduction

In this post, we will cover common patterns and guidelines for writing Terraform Actions. If you have not heard about Terraform Actions yet, please check out my post Introduction to Terraform Actions. If you want to learn how to write an Action, please take a look at Writing a Terraform Action.

Guidelines

Here are some helpful guidelines when thinking about actions:

There is an exception to every rule so take these guidelines with a grain of salt.

Actions are designed for side effects, not CRUD operations

Before you create an action, make sure the content of the action is a side effect. My mental checklist is:

Does it have state? If it does it should not be an action.
If it creates a remote object, does it require state changes / deletion? If it does, maybe it makes sense to model it as a resource instead.
Does it temporarily create something for the course of the run? If it does, you might want to look into ephemeral resources.
Does it change the state of a resource? If it does, please be aware that there might be a new type of action or additional features added to actions later on that help model this properly. Ideally, limit the action to only changing computed attributes so that the action does not contradict the resource’s state.

Actions should not require state changes

If your action changes the remote state of a resource, the next run might either undo the changes (if the attributes the action changed are configured) or it might update the state of the resource to match the remote state (if the attributes the action changed are computed and not configured). This might lead to unexpected behavior and should be avoided. There might be features down the line for actions interfering with resource states, so it might make sense to hold off on implementing this until those features are available. Especially so for overwriting configured values or values from other providers’ resources, this is a recipe for disaster.

If possible, actions should be retryable

If an action fails due to upstream issues or misconfiguration, it should be retried. The user can run terraform apply -invoke action.type.name to retry the action. Your action should be designed so that, in case of failure, it can be retried without problems.

Not all actions need to be triggered by `action_trigger` blocks

Some actions might just be one-off operations that are not triggered by any event. For example, an action that creates a backup of a database or a file could be hooked into the lifecycle of a resource, but it is also okay to just run it manually when needed. You don’t need to constraint your action ideas to actions that make sense in the context of the plan & apply lifecycle.

Limit the scope

Actions should be designed to perform a single task well. They should be specialized tooling and (ideally) not general purpose. I am certain that we will have some general purpose actions over time, but whenever possible the action should be somewhat atomic and specific.

Since actions can be used as part of a resource’s lifecycle, they should not try to boil the ocean and do enormous tasks. Ideally, actions should be aimed at one or a few specific resource instances (depending on the task you might need more than one).

Please also be aware that the resource triggering the action can but does not need to be the one the action is acting on. For example, you might want to clear the load balancer cache after updating a different resource that is served by the load balancer.

Patterns

There are a couple of patterns that I ran into when developing / looking at the first actions, so this is my recipe book for them.

When and how to use `resp.SendProgress()`

resp.SendProgress() is used to send progress updates to the Terraform CLI during the execution of an action. It is typically used when an action is performing a long-running task, such as a file upload or a database query.

resp.SendProgress(action.InvokeProgressEvent{
	Message: fmt.Sprintf("Publishing message to SNS topic %s...", topicArn),
})

The code above will send a progress update to the Terraform CLI so that the user can see the progress of the action. Compared to normal logs in Terraform, these progress updates are always visible to the user. This means one should

use them sparingly when it is really important to inform the user
add a configuration option to control the verbosity of progress updates for not-as-important updates
- if you are writing an action for a CLI tool that has verbosity options already, I would recommend using those options to control the verbosity of progress updates.

Adding custom validations to an action

You don’t want your action to fail during its invocation since that is in the apply phase, and by then everything should (hopefully) just work. This means we should validate as much as possible as early as possible.

Offline Validations

The earliest way to do this is during the validation phase, meaning when the provider is still offline (not yet configured). This is a great point to check the general shape of things. To hook into the validation workflow, your action needs to implement the interface action.ActionWithValidateConfig (action being imported from "github.com/hashicorp/terraform-plugin-framework/action").

Then you can implement a validation function like I did here for the ansible/ansible provider:

// Source: https://github.com/DanielMSchmidt/terraform-provider-ansible/blob/7ee0264aff1aa9dec2f2b40fbb3d5ce967b9a5f6/framework/action_playbook.go#L205-L227
func (a *runPlaybookAction) ValidateConfig(ctx context.Context, req action.ValidateConfigRequest, resp *action.ValidateConfigResponse) {
	var config runPlaybookActionModel

	resp.Diagnostics.Append(req.Config.Get(ctx, &config)...)
	if resp.Diagnostics.HasError() {
		return
	}

	if !config.VaultFiles.IsUnknown() && !config.VaultPasswordFile.IsUnknown() {
		var vaultFiles []types.String
		resp.Diagnostics.Append(config.VaultFiles.ElementsAs(ctx, &vaultFiles, false)...)
		// We can already do some validations here during plan
		if len(vaultFiles) != 0 && config.VaultPasswordFile.ValueString() == "" {
			resp.Diagnostics.AddAttributeError(path.Root("vault_password_file"), "vault_password_file is not found", "Can not access vault_files without passing the vault_password_file")
		}
	}

	if !config.SSHPrivateKeyFile.IsUnknown() && config.SSHPrivateKeyFile.ValueString() != "" {
		if _, err := os.Stat(config.SSHPrivateKeyFile.ValueString()); os.IsNotExist(err) {
			resp.Diagnostics.AddAttributeError(path.Root("ssh_private_key_file"), "ssh_private_key_file not found", fmt.Sprintf("The SSH private key file %q does not exist: %s", config.SSHPrivateKeyFile.ValueString(), err.Error()))
		}
	}
}

I first parse the config to see if any issues arise there already, then I check for a couple of files to be present when the configuration configures them. What you validate here depends on your actions goals.

Online Validations (aka planning)

If you need your provider to be online (meaning configured and ready to interact with e.g. an external API) you can make your action plan. To hook into the plan workflow, your action needs to implement the interface action.ActionWithModifyPlan (action being imported from "github.com/hashicorp/terraform-plugin-framework/action").

This is the ModifyPlan method of the action I added to the ansible/ansible provider (and removed again once I remembered validation exists):

// Source: https://github.com/DanielMSchmidt/terraform-provider-ansible/blob/a481fd1a548a1da9cecf3a85be5c751f5ccfd750/framework/action_playbook.go#L205-L211
func (a *runPlaybookAction) ModifyPlan(ctx context.Context, req action.ModifyPlanRequest, resp *action.ModifyPlanResponse) {
	var config runPlaybookActionModel

	resp.Diagnostics.Append(req.Config.Get(ctx, &config)...)
	if resp.Diagnostics.HasError() {
		return
	}

Actions wrapping a local command

Aside from connecting with non-CRUD APIs, another common use case for Terraform actions is wrapping a local command. I’m taking the ansible/ansible provider with the ansible_playbook action as an example again.

The hardest part when writing this action was to be able to send ansibles output through resp.SendProgress(). Sending it all in one go when the binary is done is easy, but sending it in chunks is a bit harder.

First, I define a helper for sending the stdout periodically; This helper is a io.WriteCloser that calls the send function at most once per second.

// Source: https://github.com/DanielMSchmidt/terraform-provider-ansible/blob/7ee0264aff1aa9dec2f2b40fbb3d5ce967b9a5f6/framework/action_playbook.go#L506-L545
type TerraformUiWriter struct {
	send      func(s string)
	buffer    string
	closed    bool
	lastFlush time.Time
}

func (t *TerraformUiWriter) Write(p []byte) (n int, err error) {
	if t.closed {
		return 0, errors.New("Writing on closed writer")
	}
	t.buffer += string(p)

	now := time.Now()
	shouldFlush := false

	if t.lastFlush.IsZero() || now.Sub(t.lastFlush) >= time.Second {
		shouldFlush = true
	}

	if shouldFlush && len(t.buffer) > 0 {
		t.send(t.buffer)
		t.buffer = ""
		t.lastFlush = now
	}

	return len(p), nil
}

func (t *TerraformUiWriter) Close() error {
	if t.closed {
		return errors.New("Closing closed writer")
	}
	t.closed = true
	if t.buffer != "" {
		t.send(t.buffer)
		t.buffer = ""
	}
	return nil
}

And this is how it is used:

// Source: https://github.com/DanielMSchmidt/terraform-provider-ansible/blob/7ee0264aff1aa9dec2f2b40fbb3d5ce967b9a5f6/framework/action_playbook.go#L439-L464
	cmd := exec.CommandContext(ctx, ansiblePlaybookBinary, args...)
	if config.SSHDisableHostKeyChecking.ValueBool() {
		cmd.Env = append(cmd.Env, "ANSIBLE_HOST_KEY_CHECKING=false")
	}

	var stderr strings.Builder
	cmd.Stderr = &stderr
	cmd.Stdout = &TerraformUiWriter{
		send: func(s string) {
			if verbosityLevel > 0 {
				resp.SendProgress(action.InvokeProgressEvent{
					Message: fmt.Sprintf("ansible-playbook: %s", s),
				})
			}
		},
	}

	if verbosityLevel > 0 {
		resp.SendProgress(action.InvokeProgressEvent{
			Message: fmt.Sprintf("Running %s", cmd.String()),
		})
	}

	err := cmd.Run()

	stderrStr := stderr.String()

We use exec.CommandContext to create a command context so we can use our own writer for Stdout. Then we have to run cmd.Run() to really execute the command.

Please leave a comment

Do you have feedback around actions or do you want to hear about a specific topic around Terraform / Software Development / Language Design / Infrastructure as Code? Please let me know in the comments below.