Terraform uses AWS IAM user credentials to manage resources in the AWS cloud. It does so by utilizing the secret key and access key of the IAM user. Hence, Terraform’s ability to manage (create/update/delete) resources depend on the permission associated with the AWS IAM user.
When I started working with Terraform to manage resources in the AWS cloud, I used the credentials of an IAM user that had Administrative privileges on that account. Using such an IAM user credential ensured that I had no issues with permissions and focused on making the Terraform configuration manage all types of resources. And such an approach worked great for me because when I executed the Terraform configurations, there wasn’t any error related to access. This approach was also fast and easy to set up because, by providing Administrative permissions, I addressed all present and future Terraform configuration use cases. However, do we want to have such an IAM user credential to manage Terraform configurations from a security standpoint?
In this note, I discuss how to manage security early in an organization’s IaC journey. When I started working in the AWS cloud, using an Administrator IAM user allowed me to manage whatever cloud resources I needed using Terraform configuration. E.g., I could manage a VPC, lambda, EC2, ECR, Route53, etc., using the same IAM user credential. That was cool because I did not need to focus on separate user credentials to manage different resources. The question to ask at that time was -is that secure? Does a user who needs rooms A, B, and C also be provided access to rooms D, E, and F? The answer could be (a) no, or (b) it depends. Is the user going to need access to rooms D, E, and F in the future?
If the answer is (a) no, and the user should not have access to resources that it does not manage, I want to share how to achieve that. But, before that, I want to take a short detour towards planning an infrastructure deployment.
At a high level, I can think of two IaC approaches – (i) single Terraform configuration to deploy the entire IaC stack, and (ii) multiple smaller Terraform configurations to deploy the IaC stack, which is modularized based on persistence.
Deploying the entire IaC stack via one Terraform configuration is doable, but it tends to get challenging to manage as the configuration grows and newer infrastructure components are added. Therefore, it is better to follow the second approach and divide the Terraform configuration into more minor configs that can be deployed following a layered approach. You can read more about that in my previous note –layered IAC approach. The gist of the idea is that you build your infrastructure by deploying separate Terraform configuration projects one after the other. While creating these individual Terraform configuration projects, ensure that (a) they manage resources that have the same lifecycle and persistence and (b) there is high cohesion within a Terraform configuration and low coupling between separate Terraform configuration projects.
The benefits of such an approach are:
-easy to manage: the Terraform configuration is smaller and hence expected to be relatively simple to understand
-quick to implement: deploying a smaller Terraform configuration would take less time, and the feedback (success/failure) would be faster
-is secure: compared to a single IaC Terraform configuration.
I mention that it is secure but with a caveat. It is secure if it is managed accordingly. As I stated earlier -Terraform uses IAM user credentials to manage resources in the AWS cloud. Following the layered deployment approach, someone could adjust an AWS IAM user’s permission to manage only the resources listed in the Terraform configuration project. This approach also adheres to the principle of least privilege, where an entity is provided with the least (minimum) level of access or permission to carry out the task. You can read more about this at –Wikipedia-principle-of-least-privilege.
I now discuss achieving the principle of least privilege while working with Terraform to manage AWS cloud resources.
There are broadly three types of access that any Terraform configuration project may require:
-permission to store the state file in an S3 bucket and work with a DynamoDB table to manage concurrent access,
-(optional) read permission to access the outputs of an existing Terraform configuration, and
-permission to manage resources as mentioned in the Terraform configuration.
Each of these permissions must be individual IAM policies in the form of JSON policy files.
As Terraform uses an IAM user’s credentials to communicate with the AWS cloud, here is a list of a few IAM best practices:
-attach permission policies to groups instead of individual IAM,
-attach IAM users to IAM groups for ease of management overhead,
-use IAM roles where possible, and
-rotate credentials frequently on a schedule.
Let me now demonstrate how I adopted the above IAM best practices and applied them in a use case to manage resources in a specific AWS account.
One of the IAM best practices is to use roles. Hence, I created a trusted-trusting account relationship between two AWS accounts.
In the AWS (trusting) account where the cloud resources existed, I created:
-a policy to manage the remote state and concurrent access,
-a policy to manage the specific AWS cloud resources; it would be best to list the exact actions in the policy file.
-a role that was assumed by a user in a Trusted AWS account, and I attached the above two policies to the role.
Then, in the AWS (trusted) account, whose user credentials a Terraform configuration project used, I created:
-a policy that allowed to “assume-role” in the trusted account,
-an IAM group, and attached the policy to that group, and
-a user, and attached it to the IAM group.
In the Terraform configuration project, I used the access and secret key of the user in the Trusted account to manage AWS cloud resources in the Trusting account. I also rotated the user credentials frequently.
If a particular Terraform configuration had a dependency and needed to read outputs from an existing Terraform remote state, I attached a policy that allowed that to the role in the trusting AWS account.
The only drawback to the above approach was that I ended up with many individual user accounts and policies files. I was, however, able to manage that by using tags intelligently.
There are several different approaches that an organization can identify to strengthen its security posture. Therefore, it is necessary to agree on a secure process that has a minor management overhead. Terraform configuration projects grow over time, and so should the permissions (in JSON policy files) associated with the IAM user/role that the project uses. The process must be reviewed and audited regularly to identify deviations and correct them.
The idea is not to reduce the speed of an organization’s IaC adoption process but to reduce the blast radius if a malicious actor exploits a security vulnerability. The idea is to ask -what could lead to a security key exposure, how to prevent that, and agree on how to remediate.
If you are using Terraform to manage AWS cloud resources, is your approach secure? Are there areas of improvement? How are you addressing them? Would you please share them in the comments section?
I hope this note was helpful to you. In my experience working with AWS cloud resources and Terraform, this is the most secure approach that I have come across, involving minor management overhead.