Rolling out infrastructure for complex environments is not easy. It requires consistency and standards in order to be reliable and to scale. Infrastructure-as-Code (IaC), is one approach that simplifies the process.
By allowing developers to write code that describes how infrastructure should be configured, then automatically configuring infrastructure to meet the definition, IaC tools like Terraform and CloudFormation add a great deal of automation to a process that would otherwise be tedious and time-consuming — not to mention prone to human error, in the event that an admin makes a mistake when configuring a system.
But with the great power that IaC offers comes great responsibility, as there are a lot of risks involved. This article outlines five of the most common risks in IaC Templates, and how to fix them.
Let’s get started…
1. Hard Coding Secrets or Resource References
Information that is either too sensitive to view or can change over time — such as secret keys, codes, IP addresses, domain names, aliases or account names — should be assigned as a variable with an appropriate name. As a rule-of-thumb, such types of information should not be hardcoded in a version control for security purposes. It’s also very inconvenient because in case we rotate some keys or secrets, we would require a new commit and review phase, which could be blocked or delayed for hours or days in case of bad deployment. Instead, all such data should be stored in specialized services that inject the required secrets or context variables on demand.
In a typical case scenario, we use the example of AWS Secrets Manager or Vault to retrieve those values when the infrastructure plan is created and submitted for deployment. That allows for safer and secure handling of secrets, with proper access controls and auditing.
2. Committing State Files into a Version Control
For users of Terraform, state files are items created when initiating an IaC plan. They contain useful metadata and configuration options for the specific infrastructure. Understandably, there is a greater chance of storing sensitive values there and committing them in version control. This would create additional problems when someone else tries to checkout the code from the version control; the state file would be stale or incomplete. They could end up deploying wrong or insecure infrastructure components that are difficult to safely roll back.
The best way to share and re-use state files is by sharing them in a remote state location (typically a remote storage service such as Amazon Web Services’ S3) with proper permissions in place.
3. Not Performing Sanity and Security Checks Before or After Deployment
Before you use a template or a state plan, you have the option to validate it against the current infrastructure deployment. This will help catch any syntax errors; but most importantly, any unintended destructive changes that may be applied in the process.
Also, after the deployment is completed, there should be additional sanity and security checks answering the most common questions: Is the deployed infrastructure secure? Did the deployment leave open ports? Did the deployment not properly destroy unused resources?
Writing acceptance tests to verify common security assumptions after deployment is the first step (see TaskCat and Terratest). Additionally, there should be an automated system, such as Prisma Cloud, that performs periodic checks against environments to catch any deviations and escalate security issues.
4. Using Untrusted Images or Plugins
Using images and instances that are old or from unknown sources can pose a security risk if they possess vulnerabilities. The same problem occurs if you are using IaC plugins from third parties (a typical case when using Terraform, for example). Just because they are open source and public, does not mean that they are trusted and reliable. In fact, unless a thorough security check has been established, they pose a great security risk as they may exfiltrate data or perform unsafe deployments at runtime.
A good peer-to-peer review and evaluation of the capabilities of the images or plugins should be performed before real-world usage.
5. Not Reusing Code and Putting Everything in One File
Putting all configurations and templates in one file is a recipe for disaster. This is because they will probably not fit together. Increasing the number of configurations can lead to lots of duplicated code. This code duplication leads to templates that are difficult to understand, which leads to more configuration drifts that can end up in production. IaC templates should be organized by environment and by logical boundaries (for example: production, development, staging with its own databases, VPCs, permissions and IAM policies templates).
Using common references and shared modules can help deploy infrastructure resources more confidently and consistently, every time.
From the above scenarios, you can clearly understand that IaC templates are source code and need to be treated as source code. This essentially means that before committing those templates in a version control system, you should review them, quality assess them, format and validate them by more than one person — every time.
A handful of organizational policies should be established earlier on. Only then can we be sure to avoid a whole class of errors and risks of allowing unchecked code to enter the production space. Adhering to good engineering practices and keeping tabs always helps to avoid those risks in the first place.
If you’re interested in real-world research on IaC, including actual user data, take a look at the Unit 42 Cloud Threat Report from Palo Alto Networks focused on IaC template vulnerabilities.