An Azure DevOps Pipeline to manage Azure resources using Terraform

Harinderjit Singh
ITNEXT
Published in
13 min readJul 23, 2023

--

GitOps Intro

GitOps uses Git repositories as a single source of truth to deliver infrastructure as code. Submitted code initiates the CI process, while the CD process checks and applies requirements for things like security, Infrastructure as Code, or any other boundaries set for the application framework. All changes to code are tracked, making updates easy while also providing version control should a rollback be needed. The core idea of GitOps is to have a Git repository that always contains declarative descriptions of the infrastructure currently desired in the production environment and an automated process to make the production environment match the described state in the repository.

FYI: MR=Merged Requests

The biggest benefit to using Git and Infrastructure as Code (IaC) is that you can now use continuous integration and deployment. With tools like Azure DevOps, you can automatically deploy Infrastructure as code and apply it to your cloud environment. Resources that have been added to the Infrastructure code are provisioned automatically and made ready for use.
Resources that were modified in infrastructure code or configuration are updated in your cloud environment and resources that are removed from the infrastructure code or configuration are automatically spun down and deleted. This allows you to write code, commit it to your
Git repository, and take full advantage of all the benefits of the DevOps process, but for your infrastructure.

Objective

In this post, I want to build a practical example showing how GitOps can be used to provision Azure infrastructure using Terraform and Azure DevOps pipeline. It’s not going to be a production-ready example but nonetheless, it’s going to be an interesting proof of concept, so let’s try it.

Overview

We are going to build Azure infrastructure using Terraform as IaC. So whenever there is a code commit on a specific branch of the Terraform code repository, the pipeline will get triggered. The pipeline has steps to build, test, scan, and deploy the terraform code. Infrastructure will be created, updated, or deleted on “deploy” steps. For the code repository and Pipeline, we will use Azure DevOps.

Using Terraform( Terragrunt) GitOps pipeline, the following will be provisioned in respective resource groups:

  1. Storage Account with a Blob container
  2. Azure SQL Server (in US East )
  3. Azure Database (in US East as Primary)
  4. Azure Container Registry ( in US East )
  5. Azure Kubernetes Service ( in US East and US East2)

Below Diagram describes what we are going to do.

Briefly, this is how the workflow works:

1. The cloud engineer creates a new terraform module for an Azure resource or updates the YAML configuration and pushes the source code into an Azure DevOps repository.

2. The code pushed triggers an Azure Pipeline that builds, tests, and scans.

3. At the Deployment step Pipeline asks for Approval and then creates, updates, or deletes Azure infrastructure as per code updates or pipeline variables.

Implementation

Implementation steps are as follows:

Prerequisites

  • An Azure subscription with resource providers for AKS, Storage Account, ACR, and Azure SQL registered. I am using a Free Tier Subscription.
  • An Azure DevOps Organization.

Let’s begin building the workflow.

Create an App Registration

  • Your GitOps Pipeline will use this Service principal to connect to Azure APIs to create, update or delete resources.
  • On your Azure portal, Navigate to the Azure Active directory > App Registrations.
  • Click on New registration and let’s name it “pipelineaccess”.
  • Create a client secret for this App registration.
  • Navigate to the subscription on the Azure portal and click on “Access Control (IAM)”.
  • Assign the Owner role on the subscription to the ‘pipelineaccess’ Service Principal.

You may consider the principle of least privilege and find out what role should be given instead of “owner” and use that.

  • So we are all set as far as Service Principal is concerned.

Create a project

  • Now we need to create a project in Azure DevOps organization.
  • This project will host the Code repository and the GitOps Pipeline.
  • Navigate to the Azure DevOps portal and click on “New Project”.
  • Select the options as per the below snapshot, I named the project “gitops”.
snapshot shows “gitops1”, but in demo I am using “gitops” as project name

Add SSH Key

  • We need to have an authentication setup for your laptop where you will commit the code. For that, we will create an SSH key (or use an existing one) and update the public SSH key to the Security config in Azure DevOps.
  • Navigate to user settings > Security > SSH Public Keys
  • Click on New Key and paste the public key from your Laptop

Import IaC repository

  • This GitHub public repository is through which I am sharing the Terraform code along with the Azure DevOps Pipeline YAML definition.
  • You can clone the GitHub repository to your machine and then add the new remote origin which points to the Azure DevOps repository.
cd
org_name=<your Azure DevOps organization>
git clone https://github.com/harinderjits-git/gitops-azdevops.git
cd gitops-azdevops
git remote remove origin
git remote add origin git@ssh.dev.azure.com:v3/${org_name}/gitops/gitops
git push -u origin --all

Edit the config YAML file

  • Update the gitops-azdevops/terraform/azure/terragrunt/orchestration/config_env_sampleapp.yaml.
  • There are 6 config items to update.
  • You need to update your Azure subscription ID, tenant ID, your Laptop’s public key, DB password, Storage Account’s name, and Storage Account’s Resource Group Name.
  • Commit this change and push it to the Azure DevOps git repository.
  • Azure resources` configuration is controlled by this YAML file. If and when someone needs to edit some configuration, update is made to this file.

Never push secrets to Git repos (this is a Lab environment), You should rather use Azure Key Vault for secret management.

Create Pipeline

  • Now we need to create the GitOps Pipeline.
  • Navigate to your GitOps project on Azure DevOps Portal.
  • Navigate to Pipeline and click on New Pipeline.
  • Select the “Azure Repo git” for the pipeline code source.
  • Select the repository to be used.
  • Click on the existing Azure Pipelines YAML file.
  • Select the path as “azure-pipelines.yml”.
  • Notice that the trigger for this pipeline is simply a merge to the Master branch.

I have not considered any security/RBAC on the pipeline because I am the only one working on my Azure DevOps Project. That is done by setting RBAC in the “manage security” section of the pipeline in the Azure DevOps portal.

Add Variables in Pipeline

  • We define variables for the attributes which may change or for secrets as well.
  • We will define 7 variables for this pipeline. These are used in the pipeline to retrieve environment variables or secrets.
  • We have used some variables in conditional expressions to make decisions in the pipeline. We will discuss more on this later.

We can also define variables in the YAML definition of the pipeline. If a variable appears in the variables block of a YAML file, its value is fixed and can't be overridden at queue time. The best practice is to define your variables in a YAML file but there are times when this doesn't make sense. For example, you may want to define a secret variable and not have the variable exposed in your YAML. Or, you may need to manually set a variable value during the pipeline run.

  • On “Review your pipeline YAML”, click on variables.
  • Click on “New Variable”.
  • Add variable “terragrunt_action
  • Similarly, we will define more variables such as ARM_CLIENT_SECRET and since its a secret we can keep this value secret and it won’t be visible in pipeline logs.
  • Here is the complete list of Variables : local_state_file, first_iteration, terragrunt_action, ARM_CLIENT_ID, ARM_CLIENT_SECRET, ARM_TENANT_ID, and ARM_SUBSCRIPTION_ID
  • Variables have default value defined unless overridden that value is used.
  • Some variables can be overridden on manual execution.
  • Click save

You may create use a variable group in the library and then use that instead of defining variables separately.

Create Environment

  • The environment is used to define the approvals and checks.
  • Navigate to Environments > “test-approval” > Approvals and checks
  • Add Check and select “Approvals”
  • Add Approvers
  • Configure control options with timeout as 1 hour and click Create

Create an agent pool

  • Azure DevOps comes with Microsoft-hosted “Azure Pipelines” as the default agent pool.
  • But Microsoft hosted pool has 0 parallelism means it doesn’t execute anything unless you fill out a form and request for it. You may do that or you may use a local agent.

I am using a local agent since this is a learning environment and I am not patient enough to wait for few days before agent is available for usage.

  • Click on the user settings and create a Personal Access Token and save it. It will be used on agent configuration on your machine for authentication.
  • Navigate to project settings.
  • Under Pipelines click on Agent Pools and click on Add Pool.
  • Select the options as per the below snapshot
  • Go to “NewLocal” and add a new agent.
  • Choose the Appropriate OS and follow the instructions on your machine.
hsingh@DESKTOP-424H6PI:~/adevopsagent/agent1$ ./config.sh

___ ______ _ _ _
/ _ \ | ___ (_) | (_)
/ /_\ \_____ _ _ __ ___ | |_/ /_ _ __ ___| |_ _ __ ___ ___
| _ |_ / | | | '__/ _ \ | __/| | '_ \ / _ \ | | '_ \ / _ \/ __|
| | | |/ /| |_| | | | __/ | | | | |_) | __/ | | | | | __/\__ \
\_| |_/___|\__,_|_| \___| \_| |_| .__/ \___|_|_|_| |_|\___||___/
| |
agent v3.220.5 |_| (commit 5faff7c)


>> End User License Agreements:

Building sources from a TFVC repository requires accepting the Team Explorer Everywhere End User License Agreement. This step is not required for building sources from Git repositories.

A copy of the Team Explorer Everywhere license agreement can be found at:
/home/hsingh/adevopsagent/agent1/license.html

Enter (Y/N) Accept the Team Explorer Everywhere license agreement now? (press enter for N) > Y

>> Connect:

Enter server URL > https://dev.azure.com/harinderjitss/
Enter authentication type (press enter for PAT) >
Enter personal access token > ****************************************************
Connecting to server ...

>> Register Agent:

Enter agent pool (press enter for default) > Newlocal
Enter agent name (press enter for DESKTOP-424H6PI) >
Scanning for tool capabilities.
Connecting to the server.
Successfully added the agent
Testing agent connection.
Enter work folder (press enter for _work) >
2023-07-21 20:47:39Z: Settings Saved.
hsingh@DESKTOP-424H6PI:~/adevopsagent/agent1$ ./run.sh
  • After following the instructions, the agent is running on your machine.

First Execution of the Pipeline

  • The first Execution is a manual execution.
  • The first execution is different because you need to create a storage account as a backend for Terraform statefiles.
  • For the first time, set the Variables “first_iteration” to “true” and “local_state_file” to “true” and click on “Run”

The build_and_test stage

  • This is the first stage and It has 4 jobs.
  • Job 1 checks the subscription after login to Azure using ‘pipelineaccess’ Service principal we created earlier.
  • Job 2 installs prerequisites such as jq, make, Terraform and Terragrunt.
  • Job 3 runs ‘terragrunt init’ and ‘terragrunt validate’. It runs on different Terragrunt directories depending on whether it is the first execution or subsequent execution. That's a bit of a hack because validate would fail on dependencies for the first iteration.
  • Job 4 runs ‘terragrunt plan’ and it runs on different Terragrunt directories depending on whether it is the first execution or subsequent execution.

In this stage, you may add Terratest tests as well that makes it easier to write automated tests for your infrastructure code.

The “scan_using_checkov” stage

  • There is just one job at this stage.
  • This stage installs Checkov and runs the scan on Terraform modules directory.
  • Checkov scans the code and generates the report on the security or policy violation.
  • I haven’t resolved the Violations, so I am not making the deployment dependent on this which ideally should be the case.
  • Violations are leading to warnings and not failure. You may change that and make subsequent stages dependent on this stage.

The “publish_artifact” stage

  • It depends on whether stage “build_and_test” succeeded.
  • We are publishing the artifact built from the repository and the next stages can use the created artifact.
  • We are not really using it, it’s just for POC.

The ‘start_deploy’ stage

  • This stage depends on the ‘publish_artifact’ stage.
  • It has 3 jobs.
  • The First deployment is just for the check and approval purpose
  • You will be prompted to grant access to Pipeline on the environment “test-approval”, click on “Permit”.
  • At “start_deploy” stage, you will be prompted for the approval, Click on Approve, it will continue to the next job in the same stage.
  • If approved it proceeds to the Second job that installs all the prerequisites.
  • The third job has conditional tasks. If the variable local_state_file is true then ‘terragrunt apply’ is done using the local statefile to store Terraform state after that the next task migrates the state to the Azure storage account. That is done only for the first iteration. For subsequent iterations, local_state_file must be false, so that no local statefile is used and no migration of statefile is done. It picks the storage account as the backend for Terraform statefile.

The ‘finalize_deployment’ stage

  • This stage depends on ‘start_deploy’ stage.
  • After completion of “start_deploy”, the pipeline proceeds to the finalize_deployment stage where it creates the infrastructure defined in respective Terragrunt files in IaC repository.
  • There are two tasks one is terragrunt_apply and the other is terragrunt_destroy.
  • Only one of terragrunt_apply and the other is terragrunt_destroy is executed depending on the value of the variable ‘terragrunt_action’.
  • If terragrunt_action equals apply, terragrunt_apply task is executed which either creates or updates infrastructure as per configuration.
  • If terragrunt_action equals destroy, terragrunt_destroy task is executed which deletes the whole infrastructure.
  • After “finalize_deployment” stage succeeds, you will notice that the Infrastructure is created in your Azure subscription.
  • All resources are created in respective resource groups as per configuration in your git repository.

Subsequent Execution

  • Whenever someone from cloud engineering teams commits on the IaC code and pushes it to the repository, the pipeline will be triggered.
  • Since it is automatically executed it will use default variable values.
  • local_state_file=false, terragrunt_action=apply and first_iteration=false by default.
  • For example, if you just edited the configuration to add a new tag and committed and pushed the change to the master branch of the repository, the pipeline will get executed.
  • The rest is the same as the first execution.
  • Here is what a ‘terragrunt apply’ will look like.
  • Here is a snapshot of successful pipeline execution.
  • Added tag “type: demo”.
  • The same reflects for the AKS as shown below.

Destroy

  • If you want to destroy all infrastructure, you have to use terragrunt_action=destroy.
  • The pipeline execution for destroy has to be manual because you need to set the variable terragrunt_action= destroy.
  • You need to run the pipeline manually.
  • Only the last stage differs, task terragrunt_destroys is executed instead of terragrunt_apply.
  • Terragrunt destroy would look like below.
  • Successful pipeline execution.
  • All resources in the concerned subscription have been destroyed except for the Terraform backend storage account.

Conclusion

  • Azure DevOps is a great platform for DevOps needs.
  • It is easy to build pipelines using YAML and gives us an opportunity to use “pipeline as code”.
  • Using Microsoft’s hosted agent pool requires a separate request.
  • We didn’t use release pipelines for deployment because Release pipelines cant be defined using YAML, you have to use the Azure DevOps portal for that.
  • There are plugins available In Azure DevOps for many tools such as Terraform. I didn’t use that because I wanted to use Terragrunt which wasn’t available in the plugin marketplace.
  • Azure DevOps pricing has a good amount of freebies for learners like you.

As an Improvement on this, I have written a post about how to create Azure DevOps pipelines and other resources using Terraform. This will help you implement pure “Pipeline as Code”.

Please read my other articles as well and share your feedback. If you like the content shared please like, comment, and subscribe for new articles.

--

--

Technical Solutions Developer (GCP). Writes about significant learnings and experiences at work.