Infrastructure with actual code - AWS Cloud Development Kit (CDK)

Written on May 2, 2019

Disclaimer: at the time of writing (May 2019) the AWS CDK is still in preview and updates (including plenty of breaking changes) are coming on a fairly consistent basis! I’ve already found and reported a bug with the `dotnet` implementation so tread carefully, as the CDK site states this is “public beta” and not supported for production workloads at this time.

The problem with Infrastructure as Code

The concept of declaring resources using code is nothing new, we have been doing this to deploy software stacks on VMs when that was all we had. I’ve used Chef, Puppet, Ansible and SaltStack to get packages deployed and bring machines into a “desired state”.

With cloud computing we also looked for tools that could help us provision resources, AWS has AWS CloudFormation, Azure has ARM, GCP has Cloud Deployment Manager and there are also tools like Terraform from Hashicorp which I have used and blogged about in the past. This is commonly referred to as Infrastructure as Code (IaC)

I’ve spent a fair bit of time with CloudFormation, ARM and Terraform … but its been fairly inconsistent between projects and I feel I have “relearn” the syntax each time I come back to them. Then I have to catch myself up with the new features and the latest best practice.

For me personally, I spend way more time writing code in .NET, TypeScript and recently started learning Go. So what if there was a way to use this to do IaC, move away from the declarative models using JSON and YAML and move to a more imperative programming model.

Welcome to AWS Cloud Development Kit (CDK)

Now the CDK is certainly not the only (or first) tool out there, I’ve been meaning to check out pulumi.io for a while now and there is also Troposhere which is Pyhton based. Essentially what these tools are doing are enabling you to write code in a number of different languages which then spit out CloudFormation templates!

Taking the CDK for a spin

Now there are heaps of great resources already so I not going to attempt to regurgitate them here, if you are looking for a simple starter then check out these:

There are also some great other posts leveraging CDK, like this from Nathan Peck.

My aim is to build a CI/CD workflow to deploy a Serverless backend, using the following services:

Source - GitHub
Pipeline - AWS CodePipeline
Build - AWS CodeBuild
API - AWS API Gateway
Compute - AWS Lambda
Queue - AWS SQS
Storage - AWS DynamoDB

I want to get MVP going and refine from there, as to be expected the documentation and guidance on CDK is still limited so this will be a bit of an experiment to get things correct, this could be a journey so stick with me.

The “Hello World” examples I have seen are pretty basic, so I wanted to think about how the code would be structured. Mono repo? Separate repo for application vs infrastructure code? I struggled with this decision a fair bit, in a perfect world we would be doing microservices where every service is responsible for it’s own data ect ect. But given the preview nature of the CDK, how about we create a single repo and scale out from there, this is also a bit tricky as at the time of writing (May 2019) CodePipeline doesn’t support mono repo out of the box, so having everything in the same repo will trigger builds on every commit, so watch out!

I initially started out using .NET, I thought it would be nice to have the Lambda and CDK code in the same language, but I fount this bug with Lambda and CDK so I decided to pivot and use TypeScript for CDK and Golang for the Lambda code.

For the API backend I am using a simple Lambda function to take an incoming API call and dump the message body on to SQS, then another Lambda picking up these messages and dumping them into Dynamo.

So what I have is:

API Gateway - the is the entry point for the API call to Lambda
Lambda Handler - this takes the API call and dumps the message to SQS
SQS queue - this is the queue that will sit between two Lambda Functions
Lambda Worker - this function takes the SQS message and persists to Dynamo
DynamoDB - data store for the Lambda API to serve up
CodePipeline - we have a pipeline for each Lambda to build and deploy changes to Lambda stack or Go code

So what does that look like in CDK? It feels fairly straight forward once you get the hang of it, here is the link to the CDK TypeScript file.

As far as the Lambda function goes, it’s really nice. I am using a new feature of v0.29.0 which allows the code location to be passed in as a CloudFormation parameter, this means I can use it in the CodePipeline and pass it the new version of the build to be deployed as a parameter override.

const lambdaCode = lambda.Code.cfnParameters();
const StarterFunc = new lambda.Function(lambdaStarterStack, 'Lambda', {
  code: lambdaCode,
  handler: 'main',
  runtime: lambda.Runtime.Go1x,
  environment: {
    SQS_QUEUE_NAME: sqsQueue.queueUrl
  }
});

I also like how easy it was to integrate with other services like Secret Manager, here I can store the GitHub access token and a simple ref to get the pipeline hooked up.

const secret = secretsmanager.Secret.import(stack, 'GitHubAccessToken', {
    secretArn: secretArnParam
});

By far my favorite feature is how simple it was to create roles and assign the correct permissions, here we give our Lambda function the ability to send to SQS in a single line, this is a entire custom role definition in YAML!

sqsQueue.grantSendMessages(starterFunc);

All up we have ~200 lines of code in CDK, which spits outs the following templates:

Shared - 48
Starter Lambda - 311
Starter Pipeline - 600
Worker Lambda - 101
Worker Pipeline - 600
Total: 1,660

So that is a 800% return right there… ok not the best measure but that is thousands of lines of YAML I don’t have to manage and I get all the benefits of making modules and construct libraries to make this easier.

Where to from here

I had heaps of fun with this one…getting back into AWS land, picking up TypeScript again and using Golang for something more than Hello World!

On the whole I really like the CDK and moving forward I can see this being THE way to start out with IaC. It’s still early days and there are some things I need to spend some more time figuring out, much of which is related to my out of date knowledge of AWS products rather than the CDK.

One area I would like to see more info on is around bootstrapping a CI/CD workflow. The initial deployment needs to be done via the CLI and I found myself pushing updates to the code in this way. There needs to be a way to update your CDK code via a pipeline. I did take a look at the @aws-cdk/app-package as this has a option to “self update” the CDK deployment, but at this time doesn’t support “Assets” which I was using to deploy the Lambda code from S3. I’ll be taking a look at this in more detail to see if I can get CI/CD going for the shared and pipeline stacks.

Written on May 2, 2019