Infrastructure with actual code - AWS CDK

Written on May 2, 2019

Disclaimer: at the time of writing (May 2019) the AWS CDK is still in preview and updates (including plenty of breaking changes) are coming on a fairly consistent basis! I’ve already found and reported a bug with the dotnet implementation so tread carefully, as the CDK site states this is “public beta” and not supported for production workloads at this time.

The problem with Infrastructure as Code

The concept of declaring resources using code is nothing new, we have been doing this to deploy software stacks on VMs when that was all we had. I’ve used Chef, Puppet, Ansible and SaltStack to get packages deployed and bring machines into a “desired state”.

With cloud computing we also looked for tools that could help us provision resources, AWS has CloudFormation, Azure has ARM, GCP has Cloud Deployment Manager and there are also tools like Terraform from Hashicorp which I have used and blogged about in the past. This is commonly referred to as Infrastructure as Code (IaC)

I’ve spent a fair bit of time with CFN, ARM and Terraform … but its been fairly inconsistent between projects and I feel I have “relearn” the syntax each time I come back to them. Then I have to catch myself up with the new features and the latest best practice.

For me personally, I spend way more time writing code in .NET, TypeScript and recently started learning Go. So what if there was a way to use this to do IaC, move away from the declarative models using JSON and YAML and move to a more imperative programming model.

Welcome to AWS Cloud Development Kit (CDK)

Now the CDK is certainly not the only (or first) tool out there, I’ve been meaning to check out pulumi.io for a while now and there is also Troposhere which is Pyhton based. Essentially what these tools are doing are enabling you to write code in a number of different languages which then spit out CloudFormation templates!

Taking the CDK for a spin

Now there are heaps of great resources already so I not going to attempt to regurgitate them here, if you are looking for a simple starter then check out these:

There are also some great other posts leveraging CDK, like this from Nathan Peck.

My aim is to build a CI/CD workflow to deploy a Serverless backend, using the following services:

  • Source - GitHub
  • Pipeline - AWS CodePipeline
  • Build - AWS CodeBuild
  • API - AWS API Gateway
  • Compute - AWS Lambda
  • Queue - AWS SQS
  • Storage - AWS DynamoDB

I want to get MVP going and refine from there, as to be expected the documentation and guidance on CDK is still limited so this will be a bit of an experiment to get things correct, this could be a journey so stick with me.

The “Hello World” examples I have seen are pretty basic, so I wanted to think about how the code would be structured. Mono repo? Separate repo for application vs infrastructure code? I struggled with this decision a fair bit, in a perfect world we would be doing microservices where every service is responsible for it’s own data ect ect. But given the preview nature of the CDK, how about we create a single repo and scale out from there, this is also a bit tricky as at the time of writing (May 2019) CodePipeline doesn’t support mono repo out of the box, so having everything in the same repo will trigger builds on every commit, so watch out!

I initially started out using .NET, I thought it would be nice to have the Lambda and CDK code in the same language, but I fount this bug with Lambda and CDK so I decided to pivot and use TypeScript for CDK and Golang for the Lambda code.

For the API backend I am using a simple Lambda function to take an incoming API call and dump the message body on to SQS, then another Lambda picking up these messages and dumping them into Dynamo.

So what I have is:

  • API Gateway - the is the entry point for the API call to Lambda
  • Lambda Handler - this takes the API call and dumps the message to SQS
  • SQS queue - this is the queue that will sit between two Lambda Functions
  • Lambda Worker - this function takes the SQS message and persists to Dynamo
  • DynamoDB - data store for the Lambda API to serve up
  • CodePipeline - we have a pipeline for each Lambda to build and deploy changes to Lambda stack or Go code

So what does that look like in CDK? It feels fairly straight forward once you get the hang of it, here is the link to the CDK TypeScript file.

As far as the Lambda function goes, it’s really nice. I am using a new feature of v0.29.0 which allows the code location to be passed in as a CloudFormation parameter, this means I can use it in the CodePipeline and pass it the new version of the build to be deployed as a parameter override.

    const lambdaCode = lambda.Code.cfnParameters();
    const StarterFunc = new lambda.Function(lambdaStarterStack, 'Lambda', {
      code: lambdaCode,
      handler: 'main',
      runtime: lambda.Runtime.Go1x,
      environment: {
        SQS_QUEUE_NAME: sqsQueue.queueUrl
      }
    });

I also like how easy it was to integrate with other services like Secret Manager, here I can store the GitHub access token and a simple ref to get the pipeline hooked up.

    const secret = secretsmanager.Secret.import(stack, 'GitHubAccessToken', {
        secretArn: secretArnParam
    });

By far my favorite feature is how simple it was to create roles and assign the correct permissions, here we give our Lambda function the ability to send to SQS in a single line, this is a entire custom role definition in YAML!

sqsQueue.grantSendMessages(starterFunc);

All up we have ~200 lines of code in CDK, which spits outs the following templates:

  • Shared - 48
  • Starter Lambda - 311
  • Starter Pipeline - 600
  • Worker Lambda - 101
  • Worker Pipeline - 600
  • Total: 1,660

So that is a 800% return right there… ok not the best measure but that is thousands of lines of YAML I don’t have to manage and I get all the benefits of making modules and construct libraries to make this easier.

Where to from here

I had heaps of frustration fun with this one… getting back into AWS land, picking up TypeScript again and using Golang for something more than Hello World!

On the whole I really like the CDK and moving forward I can see this being THE way to start out with IaC. It’s still early days and there are some things I need to spend some more time figuring out, some of this is AWS services and not just limited to the CDK

  • Bootstrapping CI/CD - the initial deployment needs to be done via the CLI and I found myself pushing updates to the code in this way. There needs to be a way to update your CDK code via a pipeline. I did take a look at the @aws-cdk/app-package as this has a option to “self update” the CDK deployment, but at this time doesn’t support “Assets” which I was using to deploy the Lambda code from S3. I’ll be taking a look at this in more detail to see if I can get CI/CD going for the shared and pipeline stacks.

  • “cdk init” vs a single file - to start off I created my first project using cdk init, this was fine for some of the simple stuff, but I struggled to get this running with the CodePipeline and passing data between stacks. Many of the current docs and examples are using a single JS/TS file rather than the multiple files created in this way, however, I think this is more down to my lack of TypeScript skills.

Written on May 2, 2019