Build Streaming App Using AWS CDK and AWS CloudFormation – Part #1

You can construct complicated architectures, share reference architectures and solution implementations with your coworkers, with people from different business lines, or with the broader public using Infrastructure as Code to speed up your time to …

You can construct complicated architectures, share reference architectures and solution implementations with your coworkers, with people from different business lines, or with the broader public using Infrastructure as Code to speed up your time to completion. By offering services, tools, and multilingual support, AWS meets infrastructure engineers and developers where they are, allowing you to get started with the knowledge you already have.

This tutorial will teach you how to speed up your solutions by using AWS CloudFormation and the AWS Cloud Development Kit (CDK). At the conclusion of this tutorial, you will have a streaming analytics pipeline constructed in minutes that you can publish as a shareable CDK module library, which you can instal automatically, regularly, and reliably, whether you are new to AWS or just to these particular services.

Despite the fact that the account we’ve given you already has everything you need to get started, this course assumes that you have a basic understanding of key AWS services and some knowledge of coding.

Architecture

Consider being tasked with developing or moving a containerized application that generates streaming data to the cloud, along with the infrastructure necessary to ingest that streaming data for further analysis. You lack expertise in containers and have no knowledge of AWS’s real-time streaming services.

You may take your time getting to know each service, looking into and trying out connections between them, and then creating the application and data ingestion pipeline from scratch. However, as this is a typical use case for the cloud, you should employ infrastructure as code to speed up the solution while also making sure it complies with AWS Best Practice.

With the help of CloudFormation and the CDK, we can construct this architecture in less than an hour. We’ll start by deploying a streaming data ingestion pipeline from the AWS Solution Implementation library using a CloudFormation template. Then, using the CDK to design an application that produces streaming data, we will expand upon that base. We’ll package the entire solution into a publicly accessible CDK module that you can distribute to other teams inside your organisation or to the entire globe.

Installing Cloud9

We will be using AWS Cloud9 IDE to make changes to files in the environment. This will be easier since Cloud9 already has the packages required to make those changes.

Choose the instance type. I will go with t2.micro since it is included in the Free-tier.

Given that we will be accessing the environment using SSH directly, we can choose Secure Shell (SSH) option in the Network settings.

Once the environment is created, you can check it in the Environments tab and by clicking on Open.

Create the AWS CDK Application

An unsupported version of node is installed by default on the Cloud9 instance. Run the command below to update.

nvm install v14.17.6

To build a project for our CDK application, create a new directory called “streaming-app” for our project and use the recently revised AWS CDK CLI.

The CDK project will be initialised using the CLI command cdk init. Therefore, our language of choice will be TypeScript, and “app” will be used because we are starting a CDK application.

Although CDK v2 is deployed by default on Cloud9, this tutorial was made with CDK v1. Therefore, for the time being, all of our CDK instructions will be pinned to CDK v1.

In the Cloud9 terminal, run the following command:

$ mkdir streaming-app && cd streaming-app
$ npx cdk@1.125.0 init --language=typescript app

Install AWS Service Dependencies

Relying on npm once more, install the AWS dependencies that we’ll require to deploy the application. Identically, if you previously studied the architectural diagram, these services should be recognisable.

First, edit the package.json file to include the dependencies with the versions pinned to 1.125.0.

{
  "name": "streaming-app",
  "version": "0.1.0",
  "bin": {
    "streaming-app": "bin/streaming-app.js"
  },
  "scripts": {
    "build": "tsc",
    "watch": "tsc -w",
    "test": "jest",
    "cdk": "cdk"
  },
  "devDependencies": {
    "@aws-cdk/assert": "1.125.0",
    "@aws-cdk/assert": "1.125.0",
    "@types/jest": "^26.0.10",
    "@types/node": "10.17.27",
    "jest": "^26.4.2",
    "ts-jest": "^26.2.0",
    "aws-cdk": "1.125.0",
    "aws-cdk": "1.125.0",
    "ts-node": "^9.0.0",
    "typescript": "~3.9.7"
  },
  "dependencies": {
    "@aws-cdk/aws-ec2": "1.125.0",
    "@aws-cdk/aws-ecs": "1.125.0",
    "@aws-cdk/aws-iam": "1.125.0",
    "@aws-cdk/aws-kinesis": "1.125.0",
    "@aws-cdk/aws-kinesisfirehose": "1.125.0",
    "@aws-cdk/aws-kinesisfirehose-destinations": "1.125.0",
    "@aws-cdk/cloudformation-include": "1.125.0",
    "@aws-cdk/core": "1.125.0",
    "source-map-support": "^0.5.16"
  }
}

Run the subsequent installation command in the Cloud9 terminal after that:

npm install
npm audit fix

For some dependencies, you can see a warning requesting that you execute “npm audit fix”. Despite the fact that you would definitely want to accomplish this for a production application, we won’t do it for the tutorial because it might result in breaking changes later on.

Configure the AWS CDK Contexts

Correspondingly, In order to use the most recent CDK bootstrap, we will lastly change the cdk.json file and set newStyleStackSynthesis to true.

{
  "app": "npx ts-node --prefer-ts-exts bin/streaming-app.ts",
  "context": {
    "@aws-cdk/core:newStyleStackSynthesis": true,
    "@aws-cdk/aws-apigateway:usagePlanKeyOrderInsensitiveId": true,
    "@aws-cdk/core:enableStackNameDuplicates": "true",
    "aws-cdk:enableDiffNoFail": "true",
    "@aws-cdk/core:stackRelativeExports": "true",
    "@aws-cdk/aws-ecr-assets:dockerIgnoreSupport": true,
    "@aws-cdk/aws-secretsmanager:parseOwnedSecretName": true,
    "@aws-cdk/aws-kms:defaultKeyPolicies": true,
    "@aws-cdk/aws-s3:grantWriteWithoutAcl": true,
    "@aws-cdk/aws-ecs-patterns:removeDefaultDesiredCount": true,
    "@aws-cdk/aws-rds:lowercaseDbIdentifier": true,
    "@aws-cdk/aws-efs:defaultEncryptionAtRest": true,
    "@aws-cdk/aws-lambda:recognizeVersionProps": true,
    "@aws-cdk/aws-cloudfront:defaultSecurityPolicyTLSv1.2_2021": true
  }
}

Bootstrap the CDK environment

The CDK bootstrap utility in the AWS CDK command-line interface is in charge of providing the resources needed by the CDK to make deployments into a specified environment, which is a combination of an AWS account and region. In the environment specified on the command line, a CloudFormation stack is created by the bootstrap command. The sole resource in that stack at the moment is an S3 bucket that stores the file assets and the CloudFormation template that results from them.

npx cdk@1.125.0 bootstrap

Building and deploying are now possible in our environment.

Deploy the Ingestion Pipeline Using AWS CloudFormation

For Streaming Data for Amazon Kinesis, there are a number of options in the AWS Solution Implementation library, and Option 3 is just the one that will speed up our use case.

AWS CloudFormation template using Amazon Kinesis Data Streams, Amazon Kinesis Data Firehose, and Amazon S3

AWS CloudFormation template providing direct ingestion using Kinesis Data Srreams, Kinesis Data Firehose, and Amazon S3
Amazon Kinesis Data Firehose buffers the data before sending the output to an Amazon S3 bucket, while Amazon Kinesis Data Streams saves the incoming streaming data. It is a fully managed service that scales automatically to match the data traffic and doesn’t need continuing management. The data input and buffering are monitored through an Amazon CloudWatch dashboard. On crucial Kinesis Data Firehose metrics, CloudWatch alarms are set.

You can deploy this solution using CloudFormation and the CDK without knowing anything about Kinesis Data Streams, Kinesis Firehose, CloudWatch, or S3—or how to configure any of them, utilise the deployed services confident that they have been set up in accordance with AWS Best Practice and iterate to change their configuration in response to the throughput and data being consumed.

Indeed, it is okay to have concerns about how the CDK and CloudFormation operate and what is necessary to install this Solution Implementation into your account if you haven’t used them previously.

What is AWS CloudFormation?

By treating infrastructure as code, AWS Cloudformation offers you a simple approach to model a group of connected AWS and outside resource, provision them fast and consistently, and manage them throughout their life cycles. You can launch and configure your chosen resources as a stack by using a CloudFormation template, which is a text file that lists their dependencies. In any event of handling resources individually, you can use a template to create, edit, and delete a full stack as a single unit as much as necessary. Stacks can be managed and created under different AWS accounts and AWS Regions. Secondly, Resources that are a part of a stack, or a group of services deployed collectively, can also be simply de-provisioned or destroyed.

How Does CloudFormation Work?

You can import your CloudFormation template into the CDK or construct it in YAML or JSON format and submit it to the CloudFormation service using the AWS Console, AWS Command Line Interface (CLI), or another method. The same AWS APIs are called whether anything is manually created in the console or using the AWS CLI when CloudFormation deploys the template as a Stack, and cloud infrastructure is installed.

Customers who aren’t developers can take advantage of the power of infrastructure as code without having to learn how to code thanks to CloudFormation’s many features, which include handling dependencies and allowing you to connect service configuration using referenced parameters.

Review the AWS CloudFormation Template

Firstly, let’s download the CloudFormation template for Option 3 so we can review at the contents before deploying. Make sure you are in the streaming-app directory that we created earlier and run the following commands in the Cloud9 terminal:

mkdir templates

curl https://solutions-reference.s3.amazonaws.com/aws-streaming-data-solution-for-amazon-kinesis/v1.5.0/aws-streaming-data-solution-for-kinesis-using-kinesis-data-firehose-and-amazon-s3.template -o templates/aws-kinesis-streaming-solution.json

Open the aws-kinesis-streaming-solution.json file you just downloaded. The template creates 20+ AWS resources and has more than 1200 lines of CloudFormation code.

JSON and YAML templates are supported by CloudFormation. Each developer has their own preferences, as well as unique benefits and drawbacks. As the Solution Implementation template is in JSON, we’ll use that for this session.

JSON, also known as JavaScript Object Notation, is an open data exchange format that can be read by both machines and people. In objects, or collections, of attributes enclosed in curly braces, and in arrays, or lists, of objects or attributes enclosed in square brackets, it employs key-value pairs to represent data [].

The Resources Section

There must be a Resources section. The complete list of resources this template will generate, together with their underlying setup, can be found here. Let’s evaluate the configuration by taking a look at the OutputBucket… S3 Bucket.

Each resource has a Type, in this case AWS::S3::Bucket, and a list of Properties that specify the necessary configuration components for that resource type (e.g. Bucket Encryption, Lifecycle Configuration, DeletionPolicy, etc.) If you were to build resources via the graphical user interface, the settings you would choose for these underlying Properties would be in the AWS console. In reality, to provision infrastructure for you in accordance with your requests, both the AWS Console and CloudFormation use the same AWS APIs. Another method for giving the API those configuration options is CloudFormation, which uses a text file rather than radio buttons or a drop-down menu on the AWS Console.

  "Resources": {
    ...

    "OutputBucketB1E245A7": {
      "Type": "AWS::S3::Bucket",
      "Properties": {
        "BucketEncryption": {
          "ServerSideEncryptionConfiguration": [
            {
              "ServerSideEncryptionByDefault": {
                "SSEAlgorithm": "AES256"
              }
            }
          ]
        },
        "LifecycleConfiguration": {
          "Rules": [
            {
              "AbortIncompleteMultipartUpload": {
                "DaysAfterInitiation": 7
              },
              "Id": "multipart-upload-rule",
              "Status": "Enabled"
            },
            {
              "Id": "intelligent-tiering-rule",
              "Status": "Enabled",
              "Transitions": [
                {
                  "StorageClass": "INTELLIGENT_TIERING",
                  "TransitionInDays": 1
                }
              ]
            }
          ]
        },
        "LoggingConfiguration": {
          "DestinationBucketName": {
            "Ref": "OutputAccessLogsBucket8BE3FC5F"
          }
        },
        "PublicAccessBlockConfiguration": {
          "BlockPublicAcls": true,
          "BlockPublicPolicy": true,
          "IgnorePublicAcls": true,
          "RestrictPublicBuckets": true
        }
      },
      "UpdateReplacePolicy": "Retain",
      "DeletionPolicy": "Retain",
      "Metadata": {
        "aws:cdk:path": "aws-streaming-data-solution-for-kinesis-using-kinesis-data-firehose-and-amazon-s3/Output/Bucket/Resource"
      }
    },

    ...

Create the Resources to Deploy

CloudFormation Include

The solution CloudFormation template has been obtained, but we still need a mechanism to deploy it as a component of our CDK application. Additionally, it would be convenient if we could take specific resources out of the template and use them elsewhere in our CDK application.

The CloudFormation-Include CDK module will be used for this. This module reads a CloudFormation template file and loads all of the resources, parameters, outputs, and other components it discovers into your CDK application. Then, you may immediately alter any template-defined objects in your CDK code and make use of the template’s already-existing resources for developing new CDK structures. Visit this blog article for more details.

It makes little sense to generalise and package some structures if they are highly specialised for a given application. If they are utilised by several apps, on the other hand, they should be transferred to a different package with a unique lifespan and testing method.

For our ingestion pipeline architecture, make a new file.

touch lib/ingestion-pipeline.ts

Edit the lib/ingestion-pipeline.ts file and add the initial construct code.

import * as cdk from '@aws-cdk/core';
import * as include from '@aws-cdk/cloudformation-include';

import * as path from 'path';

export class IngestionPipeline extends cdk.Construct {
  constructor(scope: cdk.Construct, id: string) {
    super(scope, id);

    // The code that defines your construct goes here
  }
}

Next let’s use the cloudformation-include module to include the solution template.

export class IngestionPipeline extends cdk.Construct {
  constructor(scope: cdk.Construct, id: string) {
    super(scope, id);

    const template = new include.CfnInclude(this, 'Template', {
      templateFile: path.join(__dirname, '../templates/aws-kinesis-streaming-solution.json'),
    });
  }
}

Set Parameters

There are numerous CloudFormation Parameters in the template. We could deploy the template as-is because all of the parameters have default values, but the cloudformation-include module enables us to replace any of the template’s parameters with build-time values if we prefer.

The template definition has to include a parameters variable. We will leave the default values in place for the time being, but later we might alter them to accept values from input props.

export class IngestionPipeline extends cdk.Construct {
  constructor(scope: cdk.Construct, id: string) {
    super(scope, id);

    const template = new include.CfnInclude(this, 'Template', {
      templateFile: path.join(__dirname, '../templates/aws-kinesis-streaming-solution.json'),
      parameters: {
        ShardCount: 2,
        RetentionHours: 24,
        EnableEnhancedMonitoring: 'false',
        BufferingSize: 5,
        CompressionFormat: 'GZIP',
      },
    });
  }
}

In comparison to CloudFormation, this reflects a change in how we provide various environments with the CDK. Firstly, while using CloudFormation, each environment had a single template with parameters that could be changed at deployment time. Instead, with the CDK, various templates are generated for each environment.

In conventional AWS CloudFormation situations, your objective is to create a single parameterized artefact that can be deployed to numerous target environments after being configured according to the requirements of those environments. You should incorporate that configuration directly into your source code when using the CDK. Put the configuration parameters for each stage directly in the code after creating a stack for your production environment and a separate one for each of your other stages. For sensitive values that you don’t want to check into source control, use services like AWS Secrets Manager and AWS Systems Manager Parameter Store, using the names or ARNs of those resources.

Preserve Logical IDs

The preserveLogicalIds attribute will be the final one we update. The logical IDs of all deployed template components will be renamed using CDK’s technique if this is set to false, ensuring that they are all unique within your application. Due to the fact that we intend to sell this as a CDK Construct, this is particularly crucial for us.

export class IngestionPipeline extends cdk.Construct {
  constructor(scope: cdk.Construct, id: string) {
    super(scope, id);

    const template = new include.CfnInclude(this, 'Template', {
      templateFile: path.join(__dirname, '../templates/aws-kinesis-streaming-solution.json'),
      parameters: {
        ShardCount: 2,
        RetentionHours: 24,
        EnableEnhancedMonitoring: 'false',
        BufferingSize: 5,
        CompressionFormat: 'GZIP',
      },
      preserveLogicalIds: false,
    });
  }
}

Create an Ingestion Pipeline Stack

Let’s design a CDK Stack to enable us to deploy the infrastructure now that we have our construct created.

lib/ingestion-pipeline-stack.ts should be created as a new file.

touch lib/ingestion-pipeline-stack.ts

Add the following code:

import * as cdk from '@aws-cdk/core';
import { IngestionPipeline } from './ingestion-pipeline';

export class IngestionPipelineStack extends cdk.Stack {
  constructor(scope: cdk.Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    new IngestionPipeline(this, 'IngestionPipeline');
  }
}

Now add the stack to our entry point, and make sure to update the env parameter for both stacks.

Edit bin/streaming-app.ts

#!/usr/bin/env node
import 'source-map-support/register';
import * as cdk from '@aws-cdk/core';
import { StreamingAppStack } from '../lib/streaming-app-stack';
import { IngestionPipelineStack } from '../lib/ingestion-pipeline-stack';

const app = new cdk.App();
new IngestionPipelineStack(app, 'IngestionPipelineStack', {
  env: {
    account: process.env.CDK_DEFAULT_ACCOUNT,
    region: process.env.CDK_DEFAULT_REGION,
  },
});

new StreamingAppStack(app, 'StreamingAppStack', {
    env: {
    account: process.env.CDK_DEFAULT_ACCOUNT,
    region: process.env.CDK_DEFAULT_REGION,
  },
  /* If you don't specify 'env', this stack will be environment-agnostic.
   * Account/Region-dependent features and context lookups will not work,
   * but a single synthesized template can be deployed anywhere. */

  /* Uncomment the next line to specialize this stack for the AWS Account
   * and Region that are implied by the current CLI configuration. */
  // env: { account: process.env.CDK_DEFAULT_ACCOUNT, region: process.env.CDK_DEFAULT_REGION },

  /* Uncomment the next line if you know exactly what Account and Region you
   * want to deploy the stack to. */
  // env: { account: '123456789012', region: 'us-east-1' },

  /* For more information, see https://docs.aws.amazon.com/cdk/latest/guide/environments.html */
});

Deploying the AWS CDK Ingestion Pipeline

Let’s perform a cdk diff to see what we’ll be deploying prior to deployment.

npx cdk@1.125.0 diff

You should see output similar the following screenshot.

Now run cdk deploy to create the ingestion pipeline.
Choose yes on any confirmation prompts.
npx cdk@1.125.0 deploy --all

Review Deployment in the AWS Console

To establish our bearings, let’s go over what we deployed using the AWS Console. Explore the CloudFormation Console.

You should see something similar to the below Screenshot:

The group of resources that your AWS team set up for you in Event Engine is known as the mod-ee… Stack.

The set of roles and an S3 bucket generated when you bootstrapped the CDK make up the CDK Toolkit Stack.

We recently launched two stacks leveraging the CDK and the code we created, the Ingestion Pipeline Stack and Streaming App Stack.

Due to the fact that we haven’t yet created our streaming app, the Streaming App Stack isn’t really interesting.

Click the Events tab after entering the Ingestion Pipeline Stack.

You may see a play-by-play view of the Resources‘ status and creation order on the Events tab. This is the first place to look for troubleshooting if you were to experience difficulties when deploying a stack.

Each resource is identified by a Logical ID, Physical ID, Type (which corresponds to the name of the AWS service it is a part of), and some other status details.

Spend some time clicking through some of the Physical ID links, which will take you to the sites for each deployed resource’s particular service. AWS manages many of these managed services on your behalf, including those with their own Infrastructure as Code architectures.

The quantity and complexity of the resources used to support what appears to be a fairly straightforward architecture show how effective the use of Infrastructure as Code is in hiding the complexities of service options and interactions.

At this stage this post was getting long so I’ve decided to create a Part #2 where I will be going through deploying the second half of the solution and will be building the Streaming Application with AWS CDK

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.