Building This Site

Published July 22, 2023

#security #devops

This site is built on AWS and uses Hugo.

The infrastructure is managed with Terraform and consists of:

  • An S3 bucket, configured as a static website
  • A Route53 Hosted Zone (DNS)
  • Certificates (ACM)
  • A CloudFront distribution for content caching
    • A CloudFront Function for rewriting URLs
    • An Origin Access Identity for CloudFront access to the S3 bucket
  • GitHub Actions (eventually)

Hugo

Hugo is a static site generator written in Go. All content is defined with templating and is rendered client-side before being pushed to the S3 origin. Articles are written in markdown and translated into html with the Goldmark processor. I use BootStrap for most of the css and javascript.

I chose to use Hugo over other options like Jekyll and Gatsby mostly because I am interested in learning Go at some point. I’m happy with the choice, but it didn’t do much to introduce me to Golang.

Terraform

A static website doesn’t require much to work, so the necessary terraform configurations are pretty light. The CloudFront and ACM sections include some interesting pieces, but the rest is fairly standard. Really, terraform is a bit overkill for a project like this, but even a small bump in ease of maintenance is worth it in my opinion. Plus, it is fun to do and definitely nicer than working in the AWS console.

CloudFront

Using CloudFront allows content to be cached at edge locations closer to users, which results in faster load times and better site performance. A CloudFront distribution provides many options for defining cache behavior, but for a static site, not much is needed. As long as content doesn’t change very often once it is instantiated, the default ttl values won’t be problematic. If there are objects that need to be cleared from the cache, it is easy to kick off an invalidation using an AWS CLI command:

aws cloudfront create-invalidation \
    --distribution-id $DISTRIBUTION_ID \
    --paths "/index.html" "/index.xml" <additional-paths>

Typically, I only perform invalidations for pages that are modified regularly, and this is part of the deployment process when new content is added.

Using Hugo with CloudFront poses one minor issue, which is that CloudFront doesn’t natively handle the pretty URLs that Hugo is set to use by default. Since all pages have their own index.html file, the objects that end up in S3 are individual index files with paths like /posts/post-name/index.html. Using pretty URLs, Hugo populates all of the links using the pretty format /posts/post-name/. This conflict between what Hugo expects and how CloudFront requests objects means that everything except the root-level index page will fail to be returned unless users manually add index.html to every request. Fortunately, there are some solutions for this.

In the past, I have used Lambda@Edge functions to essentially hijack request events and have the function rewrite the path Hugo uses to include the full file extension. The object is returned to the user with the originally requested path. Lambda@Edge functions are fine for this, but for this site I have switched to using CloudFront Functions. There are some limitations in CloudFront Functions that aren’t present in Lambda, but for this simple use-case there are none that matter. The benefits are simpler configuration and increased performance, which results from CloudFront Functions running closer to edge locations than Lambda@Edge. The code ends up being functionally the same:

function handler(event) {
  var request = event.request;
  var indexPath = new RegExp("/$");
  var match = indexPath.exec(request.uri);
  var newURL = request.uri;
  
  if (match) {
    newURL = `${request.uri}index.html`;
  }
  
  request.uri = newURL;
  return request;
}

Terraform’s aws_cloudfront_distribution resource includes an option to set the function to be called as part of the default_cache_behavior:

resource "aws_cloudfront_function" "pretty_urls" {
  name    = "pretty_urls"
  runtime = "cloudfront-js-1.0"
  publish = true
  code    = file("${path.module}/pretty_urls.js")
}

resource "aws_cloudfront_distribution" "s3_distribution" {
  ...
  default_cache_behavior {
    ...
    function_association {
      event_type   = "viewer-request"
      function_arn = aws_cloudfront_function.pretty_urls.arn
    }
  }
}

ACM

To request an ACM cert, ownership of the domain the cert is being requested for must be proven. This can be done by email, which is not ideal because it requires an additional manual step, but it can also be done through DNS. By creating a DNS CNAME record with pre-determined values, the certificate issuer can be sure that the requester controls the domain. This would also be a manual process, but terraform provides the aws_acm_certificate_validation resource object specifically for handling this situation. Used in combination with aws_route53_record, terraform can automatically create DNS entries resulting from aws_acm_certificate and wait for validation to complete before proceeding to use the newly-minted certificate in other places.

resource "aws_acm_certificate" "cert" {
  provider = aws.virginia

  domain_name       = "${var.domain_name}"
  validation_method = "DNS"

  lifecycle {
    create_before_destroy = true
  }
}

# DNS record to prove ownership of domain the cert is requested for
resource "aws_route53_record" "domain_validation" {
  provider = aws.virginia

  for_each = {
    for dvo in aws_acm_certificate.cert.domain_validation_options : dvo.domain_name => {
      name    = dvo.resource_record_name
      record  = dvo.resource_record_value
      type    = dvo.resource_record_type
      zone_id = data.aws_route53_zone.static_site.zone_id
    }
  }

  allow_overwrite = true
  name            = each.value.name
  records         = [each.value.record]
  ttl             = 60
  type            = each.value.type
  zone_id         = data.aws_route53_zone.static_site.zone_id
}

One other thing worth noting is that these resources all need to use a provider configuration with the AWS region set to us-east-1. This is because the certificate is being used with CloudFront, which is based in us-east-1. Provider configurations are stipulated using the alias aws.virginia in each block. Resources without a provider designation use the default provider set, which is simply the provider with no alias. Providers are defined in terraform.tf:

terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }
}

provider "aws" {
  region = "us-west-2"
}

provider "aws" {
  alias  = "virginia"
  region = "us-east-1"
}

GitHub Actions

I haven’t set this up yet, but I plan to write a workflow for GitHub Actions that will checkout the code in the site repo, install Hugo on a GitHub runner, generate the site content, and sync it to S3. Automating this will eliminate a few steps from my workflow. I’ll go into the security considerations that need to be taken into account for this, since Actions on public repositories can pose some risks if implemented incorrectly, particularly when there are interactions with a hosting environment.

Reference Materials

In building this site, I specifically referenced the following resources: