Posts

View Categories

May 31, 2025
Strategies for querying periodic S3 data snapshots

A common AWS analytics use case is making aggregate queries across multiple data sets stored in S3. For example, one partner team may store product metadata, while another team stores purchase order metadata, and we may want to join these data sets to determine which products are most popular across each marketplace.
In this post I'll cover a few options for syncing data to S3 and retrieving data snapshots for use in aggregate queries, and specifically discuss the use case where we need to maintain access patterns to complete, recent data snapshots without partial unavailability during data syncs.

Continue reading...
May 10, 2025
Granting AWS Organization member accounts access to Cost Explorer

By default, adding accounts to an AWS Organization results in consolidated billing and cost management in the Organization management account, and Organization member accounts lose access to Cost Explorer, Billing, and other cost management services.
This post walks through how to allow member accounts to access cost management services, to enable each team to review and manage their AWS spending.

Continue reading...
May 7, 2025
Using AWS Organizations to standardize security controls across AWS accounts

AWS Organizations provide a helpful, zero-cost way to centralize management of AWS accounts, with support for consolidating billing, role-based permission sets, service and resource control policies, and AWS service configurations.

Continue reading...
Jul 14, 2024
Reducing Lambda latency by 76% with AWS Lambda Power Tuning

Optimizing AWS Lambda memory capacity can decrease customer-facing latencies by up to 2-5 times without significantly increasing hardware costs, but can take some trial and error.
The AWS Lambda Power Tuning tool can be used to determine the optimal memory capacity for any Lambda function within minutes, and can be set up in a few button clicks.

Continue reading...
May 18, 2024
Serializing and deserializing DynamoDB pagination tokens to support paginated APIs

When using AWS's Java 2.x SDK, DynamoDB scan and query responses provide pagination tokens in a String to AttributeValue Map object, which represents the primary key of the last processed DynamoDB item. You can then pass this value as the "exclusive start key" for the next query to get the next page of results.
When your service retrieves all pages of results locally, this isn't a problem. However, when you want to provide a paginated API backed by DynamoDB, you'll need to convert this attribute value map into a format that can be passed over HTTP, AKA "serialize" the object into a string.

Continue reading...
Dec 1, 2023
Concurrency from single host applications up to massively distributed services

This post will describe several levels of concurrency, how they're commonly applied, and pros and cons of each approach.
Many distributed services now start with multi-host clusters for reliability and scalability reasons, so that any given host can be replaced without impacting customer service, and additional hosts can be added as needed.

Continue reading...
May 31, 2023
Process for designing distributed systems

In this post I'll step through my process for designing distributed systems, with example questions and artifacts associated with each step.
We will take the following steps to design a new service:
1. Validate whether this service needs to exist
2. Clarify business requirements
3. Estimate scale
4. Define system interfaces and data models
5. Define data flow and storage
6. Define high-level system components
7. Design individual components
The artifacts of each step can be validated with stakeholders to ensure we're on the right track before continuing. They collectively add to a design document that can be referred to both while building the service, and afterwards to understand its inner workings.

Continue reading...
Apr 28, 2023
Choosing between AWS compute services

When building a new service in AWS, it can be difficult to decide between all the available compute services. In this post I’ll give a brief overview of the main options and describe how I compare and choose between them for a given project.

Continue reading...

« Newer 1 2 3 4 5 6 7 8 Older »

Posts

Strategies for querying periodic S3 data snapshots

Granting AWS Organization member accounts access to Cost Explorer

Using AWS Organizations to standardize security controls across AWS accounts

Reducing Lambda latency by 76% with AWS Lambda Power Tuning

Serializing and deserializing DynamoDB pagination tokens to support paginated APIs

Concurrency from single host applications up to massively distributed services

Process for designing distributed systems

Choosing between AWS compute services