Real-Time ALB Log Analysis for Proactive Integration Recovery via Datadog Monitors, Workflows and AWS Lambda
Source: Dev.to
The Problem
- The Drama: Google Calendar events were giving us major side‑eye, and we had zero visibility into which users were feeling the pain.
- The Cause: Errors were mostly tied to channels that were ghosting us or events that couldn’t reach the API.
- The Worst Part: Our ELB/ALB didn’t have access logs, so we were total in the dark about it.
The Light of Reasoning
Why not add access logs? Since we already use Datadog, I decided to add the ALB/ELB access logs there.
This project was built using Pulumi, but no worries—let’s add it. If you haven’t worked with Pulumi before, check out their documentation. Pulumi lets you manage IaC for many providers using the same library/tool, similar to Terraform, but with TypeScript.
Adding ALB/ELB Access Logs
What are Access Logs?
Access Logs are a paper‑trail of everything that happens at the load balancer. Each log entry records:
- When the request arrived (timestamp)
- The client’s IP address
- The requested URL
- The response status code (e.g.,
200,404) - The request latency
- The target instance ID that handled the request
Pulumi Code Example for a Fresh ALB/ELB
Below is a minimal Pulumi setup that creates an ALB, target group, listener, and listener rules. (In the original project the ALB already existed; this example shows how to create it from scratch.)
Create the ALB
import * as pulumi from "@pulumi/pulumi";
import * as aws from "@pulumi/aws";
import * as awsx from "@pulumi/awsx";
const mainAlb = new awsx.lb.ApplicationLoadBalancer("main-alb", {
name: "main-alb",
subnetIds: vpc.publicSubnetIds,
securityGroups: [securityGroup.id],
internal: false,
accessLogs: {
bucket: lbLogs.id,
prefix: "access-logs-lb",
enabled: true,
},
tags: {
stage: environment,
managed: "true",
},
});
Create the Target Group
const webhooksTargetGroup = new aws.lb.TargetGroup("webhooks-tg", {
port: 80,
protocol: "HTTP",
targetType: "ip",
vpcId: vpc.vpcId,
healthCheck: {
enabled: true,
healthyThreshold: 2,
unhealthyThreshold: 2,
interval: 10,
path: "/api/v1/health-check",
port: "traffic-port",
},
});
HTTPS Listener
// HTTPS listener
const httpsListener = new aws.lb.Listener("httpsListener", {
loadBalancerArn: mainAlb.loadBalancer.arn,
port: 443,
protocol: "HTTPS",
defaultActions: [{
type: "fixed-response",
fixedResponse: {
contentType: "text/plain",
statusCode: "404",
messageBody: "Not Found",
},
}],
sslPolicy: "ELBSecurityPolicy-2016-08",
certificateArn: myCertificateArn,
});
// Additional certificates
new aws.lb.ListenerCertificate("webhooks-attachment", {
listenerArn: httpsListener.arn,
certificateArn: webhooksCertificateArn,
});
Listener Rules
new aws.lb.ListenerRule("rule-webhooks", {
listenerArn: httpsListener.arn,
priority: 3,
actions: [{ type: "forward", targetGroupArn: webhooksTargetGroup.arn }],
conditions: [{ hostHeader: { values: [`webhooks.${route53ZoneName}`] } }],
});
Pulumi Code Example for an Already‑Created ALB/ELB
If the load balancer already exists, you only need to provision the S3 bucket for logs and enable logging on the ALB.
import * as pulumi from "@pulumi/pulumi";
import * as aws from "@pulumi/aws";
// 1. Create S3 bucket for ALB access logs
const albAccessLogsBucket = new aws.s3.Bucket("alb-access-logs", {
bucket: `alb-access-logs.${environment}.us-east-1.careops`,
acl: "private",
});
// Bucket policy to allow ELB to write logs
new aws.s3.BucketPolicy("albLogsBucketPolicy", {
bucket: albAccessLogsBucket.id,
policy: albAccessLogsBucket.arn.apply(bucketArn => JSON.stringify({
Version: "2012-10-17",
Statement: [{
Effect: "Allow",
Principal: { AWS: "arn:aws:iam::127311923021:root" }, // ELB service account for us-east-1
Action: ["s3:PutObject", "s3:GetBucketAcl"],
Resource: [bucketArn, `${bucketArn}/*`],
}],
})),
});
// 2. Enable access logs on ALB
new aws.lb.LoadBalancerAttributes("mainAlbAccessLogs", {
loadBalancerArn: mainAlb.loadBalancer.arn,
attributes: [
{ key: "access_logs.s3.enabled", value: "true" },
{ key: "access_logs.s3.bucket", value: albAccessLogsBucket.bucket },
{ key: "access_logs.s3.prefix", value: "main-alb/" },
],
});
Now you have a bucket that stores your ALB/ELB traffic logs.
Note: ELB publishes a log file for each load balancer node every 5 minutes. Delivery is eventually consistent, and multiple logs may be generated for the same period under high traffic.
The log file naming pattern is:
bucket[/prefix]/AWSLogs/aws-account-id/elasticloadbalancing/region/yyyy/mm/dd/
aws-account-id_elasticloadbalancing_region_app.load-balancer-id_end-time_ip-address_random-string.log.gz
Example:
s3://amzn-s3-demo-logging-bucket/logging-prefix/AWSLogs/123456789012/elasticloadbalancing/us-east-2/2022/05/01/
123456789012_elasticloadbalancing_us-east-2_app.my-loadbalancer.1234567890abcdef_20220215T2340Z_172.160.001.192_20sg8hgm.log.gz
Forwarding the Logs to Datadog
To ship the logs to Datadog you need a Lambda function that processes new objects in the S3 bucket and forwards them to the Datadog API.
// 3. Deploy Datadog Forwarder Lambda (using AWS SAM/CloudFormation or manual deployment)
// For Pulumi, you can use aws.cloudformation.Stack or deploy the Lambda directly
const datadogForwarder = new aws.lambda.Function("datadogForwarder", {
runtime: aws.lambda.Runtime.Python39,
handler: "lambda_function.lambda_handler",
role: datadogForwarderRole.arn,
code: new pulumi.asset.AssetArchive({
".": new pulumi.asset.FileArchive("./datadog-forwarder"),
}),
environment: {
variables: {
DD_API_KEY_SECRET_ARN: datadogApiKeySecret.arn,
DD_SITE: "datadoghq.com",
DD_TAGS: `env:${environment}`,
},
},
});
// 4. Configure S3 event notification to trigger Lambda when new logs arrive
new aws.s3.BucketNotification("albLogsNotification", {
bucket: albAccessLogsBucket.id,
lambdaFunctions: [{
lambdaFunctionArn: datadogForwarder.arn,
events: ["s3:ObjectCreated:*"],
filterPrefix: "access-logs-lb/",
}],
});
With this setup, every time a new access‑log file lands in the S3 bucket, the Lambda forwards it to Datadog, where you can create monitors, dashboards, and automated recovery workflows.