Troubleshoot Your API Faster: Using ELB Logs and Athena
Nic Lasdoce
Sep 24, 2023
This step-by-step guide is a startup's shortcut to robust logging and analytics with AWS ELB Access Logs and Athena. With minimal setup, you'll gain actionable insights to mitigate security and performance issues, freeing you to focus on business growth. This guide is your quick path to a more secure and optimized cloud infrastructure.
Introduction
Navigating logs and metrics has always been really hard for developers, but when issues arise then we should be able to investigate deeper, and if possible do a query using SQL (a language almost every developer knows). Problems like security vulnerabilities, performance bottlenecks, and unexpected user behavior often manifest in subtle ways that may go unnoticed until it's too late. This is where the crucial alliance of AWS Elastic Load Balancer (ELB) Access Logs and AWS Athena comes into play. By employing ELB Access Logs, you gain a meticulous record of all HTTP requests sent to your ELB, capturing essential data points that can be invaluable for diagnostics and analytics. However, raw logs are just the beginning—the real power comes when you pair these with AWS Athena to do analysis using SQL.
Setting Up AWS ELB Access Logs
Create an S3 Bucket to store the Logs
- Navigate to the Amazon S3 dashboard by visiting https://console.aws.amazon.com/s3/.
- Click on "Create bucket."
- While on the "Create bucket" interface, carry out the following steps:
- For the "Bucket name" field, input a distinctive name that doesn't duplicate any existing S3 bucket names. Keep in mind that some regions might impose extra constraints on naming conventions. For additional details, consult the Amazon Simple Storage Service User Guide's section on bucket limitations.
- In "AWS Region," opt for the geographical location where your load balancer is deployed.
- Under "Default encryption," go for the Amazon S3-managed keys option (SSE-S3).
- Click "Create bucket" to finalize the setup.
- Attach the following policy to your bucket
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::elb-account-id:root"
},
"Action": "s3:PutObject",
"Resource": "my-s3-arn"
}
]
}
By following these instructions, you'll have a secure and region-specific S3 bucket ready to capture and store ELB logs, a critical step in establishing a robust analytics pipeline.
Enabling Logging
- Navigate to the Amazon EC2 dashboard by visiting https://console.aws.amazon.com/ec2/.
- On the sidebar, click on "Load Balancers."
- Find and click on your load balancer's name to view its information page.
- Head over to the "Attributes" tab and select "Edit."
- In the "Monitoring" section, enable "Access logs."
- For the "S3 URI" field, input the appropriate URI where you want your logs stored. The format of the URI will depend on whether you're using a prefix or not.
- If you're using a prefix: s3://bucket-name/prefix
- Without a prefix: s3://bucket-name
- Click on "Save changes" to update your settings.\
Configuring AWS Athena for Log Analysis
- Open the Athena console and create a new database.
- Paste the following into your query editor.
- Replace the values in LOCATION and "storage.location.template"
s3://your-alb-logs-directory/AWSLogs/<ACCOUNT-ID>/elasticloadbalancing/<REGION>/
with the s3 bucket you selected in ELB Access Logs
CREATE EXTERNAL TABLE IF NOT EXISTS alb_logs ( type string, time string, elb string, client_ip string, client_port int, target_ip string, target_port int, request_processing_time double, target_processing_time double, response_processing_time double, elb_status_code int, target_status_code string, received_bytes bigint, sent_bytes bigint, request_verb string, request_url string, request_proto string, user_agent string, ssl_cipher string, ssl_protocol string, target_group_arn string, trace_id string, domain_name string, chosen_cert_arn string, matched_rule_priority string, request_creation_time string, actions_executed string, redirect_url string, lambda_error_reason string, target_port_list string, target_status_code_list string, classification string, classification_reason string ) PARTITIONED BY ( day STRING ) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.RegexSerDe' WITH SERDEPROPERTIES ( 'serialization.format' = '1', 'input.regex' = '([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*):([0-9]*) ([^ ]*)[:-]([0-9]*) ([-.0-9]*) ([-.0-9]*) ([-.0-9]*) (|[-0-9]*) (-|[-0-9]*) ([-0-9]*) ([-0-9]*) \"([^ ]*) (.*) (- |[^ ]*)\" \"([^\"]*)\" ([A-Z0-9-_]+) ([A-Za-z0-9.-]*) ([^ ]*) \"([^\"]*)\" \"([^\"]*)\" \"([^\"]*)\" ([-.0-9]*) ([^ ]*) \"([^\"]*)\" \"([^\"]*)\" \"([^ ]*)\" \"([^\s]+?)\" \"([^\s]+)\" \"([^ ]*)\" \"([^ ]*)\"') LOCATION 's3://your-alb-logs-directory/AWSLogs/<ACCOUNT-ID>/elasticloadbalancing/<REGION>/' TBLPROPERTIES ( "projection.enabled" = "true", "projection.day.type" = "date", "projection.day.range" = "2022/01/01,NOW", "projection.day.format" = "yyyy/MM/dd", "projection.day.interval" = "1", "projection.day.interval.unit" = "DAYS", "storage.location.template" = "s3://your-alb-logs-directory/AWSLogs/<ACCOUNT-ID>/elasticloadbalancing/<REGION>/${day}" )
Analyzing Logs With Athena
Identify Specific Error Codes
Find out how many 4xx or 5xx error codes were encountered.
SELECT elb_status_code, count(*) FROM alb_logs WHERE elb_status_code >= 400 GROUP BY elb_status_code;
Monitoring Endpoints
Find out which endpoints are accessed most often.
SELECT request_url, count(*) FROM alb_logs GROUP BY request_url ORDER BY count(*) DESC;
Filtering By Day
Identify IPs that have made the most requests.
SELECT * FROM alb_logs WHERE day = '2022/02/12'
Filtering By Request Type
Segment logs by the type of HTTP request.
SELECT request_verb, count(*) FROM alb_logs GROUP BY request_verb;
Use Cases to Consider
- Security Audits: Unusual access patterns can be indicative of a security breach.
- Performance Tuning: Identifying endpoints with the most errors or longest response times helps in targeted optimizations.
- Client Behavior Analysis: Knowing which endpoints are accessed the most can aid in UX/UI decisions.
Conclusion
Harnessing the power of AWS ELB logs and Athena can transform your approach to analytics and monitoring, allowing you to be more proactive rather than reactive. This setup is not just an architectural decision; it's a strategic move to better understand your system's inner workings.
So, next time you think about skipping on logging and analytics — don't. The benefits of insights you'll gain is well worth the trouble initial setup investment.
Latest Articles

Why “It Works on My Machine” Signals a Deeper Engineering Problem

9 Things That Will Survive the Rise of AI in Cloud

The Quest for MicroAgents: GRAPHQL (Part 3.6)

The Quest for MicroAgents: REST (Part 3.5)

The Quest for MicroAgents: RPC (Part 3.4)

The Quest for MicroAgents: The 4 Best Communication Technologies for AI Microagents (Part 3.3)

The Quest for MicroAgents: Guidelines for Picking Communication Technology in Microagents (Part 3.2)

The Quest for MicroAgents: Request-Response or Event-Driven? (Part 3.1)

The Quest for MicroAgents: Simplifying AI Microagent Modeling (Part 2.4)

The Quest for MicroAgents: Loosely Coupled, Highly Cohesive (Part 2.3)