Building a Serverless Data Analytics Pipeline with AWS: Premier League Dashboard

Published: 2 days ago (December 16, 2025 at 10:51 AM EST)

4 min read

Source: Dev.to

Inspired by AWS Cookbook by John Culkin & Mike Zazon – Chapter 7: Big Data

My Journey into Data Analytics

While exploring AWS AI/ML services, I realized that artificial intelligence and machine‑learning are fundamentally built upon quality data foundations. This insight led me to step back and master the data‑analytics fundamentals first. What better way than to build a complete server‑less pipeline?

I created a scalable analytics solution using AWS S3, Athena, and QuickSight to analyze Premier League data.

Why Premier League data?

As a passionate football enthusiast, I’m fascinated by the rich statistical narratives that unfold each season. Every match generates meaningful data points—from goals and assists to tactical formations and player‑performance metrics. This abundance of structured, real‑world data makes football analytics an ideal playground for learning data‑engineering concepts while working with something I genuinely care about.

What we’ll build

Serverless data storage with Amazon S3
SQL querying with Amazon Athena
Interactive dashboards with Amazon QuickSight

⚠️ Disclaimer: The Premier League data used in this project is completely fictional and for demonstration purposes only. If you see Manchester City with 150 points or Tottenham actually winning something, that’s just my creative data generation at work! 😄 Please don’t use this for your fantasy‑football decisions—you’ve been warned! For real Premier League data, check the official sources (and prepare for more realistic disappointment).

Architecture Overview

This serverless architecture eliminates infrastructure complexity while providing:

Scalability – automatic scaling without server management
Cost‑efficiency – pay‑per‑query pricing model
Speed – query results in seconds

Implementation Highlights

1️⃣ S3 Data Lake Setup

I stored Premier League CSV files in S3, creating a scalable data foundation:

# Create bucket and upload data
aws s3api create-bucket --bucket premier-league-data-$(openssl rand -hex 3)
aws s3 cp data/ s3://your-bucket/raw-data/ --recursive

2️⃣ Athena SQL Querying

Created External Tables

Athena’s power lies in querying data directly from S3 without moving it. Here’s how I created the tables:

-- Standings table
CREATE EXTERNAL TABLE IF NOT EXISTS standings (
    team_name        STRING,
    matches_played   INT,
    wins             INT,
    draws            INT,
    losses           INT,
    goals_for        INT,
    goals_against    INT,
    goal_difference  INT,
    points           INT
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS TEXTFILE
LOCATION 's3://your-bucket/raw-data/'
TBLPROPERTIES ('skip.header.line.count'='1');

-- Match results table
CREATE EXTERNAL TABLE IF NOT EXISTS match_results (
    team_name   STRING,
    result_type STRING
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS TEXTFILE
LOCATION 's3://your-bucket/raw-data/'
TBLPROPERTIES ('skip.header.line.count'='1');

Sample Analytics Queries

-- Top 6 teams analysis
SELECT team_name, points, goal_difference
FROM   standings
ORDER BY points DESC
LIMIT 6;

-- Win‑percentage calculation
SELECT team_name,
       ROUND((wins * 100.0 / matches_played), 2) AS win_percentage
FROM   standings
ORDER BY win_percentage DESC;

-- Verify data integrity
SELECT * FROM standings;
SELECT * FROM match_results;

3️⃣ QuickSight Dashboard (Brief Overview)

Connect QuickSight to the Athena data source.
Import the standings and match_results tables.
Build visualisations such as:
- Bar chart of points per team.
- Heat‑map of win percentages.
- Line chart showing points progression over the season.

The resulting dashboard updates automatically as new CSV files land in the S3 bucket and Athena refreshes its metadata.

Takeaways

Serverless services (S3 + Athena + QuickSight) let you focus on data, not infrastructure.
Athena provides instant, pay‑per‑query analytics on data stored in S3.
QuickSight turns query results into interactive visualisations with virtually no operational overhead.

Give it a try—swap the football data for any CSV‑based dataset, and you’ll have a fully‑managed analytics pipeline ready in minutes!

Key Athena Benefits

No data movement required
Standard SQL interface
Pay only for data scanned ($5/TB)
Results in seconds

QuickSight Dashboards

Built interactive visualizations, including:

League standings table
Points comparison charts
Goal‑difference analysis
Team performance metrics

Business Value for Management

QuickSight delivers immediate ROI through:

Decision Speed – Real‑time dashboards eliminate waiting for IT reports
Cost Savings – $9/user vs $70+ for traditional BI tools (e.g., Tableau)
Self‑Service Analytics – Business users create their own insights without technical dependencies
Mobile Access – Executive dashboards available anywhere, anytime
Scalability – Handles 10 users or 10 000 users with the same architecture
Security – Enterprise‑grade AWS security and compliance built‑in

Image source: amazon.com QuickSight page

Management Benefits

Reduce reporting cycle from weeks to minutes
Democratize data access across all departments
Lower total cost of ownership by 60‑80 % vs traditional solutions
Eliminate server maintenance and upgrade costs

Results & Insights

Cost Breakdown

S3 Storage: ~$0.05/month
Athena Queries: ~$0.25/month
QuickSight: $9/user/month

Total: ~$9.30/month for enterprise‑grade analytics!

Key Learnings

✅ Setup completed in under 2 hours
✅ Serverless = zero infrastructure management
✅ SQL familiarity accelerated development
⚠️ QuickSight permissions required for initial troubleshooting

Next Steps

This foundation opens doors to:

Real‑time data integration
Machine‑learning predictions
Advanced ETL pipelines with AWS Glue

Final Reflections

Starting with data fundamentals before diving into AI/ML proved invaluable. This serverless analytics pipeline demonstrates that powerful data solutions don’t require complex infrastructure—just the right AWS services working together.

The S3 + Athena + QuickSight combination delivers enterprise‑grade analytics at startup costs, making it perfect for both learning and production use cases.

Resources

GitHub Repository: AWS‑Analytics Project
AWS Cookbook – Chapter 7 (Big Data)
AI‑Powered BI Tool: Amazon QuickSight
Athena SQL Documentation: Amazon Athena
Serverless S3: Amazon S3

Building your own data pipeline? Connect with me on LinkedIn – I’d love to hear about your experience!