Building a Serverless Data Analytics Pipeline with AWS: Premier League Dashboard
Source: Dev.to
Inspired by AWS Cookbook by John Culkin & Mike Zazon – Chapter 7: Big Data
My Journey into Data Analytics
While exploring AWS AI/ML services, I realized that artificial intelligence and machine‑learning are fundamentally built upon quality data foundations. This insight led me to step back and master the data‑analytics fundamentals first. What better way than to build a complete server‑less pipeline?
I created a scalable analytics solution using AWS S3, Athena, and QuickSight to analyze Premier League data.
Why Premier League data?
As a passionate football enthusiast, I’m fascinated by the rich statistical narratives that unfold each season. Every match generates meaningful data points—from goals and assists to tactical formations and player‑performance metrics. This abundance of structured, real‑world data makes football analytics an ideal playground for learning data‑engineering concepts while working with something I genuinely care about.
What we’ll build
- Serverless data storage with Amazon S3
- SQL querying with Amazon Athena
- Interactive dashboards with Amazon QuickSight
⚠️ Disclaimer: The Premier League data used in this project is completely fictional and for demonstration purposes only. If you see Manchester City with 150 points or Tottenham actually winning something, that’s just my creative data generation at work! 😄 Please don’t use this for your fantasy‑football decisions—you’ve been warned! For real Premier League data, check the official sources (and prepare for more realistic disappointment).
Architecture Overview
This serverless architecture eliminates infrastructure complexity while providing:
- Scalability – automatic scaling without server management
- Cost‑efficiency – pay‑per‑query pricing model
- Speed – query results in seconds
Implementation Highlights
1️⃣ S3 Data Lake Setup
I stored Premier League CSV files in S3, creating a scalable data foundation:
# Create bucket and upload data
aws s3api create-bucket --bucket premier-league-data-$(openssl rand -hex 3)
aws s3 cp data/ s3://your-bucket/raw-data/ --recursive
2️⃣ Athena SQL Querying
Created External Tables
Athena’s power lies in querying data directly from S3 without moving it. Here’s how I created the tables:
-- Standings table
CREATE EXTERNAL TABLE IF NOT EXISTS standings (
team_name STRING,
matches_played INT,
wins INT,
draws INT,
losses INT,
goals_for INT,
goals_against INT,
goal_difference INT,
points INT
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS TEXTFILE
LOCATION 's3://your-bucket/raw-data/'
TBLPROPERTIES ('skip.header.line.count'='1');
-- Match results table
CREATE EXTERNAL TABLE IF NOT EXISTS match_results (
team_name STRING,
result_type STRING
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS TEXTFILE
LOCATION 's3://your-bucket/raw-data/'
TBLPROPERTIES ('skip.header.line.count'='1');
Sample Analytics Queries
-- Top 6 teams analysis
SELECT team_name, points, goal_difference
FROM standings
ORDER BY points DESC
LIMIT 6;
-- Win‑percentage calculation
SELECT team_name,
ROUND((wins * 100.0 / matches_played), 2) AS win_percentage
FROM standings
ORDER BY win_percentage DESC;
-- Verify data integrity
SELECT * FROM standings;
SELECT * FROM match_results;
3️⃣ QuickSight Dashboard (Brief Overview)
- Connect QuickSight to the Athena data source.
- Import the
standingsandmatch_resultstables. - Build visualisations such as:
- Bar chart of points per team.
- Heat‑map of win percentages.
- Line chart showing points progression over the season.
The resulting dashboard updates automatically as new CSV files land in the S3 bucket and Athena refreshes its metadata.
Takeaways
- Serverless services (S3 + Athena + QuickSight) let you focus on data, not infrastructure.
- Athena provides instant, pay‑per‑query analytics on data stored in S3.
- QuickSight turns query results into interactive visualisations with virtually no operational overhead.
Give it a try—swap the football data for any CSV‑based dataset, and you’ll have a fully‑managed analytics pipeline ready in minutes!
Key Athena Benefits
- No data movement required
- Standard SQL interface
- Pay only for data scanned ($5/TB)
- Results in seconds
QuickSight Dashboards
Built interactive visualizations, including:
- League standings table
- Points comparison charts
- Goal‑difference analysis
- Team performance metrics
Business Value for Management
QuickSight delivers immediate ROI through:
- Decision Speed – Real‑time dashboards eliminate waiting for IT reports
- Cost Savings – $9/user vs $70+ for traditional BI tools (e.g., Tableau)
- Self‑Service Analytics – Business users create their own insights without technical dependencies
- Mobile Access – Executive dashboards available anywhere, anytime
- Scalability – Handles 10 users or 10 000 users with the same architecture
- Security – Enterprise‑grade AWS security and compliance built‑in

Image source: amazon.com QuickSight page
Management Benefits
- Reduce reporting cycle from weeks to minutes
- Democratize data access across all departments
- Lower total cost of ownership by 60‑80 % vs traditional solutions
- Eliminate server maintenance and upgrade costs
Results & Insights
Cost Breakdown
- S3 Storage: ~$0.05/month
- Athena Queries: ~$0.25/month
- QuickSight: $9/user/month
Total: ~$9.30/month for enterprise‑grade analytics!
Key Learnings
- ✅ Setup completed in under 2 hours
- ✅ Serverless = zero infrastructure management
- ✅ SQL familiarity accelerated development
- ⚠️ QuickSight permissions required for initial troubleshooting
Next Steps
This foundation opens doors to:
- Real‑time data integration
- Machine‑learning predictions
- Advanced ETL pipelines with AWS Glue
Final Reflections
Starting with data fundamentals before diving into AI/ML proved invaluable. This serverless analytics pipeline demonstrates that powerful data solutions don’t require complex infrastructure—just the right AWS services working together.
The S3 + Athena + QuickSight combination delivers enterprise‑grade analytics at startup costs, making it perfect for both learning and production use cases.
Resources
- GitHub Repository: AWS‑Analytics Project
- AWS Cookbook – Chapter 7 (Big Data)
- AI‑Powered BI Tool: Amazon QuickSight
- Athena SQL Documentation: Amazon Athena
- Serverless S3: Amazon S3
Building your own data pipeline? Connect with me on LinkedIn – I’d love to hear about your experience!





