Visualizing eBay Competitor Pricing: From Raw JSONL to Price Trend Dashboard
Source: Dev.to
š eBay Competitor Price Tracker ā EndātoāEnd Pipeline
In the highāstakes world of eācommerce, price is often the only thing standing between a customer clicking āBuy It Nowā on your listing or your competitorās. Scraping data is just the first step ā the real value comes from turning raw data into actionable insights: spotting undercutting as it happens, identifying price floors, and tracking trends over time.
This guide shows how to build an endātoāend pipeline that:
- Scrapes eBay product data with Playwright.
- Processes the data with Pandas.
- Visualises competitor price movements in a Streamlit dashboard.
1. The Setup
Weāll use the Ebay.comāScrapers repository, which contains productionāready scrapers optimized for eBayās structure.
Prerequisites
- PythonāÆ3.8+
- A ScrapeOps API Key (for antiābot bypass) ā get one at
- Basic terminal skills
Install
# Clone the repo
git clone https://github.com/scraper-bank/Ebay.com-Scrapers.git
cd Ebay.com-Scrapers/python/playwright/product_data
# Install dependencies
pip install playwright playwright-stealth pandas streamlit
# Install the Chromium browser used by Playwright
playwright install chromium
We use the playwright/product_data implementation because it extracts granular details like productId, price, and availability, which are essential for timeāseries tracking.
2. Configuring the Scraper for Competitors
The default scraper handles individual URLs. To track competitors, run the scraper against a list of product pages at regular intervals.
Create a wrapper script called run_tracker.py that loops through target URLs and saves the results. The scraper automatically appends a timestamp to the filename (e.g., ebay_com_product_page_scraper_data_20260116_090000.jsonl), making historical tracking straightforward.
# run_tracker.py
import asyncio
from scraper.ebay_com_scraper_product_v1 import extract_data, API_KEY
from playwright.async_api import async_playwright
# List of competitor product URLs to monitor
COMPETITOR_URLS = [
"https://www.ebay.com/itm/123456789012",
"https://www.ebay.com/itm/987654321098",
]
async def run_monitoring_session():
async with async_playwright() as p:
browser = await p.chromium.launch(headless=True)
page = await browser.new_page()
for url in COMPETITOR_URLS:
print(f"Scraping competitor: {url}")
await page.goto(url)
data = await extract_data(page)
# Data saving is handled by the DataPipeline class in the core scraper
print(f"Extracted Price: {data.price} {data.currency}")
await browser.close()
if __name__ == "__main__":
asyncio.run(run_monitoring_session())
3. Understanding the Data Structure
The scraper outputs JSONL (JSON Lines) files ā a streamable format thatās memoryāefficient for large priceātracking datasets.
Example record
{
"productId": "123456789012",
"name": "Apple iPhone 15 Pro - 128GB - Blue Titanium",
"price": 899.0,
"currency": "USD",
"availability": "in_stock",
"seller": { "name": "TopTierElectronics", "rating": 99.8 }
}
Key details
- Price cleaning ā the scraper converts strings like
"$899.00"into a float (899.0). - Availability ā when a competitor goes
out_of_stock, decide how to represent that on a chart (e.g., break the line or plot a zero).
4. Data Ingestion & Cleaning with Pandas
Each scraper run creates a new file. The first step is to merge all files into a single chronological DataFrame. We extract the timestamp from the filenames to build the time axis.
import pandas as pd
import glob
import re
import json
from datetime import datetime
def load_historical_data(directory="./"):
"""Load all JSONL files generated by the scraper into a single DataFrame."""
all_data = []
# Find all JSONL files matching the scraperās naming pattern
files = glob.glob(f"{directory}/ebay_com_product_page_scraper_data_*.jsonl")
for file in files:
# Extract timestamp from filename: e.g., 20260116_090000
match = re.search(r'(\d{8}_\d{6})', file)
if not match:
continue
timestamp = datetime.strptime(match.group(1), "%Y%m%d_%H%M%S")
with open(file, "r", encoding="utf-8") as f:
for line in f:
item = json.loads(line)
item["scrape_timestamp"] = timestamp
all_data.append(item)
df = pd.DataFrame(all_data)
# Ensure price is numeric
df["price"] = pd.to_numeric(df["price"], errors="coerce")
return df
5. Building the Dashboard with Streamlit
Now we create a visual interface that lets you filter by product name and view price fluctuations over days or weeks.
dashboard.py
import streamlit as st
import pandas as pd
# Load the helper function from the previous section
from load_data import load_historical_data # adjust import path as needed
st.set_page_config(page_title="eBay Price Tracker", layout="wide")
st.title("š eBay Competitor Price Trends")
# Load data
df = load_historical_data()
# Sidebar filters
available_products = df["name"].unique()
selected_products = st.sidebar.multiselect(
"Select Products to Track",
options=available_products,
default=available_products[:2],
)
filtered_df = df[df["name"].isin(selected_products)]
# Main chart
if not filtered_df.empty:
# Pivot data so each product has its own time series
pivot = (
filtered_df.pivot_table(
index="scrape_timestamp",
columns="name",
values="price",
aggfunc="first",
)
.sort_index()
)
st.line_chart(pivot)
else:
st.info("No data available for the selected products.")
Run the dashboard:
streamlit run dashboard.py
Youāll now have an interactive dashboard that:
- Shows price trends for any selected competitor product.
- Updates automatically when new JSONL files appear (simply refresh the page).
- Highlights outāofāstock periods via gaps in the line chart.
Next Steps
- Schedule
run_tracker.py(e.g., withcronor a cloud scheduler) to run every few hours. - Store the merged DataFrame in a lightweight database (SQLite or DuckDB) for faster queries.
- Add alerts (email, Slack, etc.) when a competitor undercuts your price by a configurable threshold.
With this pipeline in place, raw scraping data becomes a strategic asset, giving you the visibility needed to stay competitive in real time. Happy tracking!
Price History Chart
# each product has its own column for the line chart
chart_data = filtered_df.pivot(
index='scrape_timestamp',
columns='name',
values='price'
)
st.subheader("Price History")
st.line_chart(chart_data)
# Metrics
cols = st.columns(len(selected_products))
for i, product in enumerate(selected_products):
latest_price = filtered_df[filtered_df['name'] == product].iloc[-1]['price']
cols[i].metric(label=product[:30] + "...", value=f"${latest_price}")
else:
st.write("Please select a product to see the price trend.")
Run the dashboard with:
streamlit run dashboard.py
6. Automating the Workflow
A dashboard is only useful if the data is fresh. Running the scraper manually every morning is inefficient.
Linux / macOS
Set up a cron job to run the run_tracker.py script every 6 hours:
0 */6 * * * /usr/bin/python3 /path/to/run_tracker.py
Windows
Use Task Scheduler to achieve the same result. Once automated, the Streamlit dashboard will update with the latest price points every time you refresh the page.
To Wrap Up
We have moved from simple data extraction to building a functional business tool. By combining the scraping capabilities of the ScrapeOps eBay repository with the analytical power of Pandas and Streamlit, you now have a custom priceāintelligence platform.
Key Takeaways
- JSONL is efficient ā the best format for logging timeāseries scraping data.
- Filename timestamps ā storing metadata in the filename prevents data loss if the internal JSON structure changes.
- Visualization works ā a 5āÆ% price drop on a line chart is much easier to act on than scanning raw text files.
To learn more about bypassing antiābot measures, check out the eBay Scraping Breakdown.