Build an AI-Powered Competitive Intelligence Monitor
Source: Dev.to
Competitive Intelligence Monitor
Staying ahead of competitors requires constant vigilance—tracking product launches, funding rounds, partnerships, and strategic moves across the web. The open‑source Competitive Intelligence Monitor project demonstrates how to automate this process using CocoIndex, Tavily Search, and LLM extraction to continuously track and structure competitor news into a queryable PostgreSQL database.
How It Works
The system automates web monitoring by:
- Tavily AI Search – pulls full‑text articles.
- LLM Extraction (GPT‑4o‑mini) – detects structured “competitive events”.
- PostgreSQL – stores events and source articles for query‑driven intelligence.
Event Types Extracted
- Product launches and feature releases
- Partnerships and collaborations
- Funding rounds and financial news
- Key executive hires / departures
- Acquisitions and mergers
These events and their source articles are stored in PostgreSQL so teams can ask natural questions like:
- “What has Anthropic been doing recently?”
- “Which competitors are making the most news this week?”
Architecture Diagram
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Tavily AI │────▶│ CocoIndex │────▶│ PostgreSQL │
│ Search │ │ Pipeline │ │ Database │
└──────────────┘ └──────────────┘ └──────────────┘
│ │ │
▼ ▼ ▼
Articles Extraction Intelligence
(web data) (GPT‑4o‑mini) (structured)
Data flows from Tavily search results into an LLM extraction step that produces CompetitiveEvent objects, then into two tables—one for raw articles and another for normalized events.
Data Model
import dataclasses
@dataclasses.dataclass
class CompetitiveEvent:
"""A competitive intelligence event extracted from text.
Examples:
- Product Launch: "OpenAI released GPT‑5 with multimodal capabilities"
- Partnership: "Anthropic partnered with Google Cloud for enterprise AI"
- Funding: "Mistral AI raised $400M Series B led by Andreessen Horowitz"
- Key Hire: "Former Meta AI director joined Cohere as Chief Scientist"
- Strategic Move: "Microsoft acquired AI startup Inflection for $650M"
"""
event_type: str # "product_launch", "partnership", "funding",
# "key_hire", "acquisition", "other"
competitor: str # Company name (e.g., "OpenAI")
description: str # Brief description of the event
significance: str # "high", "medium", "low"
related_companies: list[str] # Other companies mentioned
Tavily Search Source Connector
class TavilySearchSource(SourceSpec):
"""Fetches competitive intelligence using Tavily AI Search API."""
competitor: str
days_back: int = 7
max_results: int = 10
@source_connector(
spec_cls=TavilySearchSource,
key_type=_ArticleKey,
value_type=_Article,
)
class TavilySearchConnector:
async def list(self) -> AsyncIterator[PartialSourceRow[_ArticleKey, _Article]]:
"""List articles from Tavily search."""
search_query = (
f"{self._spec.competitor} AND "
f"(funding OR partnership OR product launch OR acquisition OR executive hire)"
)
client = TavilyClient(api_key=self._api_key)
response = client.search(
query=search_query,
search_depth="advanced",
max_results=self._spec.max_results,
include_raw_content=True,
)
for ordinal, result in enumerate(response.get("results", [])):
url = result["url"]
yield PartialSourceRow(
key=_ArticleKey(url=url),
data=PartialSourceRowData(ordinal=ordinal),
)
Main Pipeline (CocoIndex Flow)
@cocoindex.flow_def(name="CompetitiveIntelligence")
def competitive_intelligence_flow(
flow_builder: cocoindex.FlowBuilder, data_scope: cocoindex.DataScope
) -> None:
"""Main pipeline for competitive intelligence monitoring."""
competitors = os.getenv("COMPETITORS", "OpenAI,Anthropic").split(",")
refresh_interval = int(os.getenv("REFRESH_INTERVAL_SECONDS", "3600"))
# Add Tavily search source for each competitor
for competitor in competitors:
data_scope[f"articles_{competitor.strip()}"] = flow_builder.add_source(
TavilySearchSource(
competitor=competitor.strip(),
days_back=7,
max_results=10,
),
refresh_interval=timedelta(seconds=refresh_interval),
)
articles_index = data_scope.add_collector()
events_index = data_scope.add_collector()
# Process each competitor's articles
for competitor in competitors:
articles = data_scope[f"articles_{competitor.strip()}"]
with articles.row() as article:
# Extract competitive events using GPT‑4o‑mini via OpenRouter
article["events"] = article["content"].transform(
cocoindex.functions.ExtractByLlm(
llm_spec=cocoindex.LlmSpec(
api_type=cocoindex.LlmApiType.OPENAI,
model="openai/gpt-4o-mini",
address="https://openrouter.ai/api/v1",
),
output_type=list[CompetitiveEvent],
instruction=(
"Extract competitive intelligence events from this article. "
"Focus on: product launches, partnerships, funding rounds, "
"key hires, acquisitions, and other strategic moves."
),
)
)
Query Handlers
@competitive_intelligence_flow.query_handler()
def search_events(query: str) -> list[dict]:
"""
Execute a natural‑language query against the PostgreSQL store.
Example queries:
- "What product launches did Anthropic announce this month?"
- "List all funding rounds for competitors in the last week."
"""
# Implementation omitted for brevity – uses SQL generation from the LLM.
pass
Summary
- Tavily AI fetches the latest web articles about target competitors.
- CocoIndex orchestrates the pipeline and runs GPT‑4o‑mini extraction to produce structured CompetitiveEvent records.
- All raw articles and normalized events are persisted in PostgreSQL, enabling natural‑language queries and dashboards for continuous competitive intelligence.
def ch_by_competitor(
competitor: str,
event_type: str | None = None,
limit: int = 20,
) -> cocoindex.QueryOutput:
"""Find recent competitive intelligence about a specific competitor."""
with connection_pool().connection() as conn:
with conn.cursor() as cur:
sql = f"""
SELECT e.competitor,
e.event_type,
e.description,
e.significance,
e.related_companies,
a.title,
a.url,
a.source,
a.published_at
FROM {events_table} e
JOIN {articles_table} a ON e.article_id = a.id
WHERE LOWER(e.competitor) LIKE LOWER(%s)
"""
params = [f"%{competitor}%"]
if event_type:
sql += " AND e.event_type = %s"
params.append(event_type)
sql += " ORDER BY a.published_at DESC LIMIT %s"
cur.execute(sql, params)
return cocoindex.QueryOutput(results=[...])
Configuration (environment variables)
DATABASE_URL=postgresql://user:password@localhost:5432/competitive_intel
COCOINDEX_DATABASE_URL=postgresql://user:password@localhost:5432/competitive_intel
OPENAI_API_KEY=sk-or-v1-...
TAVILY_API_KEY=tvly-...
COMPETITORS=OpenAI,Anthropic,Google AI,Meta AI,Mistral AI
REFRESH_INTERVAL_SECONDS=3600
SEARCH_DAYS_BACK=7
First‑time Setup
python3 run_interactive.py
Automated Deployment (CocoIndex)
cocoindex update main -f # Initial sync
cocoindex update -L main.py # Continuous monitoring
How the Monitor Works
- AI‑native search – Tavily extracts article content, avoiding brittle scraping.
- De‑duplication – CocoIndex tracks processed articles via incremental processing.
- Signal extraction – Structured events are scored for significance.
- Flexible analysis – Dual indexing (raw + extracted) provides maximum flexibility.
Supported Query Types
- Search by competitor name
- Filter by event type (funding, partnerships, acquisitions, etc.)
- Rank by significance (high = 3, medium = 2, low = 1 weighted scoring)
- Trend analysis across time periods
Project Overview
Competitive Intelligence Monitor
Track competitor mentions across the web using AI‑powered search and LLM extraction. The pipeline automatically:
- Searches the web with Tavily AI (optimized for agents)
- Extracts competitive‑intelligence events with DeepSeek LLM analysis
- Indexes both raw articles and extracted events in PostgreSQL
Types of Events Captured
- Product launches & feature releases
- Partnerships & collaborations
- Funding rounds & financial news
- Key executive hires / departures
- Acquisitions & mergers
Example Queries
- “What has OpenAI been doing recently?”
- “Which competitors are making the most news?”
- “Find all partnership announcements”
- “What are the most significant competitive moves this week?”
Prerequisites
- PostgreSQL (local installation or cloud service)
- Python 3.11+ (required for CocoIndex)
- API keys (required):
- Tavily API key (free tier: 1,000 requests/day)
- OpenAI / OpenRouter API key (for LLM extraction)
Built With
- CocoIndex – modern data‑pipeline framework
- Tavily AI Search – AI‑native search engine
- OpenRouter – multi‑model API gateway
Contributing
Have questions or want to contribute? Drop a comment below or open an issue on GitHub!
License: MIT
Repository: