Investigating Error Logs Using LangGraph, LangChain and Watsonx.ai
Source: Dev.to
Introduction
When dealing with production systems, observability plays a key role. It is a vital component of incident investigations, the foundation for monitoring and alerting, and incredibly useful for validation of new functionality, improvements, or bug fixes being shipped. Application logs are a big part of observability.1 Logs can help us understand what the system was doing at any particular point in time with a high degree of granularity.
However, understanding application logs can be difficult. There can be many logs, and finding the relevant ones is challenging. Indexing the logs and using a search engine to query them helps, but it cannot tell you which logs are related to the issue you are investigating. When I see an error in the logs that correlates with the timing of an incident, I usually ask myself a bunch of follow‑up questions:
- Is the error related to or possibly the root cause of the issue?
- Is this error a known problem?
- If it’s known, has it been reported to the right team?
- If it has been reported, is it being worked on or fixed?
- If it’s fixed, is the fix rolled out to the environment I am investigating?
- If it’s rolled out, why is the error still happening? Is there a regression?
- If there’s a regression, has it been reported to the right team?
Answering these questions during an incident can consume valuable time. Yet the answers often contain crucial information—such as a workaround in a bug ticket or a hot‑fix release that can be applied immediately. This is why a deeper log investigation during incidents is valuable, and why I believe GenAI can help answer these questions quickly.
In this post we explore how to use GenAI to investigate error logs with IBM Watsonx.ai, LangGraph, and LangChain in Python. The remainder of the post is structured as follows:
- Technological foundation – introducing LangGraph, LangChain, and Watsonx.ai.
- Design and implementation of the log‑investigation agent.
- Summary of findings and outlook for future work.
LangGraph, LangChain and Watsonx.ai
LangGraph
LangGraph is a graph‑based orchestration framework for building stateful AI workflows (e.g., agents). It lets you model an AI application as a directed graph where:
- Nodes are functions (LLM calls, tool calls, or custom logic) that operate on a shared state.
- Edges define control flow, including conditional branches and loops.
- State is an explicit, shared data structure (e.g., a
dictorTypedDict) that all nodes can read and update, making it easy to build long‑running, stateful agents.
LangChain
LangChain is often used as a building block within LangGraph nodes. It provides utilities for connecting LLMs to data and tools. By combining LangChain and LangGraph, you can build AI agents that reason and act in cycles, optionally adding a human‑in‑the‑loop.
Watsonx.ai
Watsonx.ai is IBM’s enterprise AI platform, offering managed LLMs among other capabilities. We will combine these three tools to build an AI agent that investigates logs. The required Python packages are:
Simple Example
The code below demonstrates a basic agent with access to a weather‑retrieval tool. It uses the create_agent helper (successor to create_react_agent) to build a pre‑configured graph that maintains message history in the global state.
from ibm_watsonx_ai import ChatWatsonx
from langchain_ibm import create_agent
import os
llm = ChatWatsonx(
model_id="meta-llama/llama-3-70b-instruct",
url=os.getenv("WATSONX_URL"),
)
def get_weather(city: str) -> str:
return f"Weather in {city}: 30°C and sunny."
agent = create_agent(llm, tools=[get_weather])
response = agent.invoke({
"messages": [
{"role": "user", "content": "What is the weather in Berlin?"}
]
})
print(response)
For more complex applications you can build your own graph using the Graph API.
Log Investigation Agent
Scope
The agent will address the high‑level questions introduced earlier. For this post we focus on three core capabilities:
- Search relevant work/tickets/conversations – Parallel searches in Jira and GitHub (demonstrating LangGraph’s parallelism). The design can be extended to Slack, incident‑tracking tools, post‑mortems, or a company‑wide search engine like Glean.
- Gather operational context – Retrieve pod/container name and deployed version to assess relevance of found items.
- Investigate tickets and conversations – Look for workarounds or fixes.
The agent will be wrapped in a lightweight Dash UI (illustrated below).

Defining the State
In LangGraph, state is shared among all nodes and passed along edges. When multiple nodes modify the same property concurrently, a custom merge function is required. For our use case we store all intermediate results in the state so they can be displayed to the user after the graph completes, building trust in the agent’s reasoning.
from pydantic import BaseModel
from typing import Optional
class LogInvestigationState(BaseModel):
# Raw log text as provided by the user
log_text: Optional[str] = None
# Additional fields will be added later (e.g., search queries, results, context)
Defining the Graph
The high‑level graph architecture consists of the following steps:
- Inspect Log & Derive Queries – The first node parses the provided log and generates search queries for each external system (Jira, GitHub).
- Parallel Search – Nodes perform API calls to the respective systems in parallel; no LLM is needed for this step.
- Spawn Ticket Nodes – Using the
Sendfunctionality, the graph spawns a node for each found ticket/conversation. - Grade Ticket Relevance – An LLM grades each ticket based on title, description, the full log, and operational context to determine relevance.
- Summarize Findings – The final node aggregates the most relevant tickets, extracts workarounds, and produces a concise answer for the user.
Below is a skeleton of the graph definition (details such as API wrappers and merge logic are omitted for brevity).
from langgraph.graph import StateGraph, Send
from langchain_ibm import ChatWatsonx
from typing import List, Dict
import os
# Initialize LLM
llm = ChatWatsonx(
model_id="meta-llama/llama-3-70b-instruct",
url=os.getenv("WATSONX_URL"),
)
def inspect_log(state: LogInvestigationState) -> Dict:
"""Generate search queries from the raw log."""
queries = llm.invoke(
f"Extract concise search terms from the following log:\n{state.log_text}"
)
return {"queries": queries.split(",")}
def search_jira(state: Dict) -> Dict:
"""Search Jira using generated queries."""
# Placeholder for actual Jira API calls
results = [] # List of ticket dicts
return {"jira_results": results}
def search_github(state: Dict) -> Dict:
"""Search GitHub issues/PRs using generated queries."""
results = [] # List of issue dicts
return {"github_results": results}
def grade_ticket(state: Dict, ticket: Dict) -> Dict:
"""Use LLM to grade relevance of a single ticket."""
prompt = f"""
Log: {state['log_text']}
Operational context: {state.get('context', '')}
Ticket title: {ticket['title']}
Ticket description: {ticket['description']}
Rate the relevance of this ticket to the log on a scale of 0‑10 and provide a short justification.
"""
rating = llm.invoke(prompt)
ticket["rating"] = rating
return {"graded_ticket": ticket}
def summarize(state: Dict) -> Dict:
"""Create a final summary for the user."""
relevant = [
t for t in state.get("graded_tickets", [])
if int(t["rating"].split()[0]) >= 7
]
summary = llm.invoke(
f"Summarize the following relevant tickets and any suggested workarounds:\n{relevant}"
)
return {"summary": summary}
# Build the graph
graph = StateGraph(LogInvestigationState)
graph.add_node("inspect_log", inspect_log)
graph.add_node("search_jira", search_jira)
graph.add_node("search_github", search_github)
graph.add_node("grade_ticket", grade_ticket)
graph.add_node("summarize", summarize)
# Define edges
graph.add_edge("inspect_log", "search_jira")
graph.add_edge("inspect_log", "search_github")
graph.add_conditional_edges(
"search_jira",
lambda s: Send("grade_ticket", s["jira_results"]),
)
graph.add_conditional_edges(
"search_github",
lambda s: Send("grade_ticket", s["github_results"]),
)
graph.add_edge("grade_ticket", "summarize")
graph.set_entry_point("inspect_log")
log_investigation_app = graph.compile()
The compiled graph (log_investigation_app) can be invoked with a dictionary containing the raw log text. The final summary field can be displayed in the Dash UI.
Conclusion
By combining LangGraph for orchestrating parallel searches, LangChain for LLM‑driven reasoning, and Watsonx.ai for powerful model inference, we can build an agent that quickly answers the critical questions that arise during incident investigations. The modular graph architecture makes it straightforward to extend the agent with additional data sources (e.g., Slack, internal knowledge bases) or richer operational context.
Future work may explore:
- Dynamic tool selection based on the log content.
- Feedback loops where a human can approve or reject suggested workarounds.
- Caching of search results to reduce latency for repeated investigations.
Footnotes
-
Observability fundamentals – see any standard observability reference. ↩