Querying and Plotting Data with LangChain’s Pandas Agent + OpenAI
Querying and Plotting Data with LangChain’s Pandas Agent + OpenAI
Natural language interfaces for data analysis are moving from research into everyday engineering practice.
With LangChain and OpenAI, you can now query a DataFrame directly in English, let the model write and execute the Pandas/Matplotlib code, and even plot results, all inside Python.
In this post, I’ll show you how I built a Pandas DataFrame Agent with LangChain + OpenAI that can:
-
Answer tabular questions about a dataset.
-
Generate Python plotting code automatically.
-
Execute that code safely to produce charts.
🔧 Setup
We need a few packages:
And of course, set your OpenAI key:
📊 The Dataset
For demo purposes, let’s mock up some sales data:
🤖 Creating the Agent
The magic comes from create_pandas_dataframe_agent
.
This wraps the DataFrame in a tool the LLM can call with code execution:
🔎 Asking Questions in English
Now you can query the data without touching Pandas:
Output:
The agent generated and executed the Pandas code under the hood.
📈 Asking for a Plot
We can go one step further — ask the model to plot monthly revenue:
This produces:
🚀 Why This Matters
This pattern unlocks:
-
Rapid ad-hoc analytics: Ask questions in English, get code + charts.
-
Non-technical users can explore data without writing Pandas.
-
Bridges RAG + analytics: Instead of only text retrieval, you can augment LLMs with structured data queries.
Of course, for production you’d want guardrails (e.g., AST linting before execution, schema checks, sandboxing). But as a prototyping tool, this workflow is incredibly powerful.
Source code can be found here: JordiCorbilla/langgraph-cookbook: langgraph-cookbook
Comments
Post a Comment