Building a MiFID II Compliance Chatbot with CrewAI

Posted by Jordi Corbilla October 11, 2025

Building a MiFID II Compliance Chatbot with CrewAI

Financial regulation is notoriously complex. Investment firms must interpret and comply with lengthy documents such as the Markets in Financial Instruments Directive (MiFID II) and its companion regulation (MiFIR), while preparing for MiFID III. Compliance officers routinely pore over hundreds of pages of policy statements, ESMA Q&As and Regulatory Technical Standards (RTS). At the same time, regulators are tightening expectations: by September 2025 firms will need to implement the MiFID III amendments and be ready to demonstrate that surveillance and inducement controls work across every channel (luware.comluware.com). Manual search is inefficient and error‑prone.

This article shows how to build a MiFID II compliance chatbot using CrewAI, a multi‑agent framework. You will learn how to organise the workflow, assign tools at the agent or task level, avoid common validation errors, and anticipate upcoming regulatory changes. The techniques apply beyond finance; they illustrate a pattern for building reliable agentic systems that combine retrieval with language models.

Framework
CrewAI orchestrates multiple AI agents to search MiFID-related PDFs and draft compliance answers

Approach
Compare agent‑centric vs task‑centric tool assignment for reliability and predictability

Key Fix
Teach the language model to include the mandatory {"query": "<text>"} argument when calling the PDF search tool

Regulatory Horizon

MiFID III amendments enter force March 28 2024 with a main implementation deadline of September 29 2025 (luware.comluware.com)

New Features

CrewAI’s MCP standardises tool access and data sources (blog.crewai.com) while guardrails and event buses increase reliability (blog.crewai.comblog.crewai.com)

Agent‑centric vs Task‑centric Tool Assignment

CrewAI allows tools—such as a PDF searcher or a web search API—to be provided either to the agent as a global toolbox or to individual tasks. These two patterns affect reliability and maintainability.

Agent‑centric (Generalist)

In this common setup the agent receives a list of tools that it may call at any time. The agent decides which tool to use based on the prompt and its internal reasoning. This is flexible and quick to prototype, but it can lead to mis‑calls. For example, the PDF search tool expects a JSON argument with a query string. Without explicit instructions, the LLM may omit the argument, leading to the familiar Pydantic error:


Error: Arguments validation failed: `query` field required

When the agent owns the entire toolbox, every task must encode tool‑calling rules in its prompt to avoid such errors. In our experience this reduces predictability as the system grows; even seasoned engineers forget to repeat the tool signature in each description.

Task‑centric (Specialist)

The task‑centric approach grants tools only to the tasks that need them. When the agent begins a task it is temporarily given the required tools and cannot access others. This design leads to more predictable tool calls because the agent knows which tools are available for that job and the prompt can focus on how to use them.

In regulated domains this scoping is critical. It reduces the chance that a general‑purpose LLM will call a web search when only document evidence is allowed, and it makes unit testing straightforward: each task can be executed in isolation with its own tools and guardrails (blog.crewai.com).

Pros and Cons

Agent‑centric:
- Pros: Simple configuration; fewer moving parts; agent decides which tool to use.
- Cons: LLM may mis‑call tools without guidance; prompts must repeat tool signatures; difficult to test tasks in isolation.
Task‑centric:
- Pros: Predictable and secure; tools scoped to tasks; easier to unit‑test and update; better adherence to compliance restrictions.
- Cons: Slightly more verbose setup; requires careful mapping of tasks to tools.

Code Walkthrough

Let’s examine the core elements of the chatbot. The code below instantiates an LLM, sets up PDF search tools for each regulatory document, defines an agent with a system prompt and tool‑calling guide, and creates two tasks orchestrated sequentially by a crew. Code can be found here: (langgraph-cookbook/09 - Mifid CrewAI Chatbot at main · JordiCorbilla/langgraph-cookbook)


from crewai import Agent, Task, Crew, Process
from crewai import LLM
from crewai_tools import PDFSearchTool, SerperDevTool

llm = LLM(model="openai/gpt-4o-mini")

web_search_tool = SerperDevTool()

pdf_search_tool = PDFSearchTool(
    pdf="https://www.fca.org.uk/publication/policy/ps17-14.pdf",
    config={ 'embedder': { 'provider': 'huggingface', 'config': { 'model': 'sentence-transformers/all-MiniLM-L6-v2' } } }
)
pdf_search_tool2 = PDFSearchTool(
    pdf="https://www.fca.org.uk/publication/consultation/cp16-19.pdf",
    config={ ... }
)
pdf_search_tool3 = PDFSearchTool(
    pdf="https://www.esma.europa.eu/sites/default/files/library/2016-1452_guidelines_mifid_ii_transaction_reporting.pdf",
    config={ ... }
)

COMPLIANCE_SYSTEM_PROMPT = """
You are a MiFID II / MiFIR compliance analyst. Answer ONLY with information grounded in the provided PDFs.  Always include article‑level citations.  Prefer the latest ESMA RTS for conflicts.  No speculation.
"""

CITATION_FORMAT_PROMPT = """
Format citations as markdown footnotes: [MiFID II Art X(Y)](SOURCE_URL#page=?), [MiFIR Art X(Y)](SOURCE_URL#page=?).
"""

TOOL_CALLING_GUIDE = """
When using "Search a PDF's content", you MUST pass a JSON argument like {"query": "<short search string>"}.  Never omit 'query'.
"""

agent_centric_agent = Agent(
    role="MiFID Compliance Analyst",
    goal="Answer MiFID II / MiFIR questions with exact citations.",
    backstory="Seasoned regulatory analyst trained on MiFID II, MiFIR and FCA/ESMA RTS/Q&A.",
    tools=[pdf_search_tool, pdf_search_tool2, pdf_search_tool3, web_search_tool],
    llm=llm,
    system_prompt=COMPLIANCE_SYSTEM_PROMPT + CITATION_FORMAT_PROMPT + TOOL_CALLING_GUIDE
)

# Define tasks and crew
agent_centric_task = Task(
    description=(
        "Answer '{customer_query}'. First, search the PDFs using 'Search a PDF's content' with {\"query\": \"<your search string>\"}. Extract relevant passages and synthesise the answer."),
    tools=[pdf_search_tool, pdf_search_tool2, pdf_search_tool3],
    agent=agent_centric_agent
)

response_drafting_task = Task(
    description="Use the gathered information to draft a comprehensive response to '{customer_query}'.",
    agent=agent_centric_agent,
    context=[agent_centric_task]
)

task_centric_crew = Crew(
    agents=[agent_centric_agent],
    tasks=[agent_centric_task, response_drafting_task],
    process=Process.sequential
)

The system prompt instructs the agent to answer using only information from the PDFs, include citations, and avoid speculation. We append a tool‑calling guide that explicitly defines the JSON argument required by the PDFSearchTool. Without this guide the agent might call the tool incorrectly, causing the validation error described earlier.

The first task instructs the agent to search the PDFs for relevant text using a JSON query. The second task uses the extracted passages to draft the final answer. By assigning the PDF search tools only to the search task, we prevent accidental web queries during drafting.

Preparing for MiFID III

Our chatbot currently answers questions under MiFID II and MiFIR. However, regulatory evolution continues. MiFID III amendments were published in February 2024, entered into force in March 2024 and must be implemented by September 29 2025 (luware.com). The third iteration does not overhaul the framework but raises the bar on surveillance and inducements: compliance teams must prepare for heightened expectations around off‑channel communications and record‑keeping (luware.com). MiFID III builds on MiFID II by widening the focus to behavioural risks such as conflicts of interest and pricing opacity (luware.com), and regulators expect active monitoring rather than passive archiving (luware.com).

To stay ahead:

Integrate multi‑channel monitoring: capture voice, chat, video and messaging across platforms; apply AI to detect misconduct patterns (luware.com).
Extend the knowledge base: ingest ESMA Q&As, RTS documents and MiFID III draft texts; update embeddings regularly.
Use CrewAI’s guardrails: implement output‑length checks or hallucination detection to ensure answers remain grounded (blog.crewai.com).
Leverage event buses for audit logs and compliance evidence: CrewAI’s event bus emits events such as task started, tool use error, and LLM call completed (blog.crewai.com).
Adopt query rewriting and agentic RAG to improve retrieval. CrewAI now supports query rewriting to focus on relevant keywords and integrates with vector databases like Qdrant and Weaviate (blog.crewai.com).
Plan for MiFID III deadlines: ensure your system can adapt to new PFOF bans, stricter inducement rules, and expanded communication capture (luware.com).

Beyond Compliance – General Lessons

While this capstone project addresses a specific regulatory domain, the design patterns are broadly applicable:

Be explicit about tool signatures: LLMs cannot infer parameter names; include examples and constraints in prompts. For production systems, consider using structured tool calling (JSON Schema) or wrappers that handle input validation.
Scope tools to tasks: Minimises misuse and aids debugging. For high‑impact domains (finance, healthcare) this can be critical for legal and safety reasons.
Instrument your agents: Use CrewAI’s event bus to collect telemetry on tool use and LLM calls. This data helps debug issues and demonstrates compliance (blog.crewai.com).
Stay up to date: Regulations change. Regularly ingest new documents and search the web to supplement your knowledge base. MiFID III emphasises behaviour monitoring, so your system must adapt to detect tone and sentiment across channels (luware.com).
Embrace guardrails and RAG: Guardrails catch excessive or irrelevant outputs; agentic RAG (retrieval augmented generation) tailors the knowledge retrieval to each query, increasing precision and reducing hallucinations (blog.crewai.com).

Conclusion

By combining CrewAI’s multi‑agent orchestration with dedicated tools for PDF and web search, we built a robust MiFID II compliance chatbot. Assigning tools at the task level rather than the agent level yields more predictable and testable workflows. The critical missing‑query error was resolved by teaching the LLM to call tools correctly. Looking ahead, MiFID III will demand even greater surveillance and behavioural oversight; our modular architecture can adapt by adding new documents, query‑rewriting agents, guardrails and event logging.

Agentic AI is evolving rapidly. CrewAI’s adoption of the MCP, guardrails and event bus underscores this progress (blog.crewai.comblog.crewai.comblog.crewai.com). With careful design, developers can harness these advances to build trustworthy, compliance‑ready chatbots that empower experts rather than replace them. Start small, iterate quickly, and always validate your tools.

Search This Blog

Random thoughts on coding and technology

Building a MiFID II Compliance Chatbot with CrewAI

Building a MiFID II Compliance Chatbot with CrewAI

Agent‑centric vs Task‑centric Tool Assignment

Agent‑centric (Generalist)

Task‑centric (Specialist)

Pros and Cons

Code Walkthrough

Preparing for MiFID III

Beyond Compliance – General Lessons

Conclusion

Comments

Post a Comment

Popular Posts

Train Your Own LoRA with ComfyUI: A Step-by-Step Guide

"Cannot load SSL Library" using Delphi XE7