Introduction
At Waterplan, we often face questions that require weaving together structured data, domain-specific calculations, and natural language interaction. Questions vary depending on the use case, but these are some examples:
What drives the high flood risk for this site?
What sites are at the highest risk of running out of water?
Which projects have a decreasing estimated Volumetric Water Benefit that would require new projects to achieve this year’s target?
A simple chatbot layer was not enough. We needed an assistant capable of understanding context, accessing specialized tools, and enforcing domain-specific constraints around risk scoring and water stewardship.
In this post, we’ll walk through how we built our multi-agent assistant architecture. We’ll cover the design decisions, the role of each agent, how they interact through an orchestrator, and some of the challenges we faced along the way.
The Problem We Wanted to Solve
To effectively assess and manage water risk, it is necessary to integrate various data sources. At waterplan we gather information at different granularity levels: this involves considering global data, local data, regulations and many other data sources. These are a few examples of pieces of information at different levels:
Local Data
Data sourced from specific local information, like regional reports, news, or measurements, offering a focused view on hazards in a particular area.

Global Data
Data sourced from comprehensive global datasets that provide a worldwide perspective on potential hazards.

Careful analysis of this diverse information, utilizing established methodologies and industry standards, is crucial for informed decision-making. The primary objective is not merely data collection, but a comprehensive understanding of multifaceted risks, facilitating improved management and sustainable water stewardship.
We initially attempted a singular approach, using one "agent" with a vast set of instructions to manage all information. However, the diverse nature of information and rules quickly overwhelmed this single agent, leading to confusion and a decline in result quality, which worsened with the addition of more data and rules. The inherent complexity of the problem made it clear that a monolithic plan would be ineffective.
Our solution involved developing a system with multiple specialized "agents" working collaboratively under a main "orchestrator." This design effectively breaks down complex problems into smaller, more manageable tasks, with each agent focusing on a specific aspect of the analysis. This significantly improves system performance and accuracy.
Below is a simple diagram showing the high level architecture of the agent. The different pieces will be explained in more detail later:

This collaborative approach ensures that our answers are grounded in real data and established rules, eliminating speculation and incorrect information. Each specialized agent contributes unique skills, such as validating risk figures, interpreting regulations, and identifying site-specific details, while the orchestrator unifies their efforts to deliver comprehensive and coherent responses. This solution successfully addressed our operational needs.
Request Handling Pipeline
When a request arrives, we first authenticate and verify that the caller is allowed to use the system.
We then parse the input, which consists of a set of messages and a context object.
The context object consists of information provided by the frontend that helps the agent provide answers tailored to what the user is doing in the platform. This is a simple example of what the context object looks like. Depending on what the user is doing in the platform the object might contain more or less information, but it is always information that can help us better understand how to answer the user’s question.

In this case we can derive that the user is looking at the Risk By Site view with the following filters:
Site: site with id site456
It is looking at the flood risk for 2025
Once the input is normalized, the request is delegated to the OrchestratorAgent, which receives the role, message history, and UI context.
Agent Hierarchy
At the heart of our design is an agent hierarchy, with each agent having a specific role. The OrchestratorAgent is the top-level coordinator. It does not answer questions directly but instead decides which specialist agent should handle the request. It always begins by invoking the UserContextAgent to enrich the available context.
The UserContextAgent transforms raw platform context such as company, site, or current view into a structured block of information. Its system prompt explicitly forbids it from answering questions; its sole purpose is to prepare consistent and reliable context for downstream agents. As part of its job, the UserContextAgent fetches more information on the company and the site the user is working with that will be useful for the other agents to process faster.
The RiskFrameworkAgent is responsible for queries related to Waterplan Risk Framework. It has tools that allow the agent to query the database for the same information the user can access through the platform. This allows us to check permissions at a user level so that it can access what it has been given permission to and also helps us keep consistency between the platform and the agent. By using this information, it can answer questions about hazards, vulnerabilities, categories, and risk scores, and it uses tools to get risk scorings, indicators and evidence, to ground its answers in data.
The WaterStewardshipAgent specializes in water stewardship topics. It follows the same pattern as the Risk Framework Agent in terms of tools and permission check and it has access to stewardship-specific datasets and can return evidence-based answers.
By constraining the scope of each agent, we maintain clarity of responsibilities and limit the risk of hallucinations.
Tools and Data Access
Domain agents are exposed to the Orchestrator as tools. Each tool enforces input schemas with Zod to ensure that the Orchestrator provides the correct parameters. When invoked, the tool builds a system message that includes the current timestamp, the conversation summary, the enriched user context, and domain-specific instructions. For example, the RiskFrameworkAgent’s system prompt contains a strict specification of framework categories and indicators and instructions to only use its tools as sources of truth. These measures ensure consistency and reliability across the system.
Example: RiskFrameworkAgent in Action
If a query comes in such as: “What are the biggest water-related risks at the São Paulo site?” the flow looks like this. First, the UserContextAgent enriches the context into a structured block such as:

The Orchestrator then recognizes the query as risk-related and calls the RiskFrameworkAgent tool. The RiskFrameworkAgent in turn invokes functions like get-site-risk-scorings
and get-site-global-indicators
. The large language model composes a grounded answer along the lines of: “Flood risk is high, primarily due to local precipitation variability. Vulnerability indicators suggest…” That answer is then passed back to the Orchestrator and returned through the API.
Challenges We Faced
One of the first challenges we encountered was prompt structure and consistency. Early prompts varied in format, which made improving them difficult and often led to inconsistent behavior. To solve this, we defined a structured prompt template for all of our agents. This template brought clarity, enforced consistent behavior across the system, and simplified future improvements. A simplified skeleton of the prompt looks like this:

Another significant challenge was the presence of redundant tools across multiple agents. Initially, several tools were repeated in different agents, which meant that subsequent agents often called the same tools unnecessarily. This redundancy created overhead and increased processing time. By carefully defining which tools belong to which agent, we reduced duplication, ensured that each agent focused only on its specific responsibilities, and achieved faster and more efficient execution.
What We Learned
Through this process we learned that context enrichment is critical. Even a small investment in a dedicated context agent dramatically improved the quality of responses. We also confirmed that structured schemas act as guardrails, making the system more predictable and reducing the likelihood of errors. Finally, by modularizing agents according to their domains, we enabled independent iteration and improvement, which paid off significantly in development speed and system stability.
What’s Next
Looking forward, we are planning to enable streaming answers to improve user experience, as the scaffolding for token streaming is already in place. We are also exploring the addition of new domain agents, for example in regulatory compliance, which would further expand the assistant’s scope. Another important area of improvement will be creating a feedback loop that allows answers to be rated and corrected, providing valuable input for prompt refinement and system tuning.
Conclusion
By building a multi-agent architecture, we created an assistant that goes beyond generic chat. It understands context, respects domain constraints, and delivers accurate, grounded answers for Waterplan. This approach keeps the system modular, reliable, and extensible, and it is already proving invaluable in how we interact with water risk and stewardship data.
ブログ
Get insights, expert analysis and tips on measuring, reporting, and responding to water risk