This article demonstrates how multi-agent architectures solve the limitations of single-agent AI in complex enterprise environments. By decomposing workflows into specialized agents, engineers can achieve higher accuracy, better context management, and faster execution for data-heavy tasks.
In our Engineering Energizers Q&A series, we highlight the engineering minds driving innovation across Salesforce. Today, we spotlight Neha Awasthi, Senior Manager of Software Engineering at Informatica and the engineering leader behind CLAIRE, a multi-agent AI system embedded across the Intelligent Data Management Cloud (IDMC) that executes enterprise data workflows at a 90% task success rate.
Explore how Neha’s team transformed enterprise data workflows once requiring up to three months of manual effort spanning discovery, governance, data quality, and pipeline orchestration within IDMC — and how they moved beyond the constraints of single-agent AI systems to coordinate multi-agent execution across workflows involving 50–60 model calls per request.
Our mission is to build a multi-agent AI system that transforms enterprise data management across the IDMC — making complex workflows intuitive and automated for users beyond technical specialists. Historically, workflows spanning cataloging, governance, data quality, integration, and master data remained fragmented, forcing teams to navigate multiple tools and apply deep expertise just to complete a single use case.
We addressed this by building CLAIRE as a multi-agent AI layer embedded directly across IDMC rather than as a standalone system. It orchestrates capabilities across discovery, classification, data quality, and pipeline execution through a unified interface — requiring agents designed to interpret user intent, operate across heterogeneous systems, and execute workflows that span multiple domains.
To meet users where they work, we introduced two interaction models: CLAIRE GPT as a generalist interface for cross-domain requests, and specialized copilots embedded within individual products. Users can explore workflows conversationally or execute tasks in context, ensuring AI assistance is available regardless of where they start.
The CLAIRE multi-agent architecture predates “agents” as an industry term — driven by the fundamental complexity of enterprise data workflows that demanded autonomous, coordinated execution rather than simple copilot-style assistance.

Stack View: Agentic foundation, intelligence and experiences.
Enterprise data workflows break down at scale because discovery, governance, data quality, and pipeline orchestration operate across disconnected systems with no unified execution layer. Each stage depends on different tools, interfaces, and expertise — creating a fragile chain across teams with no single point of coordination.
A unified system proved difficult to build because workflows are interdependent but distributed. A business user identifying an issue could not act on it directly. That insight had to move through translation, rule creation, validation, and pipeline integration across multiple systems — each handoff introducing latency and context loss, slowing execution and compounding error risk.
We addressed this with a multi-agent system that collapses these stages into a continuous workflow. Users initiate requests in natural language, and the system orchestrates discovery, profiling, rule generation, and execution end to end — eliminating manual handoffs and reducing workflows that previously took up to three months down to days.
Single-agent AI systems fail in enterprise workflows because they cannot simultaneously handle context-heavy reasoning, diverse toolchains, and multi-step execution. Each step — search, profiling, statistical analysis, rule generation, and code execution — demands different inputs, tools, and reasoning patterns that a single agent cannot reliably manage in one pass.
In early iterations, we exposed multiple tools within a single agent. These designs broke down quickly: the agent selected incorrect tools, exceeded context limits, and produced inconsistent outputs across steps. Latency compounded as the agent attempted to reason across too many operations simultaneously.
These failures made specialization the necessary path forward. By decomposing the system into multiple agents with focused responsibilities, we reduced context load, improved tool selection accuracy, and stabilized execution across multi-step workflows.
Designing the multi-agent system required solving orchestration, planning, and agent specialization as core architectural challenges. We introduced an orchestration agent as the control plane, responsible for intent detection, plan generation, and routing execution across specialized agents.
Each specialized agent operates within a constrained context and optimized toolset. The data quality agent, for example, encompasses distinct skills — profiling, rule recommendation, rule generation, and cleansing — each with dedicated tools and model configurations tuned for accuracy and efficiency.
Balancing flexibility with control was a central challenge. Because enterprise workflows vary widely, we introduced a planning layer in which the orchestration agent generates a high-level plan that users can review and modify before execution begins. This separation allows the system to adapt to diverse requests while maintaining predictable, auditable behavior.
We also implemented deterministic tool routing logic to ensure agents invoke the correct tools based on intent and context — without it, outputs became inconsistent. Together, orchestration, planning, and specialization form the foundation that enables reliable execution at scale.
Coordinating multiple agents introduces distributed system challenges across execution reliability, dependency management, and failure propagation. With workflows involving 50–60 model calls — where outputs from one step feed directly into the next — the system is highly sensitive to errors at any intermediate stage.
Failure propagation became a critical issue early on. A single incorrect output, such as malformed metadata or an invalid rule, cascaded across agents and caused full workflow failure. These failures compounded the problem by surfacing several steps later, making them difficult to trace back to their origin.
We addressed this by introducing validation checkpoints and strict data contracts between agents. Each step validates inputs and outputs before passing them forward, preventing invalid states from propagating downstream. Guardrails detect anomalies early in execution, containing failures before they can spread.
Dynamic user modifications introduced additional complexity. Because users can adjust execution plans in real time, the system must recompute dependencies on the fly without disrupting the workflow in progress. Adaptive planners handle these changes while preserving execution integrity.
While the conversational interface represents the most visible interaction model, the architecture also supports background and headless agent execution. Agents can operate autonomously on scheduled or event-driven triggers — running data quality assessments, executing rule validation, or monitoring pipeline health without requiring a user to be actively present in the conversation. This ensures the system delivers value not only through interactive workflows but also through continuous, autonomous data quality operations at scale.
Context modeling becomes critical when a multi-agent system must interpret open-ended enterprise data requests across heterogeneous datasets. Because users express intent in business language rather than system instructions, that intent must be translated into executable workflows before any agent can act on it.
The challenge is both dynamic and constrained. The system must identify the relevant context for each step without exceeding model limits — too much context degrades performance, while too little produces incorrect results.
We addressed this by building a semantic layer that performs entity resolution, intent decomposition, and metadata enrichment before execution begins. This ensures each agent receives precisely the context it needs — no more, no less.
Without this layer, the system would fall back on pattern matching and produce generic or incorrect results. With it, the system delivers accurate, context-aware execution across enterprise data environments.

Semantic knowledge preparation.

Inference time consumption.
Preparing the system for production required validating accuracy, grounding, and reliability across the full execution pipeline — and we recognized early that every agent deserves its own specialized validation framework, not a one-size-fits-all approach.
Unlike traditional AI evaluation built on precision, recall, and F1 scores, we introduced agent-specific metrics tailored to what each agent actually does. For the cleansing agent, we validate code execution correctness, fix completeness, and before/after data integrity. For rule generation, we measure deduplication accuracy, rule logic validity, and test data coverage. For recommendation, we evaluate contextual relevance, grounding against actual catalog metadata, and acceptance rate.
This shift from generic model metrics to agent-specific outcome metrics represents a fundamental change in how enterprise AI systems should be evaluated — measuring not just whether the model produced a correct token, but whether the agent completed the right action end to end. The data quality agent achieved a 90% task success rate, 98% grounding accuracy, and a 1% hallucination rate, establishing a measurable foundation for outputs that are both complete and trustworthy.
We implemented a multi-layered evaluation framework encompassing LLM-as-a-judge scoring and reflection-based validation for generated code. Code is reviewed for correctness prior to execution, preventing failures from propagating downstream before they can be caught.
Reliability demanded consistency across every stage — intent detection, planning, execution, and summarization. We introduced validation layers, guardrails, and horizontally scalable infrastructure to sustain performance under production load.
Together, these improvements ensure the system operates reliably at enterprise scale and delivers consistent results in real-world environments.
The post How Informatica Built a Multi-Agent AI System to Reduce Data Workflows from Months to Days appeared first on Salesforce Engineering Blog.
Continue reading on the original blog to support the author
Read full articleScaling distributed systems to 120 trillion rows requires moving beyond query federation. Adopting a file-based approach with Apache Iceberg eliminates bottlenecks between compute and storage, enabling high-performance AI at petabyte scale without data duplication.
As AI agents move to complex multi-system workflows, siloed security fails. This platform-centric approach ensures consistent identity, data, and API governance, preventing unauthorized access and ensuring auditability across distributed enterprise environments.
As AI agents become more autonomous, traditional governance fails. This integration provides engineers with deterministic lineage and tracing, allowing them to audit AI decisions, ensure data quality, and mitigate risks like hallucinations in complex, dynamic execution environments.
Scaling security operations manually is impossible in complex cloud environments. SATA demonstrates how AI agents can automate high-volume triage with 95% accuracy, allowing engineers to focus on critical threats while maintaining trust through confidence scoring and orchestration.