Why Unified, Current and Trusted Business Context is the Missing Layer in the AI Stack
Jakki Geiger, CMO at Arango in conversation with
Ravi Marwaha, Chief Product & Technology Officer at Arango
TL;DR
Vector databases are excellent at retrieval, but enterprise AI fails in production when business context is fragmented across systems. When meaning, relationships, and time are managed separately, AI sytems are forced to reconstruct context at query time. This approach works in pilots but it breaks at scale. Reliable, explainable and safe AI requires a unified, current and trusted business context that all models, agents and co-pilots can operate on consistently. Retrieval surfaces relevant information. Understanding and action require shared business context.
Jakki: Ravi for those who haven’t met you, can you please introduce yourself?
Ravi: Thank you, Jakki. Before joining Arango to lead product and technology, I was Chief Product Officer for the Commercial Banking digital platform at JPMorgan Chase. Working at that scale made one thing very clear to me: AI is transformational for enterprises, but only when the data foundation is right. Most failures don’t come from models—they come from fragmented data and a lack of unified, current and trusted business context.
Prior to JPMorgan, I led Global Delivery Services at Uptake and earlier ran global services and solutions at GE Digital. I also spent years in product roles at SAP and Informatica. Across all of these roles, what has consistently driven me is building innovative technology that delivers real, measurable business outcomes. That’s exactly why Arango resonates with me, and why I’m so excited to be here.
Enterprise AI Failures Aren’t About the Model
Jakki: We hear a lot that “enterprise AI failures aren’t about the model.” From your perspective, what do they usually fail on?
Ravi: Most enterprise systems were built in silos—customer information here, documents there, tickets and logs somewhere else and analytics off to the side. When you drop an LLM on top of that and ask it to securely connect to enterprise data systems, the model has no consistent understanding of meaning, relationships, time, or trust. Without that foundation, it can’t reason reliably or earn user confidence.
When business context—meaning, relationships, time, and trust—is fragmented across enterprise data systems, AI outputs may be searchable, but they become difficult to reason over, explain, audit, or act on consistently.
When enterprise data is fragmented, business context fragments with it—and that’s what ultimately holds AI back from real business impact. The result is AI that can’t be trusted, scaled, or afforded.
By unified, current and trusted business context, we mean knowing:
- what information is important to whom,
- right now,
- under which constraints, and
- based on what evidence.
This isn’t a philosophical definition. It’s the minimum context enterprise AI systems need to operate reliably in production.
When enterprise data is fragmented across systems, business context fragments with it—and that’s what ultimately holds AI back from real business impact. The result is that AI can’t be trusted, scaled, or afforded.
Most enterprise data isn’t AI-ready, lacking the trusted business context required for production-ready, decision-ready, action-ready and audit-ready AI.
What’s missing is a contextual data layer.
A contextual data layer sits between enterprise data systems and AI models, providing a unified, current, and trusted view of business context. It doesn’t replace vectors, graphs, or systems of record—it unifies them, so AI systems don’t have to reconstruct meaning, relationships, and state at runtime.
What Vector Databases Do Well & When They Are Enough
Jakki: Vector databases are often positioned as the solution to that problem. What do they actually do really well?
Ravi: Vector databases excel at similarity search over relatively homogeneous data, where meaning can be effectively captured in fixed-dimensional embeddings.
They’re very good at:
- Finding “things like this” or nearest neighbor
- Retrieving relevant unstructured text
- Powering fast semantic search for Retrieval Augmented Generation (RAGs)
- Supporting low-latency, high-throughput AI applications
If you’re working with a manageable number of documents and you want to retrieve passages that are conceptually related to a query, vector databases are a great tool.
Retrieval helps you find information. Reasoning is what enables AI to make decisions you can trust.
But vector databases also have inherent limits. As data volumes grow, some relevant relationships become impossible to encode perfectly. Increasing training data or model size alone doesn’t eliminate this constraint. In practice, fixed-dimensional embeddings hit a representational ceiling—accuracy eventually plateaus even as more vectors are added.So in that specific embedding retrieval context, increasing the number of vectors doesn’t scale indefinitely—there’s a combinatorial representational ceiling.
Jakki: Is there a case where a vector database alone is enough in enterprise AI?
Ravi: Yes. Vector databases work extremely well for internal knowledge search—use cases where employees need to find and read information, not reason over it or act on it.
Think internal chatbots for HR policies, onboarding guides, or technical documentation. The data is largely static, the interaction is read-only, and the goal is semantic relevance.
In those scenarios, similarity is enough. A vector database can quickly retrieve the right passages and support helpful, low-risk answers at scale.
But as data volumes and use cases grow, vector retrieval hits a natural ceiling: fixed-dimensional embeddings can’t reliably encode every distinct relevance decision an enterprise needs, so accuracy eventually plateaus even if you add more data.
The limitation shows up when AI systems are expected to do more than answer questions and need to scale—when they need to understand the current state, reason across relationships and systems, explain why an answer is correct, or take action. That’s the moment enterprises cross a line, and vector search alone is no longer sufficient.
Where Vector Databases Fall Short
Jakki: So where do they start to fall short for enterprise AI?
Ravi: They fall short the moment enterprise AI needs to do more than retrieve information—when it needs explicit understanding, not just similarity.
Vector databases are great at finding related content, especially in unstructured text. But companies and government agencies don’t work with just text—they work with a mix of structured, semi-structured, and multimodal data like records, logs, code, and documents.
Vector databases find what’s similar. Enterprise AI needs to understand what’s connected, what’s allowed, and what’s true.
More importantly, vectors don’t actually know what things are. They don’t understand that:
- a customer in CRM is the same entity referenced in a support ticket
- a customer incident is related to specific services, regions, and SLAs
- a policy applies differently depending on time, jurisdiction, or product
Vectors store proximity, not meaning. That’s fine for search and retrieval use case.
But vectors break down when AI needs to reason, explain decisions, or take action.
Similarity isn’t the same thing as meaning.
As vector systems grow, more things start to look “similar,” and important distinctions get blurred. One relevant-looking chunk can drown out constraints, relationships, or rules.
In enterprise terms, this shows up as retrieval dominance: one entity, one document, or one recent message overwhelms everything else—so the AI misses nuance, ignores policies, or connects the wrong facts to the wrong entities.
The result isn’t just wrong answers—it’s plausible answers that sound right but aren’t.
As more data is added, the system increasingly prefers what looks relevant over what is correct, governed, or in scope. That’s where trust breaks down.
This is why vector-only approaches often look good in pilots but struggle in production—because scale amplifies ambiguity, not clarity.
Vector search can tell you what’s nearby, but not what’s true, what’s allowed, what’s related to what, or how context fits together when situations get complex.
A multimodel approach fixes this by combining:
- Vectors for fast semantic recall (“find potentially relevant stuff”),
- Documents/records for the full, auditable source of truth (the actual content and metadata),
- Graphs to preserve structure (entities, relationships, “who/what/why,” and how pieces connect).
With explicit structure in place, similarity is grounded in context—so meanings stay distinct instead of collapsing as data and use cases grow.
Vector databases find what’s similar. Enterprise AI needs to understand what’s connected, what’s allowed, and what’s true.
Why Context Reconstruction Fails at Scale
Jakki: Can’t teams just add metadata or filters to solve that?
Ravi: First, let define Context Reconstruction. It’s what you have to do when context isn’t built into the data layer—so you end up rebuilding meaning at query time instead of operating on trusted, structured context.
Metadata tries to help with additional information attached to data to help preserve context when the raw content alone isn’t enough.
Think of metadata as labels or tags that try to answer:
- What is this?
- Where did it come from?
- Who is it about?
- When was it created?
—but it doesn’t solve the core issue.
Metadata is:
- Flat
- Loosely enforced
- Hard to evolve
- Easy to misapply
As soon as you want to ask questions that involve relationships—like, “Which customers were affected by this incident?”—you’re forced to reimplement logic in application code. That logic becomes brittle, inconsistent, and impossible to reuse across co-pilots and agents.
In practice, this forces teams to reconstruct business context at runtime, inside pipelines and application logic, instead of managing it once as shared infrastructure. Reconstructing business context on the fly works for pilots—but it breaks down once AI becomes mission-critical and data changes continuously across systems.
Over time, every schema change, new agent, or application update turns into an integration project—driving up cost, slowing delivery, and increasing operational risk.
The result isn’t just technical fragility—it’s inconsistent decisions, slower response times, and higher operational risk as AI expands.
Enterprise AI Production Reality
Vector databases are powerful—but retrieval alone isn’t understanding.
As AI systems move beyond search into reasoning and action, fragmented data forces every model and agent to rebuild context on the fly. That’s when accuracy drops, explainability fades, and trust erodes.
The Missing Layer in Enterprise AI
Learn why unified, current, and trusted business context is the foundation enterprise AI needs to reach production.
Vector databases find what’s similar. Enterprise AI needs to understand what’s connected, what’s allowed, and what’s true.
— Ravi Marwaha, CPTO, Arango
AI architectures are evolving from retrieval to reasoning to agentic decision-making, where vector-only and search-centric systems fall short.
Jakki: Time is another big requirement. How well do vector databases handle it?
Ravi: Poorly, by design.
Most vector systems treat time as:
- A timestamp
- A filter
- Or a recency boost
They don’t model:
- What was true at a specific point in time
- How entities changed over time
- Which version of information was valid when a decision was made
That’s a serious problem for companies and government agencies operating under audit, compliance, and regulatory requirements. They don’t just care about answers—they care about when those answers were true.
When AI can’t reason over what was true when, teams lose confidence in decisions and hesitate to let systems act autonomously.
Jakki: What about trust and provenance? That’s becoming a more important shift from just providing answers to taking action.
Ravi: Exactly—and this is another major gap.
Vector databases don’t natively capture:
- Where information came from
- Whether it’s authoritative
- How it was transformed
- Or why one source should be trusted over another
They also don’t natively encode permissions or access policies, which makes “who can see what” brittle as data, roles, and agents multiply. When embeddings from mixed-trust sources are blended together, the AI has no way to explain why it chose a particular answer. That’s a deal-breaker for adoption in companies and government agencies.
Why Multimodal Data Increases the Context Challenge
Enterprise AI must reason across structured, unstructured, and multimodal data—something traditional, siloed architectures can’t support.
Jakki: Many teams are also working with multimodal data now—text, images, logs and code. Does that help?
Ravi: Not unless those signals are unified.
Vector databases can store embeddings across all of those modalities, but they can’t connect them into a single, contextual object.
They don’t inherently know that:
- This log caused that incident
- This code change explains that outage
- This screenshot supports that policy
Without shared context, multimodal data is just more vectors—not more understanding.
As companies and government agencies add more data types, the problem actually gets worse: business context has to be reconstructed across even more systems, signals, and pipelines.Vector databases are optimized for unstructured data and similarity search. They’re powerful in that lane, but incomplete by design when it comes to modeling structured relationships and operational context. That’s where graph-based systems become essential.
Without unifying multimodal signals into shared business context, AI systems struggle to explain decisions, reason over what actually happened, or take action confidently as data changes.
The Core Misconception about Vector Databases
Jakki: So what’s the core misconception teams have about vector databases?
Ravi: They assume similarity equals business context—and retrieval equals reasoning.
Vector databases are powerful retrieval engines, but they’re not designed to manage business context or connect enterprise data systems to LLMs in a way that supports reasoning and action.
That’s why AI in large, complex organizations need a data platform that acts as a bridge between enterprise data systems and AI models—unifying structured, unstructured, and multimodal data into a shared business context that co-pilots and agents can reason over and act on reliably.
A contextual data layer connects fragmented enterprise data to LLMs, providing the shared business context required for agents, co-pilots, and chatbots.
Jakki: Enterprise agents and co-pilots are supposed to do things—resolve tickets, assess risk, recommend actions. What kind of data do they actually need to operate reliably?
Ravi: They need all of it, together.
Agents don’t succeed on unstructured text alone. They need:
- Structured data to understand state, constraints, and business rules
- Unstructured data to capture human knowledge and nuance
Multimodal signals—logs, code, text, media—to explain what actually happened
Ai Applications, Agents, Co-Pilots, and Chatbots
AI Applications
Embedded GenAI
AI applications embed intelligence directly into business software to automate decisions and improve outcomes.
AI Agents
AI Workflow Automation
AI agents are autonomous systems that can plan, reason, and take actions across tools and data.
Co-Pilots
AI Assistants / Productivity AI
AI co-pilots assist humans inside workflows by providing real-time recommendations, summaries, and next steps.
AI Chatbots
Conversational AI
AI chatbots answer questions conversationally by retrieving knowledge from enterprise content and systems.
Because agents act, not just retrieve, they have to reason over what is true right now, under specific constraints, and based on trusted information.
And most importantly, all of that data needs to be connected to the same underlying entities. A co-pilot can’t resolve an incident if the runbook, logs, impacted customers, SLA, and current system state all live in different systems with no shared context.
When AI agents have to reconstruct this context across systems at runtime, actions become inconsistent, confidence drops, and teams hesitate to let AI operate autonomously in production.
What Enterprise AI Needs
Jakki: What does enterprise AI actually need instead?
Ravi: It needs a contextual data layer—a purpose-built layer that unifies structured, unstructured, and multimodal data into shared business context.
This layer gives AI systems a consistent view of meaning, relationships, time, and trust, so co-pilots and agents can reason, explain decisions, and take action safely in production.
Vector search still plays an important role—but as one signal among many, not the foundation.
We’re seeing a new layer emerge in enterprise AI architectures. It’s often described as a knowledge layer, because modeling relationships is a critical step.
But in production, enterprise AI systems need more than just relationships alone. They need business context—relationships combined with time, provenance, operational state and multimodal signals—so answers are consistent, decisions are explainable and actions are safe.
An AI-ready data architecture unifies graph, vector, document, search, and key-value data to deliver trusted context for RAG, agents, and AI applications.
Bottom Line:
Vector databases can help you get started—but they won’t get you to production on their own.
Without a coherent data architecture, every new AI use case adds complexity instead of compounding value.
What’s Next?
The Definitive Guide to Agentic AI-Ready Data Architecture
Vector search can get AI started—but it won’t get it to production. This guide helps leaders understand the architectural choices required to move beyond retrieval and build AI systems that scale with trust.
Forrester on Multimodel Data Platforms
See how analysts are thinking about the limits of vector-only architectures—and why enterprises are turning to multimodel approaches as AI moves into production.
The Missing Layer in Enterprise AI
Learn why unified, current, and trusted business context is the missing layer in the AI stack—and what a contextual data layer actually is.