Contextus: The AI that finds what you didn’t know to look for

Your business data already contains the answers. The problem is that no search, dashboard, or AI tool was built to find them — because they’re hidden in the connections between things, not in the things themselves.

Customer relationships. Supply chains. Biological pathways. Financial networks. Code repositories. All of these are fundamentally interconnected systems — and yet most AI tools treat this data as flat text, discarding the structural signal that makes graphs uniquely powerful.

Contextus is different. It is a graph-guided AI discovery agent that finds the connections your data already contains — but that no query, dashboard, or conventional AI system would ever surface.

Two drug compounds linked via a shared protein pathway. A chip design component with near-identical architecture to one in a separate repository. A financial instrument with unexpected exposure to a correlated asset cluster. These discoveries exist in your data. Contextus finds them.

The gap in your AI stack

RAG systems retrieve. LLMs summarize. Search finds what you name. All of them answer questions you already know how to ask

None of them discover.

Discovery means finding entities that are structurally similar to your query node but sit in completely different neighborhoods — nodes that share indirect pathways, common patterns, and latent relationships that no explicit edge connects. This is what graph neural networks are built for. And it is what Contextus delivers.

Search

Finds what you name

Known entities and indexed content
Direct keyword and semantic matches
Blind to graph structure
Can’t find what you don’t name

“What is X?”

RAG

Retrieves similar text

Semantically similar documents
Good for unstructured text
Treats data as flat text
No structural graph reasoning

“Tell me about X.”

Contextus Graph-guided AI

Discovers what you missed

Structurally similar nodes across disconnected neighbourhoods
GNN link prediction on any graph
Plain-language explanation of every connection
Works on any domain, any Arango graph

“What didn’t I know to ask?”

How it works

Contextus operates in four stages from a single node key query to a plain-language discovery narrative.

01 Ingest & Embed

Your graph is loaded into Arango’s graph-native multi-model data platform, and a Graph Neural Network (GNN) is trained directly on its structure. For clarity, a GNN learns the structural patterns in your data – not just what each entity is, but how it connects to everything around it Unlike graph-only databases, Arango stores each node’s attributes, its graph position, and its GNN embedding together in a single document — no separate vector database, no ETL, no sync overhead. Training is automatic and takes minutes.

02 BFS Traversal

From your query node maps every connection from your starting point, then an Arango Query Language (AQL) breadth-first search maps all structural connections up to N hops. This builds a local subgraph that captures the full structural neighbourhood

03 Link Prediction

Finds structurally similar entities across the graph, then a GNN computes cosine similarity between your query node’s embedding and every node in the traversal. Nodes that are structurally similar — even if topologically distant — score highly. The top-K are surfaced as latent discoveries.

04 LLM Reasoning

Explains what was found, in your domain’s language so a LangGraph ReAct agent reasons over the scored subgraph. For each discovery, it explains the connection path, the predicted link strength, and what the structural similarity means — in plain language specific to your domain.

Domain-aware, automatically

After ingestion, Contextus samples your graph and calls the LLM of your choice to infer the domain, terminology, and what kinds of discoveries matter. A biomedical graph gets biomedical discovery framing. An HR graph gets org-structure framing. No manual configuration or fine-tuning required.

Any graph. Any domain.

Contextus is graph-agnostic by design. The same engine, the same pipeline, and the same agent loop works across any domain — and because it runs natively on Arango’s multimodel data platform, you don’t need to bolt on a vector database or build an ETL layer to make it work. Graph traversal, node attributes, and GNN embeddings all live together in one place.

Biomedical knowledge graphs — drug-protein-disease interaction networks, pathway analysis, side-effect clustering
Financial networks — instrument correlations, counterparty exposure, contagion pathways
HR and organisational graphs — talent clusters, reporting hierarchies, knowledge flow
Software and code repositories — component dependencies, co-authorship, architectural similarity
Supply chain and logistics — supplier dependencies, risk propagation, network resilience
Any Arango graph you already have — native ingestion, no ETL required

A graph-only database like Neo4j would require a separate vector store for embeddings, a sync layer to keep them consistent, and two query languages on every request — adding infrastructure, latency, and complexity at every step. Arango handles all three natively.

Contextus is built for multiple use cases. Every time you ingest a new graph, Contextus automatically samples it, infers the domain, and regenerates its configuration from scratch. Switch from chip design to drug discovery to HR analytics to financial risk — the agent adapts its language, framing, and discovery logic each time, with no manual reconfiguration required

Proven on real graphs

27,583

nodes — temporal chip design

40,955

edges

24

node types

On a temporal chip design graph spanning four open-source RISC-V processor repositories, Contextus identified structurally similar memory management units across architecturally distinct processor families — connections invisible to direct AQL queries and undetectable by text similarity alone.

Example discovery

In chip design, finding that two memory management units share underlying architectural patterns — even when they live in completely separate repositories — can save weeks of redundant development. This is the kind of discovery Contextus makes routine.
For example, Contextus surfaced a latent connection between OR1200_e9c0251d27c13c65 (an OpenRISC specification chunk) and MOR1KX_g_f158f9ef98d3 (a DMMU implementation in a separate processor repository) with a predicted link strength of 0.5965 — via a three-hop path through shared architectural patterns. Neither a keyword search nor a direct graph query would find this.

Built on Arango

Contextus is built natively on Arango’s graph-native multi-modal and multi-model data platform — the only database that handles document, graph, and vector storage in a single system, without a vector database sidecar or a separate graph layer.

GNN embeddings stored alongside node attributes in the same Arango collection
AQL traversal for expressive, graph-native BFS subgraph extraction
Heterogeneous graph support out of the box — multiple node and edge types, one named graph
Direct ingestion from any existing Arango instance — no ETL pipeline needed

Get started

Contextus runs as a self-contained FastAPI application with a built-in web UI. It ingests from any Arango graph, CSV dataset, or Open Graph Benchmark dataset, and requires an LLM API key — OpenAI, Anthropic, and others supported.

Contact the Arango team to arrange a live demonstration on your own graph data.

Related Blogs

See All Blogs

Arango Contextual Data Platform

Solutions

Developers

Learn

Why Arango?