Your business data already contains the answers. The problem is that no search, dashboard, or AI tool was built to find them — because they’re hidden in the connections between things, not in the things themselves.
Customer relationships. Supply chains. Biological pathways. Financial networks. Code repositories. All of these are fundamentally interconnected systems — and yet most AI tools treat this data as flat text, discarding the structural signal that makes graphs uniquely powerful.
Contextus is different. It is a graph-guided AI discovery agent that finds the connections your data already contains — but that no query, dashboard, or conventional AI system would ever surface.
Two drug compounds linked via a shared protein pathway. A chip design component with near-identical architecture to one in a separate repository. A financial instrument with unexpected exposure to a correlated asset cluster. These discoveries exist in your data. Contextus finds them.
The gap in your AI stack
RAG systems retrieve. LLMs summarize. Search finds what you name. All of them answer questions you already know how to ask
None of them discover.
Discovery means finding entities that are structurally similar to your query node but sit in completely different neighborhoods — nodes that share indirect pathways, common patterns, and latent relationships that no explicit edge connects. This is what graph neural networks are built for. And it is what Contextus delivers.
Search
Finds what you name
“What is X?”
RAG
Retrieves similar text
“Tell me about X.”
Contextus Graph-guided AI
Discovers what you missed
“What didn’t I know to ask?”
How it works
Contextus operates in four stages from a single node key query to a plain-language discovery narrative.
01
Ingest & Embed
Your graph is loaded into Arango’s graph-native multi-model data platform, and a Graph Neural Network (GNN) is trained directly on its structure. For clarity, a GNN learns the structural patterns in your data – not just what each entity is, but how it connects to everything around it Unlike graph-only databases, Arango stores each node’s attributes, its graph position, and its GNN embedding together in a single document — no separate vector database, no ETL, no sync overhead. Training is automatic and takes minutes.
02
BFS Traversal
From your query node maps every connection from your starting point, then an Arango Query Language (AQL) breadth-first search maps all structural connections up to N hops. This builds a local subgraph that captures the full structural neighbourhood
03
Link Prediction
Finds structurally similar entities across the graph, then a GNN computes cosine similarity between your query node’s embedding and every node in the traversal. Nodes that are structurally similar — even if topologically distant — score highly. The top-K are surfaced as latent discoveries.
04
LLM Reasoning
Explains what was found, in your domain’s language so a LangGraph ReAct agent reasons over the scored subgraph. For each discovery, it explains the connection path, the predicted link strength, and what the structural similarity means — in plain language specific to your domain.
Domain-aware, automatically
After ingestion, Contextus samples your graph and calls the LLM of your choice to infer the domain, terminology, and what kinds of discoveries matter. A biomedical graph gets biomedical discovery framing. An HR graph gets org-structure framing. No manual configuration or fine-tuning required.
Any graph. Any domain.
Contextus is graph-agnostic by design. The same engine, the same pipeline, and the same agent loop works across any domain — and because it runs natively on Arango’s multimodel data platform, you don’t need to bolt on a vector database or build an ETL layer to make it work. Graph traversal, node attributes, and GNN embeddings all live together in one place.
- Biomedical knowledge graphs — drug-protein-disease interaction networks, pathway analysis, side-effect clustering
- Financial networks — instrument correlations, counterparty exposure, contagion pathways
- HR and organisational graphs — talent clusters, reporting hierarchies, knowledge flow
- Software and code repositories — component dependencies, co-authorship, architectural similarity
- Supply chain and logistics — supplier dependencies, risk propagation, network resilience
- Any Arango graph you already have — native ingestion, no ETL required
A graph-only database like Neo4j would require a separate vector store for embeddings, a sync layer to keep them consistent, and two query languages on every request — adding infrastructure, latency, and complexity at every step. Arango handles all three natively.
Contextus is built for multiple use cases. Every time you ingest a new graph, Contextus automatically samples it, infers the domain, and regenerates its configuration from scratch. Switch from chip design to drug discovery to HR analytics to financial risk — the agent adapts its language, framing, and discovery logic each time, with no manual reconfiguration required
Proven on real graphs
27,583
nodes — temporal chip design
40,955
edges
24
node types
On a temporal chip design graph spanning four open-source RISC-V processor repositories, Contextus identified structurally similar memory management units across architecturally distinct processor families — connections invisible to direct AQL queries and undetectable by text similarity alone.
Example discovery
In chip design, finding that two memory management units share underlying architectural patterns — even when they live in completely separate repositories — can save weeks of redundant development. This is the kind of discovery Contextus makes routine.
For example, Contextus surfaced a latent connection between OR1200_e9c0251d27c13c65 (an OpenRISC specification chunk) and MOR1KX_g_f158f9ef98d3 (a DMMU implementation in a separate processor repository) with a predicted link strength of 0.5965 — via a three-hop path through shared architectural patterns. Neither a keyword search nor a direct graph query would find this.
Built on Arango
Contextus is built natively on Arango’s graph-native multi-modal and multi-model data platform — the only database that handles document, graph, and vector storage in a single system, without a vector database sidecar or a separate graph layer.
- GNN embeddings stored alongside node attributes in the same Arango collection
- AQL traversal for expressive, graph-native BFS subgraph extraction
- Heterogeneous graph support out of the box — multiple node and edge types, one named graph
- Direct ingestion from any existing Arango instance — no ETL pipeline needed
Get started
Contextus runs as a self-contained FastAPI application with a built-in web UI. It ingests from any Arango graph, CSV dataset, or Open Graph Benchmark dataset, and requires an LLM API key — OpenAI, Anthropic, and others supported.