Leaders in geometric deep learning · Yale spinout

See the shape
of your data.

Introducing Alpha Lake — the geometric intelligence layer

Alpha Lake, from Latent Data, sits over the relationship networks inside enterprise data, AI workflows, and agentic systems — revealing structure, behavior, and risk that flat tools cannot see.

Benchmarked at scale 40M nodes single machine, CPU path — no cluster required
Relationships processed 397.8M edges
Sustained ingest 28.7M edges/sec
Quality preserved 1.4e-5 max dev

// synthetic SBM / Barabási–Albert graph at production scale

Validated ingest · SQL edge views · Parquet (local + S3) · BigQuery · Kafka · Neo4j · Bitcoin · Ethereum
02 / The problem

Modern data has shape.
Most tools cannot see it.

Enterprise data is no longer just rows, dashboards, and reports. It includes transactions, documents, model outputs, embeddings, logs, agent traces, workflows, and decisions — all connected, all changing.

Traditional tools store and query that data, but miss the deeper structure that determines what it actually means.

  • 01Which entities are actually related
  • 02Where behavior is changing
  • 03Where anomalies are emerging
  • 04How AI agents are behaving
  • 05Which patterns matter for decisions
Note · not an LLM

Language models read text. They do not measure structure. They are made for linear, sequential data.

Latent Data uses geometric deep learning — the math of relationships, manifolds, and flow — to analyze data the way it is actually organized, not the way it reads.

// view_a · flat table nothing flagged
idfromtoamt
tx_8401acct_44…91acct_77…02$2,400
tx_8402acct_31…abacct_77…02$2,360
tx_8403acct_44…91acct_31…ab$1,980
tx_8404acct_55…1cacct_44…91$2,400
tx_8405acct_77…02acct_55…1c$2,360
tx_8406acct_55…1cacct_31…ab$2,400
tx_8407acct_92…7eacct_44…91$1,200
tx_8408acct_18…d3acct_92…7e$ 850
row-by-row scoring: all 8 transactions pass
// view_b · same data, geometric ring detected
structural pattern: 4-account cycle · circular value flow
03 / Platform

A geometric digital twin
of your data system.

Latent Data sits above your existing stack and builds a living, geometric digital twin of how entities, behaviors, and workflows actually relate — then keeps it fresh as new data arrives.

01

Twin

Build a geometric digital twin of the relationships, behaviors, and workflows inside your data.

02

Monitor

Track drift, anomalies, agent behavior, and structural change as the system evolves.

03

Predict

Surface emerging risks, opportunities, and likely future states before they reach the dashboard.

04

Act

Send signals into dashboards, alerts, APIs, copilots, and decision systems.

Reference architecture works alongside your stack
Your data & systems
Postgres Snowflake BigQuery Databricks Kafka Neo4j S3 / Parquet Agent traces
Latent Data — geometric intelligence layer
Manifold learning Graph learning Diffusion geometry Temporal modeling Representation learning
Outputs
Dashboards APIs Alerts Risk scores Agent governance Decision intelligence
Works alongside data services and providers like Postgres, Snowflake, Databricks, BigQuery, Kafka, and Neo4j. We do not replace the systems you already trust — we read from them and add the intelligence layer they were never designed to expose.
04 / Solutions

Where geometric intelligence
becomes operational.

Six surfaces where the structural layer changes the work. Each one is a place flat tools have always struggled.

Fraud & AML01

Catch rings, not just transactions.

Reveal coordinated fraud and money-laundering structure across accounts, devices, and transactions — the patterns that single-row scoring can't see.

ring detection · community-aware
Blockchain investigation02

Follow the flow on-chain.

Map address relationships, transaction trajectories, and entity behavior across Bitcoin and Ethereum. Validated on the public Elliptic Bitcoin fraud graph.

btc · eth · erc-20
AI-agent observability03

See how your agents actually behave.

Track agent behavior, tool usage, workflow drift, repeated failures, and token-cost patterns — as a graph of decisions, not a wall of logs.

trace.graph · tool calls
Drift & anomaly detection04

Know when the system moves.

Detect when models, data streams, users, agents, or whole subsystems move into unusual regions of behavior — before metrics break.

window · rolling
Identity & ad fraud05

Untangle who is really who.

Connect users, devices, accounts, and sessions into a stable identity graph; surface synthetic identities and coordinated abuse.

identity graph · cross-signal
Decision intelligence06

Turn behavior into action.

Convert complex data behavior into risk scores, opportunity signals, recommendations, and next-best actions — with structural context attached.

signal · risk@p95
05 / Technology & proof

Grounded in geometry.
Built for deployment.

Latent Data is built on geometric deep learning — the mathematics of relationships, manifolds, graphs, and flow. Where language models read text and tabular tools score rows, geometric methods analyze data based on its structure: how things connect, how behavior moves through a system, and how that structure changes over time.

01

Manifold learning

Finds the shape of complex, high-dimensional data and the low-dimensional structure that actually governs its behavior.

02

Graph learning

Models relationships between entities — users, documents, transactions, agents, decisions — as first-class objects.

03

Temporal modeling

Monitors how data, models, and workflows evolve, and where they begin to drift away from expected regimes.

04

Representation learning

Creates compact, AI-ready data layers that downstream models, agents, and dashboards can reuse.

05

Geometric digital twin

Combines all of the above into a living model of your data system — one that can be queried, monitored, and refreshed as reality changes.

Validated benchmarks

Numbers we can stand behind.

Reported at >95% confidence from internal benchmark runs. Headline scale numbers are on synthetic networks at production scale; real-world validation uses the public OGBn-products graph and the Elliptic Bitcoin fraud graph.

Headline scale
40Mnodes
40M nodes & 397.8M edges processed end-to-end in 2 hours — fits inside a nightly refresh window.
single machine · CPU path
Ingest throughput
28.7Medges / sec
Sustained edge-ingest rate at 40M-node scale — how fast the geometric layer pulls relationships in.
sustained, end-to-end
Quality preservation
1.4e-5max dev
Quadratic-form deviation at 40M nodes — structural fidelity preserved.
all hybrid query accuracy floors held
Real-data validation
2.45Mnodes
OGBn-products public graph: 61.9M edges, 23.4s snapshot, 20.8 GB peak RSS.
real-world benchmark
Note · headline 40M-node figures are measured on synthetic networks at production scale. Real-data validation is reported separately on OGBn-products and the Elliptic Bitcoin fraud graph. Third-party head-to-head benchmarks against incumbent graph systems are in progress.
Provenance

From the Krishnaswamy Lab at Yale.

Latent Data is built on more than a decade of published research in geometric deep learning, manifold learning, and graph signal processing — the same foundation behind sister companies Latent Alpha and Latent Bio.

06 / Research

Foundational research
in geometric intelligence.

The methods Latent Data builds on — manifold learning, graph signal processing, diffusion geometry, optimal transport on graphs — come out of more than a decade of peer-reviewed work. A selection of the foundational papers is below.

// foundational research from the Krishnaswamy Lab at Yale · same lineage as Latent Alpha and Latent Bio.

Contact

See the shape
of your data.

For demos, partnership, press or careers — reach out directly. We respond to qualified outreach within one business week.

New Haven, CT · Yale spinout · Remote-friendly
Interest