CAINC

Navigate the structural shape of institutional language

CAINC takes a topological reading of large document corpora and renders the result as a navigable Semantic Terrain. Topics have peaks. Boundaries have real edges. Years of enforcement priority become visible at a glance.

What is a Semantic Terrain

A corpus has a shape. CAINC reads it.

Every large document corpus contains structure. Topics cluster. Some dominate; others are peripheral. Enforcement priorities shift over time. Narratives converge and diverge.

A Semantic Terrain is the computed reading produced by CAINC. You navigate it the way you navigate geography: by moving through a landscape with real topographic features, not by entering keywords into a search box.

Each terrain is built from the full corpus of a named institution. It is updated as the corpus grows. It is the same document body, seen differently.

Peaks

The dominant topics of the corpus. Each peak is a stable cluster of language that defines a region of institutional focus. Drill into any peak to reveal finer structure within it.

Contours

The boundaries between topics. Where one enforcement focus ends and another begins. Computed from the actual geometry of the corpus.

Passes

The transition zones between peaks, where topics overlap, share language, or shift into one another.

Corridors

The paths connecting related peaks. Follow a corridor to see how two enforcement areas share language, entities, or structural proximity.

Entities

People, organizations, and cases located precisely within the terrain. Find any entity and see its full dossier: co-occurrences, peak membership, and related documents.

Live -- Department of Justice

Free -- No Account Required

DOJ Corpus Terrain

115,000+ press releases445,000+ entities mapped2020 to present full temporal range

The full body of DOJ public communication, geometrically structured and navigable. See which enforcement topics dominate by era. Locate any defendant, case, or charge type within the broader terrain of federal prosecution. The DOJ Terrain is our public reference reading, open to anyone.

OPEN TERRAINNo account. No paywall.

Who uses Semantic Terrains

Journalism & Investigation

Journalists and Investigators

See years of enforcement priority without reading years of press releases. The shape of the corpus tells you what an institution cared about, when, and how much.

  • Spot enforcement priority shifts across administrations
  • Find the topographic neighborhood of any charge type or case
  • Surface structural patterns invisible to keyword search

Legal & Litigation

White-Collar Defense Attorneys

Map how the DOJ frames similar conduct across time and jurisdiction. See where your case sits within the full landscape of federal enforcement and what surrounds it.

  • Locate structurally similar prosecution narratives
  • Identify proximity between charge types in the terrain
  • Track how enforcement language around a topic has evolved

Compliance & Risk

Compliance Analysts and Risk Officers

Detect emerging enforcement focus before it becomes a crisis. Monitor how regulatory language around your sector is moving relative to the full corpus and where it is heading.

  • Identify rising peaks in enforcement terrain
  • Track topic boundary shifts as regulatory focus moves
  • Monitor entity proximity to high-density prosecution clusters

What you can do in CAINC

Four ways to explore a Semantic Terrain

Terrain Map

See the full topographic landscape of the corpus. Peaks, contours, and density rendered as navigable geography. Click any region to explore the documents within it. Drill into peaks to find the finer structure inside.

Network View

Explore how entities relate to each other and to the terrain. See co-occurrence connections, cluster membership, and the structural neighborhood around any person, organization, or case.

Article View

Search the full corpus by meaning, not just keywords. Vector search surfaces documents that are semantically related to your query, even when they use different language. Filter by date, peak, or entity.

Entity View

Pull up any entity in the corpus and see its full dossier: every document it appears in, which peaks it belongs to, which other entities it co-occurs with, and how its presence in the terrain has developed.

How Semantic Terrains work

Built on the geometry of language. Not an approximation of it.

Most tools that visualize document similarity render contours as visual approximations. The shape looks real, but it is an artifact of the rendering method, not a computed structure.

CAINC computes a genuine scalar field over the semantic space of the corpus, then extracts its topological structure: peaks, passes, and contours that reflect the actual geometry of the language.

The mathematical substrate is documented in full on the Methodology page. For definitions of terrain features, see Terminology.

01

Documents are ingested from the source institution

The full document corpus of a named institution is collected, cleaned, and prepared for analysis. Every document is preserved in full.

02

Language is mapped into structural space

Documents are embedded into high-dimensional space where meaning determines position. Similar language clusters together. Distinct topics separate.

03

Topics, entities, and boundaries are identified

Peaks, passes, and contours are computed from the actual geometry of the corpus. Entities are located precisely within the terrain.

04

The terrain updates as new documents are published

As the live corpus grows, the Semantic Terrain is recomputed on its defined cadence. You see terrain movement, not static snapshots.

05

You navigate by topic, entity, time, and proximity

Explore the terrain by moving through its geography. Find any entity and see what surrounds it. Follow a topic across time. Trace the boundary between two enforcement areas.

Available Terrains

Each terrain is a named corpus. One institution. Full depth.

Terrain Subscriptions give you continuous access to a specific institutional Semantic Terrain built from a named corpus, updated as new documents are published, and navigable by entity, topic, and time. The DOJ Terrain is our permanent free reference reading. Named terrain subscriptions are in development.

● Live -- FreeDOJ

Department of Justice press releases, 2020 to present. 115,000+ documents. The full landscape of federal prosecution, enforcement priority, and institutional communication.

○ In DevelopmentDEA

Drug Enforcement Administration. The full terrain of federal narcotics enforcement: charge types, priority shifts, geographic and entity patterns across decades.

○ In DevelopmentFTC

Federal Trade Commission. Consumer protection and antitrust enforcement terrain for compliance, M&A due diligence, and competitive landscape research.

○ In DevelopmentSCOTUS

Supreme Court of the United States. Published opinions and argumentation. The structural shape of constitutional doctrine across two centuries of language.

Terrain Subscriptions provide continuous access to a specific institutional Semantic Terrain built from a named corpus, updated on a defined cadence as new documents are published. Access details are handled by direct inquiry ahead of each terrain launch.

Custom Terrains and Deep Dives are available for organizations that need a specific body of documents mapped or a scoped analytical engagement. These are handled by direct inquiry.

INQUIRE ABOUT A TERRAIN

Interested in a specific terrain?

Tell us about your corpus, your questions, or the institution you need mapped.

hello@semanticgaps.com