Terminology

The language of Semantic Terrain analysis

SemanticGaps operates at the intersection of topology, semantics, and institutional language. These are the terms that define our work, the instrument we build, and the readings it produces.

Core Concepts

Semantic Terrain

A computed, navigable map of a document corpus's structural geography. A Semantic Terrain is the reading produced when CAINC takes a topological measurement of a full corpus. It reveals how topics cluster, where boundaries form, and how the shape of institutional language changes over time. It is not a search engine, a summary, or a visualization of keywords.

Corpus Intelligence

Intelligence derived from the shape and movement of large document bodies. Where traditional research reads individual documents, corpus intelligence reads the structural relationships between all documents at once. The output is a map of what an institution talks about, how those topics relate, and where the landscape is shifting.

CAINC

The proper name of the SemanticGaps instrument for taking a topological reading of a document corpus. The name comes from the Welsh word for branch; it is not an acronym. CAINC ingests, embeds, and computes the full body of a named institution into a navigable Semantic Terrain. The DOJ Semantic Terrain is the first publicly available reading produced by CAINC.

Corpus

A bounded collection of documents from a single institution or domain. A corpus might consist of years of Department of Justice press releases, a decade of FDA enforcement actions, or the full published opinion history of a federal court. Each corpus has a shape. CAINC makes that shape visible and navigable.

Live Corpus

A document body that grows as new material is published by the source institution. A live corpus is not static. As the DOJ publishes new press releases, as a regulatory body issues new filings, the corpus expands and the Semantic Terrain updates to reflect its new shape.

Terrain Features

Peaks

The dominant topic clusters within a Semantic Terrain. Peaks represent the conceptual high ground of a corpus. In the DOJ terrain, peaks correspond to major enforcement categories: healthcare fraud, national security, public corruption. Each peak is a stable cluster of language that defines a region of institutional focus. Peaks can contain sub-peaks, allowing you to drill down from broad enforcement categories into finer-grained topics within them.

Passes

Transition zones where topics overlap, share language, or shift into one another. Passes are the low points between peaks. They reveal where jurisdiction is contested, where enforcement categories converge, or where a single case touches multiple domains. Passes are often where the most interesting structural insights live.

Corridors

The paths between connected peaks in a Semantic Terrain. Corridors show how topics relate to one another through shared language and overlapping entities. Following a corridor between two peaks reveals the structural connection between seemingly separate enforcement areas.

Contours

Computed boundaries between topic regions within a Semantic Terrain. Contours show where one enforcement focus ends and another begins. They are derived from the actual geometry of the language in the corpus, not from human tagging or manual categorization.

Persistence

A measure of how dominant and stable a peak is within the terrain. High-persistence peaks represent deeply established institutional focus areas that have remained prominent across the corpus. Low-persistence peaks may represent emerging or peripheral topics. Persistence helps distinguish the signal from the noise in a large corpus.

Entity Mapping

The process of locating people, organizations, and cases precisely within a Semantic Terrain. Every entity in the corpus occupies a specific position in the terrain. Entity mapping reveals not just that a name appears, but where it sits relative to the surrounding topics, which peaks it is near, and what other entities share its topological neighborhood.

Entity Dossier

A comprehensive profile of a single entity within a Semantic Terrain. A dossier shows every document the entity appears in, which peaks it is associated with, which other entities it co-occurs with, and how its presence in the corpus has developed over time. Dossiers turn a name into a full structural portrait.

Terrain Movement

Changes in the shape, peaks, or boundaries of a Semantic Terrain over time as the live corpus grows. When an institution shifts its enforcement priorities, when a new topic emerges, or when an existing focus area contracts, terrain movement captures that structural change. Movement is the signal that something in the institution has changed.

How You Explore

Terrain Map

The primary view of a Semantic Terrain. A topographic visualization showing the density landscape of the corpus, with peaks, contours, and document positions rendered as navigable geography. Click any region to explore the documents and entities within it. Drill into any peak to reveal the finer structure inside.

Network View

A view of the entity relationships within a Semantic Terrain. Entities are shown as nodes and their co-occurrences as connections, positioned according to their location in the terrain. The network view reveals which entities appear together, how tightly they cluster, and where they sit relative to the broader topic landscape.

Article View

A full-text document browser within CAINC. Search the corpus using natural language or specific terms. Results are ranked by both textual relevance and semantic similarity using vector search, surfacing documents that are structurally related to your query even when they do not share exact keywords.

Entity View

A dedicated interface for exploring individual entities within the terrain. Select any entity to see its full dossier: the documents it appears in, the peaks it belongs to, the other entities it co-occurs with, and its structural position within the corpus.

Vector Search

A search method that finds documents by meaning rather than exact keyword match. When you search a Semantic Terrain, vector search identifies documents that are semantically similar to your query, even if they use different words. Combined with full-text search, this surfaces results that keyword-only tools miss entirely.

Services

Terrain Subscription

Continuous access to a named institutional Semantic Terrain, updated on a defined cadence. A terrain subscription is not a one-time report. It is ongoing access to a living reading produced by CAINC as the corpus changes. As the institution publishes new material, the terrain updates and subscribers see the movement.

Terrain Cadence

The update frequency of a Semantic Terrain. Cadence depends on the flow of data from the source institution. Some terrains update daily, others weekly, monthly, quarterly, or yearly. The DOJ terrain updates as new press releases are published. Cadence is determined by the natural publication rhythm of the source corpus.

Custom Terrain

A bespoke Semantic Terrain built to a client's specification from a designated corpus. Organizations that need a specific body of documents mapped can request a custom terrain. The corpus is defined, ingested, and computed into a navigable terrain maintained on an agreed cadence. Custom terrain pricing is handled by direct inquiry.

Deep Dive

A scoped analytical engagement using Semantic Terrain data to investigate a specific question or thesis. A deep dive uses the terrain as a starting point, then goes further: tracing entity relationships, analyzing terrain movement over a specific period, or mapping the structural context around a particular case, topic, or enforcement action.

Technical Foundations

Scalar Field

The mathematical foundation underlying a Semantic Terrain. A scalar field is a computed density function over the embedding space of the corpus. It assigns a value to every point in the terrain, creating the surface from which peaks, passes, and contours are extracted. The scalar field is what makes a Semantic Terrain a genuine topological structure rather than a visual approximation.

See the method behind the terrain

The technical foundations of how Semantic Terrains are computed are documented on our Methodology page. To explore a live terrain, visit the free DOJ Semantic Terrain.