What This API Does
Hypernym is a context compression engine with semantic fact extraction. You submit any text — source code, medical papers, legal contracts, prose — and Hypernym returns a frequency-ranked list of semantic facts: the deduplicated, clustered concepts that the text is actually about, ordered by how often they appear across 60 independent stochastic compression trials.
The output is a tunable compression dial. Request the top 10 facts and you get the document's skeleton. Request the top 100 and you get its musculature. Request all 500+ and you get its complete semantic anatomy. Each fact includes a frequency percentage (how many of the 60 trials independently surfaced it) and a cluster size (how many raw extractions collapsed into this single concept).
Document Understanding
What is this 50-page paper actually about? The top 20 facts tell you.
Semantic Deduplication
Two documents about the same thing produce overlapping fact sets.
Compression Quality
Coherence scores tell you how stable the compression is. High coherence (>0.85) means deterministic extraction. Low coherence (<0.7) means the document resists compression.
Research Data
The full pipeline output (P-Span curves, trial distributions, coherence matrices) is available for downstream analysis.
Verified Runs
| Input | Domain | Tokens | Facts | Time | Coherence |
|---|---|---|---|---|---|
| zephyr_regulator.py | code | 6,880 | 549 | 812s | 0.935 |
| governor.py | code | 6,343 | 343 | 586s | 0.947 |
| arXiv 2602.16000 | medical | 14,297 | 541 | 966s | 0.781 |
How It Works
Hypernym runs a 4-phase pipeline on every submission. Processing time is 2–20 minutes depending on input size. A 7K-token code file takes ~10 min. A 14K-token medical paper takes ~16 min. The bottleneck is Phase 2 (60 FC calls).
P-Span Sweep
Sweeps 20 compression levels from aggressive to light. Finds the optimal compression parameter (N) where quality stabilizes — the "shear boundary."
Comprehensive Trials
Runs 60 independent stochastic compressions at the optimal N. Each trial extracts elements and details from the text independently.
Coherence Analysis
Computes pairwise similarity between all 60 trials (3,600 pairs). High coherence = stable compression. Uses BGE-M3 embeddings via Outfinity.
Semantic Clustering
Collects all element/detail pairs from all trials. Gets embeddings, computes cosine similarity matrix, runs greedy single-linkage clustering at 0.85 threshold. Ranks clusters by frequency (% of trials containing them).
Output: Frequency-Ranked Semantic Facts
Each fact = {element, detail, frequency_pct, cluster_size}. Top facts = what the document is most consistently about.
Authentication
All endpoints except /api/omnifact/health require an API key in the X-API-Key header.
- • API keys are provided by the administrator. Contact your Hypernym account manager for a key.
- • Keep your API key secure and never expose it in client-side code.
- • Use environment variables to store your API key.
Health Check
/api/omnifact/healthCheck if the server is up and view current queue load. No authentication required.
Response (200)
{
"status": "healthy",
"version": "1.5.0",
"uptime_seconds": 2240.7,
"queue": {
"active": 0,
"queued": 0,
"max_concurrent": 10
}
}The queue field shows current system load. active = experiments currently running, queued = experiments waiting for a slot, max_concurrent = total processing capacity.
Submit Text
/api/omnifact/beginSubmit text for semantic fact extraction.
Request Body
| Parameter | Type | Required | Description |
|---|---|---|---|
| text | string | Required | The text to analyze. 10–500,000 characters. |
| sector | string | Optional | Domain hint. Default: "prose". One of: code, prose, legal, medical, academic, financial. |
textThe text to analyze. 10–500,000 characters.
sectorDomain hint. Default: "prose". One of: code, prose, legal, medical, academic, financial.
Response — New experiment (202 Accepted)
{
"experiment_id": "36058a73-ff9a-4777-9c13-677f88d040bc",
"status": "PROCESSING",
"content_hash": "f887207933dbe2d4a59cb04331d334519c3f7b998b61c63622f158346f5251ba",
"input_tokens": 14297
}The experiment_id is a UUID. Use it in all subsequent requests.
{
"experiment_id": "36058a73-ff9a-4777-9c13-677f88d040bc",
"status": "COMPLETED",
"content_hash": "f887207933dbe...",
"input_tokens": 14297,
"semantic_fact_count": 541
}Submitting identical text returns the cached result immediately — no reprocessing.
{
"experiment_id": "36058a73-ff9a-4777-9c13-677f88d040bc",
"status": "PROCESSING",
"phase": "COMPREHENSIVE",
"progress_pct": 35,
"content_hash": "f887207933dbe..."
}Errors
| 400 | Missing text field, text too short/long, invalid sector |
| 401 | Invalid API key |
Poll Progress
/api/omnifact/{experiment_id}/statusPoll processing progress with granular per-phase detail.
The detail field changes per phase:
{
"experiment_id": "36058a73-ff9a-4777-9c13-677f88d040bc",
"status": "PROCESSING",
"phase": "PSPAN",
"phase_number": 1,
"total_phases": 4,
"progress_pct": 45,
"input_tokens": 14297,
"timing": { "elapsed_seconds": 101.2 },
"detail": {
"current_point": 10,
"total_points": 20,
"current_n": 164
}
}{
"status": "PROCESSING",
"phase": "COMPREHENSIVE",
"phase_number": 2,
"total_phases": 4,
"progress_pct": 25,
"detail": {
"completed_trials": 0,
"total_trials": 60,
"percent_complete": 30
}
}completed_trials stays 0 until all 60 finish (single FC request). percent_complete is FC's internal progress.
{
"phase": "COHERENCE",
"detail": {
"step": "generating_embeddings",
"generated": 35,
"total": 60
}
}{
"phase": "SEMANTIC_CLUSTERING",
"detail": {
"step": "fetching_embeddings",
"total_pairs": 1970
}
}Clustering steps cycle through: collecting_pairs → fetching_embeddings → clustering.
{
"experiment_id": "36058a73-ff9a-4777-9c13-677f88d040bc",
"status": "COMPLETED",
"phase": null,
"progress_pct": 100,
"optimal_n": 882,
"semantic_fact_count": 541,
"input_tokens": 14297,
"timing": { "total_seconds": 965.3 }
}{
"experiment_id": "f3a1874f-2a12-4e60-bbf1-98f3770a0a36",
"status": "FAILED",
"error": "P-Span analysis returned no results",
"timing": { "elapsed_seconds": 12.3 }
}Cancel Experiment
/api/omnifact/{experiment_id}/cancelCancel a running or queued experiment.
Response (200)
{
"cancelled": true,
"experiment_id": "36058a73-ff9a-4777-9c13-677f88d040bc",
"was_status": "running"
}Errors
| 409 | Experiment already in terminal state (completed, failed, or cancelled) |
| 404 | Experiment not found or not owned by your key |
Get Full Results
/api/omnifact/{experiment_id}/resultFull experiment data including P-Span curve, trial statistics, coherence metrics, all semantic facts, and timing breakdown.
Response Fields
| Parameter | Type | Required | Description |
|---|---|---|---|
| metadata | object | Optional | Sector, input stats, content hash, timestamps |
| pspan | array | Optional | 20-point P-Span sweep curve with compression ratios |
| trials | object | Optional | Trial count, avg compression ratio, element counts |
| coherence | object | Optional | Pairwise similarity stats (avg, std, min, max, pairs) |
| semantic_facts | object | Optional | All facts ranked by frequency |
| tuned_compression | array | Optional | Compression dial checkpoints at different fact counts |
| timing | object | Optional | Per-phase wall-clock time in seconds |
metadataSector, input stats, content hash, timestamps
pspan20-point P-Span sweep curve with compression ratios
trialsTrial count, avg compression ratio, element counts
coherencePairwise similarity stats (avg, std, min, max, pairs)
semantic_factsAll facts ranked by frequency
tuned_compressionCompression dial checkpoints at different fact counts
timingPer-phase wall-clock time in seconds
{
"experiment_id": "36058a73-ff9a-4777-9c13-677f88d040bc",
"status": "COMPLETED",
"metadata": {
"sector": "medical",
"input_tokens": 14297,
"input_bytes": 60454,
"input_lines": 1568,
"optimal_n": 882,
"content_hash": "f88720793...",
"created_at": "2026-02-20 11:13:54",
"completed_at": "2026-02-20 11:29:59"
},
"pspan": [
{
"pspan_index": 0,
"requested_n": 73,
"actual_n": 37,
"delivery_ratio": 0.507,
"is_shear": 1,
"compression_ratio": 0.021,
"similarity": 1.0
}
],
"trials": {
"total": 39,
"avg_compression_ratio": 0.134,
"avg_elements_raw": 59.5,
"avg_elements_clean": 48.2
},
"coherence": {
"avg": 0.7808,
"std": 0.161,
"min": 0.3995,
"max": 1.0,
"pairs": 1521
},
"semantic_facts": {
"total": 541,
"facts": [
{
"rank": 1,
"element": "Fractional Flow Reserve",
"detail": "FFR is defined as the ratio of maximal hyperaemic myocardial blood flow...",
"frequency_pct": 77.78,
"cluster_size": 18,
"token_count": 63
}
]
},
"tuned_compression": [
{ "facts_count": 10, "tokens": 361, "compression_ratio": 0.025 },
{ "facts_count": 50, "tokens": 1523, "compression_ratio": 0.107 },
{ "facts_count": 100, "tokens": 3210, "compression_ratio": 0.225 },
{ "facts_count": 200, "tokens": 6891, "compression_ratio": 0.482 }
],
"timing": {
"pspan_analysis_seconds": 201.9,
"comprehensive_seconds": 688.2,
"coherence_analysis_seconds": 26.5,
"coherence_extraction_seconds": 0.1,
"semantic_clustering_seconds": 48.6,
"total_seconds": 965.3
}
}Errors
| 202 | Experiment still processing (body contains current status) |
| 404 | Experiment not found |
Get Top Facts
/api/omnifact/{experiment_id}/facts?top=NGet just the top-N ranked semantic facts. This is the primary output endpoint.
Query Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
| top | int | Optional | Number of top facts to return. Default: 50. |
topNumber of top facts to return. Default: 50.
Response (200)
{
"experiment_id": "36058a73-ff9a-4777-9c13-677f88d040bc",
"input_tokens": 14297,
"facts_requested": 20,
"facts_returned": 20,
"total_facts": 541,
"cumulative_tokens": 786,
"compression_ratio": 0.055,
"facts": [
{
"rank": 1,
"element": "Fractional Flow Reserve (FFR)",
"detail": "FFR is defined as the ratio of maximal hyperaemic myocardial blood flow in the presence of a stenosis to the theoretical maximal flow in the absence of that stenosis.",
"frequency_pct": 77.78,
"cluster_size": 18,
"token_count": 63
},
{
"rank": 2,
"element": "CFD and Physics-Based Approaches",
"detail": "Full-order CFD computes coronary pressure and flow by solving the Navier-Stokes equations on a patient-specific coronary geometry reconstructed from imaging",
"frequency_pct": 75.0,
"cluster_size": 13,
"token_count": 38
}
]
}Fact Fields
| Parameter | Type | Required | Description |
|---|---|---|---|
| rank | int | Optional | Position in frequency ranking (1 = most frequent) |
| element | string | Optional | The semantic concept extracted |
| detail | string | Optional | Explanation or context of the concept |
| frequency_pct | float | Optional | Percentage of 60 independent trials that surfaced this fact |
| cluster_size | int | Optional | Number of raw extractions merged into this fact |
| token_count | int | Optional | Token count of this fact |
rankPosition in frequency ranking (1 = most frequent)
elementThe semantic concept extracted
detailExplanation or context of the concept
frequency_pctPercentage of 60 independent trials that surfaced this fact
cluster_sizeNumber of raw extractions merged into this fact
token_countToken count of this fact
Key response fields
- •
cumulative_tokens— Total tokens across all returned facts. - •
compression_ratio— cumulative_tokens / input_tokens. How much of the original document this fact set represents by size. - •
frequency_pct— 90%+ = extremely consistent extraction. 30% = appeared in roughly a third of trials. - •
cluster_size— Number of raw element/detail pairs from different trials merged via semantic clustering at cosine threshold 0.85.
List Experiments
/api/omnifact/listList all experiments for your API key.
Response (200)
{
"experiments": [
{
"experiment_id": "36058a73-ff9a-4777-9c13-677f88d040bc",
"status": "completed",
"sector": "medical",
"input_tokens": 14297,
"optimal_n": 882,
"semantic_fact_count": 541,
"phase": null,
"progress_pct": 100,
"created_at": "2026-02-20 11:13:54",
"completed_at": "2026-02-20 11:29:59"
}
]
}Usage Statistics
/api/omnifact/usageMetering data for your API key.
Response (200)
{
"key_name": "test-dev",
"today": {
"requests": 21,
"tokens_in": 14297,
"tokens_out": 16953,
"experiments_started": 1,
"experiments_completed": 1
},
"total": {
"requests": 42,
"tokens_in": 48034,
"tokens_out": 27983,
"experiments_started": 7,
"experiments_completed": 3
},
"daily": [
{
"date": "2026-02-20",
"requests": 21,
"tokens_in": 14297,
"tokens_out": 16953,
"experiments_started": 1,
"experiments_completed": 1
}
]
}Content-Hash Deduplication
Hypernym deduplicates by SHA-256 of the input text. Submitting the same text twice returns the cached result immediately (200 instead of 202) with no reprocessing. Deduplication is global across all API keys — if any key has already processed this exact text, you get the cached result.
This only applies when the exact same input is submitted — there is no way to browse or discover other users' submissions. Do not submit text containing sensitive personal information (SSNs, medical records, credentials). Hypernym processes semantic structure, not identity data.
Experiment Statuses
| Status | Meaning |
|---|---|
| pending | Queued, waiting for a processing slot |
| running | Pipeline active (one of 4 phases) |
| completed | All phases done, facts available |
| failed | Pipeline error (see error field in status response) |
| cancelled | Intentionally stopped by user or administrator |
Sectors
The sector parameter hints at the domain of the input text. It affects how the compression pipeline interprets the content.
| Sector | Use For |
|---|---|
| code | Source code, configuration files, scripts |
| prose | General text, articles, essays (default) |
| legal | Contracts, regulations, legal documents |
| medical | Clinical papers, medical records, health data |
| academic | Research papers, dissertations, textbooks |
| financial | Financial reports, market analysis, filings |
Error Responses
All error responses follow this format:
{"detail": "Error message here"}| Status | Meaning |
|---|---|
| 400 | Bad request — missing fields, invalid values, text too short/long |
| 401 | Invalid or missing API key |
| 404 | Experiment not found |
| 409 | Conflict — e.g., cancelling an already-terminal experiment |
| 202 | Returned by /result when experiment is still processing |
Configuration
| Parameter | Value | Description |
|---|---|---|
| P-Span sweep points | 20 | Number of compression levels sampled |
| Comprehensive trials | 60 | Independent stochastic compression runs |
| Clustering threshold | 0.85 | Cosine similarity threshold for merging facts |
| Embedding model | BAAI/bge-m3 | Used for coherence analysis and clustering |
| Max input | 500,000 chars | Maximum input text length |
| Min input | 10 chars | Minimum input text length |
| Timeouts | 30 min | P-Span and Comprehensive both allow up to 30 minutes |
End-to-End Walkthrough
Step 1: Verify the server is up
Step 2: Submit your text
Note the experiment_id (UUID) from the response.
Step 3: Poll until complete
Step 4: Get the facts
Step 5: Get full research data (optional)
Python Integration
import requests
import time
import json
class HypernymClient:
def __init__(self, api_key: str):
self.api_key = api_key
self.base_url = "https://zephyr.hypernym.ai"
self.headers = {
"Content-Type": "application/json",
"X-API-Key": api_key
}
def submit(self, text: str, sector: str = "prose") -> str:
"""Submit text and return experiment_id."""
resp = requests.post(
f"{self.base_url}/api/omnifact/begin",
headers=self.headers,
json={"text": text, "sector": sector}
)
resp.raise_for_status()
return resp.json()["experiment_id"]
def poll(self, experiment_id: str, interval: int = 30) -> dict:
"""Poll until experiment completes. Returns final status."""
while True:
resp = requests.get(
f"{self.base_url}/api/omnifact/{experiment_id}/status",
headers=self.headers
)
resp.raise_for_status()
data = resp.json()
status = data["status"]
if status == "COMPLETED":
return data
elif status in ("FAILED", "CANCELLED"):
raise RuntimeError(f"Experiment {status}: {data.get('error', '')}")
phase = data.get("phase", "?")
pct = data.get("progress_pct", 0)
print(f" {phase} {pct}%")
time.sleep(interval)
def get_facts(self, experiment_id: str, top: int = 50) -> dict:
"""Get top-N semantic facts."""
resp = requests.get(
f"{self.base_url}/api/omnifact/{experiment_id}/facts",
headers=self.headers,
params={"top": top}
)
resp.raise_for_status()
return resp.json()
def get_result(self, experiment_id: str) -> dict:
"""Get full experiment result with all pipeline data."""
resp = requests.get(
f"{self.base_url}/api/omnifact/{experiment_id}/result",
headers=self.headers
)
resp.raise_for_status()
return resp.json()
# Usage
client = HypernymClient("your-api-key")
exp_id = client.submit(
text=open("document.txt").read(),
sector="medical"
)
print(f"Experiment: {exp_id}")
client.poll(exp_id)
facts = client.get_facts(exp_id, top=20)
for f in facts["facts"]:
print(f" [{f['frequency_pct']:.0f}%] {f['element']}: {f['detail'][:80]}...")