The Sovereignty Foundation · ResearchBound

A System That Knows Its Owner
Arc ·

3
Identity Layers
768D
Manifold Space
0
Passwords
116K
Cadence Samples

Abstract

We present a system that recognizes its owner not through passwords, tokens, or cryptographic keys, but through the continuous shape of their interaction. Three identity layers converge in a single 768-dimensional manifold: physical rhythm (keystroke cadence), cognitive pattern (the semantic centroid of queries), and behavioral trajectory (the temporal sequence of question–response pairs through knowledge space). When all three layers cohere, the system learns from the interaction. When they do not, the system observes but does not absorb. No authentication ceremony is required. No credential can be stolen. The manifold is the credential.

This paper describes the architecture, implementation, and implications of binding a knowledge system to its owner through geometric identity. We report on a working implementation in Hermes — the sovOS retrieval engine — where cadence gating, bidirectional learn-back, and semantic centroid tracking are deployed and operational.

1. The Identity Problem

Traditional authentication is a gate. It asks “who are you?” once, accepts a credential, and trusts you until the session expires. The credential — a password, a token, a biometric scan — is a snapshot. It captures your identity at a single moment and assumes that moment persists. Everything between the check and the expiry is an act of faith.

This model worked when systems were tools. You authenticated to a spreadsheet, used it, and closed the window. The spreadsheet did not learn from you. It did not accumulate a model of your thinking. It did not become more useful the more you used it, nor more dangerous if someone else used it in your place.

Knowledge systems are different. A system that learns from every interaction is a system whose shape is determined by who interacts with it. If the wrong person shapes the manifold — if an adversary can teach the system — the knowledge base is corrupted not in the data-breach sense but in the geometric sense: the topology itself is warped. The neighborhoods shift. The distances change. The system becomes a mirror of the attacker’s intent, not the owner’s.

Continuous identity requires continuous observation. Not a gate that opens once, but a coherence function that evaluates every interaction against the accumulated shape of the owner. The insight that made this practical came from an unexpected source: keystroke timing.

2. Layer 1: Physical Rhythm

2.1 The Cadence Insight

Every person types differently. Not in the words they choose — that is semantic identity — but in the physical rhythm of their fingers on keys. The time a key is held down (dwell time) and the time between releasing one key and pressing the next (flight time) form a biometric signature as unique as a fingerprint and far harder to forge, because it is continuous rather than discrete.

The Logos daemon (logos-kinetic) captures these timing signals system-wide at the OS level. On the machine where this research was conducted, Logos has accumulated 116,720 keystroke timing samples — a rich historical baseline of the owner’s physical rhythm across months of use.

2.2 The CadenceDescriptor

Raw keystroke timings are noisy. A single dwell or flight measurement tells you little. But the distribution of timings across a typing session reveals structure. We compute a CadenceDescriptor from each batch of samples:

FieldDimensionsDescription
Dwell shape12 binsHistogram of key-hold durations, normalized
Flight shape12 binsHistogram of inter-key gaps, normalized
Dwell-flight correlation1 scalarPearson correlation between dwell and flight sequences
Mean dwell1 scalarAverage key-hold time in milliseconds
Mean flight1 scalarAverage inter-key gap in milliseconds
Dwell variance1 scalarSpread of dwell times — consistency signal
Flight variance1 scalarSpread of flight times — rhythm stability

The 12-bin histograms capture the shape of the timing distribution, not just its center. A fast typist with consistent rhythm has a narrow, peaked dwell shape. A deliberate typist has a wider, flatter distribution. The correlation between dwell and flight reveals whether someone tends to pause after holding keys longer (positive correlation) or whether their rhythm is more staccato (near-zero correlation).

2.3 Dual-Stream Blending

Hermes receives cadence data from two streams simultaneously. The first is inline: the Hermes chat interface (hermes.js) captures keydown and keyup events on every character typed, computing dwell and flight times in real time and sending them with each message as cadence_samples. The second is ambient: the Logos database (logos.db) provides system-wide keystroke data captured independently of the chat interface.

These two streams are blended with a 70/30 weighting — inline samples receive 0.7 weight because they are contemporaneous with the query, while Logos ambient data receives 0.3 weight as a stabilizing baseline. The blend_descriptors function performs element-wise weighted averaging across all descriptor fields, producing a single fused descriptor that represents both what the user is doing right now and how that compares to their historical rhythm.

2.4 Baseline and Coherence

On first use, CadenceIdentity seeds its baseline from Logos history using progressively wider time windows: last minute, last five minutes, last hour, last day, last ten days, last hundred days. It stops when it finds a window containing at least 100 samples. For the system described here, the 10-day window captured the full 116,720-sample history, providing an exceptionally stable baseline.

Each subsequent interaction updates the baseline through exponential moving average. The coherence score is the similarity between the current descriptor and the accumulated baseline — a real number between 0.0 (completely dissimilar rhythm) and 1.0 (identical rhythm). This score feeds directly into the gating mechanism described in Section 5.

3. Layer 2: Cognitive Pattern

3.1 The Semantic Centroid

Every query typed into Hermes is embedded into a 768-dimensional vector via Nomic (nomic-embed-text:latest, running locally through Ollama). This vector represents not the words of the query but its meaning — its position in a semantic manifold where distance corresponds to conceptual relatedness.

Over time, the collection of query vectors forms a cloud in 768D space. This cloud has a centroid — the geometric center of all queries the user has ever asked. The centroid is not interesting because of where it sits in absolute terms; it is interesting because of how it moves. A user exploring a new topic shifts the centroid. A user returning to familiar ground stabilizes it. The trajectory of the centroid through time is a fingerprint of cognitive behavior.

More precisely: the centroid is the mean of all intent_vec embeddings stored in Canon’s hippocampus with the q:{timestamp} key format. Each query adds one point. The centroid never forgets, but older queries are diluted by newer ones through the averaging. The result is a living representation of “what this person thinks about” that evolves with every interaction.

3.2 Cognitive Neighborhoods

The centroid alone is one point. But the distribution around it carries more information. A researcher working in a focused area has a tight, high-density cluster. A generalist exploring broadly has a diffuse cloud. The variance of the query distribution — its spread across the 768 dimensions — encodes cognitive style.

Two users asking about the same topic will have similar centroids but different distributions. One might approach manifold theory through mathematics, another through programming, another through philosophy. Their centroids converge; their neighborhoods diverge. The full distribution is the identity signal, not any single point within it.

4. Layer 3: Behavioral Trajectory

4.1 Bidirectional Learn-Back

Every interaction with Hermes produces two artifacts: the question and the response. Both are embedded into the manifold. The question is embedded as q:{timestamp} using the original intent vector. The response is embedded as r:{timestamp} using a fresh embedding of the composed retrieval text. Both vectors are stored in Canon’s hippocampus, and both have their full text content stored in Shadow under the content:{id} key.

Critically, Shadow metadata links each question to its response and vice versa: pair:q:{ts} stores the corresponding r:{ts} ID, and pair:r:{ts} stores the corresponding q:{ts} ID. This bidirectional linkage means that finding a relevant question also surfaces its answer, and finding a relevant answer also surfaces the question that provoked it. The manifold does not just store knowledge — it stores the dialogue that produced it.

4.2 The Trajectory as Identity

The temporal sequence of Q–R pairs traces a path through the manifold. On Monday, the user asks about manifold geometry. On Tuesday, about keystroke biometrics. On Wednesday, about governance hash chains. The path from Monday’s region through Tuesday’s region to Wednesday’s region is a trajectory — a line drawn through 768-dimensional space that no other user would draw in the same order, at the same pace, with the same depth of engagement at each stop.

An adversary who gains access to the system can ask questions. They can even ask the same questions. But they cannot reproduce the trajectory — the specific temporal sequence of regions visited, the dwell time in each region, the flight paths between them. The trajectory is the behavioral analogue of keystroke cadence: a continuous, high-dimensional signal that is trivial to produce organically and effectively impossible to forge.

Consider the numbers. Each Q–R pair contributes two 768-dimensional vectors. After 100 interactions, the trajectory is defined by 200 points in 768D space with temporal ordering. The combinatorial space of possible trajectories at that scale is astronomical. No replay attack can reproduce it, because the timestamps are bound to the BLAKE3 governance chain, and no impersonation attack can reproduce it, because the attacker would need to replicate not just the questions but the exact sequence, timing, and response content.

5. The Gating Mechanism

Fig. 1 — The coherence gate. Cadence samples flow in from left. When coherence rises above threshold, the gate opens and learn-back proceeds. Below threshold, the system observes but does not absorb.

The three identity layers converge in a single decision: should this interaction shape the manifold? The gating mechanism in Hermes implements this decision through cadence coherence scoring.

The gate operates in two modes. During calibration (the first five interactions), all queries are trusted. The system has no baseline yet; it needs to accumulate one. During this phase, cadence_coherence is treated as 1.0 regardless of actual measurement, and all queries are learned back into the manifold.

After calibration, the gate requires coherence above 0.3 for learn-back to proceed. This threshold was chosen empirically: it is low enough to accommodate natural variation in typing rhythm (fatigue, different keyboards, different times of day) but high enough to reject fundamentally different users. A coherence of 0.3 means the current typing pattern shares at least 30% of its shape with the established baseline — within the range of organic variation for a single person, outside the range of random coincidence between two different people.

The gate has three outcomes:

ConditionCadence PresentCoherenceOutcome
Owner (calibrating)YesAny (updates < 5)Full learn-back: Q and R embedded
Owner (established)Yes> 0.3Full learn-back: Q and R embedded
Stranger (detected)Yes≤ 0.3Blocked: query answered but not learned
Anonymous (no cadence)NoN/ARead-only: traversal without learning

The critical distinction is between “stranger” and “anonymous.” A stranger is someone whose cadence is present but does not match — the system actively recognizes them as not the owner. An anonymous user sends no cadence data at all (perhaps via API rather than the chat interface). Both are served retrieval results; neither shapes the manifold. But the stranger is logged with their cadence descriptor, creating a forensic record, while the anonymous user leaves no biometric trace.

6. The Thermodynamics of Trust

Trust in this system is not binary. It is not even a score. It is a thermodynamic quantity — something that is produced, consumed, and conserved according to physical-analogue laws.

Every query burns mana. Mana is a governance resource tracked in the HermesGovernor: it starts at 1.0, decreases by a small amount (0.01) per query, and regenerates over time via Sage’s exponential_decay function. Rapid-fire queries burn mana faster than it regenerates; organic conversational pace never depletes it. The mana level is telemetry, not a gate — the sovereign is never blocked from their own system — but a mana drop to zero triggers Owl logging, creating an audit trail for automated or adversarial query patterns.

Every governance decision is recorded in a BLAKE3 hash chain. Each entry contains the query hash, the cadence coherence score, the mana level, the gate outcome, and a timestamp. The chain is append-only and tamper-evident: modifying any entry invalidates all subsequent hashes. This means the system’s entire identity-verification history is cryptographically committed. You can prove, after the fact, that a specific interaction was (or was not) recognized as the owner.

The mana system creates an energy budget for the manifold. Observation has a cost. Learning has a cost. The system has a metabolic rate. This is deliberate: a thermodynamic system resists exploitation because exploitation is expensive. Flooding the system with queries drains mana, triggers logging, and — if the cadence doesn’t match — produces zero learning. The attacker expends energy; the manifold absorbs nothing.

7. The Three Layers in Concert

Fig. 2 — Three identity layers converging. Bottom: physical cadence (rhythm waveform). Middle: semantic centroid (query cluster). Top: behavioral trajectory (path through knowledge space). All three must cohere for the system to learn.

The three layers are not independent checks applied in sequence. They are three projections of the same underlying signal: the shape of a person’s interaction with a knowledge system.

Physical rhythm captures how you type — the motor cortex signature, the muscle memory, the unconscious timing that varies with alertness, mood, and focus but remains statistically distinguishable from another person’s rhythm. Cognitive pattern captures what you think about — the semantic territory you occupy, the concepts you return to, the questions that define your intellectual neighborhood. Behavioral trajectory captures how you move through knowledge — the sequence of topics, the depth of exploration, the rhythm of inquiry and reflection.

Together, they form a composite identity that is richer than any credential. A password proves you know a secret. A fingerprint proves you have a body. These three layers prove something more subtle: that you are the person who has been using this system, thinking these thoughts, at this rhythm, in this order, over this period of time. It is not a proof of identity in the cryptographic sense. It is a proof of continuity.

An attacker would need to simultaneously replicate the owner’s typing rhythm (12-bin dwell histogram, 12-bin flight histogram, correlation structure), the owner’s semantic query distribution (centroid and variance in 768D space), and the owner’s temporal exploration pattern (sequence and timing of Q–R pairs through the manifold). Compromising one layer is conceivable. Compromising all three simultaneously, continuously, in real time, is not.

8. What “Bound” Means

The system does not authenticate you. It recognizes you. The difference is fundamental.

Authentication is a challenge–response protocol. It asks a question (“what is your password?”), receives an answer, and makes a binary decision. The credential is external to the interaction — it was established at enrollment time and replayed at login time. It can be stolen, phished, brute-forced, or leaked. Once compromised, it must be revoked and replaced.

Recognition is a continuous coherence evaluation. It does not ask a question; it observes the shape of interaction and compares it to the accumulated baseline. There is no credential to steal because the credential is the interaction itself. There is no enrollment because the baseline accumulates organically. There is no revocation because there is nothing discrete to revoke — the baseline evolves with every interaction, and any interruption in coherence simply closes the learning gate until coherence is restored.

This has profound implications for sovereignty. A sovereign system is one where the owner controls not just access but influence. In a traditional system, anyone with the password can modify the database. In a bound system, only the recognized owner can shape the manifold. The manifold becomes a mirror — it reflects the intellectual shape of the person who uses it. Over time, it becomes increasingly difficult to use by anyone else, because the retrieval quality depends on the accumulated Q–R topology, and that topology was sculpted by one specific pattern of inquiry.

The manifold is not locked. It is not encrypted. Anyone can query it. But only the owner can teach it. The knowledge is open for reading; the geometry is closed for writing. This asymmetry — open retrieval, gated learning — is the core of what “bound” means.

9. Conclusion

We have described a system in which identity is not a gate but a geometry. Three layers — physical rhythm, cognitive pattern, behavioral trajectory — converge in a 768-dimensional manifold to produce continuous recognition without authentication. The implementation is operational in Hermes, the sovOS retrieval engine, where every interaction is evaluated for cadence coherence before the system decides whether to learn from it.

The implications extend beyond security. A bound system is a system that knows its owner in the deepest sense available to a machine: it has absorbed the shape of their thinking, the rhythm of their hands, the trajectory of their curiosity. It does not understand these things. It does not need to. It recognizes the pattern, and recognition is enough.

Sovereignty is not a permission granted by an authority. It is not a key held in a vault. It is not a password typed into a prompt. Sovereignty is a pattern — a coherent, continuous, multi-dimensional pattern that emerges from the interaction between a person and the system that serves them. The pattern cannot be granted because it was never given. It cannot be revoked because it was never issued. It exists because the person exists, and it persists as long as the person and the system continue to cohere.

§

Sovereignty is not a permission. It is a pattern.

The Sovereignty Foundation · Research
Bound — A System That Knows Its Owner
22 March 2026