Kettle Systems provides end-to-end provenance infrastructure that makes both questions answerable: document alteration is detectable, and every claim traces to its exact source.
Institutional and cultural memory is our first priority. Kettle Systems opens a collection to search and extraction while every claim stays bound to the exact source it came from — and any later alteration of that source is detectable. Records become usable without losing the chain back to the original.
Running today against a multi-corpus archive — the Turnbull National Wildlife Refuge record and the Spokane historical newspaper corpus. 5,400 documents, 56,499 claims, every one evidenced to its source. See the deployment proof ↓
Opposing counsel produces a document in discovery. Your team needs to know whether it's been altered since creation. You also need to know where exactly a disputed claim appears in the record.
An analyst delivers a conclusion. The first question is whether the underlying sources are intact. The second is whether the conclusion traces to specific evidence.
An AI system returns an answer from your document corpus. The question is whether that answer was extracted or hallucinated. The follow-up is which specific paragraph it came from.
Most systems address one side of this problem. Kettle Systems collapses that distinction.
Every document is cryptographically anchored at ingest. SHA-256 content addressing establishes a deterministic identity. Entropy-based forensic barcoding captures the file's structural fingerprint. Ed25519 identity signing binds authorship to the record.
Every claim extracted from a document is traced to its exact source sentence, paragraph, and page. A structured graph retrieval pipeline preserves that provenance chain through every stage of the answer.
Kettle Core unifies both layers. A claim is only valid if its source document is intact. A source document is only meaningful if its claims can be traced. Both conditions are enforced simultaneously. That enforcement holds from raw file to delivered answer.
Anchor archival collections with tamper-evident provenance. Extract structured claims and relationships from the corpus. Make institutional knowledge searchable while maintaining the integrity chain back to original materials.
Anchor documents at production. Detect post-production modification at the byte level. Trace specific claims to their exact location. Provide forensic evidence that holds up under challenge.
Ensure every conclusion traces to intact source material. Enforce access controls at the claim level. Deploy entirely on-premise within classified or air-gapped environments.
Every answer in a Kettle-powered RAG system carries its source citation. That source can be verified as unmodified. Extraction is distinguished from hallucination structurally, not heuristically.
The system has been deployed against a multi-corpus archive spanning the Turnbull National Wildlife Refuge documentary record and the Spokane historical newspaper corpus. Every claim in the graph traces to a source via an evidence relationship — source-traceability is an enforced property of the store, not a property of a subset.
The system deploys inside your environment, under your control. No data leaves your network.
All documented bugs were found through active deployment against real documents under real query load.
Three configuration files define what the pipeline knows about any collection. New domains are bootstrapped with automated tooling. Validation runs confirm coverage before full ingest.