The proof — what was checked, and how you can re-check it.

Four independent checks stand behind the demo. Each card below is one check: a plain-language sentence, the real recorded number, and — one click deeper — exactly what ran and where it lives in the repo. Nothing here is live; these are the committed results of real runs, and one command reproduces the core on your machine in about a minute.

Key: anything with this symbol is powered by Helix — compiled from Helix source by kovc.

EXHIBIT A

The output under test

GPT-2-XL was given five words and asked for twenty more, running entirely on Helix-built kernels powered by Helix. This exact sentence is what every check below judges:

The capital of France is the city of Paris. It is the capital of France and the largest city in France. It is

THE FOUR CHECKS

Click any card for the full story

Each one attacks the question “did this really run correctly?” from a different angle. All four passed, fail-closed — a single mismatch turns the whole run red.

THE FOUNDATION

The ladder under it all

Every tool was built only by the tool before it — no pre-built compiler is ever trusted. Hover any rung; the sizes are the real committed binaries.

hex0299 B→ hex1622 B→ hex21.5 KB→ catm299 B→ M01.7 KB→ cc_amd6418 KB→ M2-Planet196 KB→ seed61 KB→ kovc682 KB

Reproduce it yourself — no GPU, no weights, about a minute:

# clean checkout · CPU-only · fail-closed (exits red on any mismatch) git clone https://github.com/Questeria/helix && cd helix bash scripts/reproduce_trust.sh # asserts: seed 9837db12 · fixpoint 0992dddd · DDC K1 84363adb

EVERY MODEL

Nothing ships without a gate

A model appears in this demo only after its output matched the independent oracle. The recorded results, per model — hover the column titles for what they mean:

model	argmax	max score diff	tokens
GPT-2 124M · 12 layers	id 262 — exact	2.59e-04	25 / 25
GPT-2-Large 774M · 36 layers	id 262 — exact	3.8e-05	25 / 25
GPT-2-XL 1.5B · 48 layers	id 262 — exact	4.4e-05	25 / 25
SmolLM2-135M · 30 layers · 2024 Llama arch	id 260 — exact	4.9e-05 over 49,152	25 / 25

THE EDGES

Honest residuals — said before you ask

Complete to PTX, not SASS. Below the PTX layer, NVIDIA's closed assembler, driver and the C launcher are trusted-once.
fp32 only, single GPU (sm_86). One RTX 3070-class card; no other precision or hardware is claimed.
The oracle shares the spec. It's an independent implementation of GPT-2, not an independent specification — it catches implementation bugs, not shared misunderstandings.
~10 s/token live, by design. The recorded run took 195.5 s for 20 tokens (0.102 tok/s). Trust first; speed is roadmap.
The attestation's full hashes appear on a live run. This page shows the recorded prefixes (e.g. model file sha 248dfc39…, 548,105,171 bytes).

Honest residuals: fp32 · verified to PTX, not SASS · single GPU (sm_86) · base models, not assistants · the oracle shares the model's spec. Every number on this page is a recorded, committed result. start here · guided run · expert · proof · models

The proof — what was checked, and how you can re-check it.

The output under test

Click any card for the full story

The compiler rebuilt itself from 299 bytes

An independent oracle got the same answer

Run twice, identical to the byte

A signed attestation binds it together

The ladder under it all

Nothing ships without a gate

Honest residuals — said before you ask

The three anchors (recorded)

Recorded figures (GPT-2 124M leg)

Why it matters

What the record names