299B Helix × GPT-2 ▶ RECORDED RESULTS — real gate outputs, not live

The proof — what was checked, and how you can re-check it.

Four independent checks stand behind the demo. Each card below is one check: a plain-language sentence, the real recorded number, and — one click deeper — exactly what ran and where it lives in the repo. Nothing here is live; these are the committed results of real runs, and one command reproduces the core on your machine in about a minute.

Key: anything with this symbol is powered by Helix — compiled from Helix source by kovc.
EXHIBIT A

The output under test

GPT-2-XL was given five words and asked for twenty more, running entirely on Helix-built kernels powered by Helix. This exact sentence is what every check below judges:

The capital of France is the city of Paris. It is the capital of France and the largest city in France. It is
THE FOUR CHECKS

Click any card for the full story

Each one attacks the question “did this really run correctly?” from a different angle. All four passed, fail-closed — a single mismatch turns the whole run red.

THE FOUNDATION

The ladder under it all

Every tool was built only by the tool before it — no pre-built compiler is ever trusted. Hover any rung; the sizes are the real committed binaries.

hex0299 B hex1622 B hex21.5 KB catm299 B M01.7 KB cc_amd6418 KB M2-Planet196 KB seed61 KB kovc682 KB

Reproduce it yourself — no GPU, no weights, about a minute:

# clean checkout · CPU-only · fail-closed (exits red on any mismatch) git clone https://github.com/Questeria/helix && cd helix bash scripts/reproduce_trust.sh # asserts: seed 9837db12 · fixpoint 0992dddd · DDC K1 84363adb
EVERY MODEL

Nothing ships without a gate

A model appears in this demo only after its output matched the independent oracle. The recorded results, per model — hover the column titles for what they mean:

model argmax max score diff tokens
GPT-2 124M · 12 layersid 262 — exact2.59e-0425 / 25
GPT-2-Large 774M · 36 layersid 262 — exact3.8e-0525 / 25
GPT-2-XL 1.5B · 48 layersid 262 — exact4.4e-0525 / 25
SmolLM2-135M · 30 layers · 2024 Llama archid 260 — exact4.9e-05 over 49,15225 / 25
THE EDGES

Honest residuals — said before you ask

Honest residuals: fp32 · verified to PTX, not SASS · single GPU (sm_86) · base models, not assistants · the oracle shares the model's spec. Every number on this page is a recorded, committed result. start here · guided run · expert · proof · models