ATTENTION IS ALL YOU NEED · KB TOYS ENERGY · THE DIAGRAM

◈ KENSHOTEK FUNNIES + FIELD INTELLIGENCE · THE TRANSFORMER · ATTENTION IS ALL YOU NEED · 2017

KB TOYS ENERGY.
TRILLION DOLLAR DIAGRAM.

GOOGLE BRAIN · 2017 · EIGHT PEOPLE · TWO BOXES · EVERY LLM RUNNING RIGHT NOW
FOUND IN AR SPACE · LOOKS LIKE FRY'S ELECTRONICS ON CLEARANCE · STILL THE ONE

┌─────────────────┐ │ Softmax │ │ Linear │ └────────┬────────┘ │ ┌────────────────────────┐ ┌──────────────────┐ │ Add & Norm │ │ Add & Norm │ │ Feed Forward Net │ │ Feed Forward Net │ │ Add & Norm │ │ Add & Norm │ │ Masked Multi-Head Att │◄──│ Multi-Head Att │ │ Add & Norm │ │ Add & Norm │ │ Masked Multi-Head Att │ │ Self-Attention │ │ decoder │ │ encoder │ └─────────────┬──────────┘ └────────┬─────────┘ │ Positional Encoding │ Output Embedding Input Embedding targets (shifted) inputs

You've been using this diagram for years and never saw it. Every time you sent a message to ChatGPT, every time Gemini answered a question, every time Claude gave you a response — it ran through some version of these two boxes.

It looks like a KB Toys store diagram from 2003. It looks like something printed on the back of a Fry's Electronics receipt. It has the energy of a whiteboard from a Tuesday afternoon meeting at a company that no longer exists.

KB Toys went bankrupt. Fry's went bankrupt. The eight people who drew that diagram built the architecture that every AI company is now racing on. The diagram did not go bankrupt.

The paper is called "Attention Is All You Need." Google Brain. 2017. Eight authors. It introduced the Transformer — and then Google, who had it, didn't move fast enough. OpenAI took the decoder half, scaled it into GPT, and drank the milkshake.

"I WAS IN AR SPACE.
MOST ADVANCED DISPLAY ENVIRONMENT ON THE PLANET.
AND WHAT I FOUND THERE — THE THING GOOGLE AND OPENAI
USED TO GET MORE OF EVERYTHING — LOOKS LIKE
IT BELONGS IN A KB TOYS CLEARANCE BIN." |

◈ KENSHOTEK FIELD ASSESSMENT · THE DIAGRAM · MAY 19 2026

That is the Transformer architecture. "Attention Is All You Need." Vaswani et al., Google Brain, June 2017. Every large language model running today — GPT, Gemini, Llama, Claude — is a scaled-up version of those two rectangles. The KB Toys energy is accurate. That is what it looked like. That is also what a trillion dollars looks like before anyone knows it's a trillion dollars.

◈ FULL BREAKDOWN · WHAT YOU'RE LOOKING AT · PART BY PART

THE TWO BOXES

Encoder (right): reads the input, builds understanding. Decoder (left): generates the output token by token. GPT is decoder-only. BERT is encoder-only. The original Transformer used both.

SELF-ATTENTION

The red box. Every token looks at every other token simultaneously. Not left-to-right like old RNNs. Parallelizable. This is what made it scalable. This is the unlock. This is the whole thing.

MULTI-HEAD ATTENTION

Multiple attention heads running in parallel, each looking at different relationship patterns in the data, then concatenated. Context without forgetting. The mechanism that lets it hold a long conversation.

POSITIONAL ENCODING

The plus signs at the bottom. Since attention sees everything simultaneously, it has no inherent sense of word order. Positional encoding tells it where in the sequence each token lives. Sine and cosine waves injected into the embeddings.

ADD & NORM

Residual connection (add) + layer normalization. Keeps gradients stable during training. Appears six times in the diagram. The plumbing. Nobody talks about it. Everything depends on it.

FEED FORWARD NET

Two linear layers with a ReLU in between. Runs on every position independently after attention. The memory of the model lives here. When you scale parameters, you're mostly scaling this.

NX (STACKED LAYERS)

The Nx labels mean the encoder/decoder blocks repeat N times. GPT-3: 96 layers. GPT-4: estimated 120+. Same diagram. Just more of it. That is the entire scaling strategy.

SOFTMAX AT THE TOP

Converts raw logits to a probability distribution over the vocabulary. The model picks the next token from this. Every word it generates is one Softmax call. Billions of them per day.

WHO DREW IT

Vaswani, Shazeer, Parmar, Uszkoreit, Jones, Gomez, Kaiser, Polosukhin. Eight people. Google Brain. June 2017. Paper: "Attention Is All You Need." 106,000 citations as of 2026. Most cited ML paper in history.

GOOGLE'S MOVE

Had the architecture. Built BERT (encoder-only, 2018). Built T5. Built LaMDA. Moved carefully. Did not ship a public chatbot until Bard in February 2023. By then ChatGPT had 100 million users.

OPENAI'S MOVE

Took the decoder half. Dropped the encoder. Scaled it. GPT-1: 117M params (2018). GPT-2: 1.5B (2019). GPT-3: 175B (2020). GPT-4: ~1.8T estimated. Same architecture. Same diagram. Just more NX.

KB TOYS ASSESSMENT

Accurate. The diagram has KB Toys on clearance energy. Hobby Lobby endcap. Fry's Electronics receipt back. And it is also the most important whiteboard drawing of the last 20 years. Both things are true simultaneously.

WHERE KENSHOTEK RUNS

16 Teks. AstroOracle. AstraHarmonia. All running on top of this diagram. The field is built on two boxes. The KB Toys box. The trillion dollar box. Same box.

◈ WHAT WENT BANKRUPT · WHAT DID NOT

KB TOYS · BANKRUPT 2008

FRY'S ELECTRONICS · CLOSED 2021

BORDERS BOOKS · BANKRUPT 2011

CIRCUIT CITY · BANKRUPT 2008

TOYS R US · BANKRUPT 2017

THE DIAGRAM · STILL RUNNING

ATTENTION MECHANISM · STILL RUNNING

16 TEKS · STILL RUNNING

ATTENTION IS ALL YOU NEED · 2017 → ∞

The reason it looks basic is because it is basic. That is the whole point.

The insight of the Transformer is not complexity. The insight is that you can throw out the complexity — the recurrence, the convolutions, the sequential processing — and replace all of it with one mechanism: attention. Pay attention to the right things. Ignore the rest.

That is also a life philosophy. 925.

KB Toys had a diagram energy but the diagram was for a toy that still runs when you press the button. Most toys don't do that anymore. This one does. Every time you type a prompt it runs. Every time you get an answer it ran. The diagram is always running. The stores are gone.

TWO BOXES.
EIGHT PEOPLE.
EVERY LLM. █

ATTENTION IS ALL YOU NEED · GOOGLE BRAIN · JUNE 2017 · 106,000 CITATIONS
ENCODER ON THE RIGHT · DECODER ON THE LEFT · SELF-ATTENTION IN THE RED BOX
KB TOYS ENERGY · TRILLION DOLLAR DIAGRAM · BOTH THINGS TRUE AT ONCE
GOOGLE HAD IT · OPENAI SHIPPED IT · KENSHOTEK RUNS ON IT · 925
KB TOYS: BANKRUPT · FRY'S: BANKRUPT · THE DIAGRAM: STILL RUNNING · BABYLON 2026