medium convictionactive · updated 2026-05-21T00:00:00.000Z

CUDA built early AI → non-CUDA silicon adoption → 2 of 3 frontier models drop CUDA + inference CUDA-irrelevant → Nvidia software moat narrows → NVDA re-rate risk

The standard NVDA bull case prices a durable CUDA software moat. A competitor CEO argues CUDA 'has no role whatsoever in inference' and that two of the three leading frontier models now train without it. If the moat binds only on a shrinking training slice, the moat-premium in NVDA's terminal multiple carries re-rating risk. Tradeable: bearish-leg on the CUDA-lock-in component of NVDA's valuation; bullish for non-CUDA silicon (TPU/Google, Trainium/AWS, wafer-scale/CBRS).

The chain

CUDA was decisive in creating the AI landscape and was a real lock-in moat 3-5 years ago.

andrew-feldman in 2026-05-21-odd-lots-why-cerebras-ceo-andrew-feldman-built-the-world-s: "CUDA was really important in the creating of the AI landscape... what was true three or five years ago, in which CUDA had a dominant position."

CUDA has no lock-in at inference — moving a model from GPU to a non-Nvidia accelerator is near-frictionless.

andrew-feldman in 2026-05-21-odd-lots-why-cerebras-ceo-andrew-feldman-built-the-world-s: "it's not important now and it has no role whatsoever in inference. If you want to move from running a model on GPUs today to running it on us, we can move it in 10 keystrokes. Just move point to our API."

Frontier-model training is migrating off CUDA: two of the three leading models (Gemini on TPUs, Anthropic on Trainium) use no CUDA; only GPT remains CUDA-trained.

andrew-feldman in 2026-05-21-odd-lots-why-cerebras-ceo-andrew-feldman-built-the-world-s: "a year ago every major Frontier Lab model had been built on a Cuda foundation and today two of three haven't. So they lost 70% market share... Gemini built by Google on TPUs... Anthropics models trained on Trainium, no CUDA... two of the three leading models today use no CUDA. That's a hemorrhaging of share."

If CUDA's lock-in is gone at inference and shrinking at training, the software-moat premium embedded in NVDA's valuation is at risk — the durable-monopoly assumption narrows to a hardware-supply moat (CoWoS/HBM control), not software.

andrew-feldman in 2026-05-21-odd-lots-why-cerebras-ceo-andrew-feldman-built-the-world-s: "what was true three or five years ago, in which CUDA had a dominant position with Central, has shrunk significantly and not important at all at inference and shrinking in its role in training."

What would falsify this

Step 3: A new frontier model leader re-adopts CUDA, or Gemini/Anthropic move significant training back onto GPUs/CUDA.
Step 4: NVDA sustains pricing power and gross margin even as CUDA share falls — indicating the moat was hardware-supply, not software, and the re-rate thesis is misdirected.

Contradictions / tensions

Source is a direct Nvidia competitor with a clear incentive to talk down the moat. The '70% / two-of-three' framing is directional, not an audited share figure.
GPT (OpenAI) still trains in CUDA; CUDA's training role is 'shrinking,' not gone.
Nvidia retains a hardware-supply chokehold (CoWoS, HBM) that CUDA-erosion does not touch — see cowos-packaging-capacity-crunch. The hardware moat may matter more than the software moat for the medium term.

Implications

Bearish-leg input for the software-lock-in component of NVDA's multiple. Note: only the software-moat premium is at risk on this thesis — Nvidia's hardware-supply moat (>50% CoWoS, HBM allocation) is independent and not eroding here.
Bullish for non-CUDA silicon: Google TPU (Alphabet), AWS Trainium (Amazon), Cerebras wafer-scale (CBRS).
Pairs with inference-demand-to-wafer-scale-advantage: as inference (where CUDA is irrelevant) becomes the bulk of compute, the moat's relevance shrinks with the workload mix.
AI-thread reading (non-tradeable): the same chain, read as an ecosystem fact rather than a valuation call, says the AI compute substrate is **fragmenting across vendors** (TPU / Trainium / wafer-scale / GPU), and that runtime portability ("10 keystrokes") removes lock-in at the inference layer — the hardware-side corroboration of llm-as-commodity-thesis and a marker of the diversifying open-vs-closed-source-model-economics landscape.

Companies

Nvidia Cerebras Andrew Feldman

Concepts

CUDA moat erosion at inference Open vs closed source model economics CoWoS packaging capacity crunch

Open questions

none