brain/
← all entities
entityCBRSstock-market

Cerebras

Notes

Cerebras

One-line summary: Wafer-scale AI chipmaker (CEO/founder andrew-feldman); builds a single chip "58 times larger than any other chip that had ever been" to win on inference speed, and — critically for the supply-chain thesis — does so while sidestepping the three binding AI-silicon constraints (HBM, CoWoS, TSMC 3nm). IPO'd week of 2026-05-21 at a ~$64B market cap.

What it is

Founded 2015. Makes the "wafer-scale engine" — instead of cutting a wafer into many small chips and networking them (the GPU approach), Cerebras builds one giant chip the size of a dinner plate. The architectural payoff is memory: GPUs use HBM (stores a lot, slow); Cerebras's size lets it use the fast on-chip SRAM-type memory at scale ("stuff it to the gills with this fast memory"), which is why it claims 15x faster than the fastest GPU on inference, and 50–1,000x on some problems. Sells systems (the "CS3") and runs its own inference cloud serving open-source models (e.g. Kimi K2, a ~1T-parameter model).

Why it matters to artificial-intelligence

As an AI-infrastructure datapoint (not a markets one), Cerebras matters for three reasons tracked in this thread's "AI infrastructure: inference economics, compute markets" subdomain:

  • It reframes the inference bottleneck as a memory-bandwidth problem, not a FLOPs problem. andrew-feldman argues GPUs are slow at inference because HBM is high-capacity-but-slow and the model has to "move a ton of information from memory to compute"; wafer-scale wins by stuffing the chip "to the gills" with fast on-chip memory. This is a concrete, falsifiable account of why inference latency is what it is — useful as the hardware counterpart to the wiki's software-side capability-inflection sources.
  • It is an open-weight-model serving play. Cerebras runs its own inference cloud (visually "a lot like the open router interface") serving open-source models — Feldman cites running Kimi K2, a ~1T-parameter open model, "10 or 15 times faster than others." That makes Cerebras a live instance of the open-vs-closed-source-model-economics question: the cloud monetizes serving cheap-to-run open weights fast, where the user "what you're not paying for was the cost to train it."
  • It anchors the agentic-inference-needs-speed argument. Feldman directly rebuts Ben Thompson's "speed doesn't matter for agentic flows" claim — see inference-speed-as-a-pricing-premium.

Why it matters to stock-market

Cerebras is the live test of a SCOPE-relevant thesis: can an architecture that routes around the binding AI-silicon bottlenecks (hbm-supply-bottleneck, cowos-packaging-capacity-crunch, TSMC 3nm) win share without competing for the scarcest inputs? Feldman states Cerebras uses 5nm not 3nm, uses no HBM, and uses no CoWoS — explicitly avoiding the three constraints the wiki tracks as the bottleneck. If true and durable, it's both (a) a differentiated supplier whose growth is gated by data-center/power buildout rather than memory/packaging, and (b) a tension-leg on the "memory+packaging is the universal bottleneck" framing (it's universal for the GPU architecture, not for wafer-scale). See inference-speed-as-a-pricing-premium and cuda-moat-erosion-at-inference.

Key facts

  • Wafer-scale chip: "58 times larger than any other chip that had ever been"; took ~5 years and ~$500M to deliver the first one; a decade-long effort overall. Every prior wafer-scale attempt in the industry's 75-year history failed (incl. Gene Amdahl's Trilogy, mid-1980s). From andrew-feldman in 2026-05-21-odd-lots-why-cerebras-ceo-andrew-feldman-built-the-world-s.
  • IPO: priced week of 2026-05-21; raised ~$5.5B; not yet profitable; ~67x forward sales (Alloway). Markets valuing the company at $64B early in the IPO (Feldman). Prior IPO attempts dating to 2020; last year's attempt complicated by CFIUS review tied to G42; CFIUS resolved March 2025. From 2026-05-21-odd-lots-why-cerebras-ceo-andrew-feldman-built-the-world-s.
  • Major contracts: December 2025 deal with openai "north of $20 billion" ("one of the largest contracts ever signed in Silicon Valley"); March 2026 deal with AWS to deploy Cerebras systems in AWS data centers (served via Bedrock as a "disaggregated solution" combining Trainium + the CS3). From andrew-feldman.
  • Largest customer: g42 (Abu Dhabi) — biggest customer last year and a minority investor; deployments to date in US data centers (Santa Clara, Minneapolis, Dallas, soon Toronto). From andrew-feldman.
  • TSMC relationship: 5nm node (not the constrained 3nm); collaborated closely with TSMC on lithography to make wafer-scale work; "TSMC has given us as many wafers as we've needed." From andrew-feldman.
  • Trump-adjacent investor: 1789 Capital (associated with Donald Trump Jr.) participated in the September 2025 "G round." Feldman states this had "no role at all" in clearing CFIUS/IPO national-security review (resolved March 2025, before the 1789 money).
  • Headcount/wealth: IPO made "more than 800 millionaires" inside the company (per Feldman).
  • Own inference cloud serving open models: interface resembles OpenRouter; serves open-source models including Kimi K2 (~1T params) at "10 or 15 times faster than others." Customers "power" both openai and Cognition off Cerebras; the cloud's value proposition is fast serving of weights whose training cost the user doesn't bear. From andrew-feldman in 2026-05-21-odd-lots-why-cerebras-ceo-andrew-feldman-built-the-world-s. See open-vs-closed-source-model-economics.
  • Speed is architectural, not free: the architecture's bet is that engaged/agentic AI work rewards speed; Feldman rebuts the "speed doesn't matter for agentic flows" view as "dead wrong." See inference-speed-as-a-pricing-premium.

Strengths (from a thesis-input perspective)

  • Routes around HBM, CoWoS, and TSMC 3nm — the three constraints that gate GPU-architecture competitors.
  • Anchor demand locked: OpenAI ($20B+), AWS, G42.
  • Genuine speed advantage on inference (15x vs fastest GPU per Feldman) at lower power.

Weaknesses (from a thesis-input perspective)

  • Not profitable; ~67x forward sales — priced for flawless execution.
  • Still needs a "meaningful allocation" from TSMC; not constraint-free, just constraint-shifted to data centers/power.
  • Speed-premium thesis is unproven at scale — hosts flag that incremental speed may matter less as cost-of-speed rises (see inference-speed-as-a-pricing-premium tensions).
  • Closed-vs-open model economics still unsettled; Cerebras's cloud monetizes open-source serving, which is structurally lower-margin than owning a frontier model.

Open questions

Sources

Related

Referenced by