brain/
conceptartificial-intelligence

LLMs are a commodity; the moat is proprietary data + workflow integration

Notes

LLMs are a commodity; the moat is proprietary data + workflow integration

Vintage: Dec 2025. Articulated by ali-ghodsi on the Bg2 Pod December 23, 2025. The "LLM as commodity" framing has become a consensus operator-side view through 2025-2026 — Karpathy (2026-03-20-no-priors-andrej-karpathy-skill-issue-code-agents in March) and Benioff (2026-05-15-all-in-podcast-trump-xi-benioff-saaspocalypse-openai-apple in May) both implicitly hold it. Re-validate against frontier-lab counter-framings as those emerge.

One-line summary: Frontier-lab LLMs are differentiated briefly on capability before competitors catch up, so durable enterprise value accrues to (a) proprietary data not available to competitors, (b) workflow integration and agentic systems built around that data, and (c) the application layer where end-users interact. The lab layer is the "TSMC of AI" — high-revenue but commoditized at the unit level.

The framing

ali-ghodsi in 2025-12-23-bg2-databricks-glean-enterprise-ai: "I think the LLM is a commodity. People are not saying that, but it is a commodity. Like you can get gas from this gas station, you can get gas from that gas station. It doesn't matter. Just compare price. LLMs have become that way. Like it doesn't really matter. This one is better right now, next week, that one is better."

Three load-bearing claims:

  1. Interchangeability at the unit level. A Databricks customer can swap from OpenAI to Anthropic to Gemini to an open-source model in ~one day. Ghodsi: "during all these people just switch LLMs like in one day. That's not the case with your iPhone versus Android or your Windows versus your Mac or your anything versus anything." The switching cost is near-zero — there is no installed-base ecosystem the way there is in OS / hardware / search.
  2. The moat is the data + workflow layer. Ghodsi: "What data does your company have that's special that your competitors don't have? Can you leverage that and can you build AI that really understands that data? Because that's not a commodity. There's not an AI out there that understands all your business processes in your company, your secret sauce and your data." The "AI strategy starts as data strategy" framing.
  3. Commodity ≠ unprofitable. Ghodsi: "That doesn't mean those companies are not going to be valuable. They can be very. I mean TSMC is very valuable. But I'm saying they're going to be kind of like these fab like companies but they're interchangeable." The frontier labs can be very large businesses while being commoditized at the product level — the analog is foundry economics, not differentiated-software economics.

Why it matters to this thread

The thread tracks both the tool layer (claude-code, cursor, etc.) and the lab/business layer (anthropic, openai). The commodity thesis is the operator-side view of the lab competitive dynamic and bounds how much sustainable differentiation any one lab can capture from a frontier-model release. If true, it predicts:

  • Lab revenue grows substantially in absolute terms (TAM is huge) but margins compress as substitution becomes routine.
  • Capability gaps close quickly (within weeks per Ghodsi) — the wiki's chronological framing of practitioner views per ../../_meta/AI_CAPABILITY_TRACKING is consistent with this; a model that was best in October is rarely still best in March.
  • The high-value layer for both startups and enterprises is the application / data / workflow integration layer, not the lab layer — congruent with anthropic's "narrow-focus on coding" strategic bet (Sax framing) being a differentiation through use-case specialization, not differentiation at the model layer per se.

Evidence

The commoditization observation (Dec 2025)

  • ali-ghodsi in 2025-12-23-bg2-databricks-glean-enterprise-ai: "It really comes down to your company. What data does your company have that's special that your competitors don't have? Can you leverage that and can you build AI that really understands that data? Because that's not a commodity. There's not an AI out there that understands all your business processes in your company, your secret sauce and your data that's not a commodity."

Enterprise switching behavior

Failed-fine-tuning evidence (Glean's pivot to foundation models)

  • arvind-jain in 2025-12-23-bg2-databricks-glean-enterprise-ai: "some of our fine tuning work, building models for a specific use case within our product didn't really pan out for us. And ultimately the choice was that we can go with already built models, whether they are small open source models hosted on databricks or one of the large foundation models." — Glean (top-tier enterprise-AI vendor) tried building specialized fine-tuned models and abandoned the effort because foundation models caught up too fast. First-person operator evidence that lab models commoditize the customization layer.

Adjacent corroboration from a different vantage

Tensions / open questions

  • Is reasoning-frontier different? Frontier reasoning models (GPT-5.4, Opus 4.5) may be sticky for power users in ways general LLMs aren't. nick-turley in 2026-03-15-bg2-chatgpt-super-assistant-era frames reasoning models as the productizable next-generation tech — if reasoning behaves like a separate market with higher switching costs, the commodity thesis would weaken for that segment.
  • Pricing power evidence is mixed. anthropic grew ARR 10×/year per Sax — that doesn't look like commodity-margin economics yet. The commodity thesis predicts margin compression eventually, not from day one. The next-12-month test: do lab pricing moves diverge or converge?
  • Capability inflections (Q4 2025 / early 2026 per Karpathy + Benioff + Turley) introduce capability discontinuities that temporarily break commodity dynamics. The Anthropic-coding-agent inflection may be the cleanest counter-example: enterprises did not switch out of Claude in the way the thesis predicts when GPT-5 or Gemini matched its language metrics, because the coding-agent workflow integration was sticky. The thesis may be most true at the raw model level and least true at the integrated-product level.

What would falsify this

  • A frontier model launch where enterprises don't substitute even when capability gaps are large (e.g., a model is 2× cheaper at parity but enterprise adoption stays flat). Would suggest there's switching friction the thesis ignores.
  • Lab margin expansion sustained for >12 months — would suggest the commoditization isn't happening at the unit level.
  • A frontier capability that durably differentiates one lab for >6 months (e.g., a true continual-learning model only one lab can produce). Would shift the thesis from "commodity" to "temporary leader-rotates."

Related

Referenced by