Can LLMs choose the right research question to investigate (not just run the experiment)?
Can LLMs choose the right research question to investigate (not just run the experiment)?
The question
LLM agents can now implement experiments, optimize hyperparameters, and grind a metric well. But can they select which experiment matters next, recognize that a research track is a dead end and back out to first principles, and form the high-level "this idea should work, so the failure must be a bug" prior that human researchers use to persevere? This is the load-bearing capability for an actual intelligence explosion — automating execution is not the same as automating direction.
Why it matters
This is the crux of the recursive-self-improvement / fast-takeoff thesis. If the verifiable inner loop (run experiments) is automatable but the direction-setting outer judgment is not, then automated AI research stays human-gated and the explosion is bottlenecked on human research taste — not on compute. The answer directly conditions intelligence-explosion timing and what it would look like "from the inside."
What we currently believe
As of May 2026 (Eric Jang, first-person from running the loop on Opus 4.6/4.7): no, not yet. Current public models are good at experiment execution and open-ended optimization but "don't seem to be that great at selecting what the next experiment should be" and can't do the lateral thinking to escape a dead-end track. The discriminator is verifiability — "if you can't evaluate it, then you can't auto research it" — and research-direction-selection is exactly the hard-to-verify part. See automated-ai-research-llm-capability-boundary. Karpathy (March 2026, autoresearch-recursive-self-improvement) holds that ideas can be contributed by an automated scientist but enactment/direction should stay queued by humans — broadly the same boundary.
Evidence we have
- eric-jang in 2026-05-15-dwarkesh-podcast-eric-jang-building-alphago-from-scratch: models "don't seem to be that great at selecting what the next experiment should be in a given track ... I had to catch infra bugs myself. By prompting the right question to Claude."
- dwarkesh-patel in 2026-05-15-dwarkesh-podcast-eric-jang-building-alphago-from-scratch: the Ilya research-taste framing — a good researcher distinguishes "bug" from "wrong idea" via a strong high-level prior.
- Cross-reference: autoresearch-recursive-self-improvement — Karpathy's nanochat result (agents found tunings he missed) is the bullish counter-datapoint on the execution side; the direction side remains human in both accounts.
Evidence we need
- A documented case of an agent loop autonomously abandoning a dead-end track and re-deriving the right question to ask — without a human reformulating the prompt.
- A Mythos-class (or later) model evaluated specifically on research-direction selection, to test Jang's speculation that scaling moves this boundary.
- Frontier-lab research-headcount trajectory 2026-2027 as an indirect signal (does direction-setting labor contract?).
How to resolve
Watch for: (1) frontier-lab releases of automated-scientist tooling that claims direction-selection, not just execution; (2) reproducible third-party demonstrations of dead-end-escape; (3) whether the AutoResearch-style loops (autoresearch-recursive-self-improvement) extend from hyperparameter search to "what should we even be measuring." Re-validate against any newer Jang/Karpathy material given the vintage discipline.