brain/
conceptjournalism

Journalism Error Typology

Notes

Journalism Error Typology

One-line summary: Two complementary scholarly frameworks for classifying journalism errors — Tillinghast's 1983 14-category newspaper classification and Chang's 2015 3-type integrated framework (errors / omissions / misinterpretations, with objective/subjective subtyping). Both are directly portable to wiki-ingest accuracy checking. Critical empirical baseline: 40-60% of straight news articles contain at least one source-perceived error per multiple replication studies. The misinterpretation subtype is the highest-priority operational target for LLM-driven wiki synthesis.

The insight

The journalism-studies literature has formal error typologies that map cleanly onto what an LLM ingest pipeline can check. Without these typologies, "accuracy" is a vague aspiration; with them, accuracy becomes a checklist of specific failure modes the wiki should systematically catch.

Two frameworks are useful together:

  1. Tillinghast 1983 — granular 14-category classification capturing concrete factual errors (names, dates, numbers, locations). Operationally tight, narrow scope.
  2. Chang 2015 — integrated 3-type framework (errors / omissions / misinterpretations) capturing both concrete errors and judgment errors. Broader scope; captures the most-common LLM-synthesis failure mode (misinterpretation).

Used together, they cover roughly the full surface of errors that can be caught by post-hoc review.

Evidence

Tillinghast's 14-category framework (1983)

From 2026-05-13-academic-research-journalism-standards-political-reporting citing Tillinghast 1983 (Newspaper Research Journal):

The classic news-source error classification, derived from six accuracy-research studies that tabulated errors from source-perceived perspectives:

Omissions, underemphasis, overemphasis, misquotes, faulty headlines, spellings, names, ages, other numbers, titles, addresses, other locations, time and dates.

Operational distinction inside the 14 categories:

  • Objective errors — factual mistakes (wrong names, wrong dates, wrong numbers, wrong locations). Verifiable against primary sources.
  • Subjective errors — mistakes of judgment (omission, underemphasis, overemphasis). Verifiable only against editorial reasoning.

Crucially: between 40% and 60% of all straight news articles contain at least one source-perceived error across the six studies tabulated. This is the empirical baseline against which any synthesis (wiki included) should calibrate its own error rate.

Chang's 3-type integrated framework (2015)

From 2026-05-13-academic-research-journalism-standards-political-reporting citing Chang 2015 (Journal of Health Communication, 17 citations):

A more operationally tractable framework that integrates errors, omissions, and misinterpretations:

TypeSubtypeExamples
ErrorsObjectiveMisstating facts, wrong dates, wrong names, wrong numbers
OmissionsObjectiveMissing research methods, missing source identities, missing critical context
MisinterpretationsSubjectiveErrors in inferences, offering speculations as facts, overemphasis on uniqueness, overgeneralization of findings, shifting emphases

The Chang framework's distinctive contribution is the misinterpretation subtype. Where Tillinghast's "underemphasis/overemphasis" categories capture some of this, Chang's breakdown — errors in inferences, speculation-as-fact, overemphasis on uniqueness, overgeneralization, shifting emphases — is specifically operational for LLM-driven synthesis. These are the exact failure modes that emerge from LLMs synthesizing across multiple sources without explicit interpretation discipline.

Chang's empirical finding: across health-research news coverage, objective inaccuracy (errors + omissions) and subjective inaccuracy (misinterpretations) are independent predictors of scientist-perceived inaccuracy. They don't substitute; they compound. A synthesis can be objectively accurate (every named fact verifiable) but subjectively inaccurate (misinterprets what those facts mean) — and the reader experiences the result as wrong.

How the wiki should apply this

Per journalism-practitioner-codes-canonical-tenets convergent-core item 6 (accuracy/error typology), the wiki's ingest checklist for political-source ingest should include:

A. Objective error check (Tillinghast lineage)

Every dispositive claim about concrete facts should be verified against the source:

  • Names of persons / organizations / programs spelled correctly per primary documents.
  • Dates verified against primary sources where possible.
  • Dollar figures verified against primary sources; do not round without flagging.
  • Locations verified; do not generalize ("Minneapolis" vs "St. Paul" vs "Twin Cities" matter).
  • Titles of individuals reflect their actual role at the time of the cited event.
  • Ages of individuals are calculated correctly relative to the cited event's date.
  • Numbers (counts, percentages, durations) reflect the source's claim without unit-conversion drift.

B. Omission check (Tillinghast + Chang)

Critical context should not be silently dropped:

  • Methodology of cited studies disclosed (sample size, study design, population) when load-bearing for a claim.
  • Source's known biases / framing noted when material to the claim (per politics/SCOPE framing-bias-note rule).
  • Prior contested claims about the same subject acknowledged rather than ignored.
  • Counterevidence or counter-framing included or explicitly noted as absent.
  • The subject's response to allegations cited where relevant (see no-surprises-rule).

C. Misinterpretation check (Chang's main contribution)

The highest-priority class for LLM-driven synthesis, because it's where the failure mode is most subtle:

  • Facts vs inferences clearly distinguished. The wiki page should be able to tell the reader: "this is what the source says; this is what I'm inferring from it."
  • Speculations not offered as facts. When a source speculates, the wiki should preserve the speculation-frame ("X suggested," "Y argued may be") rather than collapsing to assertion.
  • No overemphasis on uniqueness. Don't characterize a finding as unprecedented when it isn't — check for prior comparable cases.
  • No overgeneralization. A finding from one population/context should not be generalized to all populations/contexts. Per Chang: this is the most common subjective failure mode.
  • No shifting emphasis. The wiki's framing should match the source's framing — if a source emphasizes one finding and treats another as secondary, the wiki should preserve that proportionality rather than re-weighting silently.

Design implications

For the wiki's operational checklist (vault/_meta/JOURNALISTIC_STANDARDS.md planned):

  • Adopt both frameworks. Tillinghast for concrete-fact verification; Chang for interpretation discipline.
  • Track empirical baseline. The 40-60% source-perceived-error rate is the baseline; the wiki should aim for substantially better, but should expect ~5-10% of its dispositive claims to be wrong on any given page even with discipline.
  • Misinterpretation is the priority class. LLM-driven synthesis is most prone to the subjective-error failure modes (overgeneralization, speculation-as-fact, shifting emphasis). The checklist should weight this class heavily.
  • Errors found in audit become CALIBRATION entries. When a wiki page is found wrong on a checkable claim, the error should be classified per the typology and logged in CALIBRATION so the wiki tracks its own miscalibration over time.

Contradictions / tensions

  • The objective/subjective distinction is itself contested in the literature. Some research (per Tillinghast) treats omissions as objective; some (per Chang) treats them as subjective. The framework choice affects whether omission-rates are framed as factual-error or judgment-error. For wiki purposes, the boundary doesn't matter much — both classes need checking.
  • Source-perceived error rate isn't the same as ground-truth error rate. Sources reviewing their own coverage may overstate errors (perceptual bias toward seeing themselves accurately) or understate them (motivated reasoning about coverage favorability). The 40-60% figure is source-perceived, not externally adjudicated.
  • For LLM synthesis specifically, the failure-mode distribution may differ from human journalism. LLMs over-generalize and confabulate; they're less prone to wrong-date / wrong-name errors than to subjective-interpretation errors. Empirical work specifically on LLM-synthesis errors is needed; the journalism typology is a starting framework, not a fitted one.

Open questions

  • What's the empirical error rate of LLM-driven political-source synthesis specifically? The journalism baseline (40-60%) may not transfer. See the future wiki self-audit pass mentioned in 2026-05-13-academic-research-journalism-standards-political-reporting — that's the operational vehicle to measure this.
  • Should the wiki track per-page error counts in frontmatter? Trade-off: visibility of error rate vs noise in the wiki structure. Possibly a _meta/ quality dashboard.
  • Does Chang's misinterpretation typology cover all the LLM-specific failure modes? Or are there modes (e.g., hallucination — making up specific quotes that don't exist in the source) that aren't covered? Likely the latter; "hallucinated citations" might need its own category beyond the journalism typology.

Related

Referenced by