Journalism Error Typology
Journalism Error Typology
One-line summary: Two complementary scholarly frameworks for classifying journalism errors — Tillinghast's 1983 14-category newspaper classification and Chang's 2015 3-type integrated framework (errors / omissions / misinterpretations, with objective/subjective subtyping). Both are directly portable to wiki-ingest accuracy checking. Critical empirical baseline: 40-60% of straight news articles contain at least one source-perceived error per multiple replication studies. The misinterpretation subtype is the highest-priority operational target for LLM-driven wiki synthesis.
The insight
The journalism-studies literature has formal error typologies that map cleanly onto what an LLM ingest pipeline can check. Without these typologies, "accuracy" is a vague aspiration; with them, accuracy becomes a checklist of specific failure modes the wiki should systematically catch.
Two frameworks are useful together:
- Tillinghast 1983 — granular 14-category classification capturing concrete factual errors (names, dates, numbers, locations). Operationally tight, narrow scope.
- Chang 2015 — integrated 3-type framework (errors / omissions / misinterpretations) capturing both concrete errors and judgment errors. Broader scope; captures the most-common LLM-synthesis failure mode (misinterpretation).
Used together, they cover roughly the full surface of errors that can be caught by post-hoc review.
Evidence
Tillinghast's 14-category framework (1983)
From 2026-05-13-academic-research-journalism-standards-political-reporting citing Tillinghast 1983 (Newspaper Research Journal):
The classic news-source error classification, derived from six accuracy-research studies that tabulated errors from source-perceived perspectives:
Omissions, underemphasis, overemphasis, misquotes, faulty headlines, spellings, names, ages, other numbers, titles, addresses, other locations, time and dates.
Operational distinction inside the 14 categories:
- Objective errors — factual mistakes (wrong names, wrong dates, wrong numbers, wrong locations). Verifiable against primary sources.
- Subjective errors — mistakes of judgment (omission, underemphasis, overemphasis). Verifiable only against editorial reasoning.
Crucially: between 40% and 60% of all straight news articles contain at least one source-perceived error across the six studies tabulated. This is the empirical baseline against which any synthesis (wiki included) should calibrate its own error rate.
Chang's 3-type integrated framework (2015)
From 2026-05-13-academic-research-journalism-standards-political-reporting citing Chang 2015 (Journal of Health Communication, 17 citations):
A more operationally tractable framework that integrates errors, omissions, and misinterpretations:
| Type | Subtype | Examples |
|---|---|---|
| Errors | Objective | Misstating facts, wrong dates, wrong names, wrong numbers |
| Omissions | Objective | Missing research methods, missing source identities, missing critical context |
| Misinterpretations | Subjective | Errors in inferences, offering speculations as facts, overemphasis on uniqueness, overgeneralization of findings, shifting emphases |
The Chang framework's distinctive contribution is the misinterpretation subtype. Where Tillinghast's "underemphasis/overemphasis" categories capture some of this, Chang's breakdown — errors in inferences, speculation-as-fact, overemphasis on uniqueness, overgeneralization, shifting emphases — is specifically operational for LLM-driven synthesis. These are the exact failure modes that emerge from LLMs synthesizing across multiple sources without explicit interpretation discipline.
Chang's empirical finding: across health-research news coverage, objective inaccuracy (errors + omissions) and subjective inaccuracy (misinterpretations) are independent predictors of scientist-perceived inaccuracy. They don't substitute; they compound. A synthesis can be objectively accurate (every named fact verifiable) but subjectively inaccurate (misinterprets what those facts mean) — and the reader experiences the result as wrong.
How the wiki should apply this
Per journalism-practitioner-codes-canonical-tenets convergent-core item 6 (accuracy/error typology), the wiki's ingest checklist for political-source ingest should include:
A. Objective error check (Tillinghast lineage)
Every dispositive claim about concrete facts should be verified against the source:
- Names of persons / organizations / programs spelled correctly per primary documents.
- Dates verified against primary sources where possible.
- Dollar figures verified against primary sources; do not round without flagging.
- Locations verified; do not generalize ("Minneapolis" vs "St. Paul" vs "Twin Cities" matter).
- Titles of individuals reflect their actual role at the time of the cited event.
- Ages of individuals are calculated correctly relative to the cited event's date.
- Numbers (counts, percentages, durations) reflect the source's claim without unit-conversion drift.
B. Omission check (Tillinghast + Chang)
Critical context should not be silently dropped:
- Methodology of cited studies disclosed (sample size, study design, population) when load-bearing for a claim.
- Source's known biases / framing noted when material to the claim (per politics/SCOPE framing-bias-note rule).
- Prior contested claims about the same subject acknowledged rather than ignored.
- Counterevidence or counter-framing included or explicitly noted as absent.
- The subject's response to allegations cited where relevant (see no-surprises-rule).
C. Misinterpretation check (Chang's main contribution)
The highest-priority class for LLM-driven synthesis, because it's where the failure mode is most subtle:
- Facts vs inferences clearly distinguished. The wiki page should be able to tell the reader: "this is what the source says; this is what I'm inferring from it."
- Speculations not offered as facts. When a source speculates, the wiki should preserve the speculation-frame ("X suggested," "Y argued may be") rather than collapsing to assertion.
- No overemphasis on uniqueness. Don't characterize a finding as unprecedented when it isn't — check for prior comparable cases.
- No overgeneralization. A finding from one population/context should not be generalized to all populations/contexts. Per Chang: this is the most common subjective failure mode.
- No shifting emphasis. The wiki's framing should match the source's framing — if a source emphasizes one finding and treats another as secondary, the wiki should preserve that proportionality rather than re-weighting silently.
Design implications
For the wiki's operational checklist (vault/_meta/JOURNALISTIC_STANDARDS.md planned):
- Adopt both frameworks. Tillinghast for concrete-fact verification; Chang for interpretation discipline.
- Track empirical baseline. The 40-60% source-perceived-error rate is the baseline; the wiki should aim for substantially better, but should expect ~5-10% of its dispositive claims to be wrong on any given page even with discipline.
- Misinterpretation is the priority class. LLM-driven synthesis is most prone to the subjective-error failure modes (overgeneralization, speculation-as-fact, shifting emphasis). The checklist should weight this class heavily.
- Errors found in audit become CALIBRATION entries. When a wiki page is found wrong on a checkable claim, the error should be classified per the typology and logged in CALIBRATION so the wiki tracks its own miscalibration over time.
Contradictions / tensions
- The objective/subjective distinction is itself contested in the literature. Some research (per Tillinghast) treats omissions as objective; some (per Chang) treats them as subjective. The framework choice affects whether omission-rates are framed as factual-error or judgment-error. For wiki purposes, the boundary doesn't matter much — both classes need checking.
- Source-perceived error rate isn't the same as ground-truth error rate. Sources reviewing their own coverage may overstate errors (perceptual bias toward seeing themselves accurately) or understate them (motivated reasoning about coverage favorability). The 40-60% figure is source-perceived, not externally adjudicated.
- For LLM synthesis specifically, the failure-mode distribution may differ from human journalism. LLMs over-generalize and confabulate; they're less prone to wrong-date / wrong-name errors than to subjective-interpretation errors. Empirical work specifically on LLM-synthesis errors is needed; the journalism typology is a starting framework, not a fitted one.
Open questions
- What's the empirical error rate of LLM-driven political-source synthesis specifically? The journalism baseline (40-60%) may not transfer. See the future wiki self-audit pass mentioned in 2026-05-13-academic-research-journalism-standards-political-reporting — that's the operational vehicle to measure this.
- Should the wiki track per-page error counts in frontmatter? Trade-off: visibility of error rate vs noise in the wiki structure. Possibly a
_meta/quality dashboard. - Does Chang's misinterpretation typology cover all the LLM-specific failure modes? Or are there modes (e.g., hallucination — making up specific quotes that don't exist in the source) that aren't covered? Likely the latter; "hallucinated citations" might need its own category beyond the journalism typology.
Related
- journalism-practitioner-codes-canonical-tenets — parent concept; this is the operational expansion of convergent-core item 6
- no-surprises-rule — sibling discrete operational principle
- how-to-enforce-journalism-checklist-in-wiki — the enforcement question
- CALIBRATION — where individual error-instances should be logged as belief updates
- politics/SCOPE — where the operational application sits
- 2026-05-13-academic-research-journalism-standards-political-reporting — primary source