New a framework inside metaConvert

metaDETECT

metaDETECT is the data-extraction error framework now built into metaConvert. As the package computes your effect sizes, it checks every study in a pairwise meta-analysis and flags the values that don't add up, before they reach your pooled estimate.

Guidance for
For meta-analysts

Checking your extraction before you pool

metaDETECT checks every row of your extraction at two stages: the values you entered (the Invalid family) and the effect sizes they produce (the Unusual and Discordant families). It flags any that do not hold up, before they reach your meta-analysis. This guidance is organised in two parts.

Part 1 · The three families

Three families of error

Each tab isolates one family: a worked example, what makes the value wrong, and how to resolve it. Select a family.

Mathematically impossible

Values that cannot be true for any study, whatever the data.

Worked example
studySmith 2021
or1.80
ci_low2.10
ci_up1.55
What's wrong

The interval's lower bound (2.10) is above its upper bound (1.55). No estimate can fall below its own floor.

How to resolve it
Likely cause

The two bounds were swapped, or a digit was dropped.

What to do

Re-enter the interval in the right order. The value is set to NA so it cannot enter the pool.

Also in this family
SD < 0 |r| > 1 OR / RR ≤ 0 RD ∉ [−1, 1] estimate ∉ its CI cells > margin
metaDETECT flag
Inverted CI for 'or': lower > upper, set to NA

Statistically implausible

Values that are technically possible but highly improbable, given the rest of the dataset.

Worked example
studyLee 2020
mean_exp18.4
mean_sd_exp1.2
n_exp54
mean_nexp12.1
mean_sd_nexp1.4
n_nexp51
What's wrong

The two SDs (1.2 and 1.4) are about ten times smaller than the peers in the pool (≈ 14), giving a Hedges' g of 4.70.

How to resolve it
Likely cause

A standard error entered as a standard deviation (SD ≈ SE × √n).

What to do

Check whether the column is an SD or an SE, against the sample size. The value is kept, not deleted.

Also in this family
|logOR| > 5 α / ICC > 0.99 n-ratio > 10 SE ≫ ES cross-row IQR outlier
metaDETECT flag
Large SMD: |g| = 4.70 (threshold: 3) (from means_sd)

Inconsistent across methods

Independent computations of the same effect that ought to agree, but do not.

Worked example
studyPark 2019
mean_exp13.1
mean_sd_exp5.1
n_exp52
mean_nexp12.0
mean_sd_nexp5.4
n_nexp53
student_t3.18
from means + SDsg = 0.21
from student_tg = 0.62
What's wrong

The two estimates disagree: g = 0.21 versus g = 0.62.

How to resolve it
Likely cause

At least one of the redundant inputs was mis-extracted.

What to do

Re-check the means, the SDs and the t. The disagreement localises the suspect value.

Also in this family
CI overlap = 0 large min-max range CI width ≠ SE cross-study sign flip
metaDETECT flag
Low CI overlap between min/max estimates: 31.2% (threshold: 85%) - min: means_sd, max: student_t
Part 2 · A worked example

From the source articles to a clean dataset

The analyst works from the included articles, recording each study's statistics in the wide-format sheet. Capturing every statistic a paper reports for an effect - not only the minimum one formula needs - gives metaDETECT more than one estimate of that effect to compare. Run before pooling, metaDETECT then surfaces an error a forest plot cannot: two redundant statistics in one row that disagree, while the value entering the model still looks unremarkable.

Collect the source articles

It begins with the included studies themselves: each article is the source for one row, and the analyst records its statistics directly from the paper. Five trials are included here; what each reports varies, and one, Okafor 2020, also reports a Student's t for the comparison.

Adesina 2018means + SDs
Brandt 2019means + SDs
Okafor 2020means, SDs + t
Petrov 2021means + SDs
Sandberg 2022means + SDs
Each article is the primary source for one study. Most report group means and standard deviations; Okafor 2020 also reports a test statistic, a second route to the same effect.

Extract every statistic reported for the effect

A paper usually reports an effect several ways - group means and standard deviations, a test statistic, a confidence interval, a p-value. Extract all of them, not only the one a single formula needs: metaConvert turns each into an effect size, and the agreement between those routes is the check that follows. Each statistic must be matched to the right outcome, since a paper reports more than one.

Okafor 2020 · results section

“Primary outcome (cognitive score): the intervention group exceeded control (24.6 ± 6.3 versus 21.7 ± 6.5; n = 52, 48), t(98) = 2.26, P = .026.”

“Secondary outcome (fatigue scale): a larger separation was observed, t(98) = 4.78, P < .001.”

One effect, several routes. The means, SDs, sample sizes and the primary-outcome t all describe the cognitive-score effect, and each yields an effect size. The t = 4.78 belongs to a different outcome and must not be paired with this comparison.

Record each statistic in the extraction sheet

Each study occupies one row. Okafor 2020 carries both the group statistics and its reported t (in the student_t column), giving metaDETECT two estimates of that effect to compare; the other four offer a single route.

study_id n_expmean_expmean_sd_exp n_nexpmean_nexpmean_sd_nexp student_t
Adesina 20184431.27.44527.67.6-
Brandt 20193928.96.94125.77.2-
Okafor 20205224.66.34821.76.54.78
Petrov 20215035.18.14930.98.4-
Sandberg 20223619.85.93816.96.1-
The column names are metaConvert's recognised inputs; the input-data page lists every one. Only Okafor 2020 provides a second route, so only its row can be cross-checked.

metaDETECT flags suspect estimates before pooling

metaConvert recomputes every effect size each row allows, and metaDETECT checks all of them - including the routes a forest plot discards - and flags any that disagree. In R this is two lines; the same sheet is uploaded, and the same checks run, in the web application:

## compute every effect size, then flag suspect rows res <- convert_df(data, measure = "g") summary(res, flags = TRUE)

The summary returns one row per study: the selected estimate in es_crude, its standard error in se_crude, and any checks in flags_crude. Four rows are clear; Okafor 2020 carries a flag.

study_ides_crudese_crudeflags_crude
Adesina 20180.4760.213-
Brandt 20190.4490.224-
Okafor 20200.4500.201Discordant
Petrov 20210.5050.203-
Sandberg 20220.4780.233-
Discordant - Low CI overlap between min/max estimates: 24% (threshold: 85%) - min: means_sd, max: student_t
es_crude, se_crude and flags_crude are the crude-scope columns metaConvert returns by default; the flags_crude cell above is abbreviated to its family, and the complete message appears beneath the table.

Read the flag, and correct it at the source

The two routes for Okafor 2020 disagree. The means and standard deviations give g = 0.45; the recorded t = 4.78 implies g = 0.95, more than twice as large, and the two confidence intervals overlap by only 24 percent. Because both describe the same comparison, they cannot both be right.

from means + SDsg = 0.45  [0.05, 0.85]
from student_tg = 0.95  [0.53, 1.37]
CI overlap24% (threshold 85%)
Two estimates of one effect that ought to coincide. metaConvert carries the higher-ranked means-and-SD estimate, g = 0.45, into the analysis, so a forest plot would look entirely normal; the conflict shows only when metaDETECT checks the second route.

Returning to the article resolves it: the t = 4.78 is the secondary-outcome (fatigue) value, recorded in place of the primary-outcome statistic, t = 2.26. With the value corrected, the two routes coincide (g = 0.45 from each, intervals overlapping by 99.8 percent) and the row clears:

study_id mean_expmean_sd_exp mean_nexpmean_sd_nexp student_tes_crudeflags_crude
Okafor 202024.66.321.76.52.260.450-
The means and SDs were always the higher-ranked route, so the selected estimate is unchanged (es_crude = 0.450); with the corrected t = 2.26 the two routes now agree, and metaDETECT no longer flags the row.
Why capture the redundant statistic. Had only the means and SDs been extracted, Okafor 2020 would have entered the pool at a plausible g = 0.45 and the mis-recorded t would never have been examined. The second route is what gives metaConvert a value to disagree with, and metaDETECT is what reads the disagreement before pooling.

Checklist before pooling

  • Was every statistic each paper reports for an effect captured, not only the minimum one formula needs?
  • Where a row offers two routes to the same effect, do they agree?
  • Is any effect size implausibly large for the intervention or exposure?
  • Does any estimate sit far from the rest of the pool?
  • Do the studies split into opposed directions — some intervals entirely above the null, others entirely below?
  • Does the same trial enter the dataset more than once?
Run the checks in the app How to enter your data
For reviewers & editors

Detecting extraction errors in a published meta-analysis

Reviewers and editors rarely hold the data file, yet a substantial share of extraction errors remain detectable from the published forest plot together with the source articles. This guidance is organised in two parts.

Part 1 · The patterns

Error patterns visible in forest plots

Each tab isolates one recurring pattern: its visual signature, its likely cause, and the metaDETECT check that flags it.

The plots are illustrative; the highlighted study is the one that warrants scrutiny.

An implausibly large effect

Forest plot
Appearance

One study's effect is far larger than any intervention could plausibly produce: here a standardised mean difference above 3.

How to read it
Likely cause

A standard error entered as a standard deviation, or a unit error. For scale, an SMD of 3 on a cognitive test is a 45-point IQ gain (3 × the 15-point SD).

Detected by
Unusual: |g| > 3
Reviewer action

Convert the effect back to the test's units. If the implied change is clinically impossible, an input is wrong.

An outlying effect estimate

Forest plot
Appearance

One study sits far from the others, though its value is not impossible on its own.

How to read it
Likely cause

A wrong row or arm, a unit mismatch, or a misread statistic.

Detected by
Unusual: cross-study outlier
Reviewer action

Re-extract that study from the source and recompute.

An implausibly narrow interval

Forest plot
Appearance

One interval is very short and its square dominates the plot, though the study is no larger than its neighbours.

How to read it
Likely cause

A standard error entered as a confidence-interval width, or an inflated sample size.

Detected by
Unusual: SE too small for the sample size
Reviewer action

Check the interval against the reported sample size.

Estimates split in both directions

Forest plot
Appearance

The pool splits into two opposed clusters: a group of studies is significant in one direction and another group in the other, each with intervals that exclude the null. A few studies still straddle it.

How to read it
Likely cause

In one cluster the treatment and control arms were swapped, or the sign of a difference was dropped, flipping those studies across the null.

Detected by
Discordant: cross-study direction conflict
Reviewer action

Confirm the arm coding for the studies on each side of the split.

The same trial, counted twice

Forest plot
Appearance

Two rows carry an identical estimate and confidence interval: the same trial appears more than once.

How to read it
Likely cause

Usually an accidental duplicate of one trial. A study can also contribute several rows legitimately (subgroups, timepoints, or outcomes), all belonging to the same trial.

Detected by
Info: duplicate study_id
Reviewer action

Remove an accidental duplicate. If the rows are genuinely separate estimates from one trial, model the dependency with a multilevel or multivariate meta-analysis rather than entering them independently, so the trial is not double-counted.

Part 2 · A worked review

Verifying a suspect estimate, from plot to author query

A published forest plot reports each trial's group statistics alongside the effect size they yield. The procedure below re-derives those effect sizes with metaConvert and checks them with metaDETECT.

Identify the anomalous estimate

The forest plot lists, for each trial, the per-arm sample size, mean and standard deviation, and the standardised mean difference they produce. Five estimates fall between g = 0.38 and 0.61; Nakamura 2021 is reported at g = 3.04. Its standard deviations (2.0 and 2.1) are also far smaller than those of the other trials (about 10 to 12) - a combination that is implausible for a cognitive-training outcome and requires verification.

Re-derive the estimate, rather than judge it by eye

An implausible value is best confirmed by recomputation. The statistics the plot reports are entered into metaConvert, which recomputes each effect size and applies the metaDETECT checks - an objective record that can be cited in a review.

Transcribe the data into one extraction sheet

Every trial is entered in a single sheet: the per-arm summary statistics in the standard columns, and the published effect size in metaConvert's user-input columns - not a column named after the measure. Holding both lets metaConvert recompute each estimate from the raw data and cross-check it against the published value.

study n_expmean_expmean_sd_exp n_nexpmean_nexpmean_sd_nexp user_es_original_measure_crude user_es_crudeuser_ci_lo_crudeuser_ci_up_crude
Alvarez 20174162.411.04357.711.4g0.420.100.74
Bianchi 20183858.19.83952.610.1g0.550.220.88
Cohen 20195270.312.15065.811.6g0.380.050.71
Duarte 20204549.210.44444.110.7g0.490.160.82
Faulkner 202133--31--g0.610.201.02
Nakamura 20213024.52.03018.22.1g3.042.623.46
The reported effect size occupies the user-input columns, never a column named g; user_es_target_measure_crude is set from the analysis measure and is not extracted. Faulkner 2021 reports no means or SDs, so its effect size alone is used. The input-data page lists every recognised column.

Recompute and check with metaConvert

metaConvert recomputes each effect size from the summary statistics and applies the metaDETECT checks to every row. The procedure runs in R, or without code in the web application:

## recompute every effect size, then check each row res <- convert_df(data, measure = "g") summary(res, flags = TRUE)

In the metaConvert web app the same sheet is uploaded and the metaDETECT panel opened. In either case, one row is returned with a flag:

Unusual - Large SMD: |g| = 3.04 (threshold: 3) (from means_sd)

Refer the discrepancy to the authors

metaDETECT identifies an implausible value, not its cause. A standardised mean difference of 3.04 corresponds to a separation of approximately three standard deviations between arms. The authors should be asked to verify the means and standard deviations for the trial and, in particular, to confirm whether the reported dispersion is a standard deviation or a standard error - the latter being smaller by a factor of √n.

Suggested query. “The reported standardised mean difference for Nakamura 2021 (g ≈ 3.04) implies a three-standard-deviation separation between arms. Please confirm the group means and standard deviations, and whether the value in the SD column is a standard deviation or a standard error.”

Checklist for reviewers

  • Is any effect larger than the intervention could plausibly produce?
  • Does any study sit far from the others?
  • Is any interval so narrow that one study holds most of the weight?
  • Do the studies split into opposed directions, some intervals excluding the null above it and others below?
  • Does any interval sit unevenly around its estimate?
  • Do two studies report identical numbers?
  • Does the same trial appear more than once?
Run the checks in the app How to enter your data