Understanding PT statistics

Technical notes | 2024 | EurachemInstrumentation

Other

Industries

Other

Manufacturer

Summary

Importance of the topic

The statistical descriptors reported in proficiency testing (PT) schemes determine how laboratories interpret their performance relative to peers and reference values. Clear understanding of location estimators, measures of dispersion and the uncertainty of those estimates is essential for accreditation, method validation, quality assurance and regulatory compliance. Correct selection and interpretation of these statistics reduce the risk of misidentifying method bias, wrongly excluding results as outliers, or drawing invalid conclusions from small datasets.

Objectives and overview of the document

This leaflet aims to help participants in quantitative PT schemes interpret the statistical parameters in PT reports. It summarizes the principal summary statistics used to describe a set of technically valid results: location (central tendency), dispersion (spread), and the standard uncertainty of the location estimate. It also highlights choices recommended by ISO 13528, the implications of method heterogeneity among participants, and practical cautions for small datasets and outliers.

Methodology and statistical approaches

Summary of location estimators and dispersion measures:

Location (central tendency): arithmetic mean, median, and robust estimators (Algorithm A, Hampel estimator and similar). ISO 13528 recommends robust estimators when the data show outliers or asymmetry. The arithmetic mean is appropriate when the dataset has been screened and outliers removed.
Dispersion (spread): classical standard deviation (s) and robust alternatives such as scaled median absolute deviation (MADe), normalized interquartile range (nIQR), robust standard deviation s* (from Algorithm A, Qn, Q methods) and related robust measures. These robust measures are often scaled to be comparable with the standard deviation of a normal distribution.

Key statistical points and practical rules:

Assigned value (x_pt), its uncertainty u(x_pt), and the standard deviation for proficiency assessment (σ_pt) should be reported by the PT provider along with the method used to obtain them.
When using the arithmetic mean, PT providers typically test the dataset for outliers and remove them before reporting the mean.
Robust estimators are preferred when the dataset contains outliers or lacks symmetry; they reduce the influence of discrepant results.
All estimators (location or dispersion) are less reliable with small numbers of results; uncertainty increases and breakdown risks grow.
If participants use different analytical methods, PT providers may report separate estimated locations and dispersions per method to avoid masking method-dependent biases.

Standard uncertainty of the location:

For the arithmetic mean, the standard uncertainty of the mean is the classical standard deviation divided by the square root of the number of reports: u(mean) = s / sqrt(p), where s is the classical standard deviation and p the number of results.
For robust location estimators ISO 13528 recommends a conservative multiplicative factor (1.25) when estimating the uncertainty: u(robust) ≈ 1.25 · s* / sqrt(p), where s* is a robust estimate of dispersion (e.g., MADe, nIQR, Qn-derived s*).

Conceptual characteristics of estimators described in PT reports:

Resistance to outliers and breakdown point: robust estimators have higher tolerance to a minority of discrepant values.
Efficiency: the variance of an estimator relative to the variance of the mean for normally distributed data; high-efficiency estimators yield lower uncertainty when normality holds.
Resistance to minor modes (RMM): the ability of an estimator to avoid bias when a minority subgroup of results forms a distinct mode; different estimators vary in this property.
Some scaling factors used to convert MAD to MADe or IQR to nIQR assume the underlying data are normally distributed; the conversion is approximate if normality does not hold.

Main results and discussion

The leaflet emphasizes transparency and appropriate statistical practice in PT reporting. Main conclusions and implications for participants include:

PT reports should explicitly state the statistical approach used to derive x_pt, u(x_pt) and σ_pt so participants can assess the relevance and comparability of the reported performance metrics.
A significant difference between the estimated location and an external reference value suggests method bias and should prompt investigation; the bias may originate from one or multiple methods used by participants.
If multiple analytical methods were used across participants, separate summaries per method help to reveal method-dependent bias and differing dispersion.
Table 1 in the source contrasts estimators: simple arithmetic mean and classical standard deviation are efficient for normal, outlier-free data but vulnerable to outliers; median and robust methods trade some efficiency (when data are normal) for much greater robustness in the presence of outliers or multimodality.
Any outlier removal or trimming should be documented; robust estimators reduce dependence on ad hoc outlier rejection.

Benefits and practical applications

Understanding these statistical parameters helps laboratories and PT participants to:

Choose PT schemes whose statistical treatment matches their data characteristics (e.g., presence of multiple methods, expected heterogeneity).
Interpret PT results correctly for accreditation evidence, internal QA, and regulatory reporting.
Distinguish between random variability and systematic bias, and decide when to investigate method performance or implement corrective actions.
Assess the reliability of the assigned value and the reported proficiency standard deviation, especially in small-sample scenarios.

Future trends and possibilities for use

Emerging and desirable directions for PT statistics and reporting include:

Wider adoption of robust statistical workflows and explicit reporting conventions for assigned values and uncertainties.
Stratified reporting by analytical method or instrument class to improve comparability and to reveal method-specific issues.
Improved guidance for small-sample PTs: use of bootstrap or simulation-based approaches to quantify uncertainty and to support decision rules when p is small.
Software tools and automated pipelines that implement ISO 13528-recommended procedures, conservatively estimate uncertainties, and document all steps to support transparency and reproducibility.
Application of machine learning and anomaly-detection methods to flag atypical submissions while maintaining clear statistical bases for decisions.

Conclusion

Proficiency testing reports must provide not only summary numbers but clear descriptions of how those numbers were obtained. Robust estimators and conservative uncertainty practices protect participants against misleading conclusions arising from outliers or asymmetric data. Participants should review PT provider methodology descriptions (x_pt, u(x_pt), σ_pt), consider method-specific reporting, and exercise caution with small datasets. These practices improve the value of PT as a tool for method assessment, interlaboratory comparison and quality assurance.

References

Brookman B., Mann I., editors. Eurachem Guide: Selection, Use and Interpretation of Proficiency Testing (PT) Schemes. 3rd ed. 2021.
Eurachem Proficiency Testing Working Group. Understanding PT performance assessment. Eurachem leaflet.
ISO 13528:2022. Statistical methods for use in proficiency testing by interlaboratory comparison.

Content was automatically generated from an orignal PDF document using AI and may contain inaccuracies.

Downloadable PDF for viewing

Similar PDF

Understanding PT performance assessment

2024||Technical notes

Understanding PT performance assessment Introduction This leaflet is intended to help participants in quantitative proficiency testing (PT) schemes to better understand the performance assessment made by the PT provider [1-4]. Performance assessment parameters Assigned value In order to assess individual…

Key words

assigned, assigneduncertainty, uncertaintyunitless, unitlessproficiency, proficiencyassessment, assessmentvalue, valueparticipant, participantscore, scoreparticipants, participantsperformance, performanceassessments, assessmentsdeviation, deviationxpt, xptspt, sptagrees

How can proficiency testing help my laboratory?

2022||Technical notes

How can proficiency testing help my laboratory? Introduction Proficiency testing (PT) is applicable to quantitative, qualitative and interpretative assessments, but this leaflet will concentrate on PTs for quantitative tests. Participation in PT is an essential part of the quality assurance…

Key words

spt, sptbias, biasproficiency, proficiencyscores, scoresscore, scorelaboratory, laboratoryperformance, performanceplausibility, plausibilityunsatisfactory, unsatisfactoryassessment, assessmentrounds, roundsquestionable, questionableestablished, establishedzeta, zetaproviders

Proficiency testing schemes for sampling

2020||Technical notes

Proficiency testing schemes for sampling Introduction This leaflet gives some hints on the application of ISO/IEC 17043 [1] for PT providers organising PT schemes for sampling. If there is a comparison between participants and a mechanism for performance evaluation which…

Key words

sampling, samplingschemes, schemeseee, eeebehalf, behalfminimising, minimisingsite, siteparticipant, participantorganizing, organizingjudge, judgeeurachem, euracheminterpreted, interpretedprovider, providertransportation, transportationprocedure, procedureproficiency

Selecting the right proficiency testing scheme for my laboratory

2022||Technical notes

Selecting the right proficiency testing scheme for my laboratory Introduction Participation in Proficiency Testing (PT) is an important part of assuring the quality of test results in a laboratory. The time and effort required can be costly, especially for laboratories…

Key words

provider, providerproficiency, proficiencylaboratory, laboratoryparticipants, participantsprocedures, proceduresscheme, schemedna, dnastrategies, strategiestesting, testingnumber, numbertest, testmeasurement, measurementmeetings, meetingscriteria, criteriafitness