LCMS
More information
WebinarsAbout usContact usTerms of use
LabRulez s.r.o. All rights reserved. Content available under a CC BY-SA 4.0 Attribution-ShareAlike

Validating and Comparing Component Detection Algorithms for LC-MS Data Assignment

Posters | 2015 | Thermo Fisher ScientificInstrumentation
Software, LC/MS
Industries
Manufacturer
Thermo Fisher Scientific

Summary

Importance of the Topic


The reliable detection of chromatographic features in liquid chromatography–mass spectrometry (LC-MS) is fundamental for metabolomics, drug metabolism studies and quality control in pharmaceutical and biological research. In practice, the performance of feature detection algorithms is limited by the scarcity of fully annotated benchmark datasets that include both true positive and true negative signals. An exhaustive annotation approach addresses this gap, enabling an objective stress test of algorithm accuracy across complex data.

Study Objectives and Overview


This work introduces a semi-automated annotation workflow and visualization tool (TotalRecall) to generate comprehensive LC-MS feature annotations. The objectives are to:
  • Create exhaustive lists of true positives (TP) and true negatives (TN) in diverse samples.
  • Compare feature detection algorithms under realistic conditions.
  • Demonstrate precision-recall analysis as a robust performance metric for imbalanced data.


Methodology


Three sample types were annotated: (1) a pure analyte spike (buspirone), (2) complex biological spiked mixtures (rat hepatocyte metabolite extracts and amino acid standards in positive and negative modes), and (3) a drug example lacking a targeted key (diclofenac). TotalRecall groups monoisotopic and isotopic signals into clusters, then classifies features as TP (validated A0 isotopes with consistent chromatographic shape) or TN (orphan isotopic signals or misaligned clusters). An exhaustive answer key is built via manual curation assisted by visualization tabs for clusters, TP, TN and unresolved (“Unknown”) features. Algorithm outputs are scored against these keys using varied signal‐intensity thresholds to generate precision-recall curves.

Instrumentation


Mass spectra were acquired on Thermo Scientific Exactive™ Orbitrap instruments, coupled to standard LC systems. Sample preparation involved spiking known concentrations into biological matrices and standard mixes.

Main Results and Discussion


Precision-recall analysis across datasets revealed that:
  • Algorithms achieve near-complete recall on simple spikes but exhibit high false positives when benchmarking against exhaustive annotations.
  • Complex samples (amino acids, buspirone metabolites, diclofenac) show clear differences among algorithms at lower intensity thresholds.
  • ROC curves undervalue performance gaps due to extreme class imbalance; precision-recall plots more clearly visualize trade-offs.

The exhaustive annotation approach exposes algorithm limitations that are masked by limited targeted keys.

Benefits and Practical Applications


By providing both TP and TN features, exhaustive annotation supports:
  • Objective benchmarking of feature detection and peak picking tools.
  • Evaluation of algorithm robustness in real-world complex matrices.
  • Optimization of threshold settings for high-throughput LC-MS data processing in metabolomics and pharmaceutical QC.


Future Trends and Possibilities


Potential extensions include:
  • Annotation of replicates and time‐course studies with automated feature tracking.
  • Benchmarking of isotope detection, adduct grouping, and full component detection workflows.
  • Integration with machine-learning models for automated curation and interactive interface enhancements.


Conclusion


This study demonstrates that semi-automated exhaustive annotation combined with precision-recall metrics provides a rigorous framework for evaluating LC-MS feature detection algorithms. The TotalRecall workflow uncovers performance differences in complex datasets, guiding method selection and development.

Reference


TotalRecall annotation tool and methodology presented in: Razumovskaya J., Brown J., Wright D., Baran R., Mohtashemi I. (2015) “Validating and Comparing Component Detection Algorithms for LC-MS Data Assignment.” Thermo Fisher Scientific.

Content was automatically generated from an orignal PDF document using AI and may contain inaccuracies.

Downloadable PDF for viewing
 

Similar PDF

Toggle
Deep learning methods applied tothe analysis of metabolomics data
PO-CON1857E Deep learning methods applied to the analysis of metabolomics data ASMS 2019 WP 389 Shinji Kanazawa1,3,4, Yohei Yamada1, Hiroyuki Yasuda1, Akihiro Kunisawa1,3, Toru Shiohama1, Shigeki Kajihara1, Norio Mukai1, Masaki Kakisako2, Go Fujisawa2, Yuzuru Yamakage2, Junko Iida1,3, Eiichiro Fukusaki5, Fumio…
Key words
learning, learningdeep, deepmetabolomics, metabolomicsapplied, appliedrecall, recallmethods, methodsbaseline, baselinepeak, peakdata, datafalse, falsepositive, positiveaugmentation, augmentationtrue, truederivation, derivationalgorithms
De Novo PFAS Annotation and Classification Using Highly Accurate Formula Prediction and Kaufmann Algorithms Embedded in FluoroMatchSuite
Poster Reprint ASMS 2025 Poster number WP 113 De Novo PFAS Annotation and Classification Using Highly Accurate Formula Prediction and Kaufmann Algorithms Embedded in FluoroMatch Suite Jeremy Koelmel1; Michael Kummer2; David Schiessel2; Olivier Chevallier3; David Godri4; Christian Klein3; Emma E…
Key words
homologous, homologousvoting, votingfalse, falseformula, formulanist, nistseries, seriesprediction, predictionkaufmann, kaufmannfluoromatch, fluoromatchannotation, annotationvisualizations, visualizationsabc, abcrate, ratepicking, pickingfeatures
Leveraging the MS1 Dimension and Formula Prediction in Non-Targeted Analysis of PFAS using New FluoroMatch Algorithms: Assessing Confidence and Coverage
Leveraging the MS1 Dimension and Formula Prediction in Non-Targeted Analysis of PFAS using New FluoroMatch Algorithms: Assessing Confidence and Coverage David Schiessel* [1]; Jeremy Koelmel [2]; Michael Kummer [1]; David Godri [3]; Sheng Liu [2]; Elizabeth Z. Lin [2]; John…
Key words
fluoromatch, fluoromatchformula, formulaprediction, predictionhomologous, homologousisotopic, isotopicdefect, defectdda, ddaannotation, annotationvisualizations, visualizationsabc, abcpicking, pickingfeatures, featuresseries, seriesworkflow, workflowalongside
Comprehensive non-targeted workflow for confident identification of perfluoroalkyl substances (PFAS)
Application note | 003883 Environmental Comprehensive non-targeted workflow for confident identification of perfluoroalkyl substances (PFAS) Richard Cochran1, Sarah Choyke2, Application benefits Collin Meyers2, Ralf Tautenhahn3 • High-resolution accurate-mass (HRAM) data acquired using the Thermo Scientific™ Orbitrap Exploris™ mass spectrometer platform…
Key words
pfas, pfasannotation, annotationspectral, spectralmass, massdatabase, databasefluoromatch, fluoromatchworkflow, workflowdiscoverer, discovererduke, dukemzcloud, mzcloudconfidence, confidenceafff, affftargeted, targetedhomologous, homologouscompound
Other projects
GCMS
ICPMS
Follow us
FacebookX (Twitter)LinkedInYouTube
More information
WebinarsAbout usContact usTerms of use
LabRulez s.r.o. All rights reserved. Content available under a CC BY-SA 4.0 Attribution-ShareAlike