Use of carry-over models in false-discovery rate filtering in real-time search
Posters | 2022 | Thermo Fisher Scientific | ASMSInstrumentation
Real-time control of false discovery rate (FDR) in mass spectrometry-based proteomics is critical to maximize sensitivity and specificity during LC-MS experiments. Incorporating an online FDR filter allows sequencing decisions to be made dynamically, improving the quality of data acquisition. The ability to reuse an FDR estimation model across runs can reduce instrument idle time and increase consistency, particularly when sample complexity or PSM yields vary.
This work evaluates two strategies for carrying over a linear discriminant analysis (LDA)-based FDR estimator between LC-MS runs: a static carry-over model that freezes feature weights after an initial run, and a continuation model that incrementally refines weights across successive runs. The study aims to determine whether these approaches improve protein and peptide identification metrics and to assess their practical value in real-time search workflows.
The experiments employed a Thermo Scientific EASY-nLC 1200 coupled to an Orbitrap Eclipse mass spectrometer, with a two-hour gradient and 250 ng of TMT-labeled yeast peptides. Real-time spectral searches were performed using a modified instrument control software (ICSW) running Comet for MS2 identification. LDA training and scoring used Accord.NET. Two carry-over strategies were implemented:
Both carry-over approaches maintained a 20% targeted FDR and produced similar discriminant score distributions. Key observations include:
These findings suggest the LDA filter converges quickly within a single run, limiting the benefit of further training in similar sample contexts.
Employing a carry-over FDR model can streamline real-time search workflows by removing the need for per-run LDA training. This can be particularly advantageous when instrument time is limited or when analyzing large sample cohorts with consistent complexity. For applications with low initial PSM yields or highly heterogeneous samples, carry-over models may reduce early-run sensitivity loss.
Advanced carry-over strategies may find greater utility in workflows involving low-abundance samples, single-cell proteomics, or novel acquisition schemes where initial PSM counts are insufficient for robust model training. Integration with adaptive learning algorithms and cloud-based data sharing could further enhance cross-run model generalizability. Combining carry-over with retention time alignment and real-time quality metrics offers avenues to refine on-the-fly decision making.
Carry-over of an online FDR estimator via LDA presents a practical method to bypass per-run training without compromising identification quality. Although gains in protein and peptide yields were marginal under these experimental conditions, the approach holds promise for specialized applications where training stages are limiting. Rapid convergence of LDA weights underlines the robustness of the method.
LC/HRMS, LC/MS, LC/MS/MS, LC/Orbitrap
IndustriesOther
ManufacturerThermo Fisher Scientific
Summary
Significance of the Topic
Real-time control of false discovery rate (FDR) in mass spectrometry-based proteomics is critical to maximize sensitivity and specificity during LC-MS experiments. Incorporating an online FDR filter allows sequencing decisions to be made dynamically, improving the quality of data acquisition. The ability to reuse an FDR estimation model across runs can reduce instrument idle time and increase consistency, particularly when sample complexity or PSM yields vary.
Objectives and Study Overview
This work evaluates two strategies for carrying over a linear discriminant analysis (LDA)-based FDR estimator between LC-MS runs: a static carry-over model that freezes feature weights after an initial run, and a continuation model that incrementally refines weights across successive runs. The study aims to determine whether these approaches improve protein and peptide identification metrics and to assess their practical value in real-time search workflows.
Methodology and Instrumentation
The experiments employed a Thermo Scientific EASY-nLC 1200 coupled to an Orbitrap Eclipse mass spectrometer, with a two-hour gradient and 250 ng of TMT-labeled yeast peptides. Real-time spectral searches were performed using a modified instrument control software (ICSW) running Comet for MS2 identification. LDA training and scoring used Accord.NET. Two carry-over strategies were implemented:
- Static Model Carry-Over: LDA weights computed at the end of a base run are held constant for subsequent runs.
- Continuation Model Carry-Over: LDA weights are continually updated from the base run through all continuation runs.
Main Results and Discussion
Both carry-over approaches maintained a 20% targeted FDR and produced similar discriminant score distributions. Key observations include:
- Protein and peptide quantification gains were modest, falling within run-to-run variability.
- The continuation model reduced variability of feature weights compared to the static model, but did not yield significantly higher identifications.
- Early-run MS3 acquisition rates improved slightly with carry-over, reflecting fewer training-related delays.
- The discriminant score remained stable over runs, indicating rapid model convergence.
These findings suggest the LDA filter converges quickly within a single run, limiting the benefit of further training in similar sample contexts.
Benefits and Practical Applications
Employing a carry-over FDR model can streamline real-time search workflows by removing the need for per-run LDA training. This can be particularly advantageous when instrument time is limited or when analyzing large sample cohorts with consistent complexity. For applications with low initial PSM yields or highly heterogeneous samples, carry-over models may reduce early-run sensitivity loss.
Future Trends and Potential Applications
Advanced carry-over strategies may find greater utility in workflows involving low-abundance samples, single-cell proteomics, or novel acquisition schemes where initial PSM counts are insufficient for robust model training. Integration with adaptive learning algorithms and cloud-based data sharing could further enhance cross-run model generalizability. Combining carry-over with retention time alignment and real-time quality metrics offers avenues to refine on-the-fly decision making.
Conclusion
Carry-over of an online FDR estimator via LDA presents a practical method to bypass per-run training without compromising identification quality. Although gains in protein and peptide yields were marginal under these experimental conditions, the approach holds promise for specialized applications where training stages are limiting. Rapid convergence of LDA weights underlines the robustness of the method.
References
- Canterbury J. et al., Poster MP112, ASMS 2020.
- Schweppe D. et al., J. Proteome Res. 2020, 19, 2026–2034.
- Eng J. et al., JASMS 2015, 26, 1865–.
- Accord.NET Framework v. 3.8.0.
Content was automatically generated from an orignal PDF document using AI and may contain inaccuracies.
Similar PDF
Developments in Real-Time Search on an Orbitrap Tribrid Mass Spectrometer
2020|Thermo Fisher Scientific|Posters
Developments in Real-Time Search on an Orbitrap Tribrid Mass Spectrometer Jesse D. Canterbury, William D. Barshop, Graeme C. McAlister, Tony Zhao, Aaron M. Robitaille, and Romain Huguet, Thermo Fisher Scientific, San Jose, CA, 95134 ABSTRACT RESULTS Purpose: We extend the…
Key words
psms, psmsreal, realrts, rtssearch, searchlda, ldatribrid, tribriddevelopments, developmentshost, hosttime, timepassing, passingfiltering, filteringindebted, indebteddevin, devinmultithreaded, multithreadedschweppe
Implementing Comet search engine into Proteome Discoverer to improve TMT Real-Time Search data processing
2021|Thermo Fisher Scientific|Posters
Implementing Comet search engine into Proteome Discoverer to improve TMT Real-Time Search data processing 1 Yang Liu, 2Frank Berg, 1 William D. Barshop, 1Jesse D. Canterbury, 1David Horn, 1David Bergen, 1Romain Huguet, 1Rosa Viner 1Thermo Fisher Scientific, San Jose, CA,…
Key words
comet, cometpercolator, percolatorsequest, sequestdecoy, decoysearch, searchpsm, psmdiscoverer, discovererproteome, proteomerts, rtsfixvalue, fixvalueprotein, proteinfix, fixsequestht, sequesthtquan, quantribrid
Thermo Scientific Proteome Discoverer software
2022|Thermo Fisher Scientific|Brochures and specifications
The intelligent protein informatics platform Thermo Scientific Proteome Discoverer software Transform proteomics mass spectrometry data into insights Thermo Scientific™ Proteome Discoverer™ software enables comprehensive proteomics data processing workflows empowered by artificial intelligence. • Powerful and flexible framework: Optimized analysis for…
Key words
inferys, inferysdiscoverer, discovererrescoring, rescoringproteome, proteomepsms, psmschimerys, chimerysworkflows, workflowspeptides, peptideslfq, lfqsearch, searchpeptide, peptidetmt, tmtsequest, sequestproteomics, proteomicsconsensus
Real-Time Search enables a new gold standard for TMT quantitation accuracy on the Orbitrap Eclipse Tribrid mass spectrometer
2020|Thermo Fisher Scientific|Applications
APPLICATION NOTE 65729 Real-Time Search enables a new gold standard for TMT quantitation accuracy on the Orbitrap Eclipse Tribrid mass spectrometer Authors: Aaron M. Robitaille1, Romain Huguet1, Ryan Bomgarden2, Jesse D. Canterbury1, Daniel Lopez-Ferrer1 Thermo Fisher Scientific, San Jose, CA…
Key words
search, searchtmt, tmtreal, realcomet, cometxcorr, xcorrsps, spstribrid, tribridorbitrap, orbitrapmass, masstime, timeeclipse, eclipsequantitation, quantitationinterference, interferenceengine, engineaccuracy