Retention time prediction for 653 pesticides on a biphenyl liquid chromatography stationary phase using machine learning
Posters | 2019 | ShimadzuInstrumentation
Retention time prediction plays a critical role in liquid chromatography–mass spectrometry workflows, especially in suspect screening of complex samples where authentic standards may not be available. Accurate in silico estimates of chromatographic behavior help reduce the reliance on reference materials, accelerate the identification of trace-level contaminants and support more efficient quality control and research applications.
This work aimed to develop and validate a machine learning model capable of predicting the retention times of 653 pesticides on a biphenyl reversed-phase column. Key goals included:
A dataset of 653 pesticides spanning herbicides, fungicides, insecticides and other classes was measured using a triple-quadrupole LC-MS/MS system. Key steps included:
The ensemble model achieved strong agreement between predicted and measured retention times:
This predictive approach offers several advantages:
Advances likely to further improve retention time prediction include:
This study presents the first machine learning–based retention time model for 653 pesticides on a biphenyl reversed-phase column, achieving high accuracy and demonstrating the value of logD and structural descriptors in prediction. The ensemble MLP approach enables efficient suspect screening and supports more robust chromatographic identification workflows without extensive use of reference standards.
LC/MS, LC/MS/MS, LC/QQQ
IndustriesEnvironmental, Food & Agriculture
ManufacturerShimadzu
Summary
Importance of the Topic
Retention time prediction plays a critical role in liquid chromatography–mass spectrometry workflows, especially in suspect screening of complex samples where authentic standards may not be available. Accurate in silico estimates of chromatographic behavior help reduce the reliance on reference materials, accelerate the identification of trace-level contaminants and support more efficient quality control and research applications.
Objectives and Study Overview
This work aimed to develop and validate a machine learning model capable of predicting the retention times of 653 pesticides on a biphenyl reversed-phase column. Key goals included:
- Generating a robust dataset of measured retention times under a 12 min gradient LC-MS/MS method
- Selecting informative molecular descriptors to capture structural and physicochemical diversity
- Training and evaluating neural network architectures to achieve high prediction accuracy
- Demonstrating the applicability domain and identifying the most influential descriptors
Methodology and Instrumentation
A dataset of 653 pesticides spanning herbicides, fungicides, insecticides and other classes was measured using a triple-quadrupole LC-MS/MS system. Key steps included:
- Extraction of compounds from food matrices by QuEChERS
- Chromatographic separation on a Restek Raptor biphenyl column (100 × 2.1 mm, 2.7 µm) at 35 °C, 0.4 mL/min, gradient from water/formate to methanol/formate over 10.5 min
- Detection by LCMS-8060 with heated electrospray ionization, polarity switching, and multiple reaction monitoring
- Calculation of over 5 000 molecular descriptors per compound, followed by collinearity filtering, correlation analysis, genetic feature selection and expert curation to yield 16 candidate descriptors
- Training of three-layer multi-layer perceptron (MLP) neural networks and construction of an ensemble of four identical networks using a 70:15:15 split for training, verification and blind testing
Instrumentation Used
- UHPLC system: Shimadzu Nexera
- Analytical column: Restek Raptor biphenyl, 100 × 2.1 mm, 2.7 µm
- Mass spectrometer: Shimadzu LCMS-8060 triple quadrupole
- Ionization: Heated electrospray with rapid polarity switching
- MRM transitions: ~1 919 (positive and negative modes), dwell times of 4 ms
Main Results and Discussion
The ensemble model achieved strong agreement between predicted and measured retention times:
- 75 % of compounds predicted within ±39 s of the true value over a 12 min gradient
- Mean error of 28–31 s and R2 values of 0.85–0.89 across training, verification and blind test sets
- Sensitivity analysis identified logD at pH 5.4 as the most influential descriptor, followed by number of six-membered rings, hydrophilic factor and number of benzene rings
- Principal component analysis of descriptor space revealed clear clustering and a well-defined applicability domain, with a few outliers corresponding to larger ring systems
Benefits and Practical Applications
This predictive approach offers several advantages:
- Rapid in silico screening to prioritize candidate pesticides without purchasing expensive standards
- Enhanced selectivity by leveraging biphenyl media as an alternative to conventional C18 phases
- Improved confidence in suspect identification in environmental, food and forensic analyses
Future Trends and Potential Uses
Advances likely to further improve retention time prediction include:
- Integration with high-resolution mass spectrometry and data-independent acquisition workflows
- Application of deep learning and transfer learning to expand coverage to new compound classes and stationary phases
- Development of real-time retention time alignment tools for online screening
- Creation of community-wide curated descriptor databases to refine applicability domains
Conclusion
This study presents the first machine learning–based retention time model for 653 pesticides on a biphenyl reversed-phase column, achieving high accuracy and demonstrating the value of logD and structural descriptors in prediction. The ensemble MLP approach enables efficient suspect screening and supports more robust chromatographic identification workflows without extensive use of reference standards.
Content was automatically generated from an orignal PDF document using AI and may contain inaccuracies.
Similar PDF
Classifying the pesticides in foods between GC-amenable and LC-amenable using the prediction model with molecular descriptors
2020|Agilent Technologies|Posters
Poster Reprint ASMS 2020 WP 165 Classifying the pesticides in foods between GC-amenable and LC-amenable using the prediction model with molecular descriptors Sadao Nakamura 1, Takeshi Serino 1, 2, Takeshi Otsuka 1, Yoshizumi Takigawa 1, Tarun Anumol 3, Shigehiko Kanaya…
Key words
both, bothlearning, learningdescriptor, descriptorpesticides, pesticidesmachine, machineamenable, amenableclassification, classificationatoms, atomspesticide, pesticidetech, techensemble, ensemblemethyl, methylexecution, executionlist, listqspr
Agilent ASMS 2020 Posters Book
2020|Agilent Technologies|Posters
Poster Reprint ASMS 2020 MP 176 Using ICP-MS/MS with M-Lens for the analysis of high silicon matrix samples Yu Ying1; Xiangcheng Zeng1 1Agilent China Technologies, China, Shanghai, Introduction The expansion of the connected devices and the Internet of Things (IoT)…
Key words
peptide, peptidereprint, reprintwere, wereposter, postermethod, methoddiscussion, discussionpositive, positiveresults, resultsclassification, classificationusing, usingboth, bothexperimental, experimentalanalysis, analysisrecovery, recoverysample
Moving beyond monitoring legacy per and polyfluoroalkyl substances PFAS screening strategies for the growing list
2019|Agilent Technologies|Posters
Poster Reprint ASMS 2019 TP185 Moving beyond monitoring legacy per and polyfluoroalkyl substances PFAS screening strategies for the growing list James S. Pyke, Andrew McEachran, Tarun Anumol and Jerry Zweigenbaum Agilent Technologies, Inc. Santa Clara CA USA Introduction Experimental Per/Polyfluoroalkyl…
Key words
pfas, pfassuspect, suspecttof, tofpcd, pcdsuremass, suremasstargets, targetsscreening, screeningcompounds, compoundsquantitation, quantitationputative, putativecommonly, commonlypredict, predictallions, allionspapfasinv, papfasinvquantitate
ION MOBILITY-ENABLED LC-HRMS FOR THE ANALYSIS OF POLLUTANTS IN INDOOR DUST: IDENTIFICATION AND PREDICTIVE CAPABILITIES
2020|Waters|Posters
ION MOBILITY-ENABLED LC-HRMS FOR THE ANALYSIS OF POLLUTANTS IN INDOOR DUST: IDENTIFICATION AND PREDICTIVE CAPABILITIES Authors: Lauren Mullin1,2, Robert A. DiLorenzo3, Karl Jobst4, Robert Plumb1, Eric J. Reiner5, Leo W.Y. Yeung2, Ingrid Ericson Jogsten2 Affiliations: 1. Waters Corporation, Milford, MA,…
Key words
ccs, ccstbbpa, tbbpavalues, valuesccsobs, ccsobsacetominophen, acetominophenmobility, mobilityacquisitions, acquisitionsdust, dustcdp, cdpxenobiotic, xenobioticidentifications, identificationstcep, tceppredictive, predictivecarbendazim, carbendazimimidacloprid