A Facile Database Search Engine for Metabolite Identification and Biomarker Discovery in Metabolomics
Applications | 2014 | WatersInstrumentation
Metabolite identification remains a key bottleneck in untargeted metabolomics, where high confidence in compound assignment is essential for biomarker discovery, quality control, and nutritional studies. Reliance on a single identifier, such as accurate mass alone, often yields numerous false positives and negatives. Integrating multiple orthogonal properties—mass accuracy, retention time, collision cross section, and fragmentation—enhances specificity and reliability.
This work presents a streamlined workflow using Progenesis QI Informatics with its embedded MetaScope search engine to process, filter, and identify metabolites from complex biological samples. As a practical example, the study investigates how different bottling oxygen levels (high O₂ vs. low N₂) affect the metabolomic profile of Italian wines, demonstrating the software’s utility in real-world applications.
Wine samples were collected under two conditions: with nitrogen addition (low oxygen) and without (ambient oxygen). After dilution and filtration, samples were analyzed by UPLC–MSE. Data processing steps included retention time alignment, peak picking from an aggregate ion map, isotopic and adduct deconvolution, and statistical filtering via ANOVA (p < 0.01) and fold-change (> 2) criteria to highlight discriminant features.
Data independent analysis generated over 3 000 deconvoluted features. Statistical filtering reduced this to fewer than 200 marker compounds. Principal component analysis showed clear separation of wine groups based on oxygen exposure. Initial database searches against HMDB yielded multiple ambiguous hits per feature. By querying customized in-house libraries—including accurate mass, retention time, collision cross section, and experimental fragments—false positives and negatives were significantly reduced. A representative identification of quercetin illustrated matching of high-confidence criteria: mass accuracy, retention time, isotopic pattern, and four characteristic fragments.
Advances may include machine learning–driven annotation, expansion of collision cross section libraries, real-time database updates, and integration with multi-omics platforms. Broader application to food authentication, clinical diagnostics, and environmental monitoring is anticipated as software capabilities and spectral libraries grow.
Progenesis QI Informatics offers a robust solution for high-throughput metabolite identification and biomarker discovery. By combining multivariate filtering with user-definable search parameters across several physicochemical dimensions, it enhances reliability and reduces false discoveries, making it well suited for diverse metabolomics applications.
Software, LC/TOF, LC/HRMS, LC/MS, LC/MS/MS
IndustriesMetabolomics
ManufacturerWaters
Summary
Significance of the topic
Metabolite identification remains a key bottleneck in untargeted metabolomics, where high confidence in compound assignment is essential for biomarker discovery, quality control, and nutritional studies. Reliance on a single identifier, such as accurate mass alone, often yields numerous false positives and negatives. Integrating multiple orthogonal properties—mass accuracy, retention time, collision cross section, and fragmentation—enhances specificity and reliability.
Objectives and overview of the study
This work presents a streamlined workflow using Progenesis QI Informatics with its embedded MetaScope search engine to process, filter, and identify metabolites from complex biological samples. As a practical example, the study investigates how different bottling oxygen levels (high O₂ vs. low N₂) affect the metabolomic profile of Italian wines, demonstrating the software’s utility in real-world applications.
Methodology and instrumentation
Wine samples were collected under two conditions: with nitrogen addition (low oxygen) and without (ambient oxygen). After dilution and filtration, samples were analyzed by UPLC–MSE. Data processing steps included retention time alignment, peak picking from an aggregate ion map, isotopic and adduct deconvolution, and statistical filtering via ANOVA (p < 0.01) and fold-change (> 2) criteria to highlight discriminant features.
Used Instrumentation
- ACQUITY UPLC System with BEH HSS T3 column (1.8 µm, 2.1 × 150 mm) and VanGuard pre-column
- SYNAPT HDMS System operated in MSE mode with ESI+/- ionization
- Progenesis QI Informatics platform including the MetaScope search engine
Main results and discussion
Data independent analysis generated over 3 000 deconvoluted features. Statistical filtering reduced this to fewer than 200 marker compounds. Principal component analysis showed clear separation of wine groups based on oxygen exposure. Initial database searches against HMDB yielded multiple ambiguous hits per feature. By querying customized in-house libraries—including accurate mass, retention time, collision cross section, and experimental fragments—false positives and negatives were significantly reduced. A representative identification of quercetin illustrated matching of high-confidence criteria: mass accuracy, retention time, isotopic pattern, and four characteristic fragments.
Benefits and practical applications
- Comprehensive data processing: integrated alignment, peak picking, and statistical filtering.
- Flexible database searching: supports both public repositories and user-defined libraries.
- Enhanced confidence: orthogonal descriptors and fragment matching lower false discovery rates.
- Rapid workflow: complete analysis and metabolite annotation in a few hours.
Future trends and potential uses
Advances may include machine learning–driven annotation, expansion of collision cross section libraries, real-time database updates, and integration with multi-omics platforms. Broader application to food authentication, clinical diagnostics, and environmental monitoring is anticipated as software capabilities and spectral libraries grow.
Conclusion
Progenesis QI Informatics offers a robust solution for high-throughput metabolite identification and biomarker discovery. By combining multivariate filtering with user-definable search parameters across several physicochemical dimensions, it enhances reliability and reduces false discoveries, making it well suited for diverse metabolomics applications.
References
- Dunn WB, Erban A, Weber RJM, Creek DJ, et al. Mass appeal: metabolite identification in mass spectrometry–focused untargeted metabolomics. Metabolomics. 2013;9(1 Suppl):44–66. doi:10.1007/s11306-012-0434-4
- Shahaf N, Franceschi P, Arapitsas P, Rogachev I, Vrhovsek U, Wehrens R. Constructing a mass measurement error surface to improve automatic annotations in LC/MS-based metabolomics. Rapid Commun Mass Spectrom. 2013;27(21):2425–2431. doi:10.1002/rcm.6705
- Arapitsas P, Speri G, Angeli A, Perenzoni D, Mattivi F. The influence of storage on the “chemical age” of red wine. Metabolomics. 2014. doi:10.1007/s11306-014-0638-x
Content was automatically generated from an orignal PDF document using AI and may contain inaccuracies.
Similar PDF
APPLICATION NOTEBOOK - UNTARGETED METABOLOMICS AND LIPIDOMICS
2016|Waters|Guides
[ APPLICATION NOTEBOOK ] UNTARGETED METABOLOMICS AND LIPIDOMICS 1 1 This notebook is an excerpt from the larger Waters’ Application Notebook on Metabolomics and Lipidomics #720005245EN TABLE OF CONTENTS 3 Introduction 4 Development of a Metabolomic Assay for the Analysis…
Key words
neg, negpos, posacid, acidaminoacid, aminoaciduplc, uplcbasmati, basmatitransomics, transomicsbasic, basiclipids, lipidsmobility, mobilitylipid, lipidinformatics, informaticsnucleoside, nucleosideprogenesis, progenesismetabolomics
Metabolic Phenotyping Using Atmospheric Pressure Gas Chromatography-MS
2015|Agilent Technologies|Applications
Metabolic Phenotyping Using Atmospheric Pressure Gas Chromatography-MS Vladimir Shulaev,2 Ghaste Manoj,2,3 Steven Lai,1 Carolina Salazar,2 Nobuhiro Suzuki,2 Janna Crossley,2 Feroza Kaneez Coudhury,2 Khadiza Zaman,2 Ron Mittler,2 James Langridge,1 Robert Plumb,1 Fulvio Mattivi,3 Giuseppe Astarita1 1 Waters Corporation, Milford, MA, USA…
Key words
apgc, apgcphenotyping, phenotypingatmospheric, atmosphericprogenesis, progenesismetabolic, metabolicpressure, pressuretof, tofchromatography, chromatographygas, gaswild, wildmalic, malicexperimentally, experimentallyusing, usingmass, massaligned
The Use of HRMS and Statistical Analysis in the Investigation of Basmati Rice Authenticity and Potential Food Fraud 
2014|Agilent Technologies|Applications
The Use of HRMS and Statistical Analysis in the Investigation of Basmati Rice Authenticity and Potential Food Fraud Gareth Cleland,1 Adam Ladak,2 Steven Lai,2 and Jennifer Burgess1 1 Waters Corporation, Milford, MA, USA 2 Waters Corporation, Beverly, MA, USA A…
Key words
basmati, basmatirice, riceauthenticity, authenticityfraud, fraudprogenesis, progenesismarkers, markersinvestigation, investigationstatistical, statisticalhrms, hrmsgrain, grainfood, foodpotential, potentialapgc, apgcprominent, prominentjasmine
Essential Oil Metabolomic Profiling with HRMS and a Variety of Complementary Ionization Techniques - Allowing Discrimination of Samples of Different Botanical Origin and Non-Conformity
2019|Waters|Applications
[ APPLICATION NOTE ] Essential Oil Metabolomic Profiling with HRMS and a Variety of Complementary Ionization Techniques - Allowing Discrimination of Samples of Different Botanical Origin and Non-Conformity Jerome Masson, 1 Hugues Brevard, 1 Agnes Corbin, 2 Joanne Connolly, 3…
Key words
indonesia, indonesiavetiver, vetiverhaïti, haïticomplementary, complementaryoil, oilparaguay, paraguayuplc, uplchrms, hrmsprogenesis, progenesisessential, essentialmetabolomic, metabolomicupc, upcprofiling, profilingvolatile, volatileesi