Conversion to mzML Format for NIST26 Chromatogram Processing

Presentations | 2026 | James Little/Mass Spec Interpretation ServicesInstrumentation
Software, LC/MS, LC/MS/MS
Industries
Other
Manufacturer
Wiley

Summary

Conversion to mzML Format for NIST26 Chromatogram Processing — Expert Summary


Significance of the topic

Converting vendor raw mass spectrometry files to the open mzML format is a critical preprocessing step for standardized chromatogram processing, spectral deconvolution, and library searching workflows such as NIST26. Using a single accepted interchange format ensures reproducibility, interoperability between tools, and consistent behavior of downstream algorithms (deconvolution, matching, and quantitation). Proper handling of centroiding and metadata during conversion strongly influences spectral quality, file size, and the reliability of automated library searches and deconvolution results.

Objectives and overview of the guidance

The source material outlines a practical, workflow-oriented procedure for producing mzML input files suitable for NIST26 chromatogram processing. Primary goals are: (1) ensure files are in mzML format only, (2) guarantee spectra are centroided (peak picked) before analysis, (3) describe use of ProteoWizard msconvert as the recommended conversion tool, and (4) provide practical tips for handling Agilent vendor files and selecting appropriate software versions and filters.

Methodology and workflow

Key procedural steps
  1. Obtain vendor raw data.
  2. Convert to mzML using either the vendor’s native export/conversion tool or msconvert from ProteoWizard.
  3. Ensure resulting mzML files contain centroided spectra; if original data are profile-mode, perform peak picking during conversion.
  4. Include only accepted MS levels as required by downstream processing and start conversion.

Details on msconvert and ProteoWizard

Installation and versions
  • msconvert is bundled in the ProteoWizard installer (Windows .msi or .exe). After installation, msconvert is available as a GUI (Start Menu → ProteoWizard → MSConvert) and a command-line utility (msconvert.exe).
  • Choose stable releases for routine workflows for reliability; consider nightly builds when working with newer vendor formats or when encountering vendor-reader edge cases.

Conversion settings and filters
  • Output format must be set to mzML — the downstream NIST26 processing accepts only mzML files.
  • Spectra must be centroided. Profile data are not acceptable. If your raw data are profile mode, add a peak-picking filter in msconvert.
  • The peak-picking (centroiding) filter must be the first algorithm in the filter list so that subsequent filters operate on centroided spectra.
  • If the file is already centroided, adding the peak-picking filter will not modify it; this is a safe check when in doubt.
  • Include the appropriate MS levels during conversion (e.g., MS1 for GC-EI workflows, MS2 for LC-MS/MS) according to the requirements of deconvolution and library search.

Instrument/vendor-specific notes
  • Users working with Agilent data may encounter centroid/profile handling differences; newer ProteoWizard nightly builds sometimes improve vendor reader support and offer finer control over peak-picking parameters.
  • File size can increase substantially after conversion to mzML (notably when storing detailed binary arrays and metadata). Be prepared for larger storage needs and, if required, adjust compression options available in msconvert.

Used instrumentation

The workflow references common GC-MS and LC-MS/MS platforms and specifically notes Agilent vendor files as a practical example. The main software tool is ProteoWizard (msconvert) used to export vendor data to mzML. Conversion can be performed using either the GUI or command-line msconvert.exe.

Main results and discussion (practical outcomes and implications)

Summary of expected outcomes
  • Properly converted mzML files with centroided spectra lead to consistent and predictable performance of NIST26 integrated deconvolution and library searching routines.
  • Failing to centroid spectra (leaving profile-mode data) or misordering filters can produce failures or poor-quality library matches and deconvolution artifacts.
  • Choosing the wrong ProteoWizard build can result in incomplete vendor-reader support or suboptimal peak-picking behavior, especially for certain instrument vendors.

Risks and trade-offs
  • Centroiding reduces data volume and speeds many downstream algorithms but discards detailed peak-shape information present in profile data that can be useful for advanced peak modeling or isotopic fine-structure analysis.
  • Automated peak picking parameters may need adjustment to avoid missing low-intensity peaks or generating spurious centroids, which would negatively affect identification.
  • Large mzML file sizes may impact storage, transfer, and indexing performance; compression and selective retention of metadata or binary arrays can mitigate this.

Benefits and practical applications

Practical advantages of following this workflow include:
  • Compatibility with standardized NIST26 deconvolution and library search pipelines.
  • Improved reproducibility across laboratories by using a common interchange format.
  • Flexibility: msconvert supports batch conversion, command-line automation, and integration into larger processing pipelines.
  • Ability to correct for vendor-specific quirks via choice of ProteoWizard build and filter settings.

Recommendations and checklist for reliable conversion
  1. Confirm the required input format for your processing pipeline is mzML; do not submit other formats.
  2. Verify whether raw data are centroided or profile; if profile, enable peak picking and ensure it is the first filter.
  3. Test conversions with both stable and recent ProteoWizard builds if vendor-reader issues appear, particularly with Agilent files.
  4. Review converted mzML metadata to confirm MS levels and centroiding status are as expected.
  5. Monitor resulting file sizes and enable compression where appropriate to manage storage.

Future trends and potential uses

Emerging and anticipated developments include:
  • Improved vendor reader fidelity and more robust, standardized metadata mapping in conversion tools.
  • On-the-fly centroiding and adaptive peak-picking algorithms that preserve key profile features while optimizing data volume.
  • Enhanced mzML extensions for richer chromatogram and acquisition metadata to better support complex deconvolution and machine-learning workflows.
  • Greater cloud integration and pipeline standardization enabling large-scale, reproducible NIST26 processing across distributed compute resources.

Conclusion

Converting vendor mass-spectrometry data to centroided mzML is an essential, non-negotiable preprocessing step for reliable NIST26 chromatogram deconvolution and library searching. ProteoWizard msconvert is the recommended, widely used tool for this task; careful attention to centroiding, filter ordering, and software version will prevent common failures and improve downstream identification quality. Implementing a short verification checklist after conversion ensures converted files meet the expectations of deconvolution and search workflows.

References

  • Little J. Conversion to mzML Format for NIST26 Chromatogram Processing. Video/Handout. Mass Spec Interpretation Services; April 24, 2026.
  • ProteoWizard Toolkit. msconvert component. ProteoWizard project (ProteoWizard software package).

Content was automatically generated from an orignal PDF document using AI and may contain inaccuracies.

Downloadable PDF for viewing
 

Similar PDF

Toggle
Data Processing Workflows for DDA and DIA LC-MS data using Symphony and MSConvert
[ TECHNOLOGY BRIEF ] Data Processing Workflows for DDA and DIA LC-MS data using Symphony and MSConvert Jimmy Yuk and Giorgis Isaac, Waters Corporation, Milford, MA, USA; Hans Vissers, Waters Corporation, Wilmslow, UK A simple and seamless data processing workflow…
Key words
symphony, symphonymsconvert, msconvertdata, dataparty, partyconverted, convertedthird, thirddda, ddasoftware, softwareformats, formatsopen, openmetaboanalyst, metaboanalystdia, diaconvert, convertmasslynx, masslynxsonar
Connected Solutions for Metabolomic and Lipidomic Studies on the Xevo MRT Mass Spectrometer
TM Connected Solutions for Metabolomic and Lipidomic Studies on the Xevo MRT Mass Spectrometer Make meaningful scientific discoveries more efficient through dedicated workflows combining column chemistries, separations, and informatics. The field of metabolic and lipidomic profiling faces many challenges Comparing…
Key words
mzml, mzmllipidomic, lipidomicmrt, mrtdata, dataanalytica, analyticaxevo, xevoknowledge, knowledgeformat, formatcommercial, commercialmetabolomic, metabolomicbiopathway, biopathwayconvert, convertinterpretation, interpretationmetabolic, metabolicprotocols
Non-Target PFAS Analysis in Dried Blood Spots Using the Agilent 6546 LC/Q-TOF with Profinder and FluoroMatch
Application Note PFAS Non-Target PFAS Analysis in Dried Blood Spots Using the Agilent 6546 LC/Q-TOF with Profinder and FluoroMatch Authors Jeremy Koelmel, Elizabeth Lin, Paul Stelben, and Krystal Pollitt Yale University Emily Parry, Emma Rennie, and James Pyke Agilent Technologies,…
Key words
fluoromatch, fluoromatchpfas, pfasperfluoro, perfluoroprofinder, profindermodular, modulardried, driedseries, seriesblood, bloodfalse, falsespot, spotperfluoroalkyl, perfluoroalkylsulfonic, sulfonicacid, acidacids, acidssuite
Processing MS/MS Data in NIST26 Chromatogram Window
Processing MS/MS Data in NIST26 Chromatogram Window Video/Handout James Little Mass Spec Interpretation Services April 24, 2026 mzinterpretation.com See Full Course on NIST26 with new Integrated Deconvolution/Library Searching for EI GC-MS and LC-MS/MS! Important Skills ➢ Assume familiar with configuring…
Key words
video, videomenu, menuclick, clickscores, scoreschromatogram, chromatogramwindow, windowhandout, handoutlist, listctrl, ctrlbox, boxkeyboard, keyboardentries, entriessearch, searchxic, xicuseful
Other projects
GCMS
ICPMS
Follow us
FacebookX (Twitter)LinkedInYouTube
More information
WebinarsAbout usContact usTerms of use
LabRulez s.r.o. All rights reserved. Content available under a CC BY-SA 4.0 Attribution-ShareAlike