Author
Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences
Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences
The Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences (IOCB Prague) is a leading scientific institution in the Czech Republic, recognized internationally. Its primary mission is basic research in the fields of chemical biology and medicinal chemistry, organic and material oriented chemistry, chemistry of natural compounds, biochemistry and molecular biology, physical chemistry, theoretical chemistry, and analytical chemistry.
Tags
Article
Science and research
Video
Scientists
LinkedIn Logo

For the first time, scientists have access to a comprehensive data set for identifying unknown compounds – thanks to experts at IOCB Prague

We, 24.9.2025
| Original article from: IOCB Prague
Scientists at IOCB Prague created MSⁿLib, a vast mass spectrometry library with millions of records, enabling rapid unknown compound identification and boosting drug discovery and biomedical AI.
Video placeholder
  • Photo: IOCB Prague/Tomáš Belloň: Dr. Tomáš Pluskal, head of the Biochemistry of Plant Specialized Metabolites group at IOCB Prague
  • Video: IOCB Prague: For the first time, a complete dataset helps identify unknown compounds 

Scientists from the laboratory of Dr. Tomáš Pluskal are helping colleagues around the world identify previously unknown compounds. They have created an extensive library called MSⁿLib, which contains several million records showing how small molecules “break apart” when measured by mass spectrometry. Until now, comparable databases have expanded only very slowly, but thanks to a new approach developed at IOCB Prague, data on unknown molecules can now be obtained in a matter of minutes. This opens the potential for faster drug discovery, better monitoring of chemical substances in the environment, and further advances in artificial intelligence for biomedicine. An article about the library has been published in the journal Nature Methods.

IOCB Prague/Tomáš Belloň: For the first time, scientists have access to a comprehensive data set for identifying unknown compounds – thanks to experts at IOCB Prague: Dr. Robin Schmid, a former postdoctoral researcher in Dr. Pluskal’s group, now works for mzio GmbH.IOCB Prague/Tomáš Belloň: For the first time, scientists have access to a comprehensive data set for identifying unknown compounds – thanks to experts at IOCB Prague: Dr. Robin Schmid, a former postdoctoral researcher in Dr. Pluskal’s group, now works for mzio GmbH.

Mass spectrometry reveals the composition of chemical substances and is a key tool in medicine, pharmacy, and environmental research. The instrument breaks a compound into smaller parts, and from these fragments scientists determine the structure of the original molecule. Fragment spectra, which can be imagined as a fingerprint unique to each substance, are compared with already known spectra stored in libraries. However, existing databases have covered only a limited number of known compounds, making the search considerably more difficult.

Tomáš Pluskal and his team have moved the development of spectral libraries significantly forward. At the time they prepared their study for Nature Methods, they had compiled a catalog of thirty thousand small molecules. For these they recorded two million high-quality spectra, and they did not settle for a rough picture. Through multistage fragmentation (MSⁿ), i.e. repeated breaking of molecules, they obtained a more detailed view of their internal structure. Such a comprehensive data set is available to the scientific world for the first time. Tomáš Pluskal explains: “During the twenty years I’ve worked in this field, spectral libraries have not expanded much. We managed to change this practice and created the largest database currently in existence. Moreover, we’ve made it openly available to the global scientific community.

IOCB Prague/Tomáš Belloň: For the first time, scientists have access to a comprehensive data set for identifying unknown compounds – thanks to experts at IOCB Prague: To acquire the spectral library data, a pipetting robot is used to prepare mixtures of ten chemical compounds in plates, and the mass spectrometer then analyzes each mixture for about 90 seconds. During this time, the spectrometer collects all the needed spectra and the analysis can move on to the next mixture of compounds. This efficient procedure makes it possible to collect spectra for about 3000 substances per day.IOCB Prague/Tomáš Belloň: For the first time, scientists have access to a comprehensive data set for identifying unknown compounds – thanks to experts at IOCB Prague: To acquire the spectral library data, a pipetting robot is used to prepare mixtures of ten chemical compounds in plates, and the mass spectrometer then analyzes each mixture for about 90 seconds. During this time, the spectrometer collects all the needed spectra and the analysis can move on to the next mixture of compounds. This efficient procedure makes it possible to collect spectra for about 3000 substances per day.

The researchers also substantially accelerated the analysis itself. They can measure ten compounds at once, and the entire process takes only a minute and a half. Because Pluskal’s team is exceptionally well known and active in the global scientific community, they have received thousands of compounds as gifts from companies and institutions.

“Since writing the article in Nature Methods, we’ve advanced further. So far, we’ve processed about 70,000 compounds, and we have another 150,000-awaiting analysis. We continue uploading data online, and by the end of the year we’d like to reach 200,000 measured compounds. That’s roughly ten times more than has been available over the past twenty years,” says the first author of the article, Dr. Corinna Brungs.

Tomáš Pluskal and his colleagues are also using the enormous amount of new data to improve AI algorithms that autonomously recognize unknown chemical substances – from metabolites in the human body to compounds in plants and microorganisms.

IOCB Prague/Tomáš Belloň: For the first time, scientists have access to a comprehensive data set for identifying unknown compounds – thanks to experts at IOCB Prague: Dr. Corinna Brungs, former postdoctoral researcher in Tomáš Pluskal Group, is now head of the metabolomics core facility at the University of Vienna.IOCB Prague/Tomáš Belloň: For the first time, scientists have access to a comprehensive data set for identifying unknown compounds – thanks to experts at IOCB Prague: Dr. Corinna Brungs, former postdoctoral researcher in Tomáš Pluskal Group, is now head of the metabolomics core facility at the University of Vienna.

Scientists “feed” the machine learning model with data from the chemical library. The more data it receives, the more accurately the model can predict, based on the supplied spectrum, what the molecule behind the spectrum might look like.

The spectral library was created using the open-source software mzmine, which enabled automated processing of a vast number of measurements. As a result, the resource is not only extensive but also easily usable for further scientific projects worldwide.

Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences
LinkedIn Logo
 

Related content

Simultaneous Quantitation and Discovery analysis: Combining targeted and untargeted metabolomics on Orbitrap mass spectrometers

Applications
| 2026 | Thermo Fisher Scientific
Instrumentation
LC/MS, LC/MS/MS, LC/Orbitrap, LC/HRMS
Manufacturer
Thermo Fisher Scientific
Industries
Metabolomics

Analysis of Per- and Polyfluoroalkyl Substances (PFAS) in Wastewater

Applications
| 2026 | Agilent Technologies
Instrumentation
LC/MS, LC/MS/MS, LC/QQQ
Manufacturer
Agilent Technologies
Industries
Environmental

Non-Targeted Screening of Biosolids with the Xevo™ MRT Mass Spectrometer Reveals New Isoforms of PFAS

Applications
| 2026 | Waters
Instrumentation
LC/MS, LC/MS/MS, LC/TOF, LC/HRMS
Manufacturer
Waters
Industries
Environmental

High Resolution Characterization of Lipid Nanoparticles Using the Xevo™ Charge Detection Mass Spectrometry (CDMS) Instrument - Single Particle Mass Analysis of Intact LNP-mRNA Formulations

Applications
| 2026 | Waters
Instrumentation
LC/MS, LC/MS/MS, LC/HRMS, Particle characterization, LC/IT
Manufacturer
Waters
Industries
Lipidomics

Out-of-the-box workflow for PFAS quantitation using a fullscan high-resolution approach with the Orbitrap Exploris EFOX Mass Detector

Applications
| 2025 | Thermo Fisher Scientific
Instrumentation
LC/MS, LC/MS/MS, LC/Orbitrap, LC/HRMS
Manufacturer
Thermo Fisher Scientific
Industries
Environmental
 

Related articles

A new imaging method for biological samples will help scientists and doctors to more quickly identify, for example, cancerous tissue
Article | Science and research

A new imaging method for biological samples will help scientists and doctors to more quickly identify, for example, cancerous tissue

A new piece of software that significantly speeds up and simplifies the identification of chemicals in tissues.
Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences
tag
share
more
Plants contain an incredible wealth of chemicals, Tomáš Pluskal says
Article | Scientists

Plants contain an incredible wealth of chemicals, Tomáš Pluskal says

The interview with Tomáš Pluskal from the Institute of Organic Chemistry and Biochemistry of the CAS.
Czech Academy of Sciences
tag
share
more
Scientists from CIIRC CTU and IOCB Prague lead a benchmarking effort for AI-driven discovery of molecules
Article | Science and research

Scientists from CIIRC CTU and IOCB Prague lead a benchmarking effort for AI-driven discovery of molecules

Roman and Anton Bushuiev joined experts from 14 institutes in MassSpecGym, a project to benchmark AI methods for discovering natural molecules from MS, aiding drug development, ecology, and space research.
Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences
tag
share
more
Researchers from IOCB Prague and CIIRC CTU discover unknown molecules with the help of AI
Article | Science and research

Researchers from IOCB Prague and CIIRC CTU discover unknown molecules with the help of AI

Dr. Tomáš Pluskal, winner of the Neuron Award, and his team created DreaMS, a machine learning model that accelerates molecule analysis. Published in Nature Biotechnology.
Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences
tag
share
more
 

Related content

Simultaneous Quantitation and Discovery analysis: Combining targeted and untargeted metabolomics on Orbitrap mass spectrometers

Applications
| 2026 | Thermo Fisher Scientific
Instrumentation
LC/MS, LC/MS/MS, LC/Orbitrap, LC/HRMS
Manufacturer
Thermo Fisher Scientific
Industries
Metabolomics

Analysis of Per- and Polyfluoroalkyl Substances (PFAS) in Wastewater

Applications
| 2026 | Agilent Technologies
Instrumentation
LC/MS, LC/MS/MS, LC/QQQ
Manufacturer
Agilent Technologies
Industries
Environmental

Non-Targeted Screening of Biosolids with the Xevo™ MRT Mass Spectrometer Reveals New Isoforms of PFAS

Applications
| 2026 | Waters
Instrumentation
LC/MS, LC/MS/MS, LC/TOF, LC/HRMS
Manufacturer
Waters
Industries
Environmental

High Resolution Characterization of Lipid Nanoparticles Using the Xevo™ Charge Detection Mass Spectrometry (CDMS) Instrument - Single Particle Mass Analysis of Intact LNP-mRNA Formulations

Applications
| 2026 | Waters
Instrumentation
LC/MS, LC/MS/MS, LC/HRMS, Particle characterization, LC/IT
Manufacturer
Waters
Industries
Lipidomics

Out-of-the-box workflow for PFAS quantitation using a fullscan high-resolution approach with the Orbitrap Exploris EFOX Mass Detector

Applications
| 2025 | Thermo Fisher Scientific
Instrumentation
LC/MS, LC/MS/MS, LC/Orbitrap, LC/HRMS
Manufacturer
Thermo Fisher Scientific
Industries
Environmental
 

Related articles

A new imaging method for biological samples will help scientists and doctors to more quickly identify, for example, cancerous tissue
Article | Science and research

A new imaging method for biological samples will help scientists and doctors to more quickly identify, for example, cancerous tissue

A new piece of software that significantly speeds up and simplifies the identification of chemicals in tissues.
Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences
tag
share
more
Plants contain an incredible wealth of chemicals, Tomáš Pluskal says
Article | Scientists

Plants contain an incredible wealth of chemicals, Tomáš Pluskal says

The interview with Tomáš Pluskal from the Institute of Organic Chemistry and Biochemistry of the CAS.
Czech Academy of Sciences
tag
share
more
Scientists from CIIRC CTU and IOCB Prague lead a benchmarking effort for AI-driven discovery of molecules
Article | Science and research

Scientists from CIIRC CTU and IOCB Prague lead a benchmarking effort for AI-driven discovery of molecules

Roman and Anton Bushuiev joined experts from 14 institutes in MassSpecGym, a project to benchmark AI methods for discovering natural molecules from MS, aiding drug development, ecology, and space research.
Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences
tag
share
more
Researchers from IOCB Prague and CIIRC CTU discover unknown molecules with the help of AI
Article | Science and research

Researchers from IOCB Prague and CIIRC CTU discover unknown molecules with the help of AI

Dr. Tomáš Pluskal, winner of the Neuron Award, and his team created DreaMS, a machine learning model that accelerates molecule analysis. Published in Nature Biotechnology.
Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences
tag
share
more
 

Related content

Simultaneous Quantitation and Discovery analysis: Combining targeted and untargeted metabolomics on Orbitrap mass spectrometers

Applications
| 2026 | Thermo Fisher Scientific
Instrumentation
LC/MS, LC/MS/MS, LC/Orbitrap, LC/HRMS
Manufacturer
Thermo Fisher Scientific
Industries
Metabolomics

Analysis of Per- and Polyfluoroalkyl Substances (PFAS) in Wastewater

Applications
| 2026 | Agilent Technologies
Instrumentation
LC/MS, LC/MS/MS, LC/QQQ
Manufacturer
Agilent Technologies
Industries
Environmental

Non-Targeted Screening of Biosolids with the Xevo™ MRT Mass Spectrometer Reveals New Isoforms of PFAS

Applications
| 2026 | Waters
Instrumentation
LC/MS, LC/MS/MS, LC/TOF, LC/HRMS
Manufacturer
Waters
Industries
Environmental

High Resolution Characterization of Lipid Nanoparticles Using the Xevo™ Charge Detection Mass Spectrometry (CDMS) Instrument - Single Particle Mass Analysis of Intact LNP-mRNA Formulations

Applications
| 2026 | Waters
Instrumentation
LC/MS, LC/MS/MS, LC/HRMS, Particle characterization, LC/IT
Manufacturer
Waters
Industries
Lipidomics

Out-of-the-box workflow for PFAS quantitation using a fullscan high-resolution approach with the Orbitrap Exploris EFOX Mass Detector

Applications
| 2025 | Thermo Fisher Scientific
Instrumentation
LC/MS, LC/MS/MS, LC/Orbitrap, LC/HRMS
Manufacturer
Thermo Fisher Scientific
Industries
Environmental
 

Related articles

A new imaging method for biological samples will help scientists and doctors to more quickly identify, for example, cancerous tissue
Article | Science and research

A new imaging method for biological samples will help scientists and doctors to more quickly identify, for example, cancerous tissue

A new piece of software that significantly speeds up and simplifies the identification of chemicals in tissues.
Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences
tag
share
more
Plants contain an incredible wealth of chemicals, Tomáš Pluskal says
Article | Scientists

Plants contain an incredible wealth of chemicals, Tomáš Pluskal says

The interview with Tomáš Pluskal from the Institute of Organic Chemistry and Biochemistry of the CAS.
Czech Academy of Sciences
tag
share
more
Scientists from CIIRC CTU and IOCB Prague lead a benchmarking effort for AI-driven discovery of molecules
Article | Science and research

Scientists from CIIRC CTU and IOCB Prague lead a benchmarking effort for AI-driven discovery of molecules

Roman and Anton Bushuiev joined experts from 14 institutes in MassSpecGym, a project to benchmark AI methods for discovering natural molecules from MS, aiding drug development, ecology, and space research.
Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences
tag
share
more
Researchers from IOCB Prague and CIIRC CTU discover unknown molecules with the help of AI
Article | Science and research

Researchers from IOCB Prague and CIIRC CTU discover unknown molecules with the help of AI

Dr. Tomáš Pluskal, winner of the Neuron Award, and his team created DreaMS, a machine learning model that accelerates molecule analysis. Published in Nature Biotechnology.
Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences
tag
share
more
 

Related content

Simultaneous Quantitation and Discovery analysis: Combining targeted and untargeted metabolomics on Orbitrap mass spectrometers

Applications
| 2026 | Thermo Fisher Scientific
Instrumentation
LC/MS, LC/MS/MS, LC/Orbitrap, LC/HRMS
Manufacturer
Thermo Fisher Scientific
Industries
Metabolomics

Analysis of Per- and Polyfluoroalkyl Substances (PFAS) in Wastewater

Applications
| 2026 | Agilent Technologies
Instrumentation
LC/MS, LC/MS/MS, LC/QQQ
Manufacturer
Agilent Technologies
Industries
Environmental

Non-Targeted Screening of Biosolids with the Xevo™ MRT Mass Spectrometer Reveals New Isoforms of PFAS

Applications
| 2026 | Waters
Instrumentation
LC/MS, LC/MS/MS, LC/TOF, LC/HRMS
Manufacturer
Waters
Industries
Environmental

High Resolution Characterization of Lipid Nanoparticles Using the Xevo™ Charge Detection Mass Spectrometry (CDMS) Instrument - Single Particle Mass Analysis of Intact LNP-mRNA Formulations

Applications
| 2026 | Waters
Instrumentation
LC/MS, LC/MS/MS, LC/HRMS, Particle characterization, LC/IT
Manufacturer
Waters
Industries
Lipidomics

Out-of-the-box workflow for PFAS quantitation using a fullscan high-resolution approach with the Orbitrap Exploris EFOX Mass Detector

Applications
| 2025 | Thermo Fisher Scientific
Instrumentation
LC/MS, LC/MS/MS, LC/Orbitrap, LC/HRMS
Manufacturer
Thermo Fisher Scientific
Industries
Environmental
 

Related articles

A new imaging method for biological samples will help scientists and doctors to more quickly identify, for example, cancerous tissue
Article | Science and research

A new imaging method for biological samples will help scientists and doctors to more quickly identify, for example, cancerous tissue

A new piece of software that significantly speeds up and simplifies the identification of chemicals in tissues.
Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences
tag
share
more
Plants contain an incredible wealth of chemicals, Tomáš Pluskal says
Article | Scientists

Plants contain an incredible wealth of chemicals, Tomáš Pluskal says

The interview with Tomáš Pluskal from the Institute of Organic Chemistry and Biochemistry of the CAS.
Czech Academy of Sciences
tag
share
more
Scientists from CIIRC CTU and IOCB Prague lead a benchmarking effort for AI-driven discovery of molecules
Article | Science and research

Scientists from CIIRC CTU and IOCB Prague lead a benchmarking effort for AI-driven discovery of molecules

Roman and Anton Bushuiev joined experts from 14 institutes in MassSpecGym, a project to benchmark AI methods for discovering natural molecules from MS, aiding drug development, ecology, and space research.
Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences
tag
share
more
Researchers from IOCB Prague and CIIRC CTU discover unknown molecules with the help of AI
Article | Science and research

Researchers from IOCB Prague and CIIRC CTU discover unknown molecules with the help of AI

Dr. Tomáš Pluskal, winner of the Neuron Award, and his team created DreaMS, a machine learning model that accelerates molecule analysis. Published in Nature Biotechnology.
Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences
tag
share
more
Other projects
GCMS
ICPMS
Follow us
FacebookX (Twitter)LinkedInYouTube
More information
WebinarsAbout usContact usTerms of use
LabRulez s.r.o. All rights reserved. Content available under a CC BY-SA 4.0 Attribution-ShareAlike