Author
Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences
Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences
The Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences (IOCB Prague) is a leading scientific institution in the Czech Republic, recognized internationally. Its primary mission is basic research in the fields of chemical biology and medicinal chemistry, organic and material oriented chemistry, chemistry of natural compounds, biochemistry and molecular biology, physical chemistry, theoretical chemistry, and analytical chemistry.
Tags
Article
Science and research
Video
LinkedIn Logo

Researchers from IOCB Prague and CIIRC CTU discover unknown molecules with the help of AI

We, 28.5.2025
| Original article from: IOCB Prague
Dr. Tomáš Pluskal, winner of the Neuron Award, and his team created DreaMS, a machine learning model that accelerates molecule analysis. Published in Nature Biotechnology.
Video placeholder
  • Photo: Dr. Tomáš Belloň / IOCB Prague: Researchers from IOCB Prague and CIIRC CTU discover unknown molecules with the help of AI (From left: Dr. Tomáš Pluskal, head of the Biochemistry of Plant Specialized Metabolites research group at IOCB Prague; Roman Bushuiev, IOCB Prague; Anton Bushuiev, CIIRC CTU; Raman Samusevich, IOCB Prague; Dr. Josef Šivic, CIIRC CTU)
  • Video: IOCB Prague: Researchers from IOCB Prague and CIIRC CTU discover unknown molecules with the help of AI

This year's winner of the Neuron Award for young promising scientists, Dr. Tomáš Pluskal from IOCB Prague, together with his student Roman Bushuiev and colleagues from the Czech Institute of Informatics, Robotics and Cybernetics at the Czech Technical University (CIIRC CTU), Dr. Josef Šivic and Anton Bushuiev, have developed a machine learning model called DreaMS, which significantly accelerates the analysis of previously unknown molecules. The study was published in the influential scientific journal Nature Biotechnology.

Dr. Tomáš Belloň / IOCB Prague: Researchers from IOCB Prague and CIIRC CTU discover unknown molecules with the help of AIDr. Tomáš Belloň / IOCB Prague: Researchers from IOCB Prague and CIIRC CTU discover unknown molecules with the help of AI

Nature is full of chemicals that have yet to be discovered. It is believed that the vast majority of natural molecules remain unknown. Describing them could pave the way to new drugs, more environmentally friendly pesticides, a deeper understanding of biological processes, or more advanced research into life in the universe.

Each substance has a unique pattern, similar to a human fingerprint, called a mass spectrum, which can be captured using a method known as mass spectrometry. Although this approach generates large quantities of data, interpreting it and uncovering exact molecular structures is extremely difficult. The resulting datasets often appear as vast tables of numbers with no obvious meaning.

To unravel the mystery of unknown molecules, the team from IOCB and CIIRC CTU turned to artificial intelligence. Much like large language models such as ChatGPT learn to understand language without knowing the meaning of words in advance, the DreaMS model attempts to interpret mass spectra without prior knowledge of their chemical structures. “ChatGPT can infer the meaning of words and the connections between them from large volumes of text, and the DreaMS neural network, using self-supervised machine learning, learns to recognize what molecular structures are hidden within spectra. It draws on data from millions of examples,” explains Josef Šivic.

IOCB Prague: The DreaMS AI enables the characterization of molecular structures present in nature by interpreting mass spectrometry data.IOCB Prague: The DreaMS AI enables the characterization of molecular structures present in nature by interpreting mass spectrometry data.

The DreaMS model was trained on tens of millions of spectra from diverse organisms and environments – plants, microbes, food, tissue, and soil samples. Thanks to this, it can uncover hidden similarities between spectra that, at first glance, seem unrelated,” says Tomáš Pluskal. The result is an interconnected network that helps navigate the vast body of chemical data. This network, which can be imagined as an internet of mass spectra, has been named the DreaMS Atlas. Each spectrum is like a website linked to others. On this “internet of spectra”, users can search, explore discovered connections, and ask new questions – for example: What do pesticides, food, and human skin have in common? DreaMS uncovered unexpected chemical similarities between them and hypothesized that certain pesticides may be linked to autoimmune diseases such as psoriasis.

In addition to connecting spectra from different studies, DreaMS can also be used for various practical tasks – for instance, to estimate how many specific fragments a molecule contains or whether it includes particular chemical elements. “We were especially surprised that the model learned to detect fluorine,” says Roman Bushuiev. “Fluorine is present in about one-third of all drugs and agrochemicals, but we were previously unable to reliably detect it from the mass spectrum. After pretraining DreaMS on millions of spectra, we fine-tuned it with a few thousand examples of fluorine-containing molecules – and suddenly it worked.”

The researchers are now working on the next step: teaching the model to predict entire molecular structures. If successful, it could fundamentally transform our understanding of chemical diversity – whether on planet Earth or beyond.

Original article

Bushuiev, R.; Bushuiev, A.; Samusevich, R.; Brungs, C.; Sivic, J.; Pluskal, T. Self-supervised learning of molecular representations from millions of tandem mass spectra using DreaMS. Nat. Biotechnol. 2025. https://doi.org/10.1038/s41587-025-02663-3

Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences
LinkedIn Logo
 

Related content

Simultaneous Quantitation and Discovery analysis: Combining targeted and untargeted metabolomics on Orbitrap mass spectrometers

Applications
| 2026 | Thermo Fisher Scientific
Instrumentation
LC/MS, LC/MS/MS, LC/Orbitrap, LC/HRMS
Manufacturer
Thermo Fisher Scientific
Industries
Metabolomics

Analysis of Per- and Polyfluoroalkyl Substances (PFAS) in Wastewater

Applications
| 2026 | Agilent Technologies
Instrumentation
LC/MS, LC/MS/MS, LC/QQQ
Manufacturer
Agilent Technologies
Industries
Environmental

Non-Targeted Screening of Biosolids with the Xevo™ MRT Mass Spectrometer Reveals New Isoforms of PFAS

Applications
| 2026 | Waters
Instrumentation
LC/MS, LC/MS/MS, LC/TOF, LC/HRMS
Manufacturer
Waters
Industries
Environmental

High Resolution Characterization of Lipid Nanoparticles Using the Xevo™ Charge Detection Mass Spectrometry (CDMS) Instrument - Single Particle Mass Analysis of Intact LNP-mRNA Formulations

Applications
| 2026 | Waters
Instrumentation
LC/MS, LC/MS/MS, LC/HRMS, Particle characterization, LC/IT
Manufacturer
Waters
Industries
Lipidomics

Out-of-the-box workflow for PFAS quantitation using a fullscan high-resolution approach with the Orbitrap Exploris EFOX Mass Detector

Applications
| 2025 | Thermo Fisher Scientific
Instrumentation
LC/MS, LC/MS/MS, LC/Orbitrap, LC/HRMS
Manufacturer
Thermo Fisher Scientific
Industries
Environmental
 

Related articles

A breakthrough in big data processing helps trace chemicals in complex mixtures
Article | Video

A breakthrough in big data processing helps trace chemicals in complex mixtures

Analyze large volumes of data from mass spectrometry in complex mixtures is much easier thanks to the new version of the MZmine 3 software.
Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences
tag
share
more
Plants contain an incredible wealth of chemicals, Tomáš Pluskal says
Article | Scientists

Plants contain an incredible wealth of chemicals, Tomáš Pluskal says

The interview with Tomáš Pluskal from the Institute of Organic Chemistry and Biochemistry of the CAS.
Czech Academy of Sciences
tag
share
more
Scientists from CIIRC CTU and IOCB Prague lead a benchmarking effort for AI-driven discovery of molecules
Article | Science and research

Scientists from CIIRC CTU and IOCB Prague lead a benchmarking effort for AI-driven discovery of molecules

Roman and Anton Bushuiev joined experts from 14 institutes in MassSpecGym, a project to benchmark AI methods for discovering natural molecules from MS, aiding drug development, ecology, and space research.
Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences
tag
share
more
For the first time, scientists have access to a comprehensive data set for identifying unknown compounds – thanks to experts at IOCB Prague
Article | Science and research

For the first time, scientists have access to a comprehensive data set for identifying unknown compounds – thanks to experts at IOCB Prague

Scientists at IOCB Prague created MSⁿLib, a vast mass spectrometry library with millions of records, enabling rapid unknown compound identification and boosting drug discovery and biomedical AI.
Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences
tag
share
more
 

Related content

Simultaneous Quantitation and Discovery analysis: Combining targeted and untargeted metabolomics on Orbitrap mass spectrometers

Applications
| 2026 | Thermo Fisher Scientific
Instrumentation
LC/MS, LC/MS/MS, LC/Orbitrap, LC/HRMS
Manufacturer
Thermo Fisher Scientific
Industries
Metabolomics

Analysis of Per- and Polyfluoroalkyl Substances (PFAS) in Wastewater

Applications
| 2026 | Agilent Technologies
Instrumentation
LC/MS, LC/MS/MS, LC/QQQ
Manufacturer
Agilent Technologies
Industries
Environmental

Non-Targeted Screening of Biosolids with the Xevo™ MRT Mass Spectrometer Reveals New Isoforms of PFAS

Applications
| 2026 | Waters
Instrumentation
LC/MS, LC/MS/MS, LC/TOF, LC/HRMS
Manufacturer
Waters
Industries
Environmental

High Resolution Characterization of Lipid Nanoparticles Using the Xevo™ Charge Detection Mass Spectrometry (CDMS) Instrument - Single Particle Mass Analysis of Intact LNP-mRNA Formulations

Applications
| 2026 | Waters
Instrumentation
LC/MS, LC/MS/MS, LC/HRMS, Particle characterization, LC/IT
Manufacturer
Waters
Industries
Lipidomics

Out-of-the-box workflow for PFAS quantitation using a fullscan high-resolution approach with the Orbitrap Exploris EFOX Mass Detector

Applications
| 2025 | Thermo Fisher Scientific
Instrumentation
LC/MS, LC/MS/MS, LC/Orbitrap, LC/HRMS
Manufacturer
Thermo Fisher Scientific
Industries
Environmental
 

Related articles

A breakthrough in big data processing helps trace chemicals in complex mixtures
Article | Video

A breakthrough in big data processing helps trace chemicals in complex mixtures

Analyze large volumes of data from mass spectrometry in complex mixtures is much easier thanks to the new version of the MZmine 3 software.
Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences
tag
share
more
Plants contain an incredible wealth of chemicals, Tomáš Pluskal says
Article | Scientists

Plants contain an incredible wealth of chemicals, Tomáš Pluskal says

The interview with Tomáš Pluskal from the Institute of Organic Chemistry and Biochemistry of the CAS.
Czech Academy of Sciences
tag
share
more
Scientists from CIIRC CTU and IOCB Prague lead a benchmarking effort for AI-driven discovery of molecules
Article | Science and research

Scientists from CIIRC CTU and IOCB Prague lead a benchmarking effort for AI-driven discovery of molecules

Roman and Anton Bushuiev joined experts from 14 institutes in MassSpecGym, a project to benchmark AI methods for discovering natural molecules from MS, aiding drug development, ecology, and space research.
Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences
tag
share
more
For the first time, scientists have access to a comprehensive data set for identifying unknown compounds – thanks to experts at IOCB Prague
Article | Science and research

For the first time, scientists have access to a comprehensive data set for identifying unknown compounds – thanks to experts at IOCB Prague

Scientists at IOCB Prague created MSⁿLib, a vast mass spectrometry library with millions of records, enabling rapid unknown compound identification and boosting drug discovery and biomedical AI.
Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences
tag
share
more
 

Related content

Simultaneous Quantitation and Discovery analysis: Combining targeted and untargeted metabolomics on Orbitrap mass spectrometers

Applications
| 2026 | Thermo Fisher Scientific
Instrumentation
LC/MS, LC/MS/MS, LC/Orbitrap, LC/HRMS
Manufacturer
Thermo Fisher Scientific
Industries
Metabolomics

Analysis of Per- and Polyfluoroalkyl Substances (PFAS) in Wastewater

Applications
| 2026 | Agilent Technologies
Instrumentation
LC/MS, LC/MS/MS, LC/QQQ
Manufacturer
Agilent Technologies
Industries
Environmental

Non-Targeted Screening of Biosolids with the Xevo™ MRT Mass Spectrometer Reveals New Isoforms of PFAS

Applications
| 2026 | Waters
Instrumentation
LC/MS, LC/MS/MS, LC/TOF, LC/HRMS
Manufacturer
Waters
Industries
Environmental

High Resolution Characterization of Lipid Nanoparticles Using the Xevo™ Charge Detection Mass Spectrometry (CDMS) Instrument - Single Particle Mass Analysis of Intact LNP-mRNA Formulations

Applications
| 2026 | Waters
Instrumentation
LC/MS, LC/MS/MS, LC/HRMS, Particle characterization, LC/IT
Manufacturer
Waters
Industries
Lipidomics

Out-of-the-box workflow for PFAS quantitation using a fullscan high-resolution approach with the Orbitrap Exploris EFOX Mass Detector

Applications
| 2025 | Thermo Fisher Scientific
Instrumentation
LC/MS, LC/MS/MS, LC/Orbitrap, LC/HRMS
Manufacturer
Thermo Fisher Scientific
Industries
Environmental
 

Related articles

A breakthrough in big data processing helps trace chemicals in complex mixtures
Article | Video

A breakthrough in big data processing helps trace chemicals in complex mixtures

Analyze large volumes of data from mass spectrometry in complex mixtures is much easier thanks to the new version of the MZmine 3 software.
Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences
tag
share
more
Plants contain an incredible wealth of chemicals, Tomáš Pluskal says
Article | Scientists

Plants contain an incredible wealth of chemicals, Tomáš Pluskal says

The interview with Tomáš Pluskal from the Institute of Organic Chemistry and Biochemistry of the CAS.
Czech Academy of Sciences
tag
share
more
Scientists from CIIRC CTU and IOCB Prague lead a benchmarking effort for AI-driven discovery of molecules
Article | Science and research

Scientists from CIIRC CTU and IOCB Prague lead a benchmarking effort for AI-driven discovery of molecules

Roman and Anton Bushuiev joined experts from 14 institutes in MassSpecGym, a project to benchmark AI methods for discovering natural molecules from MS, aiding drug development, ecology, and space research.
Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences
tag
share
more
For the first time, scientists have access to a comprehensive data set for identifying unknown compounds – thanks to experts at IOCB Prague
Article | Science and research

For the first time, scientists have access to a comprehensive data set for identifying unknown compounds – thanks to experts at IOCB Prague

Scientists at IOCB Prague created MSⁿLib, a vast mass spectrometry library with millions of records, enabling rapid unknown compound identification and boosting drug discovery and biomedical AI.
Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences
tag
share
more
 

Related content

Simultaneous Quantitation and Discovery analysis: Combining targeted and untargeted metabolomics on Orbitrap mass spectrometers

Applications
| 2026 | Thermo Fisher Scientific
Instrumentation
LC/MS, LC/MS/MS, LC/Orbitrap, LC/HRMS
Manufacturer
Thermo Fisher Scientific
Industries
Metabolomics

Analysis of Per- and Polyfluoroalkyl Substances (PFAS) in Wastewater

Applications
| 2026 | Agilent Technologies
Instrumentation
LC/MS, LC/MS/MS, LC/QQQ
Manufacturer
Agilent Technologies
Industries
Environmental

Non-Targeted Screening of Biosolids with the Xevo™ MRT Mass Spectrometer Reveals New Isoforms of PFAS

Applications
| 2026 | Waters
Instrumentation
LC/MS, LC/MS/MS, LC/TOF, LC/HRMS
Manufacturer
Waters
Industries
Environmental

High Resolution Characterization of Lipid Nanoparticles Using the Xevo™ Charge Detection Mass Spectrometry (CDMS) Instrument - Single Particle Mass Analysis of Intact LNP-mRNA Formulations

Applications
| 2026 | Waters
Instrumentation
LC/MS, LC/MS/MS, LC/HRMS, Particle characterization, LC/IT
Manufacturer
Waters
Industries
Lipidomics

Out-of-the-box workflow for PFAS quantitation using a fullscan high-resolution approach with the Orbitrap Exploris EFOX Mass Detector

Applications
| 2025 | Thermo Fisher Scientific
Instrumentation
LC/MS, LC/MS/MS, LC/Orbitrap, LC/HRMS
Manufacturer
Thermo Fisher Scientific
Industries
Environmental
 

Related articles

A breakthrough in big data processing helps trace chemicals in complex mixtures
Article | Video

A breakthrough in big data processing helps trace chemicals in complex mixtures

Analyze large volumes of data from mass spectrometry in complex mixtures is much easier thanks to the new version of the MZmine 3 software.
Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences
tag
share
more
Plants contain an incredible wealth of chemicals, Tomáš Pluskal says
Article | Scientists

Plants contain an incredible wealth of chemicals, Tomáš Pluskal says

The interview with Tomáš Pluskal from the Institute of Organic Chemistry and Biochemistry of the CAS.
Czech Academy of Sciences
tag
share
more
Scientists from CIIRC CTU and IOCB Prague lead a benchmarking effort for AI-driven discovery of molecules
Article | Science and research

Scientists from CIIRC CTU and IOCB Prague lead a benchmarking effort for AI-driven discovery of molecules

Roman and Anton Bushuiev joined experts from 14 institutes in MassSpecGym, a project to benchmark AI methods for discovering natural molecules from MS, aiding drug development, ecology, and space research.
Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences
tag
share
more
For the first time, scientists have access to a comprehensive data set for identifying unknown compounds – thanks to experts at IOCB Prague
Article | Science and research

For the first time, scientists have access to a comprehensive data set for identifying unknown compounds – thanks to experts at IOCB Prague

Scientists at IOCB Prague created MSⁿLib, a vast mass spectrometry library with millions of records, enabling rapid unknown compound identification and boosting drug discovery and biomedical AI.
Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences
tag
share
more
Other projects
GCMS
ICPMS
Follow us
FacebookX (Twitter)LinkedInYouTube
More information
WebinarsAbout usContact usTerms of use
LabRulez s.r.o. All rights reserved. Content available under a CC BY-SA 4.0 Attribution-ShareAlike