IBM Research and Thieme Chemistry’s partnership combines machine learning and manually curated data | Scientific Computing World

2021-12-13 22:31:45 By : Ms. Rose Lee

Share this on social media:

IBM Research Europe and Thieme Chemistry announced the first batch of results of their collaboration, which were evaluated by seven well-known synthetic chemistry experts and their research teams from China, Germany, Switzerland, New Zealand and the United States. 

Professor Dame Margaret Brimble from the University of Auckland in New Zealand commented: “This innovative IBM/Thieme Chemistry platform provides synthetic chemistry researchers with an effective tool to verify their own retrosynthesis plans while also providing alternative solutions. It can carry out a rigorous evaluation of the design phase of the reverse synthesis of a given synthesis, which will undoubtedly bring benefits in the implementation of the selected synthesis plan.

The collaboration between IBM Research Europe and Thieme Chemistry is based on the synergy between high-quality data and state-of-the-art organic chemistry synthesis predictive machine learning models. RXN For Chemistry is a cloud platform using artificial intelligence (AI), which was recently trained using synthetic science from Thieme and high-quality, manually curated datasets from Synfacts. 

Organic compounds can react with each other in thousands of different ways. Empirical knowledge is the key for organic chemists to avoid countless trials and errors in the laboratory. In order to improve the synthesis plan, IBM Research and Thieme Chemistry combined the full-text resources of synthetic organic chemistry methods from Thieme, the artificially curated data set by experts from Science of Synthesis, and the comments of Synfacts journals with an artificial intelligence model called Molecular. IBM RXN for Chemistry In the transformer.

Molecular Transformer is a neural machine translation model used to reliably predict the outcome of chemical reactions and was later enhanced to include reverse synthesis analysis—that is, first determine the chemicals needed to create a given target molecule. Facts have proved that the model is very successful in learning the chemical reactivity information present in the chemical reaction data set. However, it is limited to the content and correctness of these data sets.

Science of Synthesis and Synfacts cover a wide range of reaction spaces. Generally, models trained on commercial patent datasets perform poorly in many such responses. The higher quality of the chemical records of Science of Synthesis and Synfacts is reflected in the greater proportion of available records. This consistency in the Thieme data set facilitates the learning process of the AI ​​model, resulting in more consistent predictions: The results show that the model trained on Thieme on the RXN for Chemistry platform improves the prediction accuracy of forward prediction by three times , Improve the prediction accuracy by nine times for inverse synthesis.

The collaboration between Thieme and IBM Research Europe demonstrates the impact of high-quality chemical reaction data on future AI chemical synthesis tools. The integration of high-quality, carefully selected data from synthetic science and Synfacts provides a unique opportunity to elevate the chemical performance of RXN to unprecedented levels because it unleashes all the knowledge contained in hundreds of thousands of chemical reaction records.

Professor Richmond Sarpong of the University of California, Berkeley said: “The sustainable future of synthesis will include minimizing the number of non-productive strategies that run only those reactions that lead to productive results. This can only be done through computer design and manual work. The combination of design to achieve, which makes the cooperation with IBM and Thieme Chemistry exciting.

Also involved in testing the retraining model are Prof. Alois Fürstner (MPI Mülheim, Germany), Prof. Karl Gademann and Prof. Cristina Nevado (University of Zurich, Switzerland), Prof. Ang Li (Shanghai Institute of Organic Chemistry, China), and Prof. Dirk Trauner (New York University, USA) And its research team.

James Pena of Thermo Fisher gave his opinion on how scientists can advance their research goals by implementing cloud technology in the laboratory

Robert Roe introduced a series of HPC software tools that can help scientists create applications or run them more efficiently

Robert Roe introduced a series of software tools designed to simplify HPC resource management and implementation