09 Novembre 2021 Machine learning reveals the atomic structure of large and complex molecules

Schematic illustration of the molecular structure of the Fenchone molecule.

Schematic of the Machine Learning framework, employing a Convolutional Neural Network (CNN) for LIED

ICFO researchers report in Communications Chemistry on a newly-developed machine learning algorithm for LIED to extract the three-dimensional structure of large and complex molecules. Until very recently, the idea of watching how molecules break, or transform, during chemical reactions was unfathomable. In 2016, researchers from ICFO developed mid-IR-driven laser induced electron diffraction (LIED) with kinematic coincidence detection to image the position of each and every atom inside a single molecule with one of its own electrons. The combined picometer spatial and attosecond temporal resolution allowed to actually image and track the molecular bond breakup in acetylene (C2H2) nine femtoseconds after its ionization, a method they coined “molecular selfie”.

Applying the LIED technique to take snapshots of small gas-phase molecules proved to be an extremely powerful tool to understand the intertwining of molecules and how they react, change, break, bend, etc. However, this technique was never applied to more complex molecular structures because, the larger the molecule is, the harder the structural retrieval becomes and, therefore, it is necessary to calculate many-thousands of molecular configurations for all possible orientations of the molecule, something that would take ages.

In a recent study published in Chemistry Communications, ICFO researcher Xinyao Liu, Kasra Amini, Aurelien Sanchez, Blanca Belsa, Tobias Steinle, led by led by ICREA professor at ICFO Jens Biegert, report on a solution to this problem with a newly-developed machine learning algorithm for LIED to extract the three-dimensional structure of large and complex moleculesIn their experiment, the team of researchers developed a machine learning model and combined it with a Convolutional Neural Network (CNN) algorithm which, according to the researchers, is “well suited for problems in image recognitions to identified subtle features from an image at different level of complexity similar to a human brain”. Using the CNN-ML framework, the pre-calculated database of configurations could be drastically reduced to unambiguously identify a complex chiral molecular structure such as the Fenchone molecule.

This result is of major importance because being able to calculate the 3D molecular structure of complex molecules with sufficient structural resolution has been, so far, a very difficult challenge to overcome. This study is a major step forward is this field, where the combination of LIED, machine learning and CNN network, have not only shown that they can predict and determine the structure of these large molecules, but also do it within a completely reasonable computing processing time.