Workshop: AI4S: 6th Workshop on Artificial Intelligence and Machine Learning for Scientific Applications
Authors: Kazuyuki Yasuda, Masahito Kumagai, Masayuki Sato, Kazuhiko Komatsu, and Hiroaki Kobayashi (Tohoku University)
Abstract: Three-dimensional electron diffraction (3D ED) has become an essential technique for determining high-resolution molecular structures. In typical 3D ED experiments, hundreds to thousands of molecular structures are reconstructed from a single sample. However, many of these structures are inaccurate. Traditionally, researchers have had to manually inspect each structure to identify valid ones, which is a time-consuming and labor-intensive process.
We propose an LLM-centric automatic screening method to efficiently identify correct molecular structures from 3D ED outputs. The method proceeds through three stages: (1) rule-based filtering to eliminate clearly impossible candidates, (2) classification by a fine-tuned LLM trained on both correct and artificially-generated corrupted molecules, and (3) grouping to merge identical topologies. This combination allows diverse 3D ED datasets to be classified quickly and accurately.
This method substantially reduces the manual burden and enables efficient large-scale classification of 3D ED data.
Back to AI4S: 6th Workshop on Artificial Intelligence and Machine Learning for Scientific Applications Archive Listing Back to Full Workshop Archive Listing