Bundesanstalt für Materialforschung und -prüfung (BAM)
High-throughput material simulations are an integral part of modern materials science. However, there is no straightforward way to recognize synthesizable materials before feeding them to simulation pipelines. The common heuristics for distinguishing stable crystals, such as the Pauling Rules, have been shown to be outdated [1]. In addition to stability, factors like reaction kinetics and technological limitations significantly impact synthesizability.
In this study, we develop a machine learning model for predicting the synthesizability of crystals. This can be formulated as a classification problem with positive (experimental) data, and unlabeled (theoretical) data. We employ an iterative Positive and Unlabeled (PU) learning approach to build and train our model. Two deep learning classifiers are used, the SchNetPack [2] and ALIGNN [3]. We combine their power via co-training [4] to learn the positive distribution more comprehensively and increase prediction reliability.
The final model achieves a notable true-positive rate of nearly 95% for the experimentally synthesized test set and predicts that 17% of the theoretical crystals are synthesizable. These results go beyond the scope of thermodynamic stability analysis alone. This work carries significant implications, including the filtration of structural predictions from high-throughput simulations to identify synthesizable candidates.
References
[1] J. George, D. Waroquiers, D. Di Stefano, G. Petretto, G. Rignanese, and G. Hautier, “The Limited Predictive Power of the Pauling Rules,” Angew. Chem., vol. 132, no. 19, pp. 7639–7645,May 2020.
[2] K. T. Schütt et al. “SchNetPack: A Deep Learning Toolbox For Atomistic Systems,” J. Chem. Theory Comput., vol. 15, no. 1, pp. 448–455, Jan. 2019.
[3] Choudhary, K., DeCost, B. Atomistic Line Graph Neural Network for improved materials property predictions. npj Comput Mater 7, (2021).
[4] Katz, G.; Caragea, C.; Shabtai, A. Vertical Ensemble Co-Training for Text Classification. ACM Trans. Intell. Syst. Technol. 2018, 9, 21:1–21:23
© 2026