Kanazawa University
Formulating design principles for functional materials using machine learning often depends on interpreting the descriptors that define prediction accuracy[1]. At the atomistic level, materials design begins with elements of the periodic table as the fundamental building blocks, whose properties and interactions determine physical behavior. Therefore, materials design using machine learning requires robust descriptors that can capture correlations among elemental features while maintaining interpretability. However, traditional descriptors, typically hand-crafted numerical quantities such as atomic radii or electronegativity, are labor-intensive to construct and inherently limited by the features chosen a priori[2]. While neural network–based representations are powerful, it often lacks transparency[3]. Motivated by advances in natural language processing, we propose text-based representations of elements as descriptors, obtained through fine-tuning BERT [4]. In preliminary experiments, we fine-tuned BERT to classify elements into their periodic groups and analyzed the resulting embeddings, which revealed meaningful organization reflecting chemical similarity. This approach suggests that text-based representations can produce chemically meaningful embeddings that capture predictive features and organize elements with known periodic trends.
References:
[1] L.
M. Ghiringhelli et al., Physical Review Letters 114, (2015).
[2] L. Ward et al., Npj Computational Materials 2, (2016).
[3] X. Zhong et al., Npj Computational Materials 8, (2022).
[4] J.
Devlin et al., (2018). arXiv:1810.04805
© 2025