LKR Leichtmetallkompetenzzentrum Ranshofen GmbH
Predicting mechanical properties of aluminum alloys is critical for optimizing their performance in industrial applications. However, data-driven methods often face challenges due to limited datasets. We automated the extraction of chemical compositions, process parameters, and mechanical properties from a large number of published research articles using a locally hosted Large Language Model (LLM). After cleaning the data, we performed physics-based feature engineering using basic elemental properties as well as the CALculation of PHAse Diagrams (CALPHAD) approach via the MatCalc software. Subsequently, features were selected with a genetic algorithm. Our trained machine learning models show promising results in cross validation on the LLM-extracted dataset, albeit with limited generalizability to independent datasets. By sharing our methods as open-source code, we provide the materials science community with a practical tool and demonstrate the transformative potential of LLMs for automating scientific data extraction and processing in combination with physics-based feature engineering.
Poster
Erwerben Sie einen Zugang, um dieses Dokument anzusehen.
© 2025