AI4EC logo
Location:Home - Activity - Activity Detail
Collaborative Achievement Released at WAIC - "AISI Beyond Boundaries: Exploring Fundamental Research Forum for Next-Generation Large Models"
AI4EC Lab2024/7/13

To further unleash the innovative potential of artificial intelligence and advance two cutting-edge technological frontiers—"AI for Science infrastructure" and "next-generation general AI methodologies"—the forum titled "Beyond Boundaries: Exploring Fundamental Research for Next-Generation Large Models" was officially held on July 4. The event was organized by the World Artificial Intelligence Conference (WAIC) Organizing Committee Office and co-hosted by the Shanghai Institute of Algorithm Innovation and the Beijing Institute of Scientific Intelligence (BISI).

During the product launch session of the forum, Dr. Linfeng Zhang, President of the Beijing Institute of Scientific Intelligence, Founder of DP Technology, and Deputy Director of the AI4EC Lab, officially unveiled the Large Atomistic Model (OpenLAM). As a model solution for dynamic catalysis, "DPA-DynaCat"—jointly developed by AI4EC Lab, the Beijing Institute of Scientific Intelligence, and DP Technology—was introduced as part of the OpenLAM product release. DPA-DynaCat is a general-purpose machine learning potential model designed for dynamic catalytic reaction systems on metal nanocluster surfaces. Trained on datasets of reaction states involving various elemental compositions and cluster sizes, it enables free energy calculations for dynamic catalytic reactions using molecular dynamics and enhanced sampling algorithms.

Background

As energy and environmental issues have become strategic priorities globally, catalysis science has gained renewed momentum in critical areas such as new energy development and carbon emission reduction. To achieve a comprehensive atomic-scale understanding of catalytic processes and explore the dynamic evolution of catalyst structures under in-situ reaction conditions, AI4EC Lab launched ai²-cat—an in-house developed intelligent computational workflow for dynamic catalysis—earlier this year. By constructing machine learning potentials for simulating dynamic catalytic processes, ai²-cat enables automated, high-throughput free energy calculations, thereby extracting key information about catalytic reaction mechanisms.

Leveraging the large-scale, high-precision computational data generated by ai²-cat, we have established DynaCat, a specialized dynamic catalysis database encompassing small molecule dissociation processes on common metal nanoclusters. Built upon an efficient automated computational workflow and standardized data collection and management, DynaCat provides new insights and tools for catalyst research and will continue to expand in both data volume and dimensionality.

Figure 2: The DynaCat Database

As data accumulates and scales, our integrated physicochemical property database will enable more efficient data management and operations. Building upon this high-efficiency workflow and specialized database, we have collaborated with the Beijing Institute of Scientific Intelligence and DP Technology to develop and release DPA-DynaCat—a general-purpose potential model targeting elementary dynamic catalytic reactions on metal nanocluster surfaces—aimed at accelerating the computation and prediction of dynamic properties of metal nanocatalysts under reactive conditions.

Application

The training data for the DPA-DynaCat model were obtained via enhanced sampling of metastable and transition-state structures from dissociation reactions of small molecules (O₂, H₂, H₂O, CO, CO₂, CH₄) on pure metal (Au/Ag/Cu/Pt/Pd/Ni) and binary alloy nanoclusters. This approach ensures comprehensive coverage across the reaction coordinate, overcoming the limitation of conventional sampling methods that capture only metastable states, and enables efficient training of a potential function capable of describing the entire reaction system.

The training data were labeled using DFT calculations performed with CP2K, employing the PBE functional and TZV2P basis set. The dataset comprises 619,642 data points covering over 9,000 distinct reaction systems. The general-purpose potential model was trained from scratch using the DPA-1 framework.

Evaluation

We evaluated the model performance in three aspects: (1) Training set energy/force errors; (2) Test set energy/force errors; (3) Reaction free energy calculation errors.

(1)Training Set Energy/Force Errors

Testing results show that the model achieves a root mean square error (RMSE) of 26.8 meV/atom for energy predictions and 199.5 meV/Å for force predictions on the training set. As shown in the figures below, DPA-DynaCat demonstrates strong predictive capability for the structural energies and atomic forces of adsorbed reaction states on metal and alloy clusters across all tested elements except Pt.

Figure 3: Distribution of model prediction errors for structural energies of adsorbed reaction states on metal clusters of different elemental compositions
Figure 4: Distribution of model prediction errors for atomic forces of adsorbed reaction states on metal clusters of different elemental compositions

(2)Test Set Energy/Force Errors

We constructed a test dataset by performing enhanced sampling simulations on catalytic reaction systems involving 15 binary alloys with compositions and sizes not covered in the training data. Evaluation results indicate that the trained general-purpose potential achieves accurate energy and force predictions across various compositions. The energy prediction RMSE ranges from 10 to 30 meV/atom, and the force prediction RMSE generally falls between 100 and 200 meV/Å, meeting expectations.

Figure 5: Distribution of model-predicted energy errors on test sets of alloy clusters with different compositions
Figure 6: Distribution of model-predicted atomic force errors on test sets of alloy clusters with different compositions

The RMSE values for energy and force predictions across different alloy systems are shown below:

(3)Reaction Free Energy Calculation Errors

Compared to baseline free energy surfaces obtained from system-specific model training and simulation, the current general-purpose model provides reasonably accurate predictions of energy barriers and reaction free energies. For pure metal systems, fine-tuning with DPA2 yields nearly identical free energy profiles. However, predictions for alloy systems remain unsatisfactory, with errors primarily arising from inaccuracies in energy prediction.

Figure 7: Model errors in reaction free energy calculations

Trial Access

DPA-DynaCat is now available on the Scientific Intelligence Square (AI-Square) platform (https://www.aissquare.com/openlam). The model can be directly downloaded and deployed in local computing environments, or accessed and used via the platform’s application interface.