Electrocatalysis is a key technology for achieving sustainable energy conversion and reducing carbon emissions. Electrocatalytic reactions always occur under specific environmental conditions and involve interactions with surrounding species, giving rise to active phase structures in operation. Traditional algorithms for active phase discovery often suffer from condition dependence and struggle to capture disordered and dynamic structures, highlighting the need for new paradigms to explore catalytic active phases. Professor Feng Pan’s team at the SAM, Peking University Shenzhen Graduate School, has long integrated mathematical graph theory/algebraic topology with structural chemistry by mapping chemical structures into mathematical models. Their work has led to the development of a series of material research methods, including a graph-theory-based structural chemistry approach that resolved the challenge of crystallographic isomorphism (Sci. China Chem., 2019, DOI: 10.1007/s11426-019-9502-5), the construction of a database with 650,000 crystal structures, and the subsequent development of materials genomics and AI for Science (AI4S) research. These approaches have been applied to the discovery of low-dimensional materials (National Science Review, 2022, DOI: 10.1093/nsr/nwac028) and the design of new solid-state electrolytes (J. Am. Chem. Soc., 2024, 146, 27, 18535–18543). They further developed an active-learning framework based on graph representation, graph isomorphism, and machine learning, capable of rapidly predicting optimal thermodynamic pathways from catalytic reaction networks containing hundreds of intermediate species (CCS Chem., 2024, 7, 1–14).
Recently, Pan’s team, in collaboration with the teams of Jianfeng Li and Shisheng Zheng at Xiamen University, proposed an automated framework for active phase discovery in heterogeneous catalysis, integrating graph-theory-based structural chemistry, topological data analysis, and machine-learning force fields. By employing a topology-guided sampling algorithm coupled with machine learning, the study achieved systematic sampling and efficient computation of active phases across diverse catalytic materials. This provides a new technical pathway for mechanistic studies of heterogeneous catalysis and for catalyst design. The results were published in Nature Communications (2025, 16, 2542) under the title “Active Phase Discovery in Heterogeneous Catalysis via Topology-Guided Sampling and Machine Learning.”
In heterogeneous catalysis, identifying the active phase is essential for understanding reaction mechanisms. However, the atomic structure of catalyst surfaces and interiors undergoes complex evolution under environmental conditions, and traditional computational methods struggle to efficiently cover the vast structural space. To address this challenge, the team proposed a sampling algorithm based on algebraic topology persistent homology theory (PH-SA). This algorithm employs persistent homology analysis to detect potential adsorption/embedding sites in a bottom-up manner. PH-SA enables exploration of interactions between active species and surfaces, subsurfaces, and even bulk regions, without morphological constraints, making it applicable to both crystalline and amorphous structures.

Figure 1. Overview of the persistent homology–based sampling algorithm (PH-SA) and the general framework for active phase exploration.
The effectiveness of the framework is demonstrated using two representative systems. In the Pd–H system, more than 50,000 possible hydrogen adsorption/embedding configurations were screened, and machine-learning force fields were employed to calculate the distribution of active phases at different hydrogen concentrations. Under electrochemical conditions, the Pd(100) surface was found to reconstruct from four-fold to six-fold vacancies, a phenomenon closely associated with enhanced catalytic activity in CO2 electroreduction. In the Pt–O system, over 100,000 oxidation configurations of Pt nanoclusters were analyzed. With increasing oxygen concentration, Pt55 clusters gradually developed internal Pt–O coordination structures, leading to reduced oxygen reduction reaction activity. These findings show strong consistency with experimental observations.

Figure 2. Analysis of PdHx systems at different hydrogen concentrations.

Figure 3. Analysis of PtOx systems at different oxygen concentrations.
The study demonstrates that topology-based sampling methods can effectively overcome the limitations of traditional intuition-driven sampling strategies. When combined with the computational efficiency of AI-enabled machine-learning force fields, this approach enables automated exploration of large structural spaces. The research team emphasized that the method is not limited to metal catalysts but can also be extended to other complex catalytic systems, such as CuOx phase transitions in CO2 electroreduction and the energy storage mechanisms of SiOx in lithium-ion batteries.
Professor Feng Pan of the SAM, Peking University Shenzhen Graduate School, Dr. Shisheng Zheng, a PhD graduate of PKUSZ, and Professor Jianfeng Li of Xiamen University are the corresponding authors of this work. Shisheng Zheng and Ximing Zhang from the Institute of Artificial Intelligence, Xiamen University, are the first authors. This research was supported by the National Natural Science Foundation of China and the Guangdong Provincial Key Laboratory programs.