a. Key Laboratory of Tropical Biological Resources of Ministry of Education, School of Pharmaceutical Sciences, Hainan University, Haikou, 570228, China;
b. School of Pharmaceutical Sciences, Sun Yat-Sen University, Guangzhou, 510000, China
Funds:
This work was supported by the National Key R&D Program of China (Grant No.: 2023YFF1205102), the National Natural Science Foundation of China (Grant Nos.: 82273856, 22077143, and 21977127), and the Science Foundation of Guangzhou, China (No.: 2Grant024A04J2172).
Structural optimization of lead compounds is a crucial step in drug discovery. One optimization strategy is to modify the molecular structure of a scaffold to improve both its biological activities and absorption, distribution, metabolism, excretion, toxicity (ADMET) properties. One of the deep molecular generative model approaches preserves the scaffold while generating drug-like molecules, thereby accelerating the molecular optimization process. Deep molecular diffusion generative models simulate a gradual process that creates novel, chemically feasible molecules from noise. However, the existing models lack direct interatomic constraint features and struggle with capturing long-range dependencies in macromolecules, leading to challenges in modifying the scaffold-based molecular structures, and creates limitations in the stability and diversity of the generated molecules. To address these challenges, we propose a deep molecular diffusion generative model, the three-dimensional (3D) equivariant diffusion-driven molecular generation (3D-EDiffMG) model. The dual strong and weak atomic interaction force-based long-range dependency capturing equivariant encoder (dual-SWLEE) is introduced to encode both the bonding and nonbonding information based on strong and weak atomic interactions. Additionally, a gate multilayer perceptron (gMLP) block with tiny attention is incorporated to explicitly model complex long-sequence feature interactions and long-range dependencies. The experimental results show that 3D-EDiffMG effectively generates unique, novel, stable, and diverse drug-like molecules, highlighting its potential for lead optimization and accelerating drug discovery.