Journal of Pharmaceutical Analysis

Innovative pharmaceutical research facilitated by AI

Feng Zhu, Caisheng Wu, Tingjun Hou

2025, 15(6): 101388. doi: 10.1016/j.jpha.2025.101388

Abstract(132) HTML Full Text PDF(8)

Abstract:

Advancement of artificial intelligence based treatment strategy in type 2 diabetes: A critical update

Aniruddha Sen, Palani Selvam Mohanraj, Vijaya Laxmi, Sumel Ashique, Rajalakshimi Vasudevan, Afaf Aldahish, Anupriya Velu, Arani Das, Iman Ehsan, Anas Islam, Sabina Yasmin, Mohammad Yousuf Ansari

2025, 15(6): 101305. doi: 10.1016/j.jpha.2025.101305

Abstract(240) HTML Full Text PDF(9)

Abstract:

In the unrelenting race to strive to dominate type 2 diabetes mellitus (T2DM) care better, this review paper sets out on a significant discovery trip across recent advancements in treatment and the blooming era of artificial intelligence (AI) utilities. Given the considerable global burden of T2DM, innovative therapeutic approaches to improve patient outcomes remain a public health priority. This review first provides an in-depth analysis of the current state of therapy, from novel pharmacotherapy to lifestyle interventions and new treatment methods. At the same time, the rapidly increasing role of AI in diabetes care is woven into the story, mainly targeting how insulin therapy can be modified and personalized through algorithms and predictive modelling. It leaves a deep review of their pre-existing synergies, which helps understand how collaborative opportunities will unlock the future of T2DM care. This critical role is shown by integrating recent therapeutic advances and AI with overall showcasing better screening, diagnosis, and therapeutics decision-making to outcome prediction in T2DM. The review emphasizes how AI applications in insulin therapy have transformative potential in diabetes care. These person-centred approaches to T2DM management, which are more effective and personalized than some traditional strategies, only work because of the often-hidden synergies between AI algorithms in areas such as diagnostic criteria, predictive methods, and familiar classification tools for subgroups with relevant aspects/predictors on prognosis or treatment responsiveness.

A review of transformer models in drug discovery and beyond

Jian Jiang, Long Chen, Lu Ke, Bozheng Dou, Chunhuan Zhang, Hongsong Feng, Yueying Zhu, Huahai Qiu, Bengong Zhang, Guo-Wei Wei

2025, 15(6): 101081. doi: 10.1016/j.jpha.2024.101081

Abstract(535) HTML Full Text PDF(44)

Abstract:

Transformer models have emerged as pivotal tools within the realm of drug discovery, distinguished by their unique architectural features and exceptional performance in managing intricate data landscapes. Leveraging the innate capabilities of transformer architectures to comprehend intricate hierarchical dependencies inherent in sequential data, these models showcase remarkable efficacy across various tasks, including new drug design and drug target identification. The adaptability of pre-trained transformer-based models renders them indispensable assets for driving data-centric advancements in drug discovery, chemistry, and biology, furnishing a robust framework that expedites innovation and discovery within these domains. Beyond their technical prowess, the success of transformer-based models in drug discovery, chemistry, and biology extends to their interdisciplinary potential, seamlessly combining biological, physical, chemical, and pharmacological insights to bridge gaps across diverse disciplines. This integrative approach not only enhances the depth and breadth of research endeavors but also fosters synergistic collaborations and exchange of ideas among disparate fields. In our review, we elucidate the myriad applications of transformers in drug discovery, as well as chemistry and biology, spanning from protein design and protein engineering, to molecular dynamics (MD), drug target identification, transformer-enabled drug virtual screening (VS), drug lead optimization, drug addiction, small data set challenges, chemical and biological image analysis, chemical language understanding, and single cell data. Finally, we conclude the survey by deliberating on promising trends in transformer models within the context of drug discovery and other sciences.

Advances and challenges in drug design against dental caries: Application of in silico approaches

Zhongxin Chen, Xinyao Zhao, Hanyu Zheng, Yufei Wang, Linglin Zhang

2025, 15(6): 101161. doi: 10.1016/j.jpha.2024.101161

Abstract(230) HTML Full Text PDF(32)

Abstract:

Dental caries, a chronic disease characterized by tooth decay, occupies the second position in terms of disease burden and is primarily caused by cariogenic bacteria, especially Streptococcus mutans, because of its acidogenic, aciduric, and biofilm-forming capabilities. Developing novel targeted anti-virulence agents is always a focal point in caries control to overcome the limitations of conventional anti-virulence agents. The current study represents an up-to-date review of in silico approaches of drug design against dental caries, which have emerged more and more powerful complementary to biochemical attempts. Firstly, we categorize the in silico approaches into computer-aided drug design (CADD) and AI-assisted drug design (AIDD) and highlight the specific methods and models they contain respectively. Subsequently, we detail the design of anti-virulence drugs targeting single or multiple cariogenic virulence targets of S. mutans, such as glucosyltransferases (Gtfs), antigen I/II (AgI/II), sortase A (SrtA), the VicRK signal transduction system and superoxide dismutases (SODs). Finally, we outline the current opportunities and challenges encountered in this field to aid future endeavors and applications of CADD and AIDD in anti-virulence drug design.

Diffusion-based generative drug-like molecular editing with chemical natural language

Jianmin Wang, Peng Zhou, Zixu Wang, Wei Long, Yangyang Chen, Kyoung Tai No, Dongsheng Ouyang, Jiashun Mao, Xiangxiang Zeng

2025, 15(6): 101137. doi: 10.1016/j.jpha.2024.101137

Abstract(361) HTML Full Text PDF(33)

Abstract:

Recently, diffusion models have emerged as a promising paradigm for molecular design and optimization. However, most diffusion-based molecular generative models focus on modeling 2D graphs or 3D geometries, with limited research on molecular sequence diffusion models. The International Union of Pure and Applied Chemistry (IUPAC) names are more akin to chemical natural language than the simplified molecular input line entry system (SMILES) for organic compounds. In this work, we apply an IUPAC-guided conditional diffusion model to facilitate molecular editing from chemical natural language to chemical language (SMILES) and explore whether the pre-trained generative performance of diffusion models can be transferred to chemical natural language. We propose DiffIUPAC, a controllable molecular editing diffusion model that converts IUPAC names to SMILES strings. Evaluation results demonstrate that our model outperforms existing methods and successfully captures the semantic rules of both chemical languages. Chemical space and scaffold analysis show that the model can generate similar compounds with diverse scaffolds within the specified constraints. Additionally, to illustrate the model’s applicability in drug design, we conducted case studies in functional group editing, analogue design and linker design.

A disentangled generative model for improved drug response prediction in patients via sample synthesis

Kunshi Li, Bihan Shen, Fangyoumin Feng, Xueliang Li, Yue Wang, Na Feng, Zhixuan Tang, Liangxiao Ma, Hong Li

2025, 15(6): 101128. doi: 10.1016/j.jpha.2024.101128

Abstract(218) HTML Full Text PDF(10)

Abstract:

Personalized drug response prediction from molecular data is an important challenge in precision medicine for treating cancer. Computational methods have been widely explored and have become increasingly accurate in recent years. However, the clinical application of prediction methods is still in its infancy due to large discrepancies between preclinial models and patients. We present a novel disentangled synthesis transfer network (DiSyn) for drug response prediction specifically designed for transfer learning from preclinical models to clinical patients. DiSyn uses a domain separation network (DSN) to disentangle drug response related features, employs data synthesis technology to increase the sample size and iteratively trains for better feature disentanglement. DiSyn is pretrained on large-scale unlabeled cancer samples and validated by three datasets, The Cancer Genome Atlas (TCGA), Investigation of Serial Studies to Predict Your Therapeutic Response With Imaging And moLecular Analysis 2 (I-SPY2) and Novartis Institutes for Biomedical Research Patient-Derived Xenograft Encyclopedia (NIBR PDXE), achieving competitive performance with the state-of-the-art methods on cancer patients and mice. Furthermore, the application of DiSyn to thousands of breast cancer patients show the heterogeneity in drug responses and demonstrate its potential value in biomarker discovery and drug combination prediction.

Machine learning-assisted microfluidic approach for broad-spectrum liposome size control

Yujie Jia, Xiao Liang, Li Zhang, Jun Zhang, Hajra Zafar, Shan Huang, Yi Shi, Jian Chen, Qi Shen

2025, 15(6): 101221. doi: 10.1016/j.jpha.2025.101221

Abstract(218) HTML Full Text PDF(11)

Abstract:

Liposomes serve as critical carriers for drugs and vaccines, with their biological effects influenced by their size. The microfluidic method, renowned for its precise control, reproducibility, and scalability, has been widely employed for liposome preparation. Although some studies have explored factors affecting liposomal size in microfluidic processes, most focus on small-sized liposomes, predominantly through experimental data analysis. However, the production of larger liposomes, which are equally significant, remains underexplored. In this work, we thoroughly investigate multiple variables influencing liposome size during microfluidic preparation and develop a machine learning (ML) model capable of accurately predicting liposomal size. Experimental validation was conducted using a staggered herringbone micromixer (SHM) chip. Our findings reveal that most investigated variables significantly influence liposomal size, often interrelating in complex ways. We evaluated the predictive performance of several widely-used ML algorithms, including ensemble methods, through cross-validation (CV) for both liposome size and polydispersity index (PDI). A standalone dataset was experimentally validated to assess the accuracy of the ML predictions, with results indicating that ensemble algorithms provided the most reliable predictions. Specifically, gradient boosting was selected for size prediction, while random forest was employed for PDI prediction. We successfully produced uniform large (600 nm) and small (100 nm) liposomes using the optimised experimental conditions derived from the ML models. In conclusion, this study presents a robust methodology that enables precise control over liposome size distribution, offering valuable insights for medicinal research applications.

Identify drug-drug interactions via deep learning: A real world study

Jingyang Li, Yanpeng Zhao, Zhenting Wang, Chunyue Lei, Lianlian Wu, Yixin Zhang, Song He, Xiaochen Bo, Jian Xiao

2025, 15(6): 101194. doi: 10.1016/j.jpha.2025.101194

Abstract(485) HTML Full Text PDF(26)

Abstract:

Identifying drug-drug interactions (DDIs) is essential to prevent adverse effects from polypharmacy. Although deep learning has advanced DDI identification, the gap between powerful models and their lack of clinical application and evaluation has hindered clinical benefits. Here, we developed a Multi-Dimensional Feature Fusion model named MDFF, which integrates one-dimensional simplified molecular input line entry system sequence features, two-dimensional molecular graph features, and three-dimensional geometric features to enhance drug representations for predicting DDIs. MDFF was trained and validated on two DDI datasets, evaluated across three distinct scenarios, and compared with advanced DDI prediction models using accuracy, precision, recall, area under the curve, and F1 score metrics. MDFF achieved state-of-the-art performance across all metrics. Ablation experiments showed that integrating multi-dimensional drug features yielded the best results. More importantly, we obtained adverse drug reaction reports uploaded by Xiangya Hospital of Central South University from 2021 to 2023 and used MDFF to identify potential adverse DDIs. Among 12 real-world adverse drug reaction reports, the predictions of 9 reports were supported by relevant evidence. Additionally, MDFF demonstrated the ability to explain adverse DDI mechanisms, providing insights into the mechanisms behind one specific report and highlighting its potential to assist practitioners in improving medical practice.

In silico prediction of pK_a values using explainable deep learning methods

Chen Yang, Changda Gong, Zhixing Zhang, Jiaojiao Fang, Weihua Li, Guixia Liu, Yun Tang

2025, 15(6): 101174. doi: 10.1016/j.jpha.2024.101174

Abstract(305) HTML Full Text PDF(35)

Abstract:

Negative logarithm of the acid dissociation constant (pK_a) significantly influences the absorption, distribution, metabolism, excretion, and toxicity (ADMET) properties of molecules and is a crucial indicator in drug research. Given the rapid and accurate characteristics of computational methods, their role in predicting drug properties is increasingly important. Although many pK_a prediction models currently exist, they often focus on enhancing model precision while neglecting interpretability. In this study, we present GraFpK_a, a pK_a prediction model using graph neural networks (GNNs) and molecular fingerprints. The results show that our acidic and basic models achieved mean absolute errors (MAEs) of 0.621 and 0.402, respectively, on the test set, demonstrating good predictive performance. Notably, to improve interpretability, GraFpK_a also incorporates Integrated Gradients (IGs), providing a clearer visual description of the atoms significantly affecting the pK_a values. The high reliability and interpretability of GraFpK_a ensure accurate pK_a predictions while also facilitating a deeper understanding of the relationship between molecular structure and pK_a values, making it a valuable tool in the field of pK_a prediction.

Perturbation response scanning of drug-target networks: Drug repurposing for multiple sclerosis

Yitan Lu, Ziyun Zhou, Qi Li, Bin Yang, Xing Xu, Yu Zhu, Mengjun Xie, Yuwan Qi, Fei Xiao, Wenying Yan, Zhongjie Liang, Qifei Cong, Guang Hu

2025, 15(6): 101295. doi: 10.1016/j.jpha.2025.101295

Abstract(69) HTML Full Text PDF(2)

Abstract:

Combined with elastic network model (ENM), the perturbation response scanning (PRS) has emerged as a robust technique for pinpointing allosteric interactions within proteins. Here, we proposed the PRS analysis of drug-target networks (DTNs), which could provide a promising avenue in network medicine. We demonstrated the utility of the method by introducing a deep learning and network perturbation-based framework, for drug repurposing of multiple sclerosis (MS). First, the MS comorbidity network was constructed by performing a random walk with restart algorithm based on shared genes between MS and other diseases as seed nodes. Then, based on topological analysis and functional annotation, the neurotransmission module was identified as the “therapeutic module” of MS. Further, perturbation scores of drugs on the module were calculated by constructing the DTN and introducing the PRS analysis, giving a list of repurposable drugs for MS. Mechanism of action analysis both at pathway and structural levels screened dihydroergocristine as a candidate drug of MS by targeting a serotonin receptor of serotonin 2B receptor (HTR2B). Finally, we established a cuprizone-induced chronic mouse model to evaluate the alteration of HTR2B in mouse brain regions and observed that HTR2B was significantly reduced in the cuprizone-induced mouse cortex. These findings proved that the network perturbation modeling is a promising avenue for drug repurposing of MS. As a useful systematic method, our approach can also be used to discover the new molecular mechanism and provide effective candidate drugs for other complex diseases.

Generation of SARS-CoV-2 dual-target candidate inhibitors through 3D equivariant conditional generative neural networks

Zhong-Xing Zhou, Hong-Xing Zhang, Qingchuan Zheng

2025, 15(6): 101229. doi: 10.1016/j.jpha.2025.101229

Abstract(375) HTML Full Text PDF(19)

Abstract:

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) mutations are influenced by random and uncontrollable factors, and the risk of the next widespread epidemic remains. Dual-target drugs that synergistically act on two targets exhibit strong therapeutic effects and advantages against mutations. In this study, a novel computational workflow was developed to design dual-target SARS-CoV-2 candidate inhibitors with the Envelope protein and Main protease selected as the two target proteins. The drug-like molecules of our self-constructed 3D scaffold database were used as high-throughput molecular docking probes for feature extraction of two target protein pockets. A multi-layer perceptron (MLP) was employed to embed the binding affinities into a latent space as conditional vectors to control conditional distribution. Utilizing a conditional generative neural network, cG-SchNet, with 3D Euclidean group (E3) symmetries, the conditional probability distributions of molecular 3D structures were acquired and a set of novel SARS-CoV-2 dual-target candidate inhibitors were generated. The 1D probability, 2D joint probability, and 2D cumulative probability distribution results indicate that the generated sets are significantly enhanced compared to the training set in the high binding affinity area. Among the 201 generated molecules, 42 molecules exhibited a sum binding affinity exceeding 17.0 kcal/mol while 9 of them having a sum binding affinity exceeding 19.0 kcal/mol, demonstrating structure diversity along with strong dual-target affinities, good absorption, distribution, metabolism, excretion, and toxicity (ADMET) properties, and ease of synthesis. Dual-target drugs are rare and difficult to find, and our “high-throughput docking-multi-conditional generation” workflow offers a wide range of options for designing or optimizing potent dual-target SARS-CoV-2 inhibitors.

Fingerprint-enhanced hierarchical molecular graph neural networks for property prediction

Shuo Liu, Mengyun Chen, Xiaojun Yao, Huanxiang Liu

2025, 15(6): 101242. doi: 10.1016/j.jpha.2025.101242

Abstract(376) HTML Full Text PDF(18)

Abstract:

Accurate prediction of molecular properties is crucial for selecting compounds with ideal properties and reducing the costs and risks of trials. Traditional methods based on manually crafted features and graph-based methods have shown promising results in molecular property prediction. However, traditional methods rely on expert knowledge and often fail to capture the complex structures and interactions within molecules. Similarly, graph-based methods typically overlook the chemical structure and function hidden in molecular motifs and struggle to effectively integrate global and local molecular information. To address these limitations, we propose a novel fingerprint-enhanced hierarchical graph neural network (FH-GNN) for molecular property prediction that simultaneously learns information from hierarchical molecular graphs and fingerprints. The FH-GNN captures diverse hierarchical chemical information by applying directed message-passing neural networks (D-MPNN) on a hierarchical molecular graph that integrates atomic-level, motif-level, and graph-level information along with their relationships. Additionally, we used an adaptive attention mechanism to balance the importance of hierarchical graphs and fingerprint features, creating a comprehensive molecular embedding that integrated hierarchical molecular structures with domain knowledge. Experiments on eight benchmark datasets from MoleculeNet showed that FH-GNN outperformed the baseline models in both classification and regression tasks for molecular property prediction, validating its capability to comprehensively capture molecular information. By integrating molecular structure and chemical knowledge, FH-GNN provides a powerful tool for the accurate prediction of molecular properties and aids in the discovery of potential drug candidates.

Scaffold and SAR studies on c-MET inhibitors using machine learning approaches

Jing Zhang, Mingming Zhang, Weiran Huang, Changjie Liang, Wei Xu, Jinghua Zhang, Jun Tu, Innocent Okohi Agida, Jinke Cheng, Dong-Qing Wei, Buyong Ma, Yanjing Wang, Hongsheng Tan

2025, 15(6): 101303. doi: 10.1016/j.jpha.2025.101303

Abstract(189) HTML Full Text PDF(3)

Abstract:

Numerous c-mesenchymal-epithelial transition (c-MET) inhibitors have been reported as potential anticancer agents. However, most fail to enter clinical trials owing to poor efficacy or drug resistance. To date, the scaffold-based chemical space of small-molecule c-MET inhibitors has not been analyzed. In this study, we constructed the largest c-MET dataset, which included 2,278 molecules with different structures, by inhibiting the half maximal inhibitory concentration (IC₅₀) of kinase activity. No significant differences in drug-like properties were observed between active molecules (1,228) and inactive molecules (1,050), including chemical space coverage, physicochemical properties, and absorption, distribution, metabolism, excretion, and toxicity (ADMET) profiles. The higher chemical diversity of the active molecules was downscaled using t-distributed stochastic neighbor embedding (t-SNE) high-dimensional data. Further clustering and chemical space networks (CSNs) analyses revealed commonly used scaffolds for c-MET inhibitors, such as M5, M7, and M8. Activity cliffs and structural alerts were used to reveal “dead ends” and “safe bets” for c-MET, as well as dominant structural fragments consisting of pyridazinones, triazoles, and pyrazines. Finally, the decision tree model precisely indicated the key structural features required to constitute active c-MET inhibitor molecules, including at least three aromatic heterocycles, five aromatic nitrogen atoms, and eight nitrogen–oxygen atoms. Overall, our analyses revealed potential structure-activity relationship (SAR) patterns for c-MET inhibitors, which can inform the screening of new compounds and guide future optimization efforts.

EvoNB: A protein language model-based workflow for nanobody mutation prediction and optimization

Danyang Xiong, Yongfan Ming, Yuting Li, Shuhan Li, Kexin Chen, Jinfeng Liu, Lili Duan, Honglin Li, Min Li, Xiao He

2025, 15(6): 101260. doi: 10.1016/j.jpha.2025.101260

Abstract(449) HTML Full Text PDF(36)

Abstract:

The identification and optimization of mutations in nanobodies are crucial for enhancing their therapeutic potential in disease prevention and control. However, this process is often complex and time-consuming, which limit its widespread application in practice. In this study, we developed a workflow, named Evolutionary-Nanobody (EvoNB), to predict key mutation sites of nanobodies by combining protein language models (PLMs) and molecular dynamic (MD) simulations. By fine-tuning the ESM2 model on a large-scale nanobody dataset, the ability of EvoNB to capture specific sequence features of nanobodies was significantly enhanced. The fine-tuned EvoNB model demonstrated higher predictive accuracy in the conserved framework and highly variable complementarity-determining regions of nanobodies. Additionally, we selected four widely representative nanobody–antigen complexes to verify the predicted effects of mutations. MD simulations analyzed the energy changes caused by these mutations to predict their impact on binding affinity to the targets. The results showed that multiple mutations screened by EvoNB significantly enhanced the binding affinity between nanobody and its target, further validating the potential of this workflow for designing and optimizing nanobody mutations. Additionally, sequence-based predictions are generally less dependent on structural absence, allowing them to be more easily integrated with tools for structural predictions, such as AlphaFold 3. Through mutation prediction and systematic analysis of key sites, we can quickly predict the most promising variants for experimental validation without relying on traditional evolutionary or selection processes. The EvoNB workflow provides an effective tool for the rapid optimization of nanobodies and facilitates the application of PLMs in the biomedical field.

3D-EDiffMG: 3D equivariant diffusion-driven molecular generation to accelerate drug discovery

Chao Xu, Runduo Liu, Yufen Yao, Wanyi Huang, Zhe Li, Hai-Bin Luo

2025, 15(6): 101257. doi: 10.1016/j.jpha.2025.101257

Abstract(279) HTML Full Text PDF(9)

Abstract:

Structural optimization of lead compounds is a crucial step in drug discovery. One optimization strategy is to modify the molecular structure of a scaffold to improve both its biological activities and absorption, distribution, metabolism, excretion, and toxicity (ADMET) properties. One of the deep molecular generative model approaches preserves the scaffold while generating drug-like molecules, thereby accelerating the molecular optimization process. Deep molecular diffusion generative models simulate a gradual process that creates novel, chemically feasible molecules from noise. However, the existing models lack direct interatomic constraint features and struggle with capturing long-range dependencies in macromolecules, leading to challenges in modifying the scaffold-based molecular structures, and creates limitations in the stability and diversity of the generated molecules. To address these challenges, we propose a deep molecular diffusion generative model, the three-dimensional (3D) equivariant diffusion-driven molecular generation (3D-EDiffMG) model. The dual strong and weak atomic interaction force-based long-range dependency capturing equivariant encoder (dual-SWLEE) is introduced to encode both the bonding and non-bonding information based on strong and weak atomic interactions. Additionally, a gate multilayer perceptron (gMLP) block with tiny attention is incorporated to explicitly model complex long-sequence feature interactions and long-range dependencies. The experimental results show that 3D-EDiffMG effectively generates unique, novel, stable, and diverse drug-like molecules, highlighting its potential for lead optimization and accelerating drug discovery.

Adaptive multi-view learning method for enhanced drug repurposing using chemical-induced transcriptional profiles, knowledge graphs, and large language models

Yudong Yan, Yinqi Yang, Zhuohao Tong, Yu Wang, Fan Yang, Zupeng Pan, Chuan Liu, Mingze Bai, Yongfang Xie, Yuefei Li, Kunxian Shu, Yinghong Li

2025, 15(6): 101275. doi: 10.1016/j.jpha.2025.101275

Abstract(169) HTML Full Text PDF(11)

Abstract:

Drug repurposing offers a promising alternative to traditional drug development and significantly reduces costs and timelines by identifying new therapeutic uses for existing drugs. However, the current approaches often rely on limited data sources and simplistic hypotheses, which restrict their ability to capture the multi-faceted nature of biological systems. This study introduces adaptive multi-view learning (AMVL), a novel methodology that integrates chemical-induced transcriptional profiles (CTPs), knowledge graph (KG) embeddings, and large language model (LLM) representations, to enhance drug repurposing predictions. AMVL incorporates an innovative similarity matrix expansion strategy and leverages multi-view learning (MVL), matrix factorization, and ensemble optimization techniques to integrate heterogeneous multi-source data. Comprehensive evaluations on benchmark datasets (Fdataset, Cdataset, and Ydataset) and the large-scale iDrug dataset demonstrate that AMVL outperforms state-of-the-art (SOTA) methods, achieving superior accuracy in predicting drug-disease associations across multiple metrics. Literature-based validation further confirmed the model's predictive capabilities, with seven out of the top ten predictions corroborated by post-2011 evidence. To promote transparency and reproducibility, all data and codes used in this study were open-sourced, providing resources for processing CTPs, KG, and LLM-based similarity calculations, along with the complete AMVL algorithm and benchmarking procedures. By unifying diverse data modalities, AMVL offers a robust and scalable solution for accelerating drug discovery, fostering advancements in translational medicine and integrating multi-omics data. We aim to inspire further innovations in multi-source data integration and support the development of more precise and efficient strategies for advancing drug discovery and translational medicine.

druglikeFilter 1.0: An AI powered filter for collectively measuring the drug-likeness of compounds

Minjie Mou, Yintao Zhang, Yuntao Qian, Zhimeng Zhou, Yang Liao, Tianle Niu, Wei Hu, Yuanhao Chen, Ruoyu Jiang, Hongping Zhao, Haibin Dai, Yang Zhang, Tingting Fu

2025, 15(6): 101298. doi: 10.1016/j.jpha.2025.101298

Abstract(226) HTML Full Text PDF(16)

Abstract:

Advancements in artificial intelligence (AI) and emerging technologies are rapidly expanding the exploration of chemical space, facilitating innovative drug discovery. However, the transformation of novel compounds into safe and effective drugs remains a lengthy, high-risk, and costly process. Comprehensive early-stage evaluation is essential for reducing costs and improving the success rate of drug development. Despite this need, no comprehensive tool currently supports systematic evaluation and efficient screening. Here, we present druglikeFilter, a deep learning-based framework designed to assess drug-likeness across four critical dimensions: 1) physicochemical rule evaluated by systematic determination, 2) toxicity alert investigated from multiple perspectives, 3) binding affinity measured by dual-path analysis, and 4) compound synthesizability assessed by retro-route prediction. By enabling automated, multidimensional filtering of compound libraries, druglikeFilter not only streamlines the drug development process but also plays a crucial role in advancing research efforts towards viable drug candidates, which can be freely accessed at https://idrblab.org/drugfilter/.

SITA: Predicting site-specific immunogenicity for therapeutic antibodies

Yewei Cun, Hao Ding, Tiantian Mao, Yuan Wang, Caicui Wang, Jiajun Li, Zihao Li, Mengdie Hu, Zhiwei Cao, Tianyi Qiu

2025, 15(6): 101316. doi: 10.1016/j.jpha.2025.101316

Abstract(323) HTML Full Text PDF(37)

Abstract:

Antibody humanization is critical to reduce immunogenicity and enhance efficacy in the preclinical phase of the development of therapeutic antibodies originated from animal models. Computational suggestions have long been desired, but available tools focused on immunogenicity calculation of whole antibody sequences and sequence segments, missing the individual residue sites. This study introduces Site-specific Immunogenicity for Therapeutic Antibody (SITA), a novel computational framework that predicts B-cell immunogenicity score for not only the overall antibody, but also individual residues, based on a comprehensive set of amino acid descriptors characterizing physicochemical and spatial features for antibody structures. A transfer-learning-inspired framework was purposely adopted to overcome the scarcity of Antibody-Antibody structural complexes. On an independent testing dataset derived from 13 Antibody-Antibody structural complexes, SITA successfully predicted the epitope sites for Antibody-Antibody structures with a receiver operating characteristic (ROC)-area unver the ROC curve (AUC) of 0.85 and a precision-recall (PR)-AUC of 0.305 at the residue level. Furthermore, the SITA score can significantly distinguish immunogenicity levels of whole human antibodies, therapeutic antibodies and non-human-derived antibodies. More importantly, analysis of an additional 25 therapeutic antibodies revealed that over 70% of them were detected with decreased immunogenicity after modification compared to their parent variants. Among these, nearly 66% antibodies successfully identified actual modification sites from the top five sites with the highest SITA scores, suggesting the ability of SITA scores for guide the humanization of antibody. Overall, these findings highlight the potential of SITA in optimizing immunogenicity assessments during the process of therapeutic antibody design.

TCMKD: From ancient wisdom to modern insights-A comprehensive platform for traditional Chinese medicine knowledge discovery

Wenke Xiao, Mengqing Zhang, Danni Zhao, Fanbo Meng, Qiang Tang, Lianjiang Hu, Hongguo Chen, Yixi Xu, Qianqian Tian, Mingrui Li, Guiyang Zhang, Liang Leng, Shilin Chen, Chi Song, Wei Chen

2025, 15(6): 101297. doi: 10.1016/j.jpha.2025.101297

Abstract(197) HTML Full Text PDF(15)

Abstract:

Traditional Chinese medicine (TCM) serves as a treasure trove of ancient knowledge, holding a crucial position in the medical field. However, the exploration of TCM's extensive information has been hindered by challenges related to data standardization, completeness, and accuracy, primarily due to the decentralized distribution of TCM resources. To address these issues, we developed a platform for TCM knowledge discovery (TCMKD, https://cbcb.cdutcm.edu.cn/TCMKD/). Seven types of data, including syndromes, formulas, Chinese patent drugs (CPDs), Chinese medicinal materials (CMMs), ingredients, targets, and diseases, were manually proofread and consolidated within TCMKD. To strengthen the integration of TCM with modern medicine, TCMKD employs analytical methods such as TCM data mining, enrichment analysis, and network localization and separation. These tools help elucidate the molecular-level commonalities between TCM and contemporary scientific insights. In addition to its analytical capabilities, a quick question and answer (Q&A) system is also embedded within TCMKD to query the database efficiently, thereby improving the interactivity of the platform. The platform also provides a TCM text annotation tool, offering a simple and efficient method for TCM text mining. Overall, TCMKD not only has the potential to become a pivotal repository for TCM, delving into the pharmacological foundations of TCM treatments, but its flexible embedded tools and algorithms can also be applied to the study of other traditional medical systems, extending beyond just TCM.

Breaking barriers: MS-BDF tools in the quality control of insect-derived traditional Chinese medicine

Caixia Yuan, Dandan Zhang, Hairong Zhang, Jiyang Dong, Caisheng Wu

2025, 15(6): 101193. doi: 10.1016/j.jpha.2025.101193

Abstract(276) HTML Full Text PDF(19)

Abstract:

Combining transformer and 3DCNN models to achieve co-design of structures and sequences of antibodies in a diffusional manner

Yue Hu, Feng Tao, Jiajie Xu, Wen-Jun Lan, Jing Zhang, Wei Lan

2025, 15(6): 101267. doi: 10.1016/j.jpha.2025.101267

Abstract(193) HTML Full Text PDF(9)

Abstract:

2025 Vol. 15, No. 6