Turn off MathJax
Article Contents
Yuwei Zhou, Haoxiang Tang, Changchun Wu, Zixuan Zhang, Jinyi Wei, Rong Gong, Samarappuli Mudiyanselage Savini Gunarathne, Changcheng Xiang, Jian Huang. Enhancing polyreactivity prediction of preclinical antibodies through fine-tuned protein language models[J]. Journal of Pharmaceutical Analysis. doi: 10.1016/j.jpha.2025.101448
Citation: Yuwei Zhou, Haoxiang Tang, Changchun Wu, Zixuan Zhang, Jinyi Wei, Rong Gong, Samarappuli Mudiyanselage Savini Gunarathne, Changcheng Xiang, Jian Huang. Enhancing polyreactivity prediction of preclinical antibodies through fine-tuned protein language models[J]. Journal of Pharmaceutical Analysis. doi: 10.1016/j.jpha.2025.101448

Enhancing polyreactivity prediction of preclinical antibodies through fine-tuned protein language models

doi: 10.1016/j.jpha.2025.101448
Funds:

This work was supported by the National Natural Science Foundation of China (Grant Nos.: 62071099, and 62371112), and Sichuan Province Science and Technology Support Program (Grant No.: 2024NSFSC0636).

  • Received Date: Jan. 12, 2025
  • Accepted Date: Sep. 08, 2025
  • Rev Recd Date: Sep. 05, 2025
  • Available Online: Sep. 10, 2025
  • Therapeutic monoclonal antibodies (mAbs) have garnered significant attention for their efficacy in treating a variety of diseases. However, some candidate antibodies exhibit non-specific binding to off-target proteins or other biomolecules, leading to high polyreactivity, which can compromise therapeutic efficacy and cause other complications, thereby reducing the approval rate of antibody drug candidates. Therefore, predicting the polyreactivity risk of therapeutic mAbs at an early stage of development is crucial. In this study, we fine-tuned six pre-trained protein language models to predict the polyreactivity of antibody sequences. The most effective model, named PolyXpert, demonstrated a sensitivity (SN) of 90.10%, specificity (SP) of 90.08%, accuracy (ACC) of 90.10%, F1-score of 0.9301, Matthews correlation coefficient (MCC) of 0.7654, and an area under curve (AUC) of 0.9672 on the external independent test dataset. These results suggest its potential as a valuable in-silico tool for assessing antibody polyreactivity and for selecting superior therapeutic mAb candidates for clinical development. Furthermore, we demonstrated that fine-tuned language model classifiers exhibit enhanced prediction robustness compared with classifiers trained on pre-trained model embeddings. PolyXpert can be easily available at https://github.com/zzyywww/PolyXpert.
  • loading
  • [1]
    A.B. Kapingidza, K. Kowal, M. Chruszcz, Antigen-Antibody Complexes, Subcell Biochem. 94 (2020) 465-497.
    [2]
    X. Lyu, Q. Zhao, J. Hui, et al., The global landscape of approved antibody therapies, Antib Ther. 5 (2022) 233-257.
    [3]
    D.R. Goulet, W.M. Atkins, Considerations for the Design of Antibody-Based Therapeutics, J Pharm Sci. 109 (2020) 74-103.
    [4]
    S. Crescioli, H. Kaplon, L. Wang, et al., Antibodies to watch in 2025, MAbs. 17 (2025) 2443538.
    [5]
    Z. Elgundi, M. Reslan, E. Cruz, et al., The state-of-play and future of antibody therapeutics, Adv Drug Deliv Rev. 122 (2017) 2-19.
    [6]
    S. Crescioli, H. Kaplon, A. Chenoweth, et al., Antibodies to watch in 2024, MAbs. 16 (2024) 2297450.
    [7]
    Y. Xu, D. Wang, B. Mason, et al., Structure, heterogeneity and developability assessment of therapeutic antibodies, MAbs. 11 (2019) 239-264.
    [8]
    W. Li, H. Lin, Z. Huang, et al., DOTAD: A Database of Therapeutic Antibody Developability, Interdiscip Sci. 16 (2024) 623-634.
    [9]
    Y. Zhou, Z. Huang, W. Li, et al., Deep learning in preclinical antibody drug discovery and development, Methods. 218 (2023) 57-71.
    [10]
    A. Datta-Mannan, J. Lu, D.R. Witcher, et al., The interplay of non-specific binding, target-mediated clearance and FcRn interactions on the pharmacokinetics of humanized antibodies, MAbs. 7 (2015) 1084-1093.
    [11]
    A. Datta-Mannan, A. Thangaraju, D. Leung, et al., Balancing charge in the complementarity-determining regions of humanized mAbs without affecting pI reduces non-specific binding and improves the pharmacokinetics, MAbs. 7 (2015) 483-493.
    [12]
    A. Amash, G. Volkers, P. Farber, et al., Developability considerations for bispecific and multispecific antibodies, MAbs. 16 (2024) 2394229.
    [13]
    O. Cunningham, M. Scott, Z.S. Zhou, et al., Polyreactivity and polyspecificity in therapeutic antibody development: risk factors for failure in preclinical and clinical development campaigns, MAbs. 13 (2021) 1999195.
    [14]
    Y. Teng, R. Guo, J. Sun, et al., Reactive capillary hemangiomas induced by camrelizumab (SHR-1210), an anti-PD-1 agent, Acta Oncol. 58 (2019) 388-389.
    [15]
    H. Mo, J. Huang, J. Xu, et al., Safety, anti-tumour activity, and pharmacokinetics of fixed-dose SHR-1210, an anti-PD-1 antibody in advanced solid tumours: a dose-escalation, phase 1 study, Br J Cancer. 119 (2018) 538-545.
    [16]
    W.J.J. Finlay, J.E. Coleman, J.S. Edwards, et al., Anti-PD1 'SHR-1210' aberrantly targets pro-angiogenic receptors and this polyspecificity can be ablated by paratope refinement, MAbs. 11 (2019) 26-44.
    [17]
    M. Onda, S. Nagata, Y. Tsutsumi, et al., Lowering the isoelectric point of the Fv portion of recombinant immunotoxins leads to decreased nonspecific animal toxicity without affecting antitumor activity, Cancer Res. 61 (2001) 5070-5077.
    [18]
    R.A. Norman, F. Ambrosetti, A. Bonvin, et al., Computational approaches to therapeutic antibody design: established methods and emerging trends, Brief Bioinform. 21 (2020) 1549-1567.
    [19]
    M.I.J. Raybould, C.M. Deane, The Therapeutic Antibody Profiler for Computational Developability Assessment, Methods Mol Biol. 2313 (2022) 115-125.
    [20]
    Y. Zhou, Z. Huang, Y. Gou, et al., AB-Amy: machine learning aided amyloidogenic risk prediction of therapeutic antibody light chains, Antib Ther. 6 (2023) 147-156.
    [21]
    C.T. Boughter, M.T. Borowska, J.J. Guthmiller, et al., Biochemical patterns of antibody polyreactivity revealed through a bioinformatics-based analysis of CDR loops, Elife. 9 (2020).
    [22]
    E.P. Harvey, J.E. Shin, M.A. Skiba, et al., An in silico method to assess antibody fragment polyreactivity, Nat Commun. 13 (2022) 7554.
    [23]
    H.T. Chen, Y. Zhang, J. Huang, et al., Human antibody polyreactivity is governed primarily by the heavy-chain complementarity-determining regions, Cell Rep. 43 (2024) 114801.
    [24]
    H. Lim, K.T. No, Prediction of polyreactive and nonspecific single-chain fragment variables through structural biochemical features and protein language-based descriptors, BMC Bioinformatics. 23 (2022) 520.
    [25]
    X. Yu, K. Vangjeli, A. Prakash, et al., Protein language models enable prediction of polyreactivity of monospecific, bispecific, and heavy-chain-only antibodies, Antib Ther. 7 (2024) 199-208.
    [26]
    S.K.S. Chu, K. Narang, J.B. Siegel, Protein stability prediction by fine-tuning a protein language model on a mega-scale dataset, PLoS Comput Biol. 20 (2024) e1012248.
    [27]
    J.A. Ruffolo, J.J. Gray, J. Sulam.Deciphering antibody affinity maturation with language models and weakly supervised learning, arXiv. 2021. https://doi.org/10.48550/arXiv.2112.07782.
    [28]
    J. Barton, J.D. Galson, J. Leem.Enhancing Antibody Language Models with Structural Information, arXiv. 2023. https://doi.org/10.1101/2023.12.12.569610.
    [29]
    H. Kenlay, F.A. Dreyer, A. Kovaltsuk, et al., Large scale paired antibody language models, PLoS Comput Biol. 20 (2024) e1012646.
    [30]
    Z. Lin, H. Akin, R. Rao, et al., Evolutionary-scale prediction of atomic-level protein structure with a language model, Science. 379 (2023) 1123-1130.
    [31]
    A. Elnaggar, M. Heinzinger, C. Dallago, et al., ProtTrans: Toward Understanding the Language of Life Through Self-Supervised Learning, IEEE Trans Pattern Anal Mach Intell. 44 (2022) 7112-7127.
    [32]
    T. Jain, T. Sun, S. Durand, et al., Biophysical properties of the clinical-stage antibody landscape, Proc Natl Acad Sci U S A. 114 (2017) 944-949.
    [33]
    M. Bailly, C. Mieczkowski, V. Juan, et al., Predicting Antibody Developability Profiles Through Early Stage Discovery Screening, MAbs. 12 (2020) 1743053.
    [34]
    J. Dunbar, C.M. Deane, ANARCI: antigen receptor numbering and receptor classification, Bioinformatics. 32 (2016) 298-300.
    [35]
    G.E. Crooks, G. Hon, J.M. Chandonia, et al., WebLogo: a sequence logo generator, Genome Res. 14 (2004) 1188-1190.
    [36]
    A. Schoch, H. Kettenberger, O. Mundigl, et al., Charge-mediated influence of the antibody variable domain on FcRn-dependent pharmacokinetics, Proc Natl Acad Sci U S A. 112 (2015) 5997-6002.
    [37]
    A.V. Sadybekov, V. Katritch, Computational approaches streamlining drug discovery, Nature. 616 (2023) 673-685.
    [38]
    Y. Zhou, S. Xie, Y. Yang, et al., SSH2.0: A Better Tool for Predicting the Hydrophobic Interaction Risk of Monoclonal Antibody, Front Genet. 13 (2022) 842127.
    [39]
    T. Jain, T. Boland, M. Vasquez, Identifying developability risks for clinical progression of antibodies using high-throughput in vitro and in silico approaches, MAbs. 15 (2023) 2200540.
    [40]
    L. Berglund, E. Bjorling, P. Oksvold, et al., A genecentric Human Protein Atlas for expression profiles based on antibodies, Mol Cell Proteomics. 7 (2008) 2019-2027.
    [41]
    M.L. Fernandez-Quintero, J.R. Loeffler, L.M. Bacher, et al., Local and Global Rigidification Upon Antibody Affinity Maturation, Front Mol Biosci. 7 (2020) 182.
    [42]
    M.T. Borowska, C.T. Boughter, J.J. Bunker, et al., Biochemical and biophysical characterization of natural polyreactivity in antibodies, Cell Rep. 42 (2023) 113190.
    [43]
    S. Elias, C. Wrzodek, C.M. Deane, et al., Prediction of polyspecificity from antibody sequence data by machine learning, Front Bioinform. 3 (2023) 1286883.
    [44]
    S. Birtalan, Y. Zhang, F.A. Fellouse, et al., The intrinsic contributions of tyrosine, serine, glycine and arginine to the affinity and specificity of antibodies, J Mol Biol. 377 (2008) 1518-1528.
    [45]
    R. Akbar, H. Bashour, P. Rawat, et al., Progress and challenges for the machine learning-based design of fit-for-purpose monoclonal antibodies, MAbs. 14 (2022) 2008790.
    [46]
    S. Elias, C. Wrzodek, C.M. Deane, et al., Prediction of polyspecificity from antibody sequence data by machine learning, Front Bioinform. 3 (2023) 1286883.
    [47]
    J. Dunbar, K. Krawczyk, J. Leem, et al., SAbDab: the structural antibody database, Nucleic Acids Res. 42 (2014) D1140-1146.
    [48]
    M.I.J. Raybould, C. Marks, A.P. Lewis, et al., Thera-SAbDab: the Therapeutic Structural Antibody Database, Nucleic Acids Res. 48 (2020) D383-D388.
    [49]
    M. Lecerf, A. Kanyavuz, S. Lacroix-Desmazes, et al., Sequence features of variable region determining physicochemical properties and polyreactivity of therapeutic antibodies, Mol Immunol. 112 (2019) 338-346.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(1)

    Article Metrics

    Article views (12) PDF downloads(0) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return