Article

Benchmarking co-folding methods to predict the structures of covalent protein–ligand complexes

Tong-han Zhang1, Jin-tao Zhu2, Zhi-xian Huang3, Juan Xie3, Jian-feng Pei2, Lu-hua Lai1,2,3,4
1 Peking–Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
2 Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
3 BNLMS, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
4 Chengdu Academy for Advanced Interdisciplinary Biotechnologies, Peking University, Chengdu 610213, China
Correspondence to: Jian-feng Pei: jfpei@pku.edu.cn, Lu-hua Lai: lhlai@pku.edu.cn,
DOI: 10.1038/s41401-025-01721-5
Received: 29 July 2025
Accepted: 24 November 2025
Advance online: 12 January 2026

Abstract

Targeted covalent inhibitors (TCIs) are emerging as a new modality in drug discovery because of their strong binding affinity and prolonged target engagement. However, the rational design of TCIs remains a significant challenge and is hindered by the lack of methods that accurately predict the structures of covalent protein–ligand complexes. Recent advances in co-folding approaches have made substantial strides in modeling complex biomolecular structures. Despite significant progress, their performance profiles for predicting the structures of covalent protein–ligand complexes remain largely unexplored because of the absence of rigorous benchmarks. Here, we introduce CoFD-Bench, a comprehensive benchmark dataset comprising 218 recently resolved covalent complexes designed to systematically evaluate both classical docking methods (AutoDock-GPU, CovDock, and GNINA) and deep learning co-folding models (AlphaFold3 (AF3), Chai-1, and Boltz-1x). Our results demonstrate that co-folding methods achieve superior ligand RMSD accuracy and protein–ligand interaction recovery. However, their performance markedly declines for novel pocket–ligand pairs. In contrast, classical docking methods exhibit stable but modest performance, which is primarily limited by target conformations. Furthermore, computational efficiency evaluations show that co-folding methods are slower than classical approaches, posing challenges for large-scale predictions. We also reveal that AF3 has the potential to identify native covalent residues through noncovalent co-folding, with a ligand RMSD comparable to that of covalent co-folding. These findings offer a possible route to explore covalent binding without prior specification of reactive residues, which are often unknown in real-world scenarios. Our study provides crucial insights and new opportunities for future co-folding-based TCI design, informing future model applications and improvements. CoFD-Bench offers rigorous evaluation criteria, diverse docking scenarios, and various methodological baselines, positioning it as an important benchmark for future model development and assessment.
Keywords: covalent complex prediction; co-folding methods; benchmarking; targeted covalent inhibitors

Article Options

Download Citation

Cited times in Scopus