본문으로 건너뛰기
← 뒤로

ME-p: A Deep Learning Method with Multimodal Learning for Protein p Prediction.

1/5 보강
Journal of chemical theory and computation 2026 Vol.22(2) p. 1149-1163
Retraction 확인
출처

Shi S, Miao R, Liu D, Zhang Y, Ruan S, Xu Q, Wang J, Li H, Li S

📝 환자 설명용 한 줄

Proteins are crucial in biological processes and are important substances that mediate biochemical reactions, regulate cellular processes, and facilitate drug binding through their active sites and su

이 논문을 인용하기

BibTeX ↓ RIS ↓
APA Shi S, Miao R, et al. (2026). ME-p: A Deep Learning Method with Multimodal Learning for Protein p Prediction.. Journal of chemical theory and computation, 22(2), 1149-1163. https://doi.org/10.1021/acs.jctc.5c01747
MLA Shi S, et al.. "ME-p: A Deep Learning Method with Multimodal Learning for Protein p Prediction.." Journal of chemical theory and computation, vol. 22, no. 2, 2026, pp. 1149-1163.
PMID 41528986

Abstract

Proteins are crucial in biological processes and are important substances that mediate biochemical reactions, regulate cellular processes, and facilitate drug binding through their active sites and surface residues. The p values of proteins determine the protonation state of ionizable amino acids under specific pH conditions, profoundly impacting protein structure, function, and drug design. However, experimental determination of p values is normally laborious and complex. Moreover, existing prediction methods are limited by the data quantity and quality, as well as their inability to address the intricate structural and physicochemical attributes of proteins, thereby hindering accuracy and generalization, especially in predicting p values for buried residues. In this study, we developed a multimodal protein p prediction model named ME-p (Multimodal ESM p), which leverages the multimodal information and employs a multifidelity learning strategy to speedily and accurately predict molecular p values. The ME-p method facilitates data augmentation by integrating the local environmental attributes of amino acids with the FASTA sequence characteristics of proteins. Furthermore, the incorporation of multifidelity learning has addressed the challenge of limited data availability to some extent. Our ME-p model outperforms several state-of-the-art models in predicting protein p values, achieving impressive results with a low RMSE of 0.845 ± 0.09 and MAE of 0.641 ± 0.07, a high of 0.921 ± 0.02, and R of 0.959 ± 0.01 on the PE-p data set. Notably, ME-p demonstrated balanced and robust performance across major ionizable residue types (ASP, GLU, HIS, LYS). It demonstrates superior accuracy in predicting p values for buried residues (RSA < 0.2), achieving the lowest MAE values of 0.921 ± 0.05 on the PE-p data set and 0.911 ± 0.06 on the Small Set data set, which collectively excel in capturing complex environmental influences on p. Moreover, our method confirmed pH-dependent binding of PD-L1 antibodies mediated by the protonation state of His-69 in PD-L1, emphasizing the critical role of amino acid protonation states in drug design. The source code of ME-p can be found at https://github.com/yzjyg215/ME-pKa.

MeSH Terms

Deep Learning; Proteins; Hydrogen-Ion Concentration; Amino Acids

같은 제1저자의 인용 많은 논문 (5)