Poster Presentation 50th Lorne Proteins Conference 2025

Prediction of the toxicity of proteins and peptides by exploiting structural information based on deep learning (#139)

Jing J Xu 1 , Jiangning J Song 1
  1. monash university, Clayton, VICTORIA, Australia

Proteins and peptides have recently gained considerable attention as promising therapeutic agents due to their specificity, efficacy, and versatility in targeting a wide range of diseases. However, toxicity remains a key challenge in the development and clinical application of protein- and peptide-based therapies, as unintended interactions with human cells can lead to adverse effects. Therefore, there is a critical need to develop advanced computational tools that can accurately predict the potential toxicity of proteins or peptides across a vast pool of candidates, thus enabling the safer and more effective design of therapeutic biomolecules.

Several computational tools have been developed to predict the toxicity of proteins and peptides. However, most of these tools rely solely on sequence information and overlook structural aspects, despite extensive research demonstrating that the structural conformation of proteins and peptides plays a crucial role in determining their bioactivity and interactions. Recognizing this gap, we sought to investigate how incorporating structural information could improve the accuracy of toxicity predictions.

In this study, we aimed to explore the impact of structural data on predictive performance by integrating secondary structural information into our computational models. We employed ESMfold, a state-of-the-art tool for predicting the secondary structures of proteins and peptides. ESMfold enables the rapid and accurate generation of structural predictions, which we then used to characterize each candidate at a higher level of detail. To fully utilize these structural features, we integrated them into our models using graph neural networks (GNNs), a powerful machine learning approach that effectively captures complex relationships between structural components. By comparing models that included secondary structure information with those that relied only on sequence-based features, we observed a improvement in predictive performance for models incorporating structural data. This enhancement in predictive capability suggests that leveraging secondary structural information can significantly improve the reliability of toxicity predictions, ultimately reducing the risk of adverse effects in therapeutic development.