Publication: BERT2OME: Prediction of 2′-O-methylation modifications from RNA sequence by transformer architecture based on BERT
Institution Authors
Authors
Journal Title
Journal ISSN
Volume Title
Type
Article
Access
info:eu-repo/semantics/restrictedAccess
Publication Status
Published
Abstract
Recent work on language models has resulted in state-of-the-art performance on various language tasks. Among these, Bidirectional Encoder Representations from Transformers (BERT) has focused on contextualizing word embeddings to extract context and semantics of the words. On the other hand, post-transcriptional 2'-O-methylation (Nm) RNA modification is important in various cellular tasks and related to a number of diseases. The existing high-throughput experimental techniques take longer time to detect these modifications, and costly in exploring these functional processes. Here, to deeply understand the associated biological processes faster, we come up with an efficient method B2O to infer 2'-O-methylation RNA modification sites from RNA sequences. B2O combines BERT-based model with convolutional neural networks (CNN) to infer the relationship between the modification sites and RNA sequence content. Unlike the methods proposed so far, B2O assumes each given RNA sequence as a text and focuses on improving the modification prediction performance by integrating the pretrained deep learning-based language model BERT. Additionally, our transformer-based approach could infer modification sites across multiple species. According to 5-fold cross-validation, human and mouse accuracies were and respectively. Similarly, ROC AUC scores were 0.99, 0.94 for the same species. Detailed results show that B2O reduces the time consumed in biological experiments and outperforms the existing approaches across different datasets and species over multiple metrics. Additionally, deep learning approaches such as 2D CNNs are more promising in learning BERT attributes than more conventional machine learning methods. Our code and datasets can be found at .
Date
2023-06
Publisher
IEEE