Finsentiment : predicting financial sentiment and risk through transfer learning

Ergün, Zehra Erva

Publication:
Finsentiment : predicting financial sentiment and risk through transfer learning

Authors

Ergün, Zehra Erva

Organizational Unit

Department of Computer Science

Type

Master's thesis

Access

restrictedAccess

Publication Status

Unpublished

Abstract

There is an increasing interest in financial text mining tasks. Significant progress has been made by using deep learning-based models on generic corpus, which also shows reasonable results on financial text mining tasks such as financial sentiment analysis. However, financial sentiment analysis is still a demanding work because of insufficiency of labeled data for the financial domain and its specialized language. General-purpose deep learning methods are not as effective mainly due to specialized language used in the financial context. In this study, we focus on enhancing the performance of financial text mining tasks by improving the existing pretrained language models via NLP transfer learning. Pretrained language models demand a small quantity of labeled samples, and they could be enhanced to a greater extent by training them on domain-specific corpora instead. We propose an enhanced model FinSentiment, which incorporates enhanced versions of a number of recentlyproposed pretrained models, such as BERT, XLNet, RoBERTa to better perform across NLP tasks in financial domain by training these models on financial domain corpora. The corresponding finance-specific models in FinSentiment are called Fin-BERT, Fin-XLNet, and Fin-RoBERTa respectively. We also propose variants of these models jointly trained over financial domain and general corpora. Our finance-specific FinSentiment models in general show the best performance across 3 financial sentiment analysis datasets, even when only a subpart of these models are fine-tuned with a smaller training set. Our results exhibit enhancement for each tested performance criteria on the existing results for these datasets. Extensive experimental results demonstrate the effectiveness and robustness of especially RoBERTa pretrained on financial corpora. Overall, we show that NLP transfer learning techniques are favorable solutions to financial sentiment analysis tasks. Financial risk is empirically quantified in terms of asset return volatility, which is degree of deviation from the expected return of the asset. Under risk management in finance, predicting asset volatility is one of the most crucial problems because of its important role in making investment decisions. Even though a number of previous studies have investigated the role of natural language knowledge in enhancing the quality of volatility predictions, volatility estimation can still be enhanced via recent deep learning techniques. Specifically, extracting financial knowledge in text through transfer learning approaches such as BERT has not been used in risk prediction. Here, we come up with RiskBERT, the first BERT-based transfer learning method to predict asset volatility by simultaneously considering both a broad set of financial attributes and financial sentiment. In terms of language dataset, we utilize transcripts from the annually occurring 10-K filings of the publicly trading companies to train our model. Our proposed model, RiskBERT uses attention mechanism to model verbal context and remarkably performs better than the state-of-the-art methods and baselines such as historical volatility. We observe such outperformance even when RiskBERT is finetuned with a smaller training set. We found RiskBERT to be more effective in risk prediction after the Sarbanes-Oxley Act of 2002 has passed since such legislation has made the annual reports more effective. Overall, we show that NLP transfer learning techniques are favorable solutions to financial risk prediction task. Our pretrained models, and source code will be publicly available once the review is finished.
Son zamanlarda, finansal metin madenciliğine ilgide artış bulunmaktadır. Derin öğrenmeye dayalı modellerle genel korpuslarda yapılan çalışmalarda hayli yol katedildi ki bu modellerle finansal metin madenciliği işi olan finansal duygu analizinde de mantıklı sonuçlar elde edildi. Ancak, finansal duygu analizi, finans alanında etiketlenmiş veri eksiliğinden ve özelleşmiş bir dili olmasına bağlı olarak hala zahmetli bir iştir. Finans kapsamındaki bu özelleşmiş dilden dolayı genel amaçlı derin öğrenme metotları çok verimli çalışmıyor. Bu çalışmada, halihazırda bulunan dil modellerini transfer öğrenmesi ile geliştirerek finansal metin madenciliği problemlerindeki performansı iyileştirmeye odaklandık. ¨ Onceden eğitilmiş dil modelleri daha az miktarda etiketlenmiş veri örneklemi talep eder ve alana özel bir korpora ile eğitildiğinde daha da büyük çapta performansa sahip olur. Biz, son dönemlerde üzerine çalışılmış olan BERT, XLNet ve RoBERTa modellerini içeren, iyileştirilmiş bir model olan FinSentiment 'i sunuyoruz. Bu modeller de doğal dil işleme problemlerinde daha iyi bir performans göstermeleri için finans alanında bir korpora ile eğitildiler. Sırasıyla finans alanı özelinde eğitilmiş FinSentiment modelleri: FinBERT, Fin-XLNet ve Fin-RoBERTa. Bu FinSentiment modelleri genel olarak 3 finansal duygu analizi verisetlerinde en iyi performansı gösterdiler ki bu modellerin bir parçası küçük bir set ile eğitilmişti. Test edilen veri setleri için baz aldığımız performans kriterine göre olan sonuçlarımızda iyileşme gözlemlendi. Daha detaylı sonuçlarda ise özellikle finans korporası ile eğitilmiş RoBERTa modelinin verimliliğini gösterdi. Nihayetinde, transfer öğrenmesi teknikleri duygu analizi çalışmalarında elverişli çözümler sunduğunu göstermiş olduk. Finansal risk ampirik olarak, varlık getirisi volatilitesi üzerinden ölçülür. Varlık getirisi volatilitesi, varlığın beklenen getirisinden ne kadar sapma olduğunun bir ölçütüdür. Finans'ta risk yönetimi altında, yatırım kararı vermekteki önemine binaen varlık volatilitesi tahmini en mühim problemlerden biridir. Doğal dil işleme yöntemlerini kullanarak volatilite tahminlemesinde verimliliğ arttırmaya yönelik birçok çalışma olmasına rağmen, derin öğrenme teknikleriyle volatilite tahminlemesi daha da iyile ştirilebilir. özellikle, risk tahminlemesinde BERT gibi transfer öğrenmesi yaklaşımını kullanarak metinlerden finansal anlam çıkaran yöntemler kullanılmamıştır. C， alışmamızda öncelikle, BERT'e dayalı transfer öğrenmesi metodu ile varlık volatilitesi tahminlemesi yapan RiskBERT 'i sunuyoruz. Dil veriseti olarak modelimizi eğitirken halka açık şirketlerin yıllık yayınlanan 10-K belgelerini kullandık. RiskBERT modeli, sözcüklere dayalı bağlamı modellerken bir "attention mekanizması" kullanır ve modern yöntemler ile dayanak noktası olan geçmiş volatilite değerine göre dikkate değer bir performans göstermiştir. Bu dikkate değer performans, çok küçük bir veri setiyle eğitilerek göstermiştir. özellikle 2002 yılında geçen Sarbanes-Oxley yasası sonrası raporların daha etkili bir şekilde yazılması zorunlu tutulduğundan, RiskBERT de bu yasa sonrası daha iyi bir performans sergilemiştir. Özetle, risk tahminlemesinde de transfer öğrenmesi teknikleri olumlu sonuçlar vermiştir.

URI

https://discover.ozyegin.edu.tr/iii/encore/record/C__Rb7038410
https://hdl.handle.net/10679/10270
https://tez.yok.gov.tr/

Publication:
Finsentiment : predicting financial sentiment and risk through transfer learning

Institution Authors

Authors

Research Projects

Journal Title

Journal ISSN

Volume Title

Type

Sub Type

Access

Publication Status

Journal Issue

Abstract

Date

Publisher

Description

Keywords

Citation

URI

Collections

0

Views

0

Downloads

Publication: Finsentiment : predicting financial sentiment and risk through transfer learning

Institution Authors

Authors

Research Projects

Journal Title

Journal ISSN

Volume Title

Type

Sub Type

Access

Publication Status

Journal Issue

Abstract

Date

Publisher

Description

Keywords

Citation

URI

Collections

0

Views

0

Downloads

Publication:
Finsentiment : predicting financial sentiment and risk through transfer learning