Deep reinforcement learning approach for trading automation in the stock market

Kabbani, Taylan

dc.contributor.author	Kabbani, Taylan
dc.date.accessioned	2022-06-07T12:36:52Z
dc.date.available	2022-06-07T12:36:52Z
dc.identifier.uri	http://hdl.handle.net/10679/7708
dc.identifier.uri	https://tez.yok.gov.tr
dc.identifier.uri	https://discover.ozyegin.edu.tr/iii/encore/record/C__Rb4969441?lang=eng&ivts=X5GUFipAB5kpB5cO9xz9pA%3D%3D&casts=jYQ%2BCMssaa6aMV2eMracVQ%3D%3D
dc.description	Thesis (M.A.)--Özyeğin University, Graduate School of Sciences and Engineering, Department of Data Science, December 2021.
dc.description.abstract	Deep Reinforcement Learning (DRL) algorithms can scale to previously intractable problems. The automation of profit generation in the stock market is possible using DRL, by combining the financial assets price "prediction" step and the "allocation" step of the portfolio in one unified process to produce fully autonomous system capable of interacting with its environment to make optimal decisions through trial and error. In this study, a continuous action space approach is adopted to give the trading agent the ability to gradually adjust the portfolio's positions with each time step (dynamically re-allocate investments), resulting in better agent-environment interaction and faster convergence of the learning process. In addition, the approach supports the managing of a portfolio with several assets instead of a single one. This work represents a novel DRL model to generate profitable trades in the stock market, effectively overcoming the limitations of supervised learning approaches. We formulate the trading problem as a Partially Observed Markov Decision Process (POMDP) model, considering the constraints imposed by the stock market, such as liquidity and transaction costs. More specifically, we design an environment that simulates the real-world trading process by augmenting the state representation with ten different technical indicators and sentiment analysis of news articles for each stock. We then solve the formulated POMDP problem using the Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm and achieved a 2.68 Sharpe ratio on the test dataset. From the point of view of stock market forecasting and the intelligent decision-making mechanism, this study demonstrates the superiority of deep reinforcement learning in financial markets over other types of machine learning such as supervised learning and proves the credibility and advantages of strategic decision-making using DRL.	en_US
dc.description.abstract	Deep Reinforcement Learning (DRL) algoritmaları, daha önceki zorlu sorunlara ölçeklenebilir.Hisse senedi piyasasında kâr üretiminin otomasyonu, finansal varlıkların fiyat "tahmin" adımı ve portföyün "tahsis" adımını tek bir birleşik süreçte birleştirerek, deneme yanılma yoluyla optimal kararlar almak için çevresiyle etkileşime girebilen tamamen özerk bir sistem üretmek için DRL kullanılarak mümkündür. Bu çalışmada, alım-satım aracısına portföyün pozisyonlarını her zaman adımı için kademeli olarak ayarlama (yatırımları dinamik olarak yeniden tahsis etme) özelliği vermek için sürekli bir eylem alanı yaklaşımı benimsenmiştir, bu da daha iyi aracı-ortam etkileşimi ve öğrenme sürecinin daha hızlı yakınsaması ile sonuçlanmıştır. Ayrıca yaklaşım, tek bir varlık yerine birkaç varlık içeren bir portföyün yönetilmesini destekler. Bu çalışma, denetimli öğrenme yaklaşımlarının sınırlamalarını etkin bir şekilde aşarak borsada karlı işlemler oluşturmak için yeni bir DRL modelini temsil etmektedir. Ticaret problemini, likidite ve işlem maliyetleri gibi hisse senedi piyasasının getirdiği kısıtlamaları dikkate alarak Partially Observed Markov Decision Process (POMDP) modeli olarak formüle ediyoruz. Daha spesifik olarak, her hisse senedi için on farklı teknik gösterge ve haber makalelerinin duygu analizi ile durum temsilini artırarak gerçek dünyadaki ticaret sürecini simüle eden bir ortam tasarlıyoruz. Daha sonra formüle edilmiş POMDP problemini Twin Delayed Deep Deterministic Policy Gradient (TD3) algoritmasını kullanarak çözdük ve test veri setinde 2.68 Sharpe oranı elde ettik. Hisse senedi piyasası tahmini ve akıllı karar verme mekanizması açısından bu makale, finansal piyasalarda derin pekiştirmeli öğrenmenin denetimli öğrenme gibi diğer makine öğrenimi türlerine göre üstünlüğünü göstermekte, aynı zamanda DRL yaklaşımının güvenilirliğini ve stratejik karar vermenin avantajlarını kanıtlamaktadır.
dc.language.iso	eng	en_US
dc.rights	restrictedAccess
dc.title	Deep reinforcement learning approach for trading automation in the stock market	en_US
dc.title.alternative	Hisse senetlerinde işlem otomasyonu için derin güçlendirme öğrenme yaklaşımı
dc.type	Master's thesis	en_US
dc.contributor.advisor	Duman, Ekrem
dc.contributor.committeeMember	Duman, Ekrem
dc.contributor.committeeMember	Albey, Erinç
dc.contributor.committeeMember	Alkaya, A. F.
dc.publicationstatus	Unpublished	en_US
dc.contributor.department	Özyeğin University
dc.contributor.ozugradstudent	Kabbani, Taylan
dc.contributor.authorMale	1
dc.relation.publicationcategory	Thesis - Institutional Graduate Student