Show simple item record

dc.contributor.authorSiddiqi, U. F.
dc.contributor.authorSait, S. M.
dc.contributor.authorUysal, Murat
dc.date.accessioned2021-02-08T08:08:21Z
dc.date.available2021-02-08T08:08:21Z
dc.date.issued2020
dc.identifier.issn2169-3536en_US
dc.identifier.urihttp://hdl.handle.net/10679/7274
dc.identifier.urihttps://ieeexplore.ieee.org/document/9130159
dc.description.abstractThe traditional method to solve nondeterministic-polynomial-time (NP)-hard optimization problems is to apply meta-heuristic algorithms. In contrast, Deep Q Learning (DQL) uses memory of experience and deep neural network (DNN) to choose steps and progress towards solving the problem. The dynamic time-division multiple access (DTDMA) scheme is a viable transmission method in visible light communication (VLC) systems. In DTDMA systems, the time-slots of the users are adjusted to maximize the spectral efficiency (SE) of the system. The users in a VLC network have different channel gains because of their physical locations, and the use of variable time-slots can improve the system performance. In this work, we propose a Markov decision process (MDP) model of the DTDMA-based VLC system. The MDP model integrates into deep Q learning (DQL) and provides information to it according to the behavior of the VLC system and the objective to maximize the SE. When we use the proposed MDP model in deep Q learning with experienced replay algorithm, we provide the light emitting diode (LED)-based transmitter an autonomy to solve the problem so it can adjust the time-slots of users using the data collected by device in the past. The proposed model includes definitions of the state, actions, and rewards based on the specific characteristics of the problem. Simulations show that the performance of the proposed DQL method can produce results that are competitive to the well-known metaheuristic algorithms, such as Simulated Annealing and Tabu search algorithms.en_US
dc.description.sponsorshipKing Fahd University of Petroleum & Minerals, Dhahran, Saudi Arabia
dc.language.isoengen_US
dc.publisherIEEEen_US
dc.relation.ispartofIEEE Access
dc.rightsopenAccess
dc.titleDeep Q-learning based optimization of VLC systems with dynamic time-division multiplexingen_US
dc.typeArticleen_US
dc.description.versionPublisher versionen_US
dc.peerreviewedyesen_US
dc.publicationstatusPublisheden_US
dc.contributor.departmentÖzyeğin University
dc.contributor.authorID(ORCID 0000-0001-5945-0813 & YÖK ID 124615) Uysal, Murat
dc.contributor.ozuauthorUysal, Murat
dc.identifier.volume8en_US
dc.identifier.startpage120375en_US
dc.identifier.endpage120387en_US
dc.identifier.wosWOS:000551989600001
dc.identifier.doi10.1109/ACCESS.2020.3005885en_US
dc.subject.keywordsDeep Q learningen_US
dc.subject.keywordsDeep reinforcement learningen_US
dc.subject.keywordsDynamic time division multiple accessen_US
dc.subject.keywordsVisible light communicationsen_US
dc.subject.keywordsOptimizationen_US
dc.subject.keywordsNon-deterministic algorithmsen_US
dc.identifier.scopusSCOPUS:2-s2.0-85088314354
dc.contributor.authorMale1
dc.relation.publicationcategoryArticle - International Refereed Journal - Institutional Academic Staff


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record


Share this page