Publication:
Deep reinforcement based power allocation for the max-min optimization in non-orthogonal multiple access

dc.contributor.authorSiddiqi, U. F.
dc.contributor.authorSait, S. M.
dc.contributor.authorUysal, Murat
dc.contributor.departmentElectrical & Electronics Engineering
dc.contributor.ozuauthorUYSAL, Murat
dc.date.accessioned2021-03-08T11:21:47Z
dc.date.available2021-03-08T11:21:47Z
dc.date.issued2020
dc.description.abstractNOMA is a radio access technique that multiplexes several users over the frequency resource and provides high throughput and fairness among different users. The maximization of the minimum the data-rate, also known as max-min, is a popular approach to ensure fairness among the users. NOMA optimizes the transmission power (or power-coefficients) of the users to perform max-min. The problem is a constrained non-convex optimization for users greater than two. We propose to solve this problem using the Double Deep Q Learning (DDQL) technique, a popular method of reinforcement learning. The DDQL technique employs a Deep Q- Network to learn to choose optimal actions to optimize users' power-coefficients. The model of the Markov Decision Process (MDP) is critical to the success of the DDQL method, and helps the DQN to learn to take better actions. An MDP model is proposed in which the state consists of the power-coefficients values, data-rate of users, and vectors indicating which of the power-coefficients can be increased or decreased. An action simultaneously increases the power-coefficient of one user and reduces another user's power-coefficient by the same amount. The amount of change can be small or large. The action-space contains all possible ways to alter the values of any two users at a time. DQN consists of a convolutional layer and fully connected layers. We compared the proposed method with the sequential least squares programming and trust-region constrained algorithms and found that the proposed method can produce competitive results.en_US
dc.description.sponsorshipDeanship of Scientific Research, King Fahd University of Petroleum & Minerals, Dhahran, Saudi Arabia
dc.description.versionPublisher version
dc.identifier.doi10.1109/ACCESS.2020.3038923en_US
dc.identifier.endpage211247en_US
dc.identifier.issn2169-3536en_US
dc.identifier.scopus2-s2.0-85097130085
dc.identifier.startpage211235en_US
dc.identifier.urihttp://hdl.handle.net/10679/7370
dc.identifier.urihttps://doi.org/10.1109/ACCESS.2020.3038923
dc.identifier.volume8en_US
dc.identifier.wos000596356100001
dc.language.isoengen_US
dc.peerreviewedyesen_US
dc.publicationstatusPublisheden_US
dc.publisherIEEEen_US
dc.relation.ispartofIEEE Access
dc.relation.publicationcategoryInternational Refereed Journal
dc.rightsinfo:eu-repo/semantics/openAccess
dc.subject.keywordsNOMAen_US
dc.subject.keywordsOptimizationen_US
dc.subject.keywordsSilicon carbideen_US
dc.subject.keywordsResource managementen_US
dc.subject.keywordsTask analysisen_US
dc.subject.keywordsRelaysen_US
dc.subject.keywordsReinforcement learningen_US
dc.subject.keywordsNon-orthogonal multiplexingen_US
dc.subject.keywordsDouble deep Q learningen_US
dc.subject.keywordsDeep reinforcement learningen_US
dc.subject.keywordsNon-convex optimizationen_US
dc.subject.keywordsPower-domain NOMAen_US
dc.titleDeep reinforcement based power allocation for the max-min optimization in non-orthogonal multiple accessen_US
dc.typeArticleen_US
dspace.entity.typePublication
relation.isOrgUnitOfPublication7b58c5c4-dccc-40a3-aaf2-9b209113b763
relation.isOrgUnitOfPublication.latestForDiscovery7b58c5c4-dccc-40a3-aaf2-9b209113b763

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Deep reinforcement based power allocation for the max-min optimization in non-orthogonal multiple access.pdf
Size:
6.59 MB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
Placeholder
Name:
license.txt
Size:
1.45 KB
Format:
Item-specific license agreed upon to submission
Description: