Show simple item record

dc.contributor.author Moussaoui, Hanae
dc.contributor.author El akkad, Nabil
dc.contributor.author Benslimane, Mohamed
dc.date.accessioned 2023-05-07T08:51:33Z
dc.date.available 2023-05-07T08:51:33Z
dc.date.issued 2023-05-07
dc.identifier.issn 2210-142X
dc.identifier.uri https://journal.uob.edu.bh:443/handle/123456789/4943
dc.description.abstract Reinforcement learning is considered a sort of machine learning that acquires knowledge of solving problems using the trial-and-error technique. The process starts with the main actor that is the agent interacting with a given environment and attempting to achieve a multi-step goal within this environment. Take the example of a self-driving car trying to drive on real roads, where its main goal is to drive the owner from a given point A to a specific point B while avoiding obstacles. The environment is characterized by a state that the agent detects and examines. The state might include for example the car's location, the condition of the road, and the location of other vehicles. On the other hand, due to the agent's several actions, the environment's state changes according to these modifications. Eventually, and at this stage, the agent gets reward signals as it proceeds nearer to its goal. The agent uses these rewards signals to determine which actions were successful and which actions were not. We repeat this state-action and reward loop until the agent learns how to operate effectively within the environment using the trial-and-error concept. The agent's main objective is to learn how to always choose the right action given any state of the environment that leads it closer to its goal. In this paper, we gathered all the methods used in the literature. Multi-armed bandits, the Markov decision process, dynamic programming, Monte Carlo methods, and temporal-difference learning are some of the corresponding methods used to solve reinforcement learning problems. en_US
dc.language.iso en en_US
dc.publisher University of Bahrain en_US
dc.subject Reinforcement learning; multi-armed bandits; Markov decision process; dynamic programming; Monte Carlo methods; Deep reinforcement learning en_US
dc.title Reinforcement Learning: A review en_US
dc.identifier.doi http://dx.doi.org/10.12785/ijcds/1301118 en
dc.volume 13 en_US
dc.issue 1 en_US
dc.pagestart 1 en_US
dc.pageend 1 en_US
dc.contributor.authorcountry Morocco en_US
dc.contributor.authoraffiliation Sidi Mohamed Ben Abdellah University Fez - ENSA en_US
dc.contributor.authoraffiliation ENSA of Fez, Sidi Mohamed Ben Abdellah University en_US
dc.contributor.authoraffiliation EST of Fez, Sidi Mohamed Ben Abdellah University en_US
dc.source.title International Journal of Computing and Digital Systems en_US
dc.abbreviatedsourcetitle IJCDS en_US


Files in this item

This item appears in the following Issue(s)

Show simple item record

All Journals


Advanced Search

Browse

Administrator Account