Enhanced reinforcement learning by recursive updating of Q-values for reward propagation

Yunsick Sung, Eunyoung Ahn, Kyungeun Cho

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In this paper, we propose a method to reduce the learning time of Q-learning by combining the method of updating even to Q-values of unexecuted actions with the method of adding a terminal reward to unvisited Q-values. To verify the method, its performance was compared to that of conventional Q-learning. The proposed approach showed the same performance as conventional Q-learning, with only 27 % of the learning episodes required for conventional Q-learning. Accordingly, we verified that the proposed method reduced learning time by updating more Q-values in the early stage of learning and distributing a terminal reward to more Q-values.

Original languageEnglish
Title of host publicationIT Convergence and Security 2012
Pages1003-1008
Number of pages6
DOIs
StatePublished - 2013
EventInternational Conference on IT Convergence and Security, ICITCS 2012 - Pyeong Chang, Korea, Republic of
Duration: 5 Dec 20127 Dec 2012

Publication series

NameLecture Notes in Electrical Engineering
Volume215 LNEE
ISSN (Print)1876-1100
ISSN (Electronic)1876-1119

Conference

ConferenceInternational Conference on IT Convergence and Security, ICITCS 2012
Country/TerritoryKorea, Republic of
CityPyeong Chang
Period5/12/127/12/12

Keywords

  • Propagation
  • Q-learning
  • Q-value
  • Terminal reward

Fingerprint

Dive into the research topics of 'Enhanced reinforcement learning by recursive updating of Q-values for reward propagation'. Together they form a unique fingerprint.

Cite this