Implementasi Q-Learning dan Backpropagation pada Agen yang Memainkan Permainan Flappy Bird
Abstract: This paper shows how
to implement a combination of Q-learning and backpropagation on the case of
agent learning to play Flappy Bird game. Q-learning and backpropagation are
combined to predict the value-function of each action, or called value-function
approximation. The value-function approximation is used to reduce learning time
and to reduce weights stored in memory. Previous studies using only regular
reinforcement learning took longer time and more amount of weights stored in
memory. The artificial neural network architecture (ANN) used in this study is
an ANN for each action. The results show that combining Q-learning and
backpropagation can reduce agent’s learning time to play Flappy Bird up to 92%
and reduce the weights stored in memory up to 94%, compared to regular
Q-learning only. Although the learning time and the weights stored are reduced,
Q-learning combined with backpropagation have the same ability as regular
Q-learning to play Flappy Bird game.
Kata Kunci:
Flappy Bird, Q-Learning,
Value-Function Approximation, Artificial Neural Netowrk, Backpropagation
Penulis: Ardiansyah, Ednawati
Rainarli
Kode Jurnal: jptlisetrodd170177