Welcome to Journal of Beijing Institute of Technology
Bi Jinbo, Wu Cangpu. Efficient Temporal Difference Learning with Adaptive λ[J]. JOURNAL OF BEIJING INSTITUTE OF TECHNOLOGY, 1999, 8(3): 251-257.
Citation: Bi Jinbo, Wu Cangpu. Efficient Temporal Difference Learning with Adaptive λ[J]. JOURNAL OF BEIJING INSTITUTE OF TECHNOLOGY, 1999, 8(3): 251-257.

Efficient Temporal Difference Learning with Adaptive λ

  • Aim To find a more efficient learning method based on temporal difference learning for delayed reinforcement learning tasks. Methods A kind of Q learning algorithm based on truncated TD( λ ) with adaptive schemes of λ value selection addressed to absorbing Markov decision processes was presented and implemented on computers. Results and Conclusion Simulations on the shortest path searching problems show that using adaptive λ in the Q learning based on TTD( λ ) can speed up its convergence.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return
    Baidu
    map