David Silver - The Predictron: End-To-End Learning and Planning (2016)

History / Edit / PDF / EPUB / BIB /
Created: June 23, 2017 / Updated: November 2, 2024 / Status: finished / 1 min read (~126 words)
Machine learning

  • The predictron is composed of four main components
    • A state representation s=f(s) that encodes raw input s
    • A model s, r, γ=m(s,β) that maps from internal state s to subsequent internal state s, internal reward r, and internal discount γ
    • A value function v that outputs internal values v=v(s) representing the future, internal return from internal state s onwards
    • An accumulator, which combines together internal rewards, discounts, and values, into an overall estimate of value g

  • Silver, David, et al. "The predictron: End-to-end learning and planning." arXiv preprint arXiv:1612.08810 (2016).