We provide a novel framework for very fast model-based reinforcement learning in continuous state and action spaces. The framework
requires probabilistic models that explicitly characterize their levels of confidence. Within this framework, we use flexible,
non-parametric models to describe the world based on previously collected experience. We demonstrate learning on the cart-pole
problem in a setting where we provide very limited prior knowledge about the task. Learning progresses rapidly, and a good
policy is found after only a hand-full of iterations.