In this paper we introduce a new approach to genetic programming with memory in reinforcement learning situations, which selects
memories in order to increase the probability of modelling the most relevant parts of memory space. We evolve maps directly
from state to action, rather than maps that predict reward based on state and action, which reduces the complexity of the
evolved mappings. The work is motivated by applications to the control of autonomous robots. Preliminary results in software
simulations indicate an enhanced learning speed and quality.