Heterogeneous clusters and grid infrastructures are becoming increasingly popular. In these computing infrastructures, machines
have different resources, including memory sizes, disk space, and installed software packages. These differences give rise
to a problem of over-provisioning, that is, sub-optimal utilization of a cluster due to users requesting resource capacities
greater than what their jobs actually need. Our analysis of a real workload file (LANL CM5) revealed differences of up to
two orders of magnitude between requested memory capacity and actual memory usage. This paper presents an algorithm to estimate
actual resource capacities used by batch jobs. Such an algorithm reduces the need for users to correctly predict the resources
required by their jobs, while at the same time managing the scheduling system to obtain superior utilization of available
hardware. The algorithm is based on the Reinforcement Learning paradigm; it learns its estimation policy on-line and dynamically
modifies it according to the overall cluster load. The paper includes simulation results which indicate that our algorithm
can yield an improvement of over 30% in utilization (overall throughput) of heterogeneous clusters.