Modern computers provide excellent opportunities for performing fast computations. They are equipped with powerful microprocessors
and large memories. However, programs are not necessarily able to exploit those computer resources effectively. In this paper,
we present the way in which we have implemented a nearest neighbor classification. We show how performance can be improved
by exploiting the ability of superscalar processors to issue multiple instructions per cycle and by using the memory hierarchy
adequately. This is accomplished by the use of floating-point arithmetic which usually outperforms integer arithmetic, and
block (tiled) algorithms which exploit the data locality of programs allowing for an efficient use of the data stored in the
cache memory. Our results are validated with both an analytical model and empirical results. We show that regular codes could
be performed faster than more complex irregular codes using standard data sets.