In this paper we present and analyze an artificial neural network hardware engine, its architecture and implementation. The
engine was designed to solve performance problems of the serial software implementations. It is based on a hierarchical parallel
and parameterized architecture. Taking into account verification results, we conclude that this engine improves the computational
performance, producing speedups from 52.3 to 204.5 and its architectural parameterization provides more flexibility.