Routing packets is a relevant issue for maintaining good performance and successsfully operating in a web based systems. This
problem is naturally formulated as a dynamig programming problem, which, however, is too complex to be solved exactly. We
proposed here two adaptive routing algorithms based on reinforcement learning. In the first algorithm, we have used a neural
network to approximate a reinforcement signal, allowing the learner to incorporate various parameters into its distance estimation
such as local queue size. Moreover, each router uses an on line learning module to optimize the path in terms of average packet
delivery time, by taking into account the waiting queue states of neighboring routers. In the second step, the exploration
of paths is limited to N-Best non loop paths in term of hops number (number of routers in a path) leading to a substantial
reduction of convergence time. The performances of the proposed algorithms are evaluated experimentally for different levels
of traffic’s load and compared to standard shortest path and Q-routing algorithms. Our Approaches proves superior to a classical
algorithms and are able to route efficiently even when critical aspects of the simulation, such as the network load, are allowed
to vary dynamically.