In this paper, we present an analytical performance model of the parallel left-right looking out-of-core LU factorization
algorithm. We show the accuracy of the performance prediction for a prototype implementation in the ScaLAPACK library. We
will show that with a correct distribution of the matrix and with an overlapof IO by computation, we obtain performances similar
to those of the in-core algorithm. To get such performances, the size of the physical main memory only need to be proportional
to the product of the matrix order (not the matrix size) by the ratio of the IO bandwidth and the computation rate: There
is no need of large main memory for the factorization of huge matrix!
This work is supported by a grant of the “Pôle de Modélisation de la Région Picardie”.