Along with the widespread concern of spam problem, at present, there are spam filtering system nowadays about the problem
of semantic imperfection and spam filter low effect in the multi-send spam. This paper proposes a model of spam filtering
which based on latent semantic analysis (LSA) and message-digest algorithm 5 (SHA). Making use of the LSA marks the latent
feature phrase in the spam, semantic analysis is led into the spam filtering technique; the "e-mail fingerprint" of multi-send
spam is born with SHA on the LSA analytical foundation, the problem of filtering technique’s low effect in the multi-send
spam is resolved with this kind of method. We have designed a spam filtering system based on this model. Our designed system
was evaluated with an optional dataset. The results obtained were compared with KNN algorithm filter experiment results show
that system based on Latent Semantic Analysis and SHA performs KNN. The experiments show the expected results obtained, and
the feasibility and advantage of the new spam filtering method is validated.
Keywords Latent Semantic Analysis - Secure Hash Algorithm - Mail Characteristic ID - Slipping Windows - Spam Filtering