In this paper, we address the problem of automatic keywords assignment to scientific publications. The idea to use textual
traces learned from training data in a supervised manner to identify appropriate keywords is considered. We introduce the
transparent concept of identification cloud as a means to represent the semantics of scientific terms. This concept is mathematically
defined by models of scientific terms stochastic distributions over publication texts. Characteristics of models as well as
procedures for both non-parametric and parametric estimation of probability distributions are presented.