In this paper, we investigate the integration of heterogeneous network monitoring data. Specifically, we will synchronize
and integrate flow-level records, exemplified by Cisco NetFlow, and packet-level traces, exemplified by NLANR PMA. The integration
can facilitate cross-validation and complementary utility. However, finding the correspondences of timestamps/flows/packets
between the PMA and Netflow is non-trivial, because they have different levels of granularity, different sampling strategy,
different time sources, and different IP address masking. To integrate heterogeneous monitoring data, we first synchronize
their timestamps, and then match their masked IP addresses. Our key observation is that although the IP addresses are masked,
some other header fields can be exploited to match different types of monitoring data. In order to reduce the search space
and the processing overhead, we have adopted a top-down approach to limit the search scope, and iterative algorithms to reduce
the matching errors step by step.
Keywords Heterogeneous network monitoring data - NetFlow - PMA
This work is sponsored by the University Research Program of Cisco Systems Inc from 09/01/04 to 08/31/06.