Volume 2, Number 1, 85-102, DOI: 10.1007/s10723-004-2809-x

Metadata for Managing Grid Resources in Data Mining Applications

Carlo Mastroianni, Domenico Talia and Paolo Trunfio

From the issue entitled "Highlights of the 4th International Workshop on Grid Computing (Grid2003)"

View Related Documents

Abstract

The Grid is an infrastructure for resource sharing and coordinated use of those resources in dynamic heterogeneous distributed environments. The effective use of a Grid requires the definition of metadata for managing the heterogeneity of involved resources that include computers, data, network facilities, and software tools provided by different organizations. Metadata management becomes a key issue when complex applications, such as data-intensive simulations and data mining applications, are executed on a Grid. This paper discusses metadata models for heterogeneous resource management in Grid-based data mining applications. In particular, it discusses how resources are represented and managed in the Knowledge Grid, a framework for Grid-enabled distributed data mining. The paper illustrates how XML-based metadata is used to describe data mining tools, data sources, mining models, and execution plans, and how metadata is used for the design and execution of distributed knowledge discovery applications on Grids.

Keywords  data mining - discovery service - dynamic scheduling - knowledge grid - metadata management - peer-to-peer - resource categorization - semantic grid

Fulltext Preview

Image of the first page of the fulltext document