Analyzing complex scientific data, e.g., graphs and images, often requires comparison of features: regions on graphs, visual
aspects of images and related metadata, some features being relatively more important. The notion of similarity for comparison
is typically distance between data objects which could be expressed as distance between features. We refer to distance based
on each feature as a component. Weights of components representing relative importance of features could be learned using
distance function learning algorithms. However, it is seldom known which components optimize learning, given criteria such
as accuracy, efficiency and simplicity. This is the problem we address. We propose and theoretically compare four component
selection approaches: Maximal Path Traversal, Minimal Path Traversal, Maximal Path Traversal with Pruning and Minimal Path
Traversal with Pruning. Experimental evaluation is conducted using real data from Materials Science, Nanotechnology and Bioinformatics.
A trademarked software tool is developed as a highlight of this work.
Keywords Feature Selection - Data Mining - Multimedia - Scientific Analysis