This is a bit of an oversimplification of data mining to the point where I am no...

This is a bit of an oversimplification of data mining to the point where I am not sure it is useful. Most interesting data exists in only a subset of a large feature set, where most items are irrelevant to the similarity metric. Take movies for example, if you tried to find similar movies using all features, key grip names and minor actors would unrealistically mess up your similarity score. This relates to the "curse of dimensionality".

Many data mining approaches first use a feature selection or feature extraction approach. That is, an approach which finds the relevant feature subsets, or discovers the underlying features of the data set.

Inverse Image search and the solution to the Netflix prize both used feature extraction approaches.