organised by

« back

Invited Talks

Distances in Classification

Prof. Dr. Claus Weihs
Department of Statistics
TU Dortmund/Germany

Abstract

The notion of distance is the most important basis for classification. This is especially true for unsupervised learning, i.e. clustering, since there is no validation mechanism by means of objects with of known groups. For every individual problem the adequate distance has to be decided upon. This is demonstrated by means of three practical examples from very different application areas, namely social science, music science, and production economics. In social science often models are used which take spatial distances between objects into account which might have very irregular borders. These borders have to be taken into account when defining distances. In music science the main problem is often to find an adequate transformation of the input time series as the basis for distance definition. Also, local modelling is proposed in order to account for different subpopulations, e.g. instruments. In production economics often many quality criteria have to taken into account with very different scaling. In order to find a compromise optimum classification, this leads to a pre-transformation onto the same scale, called desirability.

presentation © Shutterstock presentation © Petra Perner Petra Perner © Petra Perner