Presented by: 
Cinzia Viroli
Thu 3 Jul, 11:00 am - 12:00 pm

Abstract: A distance based classifier, using component-wise quantiles, is defined by classifying an observation according to a sum of appropriately weighted component-wise distances of the components of the observation to the within-class quantiles. The method is inspired by the recent median-based classifiers (Hall et al., 2009) that represent a robust version of the conventional Euclidean distance-based classifiers for potentially high-dimensional data.

The optimal quantiles can be chosen by minimizing the misclassification error in the training sample. It is shown that this is consistent, as the sample size increase, for the classification rule with asymptotically optimal quantiles. Moreover, under some assumptions, as the dimensionality increases, the probability of correct classification converges to one. The role of skewness of the involved variables is discussed, which leads to an improved classifier. The optimal quantile classifier performs very well in a comprehensive simulation study and a real data set from chemistry (classification of bioaerosols) compared to other classifiers.

Return to SMOR seminar homepage