Explainable Structuring and Discovery of Relevant Cases for Exploration of High-Dimensional Data
Résumé
Data described by numerous features create a challenge for domain experts as it is difficult to manipulate, explore and visualize them. With the increased number of features, a phenomenon called "curse of dimensionality" arises: sparsity increases and distance metrics are less relevant as most elements of the dataset become equidistant. The result is a loss of efficiency for traditional machine learning algorithms. Moreover, many state-of-the-art approaches act as black-boxes from a user point of view and are unable to provide explanations for their results. We propose an instance-based method to structure datasets around important elements called exemplars. The similarity measure used by our approach is less sensitive to high-dimensional spaces, and provides both explainable and interpretable results: important properties for decision-making tools such as recommender systems. The described algorithm relies on exemplar theory to provide a data exploration tool suited to the reasoning used by experts of various fields. We apply our method to synthetic as well as real-world datasets and compare the results to recommendations made using a nearest neighbor approach.
Origine | Fichiers produits par l'(les) auteur(s) |
---|
Loading...