DI-UMONS : Dépôt institutionnel de l’université de Mons

Recherche transversale
(titres de publication, de périodique et noms de colloque inclus)
2018-07-10 - Colloque/Présentation - communication orale - Anglais - 0 page(s)

Itani Sarah , Lecron Fabian , Fortemps Philippe , "A Novel Training Algorithm to Build Decision Trees for Anomaly Detection" in EURO2018 - 29th European Conference on Operational Research, Valence, Espagne, 2018

  • Codes CREF : Modèles mathématiques d'aide à la décision (DI1151), Recherche opérationnelle (DI1150)
  • Unités de recherche UMONS : Management de l'Innovation Technologique (F113), Mathématique et Recherche opérationnelle (F151)
  • Instituts UMONS : Institut de Recherche en Technologies de l’Information et Sciences de l’Informatique (InforTech)

Abstract(s) :

(Anglais) Anomaly detection is a widespread problem in the sphere of data science. One-Class Classification (OCC) is an approach that addresses this issue for various applications, e.g. fraud detection, medical diagnosis, monitoring. Actually, OCC training algorithm are run on a set of training instances included in the same class, with potentially some additional few outliers. Thus a OCC methodology allows to handle problems associated with poor data availability. One-Class Support Vector Machine (OCSVM) is one of the most popular OCC methods. Though high-performing, this predictive technique can hardly satisfy some specific needs which are required in diagnosis aid for example. Indeed, in this context, predictions should be explained and justified, thus interpretability constitutes an important modeling goal. In the present work, we propose a one-class learning algorithm to implement decision trees. These models are originally dedicated to multi-class prediction; we propose an adaptation to the one-class mode. The original aspect of our proposal is related to the division mechanism, which is here based on the information of density, without the necessity of generating physically outliers as the representatives of a second class. It appears that our proposal, which is readable and interpretable, performs favorably in comparison to the most common OCC methods (e.g. OCSVM) and shows robustness towards high dimensional data.