A non-asymptotic theory for model selection in a high-dimensional mixture of experts via joint rank and variable selection - IRT SystemX Accéder directement au contenu
Communication Dans Un Congrès Année : 2023

A non-asymptotic theory for model selection in a high-dimensional mixture of experts via joint rank and variable selection

Résumé

We are motivated by the problem of identifying potentially nonlinear regression relationships between high-dimensional outputs and high-dimensional inputs of heterogeneous data. This requires regression, clustering, and model selection, simultaneously. In this framework, we apply the mixture of experts models which are among the most popular ensemble learning techniques developed in the field of neural networks. In particular, we consider a more general case of mixture of experts models characterized by multiple Gaussian experts whose means are polynomials of the input variables and whose covariance matrices have block-diagonal structures. More especially, each expert is weighted by a gating network that is a softmax function of a polynomial of the input variables. These models require several hyper-parameters, including the number of mix- ture components, the complexity of the softmax gating networks and Gaussian mean experts, and the hidden block-diagonal structures of the covariance matrices. We provide a non-asymptotic theory for model selec- tion of such complex hyper-parameters using the slope heuristic approach in a penalized maximum likelihood estimation framework. Specifically, we establish a non-asymptotic risk bound on the penalized maximum likelihood estimation, which takes the form of an oracle inequality, given lower bound assumptions on the penalty function.
Fichier principal
Vignette du fichier
SGaBloME.pdf (536.77 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03984011 , version 1 (11-02-2023)
hal-03984011 , version 2 (24-10-2023)
hal-03984011 , version 3 (09-11-2023)

Identifiants

  • HAL Id : hal-03984011 , version 1

Citer

Trungtin Nguyen, Dung Ngoc Nguyen, Hien Duy Nguyen, Faicel Chamroukhi. A non-asymptotic theory for model selection in a high-dimensional mixture of experts via joint rank and variable selection. AJCAI Australasian Joint Conference on Artificial Intelligence 2023, Nov 2023, Brisbane, Australia. ⟨hal-03984011v1⟩
419 Consultations
141 Téléchargements

Partager

Gmail Facebook X LinkedIn More