Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ arXiv.org e-Print Ar...arrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
Hal
Article . 2019
Data sources: Hal
https://dx.doi.org/10.48550/ar...
Article . 2018
License: arXiv Non-Exclusive Distribution
Data sources: Datacite
versions View all 3 versions
addClaim

This Research product is the result of merged Research products in OpenAIRE.

You have already added 0 works in your ORCID record related to the merged Research product.

Regularized Maximum Likelihood Estimation and Feature Selection in Mixtures-of-Experts Models

Authors: Chamroukhi, Faicel; Huynh, Bao-Tuyen;

Regularized Maximum Likelihood Estimation and Feature Selection in Mixtures-of-Experts Models

Abstract

Les mélanges d'experts (MoE) sont des modèles efficaces pour la modélisation de données hétérogènes dans de nombreux problèmes en apprentissage statistique, y compris en régression, en classification et en discrimination. Généralement ajustés par maximum de vraisemblance via l'algorithme EM, leur application aux problémes de grande dimension est difficile dans un tel contexte. Nous considérons le problème de l'estimation et de la sélection de variables dans les modèles de mélanges d'experts, et proposons une approche d'estimation par maximum de vraisemblance régularisé qui encourage des solutions parcimonieuses pour des modéles de données de régression hétérogènes comportant un nombre de prédicteurs potentiellement grand. La méthode de régularisation proposée, contrairement aux méthodes de l'état de l'art sur les mélanges d'experts, ne se base pas sur une pénalisation approchée et ne nécessite pas de seuillage pour retrouver la solution parcimonieuse. L'estimation parcimonieuse des paramètres s'appuie sur une régularisation de l'estimateur du maximum de vraisemblance pour les experts et les fonctions d'activations, mise en oeuvre par deux versions d'un algorithme EM hybride. L'étape M de l'algorithme, effectuée par montée de coordonnées ou par un algorithme MM, évite l'inversion de matrices dans la mise à jour et rend ainsi prometteur le passage de l'algorithme à l'échelle. Une étude expérimentale met en évidence de bonnes performances de l'approche proposée.

Mixture of Experts (MoE) are successful models for modeling heterogeneous data in many statistical learning problems including regression, clustering and classification. Generally fitted by maximum likelihood estimation via the well-known EM algorithm, their application to high-dimensional problems is still therefore challenging. We consider the problem of fitting and feature selection in MoE models, and propose a regularized maximum likelihood estimation approach that encourages sparse solutions for heterogeneous regression data models with potentially high-dimensional predictors. Unlike state-of-the art regularized MLE for MoE, the proposed modelings do not require an approximate of the penalty function. We develop two hybrid EM algorithms: an Expectation-Majorization-Maximization (EM/MM) algorithm, and an EM algorithm with coordinate ascent algorithm. The proposed algorithms allow to automatically obtaining sparse solutions without thresholding, and avoid matrix inversion by allowing univariate parameter updates. An experimental study shows the good performance of the algorithms in terms of recovering the actual sparse solutions, parameter estimation, and clustering of heterogeneous regression data.

Keywords

FOS: Computer and information sciences, Mélanges d'experts, Computer Science - Machine Learning, Machine Learning (stat.ML), [STAT.CO] Statistics [stat]/Computation [stat.CO], Statistics - Computation, Machine Learning (cs.LG), Methodology (stat.ME), MM algorithm, Model-based clustering, Classification á base de modéle, Mixture of experts, Statistics - Machine Learning, Sélection de variable, Regularization, Régularisation, [STAT.CO]Statistics [stat]/Computation [stat.CO], EM algorithm, Statistics - Methodology, Computation (stat.CO), Coordinate ascent, Montée de coordonnées, [STAT] Statistics [stat], [STAT]Statistics [stat], High-dimensional data, 62-XX, 62H30, 62G05, 62G07, 62H12, 62-07, 62J07, 68T05, Feature selection, Algorithme EM, Algorithme MM

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
Green