
Sparse Multinomial Logistic Regression (SMLR) is widely used in the field of image classification, multi-class object recognition, and so on, because it has the function of embedding feature selection during classification. However, it cannot meet the time and memory requirements for processing large-scale data. We have reinvestigated the classification accuracy and running efficiency of the algorithm for solving SMLR problems using the Alternating Direction Method of Multipliers (ADMM), which is called fast SMLR (FSMLR) algorithm in this paper. By reformulating the optimization problem of FSMLR, we transform the serial convex optimization problem to the distributed convex optimization problem, i.e., global consensus problem and sharing problem. Based on the distributed optimization problem, we propose two distribute parallel SMLR algorithms, sample partitioning-based distributed SMLR (SP-SMLR), and feature partitioning-based distributed SMLR (FP-SMLR), for a large-scale sample and large-scale feature datasets in big data scenario, respectively. The experimental results show that the FSMLR algorithm has higher accuracy than the original SMLR algorithm. The big data experiments show that our distributed parallel SMLR algorithms can scale for massive samples and large-scale features, with high precision. In a word, our proposed serial and distribute SMLR algorithms outperform the state-of-the-art algorithms.
distributed parallel, Alternating Direction Method of Multipliers, sparse multinomial logistic regression, big data, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
distributed parallel, Alternating Direction Method of Multipliers, sparse multinomial logistic regression, big data, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 3 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
