
In this paper, we discuss the way we addressed the 2024 MuSe-Perception challenge, which aims to automatically recognize and quantify social attributes of CEOs using multimodal data. We investigated five different approaches: (1) optimizing a basic RNN and Transformer encoder model, (2) adopting the xLSTM architecture for improved long-term dependency modeling, (3) extracting text features using pre-trained language models, (4) grouping similar social attributes for joint learning, and (5) incorporating novel fusion methods. The experimental results on the development set showed that our different approaches excelled for different attributes, with multimodal methods generally outperforming unimodal methods, and text feature extraction notably improving the prediction of a subset of the agentive attributes. For the test set, our most effective strategy achieved a mean Pearson correlation coefficient of 0.35 by combining the highest values from four different approaches. This outcome points to the existence of a substantial gap between development and test set effectiveness, indicating limitations in model generalization and calling for further research to improve the accuracy and reliability of methods for social attribute prediction.
Technology and Engineering, Fusion methods, Attribute grouping, Machine learning, Multimodal analysis, Social attribute prediction, Text features
Technology and Engineering, Fusion methods, Attribute grouping, Machine learning, Multimodal analysis, Social attribute prediction, Text features
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 2 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
