
We consider the task of predicting the political views of VKontakte users based on textual data posted on their personal pages. The analysis of social media is an increasingly important area in digital humanities, opinion mining, and natural language processing. Nowadays, social networks contain a lot of meaningful and freely distributed data describing the views and moods of society. First, we analyzed information from user pages of various categories identified on the basis of the VKontakte political polarization. Personal profiles contain textual categorical values and text fields that are filled in by the user in a free form. We encoded categorical features as a one-hot numeric array and used the Bag-of-Words model for free-form text representation. Next, we applied a simple machine learning classifier based on Linear Support Vector Machines to the textual data of the custom page. We have shown that the classifier is better at separating groups of social media users with opposite political views than adherents of closer political ideologies.
анализ мнений, Engineering, Sociology, политическая поляризация, социальные сети, Computational Engineering, Social and Behavioral Sciences, машинное обучение, Politics and Social Change, ВКонтакте, социальная сеть
анализ мнений, Engineering, Sociology, политическая поляризация, социальные сети, Computational Engineering, Social and Behavioral Sciences, машинное обучение, Politics and Social Change, ВКонтакте, социальная сеть
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
