
handle: 10230/35646
Emojis are very common in social media and understanding their underlying semantics is of great interest from a Natural Language Processing point of view. In this work, we investigate emoji prediction in short text messages using a multi-task pipeline that simultaneously predicts emojis, their categories and sub-categories. The categories are either manually predefined in the unicode standard or automatically obtained by clustering over word embeddings. We show that using this categorical information adds meaningful information, thus improving the performance of emoji prediction task. We systematically analyze the performance of the emoji prediction task by varying the number of training samples and also do a qualitative analysis by using attention weights from the prediction task.
This work was done when Francesco B. interned at Snap Inc. Francesco B. acknowledge support also from the TUNER project (TIN2015-65308-C5-5-R, MINECO/FEDER, UE) and the Maria de Maeztu Units of Excellence Programme (MDM-2015-0502).
Comunicació presentada a la 12th International AAAI Conference on Web and Social Media (ICWSM 2018) celebrada a Stanford (EEUU) el 25 de juny de 2018.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
