The perfect solution for detecting sarcasm in tweets #not.

To avoid a sarcastic message being understood in its unintended literal meaning, in microtexts such as messages on Twitter.com sarcasm is often explicitly marked with the hashtag ‘#sarcasm’. We collected a training corpus of about 78 thousand Dutch tweets with this hashtag. Assuming that the human labeling is correct (annotation of a sample indicates that about 85% of these tweets are indeed sarcastic), we train a machine learning classiﬁer on the harvested examples, and apply it to a test set of a day’s stream of 3.3 million Dutch tweets. Of the 135 explicitly marked tweets on this day, we detect 101 (75%) when we remove the hashtag. We annotate the top of the ranked list of tweets most likely to be sarcastic that do not have the explicit hashtag. 30% of the top-250 ranked tweets are indeed sarcastic. Analysis shows that sarcasm is often signalled by hyperbole, using intensiﬁers and exclamations; in contrast, non-hyperbolic sarcastic messages often receive an explicit marker. We hypothesize that explicit markers such as hashtags are the digital extralinguistic equivalent of nonverbal expressions that people employ in live interaction when conveying sarcasm.

4th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis (WASSA-2013), 14 juni 2013

Contains fulltext : 112949.pdf (Publisher’s version ) (Open Access)

Country

Netherlands

Related Organizations

Radboud University Nijmegen
Netherlands

Keywords

Style and Persuasive Power: Language Intensity, Language in Society, The changing dynamics of news (project of: ADNEXT (Adaptive Information Extraction over Time (is project of COMIC)), ADNEXT (Adaptive Information Extraction over Time), Persuasive Communication, Nederlab, Language & Speech Technology, Stijl en overtuigingskracht: Taalintensiteit

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

Green

Related to Research communities

Netherlands Research Portal