• shareshare
  • link
  • cite
  • add
auto_awesome_motion View all 3 versions
Publication . Article . Conference object . Preprint . 2018

Neural Based Statement Classification for Biased Language

Christoph Hube; Besnik Fetahu;
Open Access
Biased language commonly occurs around topics which are of controversial nature, thus, stirring disagreement between the different involved parties of a discussion. This is due to the fact that for language and its use, specifically, the understanding and use of phrases, the stances are cohesive within the particular groups. However, such cohesiveness does not hold across groups. In collaborative environments or environments where impartial language is desired (e.g. Wikipedia, news media), statements and the language therein should represent equally the involved parties and be neutrally phrased. Biased language is introduced through the presence of inflammatory words or phrases, or statements that may be incorrect or one-sided, thus violating such consensus. In this work, we focus on the specific case of phrasing bias, which may be introduced through specific inflammatory words or phrases in a statement. For this purpose, we propose an approach that relies on a recurrent neural networks in order to capture the inter-dependencies between words in a phrase that introduced bias. We perform a thorough experimental evaluation, where we show the advantages of a neural based approach over competitors that rely on word lexicons and other hand-crafted features in detecting biased language. We are able to distinguish biased statements with a precision of P=0.92, thus significantly outperforming baseline models with an improvement of over 30%. Finally, we release the largest corpus of statements annotated for biased language.
The Twelfth ACM International Conference on Web Search and Data Mining, February 11--15, 2019, Melbourne, VIC, Australia
Subjects by Vocabulary

Microsoft Academic Graph classification: Word (computer architecture) News media Artificial intelligence business.industry business Group cohesiveness Natural language processing computer.software_genre computer Phrase Focus (linguistics) Computer science Order (exchange) Statement (computer science)


Computer Science - Computation and Language, Computation and Language (cs.CL), FOS: Computer and information sciences

28 references, page 1 of 3

[1] Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014).

[2] Eric Baumer, Elisha Elovic, Ying Qin, Francesca Polletta, and Geri Gay. 2015. Testing and comparing computational approaches for identifying the language of framing in political news. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 1472-1482.

[3] Douglas Biber. 1991. Variation across speech and writing. Cambridge University Press.

[4] Dylan Bourgeois, Jérémie Rappaz, and Karl Aberer. 2018. Selection Bias in News Coverage: Learning it, Fighting it. In The International World Wide Web Conference 2018.

[5] Roger Brown, Albert Gilman, et al. 1960. The pronouns of power and solidarity. (1960).

[6] Ewa S Callahan and Susan C Herring. 2011. Cultural bias in Wikipedia content on famous persons. JASIST 62, 10 (2011).

[7] Kyunghyun Cho, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014).

[8] Besnik Fetahu, Abhijit Anand, and Avishek Anand. 2015. How much is Wikipedia Lagging Behind News?. In Proceedings of the ACM Web Science Conference, WebSci 2015, Oxford, United Kingdom, June 28 - July 1, 2015. 28:1-28:9. 1145/2786451.2786460

[9] Besnik Fetahu, Katja Markert, Wolfgang Nejdl, and Avishek Anand. 2016. Finding News Citations for Wikipedia. In Proceedings of the 25th ACM International Conference on Information and Knowledge Management, CIKM 2016, Indianapolis, IN, USA, October 24-28, 2016. 337-346.

[10] Roger Fowler. 2013. Language in the News: Discourse and Ideology in the Press. Routledge.

Funded by
Foundations for Temporal Retrieval, Exploration and Analytics in Web Archives
  • Funder: European Commission (EC)
  • Project Code: 339233
  • Funding stream: FP7 | SP2 | ERC
AFEL - Analytics For Everyday Learning
  • Funder: European Commission (EC)
  • Project Code: 687916
  • Funding stream: H2020 | RIA
DARIAH ERIC Sustainability Refined
  • Funder: European Commission (EC)
  • Project Code: 731081
  • Funding stream: H2020 | CSA
Download fromView all 4 sources