
doi: 10.32388/9a17f4
In this paper, we show that fiction vs non-fiction genre classification can be achieved with very high accuracy using simple readability metrics, which have been extensively studied by linguists for many decades. In addition, we explore the BERT model for this classification and find that, although it can also achieve very high accuracy with the same amount of training data, its results are very hard to understand. We tried many adversarial attacks to break the fine-tuned BERT model but found it to be quite resilient.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
