
Artifacts for "Can Machine Learning Support the Selection of Studies for Systematic Literature Review Updates?". File used to answer RQ1: RQ1-RF-predictions.csv RQ1-RQ3-best-configuration-RF.csv File used to answer RQ2: RQ2-SVM-predictions.csv RQ2-best-configuration-SVM.csv File used to answer RQ3: RQ3-RF-normalized-predictions.csv RQ1-RQ3-best-configuration-RF.csv The file assessment-team-votes.csv contains the title of each study, a bolean indicating if it was included or not and the individual marks of each reviewer before applying the agreement criteria. The .bib files used in our experiment are available at: Our testing set: 'Testing set - Excluded.bib' (513 studies) and 'Testing set - Included.bib' (38 studies). All of the 551 studies we used, were obtained from the actual SLR Update Our training set: 'Training set - Excluded.bib' (83 studies - obtained by performing the backward snowballing using the Original SLR) and 'Training set - Included.bib' (45 studies - all studies that were included in the Original SLR). All of our code is available in the .zip file. Besides our pipeline, there's also some jupyter notebooks in code/analysis showing illustrating how we answered each of our questions.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
