Downloads provided by UsageCounts
artifact_detection A tool for NLP tasks on textual bug reports. Automated classification of text into natural language (e.g. English in the contained datasets), and non-natural language text portions (e.g. stack traces, code snippets, log outputs, file listings, urls,) on a line by line basis. This repo contains the Python implementation of a machine learning classifier model, basic scripts for automated trainingset creation from GitHub issue tickets. Further, a scikit-learn transformer implementation wrapping pretrained models ready to be used as preprocessing step. Datasets consist of issue tickets and documentation files mined from C++, Java, JavaScript, PHP, and Python projects hosted on GitHub. Detailed discussion of this model can be found in "Detecting non-natural language artifacts for de-noising bug reports" - Hirsch T. and Hofer B. (in review). This is project is also available on GitHub: https://github.com/AmadeusBugProject/artifact_detection
bug report, nlp, data cleaning
bug report, nlp, data cleaning
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 1 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
| views | 7 | |
| downloads | 2 |

Views provided by UsageCounts
Downloads provided by UsageCounts