Gistable: Evaluating the Executability of Python Code Snippets on GitHub

descriptionPublicationkeyboard_double_arrow_right Article , Preprint , Conference object 01 Sep 2018Embargo end date: 01 Jan 2018Publisher:IEEEJournal:2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Authors: Eric Horton; Chris Parnin;

doi: 10.1109/icsme.2018.00031 , 10.48550/arxiv.1808.04919

arXiv: 1808.04919

Gistable: Evaluating the Executability of Python Code Snippets on GitHub

- Summary
- Subjects
- Related research
  (2)
- Metrics

Abstract

Software developers create and share code online to demonstrate programming language concepts and programming tasks. Code snippets can be a useful way to explain and demonstrate a programming concept, but may not always be directly executable. A code snippet can contain parse errors, or fail to execute if the environment contains unmet dependencies. This paper presents an empirical analysis of the executable status of Python code snippets shared through the GitHub gist system, and the ability of developers familiar with software configuration to correctly configure and run them. We find that 75.6% of gists require non-trivial configuration to overcome missing dependencies, configuration files, reliance on a specific operating system, or some other environment configuration. Our study also suggests the natural assumption developers make about resource names when resolving configuration errors is correct less than half the time. We also present Gistable, a database and extensible framework built on GitHub's gist system, which provides executable code snippets to enable reproducible studies in software engineering. Gistable contains 10,259 code snippets, approximately 5,000 with a Dockerfile to configure and execute them without import error. Gistable is publicly available at https://github.com/gistable/gistable.

Related Organizations

North Carolina State University
United States

Keywords

Software Engineering (cs.SE), FOS: Computer and information sciences, Computer Science - Software Engineering

2 Research products, page 1 of 1

basic-python-logger software on GitHub
IsRelatedTo
nodejs-guidelines software on GitHub
IsRelatedTo

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	21
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 10%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%