Compressibility as Proxy for Readability

Name: Compressibility as Proxy for Readability
Keywords: Programvaruteknik, Compressibility, Reactive, Software Engineering, Reliability, Proxy, Object-Oriented

Hansson, Axel; Lönnqvist, Marcus

Found an issue? Give us feedback

Digitala Vetenskapli...arrow_drop_down

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Other ORP type . 2021

Data sources: Digitala Vetenskapliga Arkivet - Academic Archive On-line

Compressibility as Proxy for Readability

appsOther research productkeyboard_double_arrow_right Other ORP type 01 Jan 2021 English Publisher:Mittuniversitetet, Institutionen för data- och systemvetenskap

Authors: Hansson, Axel; Lönnqvist, Marcus;

Compressibility as Proxy for Readability

- Summary
- Subjects
- Related research
  (16)
- Metrics

Abstract

This study’s main objective is to examine if there is acorrelation between readability and compressibility of Java code. The code readability is important to softwaremaintainability and the comprehension of the code, and thiscan be verified and tested with a range of different metricssuch as B&W, Scalabrino and Dorn’s readability metric.Should there exist a correlation, compressibility could proveto be a simple yet useful readability metric.Data compression is when code or data is encoded usingfewer bits that its original size. There are several algorithmsto do this, and this study works with some of the mostpopular methods. To examine the correlation, we first testedthe different compression algorithms against each other tosee if there was a major difference in size of the resulting file.After that we compared the compressibility between twodifferent types of written code, with previously establisheddifferences in readability.All in all, the source code from a total of 20 popular GitHubprojects were tested with 3 compression algorithms tocompare the differences between the algorithms. For thecompressibility comparisons between code as relating toreadability, a combined total of 104 code snippets weretested, 52 of each compared coding paradigm.Result: For the first test we concluded that there was nosignificant difference between the compression rates of thealgorithms, ending up roughly within 4% or less of eachother on average.The second result reveals a small difference incompressibility between sets of code using reactive Java andobject-oriented Java. These two paradigms are showing adifference in readability according to earlier research, thoughthe difference in compressibility was so small that it wasconsidered negligible. This is due to a lack of variety ofsnippets tested and the difference can largely be attributed tothe small file sizes of some snippets. The smaller filesincreased in size due to the compression adding an“overhead” when a file is compressed. This is morenoticeable on smaller files which this study tested a lot of.In conclusion, the study was unable to indicate a clearconnection between source code readability andcompressibility. Thus, it does not indicate that compressibility is a suitable proxy for readability as of now.This study does however start a conversation on a topicpreviously untouched, and we hope that this study can pointother studies in the right direction. The scope of this researchis too big to be fully explored in this study alone, and westrongly suggest future research on the topic.

Keywords

Programvaruteknik, Compressibility, Reactive, Software Engineering, Reliability, Proxy, Object-Oriented

16 Research products, page 1 of 2

AndroidUtilCode software on GitHub
IsRelatedTo
picasso software on GitHub
IsRelatedTo
gson software on GitHub
IsRelatedTo
CompressTest software on GitHub
IsRelatedTo
kafka software on GitHub
IsRelatedTo
dubbo software on GitHub
IsRelatedTo
RxJava software on GitHub
IsRelatedTo
taskell software on GitHub
IsRelatedTo
elasticsearch software on GitHub
IsRelatedTo
shellcheck software on GitHub
IsRelatedTo

chevron_left
1
2
chevron_right

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average