
The degree of correlation between the source code of different software programs is important for uncovering plagiarism, trade secret theft, copyright infringement, and patent infringement. Other uses include locating open source code within a proprietary program and determining common authorship of different programs. Measurement of source code correlation is an important factor in any punitive determination of rights infringement. Existing measures of source code correlation tend to focus on only one or two types of correlation. This paper presents a theoretical basis for a measure of source code correlation predicated on various uses and requirements. The paper also describes a software tool that uses multiple algorithms to determine this correlation measure. At the conclusion the paper compares the results produced by this tool against results produced by other tools when examining a controlled set of correlated source code files and finds that the new tool is more accurate than other tools in determining all types of source code correlation.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
