
This is the experimental artifact for the TACAS '25 submission "Tidyparse: A Tool for Realtime Syntax Repair". To run it, ensure you have Java 21 installed, then download the file tacas25-experiments.jar and run the command: java -jar tacas25-artifact.jar -Xmx4g 2>&1 | tee repairs.log After a while, the log should contain the repair instances and a list of aggregate statistics. The parent folder will contain two files named bar_hillel_results_{positive, negative}*.csv, containing the statistics for each repair instance. For the negative examples, this will contain the following columns: length, lev_dist, samples... Where length is the length of the broken code snippet, lev_dist is the Levenshtein distance between the broken and fixed code snippets, and samples are the total number of samples drawn before timeout. For the positive examples, this will contain the following columns: length, lev_dist, sample_ms ... rank Where the first two columns are the same, sample_ms was the time it took to find the human repair after constructing the language intersection automaton, and rank was the rank of the true repair in the list of all repair suggestions. These statistics will also be aggregrated and displayed in a streaming fashion in the terminal and repairs.log file. Next to the individual repairs instances, it will periodically display running statistics that look as follows: Lev(*): Top-1/rec/pos/total: 1 / 1 / 1 / 1, errors: 0, P@1: 1.0, P@All: 1.0 Lev(3): Top-1/rec/pos/total: 1 / 1 / 1 / 1, errors: 0, P@1: 1.0, P@All: 1.0 Draw timings (ms): {1=0.0, 2=0.0, 3=731.0} Full timings (ms): {1=0.0, 2=0.0, 3=10513.0} Avg samples drawn: {1=0.0, 2=0.0, 3=9853.0} Top-1 is the number of repair instanaces where the true repair was sampled and ranked first rec is the number of repair instances where the true repair was sampled, but not ranked first pos is the number of instances where the true repair could have been sampled, but was not total are the total number of repair instances evaluated so far It will also contain the following data, which a more granular summary of the running average precision across all repair instances, broken down by snippet length and edit distance, where |σ| is the length of the broken code snippet and Δ indicates the Levenshtein distance of the true repair. Precision@1===========|σ|∈[0, 10): Top-1/total: 17 / 37 ≈ 0.4594594594594595|σ|∈[10, 20): Top-1/total: 13 / 29 ≈ 0.4482758620689655|σ|∈[20, 30): Top-1/total: 13 / 48 ≈ 0.2708333333333333|σ|∈[30, 40): Top-1/total: 6 / 9 ≈ 0.6666666666666666Δ(1)= Top-1/total: 31 / 48 ≈ 0.6458333333333334Δ(2)= Top-1/total: 14 / 39 ≈ 0.358974358974359Δ(3)= Top-1/total: 4 / 36 ≈ 0.1111111111111111(|σ|∈[0, 10), Δ=1): Top-1/total: 12 / 14 ≈ 0.8571428571428571(|σ|∈[0, 10), Δ=2): Top-1/total: 4 / 13 ≈ 0.3076923076923077(|σ|∈[0, 10), Δ=3): Top-1/total: 1 / 10 ≈ 0.1(|σ|∈[10, 20), Δ=1): Top-1/total: 8 / 12 ≈ 0.6666666666666666(|σ|∈[10, 20), Δ=2): Top-1/total: 3 / 9 ≈ 0.3333333333333333(|σ|∈[10, 20), Δ=3): Top-1/total: 2 / 8 ≈ 0.25(|σ|∈[20, 30), Δ=1): Top-1/total: 8 / 17 ≈ 0.47058823529411764(|σ|∈[20, 30), Δ=2): Top-1/total: 4 / 13 ≈ 0.3076923076923077(|σ|∈[20, 30), Δ=3): Top-1/total: 1 / 18 ≈ 0.05555555555555555(|σ|∈[30, 40), Δ=1): Top-1/total: 3 / 5 ≈ 0.6(|σ|∈[30, 40), Δ=2): Top-1/total: 3 / 4 ≈ 0.75 Running the full set of experiments can take several hours depending on the machine.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
