publication . Preprint . 2018

DeepBugs: A Learning Approach to Name-based Bug Detection

Pradel, Michael; Sen, Koushik;
Open Access English
  • Published: 30 Apr 2018
Abstract
Natural language elements in source code, e.g., the names of variables and functions, convey useful information. However, most existing bug detection tools ignore this information and therefore miss some classes of bugs. The few existing name-based bug detection approaches reason about names on a syntactic level and rely on manually designed and tuned algorithms to detect bugs. This paper presents DeepBugs, a learning approach to name-based bug detection, which reasons about names based on a semantic representation and which automatically learns bug detectors instead of manually writing them. We formulate bug detection as a binary classification problem and trai...
Subjects
free text keywords: Computer Science - Software Engineering, Computer Science - Programming Languages
Related Organizations
Funded by
NSF| SHF: Medium: Automated Graphical User Interface Testing with Learning
Project
  • Funder: National Science Foundation (NSF)
  • Project Code: 1409872
  • Funding stream: Directorate for Computer & Information Science & Engineering | Division of Computing and Communication Foundations
,
NSF| SHF: Small: A Dynamic Analysis and Test Generation Framework for JavaScript and Web Applications
Project
  • Funder: National Science Foundation (NSF)
  • Project Code: 1423645
  • Funding stream: Directorate for Computer & Information Science & Engineering | Division of Computing and Communication Foundations
Download from
45 references, page 1 of 3

Edward Aftandilian, Raluca Sauciuc, Siddharth Priya, and Sundaresan Krishnan. 2012. Building Useful Program Analysis Tools Using an Extensible Java Compiler. In 12th IEEE International Working Conference on Source Code Analysis and Manipulation, SCAM 2012, Riva del Garda, Italy, September 23-24, 2012. 14-23.

Miltiadis Allamanis, Earl T. Barr, Christian Bird, and Charles A. Sutton. 2014. Learning natural coding conventions. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, (FSE-22), Hong Kong, China, November 16 - 22, 2014. 281-293.

Miltiadis Allamanis, Earl T. Barr, Premkumar Devanbu, and Charles Sutton. 2017a. A Survey of Machine Learning for Big Code and Naturalness. arXiv:1709.06182 (2017).

Miltiadis Allamanis and Marc Brockschmidt. 2017. SmartPaste: Learning to Adapt Source Code. CoRR abs/1705.07867 (2017). http://arxiv.org/abs/1705.07867 [OpenAIRE]

Miltiadis Allamanis, Marc Brockschmidt, and Mahmoud Khademi. 2017b. Learning to Represent Programs with Graphs. CoRR abs/1711.00740 (2017). arXiv:1711.00740 http://arxiv.org/abs/1711.00740 [OpenAIRE]

Miltiadis Allamanis, Hao Peng, and Charles A. Sutton. 2016. A Convolutional Attention Network for Extreme Summarization of Source Code. In Proceedings of the 33nd International Conference on Machine Learning, ICML 2016, New York City, NY, USA, June 19-24, 2016. 2091-2100.

Uri Alon, Meital Zilberstein, Omer Levy, and Eran Yahav. 2018. A General Path-Based Representation for Predicting Program Properties. In PLDI. [OpenAIRE]

Glenn Ammons, Rastislav Bodík, and James R. Larus. 2002. Mining specifications. In Symposium on Principles of Programming Languages (POPL). ACM, 4-16.

M. Amodio, S. Chaudhuri, and T. Reps. 2017. Neural Attribute Machines for Program Generation. ArXiv e-prints (May 2017). arXiv:cs.AI/1705.09231 [OpenAIRE]

Sahil Bhatia and Rishabh Singh. 2016. Automated Correction for Syntax Errors in Programming Assignments using Recurrent Neural Networks. CoRR abs/1603.06129 (2016). [OpenAIRE]

Pavol Bielik, Veselin Raychev, and Martin T. Vechev. 2016. PHOG: Probabilistic Model for Code. In Proceedings of the 33nd International Conference on Machine Learning, ICML 2016, New York City, NY, USA, June 19-24, 2016. 2933-2942.

David Bingham Brown, Michael Vaughn, Ben Liblit, and Thomas W. Reps. 2017. The care and feeding of wild-caught mutants. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, ESEC/FSE 2017, Paderborn, Germany, September 4-8, 2017. 511-522.

Simon Butler, Michel Wermelinger, Yijun Yu, and Helen Sharp. 2010. Exploring the Influence of Identifier Names on Code Quality: An Empirical Study. In European Conference on Software Maintenance and Reengineering (CSMR). IEEE, 156-165. [OpenAIRE]

Brendan Dolan-Gavitt, Patrick Hulin, Engin Kirda, Tim Leek, Andrea Mambretti, William K. Robertson, Frederick Ulrich, and Ryan Whelan. 2016. LAVA: Large-Scale Automated Vulnerability Addition. In IEEE Symposium on Security and Privacy, SP 2016, San Jose, CA, USA, May 22-26, 2016. 110-121.

Dawson Engler, David Yu Chen, Seth Hallem, Andy Chou, and Benjamin Chelf. 2001. Bugs as Deviant Behavior: A General Approach to Inferring Errors in Systems Code. In Symposium on Operating Systems Principles (SOSP). ACM, 57-72.

45 references, page 1 of 3
Abstract
Natural language elements in source code, e.g., the names of variables and functions, convey useful information. However, most existing bug detection tools ignore this information and therefore miss some classes of bugs. The few existing name-based bug detection approaches reason about names on a syntactic level and rely on manually designed and tuned algorithms to detect bugs. This paper presents DeepBugs, a learning approach to name-based bug detection, which reasons about names based on a semantic representation and which automatically learns bug detectors instead of manually writing them. We formulate bug detection as a binary classification problem and trai...
Subjects
free text keywords: Computer Science - Software Engineering, Computer Science - Programming Languages
Related Organizations
Funded by
NSF| SHF: Medium: Automated Graphical User Interface Testing with Learning
Project
  • Funder: National Science Foundation (NSF)
  • Project Code: 1409872
  • Funding stream: Directorate for Computer & Information Science & Engineering | Division of Computing and Communication Foundations
,
NSF| SHF: Small: A Dynamic Analysis and Test Generation Framework for JavaScript and Web Applications
Project
  • Funder: National Science Foundation (NSF)
  • Project Code: 1423645
  • Funding stream: Directorate for Computer & Information Science & Engineering | Division of Computing and Communication Foundations
Download from
45 references, page 1 of 3

Edward Aftandilian, Raluca Sauciuc, Siddharth Priya, and Sundaresan Krishnan. 2012. Building Useful Program Analysis Tools Using an Extensible Java Compiler. In 12th IEEE International Working Conference on Source Code Analysis and Manipulation, SCAM 2012, Riva del Garda, Italy, September 23-24, 2012. 14-23.

Miltiadis Allamanis, Earl T. Barr, Christian Bird, and Charles A. Sutton. 2014. Learning natural coding conventions. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, (FSE-22), Hong Kong, China, November 16 - 22, 2014. 281-293.

Miltiadis Allamanis, Earl T. Barr, Premkumar Devanbu, and Charles Sutton. 2017a. A Survey of Machine Learning for Big Code and Naturalness. arXiv:1709.06182 (2017).

Miltiadis Allamanis and Marc Brockschmidt. 2017. SmartPaste: Learning to Adapt Source Code. CoRR abs/1705.07867 (2017). http://arxiv.org/abs/1705.07867 [OpenAIRE]

Miltiadis Allamanis, Marc Brockschmidt, and Mahmoud Khademi. 2017b. Learning to Represent Programs with Graphs. CoRR abs/1711.00740 (2017). arXiv:1711.00740 http://arxiv.org/abs/1711.00740 [OpenAIRE]

Miltiadis Allamanis, Hao Peng, and Charles A. Sutton. 2016. A Convolutional Attention Network for Extreme Summarization of Source Code. In Proceedings of the 33nd International Conference on Machine Learning, ICML 2016, New York City, NY, USA, June 19-24, 2016. 2091-2100.

Uri Alon, Meital Zilberstein, Omer Levy, and Eran Yahav. 2018. A General Path-Based Representation for Predicting Program Properties. In PLDI. [OpenAIRE]

Glenn Ammons, Rastislav Bodík, and James R. Larus. 2002. Mining specifications. In Symposium on Principles of Programming Languages (POPL). ACM, 4-16.

M. Amodio, S. Chaudhuri, and T. Reps. 2017. Neural Attribute Machines for Program Generation. ArXiv e-prints (May 2017). arXiv:cs.AI/1705.09231 [OpenAIRE]

Sahil Bhatia and Rishabh Singh. 2016. Automated Correction for Syntax Errors in Programming Assignments using Recurrent Neural Networks. CoRR abs/1603.06129 (2016). [OpenAIRE]

Pavol Bielik, Veselin Raychev, and Martin T. Vechev. 2016. PHOG: Probabilistic Model for Code. In Proceedings of the 33nd International Conference on Machine Learning, ICML 2016, New York City, NY, USA, June 19-24, 2016. 2933-2942.

David Bingham Brown, Michael Vaughn, Ben Liblit, and Thomas W. Reps. 2017. The care and feeding of wild-caught mutants. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, ESEC/FSE 2017, Paderborn, Germany, September 4-8, 2017. 511-522.

Simon Butler, Michel Wermelinger, Yijun Yu, and Helen Sharp. 2010. Exploring the Influence of Identifier Names on Code Quality: An Empirical Study. In European Conference on Software Maintenance and Reengineering (CSMR). IEEE, 156-165. [OpenAIRE]

Brendan Dolan-Gavitt, Patrick Hulin, Engin Kirda, Tim Leek, Andrea Mambretti, William K. Robertson, Frederick Ulrich, and Ryan Whelan. 2016. LAVA: Large-Scale Automated Vulnerability Addition. In IEEE Symposium on Security and Privacy, SP 2016, San Jose, CA, USA, May 22-26, 2016. 110-121.

Dawson Engler, David Yu Chen, Seth Hallem, Andy Chou, and Benjamin Chelf. 2001. Bugs as Deviant Behavior: A General Approach to Inferring Errors in Systems Code. In Symposium on Operating Systems Principles (SOSP). ACM, 57-72.

45 references, page 1 of 3
Powered by OpenAIRE Open Research Graph
Any information missing or wrong?Report an Issue