How do you test a program when only a single user, with no expertise in software testing, is able to determine if the program is performing correctly? Such programs are common today in the form of machine-learned classifiers. We consider the problem of testing this comm... View more
 S. Amershi, J. Fogarty, and D. Weld. Regroup: interactive machine learning for on-demand group creation in social networks. In CHI '12: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pages 21-30. ACM Request Permissions, May 2012.
 A. Arcuri, M. Iqbal, and L. Briand. Formal analysis of the effectiveness and predictability of random testing. In Intl. Symp. Software Testing and Analysis, pages 219-230, 2010.
 A. Asuncion and D. Newman. UCI machine learning repository, 2007.
 R. Baeza-Yates and B. Ribeiro-Neto. Modern Information Retrieval. 1999.
 A. Blackwell. First steps in programming: A rationale for attention investment models. In IEEE Conf. Human-Centric Computing, pages 2-10, 2002.
 D. Brain and G. Webb. On the effect of data set size on bias and variance in classification learning. In D. Richards, G. Beydoun, A. Hoffmann, and P. Compton, editors, Proc. of the Fourth Australian Knowledge Acquisition Workshop, pages 117- 128. 1999.
 C.-C. Chang and C.-J. Lin. LIBSVM: a library for support vector machines, 2001. http://www.csie.ntu.edu.tw/∼cjlin/libsvm.
 T. Chen, T. Tse, and Z. Quan Zhou. Fault-based testing without the need of oracles. Information and Software Technology, 45(1):1- 9, 2003.
 T. Y. Chen, S. C. Cheung, and S. Yiu. Metamorphic testing: a new appraoch for generating next test cases. Technical Report HKUST-CS98-01, Hong Kong Univ. Sci. Tech., 1998.