publication . Preprint . 2018

Neural language representations predict outcomes of scientific research

Bagrow, James P.; Berenberg, Daniel; Bongard, Joshua;
Open Access English
  • Published: 17 May 2018
Abstract
Many research fields codify their findings in standard formats, often by reporting correlations between quantities of interest. But the space of all testable correlates is far larger than scientific resources can currently address, so the ability to accurately predict correlations would be useful to plan research and allocate resources. Using a dataset of approximately 170,000 correlational findings extracted from leading social science journals, we show that a trained neural network can accurately predict the reported correlations using only the text descriptions of the correlates. Accurate predictive models such as these can guide scientists towards promising ...
Subjects
free text keywords: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Computers and Society, Computer Science - Learning, Statistics - Machine Learning
Related Organizations
Download from
30 references, page 1 of 2

[1] N. Jean, M. Burke, M. Xie, W. M. Davis, D. B. Lobell, and S. Ermon, “Combining satellite imagery and machine learning to predict poverty,” Science, vol. 353, no. 6301, pp. 790-794, 2016. 1

[2] G. F. Cooper, C. F. Aliferis, R. Ambrosino, J. Aronis, B. G. Buchanan, R. Caruana, M. J. Fine, C. Glymour, G. Gordon, B. H. Hanusa, et al., “An evaluation of machine-learning methods for predicting pneumonia mortality,” Artificial intelligence in medicine, vol. 9, no. 2, pp. 107-138, 1997. 1

[3] R. Caruana, Y. Lou, J. Gehrke, P. Koch, M. Sturm, and N. Elhadad, “Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission,” in Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1721-1730, ACM, 2015. 1

[4] D. C. Cireşan, A. Giusti, L. M. Gambardella, and J. Schmidhuber, “Mitosis detection in breast cancer histology images with deep neural networks,” in International Conference on Medical Image Computing and Computerassisted Intervention, pp. 411-418, Springer, 2013. 1 [OpenAIRE]

[5] A. Cruz-Roa, A. Basavanhally, F. González, H. Gilmore, M. Feldman, S. Ganesan, N. Shih, J. Tomaszewski, and A. Madabhushi, “Automatic detection of invasive ductal carcinoma in whole slide images with convolutional neural networks,” in Medical Imaging 2014: Digital Pathology, vol. 9041, p. 904103, International Society for Optics and Photonics, 2014. 1 [OpenAIRE]

[6] J. Xu, X. Luo, G. Wang, H. Gilmore, and A. Madabhushi, “A deep convolutional neural network for segmenting and classifying epithelial and stromal regions in histopathological images,” Neurocomputing, vol. 191, pp. 214- 223, 2016. 1 [OpenAIRE]

[7] P. Raccuglia, K. C. Elbert, P. D. Adler, C. Falk, M. B. Wenny, A. Mollo, M. Zeller, S. A. Friedler, J. Schrier, and A. J. Norquist, “Machine-learning-assisted materials discovery using failed experiments,” Nature, vol. 533, no. 7601, p. 73, 2016. 1

[8] G. T. Richards, R. C. Nichol, A. G. Gray, R. J. Brunner, R. H. Lupton, D. E. V. Berk, S. S. Chong, M. A. Weinstein, D. P. Schneider, S. F. Anderson, et al., “Efficient photometric selection of quasars from the Sloan Digital Sky Survey: 100,000 z < 3 quasars from Data Release One,” The Astrophysical Journal Supplement Series, vol. 155, no. 2, p. 257, 2004. 1

[9] P. R. Fiorentin, C. Bailer-Jones, Y. S. Lee, T. C. Beers, T. Sivarani, R. Wilhelm, C. A. Prieto, and J. Norris, “Estimation of stellar atmospheric parameters from SDSS/SEGUE spectra,” Astronomy & Astrophysics, vol. 467, no. 3, pp. 1373-1387, 2007. 1

[10] P. Baldi, P. Sadowski, and D. Whiteson, “Searching for exotic particles in high-energy physics with deep learning,” Nature communications, vol. 5, p. 4308, 2014. 1

[11] F. A. Bosco, H. Aguinis, K. Singh, J. G. Field, and C. A. Pierce, “Correlational effect size benchmarks,” Journal of Applied Psychology, vol. 100, no. 2, p. 431, 2015. 1, 2

[12] F. A. Bosco, P. Steel, F. L. Oswald, K. Uggerslev, and J. G. Field, “Cloud-based meta-analysis to bridge science and practice: Welcome to metabus,” Personnel Assessment and Decisions, vol. 1, no. 1, p. 2, 2015. 2

[14] T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean, “Distributed representations of words and phrases and their compositionality,” in Advances in neural information processing systems, pp. 3111-3119, 2013. 2

[15] J. Pennington, R. Socher, and C. Manning, “GloVe: Global vectors for word representation,” in Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp. 1532-1543, 2014. 2

[16] Y. Bengio, R. Ducharme, P. Vincent, and C. Jauvin, “A neural probabilistic language model,” Journal of machine learning research, vol. 3, no. Feb, pp. 1137-1155, 2003. 2

30 references, page 1 of 2
Abstract
Many research fields codify their findings in standard formats, often by reporting correlations between quantities of interest. But the space of all testable correlates is far larger than scientific resources can currently address, so the ability to accurately predict correlations would be useful to plan research and allocate resources. Using a dataset of approximately 170,000 correlational findings extracted from leading social science journals, we show that a trained neural network can accurately predict the reported correlations using only the text descriptions of the correlates. Accurate predictive models such as these can guide scientists towards promising ...
Subjects
free text keywords: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Computers and Society, Computer Science - Learning, Statistics - Machine Learning
Related Organizations
Download from
30 references, page 1 of 2

[1] N. Jean, M. Burke, M. Xie, W. M. Davis, D. B. Lobell, and S. Ermon, “Combining satellite imagery and machine learning to predict poverty,” Science, vol. 353, no. 6301, pp. 790-794, 2016. 1

[2] G. F. Cooper, C. F. Aliferis, R. Ambrosino, J. Aronis, B. G. Buchanan, R. Caruana, M. J. Fine, C. Glymour, G. Gordon, B. H. Hanusa, et al., “An evaluation of machine-learning methods for predicting pneumonia mortality,” Artificial intelligence in medicine, vol. 9, no. 2, pp. 107-138, 1997. 1

[3] R. Caruana, Y. Lou, J. Gehrke, P. Koch, M. Sturm, and N. Elhadad, “Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission,” in Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1721-1730, ACM, 2015. 1

[4] D. C. Cireşan, A. Giusti, L. M. Gambardella, and J. Schmidhuber, “Mitosis detection in breast cancer histology images with deep neural networks,” in International Conference on Medical Image Computing and Computerassisted Intervention, pp. 411-418, Springer, 2013. 1 [OpenAIRE]

[5] A. Cruz-Roa, A. Basavanhally, F. González, H. Gilmore, M. Feldman, S. Ganesan, N. Shih, J. Tomaszewski, and A. Madabhushi, “Automatic detection of invasive ductal carcinoma in whole slide images with convolutional neural networks,” in Medical Imaging 2014: Digital Pathology, vol. 9041, p. 904103, International Society for Optics and Photonics, 2014. 1 [OpenAIRE]

[6] J. Xu, X. Luo, G. Wang, H. Gilmore, and A. Madabhushi, “A deep convolutional neural network for segmenting and classifying epithelial and stromal regions in histopathological images,” Neurocomputing, vol. 191, pp. 214- 223, 2016. 1 [OpenAIRE]

[7] P. Raccuglia, K. C. Elbert, P. D. Adler, C. Falk, M. B. Wenny, A. Mollo, M. Zeller, S. A. Friedler, J. Schrier, and A. J. Norquist, “Machine-learning-assisted materials discovery using failed experiments,” Nature, vol. 533, no. 7601, p. 73, 2016. 1

[8] G. T. Richards, R. C. Nichol, A. G. Gray, R. J. Brunner, R. H. Lupton, D. E. V. Berk, S. S. Chong, M. A. Weinstein, D. P. Schneider, S. F. Anderson, et al., “Efficient photometric selection of quasars from the Sloan Digital Sky Survey: 100,000 z < 3 quasars from Data Release One,” The Astrophysical Journal Supplement Series, vol. 155, no. 2, p. 257, 2004. 1

[9] P. R. Fiorentin, C. Bailer-Jones, Y. S. Lee, T. C. Beers, T. Sivarani, R. Wilhelm, C. A. Prieto, and J. Norris, “Estimation of stellar atmospheric parameters from SDSS/SEGUE spectra,” Astronomy & Astrophysics, vol. 467, no. 3, pp. 1373-1387, 2007. 1

[10] P. Baldi, P. Sadowski, and D. Whiteson, “Searching for exotic particles in high-energy physics with deep learning,” Nature communications, vol. 5, p. 4308, 2014. 1

[11] F. A. Bosco, H. Aguinis, K. Singh, J. G. Field, and C. A. Pierce, “Correlational effect size benchmarks,” Journal of Applied Psychology, vol. 100, no. 2, p. 431, 2015. 1, 2

[12] F. A. Bosco, P. Steel, F. L. Oswald, K. Uggerslev, and J. G. Field, “Cloud-based meta-analysis to bridge science and practice: Welcome to metabus,” Personnel Assessment and Decisions, vol. 1, no. 1, p. 2, 2015. 2

[14] T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean, “Distributed representations of words and phrases and their compositionality,” in Advances in neural information processing systems, pp. 3111-3119, 2013. 2

[15] J. Pennington, R. Socher, and C. Manning, “GloVe: Global vectors for word representation,” in Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp. 1532-1543, 2014. 2

[16] Y. Bengio, R. Ducharme, P. Vincent, and C. Jauvin, “A neural probabilistic language model,” Journal of machine learning research, vol. 3, no. Feb, pp. 1137-1155, 2003. 2

30 references, page 1 of 2
Powered by OpenAIRE Research Graph
Any information missing or wrong?Report an Issue