55 references, page 1 of 4
[1] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning internal representations by error propagation,” in Parallel Distributed Processing: Explorations in the Microstructure of Cognition, Vol. 1. Cambridge, MA, USA: MIT Press, 1986, pp. 318-362.
[2] H. T. Siegelmann and E. D. Sontag, “Turing computability with neural nets,” Applied Mathematics Letters, vol. 4, no. 6, pp. 77-80, 1991. [OpenAIRE]
[3] --, “On the computational power of neural nets,” Journal of Computer and System Sciences, vol. 50, no. 1, pp. 132-150, 1995.
[4] J. J. Hopfield, “Neural networks and physical systems with emergent collective computational abilities,” Proceedings of the National Academy of Sciences, vol. 79, no. 8, pp. 2554-2558, 1982.
[5] M. I. Jordan, “Serial order: A parallel, distributed processing approach,” Institute for Cognitive Science, University of California, San Diego, Tech. Rep. 8604, 1986.
[6] J. L. Elman, “Finding structure in time,” Cognitive Science, vol. 14, no. 2, pp. 179-211, 1990.
[7] P. J. Werbos, “Backpropagation through time: what it does and how to do it,” Proceedings of the IEEE, vol. 78, no. 10, pp. 1550-1560, 1990.
[8] R. J. Williams and D. Zipser, “A learning algorithm for continually running fully recurrent neural networks,” Neural Computation, vol. 1, no. 2, pp. 270-280, 1989.
[9] Y. Bengio, P. Simard, and P. Frasconi, “Learning long-term dependencies with gradient descent is difficult,” IEEE Transactions on Neural Networks, vol. 5, no. 2, pp. 157-166, 1994. [OpenAIRE]
[10] J. Martens and I. Sutskever, “Learning recurrent neural networks with hessian-free optimization,” in Proceedings of the 28th International Conference on Machine Learning, 2011, pp. 1033-1040.
[11] J. Schmidhuber, D. Wierstra, M. Gagliolo, and F. Gomez, “Training recurrent networks by Evolino,” Neural Computation, vol. 19, no. 3, pp. 757-779, 2007.
[12] F. Gomez, J. Schmidhuber, and R. Miikkulainen, “Accelerated neural evolution through cooperatively coevolved synapses,” Journal of Machine Learning Research, vol. 9, pp. 937-965, 2008.
[13] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Computation, vol. 9, no. 8, pp. 1735-1780, 1997.
[14] K. Cho, B. van Merrienboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio, “Learning phrase representations using RNN encoder-decoder for statistical machine translation,” in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Doha, Qatar: Association for Computational Linguistics, 2014, pp. 1724-1734.
[15] M. Lukosˇevicˇius and H. Jaeger, “Reservoir computing approaches to recurrent neural network training,” Computer Science Review, vol. 3, pp. 127-149, 2009.
55 references, page 1 of 4