
handle: 10486/677832
Este Trabajo Fin de Grado comprende un exhaustivo examen sobre la corrupción en los países, realizándose un estudio sobre su evolución durante de un periodo de tiempo para así poder realizar una predicción de su desarrollo futuro. Por ello se ha realizado un proceso de recopilación de datos relativos al barómetro general de corrupción de los años 2004, 2005, 2006, 2010 y 2013, recopilados por Transparencia Internacional. Para acometer la tarea de predecir el desarrollo futuro de los niveles de corrupción se emplean técnicas de machine-learning tales como vecinos próximos, el perceptrón multicapa o los árboles de decisión. Adicionalmente se ha valorado la calidad de las predicciones mediante la convergencia de los exponentes de Lyapunov y el cálculo de la tasa de error en la fase de testeo. Como paso previo a estas predicciones se ha realizado un proceso de estudio de los datos recogidos, incluyendo el empleo de clusters, el cálculo de la entropía de cada uno de los distintos campos de que se disponía o el establecimiento de enlaces entre los distintos países en función de las distancias euclidianas de los valores de sus campos para cada año. Estas pruebas y predicciones se han realizado empleando la herramienta Weka. Además, se han desarrollado herramientas propias para cubrir tareas necesarias para el desarrollo del Trabajo, tales como una serie de programas que convierten conjuntos de datos en un fichero de entrada para Weka, así como un programa que enlaza con Weka a través de línea de comandos para realizar las predicciones.
This Bachelor Thesis provides an exhaustive study about corruption, being the subject of study its evolution through a time period in order to predict its future development. In order to do so, a process of data gathering has been performed, acquiring data gathered by International Transparency relative to the general corruption barometer from years 2004, 2005, 2006, 2010 and 2013. To accomplish the task of predicting the future developments of the corruption level in a certain country, a set of machine learning algorithms, such as the multilayered perceptron, the nearest neighbors and decision trees, have been used. Additionally, a process of quality assurance has been performed on the aforementioned algorithms, using the Lyapunov exponent and the value of the error percentage during test to do so. Before the aforementioned predictions a study of the available data has been performed, using clusters as well as the calculation of the entropy of the various fields available in the data set and the search for links between different countries based on the euclidean distances between the various fields corresponding to each country in a specific year. These experiments and predictions have been done using the Weka program. Additionally, a set of programs have been coded to do some certain tasks, such as transforming raw data into an arff file that Weka can understand or a program that calls Weka classifiers through a command line interface, required during the process of making the Thesis.
Machine Learning, Informática, Cluster, Corrupción
Machine Learning, Informática, Cluster, Corrupción
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
