Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ Recolector de Cienci...arrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
Biblos-e Archivo
Bachelor thesis . 2016
Data sources: Biblos-e Archivo
versions View all 2 versions
addClaim

Introducción al aprendizaje por refuerzo Problema bandido multibrazo

Authors: Guinea Juliá, Álvaro;

Introducción al aprendizaje por refuerzo Problema bandido multibrazo

Abstract

El objetivo de este trabajo fin de grado es presentar el problema del bandido multibrazo, así como desarrollar una aplicación práctica basada en el problema del bandido multibrazo y destinada al diseño de carteras. En una primera parte de la memoria realizamos una introducción al aprendizaje por refuerzo, en el que se está incluido el problema del bandido multibrazo. En esta fase se revisa la importancia de la exploración y de la explotación y se introducen conceptos de relevancia, como por ejemplo el rechazo. En la segunda parte de la memoria desarrollamos el concepto de bandidos estocásticos y presentamos varios algoritmos que se enmarcan dentro de esta idea, como son los algoritmos -Greedy, Softmax, UCB1 y otros. Para cada algoritmo se ha realizado un análisis exhaustivo de su funcionamiento. Por último aplicaremos los algoritmos presentados anteriormente, como técnicas de selección de acciones en carteras financieras. Nos apoyaremos en conocimientos básicos de matemática financiera para conseguir el objetivo de disminuir el riesgo, entendido éste como la varianza de los retornos de cada acción. Adicionalmente, se ha explorado algoritmos alternativos para el diseño de carteras. Por ejemplo, el algoritmo Orthogonal Bandit Porfolio[6] en el cual se eligen carteras ortogonales donde en cada una de ellas se invierte en distintas acciones y con distintos pesos. Los valores de las acciones se han extraído del entorno Yahoo Finance. A partir de los datos capturados se han construido cuatro grupos de pruebas formados por acciones que se encuentran en los índices bursátiles Standard & Poor's e IBEX35.

The aim of this nal degree project is to present the multi-armed bandit problem, as well as develop an application based on the multi-armed bandit problem, which is used in portfolio design. On the rst part of this project, we do an introduction to reinforcement learning, in which is included the multi-armed bandit problem. On this phase we review the importance of explotation and exploration. New relevance concepts are introduced, for example the regret. On the second part, we develop the concept of stochastic bandits and we present several algorithms that fall on this idea, as for example the algorithms: -Greedy, Softmax, UCB1 and others. For each algorithm, it has made an exhaustive analisys of its behavior. By last, we apply the algorithms that we presented before, for selecting assets in a portfolio. We use basic knowledge of mathematical nance for reducing the risk, which is the variance of the return of each asset. In Addition we explore some alternative algorithms for portfolio design. For example, the algorithm Orthogonal Bandit Porfolio [6], in which orthogonal portfolios are choosen, where in each of them we invest the money in di erent assets with di erent weights. The values of the assets are taken from Yahoo Finace. We use these values to create four test groups, which are formed by stocks from the indexes Standard & Poor's and IBEX35.

Country
Spain
Related Organizations
Keywords

Informática

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
Green