Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Other literature type . 2017
License: CC BY
Data sources: ZENODO
ZENODO
Thesis . 2017
License: CC BY
Data sources: Datacite
ZENODO
Thesis . 2017
License: CC BY
Data sources: Datacite
versions View all 2 versions
addClaim

Convergence of distributed symbolic regression using metaheuristics

Authors: Cardoen, Ben;

Convergence of distributed symbolic regression using metaheuristics

Abstract

Symbolic regression (SR) fits a symbolic expression to a set of expected values. Amongst its advantages over other techniques is the ability for a practitioner to interpret the resulting expression, determine important features by their usage in the expression, and insights into the behavior of the resulting model such as continuity, derivatives and extrema. SR combines a discrete combinatoric problem, combining base functions, with the continuous optimization problem of selecting and mutating real valued constants. One of the main algorithms used in SR is Genetic Programming (GP). The convergence characteristics of SR using GP are still an open issue. The continuous aspect of the problem has traditionally been an issue in GP based symbolic regression. This paper will study convergence of a GP-SR implementation on selected benchmarks known for poor convergence characteristics. We introduce a cooling schedule on the mutation operator and observe the computational savings. The constant optimization problem is studied using a two phase approach. We apply a variation on constant folding and evaluate its effects. The hybridization of GP with 3 metaheuristics (Differential Evolution, Artificial Bee Colony, Particle Swarm Optimization) are evaluated. We use a distributed GP-SR implementation to evaluate the effect of topologies on the convergence characteristics of the algorithm and the difference in communication overheadand speedup. We introduce and evaluate a topology with the aim of finding a new balance between diffusion and communication and synchronization overhead. We intro-duce a variation of k-fold cross validation to estimate how accurate a generated solution is in predicting unknown datapoints. This validation technique is implemented in parallel in the algorithm combining both the advantages of cross validation with the increase in coverage of the search space. Our tool offers a wide array of statistics describing the convergence characteristics of the algorithm over time, offering practitioners nuanced insights into the algorithm as it approximates the symbolic regression problem. We combine our incremental support with a design of experiment technique applied on a simulator and evaluate the impact on the convergence characteristics in combination with our constant optimization approach on the one hand and the distributed algorithm on the other hand.

Master Thesis, University of Antwerp, Computer Science.

Related Organizations
Keywords

FOS: Computer and information sciences, Hyperheuristics, Epidemiology, openMPI, Symbolic Regression, Genetic Programming, Metaheuristics, Distributed Computing, Python

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
Green