Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Software . 2019
Data sources: Datacite
addClaim

This Research product is the result of merged Research products in OpenAIRE.

You have already added 0 works in your ORCID record related to the merged Research product.

kengz/SLM-Lab: RAdam+Lookahead optim, TensorBoard, Full Benchmark Upload

Authors: Wah Loon Keng; Graesser, Laura; Allan-Avatar1; Snyk Bot; Gillen, Sean; Rahim16; Cvitkovic, Milan; +1 Authors

kengz/SLM-Lab: RAdam+Lookahead optim, TensorBoard, Full Benchmark Upload

Abstract

This marks a stable release of SLM Lab with full benchmark results RAdam+Lookahead optimizer Lookahead + RAdam optimizer significantly improves the performance of some RL algorithms (A2C (n-step), PPO) on continuous domain problems, but does not improve (A2C (GAE), SAC). #416 TensorBoard Add TensorBoard in body to auto-log summary variables, graph, network parameter histograms, action histogram. To launch TensorBoard, run tensorboard --logdir=data after a session/trial is completed. Example screenshot: <img width="1423" alt="Screen Shot 2019-10-14 at 10 41 36 PM" src="https://user-images.githubusercontent.com/8209263/66803221-d9bc0980-eed3-11e9-92b8-0e5cd42a6eab.png"> Full Benchmark Upload Plot Legend <img width="400" alt="legend" src="https://user-images.githubusercontent.com/8209263/67737544-d727dc80-f9c8-11e9-904a-319b9aafd41b.png"> Discrete Benchmark Upload PR #427 Dropbox data Env. \ Alg. DQN DDQN+PER A2C (GAE) A2C (n-step) PPO SAC Breakout <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67737546-dabb6380-f9c8-11e9-901e-b96cc28f1fdf.png"></details> 80.88 182 377 398 443 - Pong <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67737554-e018ae00-f9c8-11e9-92b5-3bd8d213b1e0.png"></details> 18.48 20.5 19.31 19.56 20.58 19.87* Seaquest <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67737557-e3139e80-f9c8-11e9-9446-119593ca956b.png"></details> 1185 4405 1070 1684 1715 - Qbert <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67737559-e575f880-f9c8-11e9-8c98-f14c82041a45.png"></details> 5494 11426 12405 13590 13460 214* LunarLander <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67737566-e7d85280-f9c8-11e9-8df8-39c1205c5308.png"></details> 192 233 25.21 68.23 214 276 UnityHallway <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67737569-ead34300-f9c8-11e9-9e26-61fe1d779989.png"></details> -0.32 0.27 0.08 -0.96 0.73 - UnityPushBlock <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67737577-eeff6080-f9c8-11e9-931c-843ba697779c.png"></details> 4.88 4.93 4.68 4.93 4.97 - Episode score at the end of training attained by SLM Lab implementations on discrete-action control problems. Reported episode scores are the average over the last 100 checkpoints, and then averaged over 4 Sessions. Results marked with * were trained using the hybrid synchronous/asynchronous version of SAC to parallelize and speed up training time. For the full Atari benchmark, see Atari Benchmark Continuous Benchmark Upload PR #427 Dropbox data Env. \ Alg. A2C (GAE) A2C (n-step) PPO SAC RoboschoolAnt <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67737923-1571cb80-f9ca-11e9-8f6b-b288fa19bff0.png"></details> 787 1396 1843 2915 RoboschoolAtlasForwardWalk <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67737924-1571cb80-f9ca-11e9-98ee-82c920dfbf44.png"></details> 59.87 88.04 172 800 RoboschoolHalfCheetah <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67737925-1571cb80-f9ca-11e9-9c7f-3a8294a517af.png"></details> 712 439 1960 2497 RoboschoolHopper <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67737926-160a6200-f9ca-11e9-8cae-9afc532e5af8.png"></details> 710 285 2042 2045 RoboschoolInvertedDoublePendulum <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67737927-160a6200-f9ca-11e9-8eb2-e04554e3844f.png"></details> 996 4410 8076 8085 RoboschoolInvertedPendulum <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67737928-160a6200-f9ca-11e9-8eae-e7a3ccbe914a.png"></details> 995 978 986 941 RoboschoolReacher <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67737929-160a6200-f9ca-11e9-9423-b27165def32e.png"></details> 12.9 10.16 19.51 19.99 RoboschoolWalker2d <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67737930-160a6200-f9ca-11e9-9a0f-edbd4f01f4e0.png"></details> 280 220 1660 1894 RoboschoolHumanoid <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67737931-16a2f880-f9ca-11e9-9340-fe90ab48e95f.png"></details> 99.31 54.58 2388 2621* RoboschoolHumanoidFlagrun <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67737932-16a2f880-f9ca-11e9-92bb-9c896ec3991e.png"></details> 73.57 178 2014 2056* RoboschoolHumanoidFlagrunHarder <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67737933-16a2f880-f9ca-11e9-98c8-7388fa9e1775.png"></details> -429 253 680 280* Unity3DBall <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67737934-16a2f880-f9ca-11e9-912b-37c8840d0acc.png"></details> 33.48 53.46 78.24 98.44 Unity3DBallHard <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67737935-16a2f880-f9ca-11e9-9275-f3b5fef22e1b.png"></details> 62.92 71.92 91.41 97.06 Episode score at the end of training attained by SLM Lab implementations on continuous control problems. Reported episode scores are the average over the last 100 checkpoints, and then averaged over 4 Sessions. Results marked with * require 50M-100M frames, so we use the hybrid synchronous/asynchronous version of SAC to parallelize and speed up training time. Atari Benchmark Upload PR #427 Dropbox data: DQN Dropbox data: DDQN+PER Dropbox data: A2C (GAE) Dropbox data: A2C (n-step) Dropbox data: PPO Dropbox data: all Atari graphs Env. \ Alg. DQN DDQN+PER A2C (GAE) A2C (n-step) PPO Adventure <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67738131-d6904580-f9ca-11e9-8818-0d027b668a97.png"></details> -0.94 -0.92 -0.77 -0.85 -0.3 AirRaid <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67738132-d6904580-f9ca-11e9-9585-41f69fd8bb33.png"></details> 1876 3974 4202 3557 4028 Alien <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67738133-d6904580-f9ca-11e9-8375-4c134255cfe1.png"></details> 822 1574 1519 1627 1413 Amidar <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67738134-d6904580-f9ca-11e9-865c-eb41f4e712f9.png"></details> 90.95 431 577 418 795 Assault <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67738135-d6904580-f9ca-11e9-8f8d-61732ecc3ce4.png"></details> 1392 2567 3366 3312 3619 Asterix <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67738138-d6904580-f9ca-11e9-86c0-3589622a311c.png"></details> 1253 6866 5559 5223 6132 Asteroids <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67738139-d728dc00-f9ca-11e9-8741-e9a59883197e.png"></details> 439 426 2951 2147 2186 Atlantis <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67738140-d728dc00-f9ca-11e9-9649-ecc4b2db782f.png"></details> 68679 644810 2747371 2259733 2148077 BankHeist <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67738141-d728dc00-f9ca-11e9-924a-a02be1639ee6.png"></details> 131 623 855 1170 1183 BattleZone <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67738142-d728dc00-f9ca-11e9-82b0-382bbb0bcc6c.png"></details> 6564 6395 4336 4533 13649 BeamRider <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67738143-d728dc00-f9ca-11e9-84eb-2ec8988ff545.png"></details> 2799 5870 2659 4139 4299 Berzerk <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67738144-d728dc00-f9ca-11e9-83c6-2e50a69b4ed3.png"></details> 319 401 1073 763 860 Bowling <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67738145-d7c17280-f9ca-11e9-9a2e-bc179e3186f4.png"></details> 30.29 39.5 24.51 23.75 31.64 Boxing <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67738146-d7c17280-f9ca-11e9-95ac-008f35834ed1.png"></details> 72.11 90.98 1.57 1.26 96.53 Breakout <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67738147-d7c17280-f9ca-11e9-890e-319a21e036e0.png"></details> 80.88 182 377 398 443 Carnival <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67738148-d7c17280-f9ca-11e9-95e9-58309efb8ee4.png"></details> 4280 4773 2473 1827 4566 Centipede <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67738150-d7c17280-f9ca-11e9-8a27-3cc7160c1e60.png"></details> 1899 2153 3909 4202 5003 ChopperCommand <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67738151-d7c17280-f9ca-11e9-8316-90cf4e944e97.png"></details> 1083 4020 3043 1280 3357 CrazyClimber <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67738152-d85a0900-f9ca-11e9-8b48-1a988dc31627.png"></details> 46984 88814 106256 109998 116820 Defender <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67738153-d85a0900-f9ca-11e9-8b30-750fc49b25dd.png"></details> 281999 313018 665609 657823 534639 DemonAttack <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67738154-d85a0900-f9ca-11e9-8e5e-e99b336e6fbb.png"></details> 1705 19856 23779 19615 121172 DoubleDunk <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67738155-d85a0900-f9ca-11e9-8fd4-e94d1be4a6ee.png"></details> -21.44 -22.38 -5.15 -13.3 -6.01 ElevatorAction <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67738156-d85a0900-f9ca-11e9-9006-903a9c823230.png"></details> 32.62 17.91 9966 8818 6471 Enduro <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67738158-d85a0900-f9ca-11e9-8167-ebc713c59fdc.png"></details> 437 959 787 0.0 1926 FishingDerby <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67738159-d8f29f80-f9ca-11e9-9166-ebe3ea5339ab.png"></details> -88.14 -1.7 16.54 1.65 36.03 Freeway <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67738161-d8f29f80-f9ca-11e9-9727-2584ac850507.png"></details> 24.46 30.49 30.97 0.0 32.11 Frostbite <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67738163-d8f29f80-f9ca-11e9-9d36-1cb7985360ac.png"></details> 98.8 2497 277 261 1062 Gopher <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67738164-d8f29f80-f9ca-11e9-8ba3-fb1d75ef81f1.png"></details> 1095 7562 929 1545 2933 Gravitar <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67738166-d8f29f80-f9ca-11e9-9d57-c02118eba7c1.png"></details> 87.34 258 313 433 223 Hero <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67738167-d8f29f80-f9ca-11e9-9faf-2c30048c8621.png"></details> 1051 12579 16502 19322 17412 IceHockey <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67738168-d98b3600-f9ca-11e9-8695-8014fd177416.png"></details> -14.96 -14.24 -5.79 -6.06 -6.43 Jamesbond <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67738170-d98b3600-f9ca-11e9-9f4a-25929639efc1.png"></details> 44.87 702 521 453 561 JourneyEscape <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67738171-d98b3600-f9ca-11e9-9679-15a1586719dd.png"></details> -4818 -2003 -921 -2032 -1094 Kangaroo <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67738172-d98b3600-f9ca-11e9-9770-3d63043a716b.png"></details> 1965 8897 67.62 554 4989 Krull <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67738173-d98b3600-f9ca-11e9-9244-0933adbfedd8.png"></details> 5522 6650 7785 6642 8477 KungFuMaster <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67738174-d98b3600-f9ca-11e9-95e3-33621db77541.png"></details> 2288 16547 31199 25554 34523 MontezumaRevenge <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67738175-da23cc80-f9ca-11e9-81cf-58e16e210b5e.png"></details> 0.0 0.02 0.08 0.19 1.08 MsPacman <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67738176-da23cc80-f9ca-11e9-8906-d54475705442.png"></details> 1175 2215 1965 2158 2350 NameThisGame <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67738177-da23cc80-f9ca-11e9-9093-0a0e2456fb4c.png"></details> 3915 4474 5178 5795 6386 Phoenix <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67738178-da23cc80-f9ca-11e9-93a1-188c75b888f6.png"></details> 2909 8179 16345 13586 30504 Pitfall <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67738179-da23cc80-f9ca-11e9-8c76-0d339ac0034a.png"></details> -68.83 -73.65 -101 -31.13 -35.93 Pong <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67738180-dabc6300-f9ca-11e9-826b-3d72cd0b13a0.png"></details> 18.48 20.5 19.31 19.56 20.58 Pooyan <details><summary><i>graph</i></summary><img src="https://user-images.githubusercontent.com/8209263/67738181-dabc6300-f9ca-11e9-922e-0b13b973a4d9.png"></details> 1958 2741 2862 2531 6799 PrivateEye <details><summary><i&gt

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
    OpenAIRE UsageCounts
    Usage byUsageCounts
    visibility views 3
  • 3
    views
    Powered byOpenAIRE UsageCounts
Powered by OpenAIRE graph
Found an issue? Give us feedback
visibility
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
views
OpenAIRE UsageCountsViews provided by UsageCounts
0
Average
Average
Average
3