Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2021
License: CC BY
Data sources: Datacite
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2021
License: CC BY
Data sources: ZENODO
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2021
License: CC BY
Data sources: Datacite
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2021
License: CC BY
Data sources: Datacite
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2021
License: CC BY
Data sources: ZENODO
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
Smithsonian figshare
Dataset . 2021
License: CC BY
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
Smithsonian figshare
Dataset . 2021
License: CC BY
versions View all 3 versions
addClaim

Game Walkthrough Corpus (GWTC)

Authors: Tiepmar, Jochen; Burghardt, Manuel;

Game Walkthrough Corpus (GWTC)

Abstract

Motivation The Game Walkthrough Corpus (GWTC) contains 12,295 unique walkthrough documents that cover a total of 6,117 games. For each game walkthrough, it provides frequencies of unigrams and bigrams, treating it as a bag of words. In addition, it provides word frequencies on the sentence level. Furthermore, the GWTC contains a number of game-related metadata, including title, publisher, developer, year, genre, etc. All the language statistics and metadata are stored in separate plain text files and can be referenced by means of uniform resource names (URN). These URNs also can be used to derive any combination of statistics and metadata. Researchers, for instance, can investigate the most frequent unigrams for games in the “Adventure” genre. This way, the GWTC can be reused in various ways, for different kinds of research questions on the topic of gaming language, which may be summarized as “distant playing”. Copyright Information Game walkthroughs are protected by individual copyright notices that are often very strict. That is why this data set does not include the documents but instead various data formats that are useful for text mining and distant reading methods while not allowing to recreate the documents. It is highly unlikely that even a single sentence can be reconstructed from the published data. Since the documents are not -- not even in part -- published but only text mining statistics about them, no violation of copyright is done by this project. Links to the original documents are available in the sourceUrls file in the data folder. File Information data folder: document data bagofwords: Word frequencies per document bigrams: Bigram frequencies per document corpusstats: Min, avg and max token count, type count, type/token ratio, documents per game plus corressponding standard deviation game_walkthrough_mapping: Documents per game game_walkthrough_mapping: Number of documents per game sentencecollocations: Word frequencies per sentence per document sourceUrls: Links to original text textlength: Number of characters per document tfidf_deu: Word significance per document (German) ifidf_eng: Word significance per document (English) tokencount: Number of unique words per document typecount: Number of words per document metadata: game metadata file names that do not start with "_": metadata [filename] per game _all: All metadata in one file _mapping_release_date*: Metadata combined with release data for time series doc folder: documentation createdata: Python script to create content of data folder extractMetainformation: Python script to create content of metadata folder metadata_rawg: Game metadata collected from RAWG metadata_steam: Game metadata collected from Steam metadata_symbol: Quality control. Relation of text in source HTML and extracted text titlesandurns: Game titles mapped to project identifiers Walkthrough Sources https://portforward.com/games/walkthroughs/ https://www.neoseeker.com https://www.spieletipps.de https://jayisgames.com/ http://gamesetter.com/ Corpus Statistics Number of unique games: 6,013 Number of documents: 12,295 Genre associations: 3,806 Gameplay tags: 10,246 Release dates: 2,443 Developers: 3,152 Publishers: 2,782 Steam IDs: 1,086 Platform associations: 5,293 (PC, Gameboy, iOS, Linux,...) Game language associations: 4,631 Languages: English, German and a little bit of French External Resources Project Website: https://www.informatik.uni-leipzig.de/~jtiepmar/forschung/gwtc/ Bitbucket: https://bitbucket.org/jtiepmar/game-walkthrough-corpus/src/master/ There are two version of the GWTC available for download: ver. 0.99 contains all the above corpus files, plus the Git files. Note that after downloading ver. 0.99, the Git folders may be hidden per default, depending on you operating system. Ver. 1.0 is a cleaned up version that comes without the Git files.

Related Organizations
Keywords

Evolutionary Biology, Mental Health, Infectious Diseases, Game Studies, Sociology, Video Games, Walkthrough, Text Corpus, Marine Biology, Game Studies, Walkthrough, Video Games, Text Corpus

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
    OpenAIRE UsageCounts
    Usage byUsageCounts
    visibility views 31
    download downloads 5
  • 31
    views
    5
    downloads
    Powered byOpenAIRE UsageCounts
Powered by OpenAIRE graph
Found an issue? Give us feedback
visibility
download
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
views
OpenAIRE UsageCountsViews provided by UsageCounts
downloads
OpenAIRE UsageCountsDownloads provided by UsageCounts
0
Average
Average
Average
31
5
Related to Research communities