Powered by OpenAIRE graph
Found an issue? Give us feedback
ZENODOarrow_drop_down
ZENODO
Dataset . 2024
License: CC BY SA
Data sources: Datacite
ZENODO
Dataset . 2024
License: CC BY SA
Data sources: Datacite
versions View all 2 versions
addClaim

This Research product is the result of merged Research products in OpenAIRE.

You have already added 0 works in your ORCID record related to the merged Research product.

Text Datasets for DSI

Authors: Tatsuya, Haga;

Text Datasets for DSI

Abstract

Text data used in an article Tatsuya Haga, Yohei Oseki, Tomoki Fukai, "A unified neural representation model for spatial and semantic computations" (preprint in biorxiv doi: https://doi.org/10.1101/2023.05.11.540307). Codes and usage of data are available at https://github.com/TatsuyaHaga/DSI_codes Main dataset (enwiki_processed_pickle): This file contains preprocessed text data of 100,000 articles randomly sampled from English Wikipedia dump taken on 22-May-2020 (https://dumps.wikimedia.org/enwiki/latest/). Additional dataset (wikitext103train_processed_pickle): This file contains preprocessed text data based on WikiText-103 dataset (Stephen Merity, Caiming Xiong, James Bradbury, and Richard Socher. 2016. Pointer Sentinel Mixture Models. http://arxiv.org/abs/1609.07843) Both text data have already been preprocessed: all characters were lowercased, punctuation characters were removed, and all words were tokenized. Data format is python pickle format. We publish data under CC-BY-SA following the license of original datasets.

  • BIP!
    Impact byBIP!
    citations
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
citations
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average