Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Software . 2020
Data sources: Datacite
addClaim

This Research product is the result of merged Research products in OpenAIRE.

You have already added 0 works in your ORCID record related to the merged Research product.

huggingface/transformers: T5 Model, BART summarization example and reduced memory, translation pipeline

Authors: Wolf, Thomas; Debut, Lysandre; Chaumond, Julien; SANH, Victor; Platen, Patrick Von; Augustin, Aymeric; Louf, Rémi; +23 Authors

huggingface/transformers: T5 Model, BART summarization example and reduced memory, translation pipeline

Abstract

T5 Model (@patrickvonplaten, @thomwolf ) T5 is a powerful encoder-decoder model that formats every NLP problem into a text-to-text format. It achieves state of the art results on a variety of NLP tasks (Summarization, Question-Answering, ...). Five sets of pre-trained weights (pre-trained on a multi-task mixture of unsupervised and supervised tasks) are released. In ascending order from 60 million parameters to 11 billion parameters: t5-small, t5-base, t5-large, t5-3b, t5-11b T5 can now be used with the translation and summarization pipeline. Related: paper official code model available in Hugging Face's community models docs Big thanks to the original authors, especially @craffel who helped answer our questions, reviewed PRs and tested T5 extensively. New BART checkpoint: bart-large-xsum (@sshleifer) These weights are from BART finetuned on the XSum abstractive summarization challenge, which encourages shorter (more abstractive) summaries. It achieves state of the art. BART summarization example with pytorch-lightning (@acarrera94) New example: BART for summarization, using Pytorch-lightning. Trains on CNN/DM and evaluates. Translation pipeline (@patrickvonplaten) A new pipeline is available, leveraging the T5 model. The T5 model was added to the summarization pipeline as well. Memory improvements with BART (@sshleifer) In an effort to have the same memory footprint and same computing power necessary to run inference on BART, several improvements have been made on the model: Remove the LM head and use the embedding matrix instead (~200MB) Call encoder before expanding input_ids (~1GB) SelfAttention only returns weights if config.output_attentions (~500MB) Two separate, smaller decoder attention masks (~500MB) drop columns that are exclusively pad_token_id from input_ids in evaluate_cnn example. New model: XLMForTokenClassification (@sakares) A new head was added to XLM: XLMForTokenClassification.

Related Organizations
  • BIP!
    Impact byBIP!
    citations
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
    OpenAIRE UsageCounts
    Usage byUsageCounts
    visibility views 120
    download downloads 4
  • 120
    views
    4
    downloads
    Powered byOpenAIRE UsageCounts
Powered by OpenAIRE graph
Found an issue? Give us feedback
visibility
download
citations
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
views
OpenAIRE UsageCountsViews provided by UsageCounts
downloads
OpenAIRE UsageCountsDownloads provided by UsageCounts
0
Average
Average
Average
120
4