Downloads provided by UsageCounts
T5 Model (@patrickvonplaten, @thomwolf ) T5 is a powerful encoder-decoder model that formats every NLP problem into a text-to-text format. It achieves state of the art results on a variety of NLP tasks (Summarization, Question-Answering, ...). Five sets of pre-trained weights (pre-trained on a multi-task mixture of unsupervised and supervised tasks) are released. In ascending order from 60 million parameters to 11 billion parameters: t5-small, t5-base, t5-large, t5-3b, t5-11b T5 can now be used with the translation and summarization pipeline. Related: paper official code model available in Hugging Face's community models docs Big thanks to the original authors, especially @craffel who helped answer our questions, reviewed PRs and tested T5 extensively. New BART checkpoint: bart-large-xsum (@sshleifer) These weights are from BART finetuned on the XSum abstractive summarization challenge, which encourages shorter (more abstractive) summaries. It achieves state of the art. BART summarization example with pytorch-lightning (@acarrera94) New example: BART for summarization, using Pytorch-lightning. Trains on CNN/DM and evaluates. Translation pipeline (@patrickvonplaten) A new pipeline is available, leveraging the T5 model. The T5 model was added to the summarization pipeline as well. Memory improvements with BART (@sshleifer) In an effort to have the same memory footprint and same computing power necessary to run inference on BART, several improvements have been made on the model: Remove the LM head and use the embedding matrix instead (~200MB) Call encoder before expanding input_ids (~1GB) SelfAttention only returns weights if config.output_attentions (~500MB) Two separate, smaller decoder attention masks (~500MB) drop columns that are exclusively pad_token_id from input_ids in evaluate_cnn example. New model: XLMForTokenClassification (@sakares) A new head was added to XLM: XLMForTokenClassification.
| citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
| views | 120 | |
| downloads | 4 |

Views provided by UsageCounts
Downloads provided by UsageCounts