Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Software . 2019
Data sources: Datacite
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Software . 2019
Data sources: ZENODO
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Software . 2019
Data sources: Datacite
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Software . 2019
Data sources: ZENODO
versions View all 2 versions
addClaim

huggingface/transformers: ALBERT, CamemBERT, DistilRoberta, GPT-2 XL, and Encoder-Decoder architectures

Authors: Thomas Wolf; Lysandre Debut; Victor SANH; Julien Chaumond; Rémi Louf; Denis; erenup; +23 Authors

huggingface/transformers: ALBERT, CamemBERT, DistilRoberta, GPT-2 XL, and Encoder-Decoder architectures

Abstract

New model architectures: ALBERT, CamemBERT, GPT2-XL, DistilRoberta Four new models have been added in v2.2.0 ALBERT (Pytorch & TF) (from Google Research and the Toyota Technological Institute at Chicago) released with the paper ALBERT: A Lite BERT for Self-supervised Learning of Language Representations, by Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, Radu Soricut. CamemBERT (Pytorch) (from Facebook AI Research, INRIA, and La Sorbonne Université), as the first large-scale Transformer language model. Released alongside the paper CamemBERT: a Tasty French Language Model by Louis Martin, Benjamin Muller, Pedro Javier Ortiz Suarez, Yoann Dupont, Laurent Romary, Eric Villemonte de la Clergerie, Djame Seddah, and Benoît Sagot. It was added by @louismartin with the help of @julien-c. DistilRoberta (Pytorch & TF) from @VictorSanh as the third distilled model after DistilBERT and DistilGPT-2. GPT-2 XL (Pytorch & TF) as the last GPT-2 checkpoint released by OpenAI Encoder-Decoder architectures We welcome the possibility to create fully seq2seq models by incorporating Encoder-Decoder architectures using a PreTrainedEncoderDecoder class that can be initialized from pre-trained models. The base BERT class has be modified so that it may behave as a decoder. Furthermore, a Model2Model class that simplifies the definition of an encoder-decoder when both encoder and decoder are based on the same model has been added. @rlouf Benchmarks and performance improvements Works by @tlkh and @LysandreJik aiming to benchmark the library models with different technologies: with TensorFlow and Pytorch, with mixed precision (AMP and FP-16) and with model tracing (Torchscript and XLA). A new section was created in the documentation: benchmarks pointing to Google sheets with the results. Breaking changes Tokenizers now add special tokens by default. @LysandreJik New model templates Model templates to ease the addition of new models to the library have been added. @thomwolf Inputs Embeddings A new input has been added to all models' forward (for Pytorch) and call (for TensorFlow) methods. These inputs_embeds are a direct embedded representation. This is useful as it gives more control over how to convert input_ids indices into associated vectors than the model's internal embedding lookup matrix. @julien-c Getters and setters for input and output embeddings A new API for the input and output embeddings are available. These methods are model-independent and allow easy acquisition/modification of the models' embeddings. @thomwolf Additional architectures New model architectures are available, namely: DistilBertForTokenClassification, CamembertForTokenClassification @stefan-it Community additions/bug-fixes/improvements The Fairseq RoBERTa model conversion script has been patched. @louismartin einsum now runs in FP-16 in the library's examples @slayton58 In-depth work on the squad script for XLNet to reproduce the original paper's results @hlums Additional improvements on the run_squad script by @WilliamTambellini, @orena1 The run_generation script has seen several improvements by @leo-du The RoBERTaTensorFlow model has been patched for several use-cases: TPU and keras.fit @LysandreJik The documentation is now versioned, links are available on the github readme @LysandreJik The run_ner script has seen several improvements @mmaybeno, @oneraghavan, @manansanghi The run_tf_glue script now works for all GLUE tasks @LysandreJik The run_lm_finetuning script now correctly evaluates perplexity on MLM tasks @altsoph An issue related to the XLM TensorFlow implementation's training has been fixed @tlkh run_bertology has been updated to be closer to the run_glue example @adrianbg Fixed added special tokens in decoded sequences @LysandreJik Several performance improvements have been done to the tokenizers @iedmrc A memory leak has been identified and patched in the library's schedulers @rlouf Correct warning when encoding a sequence too long while specifying a maximum length @LysandreJik Resizing the token embeddings now works as expected in the run_lm_finetuning script @iedmrc The difference in versions between Pypi/source in order to run the examples has been clarified @rlouf

Related Organizations
  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
    OpenAIRE UsageCounts
    Usage byUsageCounts
    visibility views 14
  • 14
    views
    Powered byOpenAIRE UsageCounts
Powered by OpenAIRE graph
Found an issue? Give us feedback
visibility
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
views
OpenAIRE UsageCountsViews provided by UsageCounts
0
Average
Average
Average
14