descriptionPublicationkeyboard_double_arrow_right Article , Other literature type 14 Apr 2021 English Publisher:Springer Science and Business Media LLCJournal:Machine Learning, volume 110, pages 989-1,028 (issn: 0885-6125, eissn: 1573-0565,

Authors: Matej Martinc; Blaž Škrlj; Nada Lavrač; Nada Lavrač; Senja Pollak;

doi: 10.1007/s10994-021-05968-x

pmid: 34720391

pmc: PMC8550026

autoBOT: evolving neuro-symbolic representations for explainable low resource text classification

- Summary
- Subjects
- Related research
  (12)
- Metrics

Abstract

AbstractLearning from texts has been widely adopted throughout industry and science. While state-of-the-art neural language models have shown very promising results for text classification, they are expensive to (pre-)train, require large amounts of data and tuning of hundreds of millions or more parameters. This paper explores how automatically evolved text representations can serve as a basis for explainable, low-resource branch of models with competitive performance that are subject to automated hyperparameter tuning. We present autoBOT (automatic Bags-Of-Tokens), an autoML approach suitable for low resource learning scenarios, where both the hardware and the amount of data required for training are limited. The proposed approach consists of an evolutionary algorithm that jointly optimizes various sparse representations of a given text (including word, subword, POS tag, keyword-based, knowledge graph-based and relational features) and two types of document embeddings (non-sparse representations). The key idea of autoBOT is that, instead of evolving at the learner level, evolution is conducted at the representation level. The proposed method offers competitive classification performance on fourteen real-world classification tasks when compared against a competitive autoML approach that evolves ensemble models, as well as state-of-the-art neural language models such as BERT and RoBERTa. Moreover, the approach is explainable, as the importance of the parts of the input space is part of the final solution yielded by the proposed optimization procedure, offering potential for meta-transfer learning.

Related Organizations

University of Nova Gorica
Slovenia
Jožef Stefan International Postgraduate School
Slovenia

Keywords

Article

12 Research products, page 1 of 2

AutoBoT: Resilient and Cost-Effective Scheduling of a Bag of Tasks on Spot VMs
2019IsAmongTopNSimilarDocuments
Autobot for Effective Design Space Exploration and Agile Generation of RBFNN Hardware Accelerator in Embedded Real-time Computing
2020IsAmongTopNSimilarDocuments
VENDOR SELECTION FOR AN AUTOBOT SYSTEM FOR VDK GLOVES MANUFACTURING COMPANY USING FUZZY ANALYTICAL HIERARCHY PROCESS
2022IsAmongTopNSimilarDocuments
A study on firms’ communication based on artificial intelligence and its influence on customers’ complaint behavior in Social media environment
2021IsAmongTopNSimilarDocuments
Automation Tools for Invenio
2020IsAmongTopNSimilarDocuments
A Review on Smart Autobot in Building Eradication Using WSN Technology
2020IsAmongTopNSimilarDocuments
Latent Variable Sequential Set Transformers For Joint Multi-Agent Motion Prediction
2021IsAmongTopNSimilarDocuments
simpletransformers software on GitHub
IsRelatedTo
hate-speech-dataset software on GitHub
IsRelatedTo
conceptnet5 software on GitHub
IsRelatedTo

chevron_left
1
2
chevron_right

Impact byBIP!

	citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	15
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%