Berri Corpus Manager: A Corpus Analysis Tool Using MongoDB Technology

Name: Berri Corpus Manager: A Corpus Analysis Tool Using MongoDB Technology
Creator: Hugo Sanjurjo-González
Keywords: 0202 electrical engineering, electronic engineering, information engineering, 02 engineering and technology

Hugo Sanjurjo-González

Found an issue? Give us feedback

https://doi.org/10.3...arrow_drop_down

https://doi.org/10.3233/faia20...

Part of book or chapter of book . 2020 . Peer-reviewed

License: CC BY NC

Data sources: Crossref

https://ebooks.iospress.nl/pdf...

Part of book or chapter of book

License: CC BY NC

Data sources: UnpayWall

mEDRA

Part of book or chapter of book . 2020

Data sources: mEDRA

ResearchGate Data

Conference object . 2020

Data sources: Datacite

DBLP

Conference object . 2021

Data sources: DBLP

Berri Corpus Manager: A Corpus Analysis Tool Using MongoDB Technology

descriptionPublicationkeyboard_double_arrow_right Part of book or chapter of book , Conference object 15 Sep 2020Publisher:IOS Press

Authors: Hugo Sanjurjo-González;

doi: 10.3233/faia200619 , 10.13140/rg.2.2.36594.73928

Berri Corpus Manager: A Corpus Analysis Tool Using MongoDB Technology

- Summary
- Metrics

Abstract

Nowadays, there are many options for corpus linguistic analysis that make use of different approaches for corpus storage. There are tools based on SQL databases, dedicated implementations such as CQP/CWB and others that employ plain-text corpora. NoSQL databases have been widely used for big data, data mining and even sentiment analysis. However, as far as we can see, there is a lack of a widespread concordancer or consolidated framework that makes use of MongoDB architecture for the purposes of corpus linguistics. This paper aims to describe the architecture of a software that allows users to analyse monolingual and bilingual parallel corpora with grammatical annotation using MongoDB technology. Our premises are that MongoDB is ideal for non-structured data and provides high flexibility and scalability, so it may be also useful for corpus linguistic research. We analyse functionalities of MongoDB such as text search indexes and query format in order to examine its suitability.

Related Organizations

University of Deusto
Spain

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

hybrid

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering