Downloads provided by UsageCounts
Resumen El trabajo presenta una alternativa para identificar idiomas en Twitter sin que sea necesario utilizar conjuntos de entrenamiento o información agregada. En dicha alternativa se utilizan técnicas basadas en los algoritmos de reconocimiento de trigramas y small words. Se valora la utilización de estos algoritmos por sí solos y en un modelo de composición. Asimismo, se analiza la incidencia del pre-procesamiento de los tweets en la precisión de la identificación de los idiomas. Finalmente, después de un proceso de experimentación, se determina la mejor alternativa de las estudiadas. Abstract The paper presents an alternative to identify languages on Twitter without having to use training sets or aggregated information. Such alternative is based on trigram recognition algorithms and small words techniques. The use of these algorithms is evaluated both on their own and in a model of composition. Also, the incidence of pre-processing of tweets in the accuracy of identifying the language is discussed. Finally, after a process of experimentation, the best alternative, out of those studied, is determined.
Detección de Idiomas, small words, Twitter, trigramas, HD28-70, Language detection, Detección de Idiomas; n-gramas; trigramas; small words; Twitter; Language detection; n-grams; trigrams; small words, n-gramas, Management. Industrial management, trigrams, n-grams, jel: jel:D81, jel: jel:M1, jel: jel:O3, jel: jel:O31, jel: jel:M15, jel: jel:O32, jel: jel:D8
Detección de Idiomas, small words, Twitter, trigramas, HD28-70, Language detection, Detección de Idiomas; n-gramas; trigramas; small words; Twitter; Language detection; n-grams; trigrams; small words, n-gramas, Management. Industrial management, trigrams, n-grams, jel: jel:D81, jel: jel:M1, jel: jel:O3, jel: jel:O31, jel: jel:M15, jel: jel:O32, jel: jel:D8
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
| views | 1 | |
| downloads | 2 |

Views provided by UsageCounts
Downloads provided by UsageCounts