
handle: 10045/3019
La mayoría de los sistemas previos para construir un diccionario bilingüe a partir de un corpus paralelo dependen de un algoritmo iterativo, usando probabilidades de traducción de palabras para alinear palabras en el corpus y sus alineamientos para estimar probabilidades de traducción, repitiendo hasta la convergencia. Si bien este enfoque produce resultados razonables, es computacionalmente lento, limitando el tamaño del corpus que se puede analizar y el del diccionario producido. Nosotros proponemos una aproximación no iterativa para producir un diccionario bilingüe unidireccional que, si bien menos precisa que las aproximaciones iterativas, es mucho más rápida, permitiendo procesar córpora mayores en un tiempo razonable. Asimismo, permite una estimación en tiempo real de la probabilidad de traducción de un par de términos, lo que significa que permite obtener un diccionario de traducción con los n términos más frecuentes, y calcular las probabilidades de traducción de términos infrecuentes cuando se encuentren en documentos reales.
Most previous systems for constructing a bilingual dictionary from a parallel corpus have depended on an iterative algorithm, using word translation probabilities to align words in the corpus, and using word alignments to estimate word translation probabilities, and repeating until convergence. While this approach produces reasonable results, it is computationally slow, limiting the size of the corpus that can be analysed and the size of the dictionary produced. We propose a non-iterative approach for producing a uni-directional bilingual dictionary which, while less accurate than iterative approaches, is far quicker, allowing larger corpora to be processed in reasonable time. The approach also allows on-the-fly estimation of translation likelihoods between a pair of terms, meaning that a translation dictionary can be generated with the n most frequent terms in an initial pass, and the translation likelihood of infrequent terms can be calculated as encountered in real documents.
Diccionarios bilingües, Statistical machine translation, Word-to-word models, Modelos palabra-a-palabra, Traducción automática estadística, Bilingual dictionaries
Diccionarios bilingües, Statistical machine translation, Word-to-word models, Modelos palabra-a-palabra, Traducción automática estadística, Bilingual dictionaries
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
