Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ RUC. Repositorio da ...arrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
image/svg+xml Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Closed Access logo, derived from PLoS Open Access logo. This version with transparent background. http://commons.wikimedia.org/wiki/File:Closed_Access_logo_transparent.svg Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao
versions View all 2 versions
addClaim

Síntesis de voz de alta calidad para el idioma gallego

Authors: Adega Fernández, Enrique;

Síntesis de voz de alta calidad para el idioma gallego

Abstract

[Resumen]: La conversión de texto a voz (TTS, por sus siglas en inglés) ha experimentado notables avances en los últimos años, impulsada por el desarrollo de nuevas técnicas basadas en el aprendizaje profundo. Estos progresos han permitido la generación de voces sintéticas cada vez más naturales, expresivas y fieles a las características del habla humana. Sin embargo, la mayoría de estos desarrollos se centran en idiomas de amplia difusión, dejando de lado lenguas minoritarias que carecen de los recursos necesarios para entrenar modelos de última generación. En este contexto, el presente proyecto tiene como objetivo aplicar los avances más recientes en TTS al idioma gallego, facilitando la creación de voces de alta calidad que respeten la prosodia y sonoridad propias de la lengua. Asimismo, se exploran técnicas de clonación de voz que permiten reproducir con precisión las características individuales de hablantes concretos, incluso a partir de muestras limitadas. De este modo, se contribuye a enriquecer las herramientas tecnológicas disponibles para el gallego, fomentando su uso en aplicaciones de accesibilidad, educación, doblaje y generación de contenidos. A lo largo del proyecto se abordan tanto los fundamentos teóricos como los aspectos prácticos, describiendo los métodos y tecnologías empleados, su funcionamiento y las razones que justifican su elección. Además, se realiza un estudio de mercado con el objetivo de analizar la viabilidad de la librería desarrollada, así como un plan detallado del proyecto que incluye la estimación de costes y plazos de ejecución. Por último, las metodologías aplicadas en el desarrollo tanto del sistema inteligente como de la propia librería, junto con el proceso completo de diseño, experimentación, implementación y validación, se describen en los capítulos correspondientes.

[Abstract]: Text-to-speech conversion (TTS) has seen remarkable progress in recent years, driven by the development of new techniques based on deep learning. These advances have enabled the generation of synthetic voices that are increasingly natural, expressive, and faithful to the characteristics of human speech. However, most of these developments focus on widely spoken languages, leaving minority languages behind due to the lack of resources needed to train state-of-the-art models. In this context, the present project aims to apply the latest advances in TTS to the Galician language, facilitating the creation of high-quality voices that respect the language’s prosody and sound patterns. In addition, the project explores voice cloning techniques that can accurately reproduce the unique characteristics of individual speakers, even from limited samples. In this way, it contributes to enriching the technological tools available for Galician, promoting its use in applications such as accessibility, education, dubbing, and content generation. Throughout the project, both theoretical foundations and practical aspects are addressed, describing the methods and technologies used, how they work, and the rationale behind their selection. Furthermore, a market study is conducted to analyze the viability of the developed library, along with a detailed project plan that includes cost estimates and execution timelines. Finally, the methodologies applied in the development of both the intelligent system and the library itself, together with the complete process of design, experimentation, implementation, and validation, are described in the corresponding chapters.

Traballo fin de grao (UDC.FIC). Enxeñaría Informática. Curso 2024/2025

Country
Spain
Related Organizations
Keywords

Minority languages, Aprendizaje profundo, Voice cloning, Lenguas minoritarias, Deep learning, Síntesis de voz, Zero-shot voice cloning, Text-to-speech conversion, Speech synthesis, Galician, SparkTTS, Conversión de texto a voz, Clonación de voz, Gallego, XTTS

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
Green