Samodejno tvorjenje govora iz besedil. Postopek za izdelavo sintetizatorja slovenskega govora

Jerneja Gros

doi:10.3986/9616358219

Samodejno tvorjenje govora iz besedil. Postopek za izdelavo sintetizatorja slovenskega govora

Authors

Jerneja Gros

https://orcid.org/0000-0001-5011-8486

DOI: https://doi.org/10.3986/9616358219

Text-to-speech synthesis (TTS) enables automatic conversion of any available textual information into its spoken form. The Slovene TTS system has been implemented in various speech technology applications: as a stand-alone TTS system, in a speech recognition and dialog system, in a reading system for the blind people and finally, in a TTS web server.

Input to the TTS system is unrestricted Slovene text. It is transformed into its spoken equivalent by a series of modules. A grapheme-to-phoneme module produces strings of phonetic symbols. It first converts special formats into standard graphemic strings. Next, word pronunciation is derived, based on a pronunciation dictionary and letter-to-sound rules. A prosodic generator assigns pitch and duration values to individual phones based on parameters derived from prosodic analysis of a large amount of recorded Slovene speech data. Final speech synthesis is based on diphone concatenation. The adequacy of the spoken output was evaluated by subjective tests as recommended by the International Telecommunication Union.