SU1683063A1

SU1683063A1 - Method of compilatory speech synthesis and device thereof

Info

Publication number: SU1683063A1
Application number: SU884459706A
Authority: SU
Inventors: Борис Мефодьевич Лобанов
Original assignee: Институт Технической Кибернетики Ан Бсср
Priority date: 1988-07-14
Filing date: 1988-07-14
Publication date: 1991-10-07

Description

1one

(21)4459706/10 (22) 14.07.88 (46)07.10.91.Бюл. №37(21) 4459706/10 (22) 07.14.88 (46) 10.10.91. Byul. №37

(71)Институт технической кибернетики АН БССР(71) Institute of Technical Cybernetics of the Academy of Sciences of the BSSR

(72)Б.М. Лобанов(72) B.M. Lobanov

(53)534.78(088.8)(53) 534.78 (088.8)

(56)Патент США № 4398059, кл.С 10 L 5/00, 1983.(56) U.S. Patent No. 4398059, C. 10 L 5/00, 1983.

Авторское свидетельство СССР Мг 1599888, кл. G 10 L 5/02, 1990.USSR author's certificate Mg 1599888, cl. G 10 L 5/02, 1990.

(54)СПОСОБ КОМПИЛЯЦИОННОГО СИНТЕЗА РЕЧИ И УСТРОЙСТВО ДЛЯ ЕГО ОСУЩЕСТВЛЕНИЯ(54) METHOD OF COMPILATION SYNTHESIS OF SPEECH AND DEVICE FOR ITS IMPLEMENTATION

(57)Изобретение относитс к речевой информатике и может быть использовано дл (57) The invention relates to speech informatics and can be used for

сжати информации, используемой при синтезе слитной речи по тексту. Цель изобретени - сжатие запоминаемой информации и упрощение устройства. Текст, вводимый в текстовый процессор 1, далее преобразуетс в последовательность слогов-дифонов и отдельных звуков. Параметры звуков и переходов между ними заранее запоминают в посто нных запоминающих устройствах2и 3 vi считывают по мере преобразовани текста в звуки выходным цифроаналоговым преобразователем 7, выполненным фор- мантным звуковым интерфейсом, нагруженным на электроакустический агрегат 8. 2 с.п. ф-лы, 1 ил.compression of information used in the synthesis of continuous speech in the text. The purpose of the invention is to compress stored information and simplify the device. The text entered into the word processor 1 is further converted into a sequence of difon syllables and separate sounds. The parameters of sounds and transitions between them are stored in advance in permanent memory devices 2 and 3 vi as the text is converted into sounds by an output digital-to-analog converter 7 made by a formant sound interface loaded onto an electro-acoustic unit 8. 2 p. f-ly, 1 ill.

ЈJ

ON 00 СО О О (АON 00 CO O O (A

Изобретение относитс к речевой информатике и приборостроению дл синтезз речевых сообщений по тексту в системах акустического общени человека с автоматическими устройствами.The invention relates to speech informatics and instrumentation for the synthesis of speech messages in the text in systems of acoustic communication of a person with automatic devices.

Цель изобретени - сжатие заранее запоминаемой информации и упрощение устройства .The purpose of the invention is to compress previously stored information and simplify the device.

При запоминании дифонные переходы кодируют посто нными времени коартику- л ции этих переходов в дикторской речи, При воспроизведении текущие параметры звуков, которые восстанавливают формант- ным. интерфейсом, устанавливают линеймцм комбинированием параметров, соответствующих текущей, предшествующей и последующей фонемам, получаемых при последовательном транскрибировании текста. Результаты линейного комбинировани сглаживают на интервале, длительность которого не превышает длительность последующей фонемы,When memorizing diphone transitions encode the co-articulation time constant of these transitions in the speaker’s speech. When playing back, the current parameters of sounds that are restored to formant. interface, set lineartsm combination of parameters corresponding to the current, preceding and subsequent phonemes, obtained by sequential transcription of the text. The results of linear combining are smoothed over an interval whose duration does not exceed the duration of the subsequent phoneme,

На чертеже показана блок-схема описываемого устройства дл компил ционного синтеза речи.The drawing shows a block diagram of the described device for compiling speech synthesis.

Устройство содержит текстовый процессор 1, посто нные запоминающие устройства 2 и 3, буферное запоминающее устройство 4, интерполирующий процессор 5, формирователь 6, выходной цифроанало- говый преобразователь 7 и электроакустический агрегат 8. Входом устройства вл етс вход текстового процессора 1, соединенного с посто нными запоминающими устройствами 2 и 3 и буферным оперативным запоминающим устройством 4, Интерполирующий процессор 5 и формирователь 6 св заны с буферным оперативным запоминающим устройством 4, которое через выходной циф- роаналоговый преобразователь 7, выполненный формантным звуковым интерфейсом , нагруженном на электроакустический агрегат 8.The device contains a word processor 1, permanent storage devices 2 and 3, a buffer memory device 4, an interpolating processor 5, a driver 6, an output digital-analog converter 7 and an electro-acoustic unit 8. The input of the device is the input of a text processor 1 connected to constants the storage devices 2 and 3 and the buffer operational storage device 4, the Interpolating processor 5 and the imaging unit 6 are connected with the buffer operational storage device 4, which through the output center f- roanalogovy converter 7 formed formant sound interface, loaded on electroacoustic unit 8.

Вводимый в текстовой процессор текст преобразуетс им в последовательность слогов-дифонов и отдельных звуков. Фор- мантные параметры элементов дифонов и звуков из посто нного запоминающего угтройства 2 пересылаютс в буферное запоминающее устройство 4, куда из посто нного запоминающего устройства 3 вызываютс также сведени о посто нных времени коартикул ции и длительност х звуков, необходимые дл комбинировани и сглаживани параметров звуков, имитирующих натуральную слитную речь, благодар этой св занной с текстом и звуками вариации темпа и гладкости переходов от звука к звуку и от слова к слову.The text entered into the word processor is transformed into a sequence of difons and separate sounds. The formant parameters of the elements of the diphones and sounds from the persistent storage device 2 are transferred to the buffer storage device 4, where information on the constant co-articulation times and durations of sounds necessary for combining and smoothing the parameters of the sounds are also generated from the permanent storage device 3 imitating natural fluent speech, thanks to this variation in tempo and smoothness of transitions from sound to sound and from word to word associated with text and sounds.

Claims

1. A method for compiling speech synthesis, including pre-storing the parameters of individual speech sounds and various transitions between speech sounds and translating text into a sequence of playable diphones and individual sounds, different in that

that, in order to compress the memorized information, during memorization, diphone transitions encode as constants of the coarticulation time of these transitions, and during playback, the parameters of reproducible

sounds are set by linear combination of parameters corresponding to the current, preceding, and subsequent phonemes, and the results of the combination are smoothed over the interval by the duration of the subsequent phoneme.

2. A device for compiling speech synthesis, containing a word processor connected to permanent memory devices and a buffer

operative storage device connected via an output digital-to-analog converter with an electro-acoustic unit, characterized in that, for the purpose of simplification,

an interpolating processor and a driver associated with the corresponding buses with a buffer random access memory, and the output digital-to-analog converter is branded

sound interface.