TW451183B - Coarticulation processing apparatus for Chinese speech synthesis - Google Patents

Coarticulation processing apparatus for Chinese speech synthesis Download PDF

Info

Publication number
TW451183B
TW451183B TW088120889A TW88120889A TW451183B TW 451183 B TW451183 B TW 451183B TW 088120889 A TW088120889 A TW 088120889A TW 88120889 A TW88120889 A TW 88120889A TW 451183 B TW451183 B TW 451183B
Authority
TW
Taiwan
Prior art keywords
syllables
chinese
liaison
syllable
speech
Prior art date
Application number
TW088120889A
Other languages
Chinese (zh)
Inventor
Jiun-Jie Guo
Original Assignee
Matsushita Electric Ind Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Matsushita Electric Ind Co Ltd filed Critical Matsushita Electric Ind Co Ltd
Application granted granted Critical
Publication of TW451183B publication Critical patent/TW451183B/en

Links

Landscapes

  • Document Processing Apparatus (AREA)

Abstract

By analyzing the situation of coarticulation of the adjacent syllables in a word string, the present invention can obtain some variation rules for coarticulation in which the preceding syllable is different as the initial phoneme state and tone of the succeeding syllable is different. Using these variation rules, a CV-VC(VV) coarticulation processing apparatus for handling coarticulation is established. The coarticulation processing apparatus finds the VC(VV) coarticulation segment by the combining relationship between the consonants and vowels of the preceding and succeeding syllables, and implements the overlapping and adding operation of waveforms between the preceding and succeeding syllables, so as to achieve a natural and fluent effect on the synthesis of speech.

Description

經濟部智慧財產局員工消費合作杜印製 451 1 83 A7 ----------- B7___ 五、發明說明(1 ) 發明背景 發明頜域 本發明係有關中文語音合成用連音處理裝置,特別是 有關在中文語音合成中從一音節平滑轉換到下一音節用之 連音處理裝置。 相關技術概述 在中文語音合成時’為了使一連串音節發音更為流 利、平滑而必須對音節連接處執行平滑處理,這便做稱連 音處理。在中文語音合成時’為了獲得從字元串中一音節 平滑的轉換到此字元串的下一音節,將前一音節的部分音 素和下一音節的部分音素互相重疊的連音處理是必須的。 第3圖為“中文”(中文)發聲之寬頻頻譜。從第3圖可以 明顯的看出有連音現象。然而,大部分傳統的中文語音合 成系統只簡單地將“中文”字元串中兩音節的鄰近音素做連 而未考慮連音處理,如第4圖所示。因此造成不自然 的合成語音。 再者’習知中文語音合成系統所使用的連音處理技 術,是在字元串的連音段作時域上的模擬,也就是說先在 龐大的連音錄音資料裡面找尋最合適的連音段。接著對此 最合適的連音段的前一音節和後一音節之間作内差。上述 程序的關鍵在於最佳連音段的決定及從已紀錄的連音段語 音資料中尋找最佳的連音段。發表於中華民國第九屆計算 语s學研討會(1996),標題為“中文速音二字詞之語音合 成’’之論文,在此引用做為參考。 (讀先閱讀背面之注意ί項再填寫本頁) 裝--------訂-------1 *線Employees ’cooperation of the Intellectual Property Bureau of the Ministry of Economic Affairs, printed by Du 451 1 83 A7 ----------- B7___ V. Description of the invention (1) Background of the invention The invention of the jaw field The present invention is related to the continuous processing of Chinese speech synthesis Device, in particular a continuous tone processing device for smooth transition from one syllable to the next syllable in Chinese speech synthesis. Related Technology Overview In Chinese speech synthesis, in order to make a series of syllables sound more fluent and smooth, it is necessary to perform a smoothing process on the syllable connection. This is called liaison processing. During Chinese speech synthesis, 'in order to obtain a smooth transition from one syllable in a character string to the next syllable of the character string, a liaison process that overlaps some phonemes of the previous syllable and some phonemes of the next syllable is necessary. of. Figure 3 shows the wideband spectrum of "Chinese" (Chinese) sound. It can be clearly seen from Fig. 3 that there is a liaison phenomenon. However, most traditional Chinese speech synthesis systems simply connect adjacent phonemes of two syllables in the “Chinese” string without considering consonant processing, as shown in Figure 4. This results in unnatural synthesized speech. Furthermore, the liaison processing technology used in the learned Chinese speech synthesis system is to simulate the time domain of the liaison segments of the character string, that is, to find the most suitable liaison in the huge liaison recording data first. Sound segment. An internal difference is then made between the previous syllable and the next syllable of the most suitable legato. The key to the above procedure is the decision of the best legato segment and finding the best legato segment from the recorded speech segment audio data. A paper entitled "Speech Synthesis of Chinese Suyin Two Words" was presented at the Ninth Symposium on Computational Linguistics in the Republic of China (1996), which is hereby incorporated by reference. (Fill in this page again) Install -------- Order ------- 1 * line

ΑΓ B7 經濟部智慧財產局員工消費合作社印製 五、發明說明(2 ) 、參’…第3圖’丨〇〇是讓使用者輸入要合成的拼音文句的 拼曰文句輪入單疋。110是用來健存大量已錄音的資串語 音資料的字元串儲存單元。刚是用來儲存單音錄音語音 貧料的單音料單元。是根據拼音文句輸人要做連音 處理的字元串,從字元串儲存單元110中尋找並分析尋找 到之字決定連音段枝字元串尋找單元。⑽是用來尋 找于疋串中連音段中點的中點檢出單元。刚是用來計算 連音段音長的計算單元。15{)是根據輸人的拼音文句,檢 出單音儲存單元18G來找出單音錄音語音資料的前段音節 並將找出的單g錄音語音資料予以合成之前段音節合成單 元。160是將岫段單音合成單元丨5〇輸出的合成語音資料和 連音段合成之連音段合成單元。17〇是根據輸入的拼音文 句,檢出單音儲存單元180來找出單音錄^語音資料的後 段音節,並將連音段合成單元16〇輸出的合成語音資料和 找出的單音錄音語音資料的後段音節予以合成的後段音節 合成單元。190是將合成語音資料輸出結果以語音形式輸 出之合成語音輸出單元。 從第5圖可以明顯得知以往習知中文語音合成系統是 從字元串儲存單元110中檢出最佳的連音段以及從單音儲 存部180中檢出最佳的單音語音資料,並將檢出到的資料 予以合成以其改善合成語音輸出的自然度和理解度。 舉例來說,如果想要使用如第5圖所示系統來合成使 用連音處理的字元串,·中文中文)時,使用者首先經由輸 •V單元1 10以相·音方式輸字元串-中文接著' 從字元 I ^ ------—^ (請先閱讀背面之注意事項再填寫本頁) 本紙張尺度適用t國圉家標準(CXS)A.丨規格X 297公着 45】183ΑΓ B7 Printed by the Consumer Cooperatives of the Intellectual Property Bureau of the Ministry of Economic Affairs. 5. Description of the Invention (2), and the reference "... Figure 3" 丨 〇〇 is a turn-by-turn list of Pinyin sentences that allows users to input the Pinyin sentences to be synthesized. 110 is a character string storage unit for storing a large amount of recorded audio data. It is a monophonic unit that is used to store monophonic voice recordings. It is to search for and analyze the found characters from the character string storage unit 110 based on the character string inputted by the Pinyin sentence to do the liaison processing to determine the liaison segment branch character string search unit. ⑽ is used to find the midpoint detection unit at the midpoint of the consonant section in the 疋 string. It is just a calculation unit used to calculate the sound length of a legato. 15 {) is based on the input Pinyin sentence, and detects the monophonic storage unit 18G to find the first syllable of the monophonic recorded speech data, and synthesizes the single g of recorded speech data to the previous syllable synthesis unit. 160 is a continuous-segment synthesizing unit that synthesizes the synthesized speech data output by the single-segment monophonic synthesis unit 50 and the continuous-segment synthesis. 17〇 is based on the input pinyin sentence, detects the monophonic storage unit 180 to find the last syllable of the monophonic record ^ speech data, and synthesizes the synthesized speech data output by the monophonic segment synthesizing unit 16 and the monophonic recording found. Back-end syllable synthesis unit that synthesizes back-end syllables of speech data. 190 is a synthesized speech output unit that outputs the synthesized speech data output result in the form of speech. It can be clearly seen from FIG. 5 that in the conventional Chinese speech synthesis system, the best consonant segment is detected from the character string storage unit 110 and the best monophonic speech data is detected from the monophonic storage unit 180. The detected data is synthesized to improve the naturalness and understanding of the synthesized speech output. For example, if you want to use the system shown in Figure 5 to synthesize a string of characters that use liaison processing, · Chinese (Chinese)), the user first inputs the characters in a phase · phonic way via the input V unit 1 10 String-Chinese then 'from the character I ^ ------— ^ (Please read the notes on the back before filling this page) This paper size is applicable to the national standard (CXS) A. 丨 size X 297 (45) 183

經濟部智慧財產局員工消費合作社印製Printed by the Consumer Cooperative of the Intellectual Property Bureau of the Ministry of Economic Affairs

串儲存單元110找出符合字元串“中文’,的字元串語音資料 紀錄假"又子元串儲存單元110中有字元串..中文的字元 串3吾曰資料記錄,字元串檢出單元120會從字元串儲存單 凡11〇中找出符合年元串“中文,,的字元串語音資料記錄。 接著分析找到的字元串語音資料來決定字元串“中文’,的速 9段在中點檢出單元130中計算字元串“中文,,的連音段 中點。在計算單元140中計算字元串“中文,,的連音段音長。 剛段音節合成單元150檢出單音儲存部180以找出符合字 中的單音節語音資料記錄。連音段合成單元160將找到 符0子中單音語音資料記錄和連音段予以合成。接著後 &曰節σ成單元17〇檢出單音儲存單元18〇以找出符合字 “文”的單音語音資料記錄,並連音段合成單元16〇輸出之 合成語音資料和找到符合字“文”的單音語音資料記錄予以 合成。最後由合成語音輸出單元〗9〇以語音形式輸出合成 語音資料。 然而,假如字元串儲存單元沒有符合字元串“辛文,,的 字凡串語音資料記錄,則會依據字元串‘‘中文,,前面音節的 母音(乂厶)以及字元串“中文,,次一音節的起始音(乂卜)來 找出最近似的連音,例如:“通問,,(六乂厶乂卜、)以上 述的方法合成。如此會造成很不自然的合成語音。再者, 上述系統需要大約5 5ΜΒ的記憶體空間來儲存大量的字元 串語音資料記錄,因而浪費有用的記憶體空間。此外,因 為使用語音資料記錄作為合成的基本單元,則無法改變頻 率和音長,而且語音資料記錄的檢出和合成也十分耗時。 (請先閲讀背面之注f項再填寫本頁) 裝--------訂---------線. 本紙張尺度適用中國國家標準(CNS)A4規格(2川X297公爱) 6 Α7 Β7 經濟部智慧財產局員工消費合作社印製 五、發明說明(4 歸納起來,上述習知技術有下列缺點: 1. 必須儲存大量的單音語音資料記錄和字元串語音 資料記錄。 2. 假如想要的字元串語音資料記錄不在字元串㈣ 單元中,則會造成很不自然的合成語音。 3. 因為使用語音資料記錄,因起無法改變音長和令 調。 4.檢出語音資料記錄十分耗時。 發明概要 因此本發明的基本目標為提供一連音處理裝置,用以 在中文語音合成中從一音節平滑轉換到下一音節,其中此 連音處理裝置可以解決上述習知技術的缺點。 根據本發明,中文語音合成用連音處理裝置,其包含: 儲存包括大量中文字元串和其相對應拼音記號之詞典 的詞典記憶體; 储存各種中文音節和連音段及對應各種中文音節和連 音段之拼音記號的基週資料、以及各種中文音節和連音段 子音母音之起始和結束位置的儲存單元; 依據儲存於詞典記憶體中的詞典分析輸八拼音文句, 切割此文句為多個字元串之詞分析單元。 k3分析單元檢出之子元串,依據儲存單元決定哪— 字元串必須經由連音程序以便檢出該字元串是否必須經由 連音處理程序:以及 在輸入之拼音文句中 '將字元串檢出之連音段的各音 錢張^ (CNS)A4 '.'ίο x 297 ^---------The string storage unit 110 finds a character string voice data record that matches the character string "Chinese", and there is a character string in the substring storage unit 110. The Chinese character string 3 The meta string detection unit 120 will find the character string voice data records that match the year string "Chinese" from the character string storage list Fan 10. Then, the speech data of the found character string is analyzed to determine the super 9 segments of the character string "Chinese", and the midpoint of the hyphenated segment of the character string "Chinese," is calculated in the midpoint detection unit 130. The calculation unit 140 calculates the length of the syllables of the Chinese character string “.”. The syllable synthesizing unit 150 detects the monophonic storage unit 180 to find a single-syllable voice data record that matches the word. Synthetic syllable synthesis The unit 160 synthesizes the monophonic speech data records and liaison segments found in the symbol 0. Then, the & The voice data record is combined with the synthesized voice data output by the segment synthesis unit 160 and a single-tone voice data record that matches the word "text". Finally, the synthesized voice output unit 90 outputs the synthesized voice data in the form of speech. However, if the character string storage unit does not match the character string "Xinwen," the character string data records will be based on the character string "Chinese, the vowel (乂 厶) of the preceding syllable, and the character string" In Chinese, the first syllable of the next syllable (乂 卜) is used to find the most similar liaison, for example: "Common question, (Six 乂 厶 乂,) is synthesized by the above method. This can cause very unnatural synthetic speech. Furthermore, the above-mentioned system requires about 55 MB of memory space to store a large number of character string voice data records, thereby wasting useful memory space. In addition, because the voice data record is used as the basic unit of synthesis, the frequency and sound length cannot be changed, and the detection and synthesis of the voice data record is time-consuming. (Please read the note f on the back before filling this page.) -------- Order --------- line. This paper size is applicable to China National Standard (CNS) A4 specifications (2 Sichuan X297 public love) 6 Α7 Β7 Printed by the Consumer Cooperatives of the Intellectual Property Bureau of the Ministry of Economic Affairs 5. Description of the invention (4 To sum up, the above-mentioned conventional technologies have the following disadvantages: 1. A large amount of single-tone voice data records and character string voices must be stored Data record 2. If the desired character string voice data record is not in the character string 单元 unit, it will cause very unnatural synthesized speech. 3. Because the voice data record is used, it is impossible to change the sound length and order 4. It is very time-consuming to check out the voice data record. SUMMARY OF THE INVENTION Therefore, the basic object of the present invention is to provide a liaison processing device for smooth transition from one syllable to the next syllable in Chinese speech synthesis. The liaison processing device The shortcomings of the above-mentioned conventional technology can be solved. According to the present invention, a liaison processing device for Chinese speech synthesis includes: a dictionary memory storing a dictionary including a large number of Chinese character strings and its corresponding pinyin marks; Storage units for the start and end positions of Chinese syllables and connected syllables, pinyin marks corresponding to various Chinese syllables and connected syllables, and the start and end positions of the consonants and vowels of various Chinese syllables and connected syllables; stored in dictionary memory according to The dictionary analysis input eight pinyin sentences, cut this sentence into multiple character word analysis units. The sub-strings detected by the k3 analysis unit, which is determined by the storage unit-the character string must go through a liaison program in order to detect the word Does the string need to go through a liaison processing program: and in the input Pinyin sentence, the sound of each syllable segment of the 'string detected' ^ (CNS) A4 '.'ίο x 297 ^ ----- ----

------------* 震-------訂--------- (請先閱讀背面之注意事項再填寫本頁J 451 1 83 A7 B7 經濟部智慧財產局員工消費合作社印製 五、發明說明(5 )------------ * Earthquake ------- Order --------- (Please read the notes on the back before filling in this page J 451 1 83 A7 B7 Economy Printed by the Ministry of Intellectual Property Bureau's Consumer Cooperatives V. Invention Description (5)

節間做内插並產生合成語音之語音合成單元D 在上述的連音處理裝置中,儲存於儲存單元的連音段 為中文字元串次音節之起始音,如第6圖中之定義。 本發明之中文語音合成用CV-VC(VV)連音處理裝置 具有上述之構成,首先依據儲存於詞典記憶體中之詞典切 兰使用者輸入之拼音文句為多個字元串。接著音節分析單 元決定哪些前後音節必須經由連音程序處理。接著,從音 節資料儲存單元檢出每一音節的基週及子母音的起始結束 位置。最後由語音合成單元計算出音長和音頻資訊作音長 和音頻之變化運算後合成並輸出語音。 圖式簡介 本發明上述及其他的目標和特徵在下列的較佳實施例 說明時’參照附圖中對應參考號碼的部分則更容易瞭解, 這些圖示為: 第1圖為使用在根據本發明較佳實施例之中文語音合 成連音處理裝置的方瑰圖。 第2圖屐示储存在第1圖中所示暫存器單元13中音節内 容之圖表。 第3圖為由人發聲‘‘中文”的寬頻頻譜。 第4圖為由傳統中文語音合成系統發聲‘‘中文,,的寬頻 頻譜。 第5圖為傳統中文語音合成系統的系統方塊圖。 第6圖展示較佳實施例中用以判斷字元串是否需要經 由連音程序之次音節起始音種類的圖表。 本紙張尺度適用中國國冢標準(CNS)A4規格⑵Ox 297公楚) ------------^----I---訂---------線 * . <請先閱tl背面之注意事項再填寫本頁) 經濟部智慧財產局員工消費合作社印製 A7 _____Β7___ 五、發明說明(6 ) 第7圖解釋根據本發明較佳實施例對“台灣”字元串做 連音處理之說明圊。 較佳實施例之詳細描沭 依據本發明的較佳實施例將會參考相關的圖示在下面 說明。 第1圖係依本發明實施例之中文語音合成用連音處理 裝置的系統方塊圖。 參考第1圖,輸入單元10係例如包含有鍵盤,使用者 可直接輸入供語音合成的拼音文句。字元分析單元丨丨依據 儲存於存有大量字元串和其對應拼音記號之詞典記情、體 (儲存單元)12中的詞典分析輸入文句,切分此文句為多個 字元串並在緊鄰的二個字元串間標記字元位置。音節分寿斤 單元14依據CV音節和VC(VV)連音段的基週資料儲存單元 1 5以及VC音節和VC(VV)連音段之標記資料储存單元16判 斷哪一字元必須經過連音處理,並找出判定需要連音處理 的C V音節和VC.( VV)連音段的基週資料和標記資料。音長 檢測單元1 7和音頻檢測單元1 8依據音節韻律規則檢出相對 應的音長與音頻。其中C代表子因' V則代表母· t d 暫存區(儲存)單元13儲存對應的音長、頻率、聲^周(中 文聲調’也就是連接到中文音節的聲調)、與每—音節的 拼音符號等。波形重疊累加單元19將CV音節和vc(vv)速 音段做重疊累加。合成語音輸出單元20輸出合成語音η 下面將解說本發明的應用。以[tai2 wanl shi4 yi2 ge5 mei3 li4 de5 6ao3 dao3〗(台灣是-個美麗的寶島)之扭音文 -------------ini----訂------- (請先閱讀背面之ii意ί項再填寫本頁) 1 83 A7Speech synthesizing unit D that interpolates between segments to generate synthesized speech In the above-mentioned liaison processing device, the liaison segment stored in the storage unit is the initial sound of the Chinese character string subsyllable, as defined in Figure 6 . The CV-VC (VV) continuous speech processing device for Chinese speech synthesis of the present invention has the above-mentioned structure. First, a phonetic sentence entered by a user in the dictionary memory stored in the dictionary memory is a plurality of character strings. The syllable analysis unit then determines which syllables must be processed by a liaison program. Next, the base period of each syllable and the start and end positions of the consonants are detected from the syllable data storage unit. Finally, the speech synthesis unit calculates the sound length and audio information, calculates the change in sound length and audio, and synthesizes and outputs speech. BRIEF DESCRIPTION OF THE DRAWINGS The above and other objects and features of the present invention are more easily understood when referring to the corresponding reference numerals in the accompanying drawings when describing the following preferred embodiments. These diagrams are as follows: Figure 1 is used in accordance with the present invention. Square diagram of the Chinese speech synthesis and continuous speech processing device of the preferred embodiment. Fig. 2 shows a graph of syllable contents stored in the register unit 13 shown in Fig. 1. Figure 3 shows the broadband spectrum of "Chinese" spoken by a person. Figure 4 shows the broadband spectrum of "Chinese," voiced by a traditional Chinese speech synthesis system. Figure 5 is a system block diagram of a traditional Chinese speech synthesis system. Figure 6 shows the chart used in the preferred embodiment to determine whether the character string needs to pass through the syllable start syllable type. This paper size is applicable to the Chinese National Standard (CNS) A4 specification (Ox 297).- ---------- ^ ---- I --- Order --------- line *. ≪ Please read the notes on the back of tl before filling in this page) Ministry of Economy Wisdom A7 printed by the Consumer Cooperative of the Property Bureau _____ Β7 ___ 5. Description of the Invention (6) Figure 7 explains the explanation of the "Taiwan" string string processing according to the preferred embodiment of the present invention. Detailed description of the preferred embodiment The preferred embodiment according to the present invention will be described below with reference to related diagrams. FIG. 1 is a system block diagram of a tone processing device for Chinese speech synthesis according to an embodiment of the present invention. Referring to FIG. 1, an input unit 10 For example, a keyboard is included, and the user can directly input pinyin for speech synthesis Sentence. Character analysis unit 丨 丨 Analyze the input sentence according to the dictionary stored in the dictionary memory, body (storage unit) 12 that stores a large number of character strings and their corresponding pinyin marks, and divide the sentence into multiple character strings. And the position of the character is marked between the next two character strings. The syllable division unit 14 is based on the base week data storage unit 15 of the CV syllable and the VC (VV) syllable, and the VC syllable and VC (VV) syllable The tag data storage unit 16 determines which character must be subjected to liaison processing, and finds out the base period data and mark data of CV syllables and VC. (VV) liaison segments that are determined to require liaison processing. Pitch length detection unit 1 7 and the audio detection unit 1 8 detect the corresponding sound length and audio according to the syllable prosody rules. Among them, C represents the child factor, and V represents the mother. Td The temporary storage area (storage) unit 13 stores the corresponding sound length, frequency, and sound. ^ Zhou (Chinese tones, that is, tones connected to Chinese syllables), pinyin symbols for each syllable, etc. The waveform overlapping accumulation unit 19 adds and accumulates CV syllables and vc (vv) syllabic segments. Synthetic speech output unit 20 Output synthetic speech η Application of the present invention. [Tai2 wanl shi4 yi2 ge5 mei3 li4 de5 6ao3 dao3] (Taiwan is a beautiful treasure island) twisted text ------------- ini ---- order ------- (Please read the meanings on the back and then fill out this page) 1 83 A7

I 訂 請 先 間 讀 背I order to read it in advance

A 之 注 項 再 填 寫裝Fill in the note of A

本衣 頁I ▲ 經濟部智慧財產局員工消費合作社h ^ A7 -—-----B7 _ 五、發明說明(8 ) 中文四聲中為中文第一聲之中文音節。 波升^重受累加單元19依據來自暫存器13之〔^音節和 vc(vv)連音段之詳細資料,其中包括音節音長、音節子 音音長、音節起始位置、音節結束位置、音節八個區段頻 率值、音節聲調種類'音節子音種類、音節母音種類、音 即位於字π串中的位置、音節的CV音節序號和音節的 vc(vv)連音段的序號等,將CV音節和vc(vv)連音段波 形重疊累加。最後由合成語音輸出單元20輸出最終合成語 音。 第7圊解釋根據本發明較佳實施例對..台灣”字元串做 連音處理之說明圖。首先,各cv音節和其”連音段的基 週貧料和標記資料儲存於暫存器單元13中。接著,在依據 韻律規則計算出“台,,和“灣”及連音段“巧χ弓”音節的頻率 和音長,以便將這些音節和連音段的波形重疊累加來產生 “台灣”字元串的波形。 因為“台灣”字元_的合成是以基週資料合成,因此其 音長與頻率可以做變化,並且可以節省有用的記Μ空 間。 -上述之說明’本發明之較佳實施例可以解決習知技 術中之連音問題。藉由檢出字元串的連音段以及將字元串 之連音段波形與前後音節波形重疊累加,如此可以獲得自 然的合成浯音。再者,字元串的音長和頻率也可以做變化, 因此可以產生各種不同聲調和音長的字元串,並且可以節 省有用的記憶體空間。This page Page ▲ Consumer Cooperatives of the Intellectual Property Bureau of the Ministry of Economic Affairs h ^ A7 --------- B7 _ V. Description of the Invention (8) Among the four Chinese words, the first Chinese syllable in Chinese. The boost ^ heavy accumulating unit 19 is based on the detailed information of the [^ syllable and vc (vv) consonant segment from the register 13, which includes the syllable length, syllable sub-sound length, syllable start position, syllable end position, The frequency value of the eight segments of the syllable, the type of syllable tone, the type of syllable consonant, the type of syllable vowel, the position of the syllable in the string π, the CV syllable number of the syllable, and the vc (vv) concatenated number of the syllable, etc. CV syllable and vc (vv) consonant waveform overlap and accumulate. Finally, the synthesized speech output unit 20 outputs the final synthesized speech. Chapter 7: Explains how to do the liaison processing for the ".. Taiwan" string according to the preferred embodiment of the present invention. First, the base material and mark data of each cv syllable and its liaison section are stored in the temporary storage.器 装置 13。 In the device unit 13. Then, calculate the frequency and length of the "Taiwan," and "Bay" and the consonant "Qiao Xong" syllables according to the prosody rules, so that the waveforms of these syllables and the consonant are overlapped and accumulated to generate the "Taiwan" character. The waveform of the string. Because the "Taiwan" character _ is synthesized based on the base period data, its sound length and frequency can be changed, and useful memory space can be saved. The example can solve the problem of liaison in the conventional technology. By detecting the liaison segment of the character string and superimposing and accumulating the syllable waveform of the character string and the syllable waveform before and after, it is possible to obtain a natural synthesizer. In addition, the pitch and frequency of a character string can also be changed, so that character strings with various tones and pitches can be generated, and useful memory space can be saved.

I —II--- * , -------— — — — — — — — <請先閲讀背面之注意事項再填耳本頁) A7 451 1 83 ______Β7____ 五、發明說明(9 ) 儘管本發明已經隨著較佳實施例和其參照的圖示做徹 底的解說’但仍必須瞭解在技術方面如同由那些技能所認 可一般可以有各種不同的變化和修改。必須瞭解如同由所 附之專利申請範圍,在本發明之範疇内可以包含各種的變 化和修改修改而不偏離本發明原理。 I I------- ^ I I--— II -------- (請先M讀背面之注意事項再填寫本頁) 經濟部智慧財產局員工消費合作杜印製 12 本紙張尺度適用t國國家標準(CNS)A4規格(2】0 * 297公釐)I —II --- *, -------— — — — — — — — < Please read the notes on the back before filling in this page) A7 451 1 83 ______ Β7 ____ 5. Description of the invention (9) Although the present invention has been thoroughly explained with the preferred embodiments and the drawings to which it is referred, it must be understood that various changes and modifications can be made in the technology as recognized by those skills. It must be understood that, as with the scope of the attached patent application, various changes and modifications can be included within the scope of the invention without departing from the principles of the invention. I I ------- ^ I I --- II -------- (Please read the precautions on the back before filling out this page) Printed by the Intellectual Property Bureau of the Ministry of Economic Affairs on consumer cooperation Du printed 12 This paper size is applicable to the national standard (CNS) A4 specification (2) 0 * 297 mm

Claims (1)

經濟部智慧財產局員工消費合作社印製 六、申請專利範圍 I 一種中文語音合成用連音處理裝置,其包含: —s可典S己憶體’用以儲存一包括多個令文字元串 和對應之拼音記號的詞典: 一儲存單元’用以儲存各種中文音節和連音段之 基週資料、該等各種中文音節和該等連音段相對應之 拼音記號、以及該等各種+文音節和該等連音段之子 音母音的起始與結束位置; 一字元分射單元,用以依據儲存於該詞典記憶體 中之該詞典分析欲作語音合成之一拼音輪入文句,切 分該文句為多個字元串; 一音節分析單元,用以依據該儲存單元判斷來自 該字元分析單元的哪一字元_必須做連音處理以便 檢出必須做該連音處理之該字元串的連音段;以及 一語音合成單元,用以將檢出之該連音段插入該 輪入拼音文句中該字元串的該節音節之間,並產生一 合成語音。 2·如申請專利範圍第1項之連音處理裝置,其中該儲存單 元收錄具有中文第一聲之4〇9個中文音節。 3.如申請專利範圍第i項之連音處理裝置,其中儲存於該 儲存單元中之該等連音段為一些中文字元串之次音節 的起始音》 (如申請專利範圍第2項之連音處理裝置,其中储存於該 儲存單元中之該等連音段為一些中文字元串之次音節 之起始音。 一- (請先M讀背面之注意事項再填寫本頁) 裝--------訂---------線· 13Printed by the Consumer Cooperatives of the Intellectual Property Bureau of the Ministry of Economic Affairs 6. Scope of patent application I. A speech processing device for Chinese speech synthesis, which includes: —s 可 典 S 自 忆 体 'to store a string of multiple characters and Dictionary of the corresponding Pinyin notation: A storage unit 'is used to store the basic weekly data of various Chinese syllables and connected syllables, the corresponding Chinese syllables and the corresponding phonetic syllables, and the various + wen syllables And the start and end positions of the consonants and vowels of these consonant segments; a character splitting unit, which analyzes a pinyin round-in sentence to be used for speech synthesis based on the dictionary stored in the dictionary memory and splits The sentence is a plurality of character strings; a syllable analysis unit is used to determine which character from the character analysis unit according to the storage unit _ must be processed in conjunction to detect the word that must be processed in conjunction A liaison segment of a metastring; and a speech synthesis unit for inserting the detected liaison segment between the syllables of the character string in the round-in pinyin sentence and generating a synthesized speech . 2. The continuous sound processing device of item 1 in the scope of patent application, wherein the storage unit contains 409 Chinese syllables with the first sound in Chinese. 3. If the liaison processing device of item i in the scope of patent application, wherein the liaison segments stored in the storage unit are the starting syllables of the subsyllables of some Chinese character strings " A liaison processing device, wherein the liaison segments stored in the storage unit are the starting syllables of the secondary syllables of some Chinese character strings. I-(Please read the notes on the back before filling this page) -------- Order --------- Line · 13
TW088120889A 1998-12-02 1999-11-30 Coarticulation processing apparatus for Chinese speech synthesis TW451183B (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP10342796A JP2000172286A (en) 1998-12-02 1998-12-02 Simultaneous articulation processor for chinese voice synthesis

Publications (1)

Publication Number Publication Date
TW451183B true TW451183B (en) 2001-08-21

Family

ID=18356572

Family Applications (1)

Application Number Title Priority Date Filing Date
TW088120889A TW451183B (en) 1998-12-02 1999-11-30 Coarticulation processing apparatus for Chinese speech synthesis

Country Status (4)

Country Link
JP (1) JP2000172286A (en)
CN (1) CN1257271A (en)
SG (1) SG77275A1 (en)
TW (1) TW451183B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104318920A (en) * 2014-10-07 2015-01-28 北京理工大学 Construction method of cross-syllable Chinese speech synthesis element with spectrum stable boundary
CN111028823B (en) * 2019-12-11 2024-06-07 广州酷狗计算机科技有限公司 Audio generation method, device, computer readable storage medium and computing equipment
CN111145723B (en) * 2019-12-31 2023-11-17 广州酷狗计算机科技有限公司 Method, device, equipment and storage medium for converting audio

Also Published As

Publication number Publication date
SG77275A1 (en) 2000-12-19
CN1257271A (en) 2000-06-21
JP2000172286A (en) 2000-06-23

Similar Documents

Publication Publication Date Title
US8219398B2 (en) Computerized speech synthesizer for synthesizing speech from text
US7155390B2 (en) Speech information processing method and apparatus and storage medium using a segment pitch pattern model
JPS62160495A (en) Voice synthesization system
JP3587048B2 (en) Prosody control method and speech synthesizer
KR20010021106A (en) Speech synthesizing method, speech synthesis apparatus, and computer-readable medium recording speech synthesis program
Narasimhan et al. Schwa-deletion in Hindi text-to-speech synthesis
Hasan et al. A spell-checker integrated machine learning based solution for speech to text conversion
TW451183B (en) Coarticulation processing apparatus for Chinese speech synthesis
KR0146549B1 (en) Korean language text acoustic translation method
JP3371761B2 (en) Name reading speech synthesizer
Gakuru et al. Development of a Kiswahili text to speech system.
Paulo et al. Multilevel annotation of speech signals using weighted finite state transducers
Mache et al. Development of text-to-speech synthesizer for Pali language
Hendessi et al. A speech synthesizer for Persian text using a neural network with a smooth ergodic HMM
JPH037995A (en) Generating device for singing voice synthetic data
JP2000056788A (en) Meter control method of speech synthesis device
JPH0962286A (en) Voice synthesizer and the method thereof
JPS62119591A (en) Sentence reciting apparatus
Mahar et al. WordNet based Sindhi text to speech synthesis system
JP2017090856A (en) Voice generation device, method, program, and voice database generation device
JPH11109988A (en) Sound information visualizing method and device and recording medium stored with sound information visualizing program
Narupiyakul et al. A stochastic knowledge-based Thai text-to-speech system
Khalifa et al. SMaTalk: Standard malay text to speech talk system
JPH06167989A (en) Speech synthesizing device
JP2003005776A (en) Voice synthesizing device

Legal Events

Date Code Title Description
GD4A Issue of patent certificate for granted invention patent
MM4A Annulment or lapse of patent due to non-payment of fees