TW382094B - Base tone synchronous differential coding method and device thereof - Google Patents

Base tone synchronous differential coding method and device thereof Download PDF

Info

Publication number
TW382094B
TW382094B TW086118724A TW86118724A TW382094B TW 382094 B TW382094 B TW 382094B TW 086118724 A TW086118724 A TW 086118724A TW 86118724 A TW86118724 A TW 86118724A TW 382094 B TW382094 B TW 382094B
Authority
TW
Taiwan
Prior art keywords
pitch
patent application
item
scope
data
Prior art date
Application number
TW086118724A
Other languages
Chinese (zh)
Inventor
Jing-Sung Jang
Jin-Yu Jang
Yi Jeng
Original Assignee
Inventec Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inventec Corp filed Critical Inventec Corp
Priority to TW086118724A priority Critical patent/TW382094B/en
Application granted granted Critical
Publication of TW382094B publication Critical patent/TW382094B/en

Links

Landscapes

  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

A kind of base tone synchronous differential coding method and device thereof. The method is to firstly input a signal representing a Chinese syllable and abstract the base tone message from input signal; then, according to this base tone message, proceed the format conversion on the input signal; after applying first order differentiation on the signal after format conversion to output the data. The base tone synchronous differential coding device includes a base tone abstracting apparatus, a format converter and a differentiating device. The base tone abstracting apparatus is used to receive an input signal representing a Chinese syllable and abstract the base tone message from the signal. The format converter is used for format conversion on the input signal based on the base tone message. The differentiating device will proceed first order differentiation on the signal after format conversion and produce an output signal for output.

Description

第86118724號說明書修正頁五、發明説明(4 )No. 86118724 revised page V. Description of the invention (4)

格式轉換器;14〜差分n ; 15〜數據壓縮器;i6〜psDc碼. 20〜輸入的語音數據模組;21〜帶通遽波處理模組;”〜波 ,ϋ波谷制模n頂〜基音提取候職組;24〜最佳 ,邏輯判Μ組;25〜輸出濁音基音的軸值馳;3Q〜psDc 碼;3卜數據解屋縮器;32〜反差分器;33〜基音訊號;μ〜 格式反轉換器;35〜PCM碼;抓中文文章;41〜文法處 理裝置;42〜語音數據合成裝置43〜漢語語音庫;44〜漢 語語音輸出;50〜文本預處理器;51〜分詞處理器;52〜漢 語詞語庫;53〜漢語詞語詞頻庫;54〜語法處理器;Μ〜漢 i#f±.-u: # 經濟部中央標準局員工消費合作社印製 語詞性表;56〜漢語語法規則冑;以及,57〜音節轉換器。 實施例: 根據本發明之基音同步差分編碼方法,是結合參量 編碼和波形編碼等兩種編碼方式而成。故相較於參量編 碼或波形編碼,所處理而得之語音數據既能保持真人音 ,,6吾音數據又容易控制。再者,對語音數據所需的運 算量:以減少’故亦能做到即時合成的效果。 請參照第1圖’所示為根據本發明之基音同步差分編 碼裝置、-較佳實施例的方塊示意圖。根據本發明之基音 同步差分編碼裝置係對脈衝碼調變(pulse code modulation’ T文以PCM簡稱)碼1〇的輸入信號,經基音 同^差刀編碼方式轉換成基音同步差分碼(下文以pSDC竭 簡稱)16後’輸出進行储存,此pcM碼之輸入信號1〇代 表的是一個漢語音節。根據本發明之基音同步差分編碼裝 置包括:一基音提取器U、一格式轉換器13、一差分器14、Format converter; 14 ~ differential n; 15 ~ data compressor; i6 ~ psDc code. 20 ~ input voice data module; 21 ~ band pass chirp processing module; Pitch extraction candidate group; 24 ~ best, logical judgement M group; 25 ~ output of the value of the voiced pitch of the pitch; 3Q ~ psDc code; 3 data decompressor; 32 ~ inverse difference; 33 ~ pitch signal; μ ~ Format Inverter; 35 ~ PCM Code; Grab Chinese Articles; 41 ~ Grammar Processing Device; 42 ~ Speech Data Synthesis Device 43 ~ Chinese Speech Library; 44 ~ Chinese Speech Output; 50 ~ Text Preprocessor; 51 ~ Word Segmentation Processor; 52 ~ Chinese word database; 53 ~ Chinese word frequency database; 54 ~ gram processor; M ~ 汉 i # f ± .-u: # Printed part-of-speech table for employees' cooperatives of the Central Standards Bureau of the Ministry of Economic Affairs; 56 ~ Chinese grammatical rules 以及; and 57 ~ syllable converter. Example: The pitch-synchronous differential coding method according to the present invention is a combination of two types of coding methods: parametric coding and waveform coding. Therefore, compared with parametric coding or waveform coding , The processed voice data can be maintained The real human voice and 6-voice data are easy to control. In addition, the amount of calculation required for the voice data: to reduce the effect of 'real-time synthesis can also be achieved. Please refer to FIG. 1'. Pitch synchronization differential encoding device, a block diagram of a preferred embodiment. The pitch synchronization differential encoding device according to the present invention is an input signal with a pulse code modulation (T abbreviated as PCM) code 10, which passes through the pitch. The same ^ difference encoding method is converted into a pitch synchronous differential code (hereinafter abbreviated as pSDC exhaustion) after 16 'output for storage. The input signal 10 of this pcM code represents a Chinese syllable. The pitch synchronous differential encoding device according to the present invention Including: a pitch extractor U, a format converter 13, a differentiator 14,

I ~ ^--- - ί 1 (請先閱讀背面之注意事項再填寫本頁) 訂 線 經濟部中央標準局員工消費合作社印黎 A 7- ______B7__ 五、發明説明(1 ) 本發明係有關於語音處理技術,特別是有關於一種基 音同步差分編碼方法及其裝置。可配合文法處理裝置,合 成任意的漢語字、詞、語句發音,所合成而得之語音能接 近真人發音的效果。 在語音合成的技術領域中,概將語音資料先以特定的 編碼方法,轉換成易於處理之語音數據進行儲存;當後績 » 欲合成出語音時,以相對應之解碼方法擷取待合成之若千 語音數據進行合成。經合成得之語音越能接近真人發音的 效果’是為此業界之人士所冀求者。目前,關於語音合成 的編碼方式概可區分為參量編碼(parameter coding)和波形 編碼(waveform coding)等兩種。 參量編碼係對語音波形取樣後進行分析,擷取語音參 數並予以儲存,當於合成語音時,係根據此等參數還原語 音數據。由於參量編碼是以語音產生模型為基礎,透過分 析獲取能表徵與聲源和聲道相關之若干特徵參數,將取樣 後之語音數據轉換成一套參數進行儲存,合成語音時再運 用這些參彰:重新組合語音數據。然而’此法必須先要建立 一套正確的發音模型,然後用相關的參數來描述這些模 型。故此法的技術關鍵在於建立一個成功的發音模型、以 及需以一套準確的參數來描述這個模型°由於目前對人類 發音過程的認識還處於探索階段’尚未建立一個勘用的模 型,因此,參量編碼的方法易引入嗓音,語音合成後會失 去真人音色,再者,其所需之運算量相當大’很難做到即 時的語音合成效果。線性預測編碼(Linear Predictive 本紙張尺度適用中國國家標準(CNS > 規格(210X297公釐) ^—ί— - -----:_{^他^~ ml - I n · ' (请先閣讀背面令江意事項存填寫本買〕 I I · 五、發明説明(5I ~ ^ ----ί 1 (Please read the notes on the back before filling in this page) Threading Service Department of the Central Standards Bureau of the Ministry of Economic Affairs, Consumer Cooperatives, Yin Li A 7- ______B7__ 5. Description of the invention (1) This invention relates to The speech processing technology, in particular, relates to a pitch synchronous differential coding method and a device thereof. It can cooperate with the grammar processing device to synthesize the pronunciation of arbitrary Chinese characters, words, and sentences, and the synthesized speech can approximate the effect of real human pronunciation. In the technical field of speech synthesis, the speech data is first converted into easy-to-handle speech data by a specific encoding method and stored; when the post-achievement »wants to synthesize speech, the corresponding decoding method is used to capture the data to be synthesized. Wakazen voice data is synthesized. The closer the synthesized speech is to the effect of real-life pronunciation ’, this is what people in this industry want. At present, the coding methods for speech synthesis can be divided into two types: parameter coding and waveform coding. Parametric coding is to analyze the speech waveform after sampling, to retrieve the speech parameters and store them. When synthesizing speech, the speech data is restored based on these parameters. Because the parameter coding is based on the speech generation model, some characteristic parameters related to the sound source and channel can be obtained through analysis, and the sampled speech data is converted into a set of parameters for storage. These references are used when synthesizing speech: Regroup voice data. However, this method must first establish a set of correct pronunciation models, and then describe these models with relevant parameters. Therefore, the key to the technique of this method lies in the establishment of a successful pronunciation model and the need to describe this model with a set of accurate parameters. Since the current understanding of the human pronunciation process is still in the exploratory stage, a survey model has not yet been established, so the parameters The encoding method is easy to introduce into the voice, and the human voice will be lost after speech synthesis. Furthermore, the amount of calculation required is quite large. 'It is difficult to achieve an instant speech synthesis effect. Linear Predictive (Linear Predictive This paper size applies to Chinese National Standards (CNS > Specifications (210X297 mm) ^ —ί—------: _ {^ 他 ^ ~ ml-I n · '(Please first Pavilion On the back of the order, Jiang Jiang will save the item and fill in this purchase] II. V. Invention Description (5

修煩II 以及一數據壓縮器15。 … 漢語音節是有清音和濁音之別,請參㈣7圖所示 為清音和濁音的波形圖。可知清音的振幅小、波形較 不規則、沒有週期性;濁音的振㈣大、波形有規則、 有明顯的週期。 經濟部中央標準局員工消費合作社印製 如第1圖所示,PCM碼1〇同時及於基音提取器U 和格式轉換器13。由於漢語音節有清音和濁音之別,故 基音提取器11對於輸入之舰碼1〇,進行清音/濁音的 判斷、以及對濁音週期的計算,成一基音訊息12輸出。 關於清音/濁音的判斷,譬如可以透過對短時間平均能量 (short term average energy)或短時間平均過零率 average cross zero)的計算來實現,其計算方式如下: 短時間平均能量: Εη=Σ [x(m)w(n-m)]2 ; 其中’ x(m)代表成PCM碼的語音數據, w(n-m)代表窗函數。 短時間平均過零率: Zn= | sgn[x(m)-sgn[x(m-l)] | w(n-m); 其中’ sgn[x(n)]=l 當 x[n]g 0 ; sgn[x(n)]=-l 當 χ[η]<〇。 至於濁音週期計算則可採用Gold-Rabiner法,其方法流程 圖即如第2圖所示。首先,於步驟2〇輸入語音數據,此 語音數據就是第1圖所示之PCM碼1〇。將輸入的語音數據 於步驟21經帶通濾波處理,此帶通濾波的頻率範圍嬖如可 _ 7 本紙張尺度適用中關家標準(CNS )八4賴_ ( 21GX 297公餐) - 1 .-——----i.il li I I ...... I n - - -I n ~ - . ~ (請先閱讀背面之注意事項再填寫本頁) ,?τ A7 B7 五、發明説明(2)Trouble II and a data compressor 15. … Chinese syllables are different from unvoiced and voiced. Please refer to Figure 7 for waveforms of unvoiced and voiced. It can be seen that the unvoiced sound has a small amplitude, a relatively irregular waveform, and no periodicity; a voiced sound has a large vibration, and the waveform has a regular and obvious period. Printed by the Consumers' Cooperative of the Central Bureau of Standards of the Ministry of Economic Affairs As shown in Figure 1, the PCM code 10 is simultaneously applied to the pitch extractor U and the format converter 13. Because there are differences between unvoiced and unvoiced Chinese syllables, the pitch extractor 11 performs unvoiced / voiced judgment and calculation of the voiced period on the input ship code 10, and outputs it as a pitched message 12. Judgment of unvoiced / voiced voice can be achieved by calculating short term average energy or average cross zero, for example, and the calculation method is as follows: Short-term average energy: Εη = Σ [x (m) w (nm)] 2; where 'x (m) represents speech data as a PCM code, and w (nm) represents a window function. Short-term average zero-crossing rate: Zn = | sgn [x (m) -sgn [x (ml)] | w (nm); where 'sgn [x (n)] = l when x [n] g 0; sgn [x (n)] = − 1 when χ [η] < 〇. As for the voiced period calculation, the Gold-Rabiner method can be used. The method flow chart is shown in Figure 2. First, input voice data in step 20, and the voice data is the PCM code 10 shown in Fig. 1. The input voice data is processed by band-pass filtering in step 21, and the frequency range of this band-pass filtering is as possible as _ 7 This paper standard is applicable to Zhongguanjia Standard (CNS) 8 4 Lai _ (21GX 297 public meals)-1. -——----I.il li II ...... I n---I n ~-. ~ (Please read the notes on the back before filling this page),? Τ A7 B7 V. Invention Instructions (2)

Coding,簡以LPC稱之)即屬參量編碼之一類,所據以合 成出來的語音相當生硬,且引入很明顯的噪音,無法符合 接近真人音色的要求。 波形編碼係對語音波形進行數據的取樣後儲存,訊噪 比(signal to noise ratio)比較高,較能保持原語音資料的音 色。配接差動脈衝編碼調變(Adaptive Differential Pulse Code Modulation,簡以ADPCM稱之)編碼即屬波形編碼之 一類’此法能夠保持語音資料原有的音色,故應用於語音 合成時,所合成而得之語音音色較參量編碼者佳。然而, 其原始音庫的來源固定,無法針對語句中的各種情況做調 整,故據以合成漢浯語句時仍嫌生硬,無法達到與近於真 人發音的效果。 因此,本發明之一目的,在於提供一種基音同步差分 編碼解碼方法及其裝置,可據以合成出近於真人發音效 果、連貫且自然之漢語語音。 本發明之另一目的,在於提供一種基音同步差分編碼 解碼方法及其裝置’能夠即時地合成語音。 為能獲致上述目的,本發明可藉由提供一種基音同步 差分編碼方法來完成。首先,輸入代表漢語音節之一輸入 信號後,對輸入信號提取基音信息。然後,根據此基音信 息’對輸入信號進行格式轉換。接著,對業經轉換格式之 信號進行一階差分後,成一輸出數據輸出。 再者,本發明可藉由提供一種基音同步差分編碼裝置 來完成。此一基音同步差分編碼裝置,包括:一基音提取 本紙張尺度適用中國國家標準(CNS ) A4規格(210X297公釐) 請先閲讀背面之注意事項再填窝本f) 經濟部中央標準局負工消費合作社印製 !1>. I-—~ I I 訂— —-I — ---n Ί— n —Ί n---n n __ A?" B7 五、發明説明(3 ) 器、一格式轉換器、以及一差分器。基音提取器是用以接 收代表漢語音節之一輸入信號,並對此輸入信號提取基音 信息。而格式轉換器係根據基音信息,對輸入信號進行格 式的轉換。然而,差分器則對業經轉換格式之信號進行一 階差分後,成一輸出數據輸出。 為讓本發明之上述和其他目的、特徵、和優點能更明 顯易懂,下文特舉一較佳實施例,並配合所附圖式,作詳 細說明如下: 圖示之簡單說明: 第1阖係顯示根據本發明之基音同步差分編碼裝置、 一較佳實施例的方塊示意圖; 第2圖係顯示根據本發明計算濁音週期之方法流程 回 · 圖, 第3圖係顯示根據本發明之基音同步差分解碼裝置、 一較佳實施例的方塊示意圖; 、 第4圖係顯示語音合成系統之方塊圖; 第5圖係顯示第4圖文法處理裝置的方塊示意圖; 經濟部中央標準局負工消費合作社印製 (請先閎讀背面之注意事項再填寫本頁) 第6A、6B、6C圖係分別顯示以本發明編碼方法、 配接差動脈衝編碼調變(ADPCM)編碼、以及線性預測編碼 (LPC)等所合成之語音頻譜與原始語音頻譜包絡線之比較 圖;以及 第7圖係顯示清音和濁音的波形圖。 符號說明: 10〜輸入信號;11〜基音提取器;12~基音訊息;13〜 本紙張尺度適用中國國家標準(CNS ) A4規格(210X297公釐) 第86118724號說明書修正頁五、發明説明(4 )Coding (referred to simply as LPC) is a type of parametric coding. The synthesized speech is quite stiff, and it introduces obvious noise, which cannot meet the requirements of close to real human voice. Waveform coding is a method of sampling and storing voice waveform data. The signal-to-noise ratio is relatively high, which can better maintain the timbre of the original voice data. It is equipped with Adaptive Differential Pulse Code Modulation (referred to as ADPCM) coding, which is a type of waveform coding. 'This method can maintain the original timbre of voice data. Therefore, it is used in speech synthesis. The obtained voice sound is better than the parametric encoder. However, its original sound database has a fixed source and cannot be adjusted for various situations in the sentence. Therefore, it is still stiff when synthesizing Hanyu sentences based on it, and it cannot achieve the effect of close to real pronunciation. Therefore, it is an object of the present invention to provide a pitch synchronous differential encoding and decoding method and a device thereof, which can synthesize a Chinese voice with a near-real-life pronunciation effect, which is coherent and natural. Another object of the present invention is to provide a pitch synchronization differential encoding and decoding method and a device thereof, which can synthesize speech in real time. To achieve the above object, the present invention can be accomplished by providing a pitch synchronization differential coding method. First, after inputting an input signal representing one of the Chinese syllables, the pitch information is extracted from the input signal. Then, the input signal is format-converted based on this pitch information '. Then, a first-order difference is performed on the signal in the converted format, and the output data is output. Furthermore, the present invention can be accomplished by providing a pitch synchronous differential encoding device. This pitch-synchronous differential encoding device includes: a pitch extraction. The paper size is applicable to the Chinese National Standard (CNS) A4 specification (210X297 mm). Please read the notes on the back before filling in the book. Printed by the consumer cooperative! 1 >. I-— ~ II Order — —-I — --- n Ί— n —Ί n --- nn __ A? &Quot; B7 V. Description of the invention (3) Device, one format A converter, and a differentiator. The pitch extractor is used to receive an input signal representing one of the Chinese syllables and extract pitch information from the input signal. The format converter performs format conversion on the input signal based on the pitch information. However, the differentiator performs a first-order difference on the signal in the converted format, and outputs it as an output data. In order to make the above and other objects, features, and advantages of the present invention more comprehensible, a preferred embodiment is given below in conjunction with the accompanying drawings to make a detailed description as follows: Brief description of the drawings: Section 1 阖FIG. 2 is a block diagram showing a preferred embodiment of a pitch synchronization differential encoding device according to the present invention; FIG. 2 is a flowchart showing a method for calculating a voiced period according to the present invention; FIG. 3 is a view illustrating a pitch synchronization according to the present invention Differential decoding device, a block diagram of a preferred embodiment; FIG. 4 is a block diagram showing a speech synthesis system; FIG. 5 is a block diagram showing a grammatical processing device of FIG. 4; Printed by the cooperative (please read the notes on the back before filling out this page) Figures 6A, 6B, and 6C show the coding method of the present invention, coupled with differential pulse code modulation (ADPCM) coding, and linear prediction coding, respectively. (LPC) comparison of the synthesized speech spectrum with the original speech spectrum envelope; and Figure 7 shows the unvoiced and voiced waveforms. Explanation of symbols: 10 ~ input signal; 11 ~ pitch extractor; 12 ~ pitch message; 13 ~ This paper size is applicable to China National Standard (CNS) A4 specification (210X297 mm) No. 86118724 amendment page 5. Description of the invention (4 )

格式轉換器;14〜差分n ; 15〜數據壓縮器;i6〜psDc碼. 20〜輸入的語音數據模組;21〜帶通遽波處理模組;”〜波 ,ϋ波谷制模n頂〜基音提取候職組;24〜最佳 ,邏輯判Μ組;25〜輸出濁音基音的軸值馳;3Q〜psDc 碼;3卜數據解屋縮器;32〜反差分器;33〜基音訊號;μ〜 格式反轉換器;35〜PCM碼;抓中文文章;41〜文法處 理裝置;42〜語音數據合成裝置43〜漢語語音庫;44〜漢 語語音輸出;50〜文本預處理器;51〜分詞處理器;52〜漢 語詞語庫;53〜漢語詞語詞頻庫;54〜語法處理器;Μ〜漢 i#f±.-u: # 經濟部中央標準局員工消費合作社印製 語詞性表;56〜漢語語法規則冑;以及,57〜音節轉換器。 實施例: 根據本發明之基音同步差分編碼方法,是結合參量 編碼和波形編碼等兩種編碼方式而成。故相較於參量編 碼或波形編碼,所處理而得之語音數據既能保持真人音 ,,6吾音數據又容易控制。再者,對語音數據所需的運 算量:以減少’故亦能做到即時合成的效果。 請參照第1圖’所示為根據本發明之基音同步差分編 碼裝置、-較佳實施例的方塊示意圖。根據本發明之基音 同步差分編碼裝置係對脈衝碼調變(pulse code modulation’ T文以PCM簡稱)碼1〇的輸入信號,經基音 同^差刀編碼方式轉換成基音同步差分碼(下文以pSDC竭 簡稱)16後’輸出進行储存,此pcM碼之輸入信號1〇代 表的是一個漢語音節。根據本發明之基音同步差分編碼裝 置包括:一基音提取器U、一格式轉換器13、一差分器14、Format converter; 14 ~ differential n; 15 ~ data compressor; i6 ~ psDc code. 20 ~ input voice data module; 21 ~ band pass chirp processing module; Pitch extraction candidate group; 24 ~ best, logical judgement M group; 25 ~ output of the value of the voiced pitch of the pitch; 3Q ~ psDc code; 3 data decompressor; 32 ~ inverse difference; 33 ~ pitch signal; μ ~ Format Inverter; 35 ~ PCM Code; Grab Chinese Articles; 41 ~ Grammar Processing Device; 42 ~ Speech Data Synthesis Device 43 ~ Chinese Speech Library; 44 ~ Chinese Speech Output; 50 ~ Text Preprocessor; 51 ~ Word Segmentation Processor; 52 ~ Chinese word database; 53 ~ Chinese word frequency database; 54 ~ gram processor; M ~ 汉 i # f ± .-u: # Printed part-of-speech table for employees' cooperatives of the Central Standards Bureau of the Ministry of Economic Affairs; 56 ~ Chinese grammatical rules 以及; and 57 ~ syllable converter. Example: The pitch-synchronous differential coding method according to the present invention is a combination of two types of coding methods: parametric coding and waveform coding. Therefore, compared with parametric coding or waveform coding , The processed voice data can be maintained The real human voice and 6-voice data are easy to control. In addition, the amount of calculation required for the voice data: to reduce the effect of 'real-time synthesis can also be achieved. Please refer to FIG. 1'. Pitch synchronization differential encoding device, a block diagram of a preferred embodiment. The pitch synchronization differential encoding device according to the present invention is an input signal with a pulse code modulation (T abbreviated as PCM) code 10, which passes through the pitch. The same ^ difference encoding method is converted into a pitch synchronous differential code (hereinafter abbreviated as pSDC exhaustion) after 16 'output for storage. The input signal 10 of this pcM code represents a Chinese syllable. The pitch synchronous differential encoding device according to the present invention Including: a pitch extractor U, a format converter 13, a differentiator 14,

I ~ ^--- - ί 1 (請先閱讀背面之注意事項再填寫本頁) 訂 線 五、發明説明(5I ~ ^ ----ί 1 (Please read the notes on the back before filling this page)

修煩II 以及一數據壓縮器15。 … 漢語音節是有清音和濁音之別,請參㈣7圖所示 為清音和濁音的波形圖。可知清音的振幅小、波形較 不規則、沒有週期性;濁音的振㈣大、波形有規則、 有明顯的週期。 經濟部中央標準局員工消費合作社印製 如第1圖所示,PCM碼1〇同時及於基音提取器U 和格式轉換器13。由於漢語音節有清音和濁音之別,故 基音提取器11對於輸入之舰碼1〇,進行清音/濁音的 判斷、以及對濁音週期的計算,成一基音訊息12輸出。 關於清音/濁音的判斷,譬如可以透過對短時間平均能量 (short term average energy)或短時間平均過零率 average cross zero)的計算來實現,其計算方式如下: 短時間平均能量: Εη=Σ [x(m)w(n-m)]2 ; 其中’ x(m)代表成PCM碼的語音數據, w(n-m)代表窗函數。 短時間平均過零率: Zn= | sgn[x(m)-sgn[x(m-l)] | w(n-m); 其中’ sgn[x(n)]=l 當 x[n]g 0 ; sgn[x(n)]=-l 當 χ[η]<〇。 至於濁音週期計算則可採用Gold-Rabiner法,其方法流程 圖即如第2圖所示。首先,於步驟2〇輸入語音數據,此 語音數據就是第1圖所示之PCM碼1〇。將輸入的語音數據 於步驟21經帶通濾波處理,此帶通濾波的頻率範圍嬖如可 _ 7 本紙張尺度適用中關家標準(CNS )八4賴_ ( 21GX 297公餐) - 1 .-——----i.il li I I ...... I n - - -I n ~ - . ~ (請先閱讀背面之注意事項再填寫本頁) ,?τ 經濟部中央標準局員工消費合作社印製 A7 B7 五、發明説明(6) 以是介於100〜900Hz間。此經帶通濾波處理後之語音數據 及至步驟22,做波峰波谷檢測,獲致N個基音提取候選 者,係始自方塊23A...及至方塊23N,一般而言,N值 為6。然後,在步驟24對N個基音提取候選者進行最佳邏 輯判斷後,於步驟25輸出濁音基音的週期值。 據此,基音訊息12便包括了清音/濁音的判斷結果、 以及濁音週期值。 總之,PCM碼10經過基音提取處理後,基音提取器 11會輸出一基音信息12予格式轉換器13,此基音信息12 包含了清音/濁音判斷結果以及濁音基音的週期值。因此, 格式轉換器13根據基音信息12,對所輸入之PCM碼10 進行格式轉換處理。例如,若所輸入之PCM碼10為清音, 其格式經轉換為: 丨清音標誌丨清音長度丨語音數據 若所輸入之PCM碼10為濁音,其格式經轉換為: 濁音標誌 週期個數 各週期長度 語音數據 當濁音、清音業經格式轉換器13分別成不同的格式轉 換後,再及於差分器14,對於濁音的部份進行差分處理。 譬如,此差分處理可對相鄰濁音週期所為之一階差分(first order difference),若簡以計算式表之即如下式: Sd(n)=S(n)-S(n-T); 本紙張尺度適用中國國家標準(CNS ) A4規格(210 X 29"7公釐) (請先閲讀背面之注意事項再填寫本頁) --Τ-Γ--1-.!·--^--r -----------,'~,w衣------訂------ A7- B7 經濟部中央標準局員工消費合作社印製 五、發明説明(7 ) 其中,Sd(n)表示差分後當前取樣點; S(n)表示差分前的當前取樣點; S(n-T)代表差分前前一週期的當前取樣點; T代表當前的週期長度。 由於清音的頻率較高,對於—階差分的效果並不明顯,漢 字發音又以濁音為主體,故差分器14主要是對濁音的部份進行差分處理。此差分處理‘可以是二階差分或二階以上之高階差分處理。 若為考量將待儲存之數據量予以縮減,則經差分處理 器14處理後之數據,可及於數據壓縮器15處進行數據壓 縮後’成PSDC碼16輸出。數據壓縮器15譬如可以習知 之Huffman法進行數據壓縮,而將數據量減少以減輕對儲 存裝置容量的需求。Huffman法包括對音庫符號頻率的統 计、建立Huffman樹專步驟,然Huffman法已屬習知之技 術,亦非為本發明之重點,故於此不再贅述。值得一提地, 以數據壓縮器15進行數據壓縮是選擇性的步驟,故經差分 器14處理後之數據亦可逕行輸出成為以〇〇碼16。此^ 出之PSDC碼16已包含相對應之基音信息,則傳送至儲 存裝置(未圖示)進行資料儲存’此儲存裝置孽如可以 碟機、記憶體等裝置。此儲存裝置儲存有各種漢語音節之 濁音、清音等語音數據的PSDC碼,是為_漢語語音庫。當欲對-文字字串各字符發音時,便得自漢語語音庫 經過解碼的動作,擷取相對應之濁音或清音等之語音數 據,再進行解碼、合成等處理。請參照第3曰圖,所::根 本紙張尺度適用中國國家榡準(CNS ) A4規格(21 〇χ 297公釐) .(請先閎讀背西之注意事項再填寫本頁} k. -訂 -ΦΙ. 經濟部中央標準局員工消費合作社印製 A7 B7 五、發明説明(8 ) 據本發明之基音同步差分解碼裝置、一較佳實施例的方塊 示意圖。根據本發明之基音同步差分解碼裝置,是對應於 上述之基音同步差分編碼裝置的反處理裝置,係對儲存於 漢語語音庫内之PSDC碼30,經基音同步差分解碼方式轉 換成PCM碼35。根據本發明之基音同步差分解碼裝置包 括:一數據解壓縮器31、一反差分器32、以及一格式反 轉換器34。 如第3圖所示,若語音數據業經數據壓縮器15進行壓 縮的話,則擷取PSDC碼30的步驟必先經數據解壓縮器31 進行解壓縮處理,簡言之,就是一種數據還原處理。若原 先壓縮數據是採用Huffman法,則此解壓縮處理必然也是 採用Huffman法反向處理。若原先未經數據壓縮器15處理 的話,此經數據解壓縮器31的步驟亦可略去。然後,以反 差分器32將解壓縮後之PSDC碼30與其相對應之基音訊 息33進行反差分計算。此反差分計算係相對於差分計算的 反運算。此時,濁S音和清音分別成上述之格式,後經格式 反轉換器34轉換成PCM碼35輸出。此格式轉換步驟係相 對於上述格式轉換器13的反轉換處理。 本發明之應用 上述之編碼、解碼方式可應用於一語音合成系統内, 此語音合成系統之方塊圖即如第4圖所示,此系統可以自 輸入端接收中文文章40後,經語音合成處理成漢語語音 44輸出。第4圖所示之語音合成系統包括:一文法處理裝 置41、一語音數據合成裝置42、以及一漢語語音庫43等。 10 本紙張尺度it用中國國家¥準(CNS ) A4規格(210X297公釐) 1 I I I-I-I ^ ---- I 11 «I ^ I ---I I n . , (請先閲讀背面之注意事項再填寫本頁) A?Trouble II and a data compressor 15. … Chinese syllables are different from unvoiced and voiced. Please refer to Figure 7 for waveforms of unvoiced and voiced. It can be seen that the unvoiced sound has a small amplitude, a relatively irregular waveform, and no periodicity; a voiced sound has a large vibration, and the waveform has a regular and obvious period. Printed by the Consumers' Cooperative of the Central Bureau of Standards of the Ministry of Economic Affairs As shown in Figure 1, the PCM code 10 is simultaneously applied to the pitch extractor U and the format converter 13. Because there are differences between unvoiced and unvoiced Chinese syllables, the pitch extractor 11 performs unvoiced / voiced judgment and calculation of the voiced period on the input ship code 10, and outputs it as a pitched message 12. Judgment of unvoiced / voiced voice can be achieved by calculating short term average energy or average cross zero, for example, and the calculation method is as follows: Short-term average energy: Εη = Σ [x (m) w (nm)] 2; where 'x (m) represents speech data as a PCM code, and w (nm) represents a window function. Short-term average zero-crossing rate: Zn = | sgn [x (m) -sgn [x (ml)] | w (nm); where 'sgn [x (n)] = l when x [n] g 0; sgn [x (n)] = − 1 when χ [η] < 〇. As for the voiced period calculation, the Gold-Rabiner method can be used. The method flow chart is shown in Figure 2. First, input voice data in step 20, and the voice data is the PCM code 10 shown in Fig. 1. The input voice data is processed by band-pass filtering in step 21, and the frequency range of this band-pass filtering is as possible as _ 7 This paper standard is applicable to Zhongguanjia Standard (CNS) 8 4 Lai _ (21GX 297 public meals)-1. -——----I.il li II ...... I n---I n ~-. ~ (Please read the notes on the back before filling this page),? Τ Central Bureau of Standards, Ministry of Economic Affairs A7 B7 printed by employee consumer cooperative V. Description of invention (6) It is between 100 ~ 900Hz. The speech data processed by the band-pass filtering goes to step 22, and the peak and valley detection is performed to obtain N pitch extraction candidates, starting from blocks 23A ... and to block 23N. Generally, the value of N is 6. Then, in step 24, the best logic judgment is performed on the N pitch extraction candidates, and then in step 25, the periodic value of the voiced pitch is output. Based on this, the pitch message 12 includes the unvoiced / voiced judgment result and the voiced period value. In a word, after the PCM code 10 is subjected to the pitch extraction processing, the pitch extractor 11 outputs a pitch information 12 to the format converter 13, and the pitch information 12 includes a voiced / voiced judgment result and a period value of the voiced pitch. Therefore, the format converter 13 performs format conversion processing on the input PCM code 10 based on the pitch information 12. For example, if the inputted PCM code 10 is unvoiced, its format is converted to: 丨 unvoiced flags 丨 unvoiced length 丨 voice data If the inputted PCM code 10 is voiced, its format is converted to: voiced flag cycle periods The length speech data is converted into different formats by the format converter 13 respectively, and then converted to the differentiator 14 to perform differential processing on the voiced portion. For example, this difference processing can be a first order difference for adjacent voiced periods. If it is simply expressed in a calculation formula, it is as follows: Sd (n) = S (n) -S (nT); this paper Standards are applicable to China National Standard (CNS) A4 specifications (210 X 29 " 7 mm) (Please read the precautions on the back before filling out this page) --Τ-Γ--1-.! ·-^-R -----------, '~, w clothing ------ Order ------ A7- B7 Printed by the Consumers' Cooperative of the Central Standards Bureau of the Ministry of Economic Affairs V. Invention Description (7) Among them, Sd (n) represents the current sampling point after the difference; S (n) represents the current sampling point before the difference; S (nT) represents the current sampling point of the previous cycle before the difference; T represents the current cycle length. Due to the high frequency of unvoiced speech, the effect of -order difference is not obvious, and the pronunciation of Chinese characters is mainly composed of voiced sounds. Therefore, the differentiator 14 mainly performs differential processing on the voiced sounds. This difference processing ′ may be a second-order difference or a higher-order difference processing of a second order or higher. If the amount of data to be stored is reduced for consideration, the data processed by the differential processor 14 can be subjected to data compression at the data compressor 15 'into a PSDC code 16 and output. The data compressor 15 can perform data compression, such as the conventional Huffman method, and reduces the amount of data to reduce the demand for the capacity of the storage device. The Huffman method includes the statistics of the symbol frequency of the sound bank, and the specific steps of establishing the Huffman tree. However, the Huffman method is already a well-known technology and is not the focus of the present invention, so it will not be repeated here. It is worth mentioning that the data compression by the data compressor 15 is an optional step, so the data processed by the differentiator 14 can also be output to 16 in 00 code. The PSDC code 16 produced by this ^ already contains the corresponding pitch information, then it is sent to a storage device (not shown) for data storage. This storage device can be a disk drive, memory, etc. This storage device stores the PSDC codes of voice data such as voiced and unvoiced sounds of various Chinese syllables, which is a _Chinese phonetic database. When you want to pronounce each character of the -text string, you can get the voice data of voiced or unvoiced sound from the Chinese voice library through decoding, and then perform decoding and synthesis. Please refer to the third chart, where: The basic paper size is applicable to the Chinese National Standard (CNS) A4 specification (21 〇χ 297 mm). (Please read the precautions of the West before filling out this page} k.- Order-ΦΙ. Printed by A7 B7, Consumer Cooperative of the Central Standards Bureau of the Ministry of Economic Affairs 5. Description of the invention (8) A block diagram of a preferred embodiment of the synchronous pitch decoding device according to the present invention. The device is an anti-processing device corresponding to the above-mentioned pitch-synchronous differential encoding device, and converts the PSDC code 30 stored in the Chinese speech database into a PCM code 35 by the pitch-synchronous differential decoding method. The pitch-synchronous differential decoding according to the present invention The device includes: a data decompressor 31, an inverse differentiator 32, and a format inverse converter 34. As shown in FIG. 3, if the voice data is compressed by the data compressor 15, the PSDC code 30 is retrieved. The steps must first be decompressed by the data decompressor 31. In short, it is a kind of data reduction processing. If the original compressed data was using the Huffman method, this decompression process must also be The Huffman method is used for reverse processing. If it was not processed by the data compressor 15, this step of the data decompressor 31 can also be omitted. Then, the decompressed PSDC code 30 is corresponding to it by the inverse differentiator 32 The pitch message 33 is subjected to inverse difference calculation. This inverse difference calculation is an inverse operation relative to the difference calculation. At this time, the voiced S sound and the unvoiced sound are respectively in the above-mentioned format, and then converted into the PCM code 35 by the format inverse converter 34 and output. This format conversion step is relative to the inverse conversion processing of the above-mentioned format converter 13. The above-mentioned encoding and decoding methods of the present invention can be applied to a speech synthesis system, and the block diagram of this speech synthesis system is shown in FIG. 4 After receiving a Chinese article 40 from the input terminal, this system can process the speech synthesis into Chinese speech 44 and output it. The speech synthesis system shown in Figure 4 includes a grammar processing device 41, a speech data synthesis device 42, and a Chinese language. Speech library 43 etc. 10 This paper size it uses Chinese national standard (CNS) A4 specifications (210X297 mm) 1 II III ^ ---- I 11 «I ^ I --- II n., (Please read first (Notes on the back please fill out this page) A?

經濟部中央標準局員工消費合作社印製 A7 B7 五、發明説明(ίο ) 庫52進行所有可能切分路徑;譬如字串”天氣預報說明天 天氣晴朗”,根據漢語詞語庫52進行可能的詞語匹配後, 可得兩種切分路徑:”天氣//預報//說//明天//天氣//晴朗//” 和”天氣//預報//說明//天天//氣//晴朗//”。然後,再經選擇 最短路徑步驟,選擇諸切分路徑最短者。若最短切分路徑 不為唯一時,便再配合漢語詞語詞頻庫53,對詞頻當量進 行計算,決定出首選者做為切分結果。 接著,及於語法處理器54做處理。此語法處理器54 以漢語詞性表55為基礎,對經切分語句諸詞語成份進行分 析,以標定各詞語的詞性,究係屬主語、狀語、謂語、補 語、定語、賓語、或兼詞等。譬如,代名詞在動詞之前是 做為主語等。然後,根據漢語語法規則庫56,對相應詞語 做音量、停頓、語速、頻率等之調整,獲致相關的語音合 成參數。譬如,就切分語句”他//說出//去//公園//的//目的” 言,”他”為主語,”說出”為謂語,”去公園”為定語,”目的” 屬賓語,故本語句結構為「主語+謂語+定語+賓語」,故 根據漢語語法規則庫56,此結構之賓語應重讀,故調整” 目的”之音量參數,而主謂語間要停頓200ms、謂定語間 要停頓150ms等等。 至於音節轉換器57係將漢字内碼先轉換為注音,再由 注音轉換為拼音。此轉換過程亦對破音字進行判斷,譬如,” 中”一字在詞語”中國”,其發音係為一聲、非為四聲。以如 是之對照方式可大幅增加轉換速度。 如第4圖所示,經文法處理器41處理後輸出至語音數 12 本紙張尺度適用中國國家標準(CNS ) A4規格(210'〆297公釐) (請先閲讀背面之注意事項再填寫本頁) —^1·1ν —B^i y —on* flu n^i ' ix .·1 經濟部中央標準局員工消費合作社印製 A7 B7 五、發明説明(11 ) 據合成裝置42。此語音數據合成裝置42根據中文文章40 内容,自漢字語音庫43擷取相對應音節之語音數據,此擷 取語音數據的動作,便是以上述基音同步差分解碼方法為 之,並對此等解碼後之語音數據進行合成處理,同時,亦 配合文法處理器41送出的語音合成參數進行音韻、語調、 音量、音調、語速等的調整,而成漢語語音44輸出。 請參照第6A、6B、6C圖所示,分別為以本發明編 碼方法、配接差動脈衝編碼調變(ADPCM)編碼、以及線性 預測編碼(LPC)等所合成之語音頻譜與原始語音頻譜包絡 之比較圖。圖示中,曲線E代表原始語音頻譜包絡,曲線 F代表根據本發明方法所合成之語音頻譜,曲線G代表由 習知配接差動脈衝編碼調變(ADPCM)編碼所合成之語音 頻譜,曲線Η代表以習知線性預測編碼(LPC)所合成之語 音頻譜。 由第6圖之頻譜比較可知,本發明之基音同步差分編 碼方法,均較習知配接差動脈衝編碼調變(ADPCM)編碼和 線性預測編碼(LPC)更為接近原始語音信號。據此,故可實 現近於真人發音效果、連貫且自然之漢語語音;再者,能 夠即時地合成語音。 雖然本發明已以較佳實施例揭露如上,然其並非用以 限定本發明,任何熟習此技藝者,在不脫離本發明之精神 和範圍内,當可作更動與潤飾,因此本發明之保護範圍當 視後附之申請專利範圍所界定者為準。 13 本紙張尺度適用中國國家標準(CNS ) A4規格(210X297公釐) (銪先閩讀背面之注意事項再填寫本頁) 裝- 訂Printed by the Consumer Standards Cooperative of the Central Standards Bureau of the Ministry of Economic Affairs A7 B7 V. Invention Description (ίο) Library 52 performs all possible segmentation paths; After that, two kinds of split paths can be obtained: "weather // forecast // say // tomorrow // weather // clear //" and "weather // forecast // description // day // air // clear // ". Then, through the step of selecting the shortest path, the shortest path is selected. If the shortest segmentation path is not unique, then it will cooperate with the Chinese word frequency database 53 to calculate the word frequency equivalent to determine the first choice as the segmentation result. Then, the grammar processor 54 performs processing. This grammar processor 54 uses the Chinese part-of-speech table 55 as the basis to analyze the constituents of the words in the segmented sentence to mark the parts of speech of each word. Wait. For example, pronouns are used as subjects before verbs. Then, according to the Chinese grammar rule base 56, the volume, pause, speed, frequency, etc. of the corresponding words are adjusted to obtain relevant speech synthesis parameters. For example, the sentence "He // Speak // Go // Park /// 's" purpose is segmented, "He" is the main subject, "Speak" is the predicate, "Go to the park" is the attributive, " The "object" is an object, so the structure of this sentence is "subject + predicate + attributive + object". Therefore, according to the Chinese grammar rule base 56, the object of this structure should be re-read, so the volume parameter of "purpose" should be adjusted. A pause of 200ms, a pause of 150ms between predicates, and so on. As for the syllable converter 57, the internal code of Chinese characters is converted into Zhuyin first, and then from Zhuyin to Pinyin. This conversion process also judges the broken sound words, for example, the word "中" is in the word "China", and its pronunciation is one sound, not four sounds. By contrast, the conversion speed can be greatly increased. As shown in Figure 4, after processing by the grammar processor 41, the number of speeches is 12. The paper size is applicable to the Chinese National Standard (CNS) A4 specification (210'〆297 mm). (Please read the precautions on the back before filling in this paper. Page) — ^ 1 · 1ν —B ^ iy —on * flu n ^ i 'ix. · 1 Printed by the Consumer Cooperative of the Central Bureau of Standards of the Ministry of Economic Affairs A7 B7 5. Description of the invention (11) According to the synthesis device 42. The speech data synthesizing device 42 extracts the speech data of the corresponding syllable from the Chinese character speech database 43 according to the content of the Chinese article 40. The operation of extracting the speech data is based on the above-mentioned pitch synchronous differential decoding method. The decoded speech data is synthesized, and at the same time, the phonology, intonation, volume, pitch, and speed of speech are adjusted in conjunction with the speech synthesis parameters sent by the grammar processor 41, and the Chinese speech 44 is output. Please refer to Figures 6A, 6B, and 6C, respectively, for the speech spectrum and original speech spectrum synthesized by the coding method of the present invention, coupled with differential pulse code modulation (ADPCM) coding, and linear prediction coding (LPC). Comparison of envelopes. In the figure, curve E represents the original speech spectrum envelope, curve F represents the speech spectrum synthesized according to the method of the present invention, and curve G represents the speech spectrum synthesized by conventionally coupled differential pulse code modulation (ADPCM) coding. Η represents the speech spectrum synthesized by conventional linear prediction coding (LPC). It can be seen from the frequency spectrum comparison in FIG. 6 that the pitch synchronous differential coding methods of the present invention are closer to the original speech signal than the conventional differential pulse code modulation (ADPCM) coding and linear prediction coding (LPC). Based on this, it is possible to achieve a close-to-natural Chinese speech that is close to real-life pronunciation effects; furthermore, it is possible to synthesize speech instantly. Although the present invention has been disclosed in the preferred embodiment as above, it is not intended to limit the present invention. Any person skilled in the art can make changes and retouches without departing from the spirit and scope of the present invention. Therefore, the protection of the present invention The scope shall be determined by the scope of the attached patent application. 13 This paper size is in accordance with Chinese National Standard (CNS) A4 (210X297 mm) (铕 Please read the precautions on the back before filling out this page)

Claims (1)

A8 B8 C8 D8A8 B8 C8 D8 申請專利範圍 1. 一種基音同步差分編碼方法,包括: 輸入代表漢語音節之一輸入信號; 對該輸入信號提取基音信息; (請先聞讀背面之注意事項再填寫本頁) 根據該基音信息,對該輸入信號進行格式轉換;以及 對該經轉換格式之信號進行差分後,成一數據輪出。 2. 如申請專利範圍第丨項所述之該基音同步差分編碼 方法,其中,該輸入信號是為脈衝編碼調變碼。 3·如申清專利範圍第1項所述之該基音同步差分編碼 =法,其中,提取該基音信息的方法,是藉由計算短時間 平均能量和短時間平均過零率實現。 4’如申請專利範圍第1項所述之該基音同步差分編碼 法.八中,5亥基音息用以標示該輸入信號係屬濁音^ 清音中之一者。 5.如申請專利範圍第4項所述之該基音同步差分編碼 方法其中,若該輸入信號係屬該濁音,則該基音信息尚 包括該濁音之週期資料。 -LQ. 經濟部中央標準局貝工消费合作社印製 、6·如申請專利範圍第5項所述之該基音同步差分編碼 方法,其中,該經轉換格式之信號包括濁音標誌、週期個 數、該等週期長度、以及語音數據等。 、7.如申請專利範圍第4項所述之該基音同步差分編碼 方法其中,若該輸入信號係屬該清音,該經轉換格式之 仏號包括清音標誌、清音長度、以及語音數據等。 、8.如申請專利範圍第1項所述之該基音同步差分編碼 方法,當需對該數據進行解碼時,係以一基音同步差分解 經濟部中央標準局負工消費合作社印«. A8 Βδ- C8 D8 申請專利範圍 碼方法行之;該基音同步差分解碼方法,包括下列步驟. 對該數據進行反差分;以及 根據該基音信息,對該經反差分處理之信號進行格式 反轉換。 9. 如申請專利範圍第8項所述之該基音同步差分編瑪 方法,其中’經差分處理後,該數據尚經過壓縮處理。 10. 如申請專利範圍第9項所述之該基音同步差分編碼 方法,其中’在對該數據進行反差分處理之前,尚經過解 壓縮處理。 11. 一種基音同步差分編碼裝置,包括: 一基音提取器,接收代表漢語音節之一輸入信號,並 對該輸入信號提取基音信息; 一格式轉換器’係根據該基音信息,對該輸入信號進 行格式轉換;以及 一差分器,係對該經轉換格式之信號進行差分後成一 數據輸出。 12. 如申請專利範圍第u項所述之該基音同步差分編 碼裝置,其中,該輸入信號是為脈衝編碼調變碼。 13. 如申請專利範圍第u項所述之該基音同步差分編 碼裝置’其中’該基音提取器對該基音信息之提取,是藉 由計算短時間平均能量和短時間平均過零率實現。 14. 如申請專利範圍第1]L項所述之該基音同步差分編 碼裝置’其中’該基音信息用以標示該輸入信號係屬濁音 和清音中之一者。 (請先閲讀背面之注意事項再填寫本頁) 裝· 訂_ 15Patent application scope 1. A method of pitch synchronization differential encoding, including: inputting an input signal representing one of the Chinese syllables; extracting pitch information from the input signal; (please read the notes on the back before filling out this page) According to the pitch information, The input signal is format-converted; and after the converted format signal is differentiated, it is output as a data. 2. The pitch synchronous differential encoding method as described in item 丨 of the patent application scope, wherein the input signal is a pulse code modulation code. 3. The pitch-synchronous differential coding method as described in item 1 of the Shen Qing patent range, wherein the method of extracting pitch information is realized by calculating the short-term average energy and the short-time average zero-crossing rate. 4 ’The pitch-synchronous differential coding method as described in item 1 of the scope of the patent application. In the eighth, the five-tone pitch is used to indicate that the input signal is one of voiced ^ unvoiced. 5. The pitch-synchronous differential encoding method as described in item 4 of the scope of the patent application, wherein if the input signal is the voiced sound, the pitch information also includes periodic data of the voiced sound. -LQ. Printed by the Central Standards Bureau of the Ministry of Economic Affairs, Shelley Consumer Cooperative, 6. The pitch-synchronous differential encoding method described in item 5 of the scope of patent application, wherein the converted format signal includes a voiced mark, number of cycles, The period length, and voice data. 7. The pitch-synchronous differential coding method as described in item 4 of the scope of patent application, wherein if the input signal is the unvoiced sound, the 仏 sign of the converted format includes unvoiced mark, unvoiced length, and voice data. 8. According to the pitch synchronization differential encoding method described in item 1 of the scope of the patent application, when the data needs to be decoded, it is decomposed by a pitch synchronization difference and printed by the Consumers ’Cooperative of the Central Standards Bureau of the Ministry of Economic Affairs. -The C8 D8 patent application range code method works; the pitch synchronous differential decoding method includes the following steps. Inverse difference the data; and inverse transform the inverse difference processed signal based on the pitch information. 9. The pitch-synchronous differential editing method as described in item 8 of the scope of patent application, wherein ′ is subjected to compression processing after differential processing. 10. The pitch-synchronous differential coding method described in item 9 of the scope of the patent application, wherein ′ is subjected to decompression processing before inverse-differential processing is performed on the data. 11. A pitch synchronous differential encoding device, comprising: a pitch extractor that receives an input signal representing one of the Chinese syllables and extracts pitch information from the input signal; a format converter 'performs the input signal based on the pitch information Format conversion; and a differentiator, which outputs the data after differentiating the converted format signal. 12. The pitch synchronous differential encoding device as described in item u of the patent application scope, wherein the input signal is a pulse code modulation code. 13. The pitch-synchronous differential encoding device according to item u in the scope of the patent application, where 'the pitch extractor extracts the pitch information by calculating the short-term average energy and the short-time average zero-crossing rate. 14. The pitch-synchronous differential encoding device according to item 1] L of the scope of the patent application, wherein the pitch information is used to indicate that the input signal is one of voiced and unvoiced. (Please read the precautions on the back before filling out this page) A8 B8 C8 D8 、申請專利把圍 15. 如申請專利範圍第14項所述之該基音同步差分編 碼裝置,其中,若該輸入信號係屬該濁音,則該基音信息 尚包括該濁音之週期資料。 16. 如申請專利範圍第15項所述之該基音同步差分編 碼裝置,其中,該經轉換格式之信號包括濁音標誌、週期 個數、該等週期長度、以及語音數據等。 17. 如申請專利範圍第14項所述之該基音同步差分編 碼裝置,其中,若該輸入信號係屬該清音,該經轉換格式 之信號包括清音標誌、清音長度、以及語音數據等。 18. 如申請專利範圍第11項所述之該基音同步差分編 碼裝置,當需對該數據進行解碼時,係以一基音同步差分 解碼裝置行之;該基音同步差分解碼裝置,包括: 一反差分器,係對該數據進行反差分;以及 一格式反轉換器,根據該基音信息,對該經反差分處 理之信號進行格式反轉換。 19. 如申請專利範圍第18項所述之該基音同步差分編 碼裝置,其中,經該差分器處理後之該數據,尚經過壓縮 處理。 經濟部中央標準局員工消費合作社印製 (請先閲讀背面之注意事項再填寫本頁) 20. 如申請專利範圍第19項所述之該基音同步差分編 碼裝置,其中,以反差分器處理該數據之前,尚經過解壓 縮處理。 本紙張尺度適用中國國家標準(CNS > Α4規格(210X297公釐)A8, B8, C8, D8, patent application 15. The pitch synchronous differential encoding device as described in item 14 of the scope of patent application, wherein if the input signal is the voiced sound, the pitch information also includes the periodic data of the voiced sound . 16. The pitch-synchronous differential encoding device according to item 15 of the scope of patent application, wherein the converted format signal includes a voiced flag, the number of periods, the length of the periods, and speech data. 17. The pitch-synchronous differential encoding device according to item 14 of the scope of patent application, wherein if the input signal is the unvoiced, the converted format signal includes unvoiced flags, unvoiced length, and voice data. 18. The pitch synchronous differential encoding device described in item 11 of the scope of patent application, when the data needs to be decoded, it is performed by a pitch synchronous differential decoding device; the pitch synchronous differential decoding device includes: a contrast A demultiplexer performs inverse difference on the data; and a format inverse converter performs inverse conversion on the inverse difference processed signal based on the pitch information. 19. The pitch synchronous differential encoding device as described in item 18 of the scope of patent application, wherein the data processed by the differentiator is still subjected to compression processing. Printed by the Consumer Cooperative of the Central Standards Bureau of the Ministry of Economic Affairs (please read the notes on the back before filling this page) 20. The pitch synchronous differential encoding device as described in item 19 of the scope of patent application, in which the inverse difference is used to process the Before data is decompressed. This paper size applies to Chinese national standards (CNS > Α4 size (210X297 mm)
TW086118724A 1997-12-11 1997-12-11 Base tone synchronous differential coding method and device thereof TW382094B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
TW086118724A TW382094B (en) 1997-12-11 1997-12-11 Base tone synchronous differential coding method and device thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
TW086118724A TW382094B (en) 1997-12-11 1997-12-11 Base tone synchronous differential coding method and device thereof

Publications (1)

Publication Number Publication Date
TW382094B true TW382094B (en) 2000-02-11

Family

ID=21627400

Family Applications (1)

Application Number Title Priority Date Filing Date
TW086118724A TW382094B (en) 1997-12-11 1997-12-11 Base tone synchronous differential coding method and device thereof

Country Status (1)

Country Link
TW (1) TW382094B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8255211B2 (en) 2004-08-25 2012-08-28 Dolby Laboratories Licensing Corporation Temporal envelope shaping for spatial audio coding using frequency domain wiener filtering

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8255211B2 (en) 2004-08-25 2012-08-28 Dolby Laboratories Licensing Corporation Temporal envelope shaping for spatial audio coding using frequency domain wiener filtering
TWI393120B (en) * 2004-08-25 2013-04-11 Dolby Lab Licensing Corp Method and syatem for audio signal encoding and decoding, audio signal encoder, audio signal decoder, computer-accessible medium carrying bitstream and computer program stored on computer-readable medium

Similar Documents

Publication Publication Date Title
Dutoit High-quality text-to-speech synthesis: An overview
CN100568343C (en) Generate the apparatus and method of pitch cycle waveform signal and the apparatus and method of processes voice signals
CN102231278A (en) Method and system for realizing automatic addition of punctuation marks in speech recognition
Ghai et al. Analysis of automatic speech recognition systems for indo-aryan languages: Punjabi a case study
TW487902B (en) Method and apparatus for mandarin Chinese speech recognition by using initial/final phoneme similarity vector
CN111028824A (en) Method and device for synthesizing Minnan
Bellegarda et al. Statistical prosodic modeling: from corpus design to parameter estimation
Dutoit A short introduction to text-to-speech synthesis
Lee et al. Voice response systems
Kim Singing voice analysis/synthesis
US20080162134A1 (en) Apparatus and methods for vocal tract analysis of speech signals
Levinson et al. Speech synthesis in telecommunications
CN113838169A (en) Text-driven virtual human micro-expression method
TW382094B (en) Base tone synchronous differential coding method and device thereof
Aida–Zade et al. The main principles of text-to-speech synthesis system
Reddy et al. Speech-to-Text and Text-to-Speech Recognition Using Deep Learning
Chettri et al. Nepali text to speech synthesis system using ESNOLA method of concatenation
Kumar et al. Significance of durational knowledge for speech synthesis system in an Indian language
Lam et al. Alternative vietnamese speech synthesis system with phoneme structure
Ma et al. Russian speech recognition system design based on HMM
Hassana et al. Text to Speech Synthesis System in Yoruba Language
de Carvalho Campinho Automatic Speech Recognition for European Portuguese
Campinho Automatic speech recognition for European Portuguese
Hande A review on speech synthesis an artificial voice production
Nkosi Creation of a pronunciation dictionary for automatic speech recognition: a morphological approach

Legal Events

Date Code Title Description
GD4A Issue of patent certificate for granted invention patent
MM4A Annulment or lapse of patent due to non-payment of fees