TW446935B

TW446935B - Method and apparatus of multi-channel voice analysis and synthesis

Info

Publication number: TW446935B
Application number: TW88118460A
Authority: TW
Inventors: Wen-Tzung Li; Yi-Lung Huang
Original assignee: Elan Microelectronics Corp
Priority date: 1999-10-26
Filing date: 1999-10-26
Publication date: 2001-07-21

Abstract

The present invention is about the method and apparatus for multi-channel voice analysis and synthesis. In this invention, the voice to be processed is processed through the pre-emphasis filter and Hamming window, and is transformed into frequency domain signal through the fast fourier transform. In addition, the important frequency and its corresponding amplitude are picked from this frequency domain and are stored in the memory after they are quantized. Each channel voice stored in the memory is individually divided into frequency and amplitude, in which the frequency and the amplitude are processed through the corresponding frequency divider and waveform generator, respectively. After that, an adder is used to conduct the synthesis process so as to achieve the function of having no distortion for the synthetic voice. Therefore, the present invention is capable of saving the amount of data needed to be stored and decreasing the amount of memory.

Description

4469 3 5 五'發明說明（1) 本發明係有關於一種多通道聲音分析與合成之方法及其裝置’其主要係將聲音轉成頻域（FreqUenCy D〇main )’並從頻域中擷取重要頻率加β量化儲存，並利用多通道之特性’達到合成之聲音不失真且節省所需儲存之資料量。習用之聲音分析與合成技術，係於時域（Tjme Domain )上以取樣點對波形作處理，因為每單位時間内之聲音需要非常大量之取樣點’而造成欲處理及儲存之資料量魔大’故經過量化後需使用大量之記憶體儲存，而造成系統之負擔。雖然減少每單位時間内之取樣點，可使得記憶雜之需求減小’但同時會使聲音之失真較多。因此，如何設計改良一資料量少且不易失真之聲音分析與合成技術及其裝置’長久以來一直是使用者殷切盼望及本發明欲行解決之困難點所在，而本發明人基於多年從事於聲音分析與合成之研究、開發、及銷售之實務經驗，乃思及改良之意念，窮個人之專業知識，經多方設計、探。寸’並經無數次試作樣品及改良後’終能發明出—種多通道聲音分析與合成之方法及其裝置D爱是，本發明之主要目的，在於提供一種多通道聲音分析與合成之 '方法及其裝置’將聲音波形轉成頻域，再擷取頻率與振te，俾使所需儲存之資料量減小’進而降低記憶體之需求量者。本發明之次要目的’在於提供一種多通道聲音分析與合成之方法及其裝置’係使用多通道合成聲音，使頻率成 mm ··« 4469 3 5 五，發明說明（2) 分之保存更為周全，故使合成出之聲音與原音幾乎無差異者。兹為使貴審查委員對本發明之特徵、結構、方法及所達成之功效有進一步之瞭解與認識，謹佐以較佳之實施例詳細說明如後：首先’請參閲第1圖，係為本發明一種多通道聲音分析方法之流程圖，其主要步驟如下：步驟1 0 ’透過錄音裝置錄取欲處理之聲音，並由使用者選定一取樣頻率對聲音做取樣；步唧1 1 ，按聲音頻率之快慢，將取樣後之取樣點分為固定區段（Node )，聲音頻率較快時選取較小之區段；而聲音頻率較慢時則選取較大之區段，而在本創作中經精算後區段大小約為2 m s〜3 2 m s ; 步驟1 2 ’為了維持聲音之延續性，故將複數個區段設為一音框（Frame )’每一音框之取樣點數需為2n個，若不足2 ri個則需補足；步驟1 3 ，將音框之資料，經過預強調濾波器 (Pre-emphasis Filter)處理；步驟1 4，將所得之資料，經過漢彌視窗（Hamm〖叫 Window)處理； p步驟1 τ5，將所得之資料，經過快速傅立葉轉換（Fast 二1^!^11“01*10處理，使資料由時域（The D〇main )轉換成頻域（Frequency Domain )訊號；步驟1 6在頻域上取出複數個能量較大之頻率和相對4469 3 5 Five 'invention description (1) The present invention relates to a method and device for multi-channel sound analysis and synthesis, which mainly converts sound into the frequency domain (FreqUenCy Domain) and extracts it from the frequency domain. Take important frequencies plus β to quantify and store, and use the characteristics of multiple channels to achieve undistorted synthesized sound and save the amount of data to be stored. The conventional sound analysis and synthesis technology is based on the time domain (Tjme Domain) to process the waveform by sampling points. Because the sound per unit time requires a very large number of sampling points, the amount of data to be processed and stored is magic. 'Therefore, a large amount of memory storage is required after quantification, which causes a burden on the system. Although reducing the number of sampling points per unit time can reduce the need for memory, it will also cause more distortion of the sound. Therefore, how to design and improve a sound analysis and synthesis technology and device with a small amount of data and not easy to be distorted has been a long-awaited problem for users and the present invention wants to solve. The inventor has been engaged in sound The practical experience of analysis, synthesis, research, development, and sales is the idea of improvement, the expertise of poor individuals, and design and exploration by many parties. Inch, and after numerous trial samples and improvements, it finally invented a method and device for multi-channel sound analysis and synthesis. The main purpose of the present invention is to provide a multi-channel sound analysis and synthesis. The method and its device 'transform the sound waveform into the frequency domain, and then acquire the frequency and vibration te, so as to reduce the amount of data to be stored', thereby reducing the amount of memory required. The secondary objective of the present invention is to provide a method and device for multi-channel sound analysis and synthesis, which is to use multi-channel synthesized sound to make the frequency mm ·· «4469 3 5 V. Description of the invention (2) For the sake of thoroughness, the synthesized sound is almost the same as the original sound. In order to make your review members have a better understanding and understanding of the features, structures, methods and achieved effects of the present invention, I would like to explain in detail the preferred embodiment as follows: First, please refer to FIG. A flowchart of a multi-channel sound analysis method is invented, and the main steps are as follows: Step 10 'Record the sound to be processed through a recording device, and the user selects a sampling frequency to sample the sound; Step 1 1, according to the sound frequency The speed of the sample is divided into fixed sections (Node). When the sound frequency is faster, the smaller section is selected. When the sound frequency is slower, the larger section is selected. The segment size after actuarial calculation is about 2 ms ~ 3 2 ms; Step 1 2 'In order to maintain the continuity of the sound, a plurality of segments are set as a frame (Frame)' The number of sampling points for each frame must be 2n, if there are less than 2 ri, you need to make up; Step 1 3, the sound frame data is processed by a pre-emphasis filter; Step 14, the obtained data is passed through the Hamm window (Hamm (Called Window) processing ; P step 1 τ5, the obtained data is processed by fast Fourier transform (Fast II 1 ^! ^ 11 "01 * 10", so that the data is converted from the time domain (The Domain) into a frequency domain (Frequency Domain) signal; Step 16 Take out multiple frequencies and relative frequencies in the frequency domain.

第5頁 Λ469 3 5Page 5 Λ469 3 5

五、發明說明（3) 應之振幅，再將頻率與振幅予以量化處理，由於每〆曰皆取出相同數量之頻率與振幅，故聲音可以多通道方式處理；步驟1 7，將量化後之資料依通道存於記憶體中；及步驟1 8 ’判斷所有音框是否處理完畢，如否’則執行步驟1 2 ;如是’則結束多通道聲音分析之動作’由於將聲音波形轉成頻域’再擷取頻率與振幅，故使所需儲存之資料量減小’達到節省記憶體之目的。反聲音合通道聲如下（步驟出；步驟頻處理步驟產生器音; 步驟合成處步驟如是，步驟之，當欲讀成方法及其音合成方法配合第3圖 2 0，從記 2 將各 2 2，將振3 2還原成2 3 ，將不理；2 4，判斷則執行步麻 2 5 ，將聲出已儲存於記憶體中之聲音資料時，其裝置’如第2及第3圖，係為本發明多之流程圖及其裝置示意圖，其主 = 所示之裝置構造）：驟憶難3 〇中將每-通道之頻率與振幅讀頻率分別經過其相對應之除頻器3 幅與除頻處理過之頻率共原來之聲音波…產生；波形通道之聲同通道之聲音波形經過—加法除器以記憶體中之頻率與振幅是 2 5 ;如$ ’則執行：取完畢，音輪出並結束聲音合成，由：聲：係使V. Explanation of the invention (3) The frequency and amplitude should be quantized. Because the same number of frequencies and amplitudes are taken out each time, the sound can be processed in a multi-channel manner; Step 17: The quantized data Stored in memory by channel; and step 1 8 'Judgment whether all the frames are processed, if not, go to step 12; If yes, end the multi-channel sound analysis action' Because the sound waveform is converted to the frequency domain ' Retrieve the frequency and amplitude, so the amount of data to be stored is reduced, and the purpose of saving memory is achieved. The counter sound and channel sound are as follows (step out; step frequency processing step generator sound; step synthesizing step, if yes, step, when you want to read the method and its sound synthesizing method cooperate with Figure 3 in Figure 3, each from 2 to 2 2. Restore the vibration 3 2 to 2 3 and ignore it; 2 4. If the judgment is performed, execute the step 2 5 to sound out the sound data that has been stored in the memory. The device is as shown in Figures 2 and 3. This is a flowchart of the present invention and a schematic diagram of the device. The main structure of the device is as shown in the figure.): Recalling the difficulty 3, the frequency and amplitude read frequency of each channel are passed through their corresponding dividers 3 respectively. The amplitude and the frequency processed by the frequency are the same as the original sound wave ... produced; the sound of the waveform channel and the sound waveform of the channel pass through-the adder divides the frequency and amplitude in the memory by 2 5; if $ 'is executed: the fetch is completed The sound wheel comes out and ends the sound synthesis by: Sound: Department

' 4469 3 5 五、發明說明（4) 用多通道合成，使頻率成分之保存更為周全，相對的其合成出之聲音與原音幾乎無差異，而可得到原音重現之目的〇兹為使責審查委員更進一步瞭解本創作，係以一實施例說明本發明多通道聲音分析之方法：今假設使用一取樣頻率丨6KHz的聲音，由於頻率變化較快’故選取較小之區段，假設區段為2ms，則每區段的點數為32點。將時域上兩相鄰之區段當成一音框，則每—音'4469 3 5 V. Description of the invention (4) Multi-channel synthesis makes the preservation of frequency components more comprehensive, and the synthesized sound has almost no difference from the original sound, and the purpose of obtaining the original sound reproduction is as follows: The reviewing committee members have a better understanding of this creation, and use an embodiment to explain the method of multichannel sound analysis of the present invention: Now suppose a sound with a sampling frequency of 6KHz is used. Because the frequency changes faster, so the smaller section is selected, assuming If the segment is 2ms, the number of points in each segment is 32 points. Taking two adjacent sections in the time domain as a sound box, each

框共有64點，經過預強調濾波器及漢彌視窗處理後可得：輸入輸出將輸出之做快速傅立葉轉換，並在頻域上求出此音框能量最大的兩頻率（fl，f2)，和其相對應之兩振幅（ai，a2) 並加以量化，再存入記憶體中。本發明多通道聲音合成之方法及其裝置：將記憶體中儲存之頻率（f 1，f 2 )分別加以除頻，並分別與振幅（a 1，a 2 ) —起通過波形產生器，而可得到原來之波形’再將兩波形經過加法器合成並予以輸出，即可得到原音重現。綜上所述’當知本發明係有關於一種多通道聲音分析與合成之方法及其裝置，其主要係將聲音轉成轉域，並從頻威中擷取重要頻率加以量化儲存，並利用多通道之特性，達到合成之聲音不失真且節省所需儲存之資料量。故本第7頁 ‘ 4469 3 5The frame has a total of 64 points, which can be obtained after pre-emphasis filter and Han Mi window processing: input and output will be output for fast Fourier transform, and in the frequency domain, find the two frequencies (fl, f2) with the greatest energy of this frame. The corresponding two amplitudes (ai, a2) are quantified and stored in memory. The method and device for multi-channel sound synthesis of the present invention: divide the frequency (f 1, f 2) stored in the memory separately, and pass the waveform generator together with the amplitude (a 1, a 2), and The original waveform can be obtained, and then the two waveforms are synthesized by the adder and output, and the original sound can be reproduced. In summary, when the present invention is related to a method and device for multi-channel sound analysis and synthesis, it mainly converts the sound into a trans-domain, and extracts important frequencies from the prestige for quantitative storage and uses Multi-channel characteristics, to achieve undistorted synthesized sound and save the amount of data to be stored. Therefore, page 7 ‘4469 3 5

發明實為一富有新穎性、進，應符合發明專利申請要件請，懇請貴審查委員早曰惟以上所述者’僅為本非用來限定本發明實施之範範圍所述之方法、構造、特飾’均應包括於本發明之申步性’及可供產業利用功效者無疑’爰依法提請發明專利申賜予本發明專利，實感德便。-發明之一較佳實施例而已，並圍。故即凡依本發明申請專利徵及精神所為之均等變化與修請專利範圍内。 (一）圖式簡單說明： ^圖：係為本發明多通道聲音分析方法之流程圖；第2圊·係為本發明多通道聲音合成方法之流程圖；及第3圖：係為本發明多通道聲音合成方法之裝置示意圖。 3 1 除頻器 3 3 加法器二）圖號簡單說明：〇記憶體 2 波形產生器The invention is truly novel and advanced, and it should meet the requirements of the invention patent application. Your reviewers are kindly asked as long as the above mentioned are only for the methods, structures, and methods described in this non-limiting scope of implementation of the invention. Special decorations' should be included in the applicability of the present invention 'and those who can use the effects of the industry will undoubtedly' apply for invention patents to the invention patent according to law, which is a real benefit. -Only one preferred embodiment of the invention, and surrounding. Therefore, all changes and repairs within the scope of the patent application and the spirit of the invention for patent application are equally within the scope of the patent. (1) Brief description of the drawings: Figure ^ is a flowchart of the multi-channel sound analysis method of the present invention; Figure 2 is a flowchart of the multi-channel sound synthesis method of the present invention; and Figure 3 is a present invention. Schematic diagram of a device for a multi-channel sound synthesis method. 3 1 Frequency divider 3 3 Adder 2) Brief description of drawing number: 〇 Memory 2 Waveform generator

Claims

446 9 3 5 6. Scope of patent application 1. A method of multi-channel sound analysis, which mainly includes the following steps: (a). Use a sampling frequency to sample the sound to be processed; (b). After sampling The sampling points are divided into fixed sections; (c). The obtained data is processed through a pre-emphasis filter; (d). The obtained data is processed through a Han Mi window; (e). The obtained data is processed, After fast Fourier transform processing, the data is converted from the time domain to the frequency domain signal; (f). Take out a plurality of frequencies with the greatest energy and the corresponding amplitude in the frequency domain, and then quantify the frequency and amplitude; and (g). The 2: converted data is stored in the memory in a plurality of channels. 2 · The method as described in item 1 of the scope of patent application, wherein step (b) can be divided into fixed sections based on the speed of the frequency, and the smaller area is selected when the audio frequency is faster. Segment; when the sound frequency is slower, select the larger segment. '3. The method described in item i of the scope of patent application', wherein step (b) = in order to maintain the continuity of the sound, a plurality of sections can be set as a frame, and the number of sampling points of each frame must be 2n, if it is less than 2n, make up. 1. A method for multi-channel sound synthesis, which mainly includes the following steps: (A). Read the frequency and amplitude of each channel from the memory; (B), divide the frequency by a frequency divider $ 4 (f). Passing the amplitude together with the frequency divided by the frequency-the corresponding waveform is generated and restored to a waveform; and

4469 35. Patent application scope (D) · The sound waveforms from different channels are processed through and output. Cardiac synthesizer synthesizer b. Two f multi-channel sound synthesizing device, its main structure includes: "memory body" used to store the quantized multiple frequency washers. The multiple frequency divider, its input end is connected to the Memory ^ Do not process the corresponding frequency from the memory and divide it; Processing: Multiple waveform generators, one of each waveform generator is connected to the corresponding frequency divider, and the other end is connected to the memory, and The sound amplitude and the corresponding frequency divided by the frequency can be restored into a waveform; and an adder can be connected to the output terminals of the plurality of waveform generators respectively, and the received waveforms are synthesized and output. j 6. The multi-channel sound synthesizing device as described in item 5 of the scope of the patent application, wherein the line connecting the waveform generator to the memory boat is available for the corresponding amplitude to pass. '