TW446935B - Method and apparatus of multi-channel voice analysis and synthesis - Google Patents

Method and apparatus of multi-channel voice analysis and synthesis Download PDF

Info

Publication number
TW446935B
TW446935B TW88118460A TW88118460A TW446935B TW 446935 B TW446935 B TW 446935B TW 88118460 A TW88118460 A TW 88118460A TW 88118460 A TW88118460 A TW 88118460A TW 446935 B TW446935 B TW 446935B
Authority
TW
Taiwan
Prior art keywords
frequency
sound
memory
amplitude
processed
Prior art date
Application number
TW88118460A
Other languages
Chinese (zh)
Inventor
Wen-Tzung Li
Yi-Lung Huang
Original Assignee
Elan Microelectronics Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Elan Microelectronics Corp filed Critical Elan Microelectronics Corp
Priority to TW88118460A priority Critical patent/TW446935B/en
Application granted granted Critical
Publication of TW446935B publication Critical patent/TW446935B/en

Links

Abstract

The present invention is about the method and apparatus for multi-channel voice analysis and synthesis. In this invention, the voice to be processed is processed through the pre-emphasis filter and Hamming window, and is transformed into frequency domain signal through the fast fourier transform. In addition, the important frequency and its corresponding amplitude are picked from this frequency domain and are stored in the memory after they are quantized. Each channel voice stored in the memory is individually divided into frequency and amplitude, in which the frequency and the amplitude are processed through the corresponding frequency divider and waveform generator, respectively. After that, an adder is used to conduct the synthesis process so as to achieve the function of having no distortion for the synthetic voice. Therefore, the present invention is capable of saving the amount of data needed to be stored and decreasing the amount of memory.

Description

4469 3 5 五'發明說明(1) 本發明係有關於一種多通道聲音分析與合成之方法及 其裝置’其主要係將聲音轉成頻域(FreqUenCy D〇main )’並從頻域中擷取重要頻率加β量化儲存,並利用多通 道之特性’達到合成之聲音不失真且節省所需儲存之資料 量。 習用之聲音分析與合成技術,係於時域(Tjme Domain )上以取樣點對波形作處理,因為每單位時間内之 聲音需要非常大量之取樣點’而造成欲處理及儲存之資料 量魔大’故經過量化後需使用大量之記憶體儲存,而造成 系統之負擔。雖然減少每單位時間内之取樣點,可使得記 憶雜之需求減小’但同時會使聲音之失真較多。 因此,如何設計改良一資料量少且不易失真之聲音分 析與合成技術及其裝置’長久以來一直是使用者殷切盼望 及本發明欲行解決之困難點所在,而本發明人基於多年從 事於聲音分析與合成之研究、開發、及銷售之實務經驗, 乃思及改良之意念,窮個人之專業知識,經多方設計、探 。寸’並經無數次試作樣品及改良後’終能發明出—種多通 道聲音分析與合成之方法及其裝置D爱是, 本發明之主要目的,在於提供一種多通道聲音分析與 合成之 '方法及其裝置’將聲音波形轉成頻域,再擷取頻率 與振te,俾使所需儲存之資料量減小’進而降低記憶體之 需求量者。 本發明之次要目的’在於提供一種多通道聲音分析與 合成之方法及其裝置’係使用多通道合成聲音,使頻率成 mm ··« 4469 3 5 五,發明說明(2) 分之保存更為周全,故使合成出之聲音與原音幾乎無差異 者。 兹為使 貴審查委員對本發明之特徵、結構、方法及 所達成之功效有進一步之瞭解與認識,謹佐以較佳之實施 例詳細說明如後: 首先’請參閲第1圖,係為本發明一種多通道聲音分 析方法之流程圖,其主要步驟如下: 步驟1 0 ’透過錄音裝置錄取欲處理之聲音,並由使用 者選定一取樣頻率對聲音做取樣; 步唧1 1 ,按聲音頻率之快慢,將取樣後之取樣點分為 固定區段(Node ),聲音頻率較快時選取較小之區段;而 聲音頻率較慢時則選取較大之區段,而在本創作中經精算 後區段大小約為2 m s〜3 2 m s ; 步驟1 2 ’為了維持聲音之延續性,故將複數個區段設 為一音框(Frame )’每一音框之取樣點數需為2n個,若 不足2 ri個則需補足; 步驟1 3 ,將音框之資料,經過預強調濾波器 (Pre-emphasis Filter)處理; 步驟1 4,將所得之資料,經過漢彌視窗(Hamm〖叫 Window)處理; p步驟1 τ5,將所得之資料,經過快速傅立葉轉換(Fast 二1^!^11“01*10處理,使資料由時域(The D〇main )轉換成頻域(Frequency Domain )訊號; 步驟1 6在頻域上取出複數個能量較大之頻率和相對4469 3 5 Five 'invention description (1) The present invention relates to a method and device for multi-channel sound analysis and synthesis, which mainly converts sound into the frequency domain (FreqUenCy Domain) and extracts it from the frequency domain. Take important frequencies plus β to quantify and store, and use the characteristics of multiple channels to achieve undistorted synthesized sound and save the amount of data to be stored. The conventional sound analysis and synthesis technology is based on the time domain (Tjme Domain) to process the waveform by sampling points. Because the sound per unit time requires a very large number of sampling points, the amount of data to be processed and stored is magic. 'Therefore, a large amount of memory storage is required after quantification, which causes a burden on the system. Although reducing the number of sampling points per unit time can reduce the need for memory, it will also cause more distortion of the sound. Therefore, how to design and improve a sound analysis and synthesis technology and device with a small amount of data and not easy to be distorted has been a long-awaited problem for users and the present invention wants to solve. The inventor has been engaged in sound The practical experience of analysis, synthesis, research, development, and sales is the idea of improvement, the expertise of poor individuals, and design and exploration by many parties. Inch, and after numerous trial samples and improvements, it finally invented a method and device for multi-channel sound analysis and synthesis. The main purpose of the present invention is to provide a multi-channel sound analysis and synthesis. The method and its device 'transform the sound waveform into the frequency domain, and then acquire the frequency and vibration te, so as to reduce the amount of data to be stored', thereby reducing the amount of memory required. The secondary objective of the present invention is to provide a method and device for multi-channel sound analysis and synthesis, which is to use multi-channel synthesized sound to make the frequency mm ·· «4469 3 5 V. Description of the invention (2) For the sake of thoroughness, the synthesized sound is almost the same as the original sound. In order to make your review members have a better understanding and understanding of the features, structures, methods and achieved effects of the present invention, I would like to explain in detail the preferred embodiment as follows: First, please refer to FIG. A flowchart of a multi-channel sound analysis method is invented, and the main steps are as follows: Step 10 'Record the sound to be processed through a recording device, and the user selects a sampling frequency to sample the sound; Step 1 1, according to the sound frequency The speed of the sample is divided into fixed sections (Node). When the sound frequency is faster, the smaller section is selected. When the sound frequency is slower, the larger section is selected. The segment size after actuarial calculation is about 2 ms ~ 3 2 ms; Step 1 2 'In order to maintain the continuity of the sound, a plurality of segments are set as a frame (Frame)' The number of sampling points for each frame must be 2n, if there are less than 2 ri, you need to make up; Step 1 3, the sound frame data is processed by a pre-emphasis filter; Step 14, the obtained data is passed through the Hamm window (Hamm (Called Window) processing ; P step 1 τ5, the obtained data is processed by fast Fourier transform (Fast II 1 ^! ^ 11 "01 * 10", so that the data is converted from the time domain (The Domain) into a frequency domain (Frequency Domain) signal; Step 16 Take out multiple frequencies and relative frequencies in the frequency domain.

第5頁 Λ469 3 5Page 5 Λ469 3 5

五、發明說明(3) 應之振幅,再將頻率與振幅予以量化處理,由於每〆曰 皆取出相同數量之頻率與振幅,故聲音可以多通道方式處 理; 步驟1 7,將量化後之資料依通道存於記憶體中;及 步驟1 8 ’判斷所有音框是否處理完畢,如否’則執行 步驟1 2 ;如是’則結束多通道聲音分析之動作’由於將 聲音波形轉成頻域’再擷取頻率與振幅,故使所需儲存之 資料量減小’達到節省記憶體之目的。 反 聲音合 通道聲 如下( 步驟 出; 步驟 頻處理 步驟 產生器 音; 步驟 合成處 步驟 如是, 步驟 之,當欲讀 成方法及其 音合成方法 配合第3圖 2 0,從記 2 將各 2 2,將振3 2還原成2 3 ,將不 理;2 4,判斷 則執行步麻 2 5 ,將聲 出已儲存於記憶體中之聲音資料時,其 裝置’如第2及第3圖,係為本發明多 之流程圖及其裝置示意圖,其主 = 所示之裝置構造): 驟 憶難3 〇中將每-通道之頻率與振幅讀 頻率分別經過其相對應之除頻器3 幅與除頻處理過之頻率共 原來之聲音波…產生;波形 通道之聲 同通道之聲音波形經過—加法 除 器 以 記憶體中之頻率與振幅是 2 5 ;如$ ’則執行:取完畢, 音輪出並結束聲音合成,由:聲:係使V. Explanation of the invention (3) The frequency and amplitude should be quantized. Because the same number of frequencies and amplitudes are taken out each time, the sound can be processed in a multi-channel manner; Step 17: The quantized data Stored in memory by channel; and step 1 8 'Judgment whether all the frames are processed, if not, go to step 12; If yes, end the multi-channel sound analysis action' Because the sound waveform is converted to the frequency domain ' Retrieve the frequency and amplitude, so the amount of data to be stored is reduced, and the purpose of saving memory is achieved. The counter sound and channel sound are as follows (step out; step frequency processing step generator sound; step synthesizing step, if yes, step, when you want to read the method and its sound synthesizing method cooperate with Figure 3 in Figure 3, each from 2 to 2 2. Restore the vibration 3 2 to 2 3 and ignore it; 2 4. If the judgment is performed, execute the step 2 5 to sound out the sound data that has been stored in the memory. The device is as shown in Figures 2 and 3. This is a flowchart of the present invention and a schematic diagram of the device. The main structure of the device is as shown in the figure.): Recalling the difficulty 3, the frequency and amplitude read frequency of each channel are passed through their corresponding dividers 3 respectively. The amplitude and the frequency processed by the frequency are the same as the original sound wave ... produced; the sound of the waveform channel and the sound waveform of the channel pass through-the adder divides the frequency and amplitude in the memory by 2 5; if $ 'is executed: the fetch is completed The sound wheel comes out and ends the sound synthesis by: Sound: Department

' 4469 3 5 五、發明說明(4) 用多通道合成,使頻率成分之保存更為周全,相對的其合 成出之聲音與原音幾乎無差異,而可得到原音重現之目的 〇 兹為使 責審查委員更進一步瞭解本創作,係以一實 施例說明本發明多通道聲音分析之方法: 今假設使用一取樣頻率丨6KHz的聲音,由於頻率變化較 快’故選取較小之區段,假設區段為2ms,則每區段的點 數為32點。將時域上兩相鄰之區段當成一音框,則每—音'4469 3 5 V. Description of the invention (4) Multi-channel synthesis makes the preservation of frequency components more comprehensive, and the synthesized sound has almost no difference from the original sound, and the purpose of obtaining the original sound reproduction is as follows: The reviewing committee members have a better understanding of this creation, and use an embodiment to explain the method of multichannel sound analysis of the present invention: Now suppose a sound with a sampling frequency of 6KHz is used. Because the frequency changes faster, so the smaller section is selected, assuming If the segment is 2ms, the number of points in each segment is 32 points. Taking two adjacent sections in the time domain as a sound box, each

框共有64點,經過預強調濾波器及漢彌視窗處理後可得: 輸入 輸出 將輸出之做快速傅立葉轉換,並在頻域上求出此音框 能量最大的兩頻率(fl,f2),和其相對應之兩振幅(ai,a2) 並加以量化,再存入記憶體中。 本發明多通道聲音合成之方法及其裝置: 將記憶體中儲存之頻率(f 1,f 2 )分別加以除頻,並分別 與振幅(a 1,a 2 ) —起通過波形產生器,而可得到原來之波 形’再將兩波形經過加法器合成並予以輸出,即可得到原 音重現。 綜上所述’當知本發明係有關於一種多通道聲音分析 與合成之方法及其裝置,其主要係將聲音轉成轉域,並從 頻威中擷取重要頻率加以量化儲存,並利用多通道之特性 ,達到合成之聲音不失真且節省所需儲存之資料量。故本 第7頁 ‘ 4469 3 5The frame has a total of 64 points, which can be obtained after pre-emphasis filter and Han Mi window processing: input and output will be output for fast Fourier transform, and in the frequency domain, find the two frequencies (fl, f2) with the greatest energy of this frame. The corresponding two amplitudes (ai, a2) are quantified and stored in memory. The method and device for multi-channel sound synthesis of the present invention: divide the frequency (f 1, f 2) stored in the memory separately, and pass the waveform generator together with the amplitude (a 1, a 2), and The original waveform can be obtained, and then the two waveforms are synthesized by the adder and output, and the original sound can be reproduced. In summary, when the present invention is related to a method and device for multi-channel sound analysis and synthesis, it mainly converts the sound into a trans-domain, and extracts important frequencies from the prestige for quantitative storage and uses Multi-channel characteristics, to achieve undistorted synthesized sound and save the amount of data to be stored. Therefore, page 7 ‘4469 3 5

發明實為一富有新穎性、進 ,應符合發明專利申請要件 請,懇請貴審查委員早曰 惟以上所述者’僅為本 非用來限定本發明實施之範 範圍所述之方法、構造、特 飾’均應包括於本發明之申 步性’及可供產業利用功效者 無疑’爰依法提請發明專利申 賜予本發明專利,實感德便。-發明之一較佳實施例而已,並 圍。故即凡依本發明申請專利 徵及精神所為之均等變化與修 請專利範圍内。 (一)圖式簡單說明: ^圖:係為本發明多通道聲音分析方法之流程圖; 第2圊·係為本發明多通道聲音合成方法之流程圖;及 第3圖:係為本發明多通道聲音合成方法之裝置示意圖。 3 1 除頻器 3 3 加法器 二)圖號簡單說明: 〇 記憶體 2 波形產生器The invention is truly novel and advanced, and it should meet the requirements of the invention patent application. Your reviewers are kindly asked as long as the above mentioned are only for the methods, structures, and methods described in this non-limiting scope of implementation of the invention. Special decorations' should be included in the applicability of the present invention 'and those who can use the effects of the industry will undoubtedly' apply for invention patents to the invention patent according to law, which is a real benefit. -Only one preferred embodiment of the invention, and surrounding. Therefore, all changes and repairs within the scope of the patent application and the spirit of the invention for patent application are equally within the scope of the patent. (1) Brief description of the drawings: Figure ^ is a flowchart of the multi-channel sound analysis method of the present invention; Figure 2 is a flowchart of the multi-channel sound synthesis method of the present invention; and Figure 3 is a present invention. Schematic diagram of a device for a multi-channel sound synthesis method. 3 1 Frequency divider 3 3 Adder 2) Brief description of drawing number: 〇 Memory 2 Waveform generator

Claims (1)

446 9 3 5 六、申請專利範圍 1 · 一種多通道聲音分析之方法,其主要係包括有下列之 步驟: (a ).使用一取樣頻率對欲處理聲音做取樣; (b ).將取樣後之取樣點,分為固定區段; (c ).將所得之資料,經過預強調濾波器處理; (d ).將所得之資料,經過漢彌視窗處理; (e ).將所得之資料,經過快速傅立葉轉換處理,使資 料由時域轉換成頻域訊號; (f) .在頻域上取出複數個能量最大之頻率和拍對應振 幅,再將頻率與振幅予以量化;及 (g) .將2:化後之資料依複數個通道存於記憶體中。 2 ·如申請專利範圍第1項所述之方法,其中該步驟(b) 中可按頻率之快慢以作為1樣點分為固定區段之依據,學 音頻率較快時選取較小之區段;而聲音頻率較慢時則選取 較大之區段。 ' 3 .如申請專利範圍第i項所述之方法’其中該步驟(b) =為了維持聲音之延續性而可將複數個區段設為一音框, 每音框之取樣點數需為2n個,若不足2n個則需補足。 1,一種多通道聲音合成之方法,其主要係包括有之 步驟: (A) .從記憶體令將每一通道之頻率與振幅讀出; (B) ,將此頻率經過除頻器除頻; $ 4 (f).將振幅與除頻過之頻率共同經過—相對應之波形 產生盗還原成一波形;及446 9 3 5 6. Scope of patent application 1. A method of multi-channel sound analysis, which mainly includes the following steps: (a). Use a sampling frequency to sample the sound to be processed; (b). After sampling The sampling points are divided into fixed sections; (c). The obtained data is processed through a pre-emphasis filter; (d). The obtained data is processed through a Han Mi window; (e). The obtained data is processed, After fast Fourier transform processing, the data is converted from the time domain to the frequency domain signal; (f). Take out a plurality of frequencies with the greatest energy and the corresponding amplitude in the frequency domain, and then quantify the frequency and amplitude; and (g). The 2: converted data is stored in the memory in a plurality of channels. 2 · The method as described in item 1 of the scope of patent application, wherein step (b) can be divided into fixed sections based on the speed of the frequency, and the smaller area is selected when the audio frequency is faster. Segment; when the sound frequency is slower, select the larger segment. '3. The method described in item i of the scope of patent application', wherein step (b) = in order to maintain the continuity of the sound, a plurality of sections can be set as a frame, and the number of sampling points of each frame must be 2n, if it is less than 2n, make up. 1. A method for multi-channel sound synthesis, which mainly includes the following steps: (A). Read the frequency and amplitude of each channel from the memory; (B), divide the frequency by a frequency divider $ 4 (f). Passing the amplitude together with the frequency divided by the frequency-the corresponding waveform is generated and restored to a waveform; and 4469 35 、申請專利範圍 (D)·將來自不同通道之聲音波形經過— 理而輸出。 心加忐器合成處 b.二f多通道聲音合成裝置,其主要構造係包括有: 憶體’用以儲存量化後之複數個頻盥. 複數個除頻器,其輸入端係連接於該記憶^ 玎 別處理來自於記憶體之相對應頻率,加以除;處理: 複數個波形產生器,每—波形產生器之一端於 對應之除頻器,而其另一端則連接於記憶體,而可將聲 振幅及與其相對應除頻過之頻率還原成波形;及 一加法器,可分別連接於該複數個波形產生器之輸出 端’並將接收之波形合成而加以輸出。 j 6 .如申請專利範圍第5項所述之多通道聲音合成裝置, 其中《亥波形產生器連接記憶艘之線路,係可供相對應之振 幅通過者。 '4469 35. Patent application scope (D) · The sound waveforms from different channels are processed through and output. Cardiac synthesizer synthesizer b. Two f multi-channel sound synthesizing device, its main structure includes: "memory body" used to store the quantized multiple frequency washers. The multiple frequency divider, its input end is connected to the Memory ^ Do not process the corresponding frequency from the memory and divide it; Processing: Multiple waveform generators, one of each waveform generator is connected to the corresponding frequency divider, and the other end is connected to the memory, and The sound amplitude and the corresponding frequency divided by the frequency can be restored into a waveform; and an adder can be connected to the output terminals of the plurality of waveform generators respectively, and the received waveforms are synthesized and output. j 6. The multi-channel sound synthesizing device as described in item 5 of the scope of the patent application, wherein the line connecting the waveform generator to the memory boat is available for the corresponding amplitude to pass. '
TW88118460A 1999-10-26 1999-10-26 Method and apparatus of multi-channel voice analysis and synthesis TW446935B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
TW88118460A TW446935B (en) 1999-10-26 1999-10-26 Method and apparatus of multi-channel voice analysis and synthesis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
TW88118460A TW446935B (en) 1999-10-26 1999-10-26 Method and apparatus of multi-channel voice analysis and synthesis

Publications (1)

Publication Number Publication Date
TW446935B true TW446935B (en) 2001-07-21

Family

ID=21642760

Family Applications (1)

Application Number Title Priority Date Filing Date
TW88118460A TW446935B (en) 1999-10-26 1999-10-26 Method and apparatus of multi-channel voice analysis and synthesis

Country Status (1)

Country Link
TW (1) TW446935B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8700388B2 (en) 2008-04-04 2014-04-15 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Audio transform coding using pitch correction
TWI480861B (en) * 2006-02-07 2015-04-11 Nokia Corp Method, apparatus, and system for controlling time-scaling of audio signal

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI480861B (en) * 2006-02-07 2015-04-11 Nokia Corp Method, apparatus, and system for controlling time-scaling of audio signal
US8700388B2 (en) 2008-04-04 2014-04-15 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Audio transform coding using pitch correction

Similar Documents

Publication Publication Date Title
Farina Simultaneous measurement of impulse response and distortion with a swept-sine technique
KR102125410B1 (en) Apparatus and method for processing audio signal to obtain processed audio signal using target time domain envelope
JP2003500703A (en) Change audio signal time scale
US8687818B2 (en) Method for dynamically adjusting the spectral content of an audio signal
CN103680517A (en) Method, device and equipment for processing audio signals
CN108962277A (en) Speech signal separation method, apparatus, computer equipment and storage medium
CN110459196A (en) A kind of method, apparatus and system adjusting singing songs difficulty
Ragano et al. Adapting the quality of experience framework for audio archive evaluation
Cámara et al. Phase-Aware Transformations in Variational Autoencoders for Audio Effects
TW446935B (en) Method and apparatus of multi-channel voice analysis and synthesis
Cunningham et al. Subjective evaluation of music compressed with the ACER codec compared to AAC, MP3, and uncompressed PCM
Dittmar et al. Towards transient restoration in score-informed audio decomposition
Moliner et al. Virtual bass system with fuzzy separation of tones and transients
Bogaards et al. An interface for analysis-driven sound processing
Fierro et al. Extreme audio time stretching using neural synthesis
Gaultier et al. Sparsity-based audio declipping methods: selected overview, new algorithms, and large-scale evaluation
Tucker et al. Novel techniques for time-compressing speech: an exploratory study
Tarjano et al. An efficient algorithm for segmenting quasi-periodic digital signals into pseudo cycles: Application in lossy audio compression
Liu Recovery of lossy compressed music based on CNN super-resolution and GAN
Zivanovic Harmonic bandwidth companding for separation of overlapping harmonics in pitched signals
Vande Veire et al. A CycleGAN for style transfer between drum and bass subgenres
JP5392057B2 (en) Audio processing apparatus, audio processing method, and audio processing program
JP2705063B2 (en) Music signal generator
Tralie Cover song synthesis by analogy
Shen et al. Harmonic-aware tri-path convolution recurrent network for singing voice separation

Legal Events

Date Code Title Description
GD4A Issue of patent certificate for granted invention patent
MM4A Annulment or lapse of patent due to non-payment of fees