JP6163785B2

JP6163785B2 - Voice band extending apparatus and program

Info

Publication number: JP6163785B2
Application number: JP2013039606A
Authority: JP
Inventors: 大藤枝
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 2013-02-28
Filing date: 2013-02-28
Publication date: 2017-07-19
Anticipated expiration: 2033-02-28
Also published as: JP2014167557A

Description

本発明は音声帯域拡張装置及びプログラムに関し、例えば、電話機器（ソフトフォン等を含む）に適用し得るものである。 The present invention relates to a voice band extending apparatus and program, and can be applied to, for example, telephone equipment (including softphones).

レガシーな電話機器で伝送できる音声信号の周波数帯域は、約３００Ｈｚから３．４ｋＨｚである。このような電話帯域に帯域制限された狭帯域音声信号の音声は、本来の音声よりもこもった音質になるため、言葉が聞き取り難くなるといった問題が生じる。 The frequency band of an audio signal that can be transmitted by a legacy telephone device is about 300 Hz to 3.4 kHz. Since the voice of such a narrowband voice signal that is band-limited to the telephone band has a higher quality than the original voice, there is a problem that it becomes difficult to hear words.

この問題を解決するために、３．４ｋＨｚ以上の拡張信号を追加して広帯域音声信号へと拡張することで、音声の明瞭性を向上させる帯域拡張技術が開発されており、例えば、電話機器が出力する音声信号の音質の向上を図っている。 In order to solve this problem, a band expansion technology has been developed to improve the clarity of voice by adding an extension signal of 3.4 kHz or more and expanding it to a wideband audio signal. The sound quality of the output audio signal is improved.

特許出願人が注目するアプローチは、狭帯域音声信号に対して時間領域で処理を施すことで拡張信号を生成し、狭帯域音声信号と生成した拡張信号とを合成することで擬似広帯域音声信号を生成するアプローチである。時間領域の処理は非線形な処理が大半である。また、拡張信号の一部又は全部として適当な雑音を利用する方法も多い。このようなアプローチは、時間領域で処理を行う上にコードブックを必要としないため、少ない計算量と少ないリソースで帯域拡張を実現できるというメリットがある。 The approach that the patent applicant has focused on is to generate an extended signal by processing the narrowband audio signal in the time domain, and synthesize the pseudoband audio signal by synthesizing the narrowband audio signal and the generated extended signal. It is an approach to generate. Most of the time domain processing is nonlinear processing. There are also many methods that use appropriate noise as part or all of the extended signal. Such an approach does not require a codebook to perform processing in the time domain, and thus has an advantage that bandwidth expansion can be realized with a small amount of calculation and a small amount of resources.

図６は、このようなアプローチにおける最も基本的な構成を示しており、以下、図６の構成を簡単に説明する。 FIG. 6 shows the most basic configuration in such an approach, and the configuration of FIG. 6 will be briefly described below.

図６の構成を有する音声帯域拡張装置１００は、サンプリング変換部１０１、バンドパスフィルタリング部（ＢＰＦ）１０２、全波整流部１０３、ハイパスフィルタリング部（ＨＰＦ）１０４、周波数解析部１０５、拡張ゲイン算出部１０６、乗算部１０７及び加算部１０８を有する。 The voice band extending apparatus 100 having the configuration of FIG. 6 includes a sampling conversion unit 101, a band pass filtering unit (BPF) 102, a full wave rectification unit 103, a high pass filtering unit (HPF) 104, a frequency analysis unit 105, and an expansion gain calculation unit. 106, a multiplication unit 107 and an addition unit 108.

サンプリング変換部１０１は、サンプリング周波数が８ｋＨｚの狭帯域音声信号Ｓを、サンプリング周波数が１６ｋＨｚの信号にアップサンプリングする。アップサンプリングされた狭帯域音声信号ＸＬは、バンドパスフィルタリング部１０２及び加算部１０８に与えられる。バンドパスフィルタリング１０２によって、アップサンプリングされた狭帯域音声信号ＸＬの例えば帯域２ｋＨｚ〜４ｋＨｚが濾波され、その濾波信号ＸＢは、全波整流部１０３によって全波整流されて、例えば０Ｈｚ〜８ｋＨｚの帯域を有する信号Ｅとなり、ハイパスフィルタリング１０４によって、全波整流信号の例えば４ｋＨｚ以上の成分が濾波されて拡張信号ＥＨが生成される。周波数解析部１０５によって、狭帯域音声信号Ｓが周波数解析されて、周波数スペクトルの振幅包絡、及び、周波数スペクトルの傾きの少なくとも一方に関するスペクトルパラメータＳＦが算出され、拡張ゲイン算出部１０６において、スペクトルパラメータＳＦに基づいて拡張ゲインＥＧが算出されて、得られた拡張ゲインＥＧが乗算部１０７に与えられる（周波数解析、拡張ゲインの算出方法として、非特許文献１に記載の方法を適用できる）。乗算部１０７において、生成された拡張信号ＥＨに、算出された拡張ゲインＥＧが乗算されて、拡張信号の振幅が調整され、加算部１０８において、アップサンプリングされた狭帯域音声信号ＸＬと振幅調整された拡張信号ＸＨとが合成（加算）されて、擬似広帯域音声信号Ｘが生成される。 The sampling converter 101 up-samples the narrowband audio signal S with a sampling frequency of 8 kHz into a signal with a sampling frequency of 16 kHz. The upsampled narrowband audio signal XL is supplied to the bandpass filtering unit 102 and the adding unit 108. For example, the band 2 kHz to 4 kHz of the up-sampled narrowband audio signal XL is filtered by the band pass filtering 102, and the filtered signal XB is full-wave rectified by the full-wave rectification unit 103, for example, to a band of 0 Hz to 8 kHz. The high-pass filtering 104 filters out a component of, for example, 4 kHz or more of the full-wave rectified signal and generates an extended signal EH. The frequency analysis unit 105 frequency-analyzes the narrowband speech signal S to calculate a spectrum parameter SF related to at least one of the amplitude envelope of the frequency spectrum and the slope of the frequency spectrum. In the extension gain calculation unit 106, the spectrum parameter SF The expansion gain EG is calculated based on the above, and the obtained expansion gain EG is given to the multiplier 107 (the method described in Non-Patent Document 1 can be applied as a frequency analysis and expansion gain calculation method). The multiplier 107 multiplies the generated extension signal EH by the calculated extension gain EG to adjust the amplitude of the extension signal, and the adder 108 adjusts the amplitude of the upsampled narrowband audio signal XL. The expanded signal XH is combined (added) to generate a pseudo wideband audio signal X.

ＮａｏｆｕｍｉＡｏｋｉ，”ＡＢａｎｄＥｘｔｅｎｓｉｏｎＴｅｃｈｎｉｑｕｅｆｏｒＮａｒｒｏｗ−ＢａｎｄＴｅｌｅｐｈｏｎｙＳｐｅｅｃｈＢａｓｅｄｏｎＦｕｌｌＷａｖｅＲｅｃｔｉｆｉｃａｔｉｏｎ”，ＩＥＩＣＥＴｒａｎｓ．Ｃｏｍｍｕｎ．，Ｖｏｌ．Ｅ９３−Ｂ（３），ｐｐ．７２９−７３１，２０１０．Naofumi Aoki, “A Band Extension Technology for Narrow-Band Telephony Speech Base on Full Wave Rectification”, IEICE Trans. Commun. , Vol. E93-B (3), pp. 729-731, 2010.

しかしながら、従来の音声帯域拡張装置では、無声音の高域を十分に拡張できず、音声の明瞭度や了解度を改善できず、生成された擬似広帯域音声が聴覚的にこもった音声になるという課題があった。 However, in the conventional voice band extension device, the high frequency range of the unvoiced sound cannot be sufficiently expanded, the intelligibility and intelligibility of the voice cannot be improved, and the generated pseudo wideband voice becomes a voice that is audibly muffled. was there.

これを回避しようとして、無理に無声音の高域を拡張しようとすると、有声音が過剰に拡張されて、生成された擬似広帯域音声は聴覚的に雑音が重畳されたような音声になるという課題があった。 If you try to avoid this and try to extend the high range of the unvoiced sound forcibly, the voiced sound will be expanded excessively, and the generated pseudo-wideband speech will become a sound that is audibly superimposed with noise. there were.

本発明は、上記従来の課題を解決することを目的とするものであり、周波数帯域が制限された狭帯域音声の制限帯域外を少ない演算量で拡張し、拡張後の音声が実用的に十分なレベルの言葉の音質と了解度を有することを目的とする。 The present invention is intended to solve the above-described conventional problems, and extends outside the limited band of narrowband voice with a limited frequency band with a small amount of computation, and the expanded voice is practically sufficient. The objective is to have a high level of speech quality and intelligibility.

第１の本発明は、周波数帯域が制限された狭帯域音声信号を、制限帯域外の拡張帯域の信号成分を含むように拡張する音声帯域拡張装置において、（１）上記狭帯域音声信号を周波数解析してスペクトルパラメータを得る周波数解析手段と、（２）上記狭帯域音声信号に関するパワー情報を得るパワー情報取得手段と、（３）上記スペクトルパラメータに基づいて、上記拡張帯域信号における拡張成分の大きさを調整するための拡張ゲインを得るものであって、上記パワー情報に応じて、拡張ゲインの取得方法を動的に制御する拡張ゲイン形成手段とを備え、（４）上記拡張ゲイン形成手段は、上記パワー情報の大小に応じて、上記拡張ゲインの最大値を制限することを特徴とする。
第２の本発明は、周波数帯域が制限された狭帯域音声信号を、制限帯域外の拡張帯域の信号成分を含むように拡張する音声帯域拡張装置において、（１）上記狭帯域音声信号を周波数解析してスペクトルパラメータを得る周波数解析手段と、（２）上記狭帯域音声信号に関するパワー情報を得るパワー情報取得手段と、（３）上記スペクトルパラメータに基づいて、上記拡張帯域信号における拡張成分の大きさを調整するための拡張ゲインを得るものであって、上記パワー情報に応じて、拡張ゲインの取得方法を動的に制御する拡張ゲイン形成手段とを備え、（４）上記拡張ゲイン形成手段は、上記パワー情報の大小に応じて、上記拡張ゲインの算出方法における上記拡張ゲインの値の大きくなりやすさを制御することを特徴とする。
第３の本発明は、周波数帯域が制限された狭帯域音声信号を、制限帯域外の拡張帯域の信号成分を含むように拡張する音声帯域拡張装置において、（１）上記狭帯域音声信号を周波数解析してスペクトルパラメータを得る周波数解析手段と、（２）上記狭帯域音声信号に関するパワー情報を得るパワー情報取得手段と、（３）上記スペクトルパラメータに基づいて、上記拡張帯域信号における拡張成分の大きさを調整するための拡張ゲインを得るものであって、上記パワー情報に応じて、拡張ゲインの取得方法を動的に制御する拡張ゲイン形成手段とを備え、（４）上記拡張ゲイン形成手段は、上記パワー情報の大小に応じて、上記拡張ゲインの算出方法の非線形式のパラメータ（２乗項の指数）を制御することを特徴とする。
第４の本発明は、周波数帯域が制限された狭帯域音声信号を、制限帯域外の拡張帯域の信号成分を含むように拡張する音声帯域拡張装置において、（１）上記狭帯域音声信号を周波数解析してスペクトルパラメータを得る周波数解析手段と、（２）上記狭帯域音声信号に関するパワー情報を得るパワー情報取得手段と、（３）上記スペクトルパラメータに基づいて、上記拡張帯域信号における拡張成分の大きさを調整するための拡張ゲインを得るものであって、上記パワー情報に応じて、拡張ゲインの取得方法を動的に制御する拡張ゲイン形成手段とを備え、（４）上記拡張ゲイン形成手段は、上記拡張ゲインの算出方法として、一次関数と二次関数の式に対応でき、上記パワー情報の大小に応じて、適用する上記拡張ゲインの算出方法を一次関数と二次関数のいずれかに選択することを特徴とする音声帯域拡張装置。
第５の本発明は、周波数帯域が制限された狭帯域音声信号を、制限帯域外の拡張帯域の信号成分を含むように拡張する音声帯域拡張装置において、（１）上記狭帯域音声信号を周波数解析して、所定の周波数より低い帯域を含む狭帯域内低域パワーと、当該所定の周波数より高い帯域を含む狭帯域内高域パワーと、グラディエントインデックスとを含むスペクトルパラメータを得る周波数解析手段と、（２）上記狭帯域音声信号に関するパワー情報を得るパワー情報取得手段と、（３）上記スペクトルパラメータに基づいて、上記拡張帯域信号における拡張成分の大きさを調整するための拡張ゲインを、上記パワー情報の大小に応じて、狭帯域内高域パワーを狭帯域内低域パワーで除した値と、グラディエントインデックスとのいずれの特徴量を用いるかを選択して得る拡張ゲイン形成手段とを備えることを特徴とする。 According to a first aspect of the present invention, there is provided an audio band extending apparatus for extending a narrow band audio signal having a limited frequency band so as to include an extended band signal component outside the limited band. Frequency analysis means for analyzing and obtaining spectral parameters; (2) power information obtaining means for obtaining power information relating to the narrowband audio signal; and (3) the magnitude of the extension component in the extension band signal based on the spectrum parameters. And an expansion gain forming means for dynamically controlling an acquisition method of the expansion gain according to the power information, and (4) the expansion gain forming means The maximum value of the expansion gain is limited according to the magnitude of the power information .
According to a second aspect of the present invention, there is provided an audio band extending apparatus for extending a narrow band audio signal having a limited frequency band so as to include an extended band signal component outside the limited band. Frequency analysis means for analyzing and obtaining spectral parameters; (2) power information obtaining means for obtaining power information relating to the narrowband audio signal; and (3) the magnitude of the extension component in the extension band signal based on the spectrum parameters. And an expansion gain forming means for dynamically controlling an acquisition method of the expansion gain according to the power information, and (4) the expansion gain forming means According to the power information, the ease of increasing the value of the expansion gain in the expansion gain calculation method is controlled.
According to a third aspect of the present invention, there is provided an audio band extending apparatus for extending a narrow band audio signal having a limited frequency band so as to include an extended band signal component outside the limited band. Frequency analysis means for analyzing and obtaining spectral parameters; (2) power information obtaining means for obtaining power information relating to the narrowband audio signal; and (3) the magnitude of the extension component in the extension band signal based on the spectrum parameters. And an expansion gain forming means for dynamically controlling an acquisition method of the expansion gain according to the power information, and (4) the expansion gain forming means According to the power information, a nonlinear equation parameter (an exponent of a square term) of the expansion gain calculation method is controlled.
According to a fourth aspect of the present invention, there is provided an audio band extending apparatus for extending a narrow band audio signal having a limited frequency band so as to include an extended band signal component outside the limited band. Frequency analysis means for analyzing and obtaining spectral parameters; (2) power information obtaining means for obtaining power information relating to the narrowband audio signal; and (3) the magnitude of the extension component in the extension band signal based on the spectrum parameters. And an expansion gain forming means for dynamically controlling an acquisition method of the expansion gain according to the power information, and (4) the expansion gain forming means As a method for calculating the expansion gain, it is possible to correspond to equations of a linear function and a quadratic function, and according to the magnitude of the power information, the calculation method of the expansion gain to be applied is one. Voice band expansion unit and selects any of the functions and secondary functions.
According to a fifth aspect of the present invention, there is provided an audio band extending apparatus for extending a narrow band audio signal having a limited frequency band so as to include an extended band signal component outside the limited band. A frequency analyzing means for analyzing and obtaining a spectral parameter including a narrow band low band power including a band lower than a predetermined frequency, a narrow band high band power including a band higher than the predetermined frequency, and a gradient index; (2) power information acquisition means for obtaining power information related to the narrowband audio signal, and (3) an expansion gain for adjusting the magnitude of the extension component in the extension band signal based on the spectrum parameter, Depending on the magnitude of the power information, either the value obtained by dividing the narrow band high band power by the narrow band low band power or the gradient index Characterized in that it comprises a extension gain forming means or the obtained by selecting used.

第６の本発明の音声帯域拡張プログラムは、周波数帯域が制限された狭帯域音声信号を、制限帯域外の拡張帯域の信号成分を含むように拡張する音声帯域拡張装置に搭載されるコンピュータを、（１）上記狭帯域音声信号を周波数解析してスペクトルパラメータを得る周波数解析手段と、（２）上記狭帯域音声信号に関するパワー情報を得るパワー情報取得手段と、（３）上記スペクトルパラメータに基づいて、上記拡張帯域信号における拡張成分の大きさを調整するための拡張ゲインを得るものであって、上記パワー情報に応じて、拡張ゲインの取得方法を動的に制御する拡張ゲイン形成手段として機能させ、（４）上記拡張ゲイン形成手段は、上記パワー情報の大小に応じて、上記拡張ゲインの最大値を制限することを特徴とする。
第７の本発明の音声帯域拡張プログラムは、周波数帯域が制限された狭帯域音声信号を、制限帯域外の拡張帯域の信号成分を含むように拡張する音声帯域拡張装置に搭載されるコンピュータを、（１）上記狭帯域音声信号を周波数解析してスペクトルパラメータを得る周波数解析手段と、（２）上記狭帯域音声信号に関するパワー情報を得るパワー情報取得手段と、（３）上記スペクトルパラメータに基づいて、上記拡張帯域信号における拡張成分の大きさを調整するための拡張ゲインを得るものであって、上記パワー情報に応じて、拡張ゲインの取得方法を動的に制御する拡張ゲイン形成手段として機能させ、（４）上記拡張ゲイン形成手段は、上記パワー情報の大小に応じて、上記拡張ゲインの算出方法における上記拡張ゲインの値の大きくなりやすさを制御することを特徴とする。
第８の本発明の音声帯域拡張プログラムは、周波数帯域が制限された狭帯域音声信号を、制限帯域外の拡張帯域の信号成分を含むように拡張する音声帯域拡張装置に搭載されるコンピュータを、（１）上記狭帯域音声信号を周波数解析してスペクトルパラメータを得る周波数解析手段と、（２）上記狭帯域音声信号に関するパワー情報を得るパワー情報取得手段と、（３）上記スペクトルパラメータに基づいて、上記拡張帯域信号における拡張成分の大きさを調整するための拡張ゲインを得るものであって、上記パワー情報に応じて、拡張ゲインの取得方法を動的に制御する拡張ゲイン形成手段として機能させ、（４）上記拡張ゲイン形成手段は、上記パワー情報の大小に応じて、上記拡張ゲインの算出方法の非線形式のパラメータ（２乗項の指数）を制御することを特徴とする。
第９の本発明の音声帯域拡張プログラムは、周波数帯域が制限された狭帯域音声信号を、制限帯域外の拡張帯域の信号成分を含むように拡張する音声帯域拡張装置に搭載されるコンピュータを、（１）上記狭帯域音声信号を周波数解析してスペクトルパラメータを得る周波数解析手段と、（２）上記狭帯域音声信号に関するパワー情報を得るパワー情報取得手段と、（３）上記スペクトルパラメータに基づいて、上記拡張帯域信号における拡張成分の大きさを調整するための拡張ゲインを得るものであって、上記パワー情報に応じて、拡張ゲインの取得方法を動的に制御する拡張ゲイン形成手段として機能させ、（４）上記拡張ゲイン形成手段は、上記拡張ゲインの算出方法として、一次関数と二次関数の式に対応でき、上記パワー情報の大小に応じて、適用する上記拡張ゲインの算出方法を一次関数と二次関数のいずれかに選択することを特徴とする。
第１０の本発明の音声帯域拡張プログラムは、周波数帯域が制限された狭帯域音声信号を、制限帯域外の拡張帯域の信号成分を含むように拡張する音声帯域拡張装置に搭載されるコンピュータを、（１）上記狭帯域音声信号を周波数解析して、所定の周波数より低い帯域を含む狭帯域内低域パワーと、当該所定の周波数より高い帯域を含む狭帯域内高域パワーと、グラディエントインデックスとを含むスペクトルパラメータを得る周波数解析手段と、（２）上記狭帯域音声信号に関するパワー情報を得るパワー情報取得手段と、（３）上記スペクトルパラメータに基づいて、上記拡張帯域信号における拡張成分の大きさを調整するための拡張ゲインを、上記パワー情報の大小に応じて、狭帯域内高域パワーを狭帯域内低域パワーで除した値と、グラディエントインデックスとのいずれの特徴量を用いるかを選択して得る拡張ゲイン形成手段として機能させることを特徴とする。 According to a sixth aspect of the present invention, there is provided an audio band expansion program comprising: (1) frequency analysis means for obtaining a spectrum parameter by frequency analysis of the narrowband speech signal; (2) power information acquisition means for obtaining power information relating to the narrowband speech signal; and (3) based on the spectrum parameter. Obtaining an extension gain for adjusting the magnitude of the extension component in the extension band signal, and functioning as an extension gain forming means for dynamically controlling an acquisition method of the extension gain according to the power information. (4) The expansion gain forming means limits the maximum value of the expansion gain according to the magnitude of the power information .
According to a seventh aspect of the present invention, there is provided an audio band expansion program comprising: (1) frequency analysis means for obtaining a spectrum parameter by frequency analysis of the narrowband speech signal; (2) power information acquisition means for obtaining power information relating to the narrowband speech signal; and (3) based on the spectrum parameter. Obtaining an extension gain for adjusting the magnitude of the extension component in the extension band signal, and functioning as an extension gain forming means for dynamically controlling an acquisition method of the extension gain according to the power information. (4) The expansion gain forming means increases the value of the expansion gain in the expansion gain calculation method according to the magnitude of the power information. And controlling the no longer reliable.
According to an eighth aspect of the present invention, there is provided an audio band expansion program comprising: (1) frequency analysis means for obtaining a spectrum parameter by frequency analysis of the narrowband speech signal; (2) power information acquisition means for obtaining power information relating to the narrowband speech signal; and (3) based on the spectrum parameter. Obtaining an extension gain for adjusting the magnitude of the extension component in the extension band signal, and functioning as an extension gain forming means for dynamically controlling an acquisition method of the extension gain according to the power information. (4) The expansion gain forming means is a parameter (square term) of a nonlinear expression of the expansion gain calculation method according to the magnitude of the power information. And controlling the index).
According to a ninth aspect of the present invention, there is provided an audio band expansion program comprising: (1) frequency analysis means for obtaining a spectrum parameter by frequency analysis of the narrowband speech signal; (2) power information acquisition means for obtaining power information relating to the narrowband speech signal; and (3) based on the spectrum parameter. Obtaining an extension gain for adjusting the magnitude of the extension component in the extension band signal, and functioning as an extension gain forming means for dynamically controlling an acquisition method of the extension gain according to the power information. (4) The expansion gain forming means can correspond to a linear function and a quadratic function as a method of calculating the expansion gain, Depending on, and selects the one of the primary functions and secondary functions of the method of calculating the extended gain applied.
According to a tenth aspect of the present invention, there is provided a voice band expansion program comprising: (1) The narrowband audio signal is frequency-analyzed, and the narrowband low band power including a band lower than a predetermined frequency, the narrowband high band power including a band higher than the predetermined frequency, and a gradient index Frequency analysis means for obtaining a spectrum parameter including: (2) power information acquisition means for obtaining power information relating to the narrowband audio signal; and (3) a magnitude of an extension component in the extension band signal based on the spectrum parameter. The expansion gain for adjusting the value is obtained by dividing the high band power in the narrow band by the low band power in the narrow band according to the magnitude of the power information. Characterized in that to function as an extended gain forming means capable to select the use of either of the feature value of the gradient index.

本発明によれば、無声音の高域を十分に拡張することができ、その結果、音声の明瞭度や了解度が改善されて、聴覚的にクリアな伸びのある擬似広帯域音声信号を生成することができる音声帯域拡張装置及びプログラムを提供できる。 According to the present invention, the high frequency range of the unvoiced sound can be sufficiently expanded, and as a result, the intelligibility and intelligibility of the voice are improved, and a pseudo-wideband audio signal having an audibly clear extension is generated. It is possible to provide a voice band expansion device and a program that can

広帯域音声の０Ｈｚ〜４ｋＨｚの低域成分パワーと４ｋＨｚ〜８ｋＨｚの高域成分パワーの散布図である。It is a scatter diagram of the low frequency component power of 0 Hz to 4 kHz and the high frequency component power of 4 kHz to 8 kHz of the wideband sound. 第１の実施形態の音声帯域拡張方法の構成を示す機能ブロック図である。It is a functional block diagram which shows the structure of the audio | voice band expansion method of 1st Embodiment. 第２の実施形態の音声帯域拡張方法の構成を示す機能ブロック図である。It is a functional block diagram which shows the structure of the audio | voice band expansion method of 2nd Embodiment. 第３の実施形態の音声帯域拡張方法の構成を示す機能ブロック図である。It is a functional block diagram which shows the structure of the audio | voice band expansion method of 3rd Embodiment. 第４の実施形態の音声帯域拡張方法の構成を示す機能ブロック図である。It is a functional block diagram which shows the structure of the audio | voice band expansion method of 4th Embodiment. 従来の基本的な音声帯域拡張装置の構成を示す機能ブロック図である。It is a functional block diagram which shows the structure of the conventional basic audio | voice band expansion apparatus.

（Ａ）各実施形態に共通する技術思想
まず、各実施形態の音声帯域拡張装置を説明する前に、これら実施形態に共通する技術思想を説明する。 (A) Technical idea common to each embodiment First, the technical idea common to these embodiments will be described before describing the audio band expansion device of each embodiment.

各実施形態の音声帯域拡張装置は、拡張前の狭帯域音声パワーに応じて、生成された拡張信号に乗算する拡張ゲインを動的に変更させようとしたものである。 The audio band expansion device of each embodiment is intended to dynamically change the expansion gain that is multiplied by the generated expansion signal in accordance with the narrowband audio power before expansion.

図１は、電話帯域に制限されていないサンプリング周波数１６ｋＨｚの広帯域音声信号から、０Ｈｚ〜４ｋＨｚの低域成分（狭帯域音声パワーが算出される信号に相当）と、４ｋＨｚ〜８ｋＨｚの高域成分とをそれぞれ、適切なＬＰＦ及びＨＰＦによって濾波し、横軸を低域成分のパワー、縦軸を高域成分のパワーとした散布図をプロットしたものである。この図１から、高域成分のパワーが大きくなるのは、低域成分のパワーが小さいときだけであり、低域成分のパワーが大きいときには高域成分のパワーは大きくなり得ないことが分かる。 FIG. 1 shows a low frequency component of 0 Hz to 4 kHz (corresponding to a signal for which a narrow band audio power is calculated) and a high frequency component of 4 kHz to 8 kHz from a wideband audio signal with a sampling frequency of 16 kHz that is not limited to the telephone band. Are plotted by using appropriate LPF and HPF, and a scatter diagram is plotted with the horizontal axis representing the power of the low frequency component and the vertical axis representing the power of the high frequency component. As can be seen from FIG. 1, the power of the high frequency component increases only when the power of the low frequency component is small, and the power of the high frequency component cannot increase when the power of the low frequency component is large.

このような事実に基づき、各実施形態の音声帯域拡張装置は、狭帯域音声パワーの大小に応じて、例えば、拡張ゲインの最大値が制限されるように拡張ゲインを動的に決定したり、拡張ゲインの値の大きくなりやすさが調整されるように拡張ゲインを動的に決定したりすることとし（具体的な方法は後述する）、横軸を狭帯域音声パワー、縦軸を拡張信号のパワーとした散布図を描いたとした場合に、上述した図１に近い特性とすることができるようにした。 Based on such facts, the voice band extending apparatus of each embodiment dynamically determines the extension gain so that the maximum value of the extension gain is limited, for example, depending on the magnitude of the narrow band voice power, The expansion gain is dynamically determined so that the ease of increasing the value of the expansion gain is adjusted (a specific method will be described later), the horizontal axis is the narrowband audio power, and the vertical axis is the expansion signal. When a scatter diagram with the power of is drawn, the characteristics close to those of FIG. 1 described above can be obtained.

（Ｂ）第１の実施形態
次に、本発明による音声帯域拡張装置及びプログラムの第１の実施形態を、図面を参照しながら説明する。 (B) First Embodiment Next, a first embodiment of a voice band extending apparatus and program according to the present invention will be described with reference to the drawings.

（Ｂ−１）第１の実施形態の構成
図１は、第１の実施形態の音声帯域拡張装置の機能的構成を示すブロック図であり、上述した図６との同一、対応部分には同一、対応符号を付して示している。第１の実施形態の音声帯域拡張装置は、その各部をハードウェアによって構成しても良く、また、ＣＰＵと、ＣＰＵが実行するプログラム（音声帯域拡張プログラム）として構成しても良いが（例えば、図１に示す各ブロックの機能をプログラムのサブルーチンとして構成しても良い）、機能的には、図１で表すことができる。 (B-1) Configuration of the First Embodiment FIG. 1 is a block diagram showing a functional configuration of the voice band extending apparatus according to the first embodiment, which is the same as in FIG. , The corresponding symbols are attached. Each part of the voice band extending apparatus according to the first embodiment may be configured by hardware, or may be configured as a CPU and a program (voice band extending program) executed by the CPU (for example, The function of each block shown in FIG. 1 may be configured as a subroutine of a program). Functionally, it can be represented by FIG.

図１において、第１の実施形態の音声帯域拡張装置２００は、図６に示した音声帯域拡張装置１００と同様な、サンプリング変換部１０１、バンドパスフィルタリング部（ＢＰＦ）１０２、全波整流部１０３、ハイパスフィルタリング部（ＨＰＦ）１０４、周波数解析部１０５、乗算部１０７、加算部１０８と、第１の実施形態の音声帯域拡張装置２００に特有なパワー算出部２０９、拡張ゲイン算出部２１０を有する。 In FIG. 1, a voice band extension device 200 according to the first embodiment is similar to the voice band extension device 100 shown in FIG. 6, and includes a sampling conversion unit 101, a band pass filtering unit (BPF) 102, and a full wave rectification unit 103. A high-pass filtering unit (HPF) 104, a frequency analysis unit 105, a multiplication unit 107, an addition unit 108, a power calculation unit 209, and an expansion gain calculation unit 210, which are unique to the voice band expansion device 200 of the first embodiment.

パワー算出部２０９は、狭帯域音声信号ＳのパワーＳＰを算出し、拡張ゲイン算出部２１０に与えるものである。 The power calculation unit 209 calculates the power SP of the narrowband audio signal S and supplies it to the extension gain calculation unit 210.

拡張ゲイン算出部２１０は、所定の可変な拡張ゲイン算出方法を用いて、周波数解析部１０５から与えられたスペクトルパラメータＳＦと、パワー算出部２０９から与えられた狭帯域音声信号ＳのパワーＳＰとに基づいて、拡張ゲインＥＧを算出し、得られた拡張ゲインＥＧを乗算部１０７に与えるものである。 The expansion gain calculation unit 210 uses a predetermined variable expansion gain calculation method to convert the spectral parameter SF given from the frequency analysis unit 105 and the power SP of the narrowband audio signal S given from the power calculation unit 209. Based on this, the expansion gain EG is calculated, and the obtained expansion gain EG is given to the multiplication unit 107.

（Ａ−２）第１の実施形態の動作
次に、第１の実施形態の音声帯域拡張方装置２００の動作を説明する。 (A-2) Operation of the First Embodiment Next, the operation of the voice band extending method apparatus 200 of the first embodiment will be described.

ここで、第１の実施形態の音声帯域拡張装置２００は、パワー算出部２０９を有し、拡張ゲイン算出部２１０が周波数解析部１０５の出力だけでなく、パワー算出部２０９の出力をも利用する点が、上述した図６に示した音声帯域拡張装置１００と異なるので、以下では、周波数解析部１０５の動作に言及した後、パワー算出部２０９及び拡張ゲイン算出部２１０の動作を説明する。 Here, the voice band extending apparatus 200 of the first embodiment includes a power calculation unit 209, and the expansion gain calculation unit 210 uses not only the output of the frequency analysis unit 105 but also the output of the power calculation unit 209. Since the point is different from the above-described voice band extending apparatus 100 shown in FIG. 6, the operation of the power calculating unit 209 and the extended gain calculating unit 210 will be described below after referring to the operation of the frequency analyzing unit 105.

周波数解析部１０５は、上述したように、狭帯域音声信号Ｓを周波数解析し、周波数スペクトルの振幅包絡、及び、周波数スペクトルの傾きの少なくとも一方に関するスペクトルパラメータＳＦを算出するものである。 As described above, the frequency analysis unit 105 performs frequency analysis on the narrowband audio signal S and calculates a spectral parameter SF related to at least one of the amplitude envelope of the frequency spectrum and the slope of the frequency spectrum.

このような算出方法として、非特許文献１に記載の方法や、同一発明者が特願２０１２−２５８６５１号で提案した方法を挙げることができる。 Examples of such a calculation method include the method described in Non-Patent Document 1 and the method proposed by the same inventor in Japanese Patent Application No. 2012-258651.

前者の方法では、以下の（１）式〜（４）式で表現されるグラディエントインデックスＧＩが、スペクトルパラメータＳＦとなる。グラディエントインデックスＧＩは、信号波形の傾き方向が変化する回数とその大きさを表す指標である。（１）式〜（４）式において、ｎは時間の要素番号であり、Ｓ（ｎ）は狭帯域音声信号である。

In the former method, the gradient index GI expressed by the following equations (1) to (4) is the spectrum parameter SF. The gradient index GI is an index that represents the number of times and the magnitude of the change in the inclination direction of the signal waveform. In the equations (1) to (4), n is a time element number, and S (n) is a narrowband audio signal.

特願２０１２−２５８６５１号で提案された算出方法では、以下の（５）式〜（８）式のいずれかで表現される修正されたグラディエントインデックスＭＧＩが、スペクトルパラメータＳＦとなる。修正されたグラディエントインデックスＭＧＩは、グラディエントインデックスＧＩと高い相関を持ちながら、値の飛び跳ねがグラディエントインデックスＧＩより小さいパラメータである。また、特願２０１２−２５８６５１号で提案されている、（９）式及び（１０）式で規定されている修正されたグラディエントインデックスＭＧＩを平滑化したパラメータＭＧＩ’（ｎ）を、スペクトルパラメータＳＦとして適用しても良い。（９）式におけるｂは、０以上１未満の忘却係数である。

In the calculation method proposed in Japanese Patent Application No. 2012-258651, the modified gradient index MGI expressed by any of the following formulas (5) to (8) is the spectral parameter SF. The corrected gradient index MGI is a parameter that has a high correlation with the gradient index GI and has a smaller value jump than the gradient index GI. Further, a parameter MGI ′ (n) obtained by smoothing the modified gradient index MGI defined in the equations (9) and (10) proposed in Japanese Patent Application No. 2012-258651 is used as a spectral parameter SF. It may be applied. In the formula (9), b is a forgetting factor of 0 or more and less than 1.

パワー算出部２０９は、狭帯域音声信号ＳのパワーＳＰを算出し、得られた狭帯域音声信号ＳのパワーＳＰを拡張ゲイン算出部２１０に与える。ここで、パワーの算出には任意の方法を用いることができる。例えば、狭帯域音声信号Ｓの絶対値の移動平均や、狭帯域音声信号Ｓの２乗値の移動平均などを狭帯域音声信号ＳのパワーＳＰとすることができる。 The power calculation unit 209 calculates the power SP of the narrowband audio signal S and supplies the obtained power SP of the narrowband audio signal S to the extension gain calculation unit 210. Here, any method can be used to calculate the power. For example, the moving average of the absolute value of the narrowband audio signal S or the moving average of the square value of the narrowband audio signal S can be used as the power SP of the narrowband audio signal S.

周波数解析部１０５が適用する周波数解析方法によっては、周波数解析部１０５内で狭帯域音声信号ＳのパワーＳＰを算出することを要することもあり得る。このような場合には、周波数解析部１０５内のパワー算出部と、パワー算出部２０９とを共用するようにしても良い。 Depending on the frequency analysis method applied by the frequency analysis unit 105, it may be necessary to calculate the power SP of the narrowband audio signal S in the frequency analysis unit 105. In such a case, the power calculation unit in the frequency analysis unit 105 and the power calculation unit 209 may be shared.

拡張ゲイン算出部２１０は、基本的に、スペクトルパラメータＳＦに変換係数を乗算することにより、スペクトルパラメータＳＦを拡張ゲインＥＧに変換し、得られた拡張ゲインＥＧを乗算部１０７に与える。 The extension gain calculation unit 210 basically multiplies the spectrum parameter SF by a conversion coefficient to convert the spectrum parameter SF to the extension gain EG, and gives the obtained extension gain EG to the multiplication unit 107.

この第１の実施形態の場合、変換係数が固定のものではなく、狭帯域音声パワーＳＰに応じて動的に変化するものである。狭帯域音声パワーＳＰと変換係数との関係は、上述した図１に示した低域成分のパワーと高域成分のパワーとの関係に応じて予め定められている。拡張ゲイン算出部２１０は、狭帯域音声パワーＳＰを変換係数に変換するための変換テーブル、若しくは、狭帯域音声パワーＳＰを変換係数に変換するための変換関数（階段状関数であっても良い）の演算部を内蔵し、入力された狭帯域音声パワーＳＰに応じた変換係数を得た後、スペクトルパラメータＳＦに変換係数を乗算して拡張ゲインＥＧを得る。なお、狭帯域音声パワーＳＰを変換係数に変換した後、スペクトルパラメータＳＦに変換係数を乗算して拡張ゲインＥＧを得る方法に代え、狭帯域音声パワーＳＰに応じて適用する変換テーブルや変換式を切り替えることにより、スペクトルパラメータＳＦを直接拡張ゲインＥＧに変換する方法を適用するようにしても良い。 In the case of the first embodiment, the conversion coefficient is not fixed, but dynamically changes according to the narrowband audio power SP. The relationship between the narrowband audio power SP and the conversion coefficient is determined in advance according to the relationship between the power of the low frequency component and the power of the high frequency component shown in FIG. The extension gain calculation unit 210 is a conversion table for converting the narrowband audio power SP into a conversion coefficient, or a conversion function (which may be a step function) for converting the narrowband audio power SP into a conversion coefficient. And obtaining a conversion coefficient corresponding to the input narrowband audio power SP, and then multiplying the spectral parameter SF by the conversion coefficient to obtain an expansion gain EG. Note that, after converting the narrowband audio power SP into a conversion coefficient, instead of multiplying the spectral parameter SF by the conversion coefficient to obtain the expansion gain EG, a conversion table or conversion equation to be applied according to the narrowband audio power SP is used. A method of directly converting the spectral parameter SF into the expansion gain EG may be applied by switching.

上述したように、電話帯域に制限されていないサンプリング周波数１６ｋＨｚの広帯域音声信号の高域成分のパワーが大きくなるのは、低域成分のパワーが小さいときだけであり、低域成分のパワーが大きいときには高域成分のパワーは大きくなり得ない。このような事実に基づいて、上述した変換テーブルや変換式が予め形成されている。 As described above, the power of the high frequency component of the wideband audio signal having a sampling frequency of 16 kHz that is not limited to the telephone band is increased only when the power of the low frequency component is small, and the power of the low frequency component is large. Sometimes the power of the high frequency component cannot increase. Based on such facts, the above-described conversion table and conversion formula are formed in advance.

例えば、狭帯域音声パワーＳＰの大小に応じて、拡張ゲインＥＧの最大値が制限されるように、スペクトルパラメータＳＦを拡張ゲインＥＧに変換する方法を定めても良く、また例えば、拡張ゲインＥＧの値の大きくなりやすさが調整されるようにスペクトルパラメータＳＦを拡張ゲインＥＧに変換する方法を定めても良い。 For example, a method for converting the spectral parameter SF into the expansion gain EG may be determined so that the maximum value of the expansion gain EG is limited according to the magnitude of the narrowband audio power SP. A method of converting the spectral parameter SF into the expansion gain EG may be determined so that the ease of increasing the value is adjusted.

ここで、スペクトルパラメータＳＦがスカラーで、拡張ゲインの算出方法が、（１１）式に示すように、スペクトルパラメータＳＦに正の変換係数Ａを乗じて拡張ゲインＥＧを求める方法である場合には、例えば、予め狭帯域音声パワーＳＰが取り得る最大値ＳＰｍａｘを設定しておき、変換係数Ａの最小値をＡｍｉｎ、最大値をＡｍａｘとし、スペクトルパラメータＳＦを変換係数Ａに変換する変換式として、（１２）式の変換式を適用するようにしても良い。 Here, when the spectral parameter SF is scalar and the calculation method of the expansion gain is a method of obtaining the expansion gain EG by multiplying the spectral parameter SF by a positive conversion coefficient A as shown in the equation (11), For example, the maximum value SPmax that can be taken by the narrowband speech power SP is set in advance, the minimum value of the conversion coefficient A is Amin, the maximum value is Amax, and the conversion formula for converting the spectral parameter SF to the conversion coefficient A is ( You may make it apply the conversion formula of Formula 12).

ＥＧ＝Ａ・ＳＦ …（１１）
Ａ＝Ａｍａｘ−（Ａｍａｘ−Ａｍｉｎ）・ＳＰ／ＳＰｍａｘ …（１２）
（１１）式及び式（１２）によれば、狭帯域音声パワーＳＰが大きいとき（有声音に相当）には、小さな変換係数ＡがスペクトルパラメータＳＦに乗じられることで拡張ゲインＥＧが比較的小さな値となり、狭帯域音声パワーＳＰが小さいとき（無声音に相当）には、大きな変換係数ＡがスペクトルパラメータＳＦに乗じられることで拡張ゲインＥＧが比較的大きな値となり、結果として、横軸を狭帯域音声パワーＳＰ、縦軸を拡張信号ＸＨのパワーとした散布図（図示は省略している）を描いた場合に、上述した図１に近い特性とすることができる。 EG = A · SF (11)
A = Amax− (Amax−Amin) · SP / SPmax (12)
According to the equations (11) and (12), when the narrowband sound power SP is large (corresponding to voiced sound), the expansion gain EG is relatively small by multiplying the spectrum parameter SF by a small conversion coefficient A. When the narrowband audio power SP is small (corresponding to an unvoiced sound), the expansion gain EG becomes a relatively large value by multiplying the spectrum parameter SF by a large conversion coefficient A. As a result, the horizontal axis is narrowband. When a scatter diagram (not shown) with the audio power SP and the vertical axis representing the power of the extension signal XH is drawn, the characteristics close to those of FIG. 1 described above can be obtained.

なお、狭帯域音声パワーＳＰの大小に応じて拡張ゲインＥＧを動的に決定する方法は、（１１）式及び（１２）式を適用した方法に限定されるものではない。例えば、狭帯域音声パワーＳＰが大きいときには小さく、狭帯域音声パワーＳＰが小さいときには大きくなるように拡張ゲインＥＧの最大値（上限値）ＥＧｍａｘを動的に算出し、固定係数を適用して算出した拡張ゲインＥＧの値が最大値ＥＧｍａｘを超えている場合には、拡張ゲインＥＧの値を最大値ＥＧｍａｘに制限する（置き換える）ようにしても良い。また例えば、変換係数Ａを制御すると共に（例えば（１２）式を適用する）、拡張ゲインＥＧの最大値を動的に制限する方法も適用するようにしても良い。 Note that the method of dynamically determining the expansion gain EG according to the size of the narrowband audio power SP is not limited to the method using the equations (11) and (12). For example, the maximum value (upper limit value) EGmax of the expansion gain EG is dynamically calculated so that it is small when the narrowband audio power SP is large and large when the narrowband audio power SP is small, and is calculated by applying a fixed coefficient. When the value of the expansion gain EG exceeds the maximum value EGmax, the value of the expansion gain EG may be limited (replaced) to the maximum value EGmax. Further, for example, a method of dynamically limiting the maximum value of the expansion gain EG may be applied while controlling the conversion coefficient A (for example, applying equation (12)).

上述した（１２）式は、狭帯域音声パワーＳＰと変換係数Ａとの間に線形な関係がある場合を示している。しかし、非線形な関係式を適用するようにしても良い。このような場合において、狭帯域音声パワーＳＰの大小に応じて拡張ゲインＥＧを動的に決定する方法として、その非線形性を調整する方法であっても良い。例えば、非線形式のパラメータ（例えば、２乗項の指数）を変化するようにしても良い。 The above-described equation (12) shows a case where there is a linear relationship between the narrowband audio power SP and the conversion coefficient A. However, a non-linear relational expression may be applied. In such a case, as a method of dynamically determining the expansion gain EG according to the magnitude of the narrowband audio power SP, a method of adjusting the nonlinearity may be used. For example, you may make it change the parameter (for example, exponent of a square term) of a nonlinear formula.

また、以上では、狭帯域音声パワーＳＰと変換係数Ａとの間の変換式（若しくは狭帯域音声パワーＳＰと拡張ゲインＥＧとの間の変換式）が連続的な曲線に従うようなイメージで説明したが、狭帯域音声パワーＳＰに対して、１つ以上の閾値を導入して選択的（離散的（さらに言い換えると段階的））に決定するようにして良い。 In the above description, the conversion formula between the narrowband voice power SP and the conversion coefficient A (or the conversion formula between the narrowband voice power SP and the expansion gain EG) is described as an image that follows a continuous curve. However, one or more threshold values may be introduced into the narrowband audio power SP to be determined selectively (discretely (in other words, stepwise)).

さらに、今まで例示した動的な決定方法と異なり、複数の拡張ゲイン算出方法を用意しておき、狭帯域音声パワーＳＰに応じて、いずれかの拡張ゲイン算出方法を選択した上で、スペクトルパラメータＳＦを拡張ゲインＥＧに変換するようにしても良い。例えば、狭帯域音声パワーＳＰに対する１つの閾値ＴＳＰを予め設定しておいて、ＳＰ≧ＴＳＰの場合には、狭帯域音声パワーＳＦの一次関数によって拡張ゲインＥＧを算出し、ＳＰ＜ＴＳＰの場合には、拡張ゲインＳＦの二次関数によって拡張ゲインＥＧを算出するという方法を適用するようにしても良い。 Further, unlike the dynamic determination methods exemplified so far, a plurality of extension gain calculation methods are prepared, and after selecting one of the extension gain calculation methods according to the narrowband audio power SP, the spectrum parameter is selected. SF may be converted into an expansion gain EG. For example, when one threshold value TSP for the narrowband voice power SP is set in advance and SP ≧ TSP, the extension gain EG is calculated by a linear function of the narrowband voice power SF, and when SP <TSP. May apply a method of calculating the expansion gain EG using a quadratic function of the expansion gain SF.

また、今まで例示した動的な決定方法の説明では、スペクトルパラメータＳＦがスカラーであることが前提であるかのように記載したが、スペクトルパラメータＳＦは複数のパラメータを有するベクトルや行列等であっても良く、上記動的な決定方法が、スペクトルパラメータＳＦを構成するパラメータの種類や数を決定する方法であっても良い。 Further, in the description of the dynamic determination method exemplified so far, it is described as if the spectral parameter SF is a scalar, but the spectral parameter SF is a vector or matrix having a plurality of parameters. Alternatively, the dynamic determination method may be a method of determining the type and number of parameters constituting the spectrum parameter SF.

例えば、スペクトルパラメータＳＦが、０Ｈｚ〜２ｋＨｚの帯域パワー（狭帯域内低域パワー）ＳＰＬ、２ｋＨｚ〜４ｋＨｚの帯域パワー（狭帯域内高域パワー）ＳＰＨ、グラディエントインデックスＧｌの３つのパラメータを有しているとし、狭帯域音声パワーＳＰに対する１つの閾値ＴＳＰを予め設定しておき、ＳＰ≧ＴＳＰの場合には、狭帯域内高域パワーＳＰＨを狭帯域内低域ＳＰＬで除した値に基づいて拡張ゲインＥＧを算出し、ＳＰ＜ＴＳＰの場合には、グラディエントインデックスＧＩに基づいて拡張ゲインＥＧを算出するという方法を適用することができる。 For example, the spectrum parameter SF has three parameters of band power (low band power in narrow band) SPL of 0 Hz to 2 kHz, band power (high band power in narrow band) SPH of 2 kHz to 4 kHz, and gradient index Gl. Assuming that one threshold value TSP for the narrowband voice power SP is set in advance, and when SP ≧ TSP, it is expanded based on a value obtained by dividing the narrowband highband power SPH by the narrowband lowband SPL. A method of calculating the gain EG and calculating the extension gain EG based on the gradient index GI when SP <TSP can be applied.

（Ｂ−２）第１の実施形態の効果
第１の実施形態によれば、電話帯域に制限されていない広帯域音声信号における低域成分のパワーと高域成分のパワーとの関係を、拡張された擬似広帯域音声信号での実現できるように、生成された拡張信号ＥＨに乗算される拡張ゲインＥＧを、狭帯域音声パワーＳＰに応じて動的に定めるようにしたので、無声音の高域を十分に拡張することができ、音声の明瞭度や了解度が改善され、かつ、有声音が過剰に拡張されずに新たに雑音が重畳されず、聴覚的にクリアな伸びのある擬似広帯域音声を得ることができる。 (B-2) Effect of First Embodiment According to the first embodiment, the relationship between the power of the low frequency component and the power of the high frequency component in the wideband audio signal that is not limited to the telephone band is extended. Since the extension gain EG multiplied by the generated extension signal EH is dynamically determined according to the narrowband voice power SP so that the pseudo-wideband voice signal can be realized, the high frequency range of the unvoiced sound is sufficient. The voice clarity and intelligibility is improved, and the voiced sound is not excessively expanded and no new noise is superimposed on it. be able to.

（Ｃ）第２の実施形態
次に、本発明による音声帯域拡張装置及びプログラムの第２の実施形態を、図面を参照しながら説明する。 (C) Second Embodiment Next, a second embodiment of the voice band extending apparatus and program according to the present invention will be described with reference to the drawings.

（Ｃ−１）第２の実施形態の構成
図３は、第２の実施形態の音声帯域拡張装置の構成を示すブロック図であり、上述した図１の同一、対応部分には同一、対応符号を付して示している。 (C-1) Configuration of Second Embodiment FIG. 3 is a block diagram showing the configuration of the voice band extension apparatus of the second embodiment. The same and corresponding parts in FIG. Is shown.

図３において、第２の実施形態の音声帯域拡張装置３００は、第１の実施形態の音声帯域拡張装置２００と同様な、サンプリング変換部１０１、バンドパスフィルタリング部（ＢＰＦ）１０２、全波整流部１０３、ハイパスフィルタリング部（ＨＰＦ）１０４、周波数解析部１０５、乗算部１０７、加算部１０８、パワー算出部２０９、拡張ゲイン算出部２１０と、第２の実施形態で特有な長期平均部３１１、パワー正規化部３１２とを有する。 In FIG. 3, the voice band extending apparatus 300 according to the second embodiment is similar to the voice band extending apparatus 200 according to the first embodiment, in the sampling conversion unit 101, the band pass filtering unit (BPF) 102, and the full wave rectifying unit. 103, a high-pass filtering unit (HPF) 104, a frequency analysis unit 105, a multiplication unit 107, an addition unit 108, a power calculation unit 209, an extended gain calculation unit 210, a long-term average unit 311 unique to the second embodiment, a power normalization And a conversion unit 312.

長期平均部３１１は、パワー算出部２０９から与えられた狭帯域音声パワーＳＰの長期平均値ｌｏｎｇＳＰを算出し、得られた狭帯域音声パワーの長期平均値ｌｏｎｇＳＰはパワー正規化部３１２に与えるものである。 The long-term average unit 311 calculates the long-term average value longSP of the narrowband voice power SP given from the power calculation unit 209, and the obtained long-term average value longSP of the narrowband voice power is given to the power normalization unit 312. is there.

パワー正規化部３１２は、パワー算出部２０９から与えられた狭帯域音声パワーＳＰを長期平均部３１１から与えられたその長期平均値ｌｏｎｇＳＰで除することで、狭帯域音声の正規化パワーＮＳＰを算出し、得られた狭帯域音声の正規化パワーＮＳＰを拡張ゲイン算出部２１０に与えるものである。 The power normalization unit 312 calculates the normalized power NSP of the narrowband speech by dividing the narrowband speech power SP given from the power computation unit 209 by the long-term average value longSP given from the long-term average unit 311. Then, the normalized power NSP of the obtained narrowband speech is given to the extension gain calculation unit 210.

なお、第２の実施形態の拡張ゲイン算出部２１０は、第１の実施形態のものとは異なり、生成された拡張信号ＥＨに乗算される拡張ゲインＥＧを、狭帯域音声の正規化パワーＮＳＰに応じて動的に定める。 Note that, unlike the first embodiment, the extension gain calculation unit 210 of the second embodiment uses the extension gain EG multiplied by the generated extension signal EH as the normalized power NSP of narrowband speech. It is determined dynamically according to the response.

（Ｃ−２）第２の実施形態の動作
次に、第２の実施形態の音声帯域拡張装置３００の動作を説明する。 (C-2) Operation of Second Embodiment Next, the operation of the voice band extending apparatus 300 of the second embodiment will be described.

ここで、第２の実施形態の音声帯域拡張装置３００は、拡張ゲイン算出部２１０に与える狭帯域音声信号のパワー情報が、単なるパワーＳＰから、正規化パワーＮＳＰに置き換わった点が、第１の実施形態の音声帯域拡張装置２００から変更されている。そこで、以下では、正規化パワーＮＳＰの形成に係わる長期平均部３１１及びパワー正規化部３１２の動作を中心に説明する。 Here, the voice band extending apparatus 300 of the second embodiment is that the power information of the narrow band voice signal given to the extension gain calculating unit 210 is replaced with the normalized power NSP from the simple power SP. It is changed from the voice band extending apparatus 200 of the embodiment. Therefore, hereinafter, the operation of the long-term average unit 311 and the power normalization unit 312 related to the formation of the normalized power NSP will be mainly described.

長期平均部３１１は、パワー算出部２０９から与えられた狭帯域音声のパワーＳＰの長期平均値ｌｏｎｇＳＰを算出する。長期平均値ｌｏｎｇＳＰの算出方法には任意の方法を用いることができる。例えば、移動平均や、（１３）式に示すような時定数フィルタによる平滑化を適用することができる。（１３）式におけるｔａｕは、０＜ｔａｕ＜１の範囲内の値をとる時定数、演算子「←」は右辺から左辺への代入を表す。長期平均の長さをＴ秒とすると、移動平均を適用する場合にはＴ秒間の平均値を長期平均値とし、時定数フィルタを適用する場合には追従に要する時間がＴ秒となるような時定数ｔａｕによって平滑化された値を長期平均値とする。なお、移動平均を適用すると比較的大きなメモリ領域を確保する必要が生じるため、移動平均を適用する場合と比較すると、時定数フィルタを用いることが好ましい。長期平均の長さＴは、５秒〜２０秒程度が望ましい。 The long-term average unit 311 calculates the long-term average value longSP of the power SP of the narrowband speech given from the power calculation unit 209. Any method can be used as a method of calculating the long-term average value longSP. For example, moving average or smoothing by a time constant filter as shown in the equation (13) can be applied. In equation (13), tau is a time constant that takes a value in the range of 0 <tau <1, and the operator “←” represents assignment from the right side to the left side. If the length of the long-term average is T seconds, the average value for T seconds is the long-term average value when moving average is applied, and the time required for tracking is T seconds when the time constant filter is applied. A value smoothed by the time constant tau is defined as a long-term average value. Note that, when the moving average is applied, it is necessary to secure a relatively large memory area. Therefore, it is preferable to use a time constant filter as compared with the case where the moving average is applied. The long-term average length T is preferably about 5 to 20 seconds.

ｌｏｎｇＳＰ ← ｔａｕ・ｌｏｎｇＳＰ＋（１−ｔａｕ）・ＳＰ
…（１３）
パワー正規化部３１２は、狭帯域音声パワーＳＰを長期平均値ｌｏｎｇＳＰで除することで、狭帯域音声の正規化パワーＮＳＰを算出する。狭帯域音声の正規化パワーＮＳＰは、入力である狭帯域音声のパワーの長期平均値によって正規化されているため、話者の声量やマイク感度の大小に関わらず、有声音では大きな値となり、無声音では小さな値となる。すなわち、この正規化処理によって、例えば、話者の声が小さい場合において有声音期間の拡張ゲインＥＧが過大になったり、話者の声が大きい場合において無声音期間の拡張ゲインＥＧが過小になったりすることを回避することができる。 longSP ← tau / longSP + (1-tau) / SP
... (13)
The power normalization unit 312 calculates the normalized power NSP of the narrowband speech by dividing the narrowband speech power SP by the long-term average value longSP. Since the normalization power NSP of the narrowband speech is normalized by the long-term average value of the power of the narrowband speech that is input, it becomes a large value for the voiced sound regardless of the volume of the speaker and the microphone sensitivity, For unvoiced sounds, the value is small. That is, by this normalization processing, for example, when the voice of the speaker is low, the extended gain EG of the voiced sound period becomes excessive, or when the voice of the speaker is high, the extended gain EG of the unvoiced sound period becomes excessively small. Can be avoided.

また、狭帯域音声のパワーの長期平均値ｌｏｎｇＳＰの算出において、長期平均の長さＴを有限長としているので、話者が変わったり、マイク感度が変化したりした場合などでも、狭帯域音声の正規化パワーＮＳＰはＴ秒後には適切な値に戻すことができる。 In addition, in calculating the long-term average value longSP of the power of the narrowband speech, the long-term average length T is finite, so even if the speaker changes or the microphone sensitivity changes, the narrowband speech power The normalized power NSP can be returned to an appropriate value after T seconds.

（Ｃ−２）第２の実施形態の効果
第２の実施形態によれば、第１の実施形態と同様な効果に加え、以下の効果を奏することができる。 (C-2) Effects of Second Embodiment According to the second embodiment, in addition to the same effects as those of the first embodiment, the following effects can be achieved.

第２の実施形態によれば、狭帯域音声のパワーをその長期平均値で正規化して拡張ゲインに反映させるようにしたので、有声音と無声音の拡張度合いが、話者の声量やマイク感度の影響を受けなくなり、さらに話者やマイク感度が変更になってもー定時間後には新たな環境に適応させることができ、その結果、より音声の明瞭度や了解度が改善された聴覚的にクリアな伸びのある擬似広帯域音声信号を得ることができる。 According to the second embodiment, since the power of the narrowband speech is normalized by the long-term average value and reflected in the expansion gain, the degree of expansion of voiced and unvoiced sounds depends on the speaker's voice volume and microphone sensitivity. Even if the speaker or microphone sensitivity changes, it can be adapted to a new environment after a certain period of time, resulting in a more auditory voice with improved speech clarity and intelligibility. It is possible to obtain a pseudo wideband audio signal having a clear elongation.

（Ｄ）第３の実施形態
次に、本発明による音声帯域拡張装置及びプログラムの第３の実施形態を、図面を参照しながら説明する。 (D) Third Embodiment Next, a third embodiment of the voice band extending apparatus and program according to the present invention will be described with reference to the drawings.

図４は、第３の実施形態の音声帯域拡張装置の構成を示すブロック図であり、上述した第２の実施形態に係る図３との同一、対応部分には同一、対応符号を付して示している。 FIG. 4 is a block diagram showing the configuration of the voice band extending apparatus according to the third embodiment. The same or corresponding parts as those in FIG. 3 according to the second embodiment described above are assigned the same or corresponding reference numerals. Show.

図４において、第３の実施形態の音声帯域拡張装置４００は、第２の実施形態の音声帯域拡張装置３００の構成に加えて音声区間検出部４１３を備え、長期平均部４１４が音声区間検出部４１３からの検出信号ＶＡＤを利用するものになっている点が、第２の実施形態の音声帯域拡張装置３００と異なっている。 In FIG. 4, the voice band extending apparatus 400 of the third embodiment includes a voice section detecting unit 413 in addition to the configuration of the voice band expanding apparatus 300 of the second embodiment, and the long-term average unit 414 is a voice section detecting unit. The point that the detection signal VAD from 413 is used is different from the voice band expansion device 300 of the second embodiment.

音声区間検出部４１３は、狭帯域音声信号Ｓに基づいて、狭帯域音声信号Ｓが音声区間か無音区間かを判定し、得られた音声区間判定結果ＶＡＤを長期平均部４１４に与れる。 Based on the narrowband speech signal S, the speech segment detection unit 413 determines whether the narrowband speech signal S is a speech segment or a silent segment, and gives the obtained speech segment determination result VAD to the long-term average unit 414.

ここで、音声区間の検出方法には、公知の任意の方法を適用することができる。例えば、狭帯域音声信号Ｓのパワーを観察し、該パワーが所定の閾値以上ならば音声区間、該パワーが所定の閾値未満ならば無音区間と判断する方法を適用できる。なお、この場合には、音声区間検出部４１３への入力を狭帯域音声信号Ｓに代えて狭帯域音声パワーＳＰとすることで、音声区間検出部４１３における演算量を少なくすることができる。 Here, any known method can be applied to the method of detecting a speech section. For example, it is possible to apply a method of observing the power of the narrowband audio signal S and determining a voice interval if the power is equal to or greater than a predetermined threshold and a silence interval if the power is less than the predetermined threshold. In this case, the amount of calculation in the voice section detection unit 413 can be reduced by replacing the narrow band voice signal S with the input to the voice section detection unit 413 as the narrow band voice power SP.

第３の実施形態の長期平均部４１４は、音声区間判定結果ＶＡＤの入力を受けて、狭帯域音声信号Ｓが音声区間である場合には、第２の実施形態の長期平均部３１１と同様に狭帯域音声のパワーの長期平均値ｌｏｎｇＳＰを更新し、一方、狭帯域音声信号Ｓが無音区間である場合には、狭帯域音声のパワーの長期平均値ｌｏｎｇＳＰの値を更新しない（前の値を保持する）。このようにして、更新された狭帯域音声のパワーの長期平均値ｌｏｎｇＳＰ、又は、更新されなかった（前の値が保持された）狭帯域音声のパワーの長期平均値ｌｏｎｇＳＰがパワー正規化部３１２に与えられ、第２の実施形態で説明したように処理される。 The long-term average unit 414 of the third embodiment receives the input of the speech segment determination result VAD, and when the narrowband speech signal S is a speech segment, the long-term average unit 414 is similar to the long-term average unit 311 of the second embodiment. The long-term average value longSP of the power of the narrowband speech is updated. On the other hand, if the narrowband speech signal S is in the silent period, the value of the long-term average value longSP of the power of the narrowband speech is not updated (the previous value is changed). Hold). In this way, the long-term average value longSP of the power of the narrowband speech that has been updated or the long-term average value longSP of the power of the narrowband speech that has not been updated (the previous value is retained) is the power normalization unit 312. And processed as described in the second embodiment.

無音区間では、話者やマイク感度とは無関係に、狭帯域音声パワーＳＰが小さい。そこで、この第３の実施形態においては、無音区間で狭帯域音声のパワーの長期平均値ｌｏｎｇＳＰの更新を止めることで、長期平均値ｌｏｎｇＳＰが無音区間のパワーに追従して小さくなることを回避している。 In the silent section, the narrow band voice power SP is small regardless of the speaker and microphone sensitivity. Therefore, in the third embodiment, the update of the long-term average value longSP of the power of the narrowband speech in the silent period is stopped to avoid the long-term average value longSP following the power of the silent period and becoming smaller. ing.

第３の実施形態によれば、第２の実施形態と同様な効果に加え、以下の効果を奏することができる。 According to the third embodiment, in addition to the same effects as those of the second embodiment, the following effects can be achieved.

第３の実施形態によれば、無音区間で狭帯域音声のパワーの長期平均値の更新を止めるようにしたので、長期平均値が意図せずに小さくなり過ぎることを回避でき、拡張度合いが安定した擬似広帯域音声信号を得ることができる。 According to the third embodiment, since the update of the long-term average value of the power of the narrowband speech is stopped in the silent period, the long-term average value can be prevented from becoming unintentionally too small, and the degree of expansion is stable. Thus, a pseudo broadband audio signal can be obtained.

（Ｅ）第４の実施形態
次に、本発明による音声帯域拡張装置及びプログラムの第４の実施形態を、図面を参照しながら説明する。 (E) Fourth Embodiment Next, a fourth embodiment of the voice band extending apparatus and program according to the present invention will be described with reference to the drawings.

図５は、第４の実施形態の音声帯域拡張装置の構成を示すブロック図であり、上述した第１の実施形態に係る図２との同一、対応部分には同一、対応符号を付して示している。 FIG. 5 is a block diagram showing the configuration of the voice band extending apparatus according to the fourth embodiment. The same or corresponding parts as those in FIG. 2 according to the first embodiment described above are assigned the same or corresponding reference numerals. Show.

図５において、第４の実施形態の音声帯域拡張装置５００は、第１の実施形態の音声帯域拡張装置２００の構成に加えて信号処理部５１３を備えている点が、第１の実施形態の音声帯域拡張装置２００と異なっている。 In FIG. 5, the voice band extending apparatus 500 according to the fourth embodiment is provided with a signal processing unit 513 in addition to the configuration of the voice band extending apparatus 200 according to the first embodiment. This is different from the voice band expansion device 200.

信号処理部５１５は、入力された狭帯域音声信号Ｓに所定の信号処理を施して処理後狭帯域音声信号Ｓ’を得て、周波数解析部１０５及びパワー算出部２０９に与えるものである。 The signal processing unit 515 performs predetermined signal processing on the input narrowband audio signal S to obtain a processed narrowband audio signal S ′, and provides it to the frequency analysis unit 105 and the power calculation unit 209.

ここで、所定の信号処理とは、例えば、一般にプリエンファシスと呼ばれる高帯域強調フィルタリングや、雑音抑圧、フォルマント強調など、多種多様な信号処理を挙げることができる。信号処理部５１５が実施する信号処理は１種類でも良く、２種類以上の信号処理を実施するようにしても良い。上述した高帯域強調フィルタリングは口唇の放射特性をキャンセルするフィルタであるから、狭帯域音声の音韻性をより正確にスペクトルパラメータＳＦに反映させることができる。また、雑音環境下では、雑音抑圧を行うことでスペクトルパラメータＳＦが雑音に乱されることを防ぐことができる。また、雑音抑圧と高域強調フィルタリングを組み合わせることで、さらに音韻性を強調するようにしても良い。 Here, the predetermined signal processing includes, for example, a wide variety of signal processing such as high-band emphasis filtering generally called pre-emphasis, noise suppression, and formant emphasis. The signal processing performed by the signal processing unit 515 may be one type, or two or more types of signal processing may be performed. Since the high-band emphasis filtering described above is a filter that cancels the radiation characteristic of the lips, the phoneme of narrow-band speech can be more accurately reflected in the spectrum parameter SF. Further, in a noise environment, it is possible to prevent the spectrum parameter SF from being disturbed by noise by performing noise suppression. Further, the phoneme may be further enhanced by combining noise suppression and high-frequency emphasis filtering.

図５では、信号処理部５１５によって得られた処理後狭帯域音声信号Ｓ’が、周波数解析部１０５及びパワー算出部２０９に与えられるように記載しているが、処理後狭帯域音声信号Ｓ’が周波数解析部１０５又はパワー算出部２０９のいずれかだけ与えられる構成としても良く、また、サンプリング変換部１０１に狭帯域音声信号Ｓではなく処理後狭帯域音声信号Ｓ’を与える構成としても良い。また、信号処理部５１５が構成の異なる複数の信号処理部を含み、異なる信号処理を施した処理後狭帯域音声信号を得て、それらが、サンプリング変換部１０１、周波数解析部１０５及びパワー算出部２０９の対応するものにだけ与えられるような構成としても良い。 In FIG. 5, it is described that the processed narrowband audio signal S ′ obtained by the signal processing unit 515 is provided to the frequency analysis unit 105 and the power calculation unit 209, but the processed narrowband audio signal S ′ is illustrated. May be provided to either the frequency analysis unit 105 or the power calculation unit 209, or the sampling conversion unit 101 may be provided with the processed narrowband audio signal S ′ instead of the narrowband audio signal S. In addition, the signal processing unit 515 includes a plurality of signal processing units having different configurations, and obtains a processed narrowband audio signal that has been subjected to different signal processing, and includes a sampling conversion unit 101, a frequency analysis unit 105, and a power calculation unit It is also possible to adopt a configuration that is given only to the corresponding ones.

第４の実施形態によれば、第１の実施形態と同様な効果に加え、以下の効果を奏することができる。 According to the fourth embodiment, in addition to the same effects as those of the first embodiment, the following effects can be achieved.

第４の実施形態によれば、狭帯域音声信号に適当な処理を施してから、後段の拡張処理や解析処理を実行するようにしたので、音韻性の反映を強化したり、雑音の影響を弱めたりすることができ、その結果、より音声の明瞭度や了解度が改善された擬似広帯域音声信号を得ることができる。 According to the fourth embodiment, after appropriate processing is performed on the narrowband audio signal, the subsequent expansion processing and analysis processing are executed, so that the reflection of phonology is enhanced or the influence of noise is reduced. As a result, it is possible to obtain a pseudo wideband audio signal with improved voice clarity and intelligibility.

（Ｆ）他の実施形態
上記各実施形態の説明においても、種々変形実施形態に言及したが、さらに、例示するような変形実施形態を挙げることができる。 (F) Other Embodiments In the description of each of the above embodiments, various modified embodiments have been referred to, but further modified embodiments can be exemplified.

上記第４の実施形態は、第１の実施形態の技術思想に対して、信号処理部を導入したものであってが、第２又は第３の実施形態の技術思想に対して、第４の実施形態で説明した信号処理部を導入するようにしても良い。 In the fourth embodiment, a signal processing unit is introduced with respect to the technical idea of the first embodiment. However, the fourth embodiment is different from the technical idea of the second or third embodiment. The signal processing unit described in the embodiment may be introduced.

上記各実施形態においては、拡張信号の生成方法が、ＢＰＦで２ｋＨｚ〜４ｋＨｚを抽出した信号の全波整流波をＨＰＦで拡張帯域に制限して生成するものであったが、拡張信号の生成方法はこの方法に限定されるものではない。例えば、全波整流処理の代りに、半波整流処理や２乗等のべき乗演算、ｔａｎｈ演算などを適用するものであっても良い。また、ここでは非線形処理を挙げたが、線形処理を行っても良い。ＢＰＦによる抽出帯域も２ｋＨｚ〜４ｋＨｚに限定されるものではなく、また、ＢＰＦによるフィルタリングを実行しないものであっても良い。また、上記各実施形態においては、音声信号を拡張しているが、線形予測分析等によって得られる音源信号を使って拡張信号を生成するようにしても良く、雑音発生源を構成に含めて該雑音発生源から出力される雑音信号を使って拡張信号を生成するようにしても良い。また、複数の信号を静的又は動的に組み合わせて拡張信号を生成するようにしても良い。 In each of the above-described embodiments, the extended signal generation method generates the full-wave rectified wave of the signal extracted from 2 kHz to 4 kHz by the BPF by limiting it to the extended band by the HPF. Is not limited to this method. For example, instead of full-wave rectification processing, half-wave rectification processing, power calculation such as square, tanh calculation, or the like may be applied. In addition, although nonlinear processing is described here, linear processing may be performed. The extraction band by BPF is not limited to 2 kHz to 4 kHz, and may not perform filtering by BPF. In each of the above embodiments, the audio signal is extended. However, the extended signal may be generated using a sound source signal obtained by linear prediction analysis or the like, and a noise generation source is included in the configuration. You may make it produce | generate an extended signal using the noise signal output from a noise generation source. Further, an extended signal may be generated by combining a plurality of signals statically or dynamically.

また、上記各実施形態において、拡張ゲイン算出部２１０の各種のパラメータや拡張ゲイン算出方法、動的な決定方法をユーザが手動で制御、選択できるようにしても良い。これにより、ユーザの好みに合わせた音質の擬似広帯域音声信号が得られる音声帯域拡張装置を実現できる。ここで、ユーザが選択できる選択肢を、狭帯域音声パワーの情報（狭帯域音声パワーそのもの、若しくは、狭帯域音声の正規化パワー）に応じて切り替えるようにしても良く、ユーザが選択した方法の処理の中で、狭帯域音声パワーの情報に応じて、変換式やパラメータを変更するようにしても良い。 In each of the above embodiments, the user may be able to manually control and select various parameters of the expansion gain calculation unit 210, the expansion gain calculation method, and the dynamic determination method. As a result, it is possible to realize an audio band expansion device that can obtain a pseudo wideband audio signal having a sound quality suited to the user's preference. Here, the options that can be selected by the user may be switched according to narrowband audio power information (the narrowband audio power itself or the normalized power of the narrowband audio). Among them, the conversion formula and parameters may be changed in accordance with the information on the narrowband audio power.

また、上記各実施形態において、拡張ゲイン算出部２１０の各種のパラメータや拡張ゲイン算出方法、動的な決定方法を、狭帯域音声信号Ｓを解析した結果に基づいて、自動的に制御できるようにしても良い。例えば、スペクトル包終の長期平均値やピッチ周波数の長期平均等の話者性の情報によって切り替える。このようにすると、使用環境に自動的に適応する音声帯域拡張装置を実現できる。ここで、自動制御された方法の処理の中で、狭帯域音声パワーの情報に応じて、変換式やパラメータを変更するようにすれば良い。 Further, in each of the above embodiments, various parameters of the expansion gain calculation unit 210, the expansion gain calculation method, and the dynamic determination method can be automatically controlled based on the result of analyzing the narrowband audio signal S. May be. For example, the switching is performed according to the information of the speaker property such as the long-term average value of the spectrum envelope and the long-term average of the pitch frequency. In this way, it is possible to realize a voice band expansion device that automatically adapts to the usage environment. Here, in the process of the automatically controlled method, the conversion formula and parameters may be changed according to the information on the narrowband audio power.

また、長期平均部３１１又は長期平均部４１４を含む第２の実施形態、第３の実施形態又は第４の実施形態において、長期平均の長さＴをユーザが手動で制御できるようにしても良い。これにより、環境の変化への追従速度をユーザの好みに合わせた音声帯域拡張装置を実現できる。 In the second embodiment, the third embodiment, or the fourth embodiment including the long-term average unit 311 or the long-term average unit 414, the user may be able to manually control the long-term average length T. . As a result, it is possible to realize a voice band expansion device that matches the user's preference with the follow-up speed to environmental changes.

また、長期平均部３１１又は長期平均部４１４を含む第２の実施形態、第３の実施形態又は第４の実施形態において、長期平均の長さＴを、狭帯域音声信号Ｓ又は処理後狭帯域音声信号Ｓ’を解析した結果に基づいて、自動的に制御できるようにしても良い。例えば、上記話者性の変化量に関する情報や、無音区間の頻度や長さによって切り替えるようにすれば良い。このようにすると、使用環境に自動的に適応する音声帯域拡張装置を実現できる。 In the second embodiment, the third embodiment, or the fourth embodiment including the long-term average unit 311 or the long-term average unit 414, the long-term average length T is set to the narrowband audio signal S or the processed narrowband. It may be possible to automatically control based on the result of analyzing the audio signal S ′. For example, the switching may be performed according to the information on the amount of change in the speaker property and the frequency and length of the silent section. In this way, it is possible to realize a voice band expansion device that automatically adapts to the usage environment.

また、信号処理部５１５を含む第４の実施形態において、狭帯域音声信号Ｓに施す信号処理の内容やパラメータをユーザが手動で制御できるようにしても良い。これにより、ユーザの好みに合わせた音質の擬似広帯域音声信号が得られる音声帯域拡張装置を実現できる。 In the fourth embodiment including the signal processing unit 515, the user may be able to manually control the contents and parameters of signal processing performed on the narrowband audio signal S. As a result, it is possible to realize an audio band expansion device that can obtain a pseudo wideband audio signal having a sound quality suited to the user's preference.

また、信号処理部５１５を含む第４の実施形態において、狭帯域音声信号Ｓに施す信号処理の内容やパラメータを、狭帯域音声信号Ｓを解析した結果に基づいて、自動的に制御できるようにしても良い。例えば、上記話者性の変化量に関する情報や、無音区間の頻度や長さによって切り替えるようにしても良い。このようにすると、使用環境に自動的に適応する音声帯域拡張装置を実現することができる。 Further, in the fourth embodiment including the signal processing unit 515, the contents and parameters of the signal processing applied to the narrowband audio signal S can be automatically controlled based on the analysis result of the narrowband audio signal S. May be. For example, the switching may be performed according to the information regarding the amount of change in the speaker property, the frequency and the length of the silent section. In this way, it is possible to realize a voice band expansion device that automatically adapts to the usage environment.

上記各実施形態の音声帯域拡張装置へ入力される狭帯域音声信号Ｓは、対向する通信装置から送信されてきたものであっても良く、また、記録媒体などから読み出したものであっても良い。また、上記各実施形態の音声帯域拡張装置が得た擬似広帯域音声信号Ｘは、スピーカなどから発音出力されても良く、他の装置に送信されても良く、また、記録媒体に記録されても良い。 The narrowband audio signal S input to the audio band extension device of each of the above embodiments may be transmitted from a facing communication device or read from a recording medium or the like. . In addition, the pseudo wideband audio signal X obtained by the audio band extending device of each of the above embodiments may be output as a sound from a speaker or the like, may be transmitted to another device, or may be recorded on a recording medium. good.

２００、３００、４００、５００…音声帯域拡張装置、１０１…サンプリング変換部、１０２…バンドパスフィルタリング部（ＢＰＦ）、１０３…全波整流部、１０４…ハイパスフィルタリング部（ＨＰＦ）、１０５…周波数解析部、１０７…乗算部、１０８…加算部、２０９…パワー算出部、２１０…拡張ゲイン算出部、３１１、４１４…長期平均部、３１２…パワー正規化部、４１３…音声区間検出部、５１５…信号処理部。 DESCRIPTION OF SYMBOLS 200, 300, 400, 500 ... Voice band expansion apparatus, 101 ... Sampling conversion part, 102 ... Band pass filtering part (BPF), 103 ... Full wave rectification part, 104 ... High pass filtering part (HPF), 105 ... Frequency analysis part , 107 ... Multiplier, 108 ... Adder, 209 ... Power calculator, 210 ... Expansion gain calculator, 311, 414 ... Long-term average part, 312 ... Power normalizer, 413 ... Voice section detector, 515 ... Signal processing Department.

Claims

In an audio band expansion device that expands a narrowband audio signal with a limited frequency band to include a signal component of an expansion band outside the limited band
A frequency analysis means for obtaining a spectral parameter by performing frequency analysis on the narrowband audio signal;
Power information acquisition means for obtaining power information related to the narrowband audio signal;
Based on the spectrum parameter, an extension gain for adjusting the magnitude of the extension component in the extension band signal is obtained, and the extension gain acquisition method is dynamically controlled according to the power information. Gain forming means ,
The extension band forming device restricts the maximum value of the extension gain according to the magnitude of the power information .

In an audio band expansion device that expands a narrowband audio signal with a limited frequency band to include a signal component of an expansion band outside the limited band,
A frequency analysis means for obtaining a spectral parameter by performing frequency analysis on the narrowband audio signal;
Power information acquisition means for obtaining power information related to the narrowband audio signal;
Based on the spectrum parameter, an extension gain for adjusting the magnitude of the extension component in the extension band signal is obtained, and the extension gain acquisition method is dynamically controlled according to the power information. Gain forming means,
The extension gain forming means, the according to the magnitude of the power information, the expanded gain the expanded gain controlling the increases and ease features and to Ruoto voice band extender to the values in the calculation method.

In an audio band expansion device that expands a narrowband audio signal with a limited frequency band to include a signal component of an expansion band outside the limited band,
A frequency analysis means for obtaining a spectral parameter by performing frequency analysis on the narrowband audio signal;
Power information acquisition means for obtaining power information related to the narrowband audio signal;
Based on the spectrum parameter, an extension gain for adjusting the magnitude of the extension component in the extension band signal is obtained, and the extension gain acquisition method is dynamically controlled according to the power information. Gain forming means,
The extension gain forming means, according to the magnitude of the power information, the expanded gain nonlinear equation parameters characteristic and to Ruoto voice band extending apparatus to control the (squared term index) calculation method.

In an audio band expansion device that expands a narrowband audio signal with a limited frequency band to include a signal component of an expansion band outside the limited band,
A frequency analysis means for obtaining a spectral parameter by performing frequency analysis on the narrowband audio signal;
Power information acquisition means for obtaining power information related to the narrowband audio signal;
Based on the spectrum parameter, an extension gain for adjusting the magnitude of the extension component in the extension band signal is obtained, and the extension gain acquisition method is dynamically controlled according to the power information. Gain forming means,
The expansion gain forming means can correspond to the expression of a linear function and a quadratic function as a method of calculating the expansion gain, and the expansion gain calculation method to be applied depends on the magnitude of the power information. features and to Ruoto voice band extender to select one of the functions.

In an audio band expansion device that expands a narrowband audio signal with a limited frequency band to include a signal component of an expansion band outside the limited band,
A spectrum including the narrowband low band power including a band lower than a predetermined frequency, the narrow band high band power including a band higher than the predetermined frequency, and a gradient index by performing frequency analysis on the narrowband audio signal. Frequency analysis means for obtaining parameters;
Power information acquisition means for obtaining power information related to the narrowband audio signal;
Based on the spectral parameter, the expansion gain for adjusting the magnitude of the expansion component in the expansion band signal is divided by the low band power in the narrow band according to the magnitude of the power information. Extended gain forming means obtained by selecting which of the feature values of the measured value and the gradient index to use
A voice band extending apparatus comprising:

The power information acquisition means is
A power calculator for calculating the power of the narrowband audio signal;
A long-term average part for obtaining a long-term average value of the power of the narrowband audio signal;
A power normalization unit that obtains normalized power by dividing the power of the narrowband audio signal by the long-term average value of the power of the narrowband audio signal;
The voice band extending apparatus according to any one of claims 1 to 5 , wherein the normalized power is output as the power information.

The power information acquisition unit further includes a voice section detection unit that determines whether the voice section is a silent section based on the narrowband voice signal, and the long-term average unit is when the voice section detection unit determines that the voice section is a voice section. 7. The voice band extending apparatus according to claim 6 , wherein the long-term average value is updated, and the long-term average value is held when the voice section detection unit determines that the voice section is a silent section.

The apparatus further comprises signal processing means for obtaining a processed narrowband audio signal by performing predetermined signal processing on the narrowband audio signal, and the processed narrowband audio signal is input to the frequency analysis means and the power information acquisition means. The voice band extending device according to any one of claims 1 to 7 , wherein

9. The voice band extending apparatus according to claim 8 , wherein the signal processing means performs at least one kind of signal processing, and high-band emphasis filtering is included in the signal processing.

9. The voice band extending apparatus according to claim 8 , wherein the signal processing means performs at least one kind of signal processing, and noise suppression is included in the signal processing.

9. The voice band extending apparatus according to claim 8 , wherein the signal processing means performs at least one kind of signal processing, and formant emphasis is included in the signal processing.

A computer mounted on an audio band expansion device that expands a narrowband audio signal with a limited frequency band to include a signal component of an expansion band outside the limited band,
A frequency analysis means for obtaining a spectral parameter by performing frequency analysis on the narrowband audio signal;
Power information acquisition means for obtaining power information related to the narrowband audio signal;
Based on the spectrum parameter, an extension gain for adjusting the magnitude of the extension component in the extension band signal is obtained, and the extension gain acquisition method is dynamically controlled according to the power information. Function as a gain forming means ,
The voice band extension program , wherein the extension gain forming means limits the maximum value of the extension gain according to the magnitude of the power information .

A computer mounted on an audio band expansion device that expands a narrowband audio signal with a limited frequency band to include a signal component of an expansion band outside the limited band,
A frequency analysis means for obtaining a spectral parameter by performing frequency analysis on the narrowband audio signal;
Power information acquisition means for obtaining power information related to the narrowband audio signal;
Based on the spectrum parameter, an extension gain for adjusting the magnitude of the extension component in the extension band signal is obtained, and the extension gain acquisition method is dynamically controlled according to the power information. Function as a gain forming means,
The expansion gain forming means controls the ease of increasing the value of the expansion gain in the expansion gain calculation method according to the magnitude of the power information.
A voice band expansion program characterized by that.

A computer mounted on an audio band expansion device that expands a narrowband audio signal with a limited frequency band to include a signal component of an expansion band outside the limited band,
A frequency analysis means for obtaining a spectral parameter by performing frequency analysis on the narrowband audio signal;
Power information acquisition means for obtaining power information related to the narrowband audio signal;
Based on the spectrum parameter, an extension gain for adjusting the magnitude of the extension component in the extension band signal is obtained, and the extension gain acquisition method is dynamically controlled according to the power information. Function as a gain forming means,
The expansion gain forming means controls a nonlinear parameter (an exponent of a square term) of the expansion gain calculation method according to the magnitude of the power information.
A voice band expansion program characterized by that.

A computer mounted on an audio band expansion device that expands a narrowband audio signal with a limited frequency band to include a signal component of an expansion band outside the limited band,
A frequency analysis means for obtaining a spectral parameter by performing frequency analysis on the narrowband audio signal;
Power information acquisition means for obtaining power information related to the narrowband audio signal;
Based on the spectrum parameter, an extension gain for adjusting the magnitude of the extension component in the extension band signal is obtained, and the extension gain acquisition method is dynamically controlled according to the power information. Function as a gain forming means,
The expansion gain forming means can correspond to the expression of a linear function and a quadratic function as a method of calculating the expansion gain, and the expansion gain calculation method to be applied depends on the magnitude of the power information. A voice band expansion program characterized by selecting one of the functions.

A computer mounted on an audio band expansion device that expands a narrowband audio signal with a limited frequency band to include a signal component of an expansion band outside the limited band,
A spectrum including the narrowband low band power including a band lower than a predetermined frequency, the narrow band high band power including a band higher than the predetermined frequency, and a gradient index by performing frequency analysis on the narrowband audio signal. Frequency analysis means for obtaining parameters;
Power information acquisition means for obtaining power information related to the narrowband audio signal;
Based on the spectral parameter, the expansion gain for adjusting the magnitude of the expansion component in the expansion band signal is divided by the low band power in the narrow band according to the magnitude of the power information. Extended gain forming means obtained by selecting which of the feature values of the measured value and the gradient index to use
A voice band expansion program characterized by being made to function.