JP4514149B2 - Speech quality estimation apparatus and speech quality estimation method - Google Patents

Speech quality estimation apparatus and speech quality estimation method Download PDF

Info

Publication number
JP4514149B2
JP4514149B2 JP2005254533A JP2005254533A JP4514149B2 JP 4514149 B2 JP4514149 B2 JP 4514149B2 JP 2005254533 A JP2005254533 A JP 2005254533A JP 2005254533 A JP2005254533 A JP 2005254533A JP 4514149 B2 JP4514149 B2 JP 4514149B2
Authority
JP
Japan
Prior art keywords
frequency band
speech
quality
voice
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
JP2005254533A
Other languages
Japanese (ja)
Other versions
JP2007065547A (en
Inventor
仁志 青木
玲 高橋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nippon Telegraph and Telephone Corp
Original Assignee
Nippon Telegraph and Telephone Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nippon Telegraph and Telephone Corp filed Critical Nippon Telegraph and Telephone Corp
Priority to JP2005254533A priority Critical patent/JP4514149B2/en
Publication of JP2007065547A publication Critical patent/JP2007065547A/en
Application granted granted Critical
Publication of JP4514149B2 publication Critical patent/JP4514149B2/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Description

本発明は、ユーザが体感する音声の品質を、ユーザを介さずに音声の物理的特徴等から推定する音声品質推定技術に関するものである。   The present invention relates to a speech quality estimation technique for estimating speech quality experienced by a user from physical features of speech without using the user.

従来の音声品質推定技術としては、非特許文献1に開示されたE−modelがある。このE−modelによると、品質パラメータ(音声コーデック、遅延量、音量、エコー量など)を入力としてR値と呼ばれる通信品質指標を出力している。
品質の推定対象となる音声帯域には様々なものがあり、代表的な音声通信サービスの帯域として、電話帯域(300〜3400Hz)や広帯域(50〜7000Hz)がある。前者の電話帯域は、電話サービス網上やIPネットワーク上の電話サービスとして広く使用されており、後者の広帯域は、電話サービス網上やIPネットワーク上のTV会議サービスや高品質電話サービスとして使用されている。前記のE−modelは、電話帯域の音声の品質を推定する場合に用いられる。
As a conventional speech quality estimation technique, there is E-model disclosed in Non-Patent Document 1. According to this E-model, a quality parameter (sound codec, delay amount, volume, echo amount, etc.) is input and a communication quality index called an R value is output.
There are various voice bands for which quality is to be estimated, and typical voice communication service bands include a telephone band (300 to 3400 Hz) and a wide band (50 to 7000 Hz). The former telephone band is widely used as a telephone service on a telephone service network or an IP network, and the latter broadband is used as a video conference service or a high quality telephone service on a telephone service network or an IP network. Yes. The E-model is used when estimating the voice quality of the telephone band.

「The E-model,a computational model for use in transmission planning」,ITU-T Recommendation G.107,2003"The E-model, a computational model for use in transmission planning", ITU-T Recommendation G.107, 2003

E−mode1は、電話帯域の音声をターゲットとして開発されたこともあり、電話帯域よりも低域および高域を拡張した広帯域の音声の品質推定に際しては十分な精度が得られない。具体的には、E−modelにおける通信品質指標値の最大値は93とされているため、より鮮明で高品質な音声通信を実現した広帯域通信において93を超える通信品質指標値が出力されることはなく、高品質な広帯域音声通信の品質を十分に推定しているとは言えない。   The E-mode 1 has been developed targeting the voice of the telephone band, and sufficient accuracy cannot be obtained when estimating the quality of the wideband voice in which the low frequency band and the high frequency band are extended from the telephone band. Specifically, since the maximum value of the communication quality index value in E-model is 93, a communication quality index value exceeding 93 is output in broadband communication that realizes clearer and higher quality voice communication. It cannot be said that the quality of high-quality broadband voice communication is sufficiently estimated.

ユーザが満足できるようにネットワークや端末の品質を設計しておくことは、音声通信サービスを提供する上で重要である。しかし、E−modelの対象が電話帯域の音声であるため、E−modelをネットワークや端末の品質設計に用いた場合、電話帯域における最高の品質までしか品質設計ができず、より高品質な広帯域音声通信サービスの品質設計ができないという問題点があった。   Designing the quality of networks and terminals so that users can be satisfied is important in providing voice communication services. However, since the target of E-model is voice in the telephone band, when E-model is used for network and terminal quality design, quality design can only be performed up to the highest quality in the telephone band, and higher quality broadband. There was a problem that the quality of voice communication service could not be designed.

本発明の目的は、E−model等の電話帯域用に開発された音声品質推定技術を利用して、より広帯域の音声通信に対しても十分な推定精度が得られる音声品質推定値を導出することにある。   An object of the present invention is to derive a speech quality estimation value that can provide sufficient estimation accuracy for wider-band speech communication using speech quality estimation technology developed for a telephone band such as E-model. There is.

本発明は、対象音声周波数帯域における対象音声の品質を推定する音声品質推定装置であって、前記対象音声周波数帯域と異なる基準音声周波数帯域における前記対象音声の品質を推定して品質指標値を出力する品質推定手段と、各音声周波数帯域毎の音声評価値を予め記憶するデータベースと、前記対象音声周波数帯域と前記基準音声周波数帯域の各々の前記音声評価値を前記データベースから取得する読込手段と、前記対象音声周波数帯域の音声評価値と前記基準音声周波数帯域の音声評価値に基づいて、前記品質指標値を前記対象音声周波数帯域における値に補正する補正値を算出する算出手段と、前記品質指標値を前記補正値により補正する補正手段とを有するものである。
また、本発明の音声品質推定装置の1構成例において、前記データベースは、各音声周波数帯域と音声品質との関係を評価する主観品質評価実験により予め求められた前記音声評価値を音声周波数帯域毎に記憶するものである。
また、本発明の音声品質推定装置の1構成例において、前記データベースは、各音声周波数帯域と音声品質との関係を測定する客観品質測定により予め求められた前記音声評価値を音声周波数帯域毎に記憶するものである。
また、本発明の音声品質推定装置の1構成例は、さらに、前記対象音声周波数帯域が未知である場合に、通信装置から出力される音声信号に基づいて前記対象音声周波数帯域を求める対象音声周波数帯域判定手段を有するものである。
The present invention is a speech quality estimation device for estimating the quality of a target speech in a target speech frequency band, and estimates the quality of the target speech in a reference speech frequency band different from the target speech frequency band and outputs a quality index value Quality estimation means, a database for storing speech evaluation values for each voice frequency band in advance, a reading means for acquiring the voice evaluation values for each of the target voice frequency band and the reference voice frequency band from the database, Calculation means for calculating a correction value for correcting the quality index value to a value in the target audio frequency band based on the audio evaluation value of the target audio frequency band and the audio evaluation value of the reference audio frequency band; and the quality index Correction means for correcting the value by the correction value.
Further, in one configuration example of the speech quality estimation apparatus of the present invention, the database stores the speech evaluation value obtained in advance by a subjective quality evaluation experiment for evaluating a relationship between each speech frequency band and speech quality for each speech frequency band. To remember.
In the configuration example of the speech quality estimation apparatus according to the present invention, the database may store the speech evaluation value obtained in advance by objective quality measurement for measuring the relationship between each speech frequency band and speech quality for each speech frequency band. It is something to remember.
Moreover, one configuration example of the speech quality estimation apparatus of the present invention further includes a target speech frequency for obtaining the target speech frequency band based on a speech signal output from a communication device when the target speech frequency band is unknown. It has a band determination means.

また、本発明の音声品質推定方法は、前記対象音声周波数帯域と異なる基準音声周波数帯域における前記対象音声の品質を推定して品質指標値を出力する品質推定手順と、予め得られている各音声周波数帯域と音声品質との関係に基づいて、前記対象音声周波数帯域と前記基準音声周波数帯域の各々の音声評価値を求める評価値取得手順と、前記対象音声周波数帯域の音声評価値と前記基準音声周波数帯域の音声評価値に基づいて、前記品質指標値を前記対象音声周波数帯域における値に補正する補正値を算出する補正値算出手順と、前記品質指標値を前記補正値により補正する補正手順とを有するものである。   Also, the speech quality estimation method of the present invention includes a quality estimation procedure for estimating the quality of the target speech in a reference speech frequency band different from the target speech frequency band and outputting a quality index value; Based on the relationship between the frequency band and the voice quality, an evaluation value acquisition procedure for obtaining a voice evaluation value of each of the target voice frequency band and the reference voice frequency band, a voice evaluation value of the target voice frequency band, and the reference voice A correction value calculation procedure for calculating a correction value for correcting the quality index value to a value in the target audio frequency band based on a voice evaluation value in a frequency band; and a correction procedure for correcting the quality index value by the correction value; It is what has.

本発明によれば、対象音声周波数帯域の音声評価値と基準音声周波数帯域の音声評価値に基づいて補正値を算出し、この補正値によって基準音声周波数帯域における対象音声の品質指標値を補正することにより、基準音声周波数帯域における対象音声の品質指標値を対象音声周波数帯域における品質指標値に拡張することができる。その結果、本発明では、各音声周波数帯域と音声品質との関係を音声評価値として予めデータベースに登録しておくことにより、E−model等の従来の品質推定手段を利用する場合において、基準音声周波数帯域よりも広帯域の対象音声周波数帯域における対象音声の主観品質に相当する品質指標値を高い精度で推定することができる。   According to the present invention, the correction value is calculated based on the voice evaluation value of the target voice frequency band and the voice evaluation value of the reference voice frequency band, and the quality index value of the target voice in the reference voice frequency band is corrected by the correction value. Thus, the quality index value of the target voice in the reference voice frequency band can be extended to the quality index value in the target voice frequency band. As a result, in the present invention, the relationship between each audio frequency band and audio quality is registered in advance in the database as an audio evaluation value, so that when using conventional quality estimation means such as E-model, the reference audio A quality index value corresponding to the subjective quality of the target voice in the target voice frequency band wider than the frequency band can be estimated with high accuracy.

また、本発明では、対象音声周波数帯域判定手段を設けることにより、対象音声周波数帯域が未知の場合であっても、対象音声周波数帯域における対象音声の主観品質に相当する品質指標値を推定することができる。   Further, in the present invention, by providing the target voice frequency band determining means, the quality index value corresponding to the subjective quality of the target voice in the target voice frequency band is estimated even when the target voice frequency band is unknown. Can do.

[第1の実施の形態]
以下、本発明の実施の形態について図面を参照して説明する。図1は、本発明の第1の実施の形態となる音声品質推定装置の構成を示すブロック図である。本実施の形態の音声品質推定装置は、基準音声周波数帯域における対象音声Ainの品質を推定して品質指標値Rを出力する総合品質推定部1と、各音声周波数帯域毎の音声主観評価値を予め記憶する主観品質データベース2と、品質推定の対象となる対象音声周波数帯域と基準音声周波数帯域の各々の音声主観評価値を主観品質データベース2から取得するMOSデータ読込部3と、MOSデータ読込部3が取得した音声主観評価値を間隔尺度へ変換する間隔尺度変換部4と、間隔尺度に変換された評価値から、品質指標値Rを対象音声周波数帯域における値に補正する補正値を算出する算出部5と、品質指標値Rを補正値により補正するスケーリング部6とを備える。
[First Embodiment]
Hereinafter, embodiments of the present invention will be described with reference to the drawings. FIG. 1 is a block diagram showing a configuration of a speech quality estimation apparatus according to the first embodiment of the present invention. The speech quality estimation apparatus of the present embodiment estimates the quality of the target speech Ain in the reference speech frequency band and outputs the quality index value R, and the speech subjective evaluation value for each speech frequency band. Subjective quality database 2 stored in advance, MOS data reading unit 3 for acquiring subjective speech evaluation values of the target speech frequency band and the reference speech frequency band to be subjected to quality estimation from the subjective quality database 2, and a MOS data reading unit 3 calculates a correction value for correcting the quality index value R to a value in the target audio frequency band from the interval scale conversion unit 4 that converts the acquired speech subjective evaluation value into the interval scale, and the evaluation value converted into the interval scale. A calculation unit 5 and a scaling unit 6 that corrects the quality index value R with a correction value are provided.

総合品質推定部1には、図示しない通信装置から品質推定の対象となる対象音声Ainが入力される。総合品質推定部1は、基準音声周波数帯域(本実施の形態では電話帯域であり、300〜3400Hz)における対象音声Ainの総合品質を推定し、品質指標値Rを出力する。この品質指標値Rは、基準音声周波数帯域における対象音声Ainの主観品質に相当するものである。総合品質推定部1の例としては、例えば非特許文献1に開示されたE−modelがある。E−modelの場合、まず対象音声Ainの品質パラメータ(音声コーデック、遅延量、音量、エコー量など)を求め、この品質パラメータから品質指標値Rを求める。   The overall quality estimation unit 1 receives a target speech Ain that is a target of quality estimation from a communication device (not shown). The total quality estimation unit 1 estimates the total quality of the target voice Ain in the reference voice frequency band (which is a telephone band in the present embodiment, 300 to 3400 Hz), and outputs a quality index value R. The quality index value R corresponds to the subjective quality of the target voice Ain in the reference voice frequency band. As an example of the total quality estimation unit 1, there is an E-model disclosed in Non-Patent Document 1, for example. In the case of E-model, first, quality parameters (speech codec, delay amount, volume, echo amount, etc.) of the target speech Ain are obtained, and a quality index value R is obtained from the quality parameters.

主観品質データベース2には、音声周波数帯域毎の音声主観評価値が予め登録されている。この音声主観評価値の例としては、平均オピニオン評点(MOS:Mean Opinion Score)がある。MOSは、被験者が音声信号サンプルを聞いて5段階評価した評価値である。本実施の形態では、帯域幅が異なるだけでそれ以外の劣化がない音声信号サンプルを音声周波数帯域毎に用意して、例えば文献「A.Takahashi,A.Kurashima,and H.Yoshino,“Subjective quality index for compatibly evaluating narrowband and wideband speech”,MESAQIN2005,June,2005」で提案されているグローバルテスト(Global test)により、各音声信号サンプルのMOSを求めたものを主観品質データベース2に登録している。主観品質データベース2の構成例を図2に示す。主観品質データベース2には、音声周波数帯域の下限周波数および上限周波数と、この音声周波数帯域のMOSとが対応付けられて登録されている。   In the subjective quality database 2, speech subjective evaluation values for each sound frequency band are registered in advance. As an example of the speech subjective evaluation value, there is a mean opinion score (MOS). The MOS is an evaluation value evaluated by a subject on a five-point scale by listening to an audio signal sample. In the present embodiment, audio signal samples that are different in bandwidth but have no other deterioration are prepared for each audio frequency band. For example, documents “A. Takahashi, A. Kurashima, and H. Yoshino,“ Subjective quality ” The MOS for each audio signal sample obtained by the global test proposed in “index for compatibly evaluating narrowband and wideband speech”, MESAQIN2005, June, 2005 ”is registered in the subjective quality database 2. A configuration example of the subjective quality database 2 is shown in FIG. In the subjective quality database 2, the lower limit frequency and the upper limit frequency of the voice frequency band and the MOS of the voice frequency band are registered in association with each other.

MOSデータ読込部3は、例えば音声品質推定装置のユーザから品質設計の対象となる対象音声周波数帯域の下限周波数Fl(例えば50Hz)および上限周波数Fh(例えば7000Hz)の指定を受け付ける。そして、MOSデータ読込部3は、指定された対象音声周波数帯域(50〜7000Hz)と予め定められた基準音声周波数帯域(300〜3400Hz)の音声主観評価値を主観品質データベース2から取得する。以下、対象音声周波数帯域の音声主観評価値をMOSt、基準音声周波数帯域の音声主観評価値をMOSbとする。図2の例では、MOStは4.3、MOSbは3.5である。   For example, the MOS data reading unit 3 accepts designation of a lower limit frequency Fl (for example, 50 Hz) and an upper limit frequency Fh (for example, 7000 Hz) of the target speech frequency band that is a target of quality design from a user of the speech quality estimation apparatus. Then, the MOS data reading unit 3 acquires from the subjective quality database 2 speech subjective evaluation values of the designated target speech frequency band (50 to 7000 Hz) and a predetermined reference speech frequency band (300 to 3400 Hz). Hereinafter, the speech subjective evaluation value in the target speech frequency band is MOSt, and the speech subjective evaluation value in the reference speech frequency band is MOSb. In the example of FIG. 2, MOSt is 4.3 and MOSb is 3.5.

MOSは心理尺度上の量であるため、MOS尺度上では、尺度値の差の比と品質との関係が一定ではない。例えば、最高尺度値5の付近では品質が非常によく、最低尺度値1の付近では品質が非常に悪いといったように心理尺度が飽和してしまい、尺度値が変化しても品質があまり変化しないという性質がある。そこで、間隔尺度変換部4は、MOSデータ読込部3が取得したMOSt,MOSbを、尺度値の差の比と品質との関係が一定となる間隔尺度へ変換する。以下、MOSt,MOSbを変換した後の評価値をそれぞれRt,Rbとする。間隔尺度変換部4の例としては、非特許文献1のAppendixIで提案されている、MOSを品質指標値Rに変換する変換式を用いたものが考えられる。   Since MOS is a quantity on a psychological scale, on the MOS scale, the relationship between the ratio of scale value differences and quality is not constant. For example, the psychological scale is saturated such that the quality is very good near the maximum scale value 5 and the quality is very bad near the minimum scale value 1, and the quality does not change much even if the scale value changes. It has the nature of Therefore, the interval scale conversion unit 4 converts the MOSt and MOSb acquired by the MOS data reading unit 3 into an interval scale in which the relationship between the scale value difference ratio and the quality is constant. Hereinafter, the evaluation values after the conversion of MOSt and MOSb are Rt and Rb, respectively. As an example of the interval scale conversion unit 4, one using a conversion formula that is proposed in Appendix I of Non-Patent Document 1 and converts a MOS into a quality index value R can be considered.

算出部5は、間隔尺度変換部4によって変換された評価値Rt,Rbから、品質指標値Rの補正値である拡張係数λを以下の式(1)に従って計算する。
λ=Rt−Rb ・・・(1)
The calculation unit 5 calculates an expansion coefficient λ, which is a correction value of the quality index value R, from the evaluation values Rt and Rb converted by the interval scale conversion unit 4 according to the following equation (1).
λ = Rt−Rb (1)

最後に、スケーリング部6は、総合品質推定部1から出力された品質指標値Rを式(2)に示すように拡張係数λで補正することにより、対象音声周波数帯域における対象音声Ainの総合品質指標値R’を算出する。この総合品質指標値R’は、対象音声周波数帯域における対象音声Ainの主観品質に相当するものである。
R’=R+λ ・・・(2)
Finally, the scaling unit 6 corrects the quality index value R output from the total quality estimation unit 1 with the expansion coefficient λ as shown in the equation (2), so that the total quality of the target speech Ain in the target speech frequency band is corrected. An index value R ′ is calculated. This total quality index value R ′ corresponds to the subjective quality of the target voice Ain in the target voice frequency band.
R ′ = R + λ (2)

以上のように、本実施の形態では、対象音声周波数帯域の評価値Rtと基準音声周波数帯域の評価値Rbに基づいて拡張係数λを算出し、この拡張係数λによって基準音声周波数帯域における対象音声の品質指標値Rを補正することにより、品質指標値Rを対象音声周波数帯域における品質指標値R’に拡張することができる。   As described above, in the present embodiment, the extension coefficient λ is calculated based on the evaluation value Rt of the target voice frequency band and the evaluation value Rb of the reference voice frequency band, and the target voice in the reference voice frequency band is calculated based on the extension coefficient λ. The quality index value R can be expanded to the quality index value R ′ in the target audio frequency band by correcting the quality index value R.

なお、本実施の形態では、各音声周波数帯域と音声品質との関係を評価する主観品質評価実験により求められた音声主観評価値を記憶する主観品質データベース2を用いているが、このデータベース2の代わりに、各音声周波数帯域と音声品質との関係を測定する客観品質測定により求められた音声評価値を音声周波数帯域毎に記憶するデータベースを用いてもよい。   In the present embodiment, a subjective quality database 2 is used which stores a speech subjective evaluation value obtained by a subjective quality evaluation experiment for evaluating the relationship between each speech frequency band and speech quality. Instead, a database that stores voice evaluation values obtained by objective quality measurement for measuring the relationship between each voice frequency band and voice quality may be used for each voice frequency band.

[第2の実施の形態]
次に、本発明の第2の実施の形態について説明する。図3は、本発明の第2の実施の形態となる音声品質推定装置の構成を示すブロック図であり、図1と同一の構成には同一の符号を付してある。第1の実施の形態で説明した総合品質推定部1に入力される対象音声Ainは音声通信装置11から出力されるが、本実施の形態は、この音声通信装置11が使用している音声周波数帯域が不明な場合の例である。
[Second Embodiment]
Next, a second embodiment of the present invention will be described. FIG. 3 is a block diagram showing the configuration of the speech quality estimation apparatus according to the second embodiment of the present invention. The same components as those in FIG. The target voice Ain that is input to the overall quality estimation unit 1 described in the first embodiment is output from the voice communication device 11. In the present embodiment, the voice frequency used by the voice communication device 11 is used. This is an example when the bandwidth is unknown.

本実施の形態では、予め音声通信装置11の対象音声周波数帯域を測定する測定時に、音声通信装置11に試験信号Stestを入力し、この試験信号Stestに応じて音声通信装置11から出力される音声通信装置出力音声Atestに基づいて、対象音声周波数帯域を求める対象音声周波数帯域幅判定部7を備える。このときの試験信号Stestとしては、例えば白色雑音やピンク雑音のように全周波数帯域成分を持つものを使用することができる。対象音声周波数帯域を決定するには、例えば通過帯域の平均パワーよりxdB(xは任意の値)低くなる周波数を対象音声周波数帯域の下限周波数Flおよび上限周波数Fhとすればよい。   In the present embodiment, a test signal Stest is input to the voice communication device 11 at the time of measuring the target voice frequency band of the voice communication device 11 in advance, and the voice output from the voice communication device 11 according to the test signal Test. A target voice frequency bandwidth determination unit 7 for obtaining a target voice frequency band is provided based on the communication device output voice Atest. As the test signal Test at this time, a signal having all frequency band components such as white noise and pink noise can be used. In order to determine the target voice frequency band, for example, the frequency lower than the average power of the pass band by xdB (x is an arbitrary value) may be set as the lower limit frequency Fl and the upper limit frequency Fh of the target voice frequency band.

対象音声周波数帯域の判定例を図4を用いて説明する。図4(A)は試験信号Stestの周波数特性を示す図、図4(B)は音声通信装置出力音声Atestの周波数特性を示す図である。本実施の形態では、試験信号Stestとして図4(A)に示すように全周波数帯域でフラットな特性を有する白色雑音を用いる。白色雑音を音声通信装置11に入力すると、通過帯域の信号のみが音声通信装置11を通過するので、図4(B)に示すように帯域制限された音声通信装置出力音声Atestが出力される。ここでは、仮に通過帯域の平均パワーより3dB低くなる周波数を対象音声周波数帯域の下限周波数Flおよび上限周波数Fhとすると、下限周波数Flが50Hzで、上限周波数Fhが7000Hzであると判定することができる。   A determination example of the target audio frequency band will be described with reference to FIG. 4A is a diagram showing the frequency characteristics of the test signal Stest, and FIG. 4B is a diagram showing the frequency characteristics of the voice communication apparatus output voice Atest. In the present embodiment, white noise having flat characteristics in the entire frequency band is used as the test signal Test as shown in FIG. When white noise is input to the voice communication device 11, only the signal in the pass band passes through the voice communication device 11, so that the band-limited voice communication device output voice Atest is output as shown in FIG. Here, assuming that the frequency that is 3 dB lower than the average power of the pass band is the lower limit frequency Fl and the upper limit frequency Fh of the target voice frequency band, it can be determined that the lower limit frequency Fl is 50 Hz and the upper limit frequency Fh is 7000 Hz. .

対象音声周波数帯域幅判定部7が求めた対象音声周波数帯域の下限周波数Flおよび上限周波数Fhは、受付部8を通じて総合品質推定部1に通知される。
対象音声周波数帯域の測定終了後、対象音声周波数帯域幅判定部7は、音声通信装置11への試験信号Stestの入力を停止し、音声通信装置11を通常の状態に戻す。以後は、音声通信装置11から総合品質推定部1に対象音声Ainが入力される。総合品質推定部1、主観品質データベース2、MOSデータ読込部3、間隔尺度変換部4、算出部5およびスケーリング部6の動作は、第1の実施の形態と同じである。
The lower limit frequency Fl and the upper limit frequency Fh of the target voice frequency band obtained by the target voice frequency bandwidth determination unit 7 are notified to the overall quality estimation unit 1 through the reception unit 8.
After the measurement of the target voice frequency band is completed, the target voice frequency bandwidth determination unit 7 stops the input of the test signal Test to the voice communication device 11 and returns the voice communication device 11 to a normal state. Thereafter, the target voice Ain is input from the voice communication device 11 to the total quality estimation unit 1. The operations of the overall quality estimation unit 1, the subjective quality database 2, the MOS data reading unit 3, the interval scale conversion unit 4, the calculation unit 5 and the scaling unit 6 are the same as those in the first embodiment.

以上のように、本実施の形態では、対象音声周波数帯域幅判定部7を設けることにより、対象音声周波数帯域が未知の場合であっても、対象音声周波数帯域における対象音声の品質を推定することができる。   As described above, in the present embodiment, by providing the target audio frequency bandwidth determination unit 7, the quality of the target audio in the target audio frequency band is estimated even when the target audio frequency band is unknown. Can do.

[第3の実施の形態]
次に、本発明の第3の実施の形態について説明する。図5は、本発明の第3の実施の形態となる音声品質推定装置の構成を示すブロック図であり、図1と同一の構成には同一の符号を付してある。本実施の形態は、第1の実施の形態における主観品質データベース2とMOSデータ読込部3と間隔尺度変換部4の代わりに、間隔尺度品質データベース9と品質データ読込部10を備えるものである。
[Third Embodiment]
Next, a third embodiment of the present invention will be described. FIG. 5 is a block diagram showing the configuration of a speech quality estimation apparatus according to the third embodiment of the present invention. The same components as those in FIG. In this embodiment, an interval scale quality database 9 and a quality data reading unit 10 are provided instead of the subjective quality database 2, the MOS data reading unit 3, and the interval scale conversion unit 4 in the first embodiment.

間隔尺度品質データベース9には、音声周波数帯域毎の音声主観評価値を予め間隔尺度に変換した評価値が登録されている。第1の実施の形態と同様に、音声主観評価値の例としてはMOSがある。そして、MOSを間隔尺度に変換するには、第1の実施の形態の間隔尺度変換部4で説明した変換式を用いればよい。間隔尺度品質データベース9の構成例を図6に示す。間隔尺度品質データベース9には、音声周波数帯域の下限周波数および上限周波数と、この音声周波数帯域の間隔尺度値とが対応付けられて登録されている。   In the interval scale quality database 9, an evaluation value obtained by converting a speech subjective evaluation value for each audio frequency band into an interval scale in advance is registered. Similar to the first embodiment, an example of the speech subjective evaluation value is MOS. In order to convert the MOS into the interval scale, the conversion formula described in the interval scale conversion unit 4 of the first embodiment may be used. A configuration example of the interval scale quality database 9 is shown in FIG. In the interval scale quality database 9, the lower limit frequency and the upper limit frequency of the voice frequency band and the interval scale value of the voice frequency band are registered in association with each other.

品質データ読込部10は、ユーザから対象音声周波数帯域の下限周波数Fl(例えば50Hz)および上限周波数Fh(例えば7000Hz)の指定を受け付け、指定された対象音声周波数帯域と予め定められた基準音声周波数帯域(300〜3400Hz)の間隔尺度値Rt,Rbを間隔尺度品質データベース9から取得して出力する。図6の例では、対象音声周波数帯域の間隔尺度値Rtは100、基準音声周波数帯域の間隔尺度値Rbは80である。   The quality data reading unit 10 accepts designation of the lower limit frequency Fl (for example, 50 Hz) and the upper limit frequency Fh (for example, 7000 Hz) of the target audio frequency band from the user, and the specified target audio frequency band and a predetermined reference audio frequency band The interval scale values Rt and Rb of (300 to 3400 Hz) are acquired from the interval scale quality database 9 and output. In the example of FIG. 6, the interval scale value Rt of the target audio frequency band is 100, and the interval scale value Rb of the reference audio frequency band is 80.

総合品質推定部1、算出部5およびスケーリング部6の動作は、第1の実施の形態と同じである。
以上のように、本実施の形態では、音声主観評価値を予め間隔尺度に変換しておくことにより、間隔尺度変換部4を省略することができる。
The operations of the overall quality estimation unit 1, the calculation unit 5, and the scaling unit 6 are the same as those in the first embodiment.
As described above, in this embodiment, the interval scale conversion unit 4 can be omitted by converting the speech subjective evaluation value into the interval measure in advance.

本発明は、音声品質推定技術に適用することができる。   The present invention can be applied to a speech quality estimation technique.

本発明の第1の実施の形態となる音声品質推定装置の構成を示すブロック図である。It is a block diagram which shows the structure of the audio | voice quality estimation apparatus used as the 1st Embodiment of this invention. 本発明の第1の実施の形態における主観品質データベースの構成例を示す図である。It is a figure which shows the structural example of the subjective quality database in the 1st Embodiment of this invention. 本発明の第2の実施の形態となる音声品質推定装置の構成を示すブロック図である。It is a block diagram which shows the structure of the audio | voice quality estimation apparatus used as the 2nd Embodiment of this invention. 試験信号および試験信号に応じて音声通信装置から出力される音声通信装置出力音声の周波数特性を示す図である。It is a figure which shows the frequency characteristic of the audio | voice communication apparatus output audio | voice output from an audio | voice communication apparatus according to a test signal and a test signal. 本発明の第3の実施の形態となる音声品質推定装置の構成を示すブロック図である。It is a block diagram which shows the structure of the audio | voice quality estimation apparatus used as the 3rd Embodiment of this invention. 本発明の第3の実施の形態における間隔尺度品質データベースの構成例を示す図である。It is a figure which shows the structural example of the space | interval scale quality database in the 3rd Embodiment of this invention.

符号の説明Explanation of symbols

1…総合品質推定部、2…主観品質データベース、3…MOSデータ読込部、4…間隔尺度変換部、5…算出部、6…スケーリング部、7…対象音声周波数帯域幅判定部、8…受付部、9…間隔尺度品質データベース、10…品質データ読込部、11…音声通信装置。
DESCRIPTION OF SYMBOLS 1 ... Total quality estimation part, 2 ... Subjective quality database, 3 ... MOS data reading part, 4 ... Space | interval scale conversion part, 5 ... Calculation part, 6 ... Scaling part, 7 ... Target audio | voice frequency bandwidth determination part, 8 ... Reception , 9 ... interval scale quality database, 10 ... quality data reading unit, 11 ... voice communication device.

Claims (5)

対象音声周波数帯域における対象音声の品質を推定する音声品質推定装置であって、
前記対象音声周波数帯域と異なる基準音声周波数帯域における前記対象音声の品質を推定して品質指標値を出力する品質推定手段と、
各音声周波数帯域毎の音声評価値を予め記憶するデータベースと、
前記対象音声周波数帯域と前記基準音声周波数帯域の各々の前記音声評価値を前記データベースから取得する読込手段と、
前記対象音声周波数帯域の音声評価値と前記基準音声周波数帯域の音声評価値に基づいて、前記品質指標値を前記対象音声周波数帯域における値に補正する補正値を算出する算出手段と、
前記品質指標値を前記補正値により補正する補正手段とを有することを特徴とする音声品質推定装置。
A speech quality estimation device for estimating the quality of a target speech in a target speech frequency band,
Quality estimation means for estimating the quality of the target voice in a reference voice frequency band different from the target voice frequency band and outputting a quality index value;
A database that pre-stores voice evaluation values for each voice frequency band;
Reading means for acquiring the voice evaluation value of each of the target voice frequency band and the reference voice frequency band from the database;
Calculation means for calculating a correction value for correcting the quality index value to a value in the target voice frequency band based on the voice evaluation value of the target voice frequency band and the voice evaluation value of the reference voice frequency band;
A speech quality estimation apparatus comprising: a correction unit that corrects the quality index value by the correction value.
請求項1記載の音声品質推定装置において、
前記データベースは、各音声周波数帯域と音声品質との関係を評価する主観品質評価実験により予め求められた前記音声評価値を音声周波数帯域毎に記憶するものであることを特徴とする音声品質推定装置。
The speech quality estimation apparatus according to claim 1, wherein
The speech quality estimation apparatus characterized in that the database stores, for each speech frequency band, the speech evaluation value obtained in advance by a subjective quality evaluation experiment for evaluating a relationship between each speech frequency band and speech quality. .
請求項1記載の音声品質推定装置において、
前記データベースは、各音声周波数帯域と音声品質との関係を測定する客観品質測定により予め求められた前記音声評価値を音声周波数帯域毎に記憶するものであることを特徴とする音声品質推定装置。
The speech quality estimation apparatus according to claim 1, wherein
The speech quality estimation apparatus, wherein the database stores, for each speech frequency band, the speech evaluation value obtained in advance by objective quality measurement for measuring a relationship between each speech frequency band and speech quality.
請求項1記載の音声品質推定装置において、
さらに、前記対象音声周波数帯域が未知である場合に、通信装置から出力される音声信号に基づいて前記対象音声周波数帯域を求める対象音声周波数帯域判定手段を有することを特徴とする音声品質推定装置。
The speech quality estimation apparatus according to claim 1, wherein
The speech quality estimation apparatus further comprising target speech frequency band determining means for obtaining the target speech frequency band based on a speech signal output from a communication device when the target speech frequency band is unknown.
対象音声周波数帯域における対象音声の品質を推定する音声品質推定方法であって、
前記対象音声周波数帯域と異なる基準音声周波数帯域における前記対象音声の品質を推定して品質指標値を出力する品質推定手順と、
予め得られている各音声周波数帯域と音声品質との関係に基づいて、前記対象音声周波数帯域と前記基準音声周波数帯域の各々の音声評価値を求める評価値取得手順と、
前記対象音声周波数帯域の音声評価値と前記基準音声周波数帯域の音声評価値に基づいて、前記品質指標値を前記対象音声周波数帯域における値に補正する補正値を算出する補正値算出手順と、
前記品質指標値を前記補正値により補正する補正手順とを有することを特徴とする音声品質推定方法。
A speech quality estimation method for estimating a target speech quality in a target speech frequency band,
A quality estimation procedure for estimating the quality of the target voice in a reference voice frequency band different from the target voice frequency band and outputting a quality index value;
Based on the relationship between each voice frequency band and voice quality obtained in advance, an evaluation value acquisition procedure for obtaining a voice evaluation value of each of the target voice frequency band and the reference voice frequency band;
A correction value calculation procedure for calculating a correction value for correcting the quality index value to a value in the target audio frequency band based on the audio evaluation value of the target audio frequency band and the audio evaluation value of the reference audio frequency band;
And a correction procedure for correcting the quality index value by the correction value.
JP2005254533A 2005-09-02 2005-09-02 Speech quality estimation apparatus and speech quality estimation method Active JP4514149B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2005254533A JP4514149B2 (en) 2005-09-02 2005-09-02 Speech quality estimation apparatus and speech quality estimation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP2005254533A JP4514149B2 (en) 2005-09-02 2005-09-02 Speech quality estimation apparatus and speech quality estimation method

Publications (2)

Publication Number Publication Date
JP2007065547A JP2007065547A (en) 2007-03-15
JP4514149B2 true JP4514149B2 (en) 2010-07-28

Family

ID=37927789

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2005254533A Active JP4514149B2 (en) 2005-09-02 2005-09-02 Speech quality estimation apparatus and speech quality estimation method

Country Status (1)

Country Link
JP (1) JP4514149B2 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101611355B1 (en) 2015-01-26 2016-04-26 유니트론 주식회사 Automatic measurement apparatus and method for mobile communication system applying an adaptive sampling algorithm

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH113097A (en) * 1997-06-13 1999-01-06 Nippon Telegr & Teleph Corp <Ntt> Evaluating method for quality of coded voice signal and data base using it
JP2005164870A (en) * 2003-12-02 2005-06-23 Nippon Telegr & Teleph Corp <Ntt> Objective evaluation apparatus for speech quality taking band limitation into consideration

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH113097A (en) * 1997-06-13 1999-01-06 Nippon Telegr & Teleph Corp <Ntt> Evaluating method for quality of coded voice signal and data base using it
JP2005164870A (en) * 2003-12-02 2005-06-23 Nippon Telegr & Teleph Corp <Ntt> Objective evaluation apparatus for speech quality taking band limitation into consideration

Also Published As

Publication number Publication date
JP2007065547A (en) 2007-03-15

Similar Documents

Publication Publication Date Title
JP4879180B2 (en) Frequency compensation for perceptual speech analysis
KR101430321B1 (en) Method and system for determining a perceived quality of an audio system
JP5204904B2 (en) Audio signal quality prediction
EP1434197B1 (en) Estimation method and apparatus of overall conversational speech quality
EP2143104A2 (en) Method and system for speech quality prediction of the impact of time localized distortions of an audio trasmission system
JP2007013674A (en) Comprehensive speech communication quality evaluating device and comprehensive speech communication quality evaluating method
JP4341586B2 (en) Call quality objective evaluation server, method and program
JP4514149B2 (en) Speech quality estimation apparatus and speech quality estimation method
Köster et al. Non-intrusive estimation of noisiness as a perceptual quality dimension of transmitted speech
JP2004222257A (en) Total call quality estimating method and apparatus, program for executing method, and recording medium thereof
JP4430566B2 (en) Objective quality evaluation apparatus and method
JP5952252B2 (en) Call quality estimation method, call quality estimation device, and program
JP4309749B2 (en) Voice quality objective evaluation system considering bandwidth limitation
Neves et al. Quality model for monitoring QoE in VoIP services
JP4116955B2 (en) Voice quality objective evaluation apparatus and voice quality objective evaluation method
Reimes et al. Perceived listening effort for in-car communication systems
JP2007233264A (en) Apparatus and method for objectively evaluating speech quality
JP3490380B2 (en) Apparatus and method for evaluating signal transmission quality of signal transmission medium, and information recording medium
Becvar et al. Impact of saturation on speech quality in VoIP
Côté et al. Assessment of Different Loudness Models for Perceived Speech Quality
JP2006148752A (en) Method and server for deciding evaluation sample number for subjective evaluation of telephone call quality
Song et al. Subjective and objective assessment of loudness for mobile phone applications
JP2004056612A (en) Method, device and program for evaluating objective quality, and recording medium having objective quality evaluation program recorded thereon
Gierlich et al. Objective Prediction of Speech Quality for Wideband Communication Scenarios Including Background Noise
Singh et al. Non-Intrusive Speech Quality with Different Time Scale

Legal Events

Date Code Title Description
A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20070810

A977 Report on retrieval

Free format text: JAPANESE INTERMEDIATE CODE: A971007

Effective date: 20100316

TRDD Decision of grant or rejection written
A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

Effective date: 20100506

A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

A61 First payment of annual fees (during grant procedure)

Free format text: JAPANESE INTERMEDIATE CODE: A61

Effective date: 20100507

R151 Written notification of patent or utility model registration

Ref document number: 4514149

Country of ref document: JP

Free format text: JAPANESE INTERMEDIATE CODE: R151

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20130521

Year of fee payment: 3

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20140521

Year of fee payment: 4

S531 Written request for registration of change of domicile

Free format text: JAPANESE INTERMEDIATE CODE: R313531

R350 Written notification of registration of transfer

Free format text: JAPANESE INTERMEDIATE CODE: R350