JP3750705B2 - Speech coding transmission method and speech coding transmission apparatus - Google Patents

Speech coding transmission method and speech coding transmission apparatus Download PDF

Info

Publication number
JP3750705B2
JP3750705B2 JP15079297A JP15079297A JP3750705B2 JP 3750705 B2 JP3750705 B2 JP 3750705B2 JP 15079297 A JP15079297 A JP 15079297A JP 15079297 A JP15079297 A JP 15079297A JP 3750705 B2 JP3750705 B2 JP 3750705B2
Authority
JP
Japan
Prior art keywords
listener
masking threshold
receiving side
information
ambient noise
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
JP15079297A
Other languages
Japanese (ja)
Other versions
JPH10341162A (en
Inventor
正之 三崎
潤一 田川
宏嗣 谷口
美治男 松本
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Panasonic Corp
Panasonic Holdings Corp
Original Assignee
Panasonic Corp
Matsushita Electric Industrial Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Panasonic Corp, Matsushita Electric Industrial Co Ltd filed Critical Panasonic Corp
Priority to JP15079297A priority Critical patent/JP3750705B2/en
Publication of JPH10341162A publication Critical patent/JPH10341162A/en
Application granted granted Critical
Publication of JP3750705B2 publication Critical patent/JP3750705B2/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Description

【0001】
【発明の属する技術分野】
本発明は伝送路を用いて音声信号を効率よく伝送する音声符号化伝送方法及び音声符号化伝送装置に関するものである。
【0002】
【従来の技術】
従来の音声符号化方法とその装置について説明する。図5は従来の音声符号化装置の基本構成を示すブロック図である。本図に示すように音声符号化装置は、周波数包絡演算手段1、マスキング閾値決定手段2A、適応ビット割当手段3、第1〜第Nの帯域に帯域分割を行う第1〜第Nの帯域分割手段4、各帯域毎に量子化を行う第1〜第Nの量子化手段5、各帯域毎にエントロピー符号化を行う第1〜第Nのエントロピー符号化手段6、マルチプレクサ7を含んで構成される。
【0003】
まず、周波数包絡演算手段1に入力された音声信号は、フレーム単位でスペクトル包絡が求められる。求められたスペクトル包絡をもとに、マスキング閾値決定手段2Aは帯域分割されている帯域のマスキング閾値を決定する。このマスキング閾値は、臨界帯域幅を考慮した同時マスキング効果により決定される。適応ビット割当手段3は、得られたマスキング閾値を超える入力信号に対して、スペクトル包絡成分を各帯域毎に求める。そしてその比に応じて各帯域へのビット割当量を決定する。
【0004】
一方、入力信号が第1〜第Nの帯域分割手段4に入力されると、第1〜第NのN帯域に分割される。そして、第1〜第Nの帯域分割手段4の出力信号は夫々第1〜第Nの量子化手段5に入力され、適応ビット割当手段3によって与えられたビット数で量子化される。そして量子化された各帯域分割信号は第1〜第Nのエントロピー符号化手段6に入力され、冗長性を削除するためのエントロピー符号化が行われる。そして各々の帯域の符号化データは適応ビット割当手段3で決定されたビット割当情報と共に、マルチプレクサ7でまとめられて伝送路に送出される。
【0005】
【発明が解決しようとする課題】
しかしながら,上記のような方法では、符号化効率を良くするために同時マスキング効果を用いて符号化データを削減しているが、音声信号を受信する聴取者側の周囲騒音の影響や、聴取者個人の聴覚能力(聴覚特性)を考慮したものではない。特に受信側の環境において、騒音レベルが高かったり、全可聴帯域を聴くとのできない聴取者にとっては、一方的にこのような帯域分割信号を受信することは、冗長な情報を取得することになる。
【0006】
本発明は、このような従来の問題点に鑑みてなされたものであって、音声信号を受信する聴取者側の周囲の騒音特性、及び聴取者の聴力特性を考慮することにより、符号化音声信号の再生品質又は符号化効率を向上させる音声符号化伝送方法及び音声符号化伝送装置を実現することを目的とするものである。
【0007】
【課題を解決するための手段】
この課題を達成するために本願の請求項1記載の発明は、受信側の聴取条件に基づいて聴感上の再生品質を補償するために送信側で適用する音声符号化伝送方法であって、送信側から受信側へ受信側の周囲騒音特性に関する情報の転送を要求して受信側から周囲騒音スペクトル情報を入手し、送信側から受信側へ受信側聴取者の聴力特性に関する情報の転送を要求して受信側から聴取者聴力特性情報を入手し、符号化伝送すべき音声信号のスペクトル包絡を求め、同時マスキング効果によって前記スペクトル包絡に関するマスキング閾値を、受信側から得た周囲騒音スペクトル情報と受信側から得た聴取者聴力特性情報の少なくとも1つに基づいて補正し、得られた新しいマスキング閾値を基に各周波数帯域に対するビット割当量の配分を減少するよう調整し、調整されたビット割当量に基づいて各周波数帯域信号に対して所定の符号化アルゴリズムで符号化を行い、受信側へ伝送することを特徴とするものであり、本願の請求項5記載の発明は、受信側の聴取条件に基づいて聴感上の再生品質を補償するために送信側で適用する音声符号化伝送装置であって、受信側の周囲騒音スペクトル情報を送信側で入手する騒音特性参照手段と、受信側の聴取者聴力特性情報を送信側で入手する聴力特性参照手段と、符号化伝送すべき音声信号のスペクトル包絡を求める周波数包絡演算手段と、同時マスキング効果によって前記スペクトル包絡に関するマスキング閾値を、受信側から得た周囲騒音スペクトル情報と受信側から得た聴取者聴力特性情報の少なくとも1つに基づいて補正するマスキング閾値決定手段と、得られた新しいマスキング閾値を基に各周波数帯域に対するビット割当量の配分を減少するよう調整する適応ビット割当手段と、調整されたビット割当量に基づいて各周波数帯域信号に対して所定の符号化アルゴリズムで符号化を行う符号化手段と、を備えたことを特徴とするものである。
【0008】
また本願の請求項2記載の発明は、受信側の聴取条件に基づいて聴感上の再生品質を補償するために送信側で適用する音声符号化伝送方法であって、送信側から受信側へ受信側の周囲騒音特性に関する情報の転送を要求して受信側から周囲騒音スペクトル情報を入手し、送信側から受信側へ受信側聴取者の聴力特性に関する情報の転送を要求して受信側から聴取者聴力特性情報を入手し、符号化伝送すべき音声信号のスペクトル包絡を求め、同時マスキング効果によって前記スペクトル包絡に関するマスキング閾値を、受信側から得た周囲騒音スペクトル情報と受信側から得た聴取者聴力特性情報の少なくとも1つに基づいて補正し、得られた新しいマスキング閾値を基に各周波数帯域に対するビット割当量を、音声信号のSN値が所定値以上となるよう変更し、変更されたビット割当量に基づいて各周波数帯域信号に対して所定の符号化アルゴリズムで符号化を行い、受信側へ伝送することを特徴とするものであり、本願の請求項6記載の発明は、受信側の聴取条件に基づいて聴感上の再生品質を補償するために送信側で適用する音声符号化伝送装置であって、受信側の周囲騒音スペクトル情報を送信側で入手する騒音特性参照手段と、受信側の聴取者聴力特性情報を送信側で入手する聴力特性参照手段と、符号化伝送すべき音声信号のスペクトル包絡を求める周波数包絡演算手段と、同時マスキング効果によって前記スペクトル包絡に関するマスキング閾値を、受信側から得た周囲騒音スペクトル情報と受信側から得た聴取者聴力特性情報の少なくとも1つ以上に基づいて補正するマスキング閾値決定手段と、得られた新しいマスキング閾値を基に各周波数帯域に対するビット割当量を、音声信号のSN値が所定値以上となるよう変更する適応ビット割当手段と、変更されたビット割当量に基づいて各周波数帯域信号に対して所定の符号化アルゴリズムで符号化を行う符号化手段と、を備えたことを特徴とするものである。
【0009】
また本願の請求項3,7記載の発明は、上述した音声符号化伝送方法及び音声符号化伝送装置において、マスキング閾値の補正に際し、前記周囲騒音スペクトル情報から得られるマスキングノイズを基に、マスキング閾値を調整することを特徴とするものである。
【0010】
また本願の請求項4,8記載の発明は、上述した音声符号化伝送方法及び音声符号化伝送装置において、マスキング閾値の補正に際し、前記聴取者聴力特性情報に基づいて得られる聴取者の周波数帯域毎の最小可聴値と臨界帯域幅を基に、マスキング閾値を調整することを特徴とするものである。
【0011】
【発明の実施の形態】
(実施の形態1)
以下本発明の実施の形態1における音声符号化伝送方法について,図1〜図3を参照しつつ説明する。図1は本実施の形態の音声符号化装置の基本構成を示すブロック図であり、従来例と同一部分は同一符号をつけ、それらの説明は省略する。この音声符号化装置は、周波数包絡演算手段1、マスキング閾値決定手段2B、適応ビット割当手段3、N帯域に帯域分割を行う第1〜第Nの帯域分割手段4、各帯域毎に量子化を行う第1〜第Nの量子化手段5、各帯域毎にエントロピー符号化を行う第1〜第Nのエントロピー符号化手段6、マルチプレクサ7に加えて、騒音特性参照手段8、聴力特性参照手段9を含んで構成される。
【0012】
騒音特性参照手段8は、伝送路を介して入力された受信側の周囲の騒音特性を入手し、マスキング閾値決定手段2Bに与える手段である。また聴力特性参照手段9は、伝送路を介して入力された聴取者の聴力特性を入手し、マスキング閾値決定手段2Bに与える手段である。マスキング閾値決定手段2Bは、入力音声信号の周波数包絡情報と、受信側の騒音特性及び聴力特性に基づき、マスキング閾値を決定する手段である。
【0013】
このように構成された音声符号化装置の動作について図1〜図3を用いて説明する。図2,図3は本実施の形態における音声符号化伝送方法の信号処理の流れを示すフローチャートである。
【0014】
ステップS1においてまず送信側は、受信側の周囲騒音環境を知るために、受信側の周囲騒音特性に関する騒音スペクトル情報の転送を要求する。これに対して図3のステップS11では、受信側は送信側からの周囲騒音特性に関する情報の転送要求を受理する。そして次のステップS12で、受信側は受信端末側の周囲騒音に関する騒音スペクトルを測定し、得られた騒音スペクトル情報を送信側の騒音特性参照手段8に転送する。こうして送信側は、周囲の騒音スペクトル情報を入手する。
【0015】
図2のステップS2では、送信側は、受信側の聴取者の聴力特性を知るために、受信側の聴取者の聴力特性に関する聴力情報の転送を要求する。これに対して受信側は図3のステップS21において、送信側からの受信側の聴取者の聴力情報の転送要求を受理する。そして次のステップS22で、受信端末側の聴取者の聴力情報を収集し、得られた聴力情報を送信側の聴力特性参照手段9に転送する。なお、この受信端末側の聴取者の正確な聴覚特性の特性が既に得られていて、その情報を受信端末から転送できるものとする。なお受信側の周囲騒音特性や聴取者の聴力特性の情報が得られないときは、送信側がその情報を推定する。
【0016】
次のステップS3では、送信側のマスキング閾値決定手段2Bは入力信号のスペクトル包絡をフレーム単位で求め、受信側から得られた周囲騒音スペクトルの包絡をこれに付加する。そしてステップS4では、同時マスキング効果によるマスキング閾値を決定する。これにより、受信側の周囲騒音環境を含めたマスキング閾値が得られることになる。
【0017】
ステップS5に進むと、マスキング閾値決定手段2Bは聴力特性参照手段9を介して得られた聴取者の聴力特性である最小可聴値を基に、マスキング閾値を補正する。これにより聴取者が例えば高域周波数の感度が劣化している場合などに、可聴域外の無駄な符号化データの送信をなくすことができる。
【0018】
ステップS6では、適応ビット割当て手段3は以上で求められたマスキング閾値と入力信号のスペクトル包絡とを用いて、適応的にビット割当量を変更する。なお、本実施の形態では、伝送する符号化音声のビットレートは上限が制限されているものとする。次に各帯域のマスキング閾値を越える成分の比を求め、その比に応じたビット配分を行う。全体でのビット数は所定値以下とするが、その割当量は先のビット配分に応じて適応的に変更される。
【0019】
ビット割り当て以降の動作は従来例と同様である。即ち、ステップS7では、帯域分割手段4が入力信号を帯域分割する。そして量子化手段5は各帯域に割当てられたビット数で量子化し、エントロピー符号化手段6がエントロピー符号化を実施する。次のステップS8では、マルチプレクサ7は各帯域の符号化されたデータと、量子化に割り当てられたビット割当て数を多重化して伝送路に出力する。
【0020】
(実施の形態2)
次に本発明の実施の形態2における音声符号化伝送方法について、図3及び図4を参照しつつ説明する。図4は本実施の形態における音声符号化伝送方法の信号処理の流れを示すフローチャートである。なお、音声符号化装置の基本構成は図1と同様であるので、図1の各手段の引用は省略する。
【0021】
図4のステップT1においてまず送信側は、受信側の周囲騒音環境を知るために、受信側の周囲騒音特性に関する騒音スペクトル情報の転送を要求する。これに対して図3のステップT11では、受信側は送信側からの周囲騒音特性に関する情報の転送要求を受理する。そして次のステップT12で、受信側は受信端末側の周囲騒音に関する騒音スペクトルを測定し、得られた騒音スペクトル情報を送信側に転送する。こうして送信側は、周囲の騒音スペクトル情報を入手する。
【0022】
次のステップT2では、送信側は、受信側の聴取者の聴力特性を知るために、聴取者の聴力特性に関する聴力情報の転送を要求する。これに対して図3のステップT21では、受信側は送信側からの受信側の聴取者の聴力特性に関する情報の転送要求を受理する。そして次のステップT22で、受信端末側の聴取者の聴力情報を収集し、得られた聴力情報を送信側に転送する。なお、この受信端末側の聴取者の正確な聴力の特性が既に得られていて、その情報を受信端末から転送できるものとする。
【0023】
次のステップT3では、送信側は入力信号のスペクトル包絡をフレーム単位で求め、受信側から得られた周囲騒音スペクトルの包絡をこれに付加する。そしてステップT4では、同時マスキング効果によるマスキング閾値を決定する。これにより、受信側の周囲騒音環境を含めたマスキング閾値が得られることになる。
【0024】
ステップT5に進むと、更に受信側の聴取者の聴力特性である最小可聴値を基に、マスキング閾値を補正する。これにより聴取者が例えば高域周波数の感度が劣化している場合などに、可聴域外の符号化データの無駄に送信を事前になくすことができる。
【0025】
ステップT6では、以上で求められたマスキング閾値と入力信号のスペクトル包絡とを用いて、適応的にビット割当量を変更する。なお、本実施の形態では、伝送する符号化音声のビットレートは可変できるとする。まず各帯域のマスキング閾値を越える成分の絶対値から、音声のSN値が所定の値になるようにビット数の決定を行う。このため、全体でのビット数は一定値ではなく、信号の状態などに応じて適応的に可変する。
【0026】
ビット割り当て以降の動作は従来例と同様である。即ち、ステップT7では、入力信号を帯域分割し、各帯域に割当てられたビット数で量子化し、エントロピー符号化を実施する。次のステップT8では、各帯域の符号化されたデータと、量子化に割り当てられたビット割当て数を多重化して伝送路に出力する。
【0027】
【発明の効果】
以上のように、請求項1〜8記載の発明によれば、受信側の周囲騒音の影響や受信側聴取者の聴力特性を考慮して全帯域の符号化データのビット数を制限することにより、符号化伝送する情報量を削減できる効果が得られる。
【0028】
また請求項2,3,4,6,7,8記載の発明によれば、受信側の周囲騒音の影響や聴取者の聴力特性を考慮して各帯域へのビット配分を変更することにより、聴感上の符号化再生品質を改善できるという効果が得られる。
【図面の簡単な説明】
【図1】 本発明の音声符号化伝送方法を実現するための音声符号化装置の基本構成図である。
【図2】 本発明の実施の形態1における音声符号化伝送方法の信号処理を示すフローチ
ャート(その1)である。
【図3】 実施の形態1,2における音声符号化伝送方法の信号処理を示すフローチャート(その2)である。
【図4】 本発明の実施の形態2における音声符号化伝送方法の信号処理を示すフローチャート(その1)である。
【図5】 従来の音声符号化装置の構成図である。
【符号の説明】
1 周波数包絡演算手段
2A,2B マスキング閾値決定手段
3 適応ビット割当手段
4 帯域分割手段
5 量子化手段
6 エントロピー符号化手段
7 マルチプレクサ
8 騒音特性参照手段
9 聴力特性参照手段
[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a voice encoding transmission method and a voice encoding transmission apparatus for efficiently transmitting a voice signal using a transmission line.
[0002]
[Prior art]
A conventional speech encoding method and apparatus will be described. FIG. 5 is a block diagram showing a basic configuration of a conventional speech coding apparatus. As shown in the figure, the speech coding apparatus includes a frequency envelope calculation means 1, a masking threshold value determination means 2A, an adaptive bit allocation means 3, and first to Nth band divisions for dividing a band into first to Nth bands. Means 4 includes first to Nth quantization means 5 for performing quantization for each band, first to Nth entropy encoding means 6 for performing entropy encoding for each band, and a multiplexer 7. The
[0003]
First, the speech envelope input to the frequency envelope calculation means 1 is required to have a spectral envelope in units of frames. Based on the obtained spectral envelope, the masking threshold value determining means 2A determines the masking threshold value of the band that is divided. This masking threshold is determined by the simultaneous masking effect considering the critical bandwidth. The adaptive bit allocation means 3 obtains a spectrum envelope component for each band for an input signal exceeding the obtained masking threshold. The bit allocation amount for each band is determined according to the ratio.
[0004]
On the other hand, when the input signal is input to the first to Nth band dividing means 4, it is divided into the first to Nth N bands. The output signals of the first to Nth band dividing means 4 are respectively input to the first to Nth quantizing means 5 and quantized with the number of bits given by the adaptive bit allocation means 3. Each quantized band division signal is input to the first to Nth entropy encoding means 6 and subjected to entropy encoding for eliminating redundancy. The encoded data of each band is collected by the multiplexer 7 together with the bit allocation information determined by the adaptive bit allocation means 3 and sent to the transmission line.
[0005]
[Problems to be solved by the invention]
However, in the above method, the encoded data is reduced by using the simultaneous masking effect in order to improve the encoding efficiency. However, the influence of the ambient noise on the listener side receiving the audio signal and the listener It does not take into account individual hearing ability (auditory characteristics). Especially for listeners who have a high noise level or who cannot listen to the entire audible band in the receiving-side environment, receiving such a band division signal unilaterally acquires redundant information. .
[0006]
The present invention has been made in view of the above-described problems of the prior art, and is based on the noise characteristics around the listener receiving the audio signal and the hearing characteristics of the listener. It is an object of the present invention to realize a speech encoding / transmission method and speech encoding / transmission apparatus that improve signal reproduction quality or encoding efficiency.
[0007]
[Means for Solving the Problems]
In order to achieve this object, the invention according to claim 1 of the present application is a speech encoding transmission method applied on the transmission side in order to compensate reproduction quality on hearing based on listening conditions on the reception side, Request the transfer of information about the ambient noise characteristics of the receiving side from the receiving side to obtain the ambient noise spectrum information from the receiving side, and request the transfer of information about the hearing characteristics of the receiving listener from the transmitting side to the receiving side. Obtaining the listener's hearing characteristic information from the receiving side , obtaining the spectral envelope of the audio signal to be encoded and transmitted, the masking threshold for the spectral envelope by the simultaneous masking effect, the ambient noise spectral information obtained from the receiving side and the receiving side at least one corrected based on the basis of the new masking threshold obtained reduced the allocation of bit allocation amount for each frequency band of the listener hearing characteristic information obtained from To make it adjusts performs encoding at a predetermined coding algorithm based on the adjusted bit allocation amount for each frequency band signal, which is characterized in that transmission to the receiver, the claims hereof The invention described in claim 5 is a speech coding and transmission apparatus applied on the transmission side in order to compensate for the audible reproduction quality based on listening conditions on the reception side, and obtains ambient noise spectrum information on the reception side on the transmission side and noise characteristic reference means for a hearing characteristic reference means to obtain the listener hearing characteristic information of the receiving side on the transmitting side, the frequency envelope calculating means for calculating a spectrum envelope of the speech signal to be transmitted coded, the by simultaneous masking effect Ma is corrected based on the masking threshold for the spectral envelope, to at least one listener hearing characteristic information obtained with ambient noise spectrum information obtained from the reception side from the reception side A king threshold value determining means, an adaptive bit allocating means for adjusting the allocation of the bit allocation amount for each frequency band based on the obtained new masking threshold value, and a signal for each frequency band signal based on the adjusted bit allocation amount. On the other hand, an encoding means for encoding with a predetermined encoding algorithm is provided.
[0008]
The invention according to claim 2 of the present application is a speech coding transmission method applied on the transmission side in order to compensate the audible reproduction quality based on the listening condition on the reception side, and is received from the transmission side to the reception side. Request the transfer of information on the ambient noise characteristics of the receiver to obtain ambient noise spectrum information from the receiver, and request the transfer of information on the hearing characteristics of the receiver of the receiver from the transmitter to the receiver to request the listener from the receiver Obtain hearing characteristic information , determine the spectral envelope of the audio signal to be encoded and transmitted, and obtain the masking threshold for the spectral envelope by the simultaneous masking effect, the ambient noise spectrum information obtained from the receiving side and the listener's hearing obtained from the receiving side corrected based on at least one particular feature information, the bit allocation amount for each frequency band based on a new masking threshold obtained, SN value is a predetermined value or more of the audio signals It was changed to a performs encoding at a predetermined coding algorithm for each frequency band signal based on the changed bit allocation amount, which is characterized in that transmission to the receiver, the present application claims The invention according to Item 6 is a speech coding and transmission apparatus applied on the transmission side in order to compensate the reproduction quality on hearing based on listening conditions on the reception side, and the ambient noise spectrum information on the reception side is transmitted on the transmission side. By means of the simultaneous masking effect, the noise characteristic reference means to be obtained, the hearing characteristic reference means to obtain the listener's hearing characteristic information on the receiving side, the frequency envelope calculation means to obtain the spectral envelope of the audio signal to be encoded and transmitted, and a masking threshold for the spectral envelope, at least one or more correction based the listener hearing characteristic information obtained with ambient noise spectrum information obtained from the reception side from the reception side Masking threshold value determining means, adaptive bit allocating means for changing the bit allocation amount for each frequency band based on the obtained new masking threshold value so that the SN value of the audio signal is equal to or greater than a predetermined value, and changed bit allocation And encoding means for encoding each frequency band signal with a predetermined encoding algorithm based on the quantity.
[0009]
According to the third and seventh aspects of the present invention, in the above-described speech coding transmission method and speech coding transmission device , the masking threshold is corrected based on the masking noise obtained from the ambient noise spectrum information when the masking threshold is corrected. It is characterized by adjusting.
[0010]
According to the fourth and eighth aspects of the present invention, in the voice encoding transmission method and the voice encoding transmission apparatus described above, the frequency band of the listener obtained based on the listener hearing characteristic information when the masking threshold is corrected. The masking threshold value is adjusted based on the minimum audible value and the critical bandwidth for each.
[0011]
DETAILED DESCRIPTION OF THE INVENTION
(Embodiment 1)
Hereinafter, a speech coding and transmission method according to Embodiment 1 of the present invention will be described with reference to FIGS. FIG. 1 is a block diagram showing a basic configuration of a speech encoding apparatus according to the present embodiment. The same parts as those in the conventional example are given the same reference numerals, and the description thereof is omitted. This speech encoding apparatus includes a frequency envelope calculation means 1, a masking threshold value determination means 2B, an adaptive bit allocation means 3, first to Nth band division means 4 for dividing a band into N bands, and performs quantization for each band. In addition to first to Nth quantizing means 5 to perform, first to Nth entropy encoding means 6 to perform entropy coding for each band, multiplexer 7, noise characteristic reference means 8, hearing characteristic reference means 9 It is comprised including.
[0012]
The noise characteristic reference means 8 is means for obtaining the noise characteristic of the reception side input via the transmission path and giving it to the masking threshold value determination means 2B. The hearing characteristic reference means 9 is means for obtaining the hearing characteristic of the listener inputted via the transmission path and giving it to the masking threshold value determination means 2B. The masking threshold value determining means 2B is a means for determining a masking threshold value based on the frequency envelope information of the input audio signal and the noise characteristics and hearing characteristics on the receiving side.
[0013]
The operation of the speech encoding apparatus configured as described above will be described with reference to FIGS. 2 and 3 are flowcharts showing the signal processing flow of the speech coding and transmission method according to the present embodiment.
[0014]
In step S1, the transmitting side first requests the transfer of noise spectrum information related to the ambient noise characteristics on the receiving side in order to know the ambient noise environment on the receiving side. On the other hand, in step S11 of FIG. 3, the receiving side accepts a transfer request for information on ambient noise characteristics from the transmitting side. In the next step S12, the receiving side measures the noise spectrum related to the ambient noise on the receiving terminal side, and transfers the obtained noise spectrum information to the noise characteristic reference means 8 on the transmitting side. In this way, the transmitting side obtains ambient noise spectrum information.
[0015]
In step S2 of FIG. 2, the transmitting side requests transfer of hearing information relating to the hearing characteristics of the receiving listener in order to know the hearing characteristics of the receiving listener. On the other hand, in step S21 in FIG. 3, the receiving side accepts a request for transferring the hearing information of the listener on the receiving side from the transmitting side. In the next step S22, the hearing information of the listener on the receiving terminal side is collected, and the obtained hearing information is transferred to the hearing characteristic reference means 9 on the transmitting side. It is assumed that the accurate auditory characteristics of the listener on the receiving terminal side have already been obtained, and that information can be transferred from the receiving terminal. If information on the ambient noise characteristics on the receiving side and the hearing characteristics of the listener cannot be obtained, the transmitting side estimates the information.
[0016]
In the next step S3, the masking threshold value determination means 2B on the transmission side obtains the spectrum envelope of the input signal in units of frames, and adds the envelope of the ambient noise spectrum obtained from the reception side to this. In step S4, a masking threshold value due to the simultaneous masking effect is determined. Thereby, the masking threshold value including the ambient noise environment on the receiving side is obtained.
[0017]
In step S5, the masking threshold value determining unit 2B corrects the masking threshold value based on the minimum audible value that is the hearing characteristic of the listener obtained through the hearing characteristic reference unit 9. As a result, it is possible to eliminate transmission of useless encoded data outside the audible range, for example, when the listener has deteriorated sensitivity at high frequencies.
[0018]
In step S6, the adaptive bit allocation means 3 adaptively changes the bit allocation amount using the masking threshold obtained above and the spectrum envelope of the input signal. In the present embodiment, it is assumed that the upper limit of the bit rate of encoded audio to be transmitted is limited. Next, a ratio of components exceeding the masking threshold value of each band is obtained, and bit allocation is performed according to the ratio. Although the total number of bits is set to a predetermined value or less, the allocated amount is adaptively changed according to the previous bit allocation.
[0019]
The operation after bit allocation is the same as in the conventional example. That is, in step S7, the band dividing means 4 divides the input signal into bands. The quantization means 5 quantizes the number of bits assigned to each band, and the entropy coding means 6 performs entropy coding. In the next step S8, the multiplexer 7 multiplexes the encoded data of each band and the bit allocation number allocated for quantization, and outputs them to the transmission line.
[0020]
(Embodiment 2)
Next, a speech coding and transmission method according to Embodiment 2 of the present invention will be described with reference to FIGS. FIG. 4 is a flowchart showing a signal processing flow of the speech coding and transmission method according to the present embodiment. Since the basic configuration of the speech encoding apparatus is the same as that in FIG. 1, citation of each unit in FIG. 1 is omitted.
[0021]
In step T1 in FIG. 4, first, the transmitting side requests transfer of noise spectrum information related to the ambient noise characteristics on the receiving side in order to know the ambient noise environment on the receiving side. On the other hand, in step T11 of FIG. 3, the receiving side accepts a transfer request for information on ambient noise characteristics from the transmitting side. In the next step T12, the receiving side measures the noise spectrum related to the ambient noise on the receiving terminal side, and transfers the obtained noise spectrum information to the transmitting side. In this way, the transmitting side obtains ambient noise spectrum information.
[0022]
In the next step T2, the transmitting side requests transfer of hearing information relating to the hearing characteristics of the listener in order to know the hearing characteristics of the listener on the receiving side. On the other hand, in step T21 of FIG. 3, the receiving side accepts a transfer request for information related to the hearing characteristics of the listener on the receiving side from the transmitting side. In the next step T22, the hearing information of the listener on the receiving terminal side is collected, and the obtained hearing information is transferred to the transmitting side. Note that it is assumed that accurate hearing characteristics of the listener on the receiving terminal side have already been obtained, and that information can be transferred from the receiving terminal.
[0023]
In the next step T3, the transmitting side obtains the spectrum envelope of the input signal in units of frames, and adds the envelope of the ambient noise spectrum obtained from the receiving side to this. In step T4, a masking threshold value due to the simultaneous masking effect is determined. Thereby, the masking threshold value including the ambient noise environment on the receiving side is obtained.
[0024]
In step T5, the masking threshold is corrected based on the minimum audible value that is the hearing characteristic of the listener on the receiving side. As a result, when the listener has deteriorated the sensitivity of the high frequency, for example, transmission of encoded data outside the audible range can be eliminated in advance.
[0025]
In step T6, the bit allocation amount is adaptively changed using the masking threshold obtained above and the spectrum envelope of the input signal. In the present embodiment, it is assumed that the bit rate of transmitted encoded audio can be varied. First, the number of bits is determined from the absolute value of the component exceeding the masking threshold of each band so that the SN value of the voice becomes a predetermined value. For this reason, the total number of bits is not a constant value, but is adaptively varied according to the signal state and the like.
[0026]
The operation after bit allocation is the same as in the conventional example. That is, in step T7, the input signal is divided into bands, quantized with the number of bits assigned to each band, and entropy coding is performed. In the next step T8, the encoded data of each band and the number of bits allocated for quantization are multiplexed and output to the transmission line.
[0027]
【The invention's effect】
As described above, according to the first to eighth aspects of the present invention, by limiting the number of bits of encoded data in the entire band in consideration of the influence of ambient noise on the reception side and the hearing characteristics of the listener on the reception side. Thus, it is possible to reduce the amount of information to be encoded and transmitted.
[0028]
According to the inventions of claims 2, 3, 4, 6, 7, and 8, by changing the bit distribution to each band in consideration of the influence of ambient noise on the receiving side and the hearing characteristics of the listener, There is an effect that the encoded reproduction quality on hearing can be improved.
[Brief description of the drawings]
FIG. 1 is a basic configuration diagram of a speech coding apparatus for realizing a speech coding and transmission method of the present invention.
FIG. 2 is a flowchart (part 1) showing signal processing of the speech coding and transmission method according to Embodiment 1 of the present invention.
FIG. 3 is a flowchart (part 2) illustrating signal processing of the speech coding and transmission method according to the first and second embodiments.
FIG. 4 is a flowchart (part 1) showing signal processing of the speech coding and transmission method according to Embodiment 2 of the present invention.
FIG. 5 is a configuration diagram of a conventional speech encoding apparatus.
[Explanation of symbols]
DESCRIPTION OF SYMBOLS 1 Frequency envelope calculation means 2A, 2B Masking threshold value determination means 3 Adaptive bit allocation means 4 Band division means 5 Quantization means 6 Entropy encoding means 7 Multiplexer 8 Noise characteristic reference means 9 Hearing characteristic reference means

Claims (8)

受信側の聴取条件に基づいて聴感上の再生品質を補償するために送信側で適用する音声符号化伝送方法であって、
送信側から受信側へ受信側の周囲騒音特性に関する情報の転送を要求して受信側から周囲騒音スペクトル情報を入手し、
送信側から受信側へ受信側聴取者の聴力特性に関する情報の転送を要求して受信側から聴取者聴力特性情報を入手し、
符号化伝送すべき音声信号のスペクトル包絡を求め、
同時マスキング効果によって前記スペクトル包絡に関するマスキング閾値を、受信側から得た周囲騒音スペクトル情報と受信側から得た聴取者聴力特性情報の少なくとも1つに基づいて補正し、
得られた新しいマスキング閾値を基に各周波数帯域に対するビット割当量の配分を減少するよう調整し、
調整されたビット割当量に基づいて各周波数帯域信号に対して所定の符号化アルゴリズムで符号化を行い、受信側へ伝送することを特徴とする音声符号化伝送方法。
A speech coding and transmission method applied on the transmitting side to compensate for auditory reproduction quality based on listening conditions on the receiving side,
Obtain the ambient noise spectrum information from the receiver by requesting the transfer of information on the ambient noise characteristics of the receiver from the transmitter to the receiver .
Request the transfer of information about the hearing characteristics of the receiving listener from the transmitting side to the receiving side to obtain the listener's hearing characteristics information from the receiving side ,
Obtain the spectral envelope of the audio signal to be encoded and transmitted,
Correcting the masking threshold for the spectral envelope by a simultaneous masking effect based on at least one of ambient noise spectrum information obtained from the receiver and listener hearing characteristic information obtained from the receiver ;
Adjust to reduce the allocation of bit allocation for each frequency band based on the new masking threshold obtained,
A speech encoding transmission method, wherein each frequency band signal is encoded with a predetermined encoding algorithm based on the adjusted bit allocation amount and transmitted to a receiving side .
受信側の聴取条件に基づいて聴感上の再生品質を補償するために送信側で適用する音声符号化伝送方法であって、
送信側から受信側へ受信側の周囲騒音特性に関する情報の転送を要求して受信側から周囲騒音スペクトル情報を入手し、
送信側から受信側へ受信側聴取者の聴力特性に関する情報の転送を要求して受信側から聴取者聴力特性情報を入手し、
符号化伝送すべき音声信号のスペクトル包絡を求め、
同時マスキング効果によって前記スペクトル包絡に関するマスキング閾値を、受信側から得た周囲騒音スペクトル情報と受信側から得た聴取者聴力特性情報の少なくとも1つに基づいて補正し、
得られた新しいマスキング閾値を基に各周波数帯域に対するビット割当量を、音声信号のSN値が所定値以上となるよう変更し、
変更されたビット割当量に基づいて各周波数帯域信号に対して所定の符号化アルゴリズムで符号化を行い、受信側へ伝送することを特徴とする音声符号化伝送方法。
A speech coding and transmission method applied on the transmitting side to compensate for auditory reproduction quality based on listening conditions on the receiving side,
Obtain the ambient noise spectrum information from the receiver by requesting the transfer of information on the ambient noise characteristics of the receiver from the transmitter to the receiver .
Request the transfer of information about the hearing characteristics of the receiving listener from the transmitting side to the receiving side to obtain the listener's hearing characteristics information from the receiving side ,
Obtain the spectral envelope of the audio signal to be encoded and transmitted,
Correcting the masking threshold for the spectral envelope by a simultaneous masking effect based on at least one of ambient noise spectrum information obtained from the receiver and listener hearing characteristic information obtained from the receiver ;
Based on the obtained new masking threshold, the bit allocation amount for each frequency band is changed so that the SN value of the audio signal is a predetermined value or more,
A speech coding and transmission method characterized in that each frequency band signal is coded with a predetermined coding algorithm based on the changed bit allocation amount and is transmitted to the receiving side .
マスキング閾値の補正に際し、前記周囲騒音スペクトル情報から得られるマスキングノイズを基に、マスキング閾値を調整することを特徴とする請求項1又は2記載の音声符号化伝送方法。3. The speech encoding and transmitting method according to claim 1, wherein the masking threshold is adjusted based on the masking noise obtained from the ambient noise spectrum information when the masking threshold is corrected. マスキング閾値の補正に際し、前記聴取者聴力特性情報に基づいて得られる聴取者の周波数帯域毎の最小可聴値と臨界帯域幅を基に、マスキング閾値を調整することを特徴とする請求項1又は2記載の音声符号化伝送方法。3. The masking threshold is adjusted based on a minimum audible value and a critical bandwidth for each frequency band of the listener obtained based on the listener hearing characteristic information when correcting the masking threshold. The voice encoded transmission method described. 受信側の聴取条件に基づいて聴感上の再生品質を補償するために送信側で適用する音声符号化伝送装置であって、
受信側の周囲騒音スペクトル情報を送信側で入手する騒音特性参照手段と、
受信側の聴取者聴力特性情報を送信側で入手する聴力特性参照手段と、
符号化伝送すべき音声信号のスペクトル包絡を求める周波数包絡演算手段と、
同時マスキング効果によって前記スペクトル包絡に関するマスキング閾値を、受信側から得た周囲騒音スペクトル情報と受信側から得た聴取者聴力特性情報の少なくとも1つに基づいて補正するマスキング閾値決定手段と、
得られた新しいマスキング閾値を基に各周波数帯域に対するビット割当量の配分を減少するよう調整する適応ビット割当手段と、
調整されたビット割当量に基づいて各周波数帯域信号に対して所定の符号化アルゴリズムで符号化を行う符号化手段と、を備えたことを特徴とする音声符号化伝送装置。
A speech coding and transmission device applied on the transmission side to compensate for the audible reproduction quality based on the listening condition on the reception side,
Noise characteristic reference means for obtaining ambient noise spectrum information on the receiving side on the transmitting side ;
A hearing characteristic reference means for obtaining the listener's hearing characteristic information on the receiving side on the transmitting side ;
A frequency envelope calculating means for obtaining a spectral envelope of a voice signal to be encoded and transmitted;
A masking threshold value determining means for correcting a masking threshold value related to the spectrum envelope by a simultaneous masking effect based on at least one of ambient noise spectrum information obtained from the receiving side and listener hearing characteristic information obtained from the receiving side ;
Adaptive bit allocation means for adjusting the allocation of the bit allocation amount for each frequency band based on the obtained new masking threshold;
An audio encoding transmission apparatus comprising: encoding means for encoding each frequency band signal with a predetermined encoding algorithm based on the adjusted bit allocation amount.
受信側の聴取条件に基づいて聴感上の再生品質を補償するために送信 側で適用する音声符号化伝送装置であって、
受信側の周囲騒音スペクトル情報を送信側で入手する騒音特性参照手段と、
受信側の聴取者聴力特性情報を送信側で入手する聴力特性参照手段と、
符号化伝送すべき音声信号のスペクトル包絡を求める周波数包絡演算手段と、
同時マスキング効果によって前記スペクトル包絡に関するマスキング閾値を、受信側から得た周囲騒音スペクトル情報と受信側から得た聴取者聴力特性情報の少なくとも1つ以上に基づいて補正するマスキング閾値決定手段と、
得られた新しいマスキング閾値を基に各周波数帯域に対するビット割当量を、音声信号のSN値が所定値以上となるよう変更する適応ビット割当手段と、
変更されたビット割当量に基づいて各周波数帯域信号に対して所定の符号化アルゴリズムで符号化を行う符号化手段と、を備えたことを特徴とする音声符号化伝送装置。
A speech coding and transmission device applied on the transmission side to compensate for the audible reproduction quality based on the listening condition on the reception side,
Noise characteristic reference means for obtaining ambient noise spectrum information on the receiving side on the transmitting side ;
A hearing characteristic reference means for obtaining the listener's hearing characteristic information on the receiving side on the transmitting side ;
A frequency envelope calculating means for obtaining a spectral envelope of a voice signal to be encoded and transmitted;
A masking threshold value determining means for correcting a masking threshold value related to the spectrum envelope by a simultaneous masking effect based on at least one of ambient noise spectrum information obtained from the receiving side and listener hearing characteristic information obtained from the receiving side ;
Adaptive bit allocation means for changing the bit allocation amount for each frequency band based on the obtained new masking threshold so that the SN value of the audio signal is equal to or greater than a predetermined value;
An audio encoding transmission apparatus comprising: encoding means for encoding each frequency band signal with a predetermined encoding algorithm based on the changed bit allocation amount.
マスキング閾値の補正に際し、前記周囲騒音スペクトル情報から得られるマスキングノイズを基に、マスキング閾値を調整することを特徴とする請求項5又は6記載の音声符号化伝送装置。7. The speech coding and transmitting apparatus according to claim 5, wherein the masking threshold is adjusted based on the masking noise obtained from the ambient noise spectrum information when the masking threshold is corrected. マスキング閾値の補正に際し、前記聴取者聴力特性情報で得られる聴取者の周波数帯域毎の最小可聴値と臨界帯域幅を基に、マスキング閾値を調整することを特徴とする請求項5又は6記載の音声符号化伝送装置。7. The masking threshold value is adjusted based on a minimum audible value and a critical bandwidth for each frequency band of the listener obtained by the listener hearing characteristic information when correcting the masking threshold value. Voice encoding transmission device.
JP15079297A 1997-06-09 1997-06-09 Speech coding transmission method and speech coding transmission apparatus Expired - Fee Related JP3750705B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP15079297A JP3750705B2 (en) 1997-06-09 1997-06-09 Speech coding transmission method and speech coding transmission apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP15079297A JP3750705B2 (en) 1997-06-09 1997-06-09 Speech coding transmission method and speech coding transmission apparatus

Publications (2)

Publication Number Publication Date
JPH10341162A JPH10341162A (en) 1998-12-22
JP3750705B2 true JP3750705B2 (en) 2006-03-01

Family

ID=15504543

Family Applications (1)

Application Number Title Priority Date Filing Date
JP15079297A Expired - Fee Related JP3750705B2 (en) 1997-06-09 1997-06-09 Speech coding transmission method and speech coding transmission apparatus

Country Status (1)

Country Link
JP (1) JP3750705B2 (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3574123B2 (en) 2001-03-28 2004-10-06 三菱電機株式会社 Noise suppression device
JP4464707B2 (en) 2004-02-24 2010-05-19 パナソニック株式会社 Communication device
JP5065687B2 (en) * 2007-01-09 2012-11-07 株式会社東芝 Audio data processing device and terminal device
CN102169694B (en) * 2010-02-26 2012-10-17 华为技术有限公司 Method and device for generating psychoacoustic model
JP6160072B2 (en) * 2012-12-06 2017-07-12 富士通株式会社 Audio signal encoding apparatus and method, audio signal transmission system and method, and audio signal decoding apparatus
JP2015227912A (en) * 2014-05-30 2015-12-17 富士通株式会社 Audio coding device and method
JP6317235B2 (en) * 2014-11-07 2018-04-25 日本電信電話株式会社 Content server device, operation method of content server device, and computer program

Also Published As

Publication number Publication date
JPH10341162A (en) 1998-12-22

Similar Documents

Publication Publication Date Title
US9165564B2 (en) Method and apparatus for frame-based buffer control in a communication system
US6098039A (en) Audio encoding apparatus which splits a signal, allocates and transmits bits, and quantitizes the signal based on bits
US6405338B1 (en) Unequal error protection for perceptual audio coders
US4622680A (en) Hybrid subband coder/decoder method and apparatus
US7930185B2 (en) Apparatus and method for controlling audio-frame division
JP2000236262A (en) Decoding of many programs for digital audio broadcasting communication and other purposes of use
US8787490B2 (en) Transmitting data in a communication system
JP3750705B2 (en) Speech coding transmission method and speech coding transmission apparatus
JP3987317B2 (en) Method and apparatus for processing a signal for transmission in a wireless communication system
US5832427A (en) Audio signal signal-to-mask ratio processor for subband coding
US11545164B2 (en) Audio signal encoding and decoding
JP3041967B2 (en) Digital signal coding device
KR20000057816A (en) Joint multiple program coding for digital audio broadcasting and other applications
JP2913696B2 (en) Digital signal encoding method
JP2913695B2 (en) Digital signal encoding method
JP2005165183A (en) Wireless communication device
JPH04302533A (en) High-efficiency encoding device for digital data
Naik et al. Joint encoding and decoding methods for digital audio broadcasting of multiple programs
KR960003627B1 (en) Decoding method of subband decoding audio signal for people hard of hearing
KR960016814B1 (en) Sub band coding method for a poor hearer
JPH04302535A (en) Digital signal encoding method
JP2001100796A (en) Audio signal encoding device
JPH04304013A (en) Digital signal encoder
JPS633528A (en) Voice coding system
JPH04302534A (en) Digital signal encoding method

Legal Events

Date Code Title Description
A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20040430

A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20040430

A977 Report on retrieval

Free format text: JAPANESE INTERMEDIATE CODE: A971007

Effective date: 20050620

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20050705

A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20050901

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20050920

A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20051104

TRDD Decision of grant or rejection written
A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

Effective date: 20051129

A61 First payment of annual fees (during grant procedure)

Free format text: JAPANESE INTERMEDIATE CODE: A61

Effective date: 20051129

R150 Certificate of patent or registration of utility model

Free format text: JAPANESE INTERMEDIATE CODE: R150

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20091216

Year of fee payment: 4

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20091216

Year of fee payment: 4

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20101216

Year of fee payment: 5

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20101216

Year of fee payment: 5

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20111216

Year of fee payment: 6

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20111216

Year of fee payment: 6

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20121216

Year of fee payment: 7

LAPS Cancellation because of no payment of annual fees