JP5172965B2 - 知覚モデルの適応的調整 - Google Patents
知覚モデルの適応的調整 Download PDFInfo
- Publication number
- JP5172965B2 JP5172965B2 JP2010530556A JP2010530556A JP5172965B2 JP 5172965 B2 JP5172965 B2 JP 5172965B2 JP 2010530556 A JP2010530556 A JP 2010530556A JP 2010530556 A JP2010530556 A JP 2010530556A JP 5172965 B2 JP5172965 B2 JP 5172965B2
- Authority
- JP
- Japan
- Prior art keywords
- signal
- bit rate
- parameter
- mask ratio
- ratio parameter
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 230000003044 adaptive effect Effects 0.000 title description 6
- 238000000034 method Methods 0.000 claims description 69
- 230000008859 change Effects 0.000 claims description 32
- 230000000873 masking effect Effects 0.000 claims description 32
- 230000005236 sound signal Effects 0.000 claims description 8
- 238000009499 grossing Methods 0.000 claims description 3
- 230000006870 function Effects 0.000 description 28
- 238000013139 quantization Methods 0.000 description 13
- 238000012805 post-processing Methods 0.000 description 10
- 238000010586 diagram Methods 0.000 description 8
- 230000000694 effects Effects 0.000 description 6
- 238000009792 diffusion process Methods 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 230000007423 decrease Effects 0.000 description 4
- 230000007774 longterm Effects 0.000 description 4
- 230000035945 sensitivity Effects 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 2
- 238000004590 computer program Methods 0.000 description 2
- CNQCVBJFEGMYDW-UHFFFAOYSA-N lawrencium atom Chemical compound [Lr] CNQCVBJFEGMYDW-UHFFFAOYSA-N 0.000 description 2
- 230000008447 perception Effects 0.000 description 2
- 230000035479 physiological effects, processes and functions Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000002829 reductive effect Effects 0.000 description 2
- 230000007480 spreading Effects 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- XUIMIQQOPSSXEZ-UHFFFAOYSA-N Silicon Chemical compound [Si] XUIMIQQOPSSXEZ-UHFFFAOYSA-N 0.000 description 1
- 238000009825 accumulation Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000002427 irreversible effect Effects 0.000 description 1
- 238000012804 iterative process Methods 0.000 description 1
- 230000000670 limiting effect Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000001953 sensory effect Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04B—TRANSMISSION
- H04B1/00—Details of transmission systems, not covered by a single one of groups H04B3/00 - H04B13/00; Details of transmission systems not characterised by the medium used for transmission
- H04B1/66—Details of transmission systems, not covered by a single one of groups H04B3/00 - H04B13/00; Details of transmission systems not characterised by the medium used for transmission for reducing bandwidth of signals; for improving efficiency of transmission
- H04B1/665—Details of transmission systems, not covered by a single one of groups H04B3/00 - H04B13/00; Details of transmission systems not characterised by the medium used for transmission for reducing bandwidth of signals; for improving efficiency of transmission using psychoacoustic properties of the ear, e.g. masking effect
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/032—Quantisation or dequantisation of spectral components
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Signal Processing (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Computer Networks & Wireless Communication (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Description
Claims (19)
- 信号を符号化する方法であって、
前記信号を知覚モデルに入力するステップと、
前記信号と信号対マスク比パラメータとに基づき、前記信号のマスキング閾値を生成するステップと、
前記マスキング閾値に基づき前記信号を量子化及び符号化するステップと、
前記信号の符号化部分のビットレートとターゲットビットレートとの関数に少なくとも基づき、前記信号対マスク比パラメータを調整するステップと、
を有する方法。 - 前記信号対マスク比パラメータを調整するステップを定期的に繰り返すステップをさらに有する、請求項1記載の方法。
- 前記信号は、フレームシーケンスに分割され、
前記信号対マスク比パラメータを調整するステップを定期的に繰り返すステップは、前記信号対マスク比パラメータを調整するステップをN(Nは整数)フレーム毎に繰り返す、請求項2記載の方法。 - 前記信号対マスク比パラメータを調整するステップは、
前記符号化部分の平均ビットレートを計算するステップと、
前記平均ビットレートと前記信号のターゲットビットレートとの関数に少なくとも基づき、前記信号対マスク比パラメータを調整するステップと、
を有する、請求項1乃至3何れか一項記載の方法。 - 前記信号対マスク比パラメータの調整はさらに、前記符号化部分の一部に対して計算される短期平均ビットレートの関数に基づく、請求項4記載の方法。
- 前記符号化部分の一部は、N(Nは整数)フレームから構成される、請求項5記載の方法。
- 前記信号対マスク比パラメータの調整はさらに、調整パラメータに基づく、請求項4乃至6何れか一項記載の方法。
- 測定されたビットレートの変化に基づき、前記調整パラメータを更新するステップをさらに有する、請求項7記載の方法。
- 前記信号の符号化部分のビットレートとターゲットビットレートとの関数に少なくとも基づき前記信号対マスク比パラメータを調整するステップはさらに、信号対マスク比パラメータの変化量を制限する、請求項1乃至10何れか一項記載の方法。
- 前記知覚モデルは、心理音響モデルからなり、
前記信号は、音声信号からなる、請求項1乃至11何れか一項記載の方法。 - 信号と信号対マスク比パラメータとに基づき、前記信号のマスキング閾値を生成するよう構成される知覚モデルと、
前記マスキング閾値に基づき、前記信号を量子化及び符号化する手段と、
前記信号の符号化部分のビットレートとターゲットビットレートとの関数に少なくとも基づき、前記信号対マスク比パラメータを調整する手段と、
を有するエンコーダ。 - 前記調整する手段は、前記符号化部分の平均ビットレートを計算し、前記平均ビットレートと前記信号のターゲットビットレートとの関数に少なくとも基づき、前記信号対マスク比パラメータを調整するよう構成される、請求項13記載のエンコーダ。
- 前記信号対マスク比パラメータの調整はさらに、前記符号化部分の一部に対して計算される短期平均ビットレートの関数に基づく、請求項14記載のエンコーダ。
- 前記信号対マスク比パラメータの調整はさらに、調整パラメータに基づく、請求項14又は15記載のエンコーダ。
- 前記調整する手段はさらに、信号対マスク比パラメータの変化量を制限するよう構成される、請求項13乃至17何れか一項記載のエンコーダ。
- 前記知覚モデルは、心理音響モデルからなり、
前記信号は、音声信号からなる、請求項13乃至18何れか一項記載のエンコーダ。
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB0721376.2 | 2007-10-31 | ||
GB0721376A GB2454208A (en) | 2007-10-31 | 2007-10-31 | Compression using a perceptual model and a signal-to-mask ratio (SMR) parameter tuned based on target bitrate and previously encoded data |
PCT/GB2008/050804 WO2009056867A1 (en) | 2007-10-31 | 2008-09-09 | Adaptive tuning of the perceptual model |
Publications (2)
Publication Number | Publication Date |
---|---|
JP2011501228A JP2011501228A (ja) | 2011-01-06 |
JP5172965B2 true JP5172965B2 (ja) | 2013-03-27 |
Family
ID=38834603
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP2010530556A Expired - Fee Related JP5172965B2 (ja) | 2007-10-31 | 2008-09-09 | 知覚モデルの適応的調整 |
Country Status (5)
Country | Link |
---|---|
US (2) | US8326619B2 (ja) |
EP (1) | EP2203916B1 (ja) |
JP (1) | JP5172965B2 (ja) |
GB (1) | GB2454208A (ja) |
WO (1) | WO2009056867A1 (ja) |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101435411B1 (ko) * | 2007-09-28 | 2014-08-28 | 삼성전자주식회사 | 심리 음향 모델의 마스킹 효과에 따라 적응적으로 양자화간격을 결정하는 방법과 이를 이용한 오디오 신호의부호화/복호화 방법 및 그 장치 |
IL205394A (en) * | 2010-04-28 | 2016-09-29 | Verint Systems Ltd | A system and method for automatically identifying a speech encoding scheme |
KR101854469B1 (ko) * | 2011-11-30 | 2018-05-04 | 삼성전자주식회사 | 오디오 컨텐츠의 비트레이트 판단장치 및 방법 |
US10043527B1 (en) * | 2015-07-17 | 2018-08-07 | Digimarc Corporation | Human auditory system modeling with masking energy adaptation |
US10395664B2 (en) | 2016-01-26 | 2019-08-27 | Dolby Laboratories Licensing Corporation | Adaptive Quantization |
WO2018069900A1 (en) * | 2016-10-14 | 2018-04-19 | Auckland Uniservices Limited | Audio-system and method for hearing-impaired |
CN115202163B (zh) * | 2022-09-15 | 2022-12-30 | 全芯智造技术有限公司 | 选择光阻模型的方法、设备和计算机可读存储介质 |
Family Cites Families (24)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR100269213B1 (ko) * | 1993-10-30 | 2000-10-16 | 윤종용 | 오디오신호의부호화방법 |
JP3131542B2 (ja) * | 1993-11-25 | 2001-02-05 | シャープ株式会社 | 符号化復号化装置 |
US5764698A (en) * | 1993-12-30 | 1998-06-09 | International Business Machines Corporation | Method and apparatus for efficient compression of high quality digital audio |
US5956674A (en) * | 1995-12-01 | 1999-09-21 | Digital Theater Systems, Inc. | Multi-channel predictive subband audio coder using psychoacoustic adaptive bit allocation in frequency, time and over the multiple channels |
EP0803989B1 (en) | 1996-04-26 | 1999-06-16 | Deutsche Thomson-Brandt Gmbh | Method and apparatus for encoding of a digitalized audio signal |
CN1106085C (zh) | 1996-04-26 | 2003-04-16 | 德国汤姆逊-布朗特公司 | 对数字音频信号编码的方法和装置 |
JP3802219B2 (ja) * | 1998-02-18 | 2006-07-26 | 富士通株式会社 | 音声符号化装置 |
TW477119B (en) * | 1999-01-28 | 2002-02-21 | Winbond Electronics Corp | Byte allocation method and device for speech synthesis |
EP1076295A1 (en) * | 1999-08-09 | 2001-02-14 | Deutsche Thomson-Brandt Gmbh | Method and encoder for bit-rate saving encoding of audio signals |
US6499010B1 (en) * | 2000-01-04 | 2002-12-24 | Agere Systems Inc. | Perceptual audio coder bit allocation scheme providing improved perceptual quality consistency |
TW499672B (en) | 2000-02-18 | 2002-08-21 | Intervideo Inc | Fast convergence method for bit allocation stage of MPEG audio layer 3 encoders |
JP2001282295A (ja) * | 2000-03-29 | 2001-10-12 | Aiwa Co Ltd | 符号化器及び符号化方法 |
JP2002006895A (ja) * | 2000-06-20 | 2002-01-11 | Fujitsu Ltd | ビット割当装置および方法 |
JP4055336B2 (ja) * | 2000-07-05 | 2008-03-05 | 日本電気株式会社 | 音声符号化装置及びそれに用いる音声符号化方法 |
DE10113322C2 (de) * | 2001-03-20 | 2003-08-21 | Bosch Gmbh Robert | Verfahren zur Codierung von Audiodaten |
KR100477701B1 (ko) * | 2002-11-07 | 2005-03-18 | 삼성전자주식회사 | Mpeg 오디오 인코딩 방법 및 mpeg 오디오 인코딩장치 |
US7333930B2 (en) | 2003-03-14 | 2008-02-19 | Agere Systems Inc. | Tonal analysis for perceptual audio coding using a compressed spectral representation |
JP4347634B2 (ja) * | 2003-08-08 | 2009-10-21 | 富士通株式会社 | 符号化装置及び符号化方法 |
US7725313B2 (en) * | 2004-09-13 | 2010-05-25 | Ittiam Systems (P) Ltd. | Method, system and apparatus for allocating bits in perceptual audio coders |
WO2007083934A1 (en) * | 2006-01-18 | 2007-07-26 | Lg Electronics Inc. | Apparatus and method for encoding and decoding signal |
US20070239295A1 (en) * | 2006-02-24 | 2007-10-11 | Thompson Jeffrey K | Codec conditioning system and method |
JP5260561B2 (ja) * | 2007-03-19 | 2013-08-14 | ドルビー ラボラトリーズ ライセンシング コーポレイション | 知覚モデルを使用した音声の強調 |
US8788264B2 (en) * | 2007-06-27 | 2014-07-22 | Nec Corporation | Audio encoding method, audio decoding method, audio encoding device, audio decoding device, program, and audio encoding/decoding system |
JP4973397B2 (ja) * | 2007-09-04 | 2012-07-11 | 日本電気株式会社 | 符号化装置および符号化方法、ならびに復号化装置および復号化方法 |
-
2007
- 2007-10-31 GB GB0721376A patent/GB2454208A/en not_active Withdrawn
-
2008
- 2008-09-09 US US12/679,729 patent/US8326619B2/en not_active Expired - Fee Related
- 2008-09-09 EP EP08788773.3A patent/EP2203916B1/en active Active
- 2008-09-09 JP JP2010530556A patent/JP5172965B2/ja not_active Expired - Fee Related
- 2008-09-09 WO PCT/GB2008/050804 patent/WO2009056867A1/en active Application Filing
-
2012
- 2012-07-31 US US13/562,841 patent/US8589155B2/en active Active
Also Published As
Publication number | Publication date |
---|---|
US8589155B2 (en) | 2013-11-19 |
EP2203916B1 (en) | 2014-02-05 |
US20130024201A1 (en) | 2013-01-24 |
EP2203916A1 (en) | 2010-07-07 |
JP2011501228A (ja) | 2011-01-06 |
US8326619B2 (en) | 2012-12-04 |
GB2454208A (en) | 2009-05-06 |
GB0721376D0 (en) | 2007-12-12 |
WO2009056867A1 (en) | 2009-05-07 |
US20100204997A1 (en) | 2010-08-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10217470B2 (en) | Bandwidth extension system and approach | |
JP5172965B2 (ja) | 知覚モデルの適応的調整 | |
US10354665B2 (en) | Apparatus and method for generating a frequency enhanced signal using temporal smoothing of subbands | |
JP5986565B2 (ja) | 音声符号化装置、音声復号装置、音声符号化方法及び音声復号方法 | |
US9691398B2 (en) | Method and a decoder for attenuation of signal regions reconstructed with low accuracy |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
A621 | Written request for application examination |
Free format text: JAPANESE INTERMEDIATE CODE: A621 Effective date: 20110817 |
|
A131 | Notification of reasons for refusal |
Free format text: JAPANESE INTERMEDIATE CODE: A131 Effective date: 20120814 |
|
A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20121112 |
|
TRDD | Decision of grant or rejection written | ||
A01 | Written decision to grant a patent or to grant a registration (utility model) |
Free format text: JAPANESE INTERMEDIATE CODE: A01 Effective date: 20121211 |
|
A61 | First payment of annual fees (during grant procedure) |
Free format text: JAPANESE INTERMEDIATE CODE: A61 Effective date: 20121226 |
|
R150 | Certificate of patent or registration of utility model |
Ref document number: 5172965 Country of ref document: JP Free format text: JAPANESE INTERMEDIATE CODE: R150 |
|
R250 | Receipt of annual fees |
Free format text: JAPANESE INTERMEDIATE CODE: R250 |
|
S533 | Written request for registration of change of name |
Free format text: JAPANESE INTERMEDIATE CODE: R313533 |
|
R350 | Written notification of registration of transfer |
Free format text: JAPANESE INTERMEDIATE CODE: R350 |
|
R250 | Receipt of annual fees |
Free format text: JAPANESE INTERMEDIATE CODE: R250 |
|
R250 | Receipt of annual fees |
Free format text: JAPANESE INTERMEDIATE CODE: R250 |
|
R250 | Receipt of annual fees |
Free format text: JAPANESE INTERMEDIATE CODE: R250 |
|
LAPS | Cancellation because of no payment of annual fees |