TW201812744A - Apparatus and method for encoding an audio signal using a compensation value - Google Patents

Apparatus and method for encoding an audio signal using a compensation value Download PDF

Info

Publication number
TW201812744A
TW201812744A TW106128438A TW106128438A TW201812744A TW 201812744 A TW201812744 A TW 201812744A TW 106128438 A TW106128438 A TW 106128438A TW 106128438 A TW106128438 A TW 106128438A TW 201812744 A TW201812744 A TW 201812744A
Authority
TW
Taiwan
Prior art keywords
frequency band
audio data
analysis result
parameter
band
Prior art date
Application number
TW106128438A
Other languages
Chinese (zh)
Other versions
TWI653626B (en
Inventor
薩斯洽 迪斯曲
法蘭茲 瑞泰爾休柏
珍恩 布特
馬庫斯 穆爾特斯
伯納德 艾德勒
Original Assignee
弗勞恩霍夫爾協會
紐倫堡大學
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 弗勞恩霍夫爾協會, 紐倫堡大學 filed Critical 弗勞恩霍夫爾協會
Publication of TW201812744A publication Critical patent/TW201812744A/en
Application granted granted Critical
Publication of TWI653626B publication Critical patent/TWI653626B/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/038Speech enhancement, e.g. noise reduction or echo cancellation using band spreading techniques
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L21/0232Processing in the frequency domain
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • H04L65/70Media network packetisation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/008Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • Signal Processing (AREA)
  • Acoustics & Sound (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Transmission Systems Not Characterized By The Medium Used For Transmission (AREA)
  • Reduction Or Emphasis Of Bandwidth Of Signals (AREA)
  • Stereophonic System (AREA)

Abstract

An apparatus for encoding an audio signal, comprises: a core encoder for core encoding first audio data in a first spectral band; a parametric coder for parametrically coding second audio data in a second spectral band being different from the first spectral band, wherein the parametric coder comprises: an analyzer for analyzing first audio data in the first spectral band to obtain a first analysis result and for analyzing second audio data in the second spectral band to obtain a second analysis result; a compensator for calculating a compensation value using the first analysis result and the second analysis result; and a parameter calculated for calculating a parameter from the second audio data in the second spectral band using the compensation value.

Description

用以使用補償值編碼音訊信號之裝置及方法Device and method for encoding audio signal using compensation value

本發明係針對音訊寫碼與解碼,而且具體而言,係針對使用諸如頻寬延伸或頻譜帶複製(SBR)或智慧間隙填充(IGF)之頻譜增強技術進行音訊編碼/解碼。The present invention is directed to audio coding and decoding, and more specifically, to audio encoding / decoding using a spectrum enhancement technique such as bandwidth extension or spectral band replication (SBR) or intelligent gap filling (IGF).

音訊信號之儲存或傳輸通常有嚴格的位元率限制條件。在過去,當只有很低的位元率可用時,寫碼器被強制要大幅縮減傳輸音訊頻寬。現代的音訊編解碼已能夠藉由使用頻寬延伸(BWE)方法來寫碼寬頻信號[1-2]。這些演算法依賴高頻成分(HF)之一參數表示型態,其係藉由轉置成HF頻譜區(「修補」)及套用一參數驅動式後處理從已解碼信號之波形寫碼低頻部分(LF)產生。然而,一貼片中複製到某目標區之頻譜細密結構與原始成分之頻譜細密結構差異頗大,煩人的假影可能產生並且降低已解碼音訊信號之感知品質。The storage or transmission of audio signals usually has strict bit rate restrictions. In the past, when only very low bit rates were available, coders were forced to significantly reduce the transmission audio bandwidth. Modern audio codecs have been able to write wideband signals by using the Bandwidth Extension (BWE) method [1-2]. These algorithms rely on one of the parametric representations of the high-frequency component (HF), which is to write the low-frequency portion of the decoded signal from the waveform of the decoded signal by transposing into the HF spectral region ("repair") and applying a parameter-driven post-processing (LF) produced. However, the fine structure of the frequency spectrum copied to a target area in a patch is quite different from the fine structure of the original component. Annoying artifacts may occur and reduce the perceived quality of the decoded audio signal.

在BWE方案中,高於一給定之所謂交越頻率的HF頻率區通常是基於頻譜修補來重構。一般而言,HF區是由多個相鄰貼片所組成,並且這些貼片各源自於低於給定交越頻率之LF頻譜的帶通(BP)區。現代化系統藉由將一組相鄰次頻帶係數從一起源複製到目標區,有效率地在一濾波器組表示型態內進行修補。在下一個步驟中,調整頻譜包絡而使得其密切地類似已在編碼器中測量,且以位元流方式傳送作為旁訊息之原始HF信號的包絡。In the BWE scheme, the HF frequency region higher than a given so-called crossover frequency is usually reconstructed based on spectrum repair. In general, the HF region is composed of multiple adjacent patches, and each of these patches is derived from the bandpass (BP) region of the LF spectrum below a given crossover frequency. Modern systems efficiently patch a set of filter bank representations by copying a set of adjacent sub-band coefficients from a source to the target area. In the next step, the spectral envelope is adjusted so that it closely resembles the envelope of the original HF signal that has been measured in the encoder and is transmitted as a side stream as a side message.

然而,頻譜細密結構中存在可能導致假影感知之不匹配。一俗稱的不匹配係有關於音調性。若原始HF包括具有極大主宰能量成分之一音調,並且要複製到音調之頻譜位置的貼片具有雜訊特性,此帶通雜訊會上調而使其變為可聽見,成為一煩人的雜訊猝發。However, the presence of fine-grained structures in the spectrum can lead to mismatches in artifact perception. A commonly known mismatch is about tonality. If the original HF includes a tone with one of the dominant energy components, and the patch to be copied to the spectral position of the tone has noise characteristics, this bandpass noise will be adjusted up to make it audible and become annoying Burst.

頻譜帶複製(SBR)是一種在現代音訊編解碼中運用之眾所周知的BWE [1]。在SBR中,音調性不匹配的問題是藉由插入人工取代正弦來因應。然而,這需要將另外的旁訊息傳送至解碼器,加大了BWE資料的位元需求。此外,若針對後續方塊雙態觸變音調插入之啟/停,所插入的音調會隨著時間導致不穩定。Spectrum band replication (SBR) is a well-known BWE used in modern audio codecs [1]. In SBR, the problem of tonal mismatch is addressed by inserting artificially instead of sine. However, this requires sending additional side messages to the decoder, which increases the bit requirements for BWE data. In addition, if the start / stop of thixotropic tone insertion for subsequent block bi-states is inserted, the inserted tone will cause instability over time.

智慧間隙填充(IGF)表示如MPEG-H 3D Audio或3gpp EVS編解碼之現代編解碼裡的半參數寫碼技巧。可應用IGF以填充編碼器中因為低位元率限制條件而由量化程序所引進的頻譜洞。一般而言,有限的位元預算若不容許透明寫碼,信號之高頻(HF)區中浮現的頻譜洞針對最低位元率領先且逐漸影響整體上頻率範圍。於解碼器側,此類頻譜洞係使用以一半參數方式從低頻(LF)成分產生之綜合HF成分、及由附加參數側資訊所控制之後處理經由IGF來替代。Intelligent Gap Filling (IGF) represents semi-parametric coding techniques in modern codecs such as MPEG-H 3D Audio or 3gpp EVS codecs. IGF can be applied to fill spectral holes introduced by quantization procedures in encoders due to low bit rate constraints. Generally speaking, if the limited bit budget does not allow transparent coding, the spectral holes emerging in the high frequency (HF) region of the signal lead the lowest bit rate and gradually affect the overall frequency range. On the decoder side, this type of spectral hole is replaced by IGF using a comprehensive HF component generated from low-frequency (LF) components in a half-parameter manner and controlled by additional parameter-side information.

由於IGF基本上係以藉由從更低頻率複製頻譜部分(所謂的磗)來填充高頻頻譜、及藉由應用一增益因子來調整能量為基礎,若原始信號中當作向上複製程序起源之頻率範圍在頻率細密結構方面與其目的地不同,則可能會有問題。Since the IGF is basically based on filling the high-frequency spectrum by copying a portion of the spectrum from a lower frequency (the so-called chirp), and adjusting the energy by applying a gain factor, if the original signal is regarded as the origin of the upward copy process A frequency range that differs from its destination in terms of frequency fine structure can be problematic.

會有強烈感知影響之一種此類狀況為音調性之差異。此音調性不匹配會以兩種不同方式出現:帶有強烈音調性之一頻率範圍係複製到假定結構中似有雜訊之頻譜範圍,或者另一種方式為雜訊取代原始信號中之音調分量。在IGF中,前種狀況因大部分音訊信號通常是頻率愈高則似雜訊愈多而更常見,係藉由應用頻譜白化來處理,其中參數係傳送至解碼器而傳訊需要白化的程度,若有的話。對於後種狀況,音調性可藉由使用核心寫碼器之全頻譜帶編碼功能來校正,用以透過波形寫碼來保存HF頻譜帶中之音調線。這些所謂的「存活線」可基於強烈的音調性來選擇。波形寫碼在位元率方面有相當高的要求,而且在低位元率情境中,最有可能的是令人無法負擔得起。此外,必須防止寫碼與不寫碼一音調分量間的訊框切換,該切換會產生煩人的實物。One such condition that has a strong perceived impact is a difference in tonality. This tonal mismatch can occur in two different ways: one of the frequency ranges with strong tonality is copied to the spectral range that appears to be noise in the hypothetical structure, or the other is to replace the tonal components in the original signal with noise. . In the IGF, the former situation is more common because most audio signals usually have higher frequencies and more noise, which is handled by applying spectral whitening. The parameters are transmitted to the decoder and the degree of whitening is required for the communication. If any. For the latter situation, the tonality can be corrected by using the full-spectrum band coding function of the core writer to save the tone lines in the HF spectrum band through the waveform writing code. These so-called "survival lines" can be selected based on strong tonality. Wave coding has quite high requirements in terms of bit rate, and in the low bit rate scenario, it is most likely to be unaffordable. In addition, it is necessary to prevent frame switching between coded and uncoded one-tone components, which may cause annoying physical objects.

另外,歐洲專利申請案EP 2830054 A1中揭示且說明智慧間隙填充技術。該IGF技術一方面解決有關於頻寬延伸分離的問題,另一方面藉由在核心解碼器運作之相同頻譜域中進行頻寬延伸來進行核心解碼。因此,提供全滿率核心編碼器/解碼器,其編碼及解碼完全的音訊信號範圍。這在編碼器側不需要降取樣器,並且在解碼器側不需要升取樣器。反而,整體處理係以全取樣率或全頻寬域來進行。為了取得一高寫碼增益,分析音訊信號以便尋找必須以一高解析度編碼之第一組第一頻譜部分,在一實施例中,此第一組第一頻譜部分可包括音訊信號的音調部分。另一方面,構成一第二組第二頻譜部分之音訊信號中的非音調或雜訊分量係以低頻譜解析度來參數性編碼。已編碼音訊信號接著僅需要依照波形保存方式以一高頻譜解析度編碼之第一組第一頻譜部分、以及使用源自於該第一組之頻率「磚」以一低解析度參數性編碼之第二組第二頻譜部分。在解碼器側,核心解碼器係一全頻譜帶解碼器,依照一波段保存方式重構第一組第一頻譜部分,亦即並不知悉有任何另外的頻率再生。然而,如此產生之頻譜有許多頻譜間隙。這些間隙隨後係藉由一方面使用應用參數資料之一頻率再生、及另一方面使用一起源頻譜範圍(即藉由全滿率音訊解碼器所重構之第一頻譜部分),以發明性智慧間隙填充(IGF)來填充。In addition, European patent application EP 2830054 A1 discloses and illustrates a smart gap-filling technique. The IGF technology solves the problem of bandwidth extension separation on the one hand, and performs core decoding by performing bandwidth extension on the same spectral domain in which the core decoder operates. Therefore, a full-rate core encoder / decoder is provided, which encodes and decodes the full audio signal range. This requires no downsampler on the encoder side and no upsampler on the decoder side. Instead, the overall processing is performed at the full sample rate or full bandwidth. In order to obtain a high write code gain, the audio signal is analyzed to find a first set of first spectral portions that must be encoded with a high resolution. In an embodiment, the first set of first spectral portions may include a tonal portion of the audio signal . On the other hand, the non-tone or noise components in the audio signals constituting a second group of the second spectral portion are parametrically encoded with a low spectral resolution. The encoded audio signal then only needs to be encoded according to the waveform preservation method with a first set of first spectral portions with a high spectral resolution, and using a frequency "brick" derived from the first set to be encoded with a low resolution parametric The second part of the second spectrum. On the decoder side, the core decoder is a full-spectrum-band decoder, which reconstructs the first set of first spectrum parts according to a band-save method, that is, it is not aware of any other frequency regeneration. However, the resulting spectrum has many spectral gaps. These gaps are then invented with inventive wisdom by using on one hand the frequency regeneration of one of the application parameter data and on the other the source spectral range (i.e. the first part of the spectrum reconstructed by the full-rate audio decoder). Gap Fill (IGF).

第三代合夥專案3GPP TS 26.445 V13.2.0 (2016-06);技術規格群組服務與系統方面(Technical Specification Group Services and System Aspect);增強型語音服務(EVS)用之編解碼(Codec for Enhanced Voice Services);詳細演算法說明(Detailed Algorithmic Description) (第13版)中亦包括且揭示此IGF技術。特別的是,所參照的有此參考與編碼器側有關之第5.3.3.2.11節「智慧間隙填充」、以及對第6節之附加參考,尤其還有第6.2.2.3.8節「IGF應用」及其他IGF相關短文,諸如與解碼器側實作態樣有關之第6.2.2.2.9節「IGF位元流讀取機」或第6.2.2.3.11節「IGF時間平坦化」。3rd Generation Partnership Project 3GPP TS 26.445 V13.2.0 (2016-06); Technical Specification Group Services and System Aspect; Codec for Enhanced Voice Service (EVS) Voice Services); Detailed Algorithmic Description (13th edition) also includes and reveals this IGF technology. In particular, reference is made to section 5.3.3.2.11 "Smart Gap Filling", which is related to the encoder side, and an additional reference to section 6, especially section 6.2.2.3.8, "IGF Application "and other IGF-related essays, such as Section 6.2.2.2.9" IGF Bit Stream Reader "or Section 6.2.2.3.11" IGF Time Flattening "related to the implementation side of the decoder.

EP 2301027 B1揭示用於產生頻寬延伸輸出資料之一種設備及一種方法。在有聲語音信號中,相較於原始計算之雜訊底,計算之雜訊底愈低,感知方面的品質便愈高。結果是,在這種狀況中,語音聽起來回響更小。倘若音訊信號包含齒音,人工提升雜訊底可掩蓋齒音相關修補方法中之缺點。因此,此參考揭示針對諸如有聲語音之信號降低雜訊底、及針對包含例如齒音之信號提升雜訊底。為了區別不同的信號,實施例使用能量分布資料(例如一齒音參數),其測量能量是否大部分位於諸更高頻率或一更高頻率,或換句話說,其測量音訊信號之頻譜表示型態是否顯示音訊信號斜向更高頻率之程度更大或更小。進一步實作態樣亦使用第一LPC係數(LPC即線性預測寫碼)產生齒音參數。EP 2301027 B1 discloses a device and a method for generating bandwidth extended output data. In a voiced speech signal, compared with the original noise floor, the lower the noise floor of the calculation, the higher the quality of perception. As a result, in this situation, speech sounds less reverberant. If the audio signal contains tooth sound, artificially raising the noise floor can cover up the shortcomings in the tooth sound related repair method. Therefore, this reference discloses reducing the noise floor for signals such as voiced speech and raising the noise floor for signals containing, for example, tooth sounds. In order to distinguish different signals, the embodiment uses energy distribution data (such as a tooth sound parameter), which measures whether the energy is mostly located at higher frequencies or a higher frequency, or in other words, it measures the spectral representation of an audio signal Whether the state shows that the audio signal is slanted to a higher frequency to a greater or lesser extent. To further implement the aspect, the first LPC coefficient (LPC) is used to generate the pitch parameter.

本發明之一目的在於針對音訊編碼或音訊處理提供一改良型概念。It is an object of the present invention to provide an improved concept for audio coding or audio processing.

此目的係藉由如請求項1之一種用於一音訊信號之設備、如請求項23之一種將一音訊信號編碼之方法、如請求項24之一種用於處理一音訊信號之系統、如請求項25之一種處理一音訊信號之方法、或如請求項26之一種電腦程式來達成。This object is achieved by a device for an audio signal as in claim 1, a method for encoding an audio signal as in claim 23, a system for processing an audio signal as in claim 24, as requested A method of processing an audio signal according to item 25, or a computer program such as requesting item 26 to achieve.

一種用於將一音訊信號編碼之設備包含用於將一第一頻譜帶中之第一音訊資料核心編碼之一核心編碼器、及用於將與該第一頻譜帶不同之一第二頻譜帶中之第二音訊資料參數性寫碼之一參數寫碼器。特別的是,該參數寫碼器包含用於分析該第一頻譜帶中之第一音訊資料以取得一第一分析結果、及用於分析該第二頻譜帶中之第二音訊資料以取得一第二分析結果之一分析器。一補償器使用該第一分析結果及該第二分析結果計算一補償值。再者,一參數計算器使用如該補償器所判定之該補償值從該第二頻譜帶中之該第二音訊資料計算一參數。An apparatus for encoding an audio signal includes a core encoder for core encoding a first audio data in a first frequency band, and a second frequency band for changing a frequency band different from the first frequency band. Parametric code writer, one of the second parametric coding of audio data. In particular, the parameter writer includes a method for analyzing the first audio data in the first frequency band to obtain a first analysis result, and a method for analyzing the second audio data in the second frequency band to obtain a One of the second analysis results. A compensator uses the first analysis result and the second analysis result to calculate a compensation value. Furthermore, a parameter calculator uses the compensation value as determined by the compensator to calculate a parameter from the second audio data in the second frequency band.

因此,本發明係基於以下發現:為了查明在解碼器側使用某一參數之一重構是否滿足音訊信號所需之某一特性,分析第一頻譜帶(其一般為起源帶)以取得第一分析結果。類似的是,藉由分析器另外分析第二頻譜帶以取得第二分析結果,該第二頻譜帶一般為目標帶,並且係於解碼器側使用第一頻譜帶(即起源帶)所重構。因此,對於起源帶及目標帶,計算一分離的分析結果。Therefore, the present invention is based on the following finding: in order to find out whether the reconstruction using one of the parameters on the decoder side satisfies a certain characteristic required for an audio signal, the first spectrum band (which is generally the origin band) is analyzed to obtain the first An analysis result. Similarly, a second spectrum band is additionally analyzed by the analyzer to obtain a second analysis result. The second spectrum band is generally a target band and is reconstructed on the decoder side using the first spectrum band (that is, the origin band). . Therefore, for the origin zone and the target zone, a separate analysis result is calculated.

接著,基於這兩個分析結果,一補償器計算一補償值以變更未對一修改值進行任何補償便已取得之某一參數。換句話說,本發明脫離一般程序,其中用於第二頻譜帶之一參數係計算自原始音訊信號,並且係傳送至解碼器,以致使用計算之參數重構第二頻譜帶,並且反而一方面產生計算自目標帶之一補償參數,而另一方面產生取決於第一與第二兩分析結果之補償值。Then, based on the two analysis results, a compensator calculates a compensation value to change a certain parameter that has been obtained without any compensation for a modified value. In other words, the present invention deviates from the general procedure in which one of the parameters for the second frequency band is calculated from the original audio signal and transmitted to the decoder, so that the second frequency band is reconstructed using the calculated parameters, and on the one hand A compensation parameter calculated from one of the target bands is generated, while a compensation value depending on the first and second analysis results is generated on the other hand.

可藉由先計算非補償參數來計算補償參數,然後可將此非補償參數與補償值組合以取得補償參數,或可在一瞬間計算補償參數,不用以無補償參數作為一中間結果。接著可將補償參數從編碼器傳送至解碼器,然後解碼器應用某一頻譜增強技術,諸如頻譜帶複製或智慧間隙填充、或使用補償參數值之任何其他程序。因此,藉由除了進行參數計算,還進行起源帶與目標帶中之信號分析,並且分別基於出自起源帶之結果、及出自目標帶之結果(即出自第一頻譜帶及第二頻譜帶)進行一補償值之後續計算,得以靈活地克服對某一參數計算演算法之強烈順從,與參數是否產生一所欲頻譜帶增強結果無關。The compensation parameter can be calculated by first calculating the non-compensation parameter, and then the non-compensation parameter can be combined with the compensation value to obtain the compensation parameter, or the compensation parameter can be calculated in an instant without using the non-compensation parameter as an intermediate result. The compensation parameters can then be passed from the encoder to the decoder, and the decoder applies some spectral enhancement technique, such as spectral band copying or smart gap filling, or any other procedure that uses the compensation parameter values. Therefore, in addition to performing parameter calculations, signal analysis in the origin and target bands is also performed, and the results are derived from the results from the origin band and the results from the target band (that is, from the first and second spectral bands). The subsequent calculation of a compensation value can flexibly overcome the strong compliance with a certain parameter calculation algorithm, regardless of whether the parameter produces a desired spectral band enhancement result.

較佳的是,分析器及/或補償器應用一種判定一心理聲學不匹配之心理聲學模型。因此,在一實施例中,補償值之計算係基於偵檢某些信號參數(諸如音調性)之一心理聲學不匹配,並且一補償策略係應用於透過修改其他信號參數(諸如頻譜帶增益因子)使總體感知煩擾降到最低。因此,藉由使不同類型的實物取得平衡而取得一感知方面平衡良好的結果。Preferably, the analyzer and / or compensator applies a psychoacoustic model that determines a psychoacoustic mismatch. Therefore, in one embodiment, the calculation of the compensation value is based on detecting a psychoacoustic mismatch of certain signal parameters (such as tonality), and a compensation strategy is applied by modifying other signal parameters (such as the spectral band gain factor Minimize overall perceived annoyance. Therefore, a well-balanced result is obtained by balancing different types of objects.

與「嘗試不計成本修正音調性」之先前技術作法截然不同的是,實施例教示對頻譜中偵檢到一音調性不匹配之有問題部分施加一消減以更恰當地補救假影,藉此在一頻譜能量包絡不匹配與一音調性不匹配之間取得平衡。In contrast to the previous technique of "attempting to correct tonalities regardless of cost", the embodiment teaches that a problematic portion of a tonal mismatch detected in the spectrum is subtracted to more appropriately remedy the artifacts, thereby There is a balance between a spectral energy envelope mismatch and a tonal mismatch.

在數個信號參數之輸入上,含有一感知煩擾模型之補償策略可選定用於取得一最優感知配適而非一僅信號參數配適之一策略。On the input of several signal parameters, a compensation strategy containing a perceptual disturbance model can be selected to obtain an optimal perceptual fit rather than a signal-only fit.

此策略係由計量潛在假影之感知顯著性、及選擇用以使總體減損減到最小之一參數組合所組成。This strategy consists of measuring the perceived significance of potential artifacts and selecting a combination of parameters to minimize overall impairment.

此作法的用意主要是要基於與MDCT相似之一轉換予以在一BWE內應用。然而,本發明之教示大致上適用,例如類似地在一正交鏡相濾波器組(QMF)為基礎的系統中適用。The intention of this method is to apply it in a BWE based on a conversion similar to MDCT. However, the teachings of the present invention are generally applicable, for example, similarly in a quadrature mirror phase filter bank (QMF) based system.

可應用此技巧之一種可能情境為以智慧間隙填充(IGF)為背景,偵檢並且隨後消減雜訊帶。One possible scenario where this technique can be applied is to detect and subsequently reduce noise bands with the background of Intelligent Gap Fill (IGF).

實施例透過偵檢一可能音調性不匹配之出現、及降低對應換算因子使其效應降低來加以處置。這可能一方面導致偏離原物之頻譜能量包絡,但另一方面導致HF噪度降低,有助於感知品質之總體提升。The embodiments deal with this by detecting the occurrence of a possible tonal mismatch and reducing the corresponding conversion factor to reduce its effect. This may lead to deviations from the spectral energy envelope of the original, on the one hand, but to a reduction in HF noise, which contributes to the overall improvement of perceived quality.

因此,實施例透過一新穎的參數補償技巧來改善感知品質,一般是藉由一感知煩擾模型來操縱,舉例而言,在起源或第一頻譜帶與目標或第二頻譜帶之間存在頻譜細密結構不匹配的情況中尤其如此。Therefore, the embodiment uses a novel parameter compensation technique to improve the perceived quality, which is generally manipulated by a perceptual disturbance model. For example, there is a fine spectrum between the origin or the first spectrum band and the target or the second spectrum band. This is especially true in the case of mismatched structures.

圖1繪示本發明之一實施例中用於將一音訊信號100編碼之一設備。該設備包含一核心編碼器110及一參數寫碼器120。再者,核心編碼器110及參數寫碼器120係於其輸入側連接至頻譜分析器130,並且於其輸出側連接至輸出介面140。輸出介面140產生一已編碼音訊信號150。輸出介面140一方面針對第二頻譜帶接收已編碼核心信號160及至少一參數,並且一般而言於輸入線170針對一第二頻譜帶接收包含該參數之一全參數表示型態。再者,頻譜分析器130將音訊信號100分成一第一頻譜帶180及一第二頻譜帶190。特別的是,參數計算器包含一分析器121,其在圖1中係繪示為一信號分析器,用於分析第一頻譜帶180中之第一音訊資料以取得一第一分析結果122、及用於分析第二頻譜帶190中之第二音訊資料以取得一第二分析結果123。將第一分析結果122及第二分析結果123兩者都提供至一補償器124,用於計算一補償值125。因此,補償器124係組配成用於將第一分析結果122及第二分析結果123用於計算該補償值。接著,將一方面之補償值125及至少出自第二頻譜帶190之第二音訊(也可使用出自第一頻譜帶之第一頻譜資料)兩者都提供至一參數計算器126,用於使用補償值125從第二頻譜帶中之第二音訊資料計算一參數170。FIG. 1 illustrates an apparatus for encoding an audio signal 100 according to an embodiment of the present invention. The device includes a core encoder 110 and a parameter writer 120. Furthermore, the core encoder 110 and the parameter writer 120 are connected to the spectrum analyzer 130 on its input side and connected to the output interface 140 on its output side. The output interface 140 generates a coded audio signal 150. On the one hand, the output interface 140 receives the encoded core signal 160 and at least one parameter for the second frequency band, and generally, the input line 170 receives for the second frequency band a full parameter representation including one of the parameters. Furthermore, the spectrum analyzer 130 divides the audio signal 100 into a first frequency band 180 and a second frequency band 190. In particular, the parameter calculator includes an analyzer 121, which is shown as a signal analyzer in FIG. 1 for analyzing the first audio data in the first frequency band 180 to obtain a first analysis result 122, And used to analyze the second audio data in the second frequency band 190 to obtain a second analysis result 123. Both the first analysis result 122 and the second analysis result 123 are provided to a compensator 124 for calculating a compensation value 125. Therefore, the compensator 124 is configured to use the first analysis result 122 and the second analysis result 123 to calculate the compensation value. Then, both the compensation value 125 on the one hand and the second audio from at least the second spectrum band 190 (the first spectrum data from the first spectrum band can also be used) are provided to a parameter calculator 126 for use. The compensation value 125 calculates a parameter 170 from the second audio data in the second frequency band.

圖1中之頻譜分析器130舉例而言,可以是用以取得個別頻譜帶或MDCT線之一直接時間/頻率轉換器。因此,在這種實作態樣中,頻譜分析器130實施一修改型離散餘弦轉換(MDCT)以取得頻譜資料。接著,進一步分析此頻譜資料,以便一方面針對核心編碼器110使資料分離且另一方面針對參數寫碼器120使資料分離。用於核心編碼器110之資料至少包含第一頻譜帶。再者,當核心編碼器係用以將超過一個起源帶編碼時,核心資料可另外更包含起源資料。The spectrum analyzer 130 in FIG. 1 may be, for example, a direct time / frequency converter for obtaining individual spectrum bands or MDCT lines. Therefore, in this implementation aspect, the spectrum analyzer 130 implements a modified discrete cosine transform (MDCT) to obtain spectrum data. Then, the spectrum data is further analyzed, so as to separate the data for the core encoder 110 and the data for the parameter writer 120 on the one hand. The data for the core encoder 110 includes at least a first frequency band. Furthermore, when the core encoder is used to encode more than one origin zone, the core data may further include the origin data.

因此,核心編碼器在頻譜帶複製技術的情況中接收低於一交越頻率之整體頻寬作為要核心編碼之輸入資料,而參數寫碼器則接收高於該交越頻率之所有音訊資料。Therefore, the core encoder receives the overall bandwidth below a crossover frequency as input data to be core-coded in the case of the spectral band replication technology, and the parameter coder receives all audio data above the crossover frequency.

然而,在一智慧間隙填充框架的情況中,核心編碼器110可另外接收高於一IGF起始頻率又藉由頻譜分析器130來分析之頻譜線,以致頻譜分析器130另外判定甚至高於IGF起始頻率之資料,其中此高於IGF起始頻率之資料係另外藉由核心編碼器來編碼。為此,亦可將頻譜分析器130實施成一「音調遮罩」,其舉例而言,亦如3GPP TS 26.445 V13.0.0(12)中所揭示在5.3.3.2.11.5節「IGF音調遮罩」中有討論。因此,為了判定應以核心編碼器傳送的是哪個頻譜分量,藉由頻譜分析器130來計算音調遮罩。因此,識別全部有效頻譜成分,但藉由音調遮罩將適用於透過IGF進行參數寫碼之成分量化至零。然而,頻譜分析器130將適用於參數寫碼之頻譜成分轉發至參數寫碼器120,並且此資料舉例而言,可以是已藉由音調遮罩處理設為零之資料。However, in the case of a smart gap filling frame, the core encoder 110 may additionally receive a spectrum line that is higher than an IGF starting frequency and analyzed by the spectrum analyzer 130, so that the spectrum analyzer 130 additionally determines that it is even higher than the IGF Data of the starting frequency, where the data higher than the starting frequency of the IGF is additionally encoded by the core encoder. To this end, the spectrum analyzer 130 can also be implemented as a "tone mask", for example, as also disclosed in 3GPP TS 26.445 V13.0.0 (12) in Section 5.3.3.2.11.5 "IGF Tone Mask" There is discussion in. Therefore, in order to determine which spectral component should be transmitted by the core encoder, the tone mask is calculated by the spectrum analyzer 130. Therefore, all effective spectral components are identified, but the components suitable for parameter coding through IGF are quantized to zero by a tone mask. However, the spectrum analyzer 130 forwards the spectrum components suitable for parameter coding to the parameter coder 120, and this data may be, for example, data that has been set to zero by tone mask processing.

在圖2繪示之一實施例中,參數寫碼器120係另外組配成用於將一第三頻譜帶中之第三音訊資料參數性寫碼,以針對此第三頻譜帶取得一進一步參數200。在這種狀況中,分析器121係組配成用於分析第三頻譜帶202中之第三音訊資料,以除了第一分析結果122及第二分析結果123以外還取得第三分析結果204。In an embodiment shown in FIG. 2, the parameter writer 120 is further configured to parametrically code the third audio data in a third frequency band to obtain a further step for the third frequency band. Parameter 200. In this case, the analyzer 121 is configured to analyze the third audio data in the third frequency band 202 to obtain a third analysis result 204 in addition to the first analysis result 122 and the second analysis result 123.

再者,出自圖1之參數寫碼器120另外包含用於至少使用該第三分析結果204來偵檢是否要補償該第三頻譜帶之一補償偵檢器210。此偵檢之結果係藉由一控制線212來輸出,其表示一補償情況是否為針對第三頻譜帶。參數計算器126被組配來當該補償偵檢器偵檢如藉由控制線212所提供不要補償該第三頻譜帶時,不用任何補償值而針對該第三頻譜帶計算進一步參數200。然而,若該補償偵檢器偵檢要補償第三頻譜帶,則參數計算器被組配來以藉由補償器124從第三分析結果200計算之一附加補償值針對第三頻譜帶計算進一步參數200。Furthermore, the parameter coder 120 from FIG. 1 further includes a compensation detector 210 for detecting whether to compensate one of the third frequency bands by using at least the third analysis result 204. The result of this detection is output through a control line 212, which indicates whether a compensation situation is for the third frequency band. The parameter calculator 126 is configured to calculate a further parameter 200 for the third frequency band without any compensation value when the compensation detector detects that the third frequency band is not compensated as provided by the control line 212. However, if the compensation detector is to compensate for the third frequency band, the parameter calculator is configured to calculate one of the additional compensation values calculated by the compensator 124 from the third analysis result 200 to further calculate the third frequency band. Parameter 200.

在應用一定量補償之一較佳實施例中,分析器121被組配來計算一第一定量值122作為第一分析結果、及計算一第二定量值123作為第二分析結果。接著,補償器124被組配來從該第一定量值、及從該第二定量值計算一定量補償值125。最後,該參數計算器係組配成用於使用該定量補償值計算該定量參數。In a preferred embodiment where a certain amount of compensation is applied, the analyzer 121 is configured to calculate a first quantitative value 122 as a first analysis result, and calculate a second quantitative value 123 as a second analysis result. Then, the compensator 124 is configured to calculate a certain amount of compensation value 125 from the first quantitative value and the second quantitative value. Finally, the parameter calculator is configured to calculate the quantitative parameter using the quantitative compensation value.

然而,本發明在僅取得定性分析結果時亦適用。在這種情況下,計算一定性補償值,其控制參數計算器以將某一非補償參數降低或升高某一程度。因此,兩分析結果可一起導致一參數升高或降低某一程度,該某種升高或降低程度是固定的,並且因而非取決於任何定量結果。然而,定量結果對於一固定遞增/遞減為較佳,但後項計算的運算密集度較低。However, the present invention is also applicable when only qualitative analysis results are obtained. In this case, a certain compensation value is calculated, which controls the parameter calculator to lower or increase a certain non-compensation parameter to a certain degree. Therefore, the two analysis results can together lead to a certain parameter increase or decrease to a certain degree, which certain increase or decrease is fixed, and thus does not depend on any quantitative result. However, quantitative results are better for a fixed increment / decrement, but the computational intensity of the latter calculation is lower.

較佳的是,分析器121分析該音訊資料之一第一特性以取得該第一分析結果、及另外分析該第二頻譜帶中之該第二音訊資料之相同第一特性以取得該第二分析結果。與之相比,該參數計算器係組配成用於藉由評估一第二特性從該第二頻譜帶中之該第二音訊資料計算該參數,其中此第二特性與此第一特性不同。Preferably, the analyzer 121 analyzes a first characteristic of the audio data to obtain the first analysis result, and additionally analyzes the same first characteristic of the second audio data in the second frequency band to obtain the second Analyze the results. In contrast, the parameter calculator is configured to calculate the parameter from the second audio data in the second frequency band by evaluating a second characteristic, wherein the second characteristic is different from the first characteristic .

圖2例示性繪示第一特性為諸如第一、第二或任何其他頻譜帶之某一頻譜帶內之一頻譜細密結構或一能量分布的情況。與之相比,藉由參數計算器所應用或藉由參數計算器所判定之第二特性為一頻譜包絡衡量、一能量衡量或一功率衡量,或大致為一振幅相關衡量,在一頻譜帶中給予功率/能量之一絕對或相對衡量,例如一增益因子。然而,對與一增益因子特性不同之一特性進行衡量之其他參數也可藉由參數計算器來計算。再者,藉由分析器121,一方面可應用及分析用於個別起源帶之其他特性,且另一方面可應用及分析目的地頻譜帶,分別即第一頻譜帶及第二頻譜帶。FIG. 2 exemplarily illustrates a case where the first characteristic is a spectral fine structure or an energy distribution in a certain frequency band such as the first, second, or any other frequency band. In contrast, the second characteristic applied by or determined by the parameter calculator is a spectrum envelope measurement, an energy measurement, or a power measurement, or approximately an amplitude correlation measurement in a frequency band. An absolute or relative measure of power / energy, such as a gain factor. However, other parameters that measure a characteristic different from a gain factor characteristic can also be calculated by a parameter calculator. Furthermore, with the analyzer 121, on the one hand, it is possible to apply and analyze other characteristics for individual origin bands, and on the other hand, it is possible to apply and analyze the destination spectral bands, namely the first spectral band and the second spectral band, respectively.

再者,分析器121被組配來不使用第二頻譜帶190中之該第二音訊資料而計算第一分析結果122、及不使用第一頻譜帶180中之該第一音訊資料而另外計算第二分析結果123,其中在此實施例中,該第一頻譜帶與該第二頻譜帶彼此互斥,亦即彼此不重疊。Furthermore, the analyzer 121 is configured to calculate the first analysis result 122 without using the second audio data in the second frequency band 190, and calculate it separately without using the first audio data in the first frequency band 180. The second analysis result 123, in this embodiment, the first frequency band and the second frequency band are mutually exclusive, that is, they do not overlap each other.

再者,頻譜分析器130另外被組配來建置音訊信號之訊框,或建窗音訊樣本之傳入串流以取得音訊樣本之訊框,其中鄰近訊框中之音訊樣本彼此重疊。在一50%重疊的情況中,舉例而言,一更早期訊框之一第二部分具有從後續訊框前半部中所包括之相同原始音訊樣本推導出之音訊樣本,其中一訊框內之音訊樣本係藉由建窗推導自原始音訊樣本。Furthermore, the spectrum analyzer 130 is further configured to construct a frame of an audio signal, or construct an incoming stream of window audio samples to obtain a frame of audio samples, where audio samples of adjacent frames overlap each other. In a 50% overlap scenario, for example, one of the second frames of an earlier frame has audio samples derived from the same original audio samples included in the first half of subsequent frames, one of which Audio samples are derived from the original audio samples by windowing.

在這種狀況中,當音訊信號包含舉例如藉由圖1中另外具有一訊框建置器功能之方塊130所另外提供之一訊框時序時,補償器124被組配來將前一個補償訊框值用於前一個訊框而針對一目前訊框計算一目前補償值。這一般導致一種修勻操作。In this case, when the audio signal contains, for example, one of the frame timings provided by block 130 which additionally has a frame builder function in FIG. 1, the compensator 124 is configured to compensate the previous one. The frame value is used for the previous frame and a current compensation value is calculated for a current frame. This generally results in a smoothing operation.

如後文之概述,圖2中所示之補償偵檢器210可另外或替代地從圖2中之其他特徵包含分別在221、223處所示之一功率譜輸入及一暫態輸入。As outlined later, the compensation detector 210 shown in FIG. 2 may additionally or alternatively include a power spectrum input and a transient input shown at 221, 223, respectively, from other features in FIG. 2.

特別的是,補償偵檢器210被組配來僅指導當圖1之原始音訊信號100之一功率譜可用時,要藉由參數計算器126來使用一補償。此事實係藉由某一資料元或旗標來傳訊,亦即傳訊功率譜是否可用。In particular, the compensation detector 210 is configured to only guide the use of a compensation by the parameter calculator 126 when one of the power spectra of the original audio signal 100 of FIG. 1 is available. This fact is signaled by a certain data element or flag, ie whether the power spectrum of the signal is available.

再者,補償偵檢器210被組配來當一暫態資訊線223針對目前訊框傳訊不存在一暫態時,僅容許經由控制線212進行一補償操作。因此,當線223傳訊存在一暫態時,停用整體補償操作,與任何分析結果無關。當然,當已針對第二頻譜帶傳訊一補償時,這應用於第三頻譜帶。然而,當針對此訊框偵檢到一情況時,諸如偵檢到一暫態情況時,這也應用於某一訊框中之第二頻譜帶。接著,針對某一時間框,可出現且將出現完全不發生任何參數補償之情況。Furthermore, the compensation detector 210 is configured to allow only a compensation operation to be performed via the control line 212 when a transient information line 223 does not have a transient state for the current frame messaging. Therefore, when there is a transient state on line 223, the overall compensation operation is disabled regardless of any analysis results. Of course, when a compensation has been signaled for the second frequency band, this applies to the third frequency band. However, when a situation is detected for this frame, such as when a transient condition is detected, this also applies to the second spectrum band of a certain frame. Then, for a certain time frame, it may occur and there will be no parameter compensation at all.

圖3a繪示振幅A(f)或平方振幅A2 (f)之一頻譜之一表示型態。特別的是,所示為一XOVER或IGF起始頻率。Figure 3a shows one representation of one of the spectra of amplitude A (f) or square amplitude A 2 (f). In particular, an XOVER or IGF starting frequency is shown.

再者,所示為一組重疊起源帶,其中該等起源帶包含第一頻譜帶180、一進一步起源帶302及一更進一步起源帶303。另外,高於IGF或XOVER頻率之目的地頻譜帶舉例而言,為第二頻譜帶190、一進一步目的地頻譜帶305、一更進一步目的地頻譜帶307及第三頻譜帶202。Furthermore, a set of overlapping origin bands is shown, where the origin bands include a first spectral band 180, a further origin band 302, and a further origin band 303. In addition, the destination frequency bands higher than the IGF or XOVER frequency are, for example, the second frequency band 190, a further destination frequency band 305, a further destination frequency band 307, and a third frequency band 202.

一般而言,IGF或頻寬延伸框架內之映射函數定義個別起源帶180、302、303與個別目的地頻譜帶305、190、307、202之間的一映射關係。可將此映射關係固定,因為其為3GPP TS 26.445中之狀況,或可藉由某一IGF編碼器演算法來適應性判定。在任何狀況中,圖3a在下方表格中,針對非重疊目的地頻譜帶及重疊起源帶之狀況,繪示一目的地頻譜帶與起源帶之間的映射關係,與此映射關係是否固定或經適應性判定及實際已針對某一訊框予以適應性判定無關,圖3a之上方部分繪示頻譜。Generally speaking, a mapping function within an IGF or bandwidth extension frame defines a mapping relationship between individual origin bands 180, 302, 303 and individual destination spectrum bands 305, 190, 307, 202. This mapping relationship can be fixed because it is the situation in 3GPP TS 26.445, or it can be adaptively determined by an IGF encoder algorithm. In any case, in the table below, Figure 3a shows the mapping relationship between a destination spectrum band and an origin band for the status of non-overlapping destination spectrum bands and overlapping origin bands. The adaptive determination has nothing to do with the actual adaptive determination for a certain frame. The upper part of Fig. 3a shows the frequency spectrum.

圖4繪示補償器124之一更詳細實作態樣。在此實作態樣中,補償器124除了針對第一頻譜帶接收可以是一頻譜平坦度衡量、一波頂因子、一頻譜傾斜值或任何其他種參數資料之第一分析結果122,還針對第二頻譜帶接收一分析結果123。此分析結果再一次可以是針對第二頻譜帶之一頻譜平坦度衡量、針對第二頻譜帶之一波頂因子或一傾斜值,亦即受限於第二頻譜帶之一頻譜傾斜值,同時針對第一頻譜帶之傾斜值或頻譜傾斜值亦針對第一頻譜帶受限制。另外,補償器124接收第二頻譜帶上之一頻譜資訊,諸如第二頻譜帶之一停止線。因此,在圖2之參數計算器126係組配成用於將第三頻譜帶202中之第三音訊資料參數性寫碼的情況中,第三頻譜帶包含比第二頻譜帶更高的頻率。這也繪示於圖3a之實例中,其中第三頻譜帶處於比第二頻譜帶更高的頻率,亦即頻譜帶202比頻譜帶190具有更高的頻率。在這種情況下,補償器124被組配來使用一加權值針對第三頻譜帶計算補償值,其中對於用於針對第二頻譜帶計算補償值之一加權值,此第三加權值是不同的。因此,一般而言,補償器124影響補償值125之計算,以使得對於相同的其他輸入值,該補償值針對更高頻率會更小。FIG. 4 shows a more detailed implementation of one of the compensators 124. In this implementation aspect, in addition to the first analysis result 122 for the first spectral band reception which can be a spectral flatness measure, a crest factor, a spectral tilt value, or any other kind of parameter data, the compensator 124 also The two frequency bands receive an analysis result 123. This analysis result can once again be a measure of the spectral flatness of one of the second spectral bands, a crest factor or a tilt value of a second spectral band, that is, limited by a spectral tilt value of the second spectral band. The tilt value or spectral tilt value for the first frequency band is also restricted for the first frequency band. In addition, the compensator 124 receives one piece of spectrum information on the second frequency band, such as a stop line on the second frequency band. Therefore, in the case where the parameter calculator 126 of FIG. 2 is configured to parametrically write the third audio data in the third frequency band 202, the third frequency band contains a higher frequency than the second frequency band . This is also illustrated in the example of FIG. 3a, where the third frequency band is at a higher frequency than the second frequency band, that is, the frequency band 202 has a higher frequency than the frequency band 190. In this case, the compensator 124 is configured to use a weighted value to calculate a compensation value for the third spectral band, wherein the third weighted value is different for one of the weighted values used to calculate the compensation value for the second spectral band. of. Therefore, in general, the compensator 124 affects the calculation of the compensation value 125 so that for the same other input values, the compensation value will be smaller for higher frequencies.

該加權值舉例而言,可以是基於諸如指數α之第一與第二分析結果而用於計算該補償值之一指數,下文有說明,或舉例而言,可以是一乘法值或甚至是要加上或減掉之一值,以使得相較於該參數是要針對更低頻率來計算時之影響,針對更高頻率取得一不同影響。The weighting value may be, for example, an index used to calculate the compensation value based on the first and second analysis results such as the index α, which is described below, or for example, may be a multiplication value or even a One value is added or subtracted so that a different effect is obtained for a higher frequency than the effect when the parameter is calculated for a lower frequency.

另外,如圖4所示,補償器針對第二頻譜帶接收一音調雜訊比,以便取決於第二頻譜帶中第二音訊資料之音調雜訊比計算補償值。因此,針對一第一音調雜訊比取得一第一補償值、或針對一第二音調雜訊比取得一第二補償值,其中當該第一音調雜訊比大於該第二音調雜訊比時,該第一補償值大於該第二補償值。In addition, as shown in FIG. 4, the compensator receives a pitch-to-noise ratio for the second frequency band, so as to calculate a compensation value depending on the pitch-to-noise ratio of the second audio data in the second frequency band. Therefore, a first compensation value is obtained for a first tone noise ratio, or a second compensation value is obtained for a second tone noise ratio, wherein when the first tone noise ratio is greater than the second tone noise ratio When the first compensation value is greater than the second compensation value.

如上述,補償器124被組配來藉由套用一心理聲學模型大致判定該補償值,其中該心理聲學模型被組配來使用該第一分析結果與該第二分析結果評估該第一音訊資料與該第二音訊資料間之心理聲學不匹配以取得該補償值。此評估該心理聲學不匹配之心理聲學模型可實施成如下文以隨後SFM計算為背景所論述之一前饋計算,或替代地,可以是藉由合成程序應用一種分析之一回授計算模組。再者,該心理聲學模型亦可實施成一神經網路或一類似結構,其係藉由某訓練資料來自動耗盡(drained)以決定哪種狀況需要一補償而哪種狀況不需要。As described above, the compensator 124 is configured to roughly determine the compensation value by applying a psychoacoustic model, wherein the psychoacoustic model is configured to evaluate the first audio data using the first analysis result and the second analysis result. The psychoacoustics does not match the second audio data to obtain the compensation value. This psychoacoustic model that evaluates the psychoacoustic mismatch can be implemented as a feedforward calculation as discussed below with the background of SFM calculations, or alternatively, it can be a feedback calculation module that applies an analysis by a synthesis program . Furthermore, the psychoacoustic model can also be implemented as a neural network or a similar structure, which is automatically drained by certain training data to determine which conditions require a compensation and which conditions do not.

隨後,所繪示的是圖2所示補償偵檢器210之功能,或大致上,為參數計算器120中所包括之一偵檢器之功能。Subsequently, what is shown is the function of the compensation detector 210 shown in FIG. 2, or roughly, the function of one of the detectors included in the parameter calculator 120.

舉例而言,如圖6中之600及602所示,該補償偵檢器功能被組配來當該第一分析結果與該第二分析結果間之一差異具有一預定特性時,偵檢一補償情況。方塊600被組配來計算第一與第二分析結果之間的一差異,然後方塊602判斷該差異是否具有一預定特性或一預定值。若判定該預定特性不在那裡,則由方塊602判定如603處所示不要進行補償。然而,若判定該預定特性存在,則控制經由線604繼續進行。再者,該偵檢器被組配來替代地或另外判斷第二分析結果是否具有某一預定值或某一預定特性。若判定該特性不存在,則線605傳訊不要進行補償。然而,若判定該預定值在那裡,則控制經由線606繼續進行。在實施例中,線604與606可足以判斷是否有一補償。然而,在圖6所示之實施例中,針對圖1之第二頻譜帶190,基於第二音訊資料之頻譜傾斜度進行進一步判斷,下文有說明。For example, as shown at 600 and 602 in FIG. 6, the compensation detector function is configured to detect an error when a difference between the first analysis result and the second analysis result has a predetermined characteristic. Compensation situation. Block 600 is configured to calculate a difference between the first and second analysis results, and then block 602 determines whether the difference has a predetermined characteristic or a predetermined value. If it is determined that the predetermined characteristic is not there, it is determined by block 602 that no compensation is performed as shown at 603. However, if it is determined that the predetermined characteristic is present, control continues via line 604. Furthermore, the detector is configured to determine whether the second analysis result has a predetermined value or a predetermined characteristic instead or in addition. If it is determined that this characteristic does not exist, then line 605 messaging should not be compensated. However, if the predetermined value is determined to be there, then control continues via line 606. In an embodiment, lines 604 and 606 may be sufficient to determine whether there is compensation. However, in the embodiment shown in FIG. 6, the second spectrum band 190 of FIG. 1 is further judged based on the slope of the spectrum of the second audio data, which will be described below.

在一實施例中,該分析器被組配來針對該第一頻譜帶計算一頻譜平坦度衡量、一波頂因子、或該頻譜平坦度衡量與該波頂因子之一商數作為該第一分析結果、及計算該第二音訊資料的一頻譜平坦度衡量、或一波頂因子、或該頻譜平坦度衡量與該波頂因子之一商數作為該第二分析結果。In an embodiment, the analyzer is configured to calculate a spectral flatness measure, a crest factor, or a quotient of the spectral flatness measure and the crest factor for the first frequency band as the first The analysis result and a spectrum flatness measure or a wave top factor or a quotient of the spectrum flatness measure and the wave top factor for calculating the second audio data are used as the second analysis result.

在此一實施例中,參數計算器126另外被組配來從該第二音訊資料計算一頻譜包絡資訊或一增益因子。In this embodiment, the parameter calculator 126 is further configured to calculate a spectral envelope information or a gain factor from the second audio data.

再者,在此一實施例中,補償器124被組配來計算補償值125以致針對該第一分析結果與該第二分析結果間之一第一差異取得一第一補償值、及針對該第一分析結果與該第二分析結果間之一差異取得一第二補償值,其中當該第一補償值大於該第二補償值時,該第一差異大於該第二差異。Furthermore, in this embodiment, the compensator 124 is configured to calculate the compensation value 125 so as to obtain a first compensation value for a first difference between the first analysis result and the second analysis result, and for the first A difference between the first analysis result and the second analysis result obtains a second compensation value, wherein when the first compensation value is greater than the second compensation value, the first difference is greater than the second difference.

下文將藉由說明任選附加判斷是否要偵檢一補償情況,繼續說明圖6。The following will continue to explain FIG. 6 by explaining the optional additional judgment whether to detect a compensation situation.

在方塊608中,從第二音訊資料計算一頻譜傾斜度。如610中所示,當判定此頻譜傾斜度低於一門檻時,則如612處所示,正向肯定一補償情況。然而,當判定頻譜傾斜度不低於該預定門檻而是高於該門檻時,則藉由線614傳訊此情況。在方塊616中,判斷一音調分量是否接近第二頻譜帶190之一邊界。如藉由項目618所示,當判定有一音調分量接近該邊界時,則再一次正向肯定一補償情況。然而,當判定不存在接近一邊界之音調分量時,則如藉由線620所示,抵消任何補償,亦即關閉任何補償。在方塊616中,藉由在任一實施例中進行一偏移SFM計算,判斷一音調分量是否接近一邊界。如藉由方塊608所判定,當斜率大幅下傾時,則計算SFM之頻率區將下移對應換算因子頻譜帶(SFB)或第二頻譜帶之一半寬度。對於一大幅傾斜,計算SFM之頻率區上移第二頻譜帶之一半寬度。依此作法,因一低SFM而仍可正確地偵檢應該要消減之音調分量,而針對更高的SFM值,則將不套用消減。In block 608, a spectral tilt is calculated from the second audio data. As shown in 610, when it is determined that the spectrum tilt is lower than a threshold, as shown at 612, a compensation situation is positively positive. However, when it is determined that the spectrum tilt is not lower than the predetermined threshold but higher than the threshold, the situation is signaled through line 614. In block 616, it is determined whether a tone component is close to a boundary of the second frequency band 190. As shown by item 618, when it is determined that a tone component is close to the boundary, a compensation situation is again positively confirmed. However, when it is determined that there is no tonal component close to a boundary, then as shown by line 620, any compensation is canceled, that is, any compensation is turned off. In block 616, by performing an offset SFM calculation in any embodiment, it is determined whether a tone component is close to a boundary. As determined by block 608, when the slope is significantly downward, the frequency region in which the SFM is calculated will be shifted down by one and a half width of the corresponding conversion factor spectral band (SFB) or the second spectral band. For a large tilt, calculate the frequency range of the SFM up to one and a half width of the second spectral band. In this way, due to a low SFM, the tone component that should be reduced can still be detected correctly, and for higher SFM values, the reduction will not be applied.

後續更加詳細論述圖5。特別的是,參數計算器126可包含用於針對第二頻譜帶(即目的地頻譜帶)從音訊資料計算非補償參數之計算器501,並且參數計算器126另外包含用於將非補償參數502與補償值125組合之一組合器503。當非補償參502為一增益值且補償值105為一定量補償值時,此組合舉例而言,可以是一乘法。然而,藉由組合器503所進行之組合替代地亦可以是將補償值當作一指數或一加性修改使用之一加法操作,其中該補償值係當作一加性或減性值使用。Figure 5 is discussed in more detail later. In particular, the parameter calculator 126 may include a calculator 501 for calculating non-compensated parameters from the audio data for the second spectrum band (ie, the destination spectrum band), and the parameter calculator 126 further includes a non-compensated parameter 502 A combiner 503 combined with the compensation value 125. When the non-compensation parameter 502 is a gain value and the compensation value 105 is a certain amount of compensation value, for example, this combination may be a multiplication method. However, the combination performed by the combiner 503 may alternatively be an addition operation using the compensation value as an exponent or an additive modification, where the compensation value is used as an additive or subtractive value.

再者,要注意圖5中所示之實施例僅為一實施例,該實施例中計算非補償參數,然後進行與組合值之一後續組合。在替代實施例中,可已引進該補償值針對補償參數進行計算,以使得不出現帶有一外顯非補償參數之任何中間結果。反而,僅進行單一操作,其中由於此「單一操作」的關係,當不將補償值125引進此一計算時,補償參數係使用補償值、及使用將會產生非補償參數之一計算演算法來計算。Furthermore, it should be noted that the embodiment shown in FIG. 5 is only an embodiment. In this embodiment, non-compensation parameters are calculated, and then subsequent combination with one of the combined values is performed. In an alternative embodiment, the compensation value may have been introduced for calculation of compensation parameters so that no intermediate results with an explicit non-compensation parameter occur. Instead, only a single operation is performed. Due to the "single operation" relationship, when the compensation value 125 is not introduced into this calculation, the compensation parameter uses a compensation value and a calculation algorithm that will generate one of the non-compensated parameters. Calculation.

圖7繪示要藉由計算器501用於計算非補償參數之一程序。圖7中的表示型態「IGF換算因子計算」約略對應於3gpp TS 26.445 V13.3.3 (2015/12)之第5.3.3.2.11.4節。當一「複數」TCX功率譜P (一頻譜,其中評估頻譜線之實部與虛部)可用時,用於計算圖5之非補償參數的計算器501如700處所示從功率譜P針對第二頻譜帶進行一振幅相關衡量之一計算。再者,計算器501如702處所示從複數功率譜P針對第一頻譜帶進行一振幅相關衡量之一計算。另外,計算器501如704處所示,從第一頻譜帶(即起源帶)之實部進行一振幅相關衡量之一計算,以使得取得並且將三個振幅相關衡量Ecplx , target 、Ecplx , source 、Ereal , source 輸入到一進一步增益因子計算功能706,以最終取得一增益因子,其為Ereal , source 與Ecplx , source 間之商數乘以Ecplx , target 之一函數。FIG. 7 illustrates a program to be used by the calculator 501 to calculate non-compensated parameters. The expression "Calculation of IGF conversion factor" in Figure 7 corresponds approximately to section 5.3.3.2.11.4 of 3gpp TS 26.445 V13.3.3 (2015/12). When a "complex" TCX power spectrum P (a spectrum in which the real and imaginary parts of the spectral line are evaluated) is available, the calculator 501 for calculating the non-compensated parameters of FIG. The second spectral band performs one calculation of an amplitude correlation measure. Furthermore, as shown at 702, the calculator 501 performs one calculation of an amplitude correlation measure for the first frequency band from the complex power spectrum P. In addition, as shown at 704, the calculator 501 calculates one of an amplitude correlation measure from the real part of the first spectrum band (ie, the origin band), so that three amplitude correlation measures E cplx , target , E cplx are obtained and obtained. , source , E real , source are input to a further gain factor calculation function 706 to finally obtain a gain factor, which is a function of the quotient between E real , source and E cplx , source multiplied by E cplx , target .

替代地,當複數TCX功率譜不可用時,則如圖7底端處所示,僅從實數第二頻譜帶計算振幅相關衡量。Alternatively, when the complex TCX power spectrum is unavailable, as shown at the bottom of FIG. 7, only the amplitude-related measure is calculated from the real second spectral band.

再者,要注意TCX功率譜P係舉例如基於以下方程式在第5.3.3.2.11.1.2小節中所示予以計算。 P(sb)=R2 (sb) +I2 (sb), sb=0,1,2,…, n-1。Furthermore, it should be noted that the TCX power spectrum P system is calculated based on, for example, the following equation shown in section 5.3.3.2.11.1.2. P (sb) = R 2 (sb) + I 2 (sb), sb = 0,1,2, ..., n-1.

在這裡,n為實際TCX窗長度,R為含有目前TCX頻譜之實值部(經餘弦轉換)的向量,並且I為含有目前TCX頻譜之虛(經正弦轉換)部的向量。特別的是,「TCX」係有關於3gpp術語,但大致提及如藉由頻譜分析器130對圖1之核心編碼器110或參數寫碼器120所提供之第一頻譜帶或第二頻譜帶中之頻譜值。Here, n is the actual TCX window length, R is a vector containing the real-valued part of the current TCX spectrum (cosine-transformed), and I is a vector containing the imaginary (sine-transformed) part of the current TCX spectrum. In particular, "TCX" refers to the 3gpp term, but roughly mentions the first or second spectral band provided by the spectrum analyzer 130 to the core encoder 110 or the parameter writer 120 of FIG. 1 The spectrum value in.

圖8a繪示一較佳實施例,其中信號分析器121更包含一核心解碼器800,用於計算一已編碼且又再解碼之第一頻譜帶、及用於自然地計算已編碼/已解碼第一頻譜帶中之音訊資料。FIG. 8a illustrates a preferred embodiment, in which the signal analyzer 121 further includes a core decoder 800 for calculating an encoded and then decoded first spectrum band, and for naturally calculating the encoded / decoded Audio data in the first frequency band.

接著,核心解碼器800將已編碼/已解碼第一頻譜帶饋送到信號分析器821中所包括之一分析結果計算器801以計算第一分析結果122。再者,信號分析器包含圖1之信號分析器121中所包括用於計算已計算第二分析結果123之一第二分析結果計算器802。因此,信號分析器121係組配成使得實際第一分析結果122係使用已編碼且又再解碼之第一頻譜帶來計算,而第二分析結果係從原始第二頻譜帶計算出。因此,解碼器側之情況在編碼器側得到較適切的模擬,因為分析結果計算器801已針對解碼器處可得之第一頻譜帶具有已解碼第一音訊資料中所包括之全部量化誤差。Then, the core decoder 800 feeds the encoded / decoded first spectrum band to one of the analysis result calculators 801 included in the signal analyzer 821 to calculate the first analysis result 122. Furthermore, the signal analyzer includes a second analysis result calculator 802 included in the signal analyzer 121 of FIG. 1 for calculating one of the calculated second analysis results 123. Therefore, the signal analyzer 121 is configured such that the actual first analysis result 122 is calculated using the first spectral band that has been encoded and then decoded, and the second analysis result is calculated from the original second spectral band. Therefore, the situation on the decoder side is more appropriately simulated on the encoder side because the analysis result calculator 801 has all the quantization errors included in the decoded first audio data for the first frequency band available at the decoder.

圖8b繪示信號分析器之一較佳進一步實作態樣,其替代圖8a程序、或附加至圖8a程序而具有一貼片模擬器804。貼片模擬器804具體而言,確認IGF編碼器之功能,亦即藉由核心編碼器實際編碼之第二目的地頻譜帶內可有諸線或至少一條線。FIG. 8b shows a preferred further implementation of a signal analyzer, which has a patch simulator 804 instead of the program of FIG. 8a or added to the program of FIG. 8a. The patch simulator 804 specifically confirms the function of the IGF encoder, that is, there may be lines or at least one line in the second destination spectrum band actually encoded by the core encoder.

特別的是,圖3b中繪示此情況。In particular, this situation is illustrated in Figure 3b.

類似於圖3a,圖3b繪示上方部分、第一頻譜帶180及第二頻譜帶190。然而,除了已在圖3a中所論述者以外,第二頻譜帶還包含第二頻譜帶內所包括之特定線351、352,其已藉由頻譜分析器130判定為藉由核心編碼器110除了第一頻譜帶180以外還另外編碼之線。Similar to FIG. 3 a, FIG. 3 b illustrates the upper part, the first frequency band 180 and the second frequency band 190. However, in addition to what has been discussed in FIG. 3a, the second spectrum band also includes specific lines 351, 352 included in the second spectrum band, which have been determined by the spectrum analyzer 130 to be excepted by the core encoder 110 Lines other than the first spectral band 180 are also coded.

高於IGF起始頻率310之某些線之此特定寫碼反映核心編碼器110為一全頻譜帶編碼器之情況,該全頻譜帶編碼器具有高於IGF起始頻率之高達fmax 354之一奈奎斯頻率。這與SBR技術相關實作態樣形成對比,其中交越頻率亦為該最大頻率,從而為核心編碼器110之奈奎斯頻率。This particular write of some lines above the IGF start frequency 310 reflects the case where the core encoder 110 is a full-spectrum band encoder, which has a frequency as high as f max 354 above the IGF start frequency A Nyquist frequency. This is in contrast to SBR technology-related implementations, where the crossover frequency is also the maximum frequency and thus the Nyquist frequency of the core encoder 110.

測試模擬器804從核心解碼器800接收第一頻譜帶180或已解碼第一頻譜帶,並且另外從頻譜分析器130或核心編碼器110接收資訊,核心編碼器輸出信號中所包括之第二頻譜帶中實際上有線。這是藉由頻譜分析器130經由一線806傳訊,或藉由核心編碼器經由一線808傳訊。貼片模擬器804現在藉由將直接第一音訊資料用於四條頻譜帶、及透過將線351、352移至第一頻譜帶而將這些線從第二頻譜帶插入第一頻譜帶,針對第一頻譜帶模擬第一音訊資料。因此,線351’與352’代表藉由將圖3b之線351、352從第二頻譜帶移到第一頻譜帶所取得之頻譜線。較佳的是,頻譜線351、352在產生方面,對於第一頻譜帶,頻譜帶邊界內與兩頻譜帶中這些線之位置等同,亦即一線與頻譜帶邊界之間的差頻等同於第二頻譜帶190及第一頻譜帶180。The test simulator 804 receives the first spectrum band 180 or the decoded first spectrum band from the core decoder 800, and additionally receives information from the spectrum analyzer 130 or the core encoder 110, and the second spectrum included in the core encoder output signal The band is actually wired. This is signaled by the spectrum analyzer 130 via a line 806, or by the core encoder via a line 808. The patch simulator 804 now inserts these lines from the second spectrum band into the first spectrum band by using the direct first audio data for the four spectrum bands and by moving the lines 351, 352 to the first spectrum band. A frequency band simulates the first audio data. Therefore, lines 351 'and 352' represent spectral lines obtained by moving lines 351, 352 of Fig. 3b from the second spectral band to the first spectral band. Preferably, in terms of the generation of the spectral lines 351, 352, for the first spectral band, the positions of these lines within the spectral band boundary are equal to those of the two spectral bands, that is, the difference frequency between the first line and the spectral band boundary is equal to the first frequency band. The second spectrum band 190 and the first spectrum band 180.

因此,貼片模擬器輸出圖3c中所示之一模擬資料808,其具有一直接第一頻譜帶資料,另外還具有從第二頻譜帶移至第一頻譜帶之該等線。現在,分析結果計算器801使用特定資料808計算第一分析結果102,而分析結果計算器802從第二頻譜帶中之原始第二音訊資料(即包括圖3b中所示線351、352之原始音訊資料)計算第二分析結果123。Therefore, the patch simulator outputs one of the simulation data 808 shown in FIG. 3c, which has a direct first spectrum band data, and also has lines that move from the second spectrum band to the first spectrum band. Now, the analysis result calculator 801 uses the specific data 808 to calculate the first analysis result 102, and the analysis result calculator 802 extracts the original second audio data (that is, the original including the lines 351, 352 shown in FIG. 3b) Audio data) calculate the second analysis result 123.

此利用貼片模擬器804之程序所具有的優點在於,附加線351、352上不需要放某些條件,諸如高音調性或任何其他條件。反而,是否要藉由核心編碼器將第二頻譜帶中之某些線編碼,完全由頻譜分析器130或核心編碼器110決定。然而,此操作之結果係藉由將這些線當作用於計算如圖8B中所示第一分析結果122之一附加輸入使用來自動考量。This program using the patch simulator 804 has the advantage that certain conditions, such as high pitch or any other conditions, need not be placed on the additional lines 351, 352. Instead, whether to encode certain lines in the second frequency band by the core encoder is entirely determined by the spectrum analyzer 130 or the core encoder 110. However, the result of this operation is automatically considered by using these lines as an additional input for calculating one of the first analysis results 122 as shown in FIG. 8B.

隨後,繪示一智慧間隙填充框架內一音調性不匹配之效應。Subsequently, the effect of a tonal mismatch in a smart gap-filling frame is illustrated.

為了偵檢雜訊帶實物,必須判定起源與目標換算因子頻譜帶(SFB)之間的音調性差異。可將頻譜平坦衡量(SFM)用於音調性計算。若發現一音調性不匹配(其中起源帶遠比目標帶有更多雜訊),則應該套用某一量的消減。圖9中繪示此種未應用本發明性處理之情況。In order to detect the actual noise band, it is necessary to determine the tonal difference between the origin and target conversion factor spectrum band (SFB). Spectrum flatness measurement (SFM) can be used for tonality calculations. If a tonal mismatch is found (where the origin band is far more noisy than the target), a certain amount of subtraction should be applied. FIG. 9 illustrates such a case where the inventive treatment is not applied.

為了避免工具之一突然開/關行為,對消減因子套用某修勻亦屬合理。以下詳細說明用以在正確的地方套用消減之必要步驟。(請注意,只有TCX功率譜P可用且訊框為非暫態(旗標為isTransient無作動)的情況下才套用消減。) 音調性不匹配偵檢:參數To avoid one of the tools' sudden on / off behavior, it is reasonable to apply a smoothing to the reduction factor. The following details the steps necessary to apply abatement in the right place. (Please note that the reduction is only applied if the TCX power spectrum P is available and the frame is non-transient (the flag is isTransient is inactive).) Tonal mismatch detection: parameters

在一第一步驟中,必須識別那些SFB,其中一音調不匹配可能造成雜訊帶假影。為此,必須判定IGF範圍之各SFB、及用於向上複製之對應頻譜帶中的音調性。一種適用於計算音調性之衡量為頻譜平坦度衡量(SFM),其係基於將一頻譜之幾何平均除以其算術平均且範圍介於0與1之間。接近0的值表示強烈的音調性,而趨近1之一值則為一非常有雜訊之頻譜之一跡象。公式如下其中P為TCX功率譜,b為目前SFB之起始線且e為其終止線,而p的定義為 In a first step, those SFBs must be identified, where a pitch mismatch may cause noise artifacts. To do this, the tones in each IFB range of the IGF and the corresponding spectral bands for upward copying must be determined. One measure suitable for calculating tonality is the spectral flatness measure (SFM), which is based on dividing the geometric mean of a spectrum by its arithmetic mean and ranges between 0 and 1. A value close to 0 indicates strong tonality, while a value close to 1 is an indication of a very noisy spectrum. The formula is as follows Where P is the TCX power spectrum, b is the starting line of the current SFB and e is its ending line, and p is defined as

除了SFM以外,還計算波頂因子,其亦藉由將最大能量除以頻譜中全部頻率筐之平均能量,指出能量在一頻率內部的分布狀況。將SFM除以波頂因子針對目前訊框產生一SFB之一音調性衡量。該波頂因子之計算方式為其中P為TCX功率譜,b為目前SFB之起始線且e為其終止線,而的定義為 In addition to SFM, a wave crest factor is also calculated, which also indicates the distribution of energy within a frequency by dividing the maximum energy by the average energy of all frequency baskets in the frequency spectrum. Dividing the SFM by the wave top factor produces a tone measure of one of the SFBs for the current frame. The wave crest factor is calculated as Where P is the TCX power spectrum, b is the starting line of the current SFB and e is its ending line, and Is defined as

然而,亦使用前幾個訊框以得到一修勻之音調性估測實屬合理。因此,音調性估測係利用以下公式算出:其中sfm表示實際頻譜平坦度計算之結果,而變數SFM包括除以波頂因子及修勻。However, it is reasonable to use the first few frames to get a smoothed tone estimate. Therefore, the tonality estimate is calculated using the following formula: Sfm represents the result of the calculation of the actual flatness of the spectrum, and the variable SFM includes the division by the wave crest factor and the smoothing.

現在計算起源與目的地之間的音調性差異: Now calculate the tonal difference between origin and destination:

對於此差異之正值,滿足比目標頻譜有更多雜訊之某東西係用於向上複製的條件。此一SFB針對消減變為一可能候選者。A positive value for this difference satisfies the condition that something with more noise than the target spectrum is used for upward copying. This SFB becomes a possible candidate for mitigation.

然而,一低SFM值不必然表示強烈的音調性,但亦可導因於一SFB中能量之一突然下傾或傾斜。這尤其適用於在一SFB之中間某處有頻譜帶限制的項目。這會導致不需要的消減,建立一稍微低通濾波信號之印象。However, a low SFM value does not necessarily indicate strong tonality, but can also be caused by a sudden dip or tilt of one of the energy in an SFB. This applies in particular to projects with spectrum band restrictions somewhere in the middle of an SFB. This results in unwanted subtraction, creating the impression of a slightly low-pass filtered signal.

為了避免此類狀況中之消減,可能受影響之SFB係藉由計算具有正SFMdiff 之所有頻譜帶中能量之頻譜傾斜度 來判定,其中一個方向上之一大幅傾斜可能表示造成一低SFM值之一突然降低。計算頻譜傾斜度作為透過SFB中所有頻譜筐之一線性回歸,回歸線之斜率係藉由以下公式來給定:其中x為筐數,P為TCX功率譜,b為目前SFB之起始線且e為其終止線。To avoid mitigation in such situations, the potentially affected SFBs are determined by calculating the spectral slope of the energy in all spectral bands with a positive SFM diff . A large tilt in one direction may indicate a low SFM value One dropped suddenly. Calculate the slope of the spectrum as a linear regression through one of all spectrum baskets in the SFB. The slope of the regression line is given by the following formula: Where x is the number of baskets, P is the TCX power spectrum, b is the starting line of the current SFB and e is its ending line.

然而,接近於一SFB之一邊界的一音調分量亦可能造成一陡峭傾斜,但仍應該經受消減。若要將這兩種狀況分開,應該針對具有陡峭傾斜之頻譜帶進行另一偏移SFM計算。 斜率值之門檻係定義為除以SFB寬度作為正規化。However, a pitch component close to a boundary of an SFB may also cause a steep tilt, but it should still undergo subtraction. To separate these two conditions, another offset SFM calculation should be performed for a spectral band with a steep slope. The threshold of the slope value is defined as Divide by SFB width for normalization.

若有< -threshtilt 之一大幅下傾斜率,計算SFM之頻率區將會下移SFB之一半寬度;其對於> threshtilt 之一大幅傾斜斜率則上移。依此作法,因低SFM而仍可正確地偵檢應該要消減之音調分量,而針對更高的SFM值,將不套用消減。該門檻在這裡係定義為值0.04,其中只在偏移之SFM落到低於該門檻的情況下才套用消減。 感知煩擾模型If there is a large down-tilt rate of <-thresh tilt , the frequency region of the SFM calculation will be shifted down by one and a half width of the SFB; it will be moved up for a large tilt-slope rate of> thresh tilt . In this way, due to the low SFM, the tone component that should be reduced can still be detected correctly, and for higher SFM values, the reduction will not be applied. The threshold is defined here as a value of 0.04, wherein the reduction is applied only if the offset SFM falls below the threshold. Perceptual Disturbance Model

消減不應該套用於任何正SFMdiff ,而是只在目標SFB的確非常有音調的情況下才有意義。若在一特定SFB中藉由一有雜訊之背景信號疊加原始信號,則對一甚至更有雜訊之頻譜帶的感知差異將會小,而且因消減而使能量損耗所導致的感覺遲鈍可超出效益。Subtraction should not be applied to any positive SFM diff , but only makes sense if the target SFB is indeed very tonal. If the original signal is superimposed with a noisy background signal in a specific SFB, the perceptual difference in an even more noisy spectral band will be small, and the sensation caused by energy loss due to subtraction may be dull. Beyond benefits.

為了確保合理界線內之應用,只在目標SFB的確非常有音調的情況下才應該使用消減。所以,只有當以及都成立,才應該套用消減。To ensure that the application is within reasonable boundaries, subtraction should only be used if the target SFB is indeed very tonal. So only if as well as If both are true, then reduction should be applied.

應該考慮之另一事項為IGF頻譜中音調分量之背景。每當原始音調分量週圍有少量或沒有似雜訊背景時,雜訊帶假影所造成的感知衰減便可能最明顯。在這種狀況中,在將原始與IGF建立之HF頻譜作比較時,一引進的雜訊帶將被感知為某種程度全新,並且從而非常顯著地突出。若另一方面已存在相當大量的背景雜訊,則附加雜訊摻雜到背景裡,導致一更小的刺耳感知差異。因此,消減套用量亦應取決於受影響SFB中之音調雜訊比。Another thing to consider is the background of the tonal components in the IGF spectrum. Whenever there is little or no noise-like background around the original pitch component, the perceived attenuation caused by noise with artifacts is probably the most noticeable. In this situation, when comparing the HF spectrum established by the original with the IGF, an introduced noise band will be perceived to be somewhat new and thus very prominent. If, on the other hand, a considerable amount of background noise already exists, additional noise is doped into the background, resulting in a smaller harsh perception difference. Therefore, the amount of abatement sleeves should also depend on the pitch-to-noise ratio in the affected SFB.

對於此音調雜訊比之計算,將一SFB中所有筐之平方TCX功率譜值P加總起來,然後除以SFB之寬度(由起始線b與終止線e所給予)以求出頻譜帶之平均能量。隨後將此平均用於正規化頻譜帶中之所有能量。 For the calculation of this tone-to-noise ratio, the squared TCX power spectrum values P of all baskets in an SFB are summed and then divided by the width of the SFB (given by the start line b and the end line e) to find the spectrum band Average energy. This average is then used to normalize all the energy in the spectral band.

接著將具有低於1之一正規化能量Pnorm,k 的所有筐加總起來,然後看作是雜訊部分Pnoise ,同時凡高於一1+adap門檻者()皆看作是音調部分Ptonal 。此門檻取決於SFB之寬度,以使得更小頻譜帶得到一更小門檻,以考量因為音調分量之高能量筐的影響更大所導致的更高平均。最終從音調及雜訊部分運算出一對數比。 Then sum up all baskets with a normalized energy P norm, k below one, and then consider them as the noise part P noise , and at the same time, those who are higher than a 1 + adap threshold ( ) Are considered tonal parts P tonal . This threshold depends on the width of the SFB, so that the smaller spectrum band gets a smaller threshold to consider the higher average caused by the higher impact of the high energy basket of the tonal component. Finally, a logarithmic ratio is calculated from the tone and noise parts.

消減取決於起源與目的地之間的SFM差異、及目標SFB之SFM兩者,其中更高的差異及一更小的目標SFM兩者都應該導致更大幅的消減。對於一更大的音調性差異應該套用一更大幅之消減是合理的。再者,若目標SFM更低,亦即若目標SFB更有音調,則消減量亦應該更快速增加。這意味著對於極有音調之SFB,將會比對於SFM恰好落於消減範圍內之SFB套用一更大幅之消減。The reduction depends on both the SFM difference between the origin and the destination, and the SFM of the target SFB. Both a higher difference and a smaller target SFM should result in a larger reduction. It is reasonable to apply a larger reduction for a larger tonal difference. Furthermore, if the target SFM is lower, that is, if the target SFB is more tonal, the reduction should also increase more quickly. This means that for extremely toned SFBs, a much larger reduction will be applied than for SFBs where the SFM falls exactly within the reduction range.

另外,對於更高頻率亦應該更保守地套用消減,因為取走最高頻譜帶中之能量可能輕易地導致頻譜帶限制之感知印象,同時由於人類聽覺系統之靈敏度朝向更高頻率降低,SFB之細密結構變為更不重要。 音調性不匹配補償:消減因子之計算In addition, reduction should be applied more conservatively for higher frequencies, because taking away the energy in the highest frequency band may easily lead to the perception impression that the frequency band is limited. At the same time, as the sensitivity of the human auditory system decreases towards higher frequencies, the SFB is fine The structure becomes less important. Tonal mismatch compensation: calculation of a reduction factor

為了將所有這些考量都併入單一消減公式,將目標與起源SFM之間的比率作為該公式的依據。依此作法,一更大的SFM絕對差及一更小的目標SFM值將會導致更大幅之消減,使其比單純地取差異更加適合。為了亦新增頻率與音調雜訊比之相依性,將調整參數套用至此比率。因此,可將消減公式寫成其中d為將與換算因子相乘之消減因子,並且α與β為消減調整參數,其計算如下其中e為目前SFB之終止線,以及其中adap相依於SFB寬度,計算方式為 To incorporate all these considerations into a single reduction formula, the ratio between the target and the originating SFM is used as the basis for the formula. In this way, a larger absolute difference in SFM and a smaller target SFM value will result in a larger reduction, making it more suitable than simply taking the difference. In order to also add the frequency-to-noise ratio dependency, the adjustment parameter is applied to this ratio. Therefore, the subtraction formula can be written as Where d is the reduction factor to be multiplied by the conversion factor, and α and β are the reduction adjustment parameters, which are calculated as follows Where e is the termination line of the current SFB, and Where adapter depends on the SFB width, and the calculation method is

參數α隨著頻率減小以便針對更高頻率套用更少消減,而β係用於在要消減之SFB之音調雜訊比降到低於一門檻的情況下進一步降低消減之強度。降到低於此門檻的程度愈顯著,消減降低程度便愈大。The parameter α decreases with frequency in order to apply less reduction for higher frequencies, and β is used to further reduce the intensity of the reduction if the tone-to-noise ratio of the SFB to be reduced falls below a threshold. The more significant it drops below this threshold, the greater the reduction.

因為消減僅在某些限制範圍內才啟動,有必要套用修勻以便防止突然的進行/停止轉變。為了落實這一點,數種修勻機制作動。Because the mitigation is only initiated within certain limits, it is necessary to apply a smoothing in order to prevent sudden on / off transitions. In order to implement this, several smoothing machines are made.

緊接一暫態之後,僅全力逐漸套用轉至TCX之一核心切換、或未消減之前一個訊框消減以防止極端能量在高能量暫態之後降低。再者,形式為一IIR濾波器之一遺忘因子係用於亦將前幾個訊框之結果列入考慮。Immediately after a transient state, only gradually switch to one of the core switching of the TCX, or reduce the previous frame without reduction to prevent extreme energy from decreasing after the high-energy transient. Furthermore, a forgetting factor in the form of an IIR filter is used to also take into account the results of the previous frames.

以下公式中包含所有修勻技巧:其中dprev 為前一個訊框之消減因子。若前一個訊框中消減未作動,則dprev 係以dcurr 覆寫但下限為0.1。變數smooth為一附加修勻因子,其在暫態訊框(旗標isTransient作動)期間或核心切換(旗標isCelpToTCX作動)之後將設為2,若前一個訊框消減無作動則設為1。在有消減之各訊框中,該變數將遞減1,但不可降到低於0。The following formula contains all smoothing techniques: Where d prev is the reduction factor for the previous frame. If the reduction in the previous frame is inactive, d prev is overwritten with d curr but the lower limit is 0.1. The variable smooth is an additional smoothing factor, which is set to 2 during a transient frame (flag isTransient actuation) or after a core switch (flag isCelpToTCX actuation), and is set to 1 if the previous frame is subdued without action. In each frame with subtraction, the variable will be decremented by 1, but not lower than 0.

在最後步驟中,將消減因子d與換算增益g相乘: In the final step, multiply the reduction factor d by the conversion gain g:

圖10繪示本發明之一較佳實作態樣。FIG. 10 illustrates a preferred embodiment of the present invention.

舉例如藉由頻譜分析器130輸出之音訊信號可當作一MDCT頻譜,或甚至可當作一複數頻譜,如圖10左邊的(c)所指。For example, the audio signal output by the spectrum analyzer 130 may be regarded as an MDCT spectrum, or may even be regarded as a complex spectrum, as indicated by (c) on the left of FIG. 10.

信號分析器121係藉由圖10中之音調性偵檢器801與802來實施,用於藉由方塊802來偵檢目標內容之音調性,及用於在項目801處偵檢(模擬)起源內容之音調性。The signal analyzer 121 is implemented by the tone detectors 801 and 802 in FIG. 10, for detecting the tone of the target content by block 802, and for detecting (simulating) the origin at the item 801 The tone of the content.

接著,進行消減因子計算124以取得補償值,然後補償器503使用從項目501、700-706取得之資料來運作。項目501及項目700-706從目標內容反映包絡估測,並且從模擬起源內容反映包絡估測,還反映隨後的換算因子計算,舉例如圖7中項目700-706處所示。Next, a reduction factor calculation 124 is performed to obtain the compensation value, and then the compensator 503 operates using the data obtained from the items 501, 700-706. Items 501 and 700-706 reflect the envelope estimate from the target content, and reflect the envelope estimate from the simulated origin content, and also reflect the subsequent conversion factor calculations. Examples are shown at items 700-706 in FIG. 7.

因此,非補償換算向量係作為值502輸入到方塊503,與以圖5為背景論述者類似。再者,圖10中繪示作為一分離構建塊之一雜訊模型1000,但消減因子計算器124內亦可直接包括該雜訊模型,如以圖4為背景所論述者。Therefore, the non-compensated conversion vector is input to the block 503 as the value 502, similar to the one discussed in the background of FIG. Furthermore, FIG. 10 illustrates a noise model 1000 as a separate building block, but the noise model may also be included directly in the reduction factor calculator 124, as discussed in the context of FIG. 4.

再者,圖10中另外包含一白化估測器之參數IGF編碼器係組配成用於計算白化等級,舉例如項目5.3.3.2.11.6.4「IGF白化等級寫碼」中所論述者。特別的是,IGF白化等級係每個磗使用一個或兩個位元來計算及傳送。亦將此資料引進位元流多工器140,以便最終取得完整的IGF參數資料。Furthermore, FIG. 10 further includes a parameter IGF encoder of a whitening estimator, which is configured to calculate the whitening level, for example, as discussed in item 5.3.3.2.11.6.4 "IGF whitening level writing code". In particular, the IGF whitening level is calculated and transmitted using one or two bits per frame. This data is also introduced into the bit stream multiplexer 140 in order to finally obtain complete IGF parameter data.

再者,另外提供可針對要由核心編碼器110編碼之頻譜線之判定對應於方塊130之方塊「稀疏化(sparsify)頻譜」,並且將其繪示成圖10中之一分離方塊1020。此資訊較佳為藉由補償器503來使用,以便反映特定IGF情況。Furthermore, a determination is made that the spectrum line to be encoded by the core encoder 110 corresponds to the block "sparsify spectrum" of the block 130, and it is shown as a separate block 1020 in Fig. 10. This information is preferably used by the compensator 503 to reflect a specific IGF situation.

再者,圖10中方塊801及「包絡估測」方塊左邊之「模擬」一詞意指為圖8a中所示之情況,其中「模擬起源內容」為第一頻譜帶中已寫碼且又再解碼之音訊資料。Furthermore, the word "simulation" on the left side of the box 801 and the "envelope estimation" box in Fig. 10 means the situation shown in Fig. 8a, where the "simulation origin content" is a code written in the first spectrum band and Decoded audio data.

替代地,「模擬」起源內容為藉由貼片模擬器804從如線180所指第一頻譜帶中之原始第一音訊資料取得之資料,或為如藉由充實著從第二頻譜帶移至第一頻譜帶之線的核心解碼器800所取得之已解碼第一頻譜帶。Alternatively, the "analog" origin content is the data obtained by the patch simulator 804 from the original first audio data in the first frequency band as indicated by line 180, or by enriching the shift from the second frequency band The decoded first spectral band obtained by the core decoder 800 to the line of the first spectral band.

隨後,說明構成一3gpp TS 26.445編解碼之一修訂板之本發明之一進一步實施例。以下提供載明發明性處理之新增文字。本文中,對3gpp TS 26.445規格中已含有之某些小節進行明確參照。 5.3.3.2.11.1.9頻譜傾斜度函數SLOPESubsequently, a further embodiment of the present invention constituting a revision board of a 3gpp TS 26.445 codec will be described. The following text provides additional text setting forth the inventive treatment. In this article, certain references are explicitly referred to in 3gpp TS 26.445 specifications. 5.3.3.2.11.1.9 Spectrum slope function SLOPE

為如根據第5.3.3.2.11.1.2小節所計算之TCX功率譜,以及令b為頻譜傾斜度測量範圍之起始線且令e為其終止線。make Is the TCX power spectrum as calculated according to section 5.3.3.2.11.1.2, and let b be the start line of the spectrum tilt measurement range and let e be the end line.

以IGF套用之SLOPE函數係利用以下來定義:SLOPE其中n為實際TCX窗長度且x為筐數。 5.3.3.2.11.1.10. 音調雜訊比函數TNRThe SLOPE function system applied by IGF is defined using the following: SLOPE : tumultuous Where n is the actual TCX window length and x is the number of baskets. 5.3.3.2.11.1.10. TNR function TNR

為如根據第5.3.3.2.11.1.2小節所計算之TCX功率譜,以及令b為音調雜訊比測量範圍之起始線且令e為其終止線。make Is the TCX power spectrum as calculated according to section 5.3.3.2.11.1.2, and let b be the start line of the pitch-to-noise ratio measurement range and let e be the end line.

以IGF套用之TNR函數係以下式來定義:TNR其中n為實際TCX窗長度,Pnorm (sb)係以下式來定義並且adap係以下式來定義消減:The TNR function applied by IGF is defined by the following formula: TNR : tumultuous Where n is the actual TCX window length, and P norm (sb) is defined by And adapt is defined by the following formula Reduction:

對於IGF消減因子計算,6個全都為nB大小之靜態陣列(prevTargetFIR、prevSrcFIR、prevTargetIIR和用於在目標與起源範圍內進行SFM計算之prevSrcIIR、以及prevDamp與dampSmooth)必須保持訊框之濾波狀態。另外,一靜態旗標wasTransient必須儲存出自前一個訊框之輸入旗標isTransient之資訊。 重設濾波狀態For the calculation of the IGF reduction factor, six static arrays (prevTargetFIR, prevSrcFIR, prevTargetIIR, prevSrcIIR, and prevDamp and dampSmooth for SFM calculations in the target and origin ranges) must all maintain the frame filter state. In addition, a static flag wasTransient must store the information of the input flag isTransient from the previous frame. Reset filter status

向量prevTargetFIR、prevSrcFIR、prevTargetIIR、prevSrcIIR、以及prevDamp與dampSmooth全都是IGF模組中nB大小之靜態陣列,並且係初始化如下: 對於k = 0,1,…,nB - 1 The vectors prevTargetFIR, prevSrcFIR, prevTargetIIR, prevSrcIIR, and prevDamp and dampSmooth are all static arrays of nB size in the IGF module, and are initialized as follows: For k = 0,1, ..., nB -1

此初始化之完成應符合 l 編解碼有啟動 l 任何位元率有切換 l 任何編解碼類型有切換 l 有從CELP轉變至TCX,例如isCelpToTCX = true l 目前訊框具有暫態性質之情況,例如isTransient = true l TCX功率譜P不可用之情況 消減因子之計算The completion of this initialization should comply with: l Codec is enabled l Any bit rate is switched l Any codec type is switched l Has a transition from CELP to TCX, such as isCelpToTCX = true = true l Calculation of the reduction factor in the case where the TCX power spectrum P is not available

若TCX功率譜P可用且isTransient為false,計算以及其中t(0),t(1),…,t(nB)應已與函數tF有映射關係,請參照第5.3.3.2.11.1.1小節,m:N→N為第5.3.3.2.11.1.8小節中所述將IGF目標範圍映射到IGF起源範圍之映射函數,並且nB為換算因子頻譜帶數,請參照表格94。SFM為第5.3.3.2.11.1.3小節中所述之一頻譜平坦度測量函數,並且CREST為第5.3.3.2.11.1.4小節中所述之一波頂因子函數。If TCX power spectrum P is available and isTransient is false, calculate as well as Among them t (0), t (1), ..., t (nB) should have a mapping relationship with the function tF, please refer to section 5.3.3.2.1.1.1, m: N → N is 5.3.3.2.11.1 The mapping function that maps the IGF target range to the IGF origin range described in section .8, and nB is the number of spectral bands of the conversion factor, please refer to Table 94. SFM is one of the spectral flatness measurement functions described in section 5.3.3.2.11.1.3, and CREST is one of the crest factor functions described in section 5.3.3.2.11.1.4.

若isCelpToTCX為true或wasTransient為true,進行以下設定 對於k = 0,1,…,nB - 1計算:以及 If isCelpToTCX is true or wasTransient is true, make the following settings for k = 0,1, ..., nB-1 Calculation: as well as

利用這些向量計算: Use these vectors to calculate:

若對於k=0,1,…,nB-1設定否則利用函數SLOPE計算頻譜傾斜度,第5.3.3.2.11.1.9小節有說明: For k = 0,1, ..., nB-1 or set up Otherwise, the function SLOPE is used to calculate the spectral tilt, as explained in section 5.3.3.2.11.1.9:

若對於k=0,1,…,nB-1或否則若其中threshTilt係定義為計算一偏移頻譜上之SFM:偏移定義為 For k = 0,1, ..., nB-1 Or else if Where threshTilt is defined as Calculate SFM on an offset spectrum: The offset is defined as

設定 If set up

若對於k=0,1,…,nB-1頻譜帶k中將目前訊框dampCurr之消減因子設為零: For k = 0,1, ..., nB-1 In spectrum band k, the reduction factor of the current frame dampCurr is set to zero:

否則,計算dampCurr(k)如下:其中alpha係定義為並且beta係定義為其中TNR為如第5.3.3.2.11.1.10小節中所述之音調雜訊比,並且adap係定義為 Otherwise, calculate dampCurr (k) as follows: Where alpha is defined as And beta is defined as Where TNR is the pitch-to-noise ratio as described in section 5.3.3.2.11.1.10, and the adapter is defined as

若對於k=0,1,…,nB-1設定 For k = 0,1, ..., nB-1 set up

計算nB大小之諸消減因子的向量: Calculate the vector of reduction factors for nB size:

最後,若isTransient為false且功率譜P可用,更新濾波器 對於k = 0,1,…,nB - 1 Finally, if isTransient is false and the power spectrum P is available, update the filter for k = 0,1, ..., nB-1

前述部分中之值/指標/參數類似於已在本說明書各處論述之對應參數/指標/值。隨後,出自收聽測試之數個結果係以圖11a至11c為背景作論述。The values / indicators / parameters in the previous sections are similar to the corresponding parameters / indicators / values already discussed throughout this specification. Subsequently, several results from the listening test are discussed in the context of Figs. 11a to 11c.

進行這些收聽測試,藉由將利用啟用之消減來寫碼之項目與未用該消減來寫碼之項目作比較,顯示消減之效益。These listening tests were performed to show the benefits of reduction by comparing items that were coded with the subtraction enabled and items that were not coded with the subtraction.

圖11a中所示之第一結果為使用單項目之一13.2 kbps位元率及一32 kHz取樣率下之一a-B比較測試(a-B-comparison-test)。圖11a中顯示結果,其顯示13.2 kbps下a-B測試(a-B-test)消減與無消減的關係。The first result shown in FIG. 11a is an a-B comparison test (a-B-comparison-test) using a single-item 13.2 kbps bit rate and a 32 kHz sampling rate. The results are shown in Figure 11a, which shows the relationship between a-B-test subtraction and no subtraction at 13.2 kbps.

圖11b中所示第二者為使用單項目在24.4 kbps與一32 kHz取樣率下之一MUSHRA測試(MUSHRA-test)。在這裡,將無消減之兩個版本與有消減之新版本作比較。圖11b (絕對分數)及圖11c (差異分數)中顯示結果。The second one shown in Figure 11b is a MUSHRA test (MUSHRA-test) using a single project at 24.4 kbps and a 32 kHz sampling rate. Here, the two versions without reduction are compared with the new version with reduction. The results are shown in Figure 11b (absolute score) and Figure 11c (difference score).

發明性編碼之音訊信號可儲存於一數位儲存媒體或一非暫時性儲存媒體上,或可予以在諸如一無線傳輸介質之一傳輸介質、或諸如網際網路之一有線傳輸介質上傳輸。The inventively encoded audio signal may be stored on a digital storage medium or a non-transitory storage medium, or may be transmitted on a transmission medium such as a wireless transmission medium or a wired transmission medium such as the Internet.

雖然已經以一設備為背景說明一些態樣,清楚可知的是,這些態樣也代表對應方法之說明,其中一方塊或裝置對應於一方法步驟或一方法步驟之一特徵。類似的是,以一方法步驟為背景說明之態樣也代表一對應方塊或一對應設備之項目或特徵的說明。Although some aspects have been described in the context of a device, it is clear that these aspects also represent descriptions of corresponding methods, in which a block or device corresponds to a method step or a feature of a method step. Similarly, the aspect with a method step as the background description also represents a description of the item or feature of a corresponding block or a corresponding device.

取決於某些實作態樣要求,本發明之實施例可實施成硬體或軟體。此實作態樣可使用一數位儲存媒體來進行,例如軟式磁片、CD、ROM、PROM、EPROM、EEPROM或快閃記憶體,此數位儲存媒體上有儲存電子可讀控制信號,此等電子可讀控制信號與一可規劃電腦系統相配合(或能夠相配合)而得以進行各別方法。Depending on certain implementation requirements, embodiments of the invention may be implemented as hardware or software. This implementation can be performed using a digital storage medium, such as a floppy disk, CD, ROM, PROM, EPROM, EEPROM, or flash memory. The digital storage medium stores electronically readable control signals. The read control signals are coordinated (or able to be coordinated) with a programmable computer system to perform various methods.

根據本發明之一些實施例包含有一具有電子可讀控制信號之資料載體,此等電子可讀控制信號能夠與一可規劃電腦系統相配合而得以進行本文中所述方法之一。Some embodiments according to the present invention include a data carrier with electronically readable control signals, which can be coordinated with a programmable computer system to perform one of the methods described herein.

一般而言,本發明之實施例可實施成一具有一程式碼之電腦程式產品,當此電腦程式產品在一電腦上執行時,此程式碼係運作來進行此等方法之一。此程式碼可例如儲存在一機器可讀載體上。Generally speaking, the embodiment of the present invention can be implemented as a computer program product with a code. When the computer program product is executed on a computer, the code is operated to perform one of these methods. This code may be stored on a machine-readable carrier, for example.

其他實施例包含有用於進行本方法所述方法之一、儲存在一機器可讀載體或一非暫時性儲存媒體上之電腦程式。Other embodiments include a computer program for performing one of the methods described in this method, stored on a machine-readable carrier or a non-transitory storage medium.

換句話說,本發明之一實施例因此係一電腦程式,此電腦程式具有一程式碼,當此電腦程式在一電腦上執行時,此程式碼係用於進行本文中所述方法之一。In other words, an embodiment of the present invention is therefore a computer program. The computer program has a code. When the computer program is executed on a computer, the code is used to perform one of the methods described herein.

本發明此等方法之再一實施例因此係一資料載體(或一數位儲存媒體、或一電腦可讀媒體),其包含有、上有記錄用於進行本文中所述方法之一的電腦程式。Yet another embodiment of these methods of the present invention is therefore a data carrier (or a digital storage medium, or a computer-readable medium) that includes and has a computer program recorded on it to perform one of the methods described herein .

本方法之再一實施例因此係一資料流或一信號串,其代表用於進行本文中所述方法之一的電腦程式。此資料流或信號串可例如組配來經由一資料通訊連線來轉移,例如經由網際網路轉移。Yet another embodiment of the method is therefore a data stream or a signal string, which represents a computer program for performing one of the methods described herein. This data stream or signal string may, for example, be configured to be transferred via a data communication connection, such as via the Internet.

再一實施例包含有例如一電腦之一處理手段、或一可規劃邏輯裝置,係組配來或適用於進行本文中所述方法之一。Yet another embodiment includes, for example, a processing means of a computer, or a programmable logic device, which is configured or adapted to perform one of the methods described herein.

再一實施例包含有一電腦,此電腦具有安裝於其上用於進行本文中所述方法之一的電腦程式。Yet another embodiment includes a computer having a computer program installed thereon for performing one of the methods described herein.

在一些實施例中,一可規劃邏輯裝置(例如一可現場規劃閘陣列)可用於進行本文中所述方法之功能的一些或全部。在一些實施例中,一可現場規劃閘陣列可與一微處理器相配合,以便進行本文中所述方法之一。一般而言,此等方法較佳的是藉由任何硬體設備來進行。In some embodiments, a programmable logic device (such as a field programmable gate array) may be used to perform some or all of the functions of the methods described herein. In some embodiments, a field-programmable gate array can cooperate with a microprocessor to perform one of the methods described herein. In general, these methods are preferably performed by any hardware device.

上述實施例對於本發明之原理而言只具有說明性。瞭解的是,本文中所述布置與細節的修改及變例對於所屬技術領域中具有通常知識者將會顯而易見。因此,意圖是僅受限於待決專利請求項之範疇,並且不受限於藉由本文中實施例之說明及解釋所介紹之特定細節。The embodiments described above are only illustrative for the principles of the present invention. It is understood that modifications and variations of the arrangements and details described herein will be apparent to those having ordinary knowledge in the art. Accordingly, the intention is to be limited only by the scope of the pending patent claims and not by the specific details introduced by way of illustration and explanation of the embodiments herein.

100‧‧‧音訊信號100‧‧‧audio signal

110‧‧‧核心編碼器110‧‧‧core encoder

120‧‧‧參數寫碼器120‧‧‧parameter writer

121‧‧‧分析器121‧‧‧ Analyzer

122、123‧‧‧分析結果122, 123‧‧‧ analysis results

124‧‧‧補償器124‧‧‧Compensator

125‧‧‧補償值125‧‧‧ compensation value

126‧‧‧參數計算器126‧‧‧parameter calculator

130‧‧‧頻譜分析器130‧‧‧Spectrum Analyzer

140‧‧‧輸出介面140‧‧‧output interface

150‧‧‧已編碼音訊信號150‧‧‧coded audio signal

160‧‧‧已編碼核心信號160‧‧‧coded core signal

170‧‧‧輸入線170‧‧‧input line

180、190、202‧‧‧頻譜帶180, 190, 202‧‧‧ Spectrum Band

200‧‧‧進一步參數200‧‧‧ Further parameters

204‧‧‧分析結果204‧‧‧ Analysis results

210‧‧‧補償偵檢器210‧‧‧Compensation Detector

212‧‧‧控制線212‧‧‧Control line

221‧‧‧功率譜輸入221‧‧‧ Power Spectrum Input

223‧‧‧暫態輸入223‧‧‧Transient input

302‧‧‧進一步起源帶302‧‧‧ Further origin zone

303‧‧‧更進一步起源帶303‧‧‧ Further origin zone

305‧‧‧進一步目的地頻譜帶305‧‧‧Further destination band

307‧‧‧更進一步目的地頻譜帶307‧‧‧Further Destination Spectrum Band

310‧‧‧IGF起始頻率310‧‧‧IGF starting frequency

351、351'、352、352'、604、605、606、620、806、808‧‧‧線351, 351 ', 352, 352', 604, 605, 606, 620, 806, 808‧‧‧ line

354‧‧‧最大頻率354‧‧‧Max frequency

400‧‧‧音調雜訊比400‧‧‧ pitch-to-noise ratio

501‧‧‧計算器501‧‧‧ calculator

502‧‧‧非補償參數502‧‧‧Non-compensated parameters

503‧‧‧組合器503‧‧‧Combiner

600、602、1020‧‧‧方塊600, 602, 1020 ‧‧‧ blocks

603‧‧‧無補償603‧‧‧No compensation

610、618、614‧‧‧預定門檻610, 618, 614‧‧‧ Booking threshold

612‧‧‧判定612‧‧‧Judgment

616‧‧‧檢查616‧‧‧ Inspection

700~706‧‧‧項目700 ~ 706‧‧‧ items

800‧‧‧核心解碼器800‧‧‧ core decoder

801、802‧‧‧分析結果計算器801, 802‧‧‧ Analysis Results Calculator

804‧‧‧貼片模擬器804‧‧‧Patch Simulator

1000‧‧‧雜訊模型1000‧‧‧ Noise Model

較佳實施例後續是以附圖為背景作說明,其中: 圖1繪示根據一實施例用於將一音訊信號編碼之一設備的一方塊圖; 圖2繪示用於編碼之一設備的一方塊圖,焦點放在補償偵檢器; 圖3a繪示一音訊頻譜的一示意圖,其具有一起源範圍及一IGF或頻寬延伸範圍及介於起源與目的地頻譜帶之間的一相關聯映射關係; 圖3b繪示一音訊信號之一頻譜,其中核心編碼器應用IGF技術且其中第二頻譜帶中有存活線; 圖3c繪示第一頻譜帶中要用於計算第一分析結果之一模擬第一音訊資料的一表示型態; 圖4繪示補償器之一更詳細表示型態; 圖5繪示參數計算器之一更詳細表示型態; 圖6繪示用於在一實施例中繪示補償偵檢器功能的一流程圖; 圖7繪示用於計算一非補償增益因子之參數計算器之一功能; 圖8a繪示具有一核心解碼器用於從一已編碼與已解碼第一頻譜帶計算第一分析結果之一編碼器實作態樣; 圖8b繪示一實施例中一編碼器的一方塊圖,其中一貼片模擬器係用於產生從第二頻譜帶偏離之一第一頻譜頻寬線以取得第一分析結果; 圖9繪示一智慧間隙填充實作態樣中一音調性不匹配之一效應; 圖10繪示一實施例中參數編碼器之實作態樣;以及 圖11a至11c繪示使用補償參數值從編碼音訊資料取得之收聽測試結果。The preferred embodiment is described below with reference to the accompanying drawings, in which: FIG. 1 shows a block diagram of a device for encoding an audio signal according to an embodiment; FIG. 2 shows a device for encoding a device A block diagram with the focus on the compensation detector; Figure 3a shows a schematic diagram of an audio spectrum with a source range and an IGF or bandwidth extension range and a correlation between the origin and destination spectral bands Figure 3b shows a frequency spectrum of an audio signal, where the core encoder uses IGF technology and there is a survival line in the second spectrum band; Figure 3c shows the first spectrum band to be used to calculate the first analysis result One simulates a representation of the first audio data; FIG. 4 shows a more detailed representation of one of the compensators; FIG. 5 shows a more detailed representation of one of the parameter calculators; A flowchart of the function of the compensation detector is shown in the embodiment; FIG. 7 shows a function of a parameter calculator for calculating a non-compensated gain factor; FIG. 8a shows a core decoder having a function of Decoded first spectral band calculated first An analysis result of an encoder is shown in the analysis result. FIG. 8b shows a block diagram of an encoder in an embodiment, in which a patch simulator is used to generate a first spectrum bandwidth line that deviates from the second spectrum band. To obtain a first analysis result; FIG. 9 illustrates an effect of a tonal mismatch in a smart gap-filling implementation; FIG. 10 illustrates an implementation of a parameter encoder in an embodiment; and FIGS. 11a to 11c Shows the listening test results obtained from the encoded audio data using the compensation parameter values.

Claims (26)

一種用於一編碼音訊信號之設備,其包含: 用於將一第一頻譜帶中之第一音訊資料核心編碼之一核心編碼器; 用於將與該第一頻譜帶不同之一第二頻譜帶中之第二音訊資料參數性寫碼之一參數寫碼器,其中該參數寫碼器包含: 用於分析該第一頻譜帶中之第一音訊資料以取得一第一分析結果、及用於分析該第二頻譜帶中之第二音訊資料以取得一第二分析結果之一分析器; 用於使用該第一分析結果及該第二分析結果計算一補償值之一補償器;以及 用於使用該補償值從該第二頻譜帶中之該第二音訊資料計算一參數之一參數計算器。A device for encoding an audio signal, comprising: a core encoder for core encoding a first audio data in a first frequency band; and a second spectrum different from the first frequency band One of the parameter writers for the parametric coding of the second audio data in the band, wherein the parameter coder includes: analyzing the first audio data in the first frequency band to obtain a first analysis result, and An analyzer for analyzing second audio data in the second frequency band to obtain a second analysis result; a compensator for calculating a compensation value using the first analysis result and the second analysis result; and A parameter calculator for calculating a parameter from the second audio data in the second frequency band using the compensation value. 如請求項1之設備, 其中該參數寫碼器係組配成用於將一第三頻譜帶中之第三音訊資料參數性寫碼; 其中該分析器係組配成用於分析該第三頻譜帶中之該第三音訊資料以取得一第三分析結果; 其中該參數寫碼器更包含用於至少使用該第三分析結果來偵檢是否要補償該第三頻譜帶之一補償偵檢器, 其中該參數計算器被組配來當該補償偵檢器偵檢不要補償該第三頻譜帶時,不用任何補償值而從該第三頻譜帶中之該音訊資料計算進一步參數。For example, the device of claim 1, wherein the parameter writer is configured to parametrically write the third audio data in a third frequency band; wherein the analyzer is configured to analyze the third The third audio data in the frequency band to obtain a third analysis result; wherein the parameter writer further includes a compensation detection for detecting at least one of the third frequency band using the third analysis result; The parameter calculator is configured to calculate further parameters from the audio data in the third frequency band without any compensation value when the compensation detector does not compensate the third frequency band. 如請求項1或2之設備, 其中該分析器被組配來計算一第一定量值作為該第一分析結果、及計算一第二定量值作為該第二分析結果, 其中該補償器被組配來從該第一定量值、及從該第二定量值計算一定量補償值,以及 其中該參數計算器係組配成用於使用該定量補償值計算一定量參數。For example, the device of claim 1 or 2, wherein the analyzer is configured to calculate a first quantitative value as the first analysis result and calculate a second quantitative value as the second analysis result, wherein the compensator is It is configured to calculate a certain amount of compensation value from the first quantitative value and the second quantitative value, and wherein the parameter calculator is configured to calculate a certain quantity of parameter using the quantitative compensation value. 如前述請求項其中一項之設備, 其中該分析器被組配來分析該第一音訊資料之一第一特性以取得該第一分析結果、及分析該第二頻譜帶中之該第二音訊資料之相同第一特性以取得該第二分析結果,以及 其中該參數計算器係組配成用於藉由評估一第二特性從該第二頻譜帶中之該第二音訊資料計算該參數,該第二特性與該第一特性不同。The device as in one of the preceding claims, wherein the analyzer is configured to analyze a first characteristic of the first audio data to obtain the first analysis result, and analyze the second audio in the second frequency band The same first characteristic of the data to obtain the second analysis result, and wherein the parameter calculator is configured to calculate the parameter from the second audio data in the second frequency band by evaluating a second characteristic, The second characteristic is different from the first characteristic. 如請求項4之設備, 其中該第一特性為該第一頻譜帶內之一頻譜細密結構特性或一能量分布特性,或 其中該第二特性為該第二頻譜帶內之頻譜值之一包絡衡量或一能量相關衡量或一功率相關衡量。The device as claimed in claim 4, wherein the first characteristic is a spectral fine structure characteristic or an energy distribution characteristic in the first frequency band, or wherein the second characteristic is an envelope of a frequency value in the second frequency band A measure or an energy-related measure or a power-related measure. 如前述請求項其中一項之設備, 其中該第一頻譜帶及該第二頻譜帶彼此互斥, 其中該分析器被組配來不使用該第二頻譜帶中之該第二音訊資料而計算該第一分析結果、及不使用該第一頻譜帶中之該第一音訊資料而計算該第二分析結果。The device according to one of the preceding claims, wherein the first spectrum band and the second spectrum band are mutually exclusive, and the analyzer is configured to calculate without using the second audio data in the second spectrum band The first analysis result and the second analysis result are calculated without using the first audio data in the first frequency band. 如前述請求項其中一項之設備, 其中該音訊信號包含一訊框時序, 其中該補償器被組配來將前一個補償值用於前一個訊框而針對一目前訊框計算一目前補償值。The device as in one of the preceding claims, wherein the audio signal includes a frame timing, wherein the compensator is configured to use a previous compensation value for the previous frame and calculate a current compensation value for a current frame . 如前述請求項其中一項之設備, 其中該參數寫碼器係組配成用於將一第三頻譜帶中之第三音訊資料參數性寫碼, 其中該第三頻譜帶比該第二頻譜帶包含更高頻率,以及 其中該補償器被組配來使用一第三加權值針對該第三頻譜帶計算該補償值, 其中該第三加權值與用於針對該第二頻譜帶計算該補償值之一第二加權值不同。The device according to one of the preceding claims, wherein the parameter writer is configured to parametrically write the third audio data in a third frequency band, wherein the third frequency band is smaller than the second frequency band. The band contains higher frequencies, and wherein the compensator is configured to use a third weighted value to calculate the compensation value for the third spectral band, where the third weighted value is used to calculate the compensation for the second spectral band. One of the values is different from the second weighted value. 如前述請求項其中一項之設備, 其中該分析器被組配來另外計算該第二頻譜帶中之該第二音訊資料之一音調雜訊比,以及 其中該補償器被組配來取決於該第二音訊資料之該音調雜訊比計算該補償值,以致針對一第一音調雜訊比取得一第一補償值、或針對一第二音調雜訊比取得一第二補償值,該第一補償值大於該第二補償值,並且該第一音調雜訊比大於該第二音調雜訊比。The device as in one of the preceding claims, wherein the analyzer is configured to additionally calculate a pitch-to-noise ratio of the second audio data in the second frequency band, and wherein the compensator is configured to depend on The compensation value is calculated for the tone-to-noise ratio of the second audio data, so that a first compensation value is obtained for a first tone-to-noise ratio or a second compensation value is obtained for a second tone-to-noise ratio. A compensation value is greater than the second compensation value, and the first to noise ratio is greater than the second to noise ratio. 如前述請求項其中一項之設備,其中該參數計算器係組配成用於從該第二音訊資料計算一非補償參數、及用於組合該非補償參數與該補償值以取得該參數。The device according to one of the preceding claims, wherein the parameter calculator is configured to calculate a non-compensated parameter from the second audio data, and to combine the non-compensated parameter and the compensation value to obtain the parameter. 如前述請求項其中一項之設備, 其更包含用於將該第一頻譜帶中之已核心編碼音訊資料、及該參數輸出之一輸出介面。The device according to one of the preceding claims further comprises an output interface for outputting the core-encoded audio data in the first frequency band and outputting the parameter. 如前述請求項其中一項之設備, 其中該補償器被組配來藉由套用一心理聲學模型判定該補償值,其中該心理聲學模型被組配來使用該第一分析結果與該第二分析結果評估該第一音訊資料與該第二音訊資料間之一心理聲學不匹配以取得該補償值。The device according to one of the preceding claims, wherein the compensator is configured to determine the compensation value by applying a psychoacoustic model, wherein the psychoacoustic model is configured to use the first analysis result and the second analysis As a result, a psychoacoustic mismatch between the first audio data and the second audio data is evaluated to obtain the compensation value. 如前述請求項其中一項之設備, 其中該音訊信號包含一訊框時序,以及 其中該分析器係組配成用於分析一訊框之該第一頻譜帶中之第一音訊資料以取得該第一分析結果、及用於分析該第二頻譜帶中該訊框之第二音訊資料以針對該訊框取得一第二分析結果,其中該補償器係組配成用於將該第一分析結果用於該訊框、及將該第二分析結果用於該訊框而針對該訊框計算一補償值;以及其中該參數計算器係組配成用於將該補償值用於該訊框而從該訊框之該第二頻譜帶中之該第二音訊資料計算該參數,或 其中該參數寫碼器更包含:用於基於該第一分析結果及該第二分析結果而偵檢是否要在一補償情況中、或一非補償情況中使用該補償值針對一訊框之該第二頻譜帶計算該參數之一補償偵檢器。The device according to one of the preceding claims, wherein the audio signal includes a frame timing, and wherein the analyzer is configured to analyze first audio data in the first frequency band of a frame to obtain the audio signal. A first analysis result, and second audio data for analyzing the frame in the second frequency band to obtain a second analysis result for the frame, wherein the compensator is configured to use the first analysis The result is used for the frame, and a second analysis result is used for the frame to calculate a compensation value for the frame; and wherein the parameter calculator is configured to use the compensation value for the frame And calculating the parameter from the second audio data in the second frequency band of the frame, or wherein the parameter writer further includes: for detecting whether the parameter analysis is based on the first analysis result and the second analysis result. The compensation value is used in a compensation case or in a non-compensation case to calculate a compensation detector for one of the parameters for the second spectrum band of a frame. 如請求項1至13之設備, 其中一補償偵檢器被組配來當該第一分析結果與該第二分析結果間之一差異具有一預定特性時、或當該第二分析結果具有一預定特性時,偵檢該補償情況, 其中該偵檢器被組配來當一功率譜對該音訊編碼器為不可用時、或當偵檢一目前訊框為一暫態訊框時,偵檢不要補償一頻譜帶,或 其中該補償器被組配來基於該第一分析結果與該第二分析結果之一商數而計算該補償值。As in the equipment of claim 1 to 13, one of the compensation detectors is configured when a difference between the first analysis result and the second analysis result has a predetermined characteristic, or when the second analysis result has a When the characteristic is predetermined, the compensation situation is detected, wherein the detector is configured to detect when a power spectrum is unavailable to the audio encoder, or when a current frame is detected as a transient frame, It is not necessary to compensate a spectral band, or the compensator is configured to calculate the compensation value based on a quotient of the first analysis result and the second analysis result. 如前述請求項其中一項之設備, 其中該分析器被組配來針對該第一頻譜帶計算一頻譜平坦度衡量、一波頂因子、或該頻譜平坦度衡量與該波頂因子之一商數作為該第一分析結果、及針對該第二頻譜帶計算一頻譜平坦度衡量、或一波頂因子、或該頻譜平坦度衡量與該波頂因子之一商數作為該第二分析結果,或 其中該參數計算器被組配來從該第二音訊資料計算一頻譜包絡資訊或一增益因子,或 其中該補償器被組配來計算該補償值以致針對該第一分析結果與該第二分析結果間之一第一差異取得一第一補償值、及針對該第一分析結果與該第二分析結果間之一第二差異取得一第二補償值,其中該第一差異大於該第二差異,並且其中該第一補償值大於該第二補償值。The device as in one of the preceding claims, wherein the analyzer is configured to calculate a spectral flatness measure, a crest factor, or a quotient of the spectral flatness measure and the crest factor for the first frequency band. The number as the first analysis result, and calculating a spectral flatness measure for the second frequency band, or a crest factor, or a quotient of the spectral flatness measure and the crest factor as the second analysis result, Or wherein the parameter calculator is configured to calculate a spectral envelope information or a gain factor from the second audio data, or wherein the compensator is configured to calculate the compensation value such that the first analysis result and the second A first compensation value is obtained for a first difference between the analysis results, and a second compensation value is obtained for a second difference between the first analysis result and the second analysis result, wherein the first difference is greater than the second A difference, and wherein the first compensation value is greater than the second compensation value. 如請求項15之設備, 其中該分析器被組配來從該第二音訊資料計算一頻譜傾斜度, 其中該分析器被組配來檢查一音調分量是否接近該第二頻譜帶之一邊界,以及 其中該參數寫碼器之一補償偵檢器被組配來只在該頻譜傾斜度低於一預定門檻時、或只在該頻譜傾斜度高於一預定門檻且該檢查已判定存在接近該邊界之一音調分量時,才判定要使用該補償值計算該參數。The device of claim 15, wherein the analyzer is configured to calculate a spectral slope from the second audio data, wherein the analyzer is configured to check whether a tone component is close to a boundary of the second frequency band, And one of the parameter writers' compensation detector is configured to only when the spectrum tilt is below a predetermined threshold, or only when the spectrum tilt is above a predetermined threshold and the check has determined that there is a near- It is determined that the parameter is to be calculated using the compensation value only when one of the tone components of the boundary is used. 如前述請求項其中一項之設備,其更包含: 用於將該第一頻譜帶中之已編碼第一音訊資料解碼以取得已編碼及已解碼第一音訊資料之一解碼器, 其中該分析器被組配來使用該已編碼及已解碼第一音訊資料計算該第一分析結果,以及 用來從出自輸入到設備之該音訊信號的該第二音訊資料計算該第二分析結果。The device according to one of the preceding claims, further comprising: a decoder for decoding the encoded first audio data in the first frequency band to obtain one of the encoded and decoded first audio data, wherein the analysis The device is configured to calculate the first analysis result using the encoded and decoded first audio data, and to calculate the second analysis result from the second audio data from the audio signal input to the device. 如前述請求項其中一項之設備,其更包含: 用於針對該第二頻譜帶模擬一修補結果,該修補結果包含出自一已核心編碼音訊信號中所包括之該第二頻譜帶的至少一條頻譜線; 其中該分析器被組配來使用該第一音訊資料、及出自該第二頻譜帶之該至少一條頻譜線計算該第一分析結果;以及 用來從出自輸入到該設備之該音訊信號的該第二音訊資料計算該第二分析結果以供編碼之用。The device according to one of the preceding claims, further comprising: for simulating a repair result for the second frequency band, the repair result including at least one of the second frequency band included in a core-coded audio signal Spectrum line; wherein the analyzer is configured to calculate the first analysis result using the first audio data and the at least one spectrum line from the second spectrum band; and the audio signal from the input to the device The second audio data of the signal calculates the second analysis result for encoding. 如前述請求項其中一項之設備, 其中該核心編碼器被組配來將該第一音訊資料編碼成一實值頻譜序列, 其中該分析器被組配來從一功率譜序列計算該第一與該第二分析結果, 其中該功率譜係計算自輸入到該設備之該音訊信號以供編碼之用、或推導自由該核心編碼器所使用之一實值頻譜。The device according to one of the preceding claims, wherein the core encoder is configured to encode the first audio data into a real-valued spectrum sequence, and the analyzer is configured to calculate the first and The second analysis result, wherein the power spectrum is calculated from the audio signal input to the device for encoding, or a real-valued spectrum derived from the core encoder is derived. 如前述請求項其中一項之設備, 其中該核心編碼器被組配來至少將延伸到一增強起始頻率之一核心頻譜帶中之該音訊信號核心編碼, 其中該核心頻譜帶包含該第一頻譜帶、及與該第一頻譜帶重疊之至少一個進一步起源帶, 其中該音訊信號包含從該增強起始頻率延伸到一最大頻率之一增強範圍,其中該增強範圍中包括該第二頻譜帶及至少一個進一步目標帶,其中該第二頻譜帶與該進一步目標帶彼此不重疊。The device according to one of the preceding claims, wherein the core encoder is configured to core code the audio signal in at least one of the core frequency bands of an enhanced start frequency, wherein the core frequency band includes the first A spectral band and at least one further origin band overlapping the first spectral band, wherein the audio signal includes an enhanced range extending from the enhanced starting frequency to a maximum frequency, wherein the enhanced range includes the second spectral band And at least one further target band, wherein the second spectral band and the further target band do not overlap each other. 如請求項20之設備, 其中該增強起始頻率為一交越頻率,並且一已核心編碼信號係頻譜帶受限於該交越頻率,或 其中該增強起始頻率為一智慧間隙填充(IGF)起始頻率,並且一已核心編碼信號係頻譜帶受限於比該增強起始頻率更大之最大頻率。For example, the device of claim 20, wherein the enhanced start frequency is a crossover frequency, and a core-encoded signal is limited by the crossover frequency, or where the enhanced start frequency is a smart gap-filled (IGF ) Starting frequency, and the spectral band of a core-coded signal is limited to a maximum frequency greater than the enhanced starting frequency. 如前述請求項其中一項之設備, 其中該參數計算器被組配來 基於該第二頻譜帶中之該第二音訊資料針對該第二頻譜帶計算一增益因子, 計算一消減因子作為該補償值,以及 將該頻譜帶的該增益因子乘以該消減因子以取得一已補償增益因子作為該參數,以及 其中該設備更包含用於將該第一頻譜帶中之已核心編碼音訊資料、及該已補償增益因子作為該參數輸出之一輸出介面。The device as in one of the preceding claims, wherein the parameter calculator is configured to calculate a gain factor for the second frequency band based on the second audio data in the second frequency band, and calculate a reduction factor as the compensation Value, and multiplying the gain factor of the frequency band by the reduction factor to obtain a compensated gain factor as the parameter, and wherein the device further includes core-coded audio data in the first frequency band, and The compensated gain factor is used as an output interface for the parameter output. 一種將一音訊信號編碼之方法,其包含: 將一第一頻譜帶中之第一音訊資料核心編碼; 將與該第一頻譜帶不同之一第二頻譜帶中之第二音訊資料參數性寫碼,其中該參數性寫碼包含: 分析該第一頻譜帶中之第一音訊資料以取得一第一分析結果、及分析該第二頻譜帶中之第二音訊資料以取得一第二分析結果; 使用該第一分析結果及該第二分析結果計算一補償值;以及 使用該補償值從該第二頻譜帶中之該第二音訊資料計算一參數。A method for encoding an audio signal, comprising: core encoding first audio data in a first frequency band; and parametrically writing second audio data in a second frequency band different from the first frequency band. Code, wherein the parametric writing code includes: analyzing first audio data in the first frequency band to obtain a first analysis result, and analyzing second audio data in the second frequency band to obtain a second analysis result Calculating a compensation value using the first analysis result and the second analysis result; and calculating a parameter from the second audio data in the second frequency band using the compensation value. 一種用於處理一音訊信號之系統,其包含: 如請求項1至22其中一項之用於將一音訊信號編碼之一設備;以及 用於接收一已編碼音訊信號之一解碼器,該已編碼音訊信號包含該第一頻譜帶中之已編碼第一音訊資料、及代表該第二頻譜帶中第二音訊資料之一參數, 其中該解碼器係組配成用於進行一頻譜增強操作,以便使用該參數、及該第一頻譜帶中之已解碼第一音訊資料針對該第二頻譜帶產生合成音訊資料。A system for processing an audio signal, comprising: a device for encoding an audio signal, such as one of claims 1 to 22; and a decoder for receiving an encoded audio signal, the decoder The encoded audio signal includes the encoded first audio data in the first frequency band and a parameter representing the second audio data in the second frequency band, wherein the decoder is configured to perform a spectrum enhancement operation, In order to use the parameter and the decoded first audio data in the first frequency band to generate synthetic audio data for the second frequency band. 一種處理一音訊信號之方法,其包含: 如請求項23之將一音訊信號編碼;以及 接收一已編碼音訊信號,該已編碼音訊信號包含該第一頻譜帶中之已編碼第一音訊資料、及代表該第二頻譜帶中第二音訊資料之一參數;以及 進行一頻譜增強操作,以便使用該參數、及該第一頻譜帶中之已解碼第一音訊資料針對該第二頻譜帶產生合成音訊資料。A method for processing an audio signal, comprising: encoding an audio signal as described in claim 23; and receiving an encoded audio signal, the encoded audio signal including encoded first audio data in the first frequency band, And a parameter representing the second audio data in the second frequency band; and performing a spectrum enhancement operation to generate a synthesis for the second frequency band using the parameter and the decoded first audio data in the first frequency band Audio information. 一種電腦程式,其係用於在一電腦或一處理器上執行時進行如請求項23或25之方法。A computer program for performing a method such as item 23 or 25 when executed on a computer or a processor.
TW106128438A 2016-08-23 2017-08-22 Apparatus and method for encoding an audio signal using a compensation value TWI653626B (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
EP16185398.1A EP3288031A1 (en) 2016-08-23 2016-08-23 Apparatus and method for encoding an audio signal using a compensation value
??16185398.1 2016-08-23

Publications (2)

Publication Number Publication Date
TW201812744A true TW201812744A (en) 2018-04-01
TWI653626B TWI653626B (en) 2019-03-11

Family

ID=56799328

Family Applications (1)

Application Number Title Priority Date Filing Date
TW106128438A TWI653626B (en) 2016-08-23 2017-08-22 Apparatus and method for encoding an audio signal using a compensation value

Country Status (18)

Country Link
US (2) US11521628B2 (en)
EP (4) EP3288031A1 (en)
JP (3) JP6806884B2 (en)
KR (1) KR102257100B1 (en)
CN (3) CN109863556B (en)
AR (1) AR109391A1 (en)
AU (1) AU2017317554B2 (en)
BR (1) BR112019003711A2 (en)
CA (1) CA3034686C (en)
ES (2) ES2967183T3 (en)
MX (1) MX2019002157A (en)
PL (2) PL3796315T3 (en)
PT (1) PT3504707T (en)
RU (1) RU2727728C1 (en)
SG (1) SG11201901645SA (en)
TW (1) TWI653626B (en)
WO (1) WO2018036972A1 (en)
ZA (1) ZA201901624B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3288031A1 (en) * 2016-08-23 2018-02-28 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus and method for encoding an audio signal using a compensation value
EP3671741A1 (en) * 2018-12-21 2020-06-24 FRAUNHOFER-GESELLSCHAFT zur Förderung der angewandten Forschung e.V. Audio processor and method for generating a frequency-enhanced audio signal using pulse processing
CN111383643B (en) * 2018-12-28 2023-07-04 南京中感微电子有限公司 Audio packet loss hiding method and device and Bluetooth receiver
KR20210003507A (en) * 2019-07-02 2021-01-12 한국전자통신연구원 Method for processing residual signal for audio coding, and aduio processing apparatus
CN113808596A (en) * 2020-05-30 2021-12-17 华为技术有限公司 Audio coding method and audio coding device
CN113808597A (en) * 2020-05-30 2021-12-17 华为技术有限公司 Audio coding method and audio coding device
TWI755901B (en) * 2020-10-21 2022-02-21 美商音美得股份有限公司 Real-time audio processing system with frequency shifting feature and real-time audio processing procedure with frequency shifting function
CN115472171A (en) * 2021-06-11 2022-12-13 华为技术有限公司 Encoding and decoding method, apparatus, device, storage medium, and computer program
CN113612808B (en) * 2021-10-09 2022-01-25 腾讯科技(深圳)有限公司 Audio processing method, related device, storage medium, and program product

Family Cites Families (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003046891A1 (en) * 2001-11-29 2003-06-05 Coding Technologies Ab Methods for improving high frequency reconstruction
JP4296752B2 (en) 2002-05-07 2009-07-15 ソニー株式会社 Encoding method and apparatus, decoding method and apparatus, and program
JP2005114814A (en) * 2003-10-03 2005-04-28 Nippon Telegr & Teleph Corp <Ntt> Method, device, and program for speech encoding and decoding, and recording medium where same is recorded
US8417515B2 (en) * 2004-05-14 2013-04-09 Panasonic Corporation Encoding device, decoding device, and method thereof
KR100636144B1 (en) * 2004-06-04 2006-10-18 삼성전자주식회사 Apparatus and method for encoding/decoding audio signal
JP5117407B2 (en) * 2006-02-14 2013-01-16 フランス・テレコム Apparatus for perceptual weighting in audio encoding / decoding
JP4984983B2 (en) * 2007-03-09 2012-07-25 富士通株式会社 Encoding apparatus and encoding method
AU2008339211B2 (en) * 2007-12-18 2011-06-23 Lg Electronics Inc. A method and an apparatus for processing an audio signal
CA2730200C (en) 2008-07-11 2016-09-27 Max Neuendorf An apparatus and a method for generating bandwidth extension output data
KR101381513B1 (en) 2008-07-14 2014-04-07 광운대학교 산학협력단 Apparatus for encoding and decoding of integrated voice and music
JP5203077B2 (en) * 2008-07-14 2013-06-05 株式会社エヌ・ティ・ティ・ドコモ Speech coding apparatus and method, speech decoding apparatus and method, and speech bandwidth extension apparatus and method
RU2591012C2 (en) 2010-03-09 2016-07-10 Фраунхофер-Гезелльшафт цур Фёрдерунг дер ангевандтен Форшунг Е.Ф. Apparatus and method for handling transient sound events in audio signals when changing replay speed or pitch
US8751225B2 (en) * 2010-05-12 2014-06-10 Electronics And Telecommunications Research Institute Apparatus and method for coding signal in a communication system
KR101826331B1 (en) 2010-09-15 2018-03-22 삼성전자주식회사 Apparatus and method for encoding and decoding for high frequency bandwidth extension
CN102800317B (en) * 2011-05-25 2014-09-17 华为技术有限公司 Signal classification method and equipment, and encoding and decoding methods and equipment
JP5942358B2 (en) * 2011-08-24 2016-06-29 ソニー株式会社 Encoding apparatus and method, decoding apparatus and method, and program
US8527264B2 (en) * 2012-01-09 2013-09-03 Dolby Laboratories Licensing Corporation Method and system for encoding audio data with adaptive low frequency compensation
CN105229735B (en) * 2013-01-29 2019-11-01 弗劳恩霍夫应用研究促进协会 Technology for coding mode switching compensation
AU2014211479B2 (en) * 2013-01-29 2017-02-23 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Audio encoder, audio decoder, method for providing an encoded audio information, method for providing a decoded audio information, computer program and encoded representation using a signal-adaptive bandwidth extension
US9741350B2 (en) * 2013-02-08 2017-08-22 Qualcomm Incorporated Systems and methods of performing gain control
ES2688134T3 (en) * 2013-04-05 2018-10-31 Dolby International Ab Audio encoder and decoder for interleaved waveform coding
EP2830061A1 (en) * 2013-07-22 2015-01-28 Fraunhofer Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus and method for encoding and decoding an encoded audio signal using temporal noise/patch shaping
CN107770718B (en) * 2014-01-03 2020-01-17 杜比实验室特许公司 Generating binaural audio by using at least one feedback delay network in response to multi-channel audio
EP2980795A1 (en) 2014-07-28 2016-02-03 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Audio encoding and decoding using a frequency domain processor, a time domain processor and a cross processor for initialization of the time domain processor
EP2980794A1 (en) 2014-07-28 2016-02-03 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Audio encoder and decoder using a frequency domain processor and a time domain processor
US20160372127A1 (en) * 2015-06-22 2016-12-22 Qualcomm Incorporated Random noise seed value generation
CA3017241C (en) 2016-04-04 2023-09-19 Mazaro Nv Planetary variator for variable transmission
EP3288031A1 (en) * 2016-08-23 2018-02-28 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus and method for encoding an audio signal using a compensation value

Also Published As

Publication number Publication date
US20220392465A1 (en) 2022-12-08
EP3796315B1 (en) 2023-09-20
CN117198305A (en) 2023-12-08
US11935549B2 (en) 2024-03-19
AU2017317554A1 (en) 2019-04-11
PL3796315T3 (en) 2024-03-18
PT3504707T (en) 2021-02-03
TWI653626B (en) 2019-03-11
EP4250289A2 (en) 2023-09-27
CA3034686A1 (en) 2018-03-01
JP2021047441A (en) 2021-03-25
EP3504707B1 (en) 2020-12-16
AU2017317554B2 (en) 2019-12-12
CN109863556B (en) 2023-09-26
ES2967183T3 (en) 2024-04-29
EP3796315C0 (en) 2023-09-20
AR109391A1 (en) 2018-11-28
CN109863556A (en) 2019-06-07
PL3504707T3 (en) 2021-06-14
SG11201901645SA (en) 2019-03-28
KR102257100B1 (en) 2021-05-27
JP2019528479A (en) 2019-10-10
JP2023082142A (en) 2023-06-13
EP3288031A1 (en) 2018-02-28
WO2018036972A1 (en) 2018-03-01
CN117198306A (en) 2023-12-08
US20190189137A1 (en) 2019-06-20
EP4250289A3 (en) 2023-11-08
KR20190042070A (en) 2019-04-23
RU2727728C1 (en) 2020-07-23
BR112019003711A2 (en) 2019-05-28
CA3034686C (en) 2022-03-15
EP3504707A1 (en) 2019-07-03
MX2019002157A (en) 2019-07-01
US11521628B2 (en) 2022-12-06
EP3796315A1 (en) 2021-03-24
JP6806884B2 (en) 2021-01-06
ZA201901624B (en) 2019-12-18
JP7385549B2 (en) 2023-11-22
ES2844930T3 (en) 2021-07-23

Similar Documents

Publication Publication Date Title
TWI653626B (en) Apparatus and method for encoding an audio signal using a compensation value
RU2660605C2 (en) Noise filling concept
TW201009812A (en) Time warp activation signal provider, audio signal encoder, method for providing a time warp activation signal, method for encoding an audio signal and computer programs
WO2010091013A1 (en) Bandwidth extension method and apparatus for a modified discrete cosine transform audio coder
KR20110040820A (en) An apparatus and a method for generating bandwidth extension output data
KR20170103996A (en) Optimized scale factor for frequency band extension in an audiofrequency signal decoder
TWI529701B (en) Apparatus and method for generating a frequency enhanced signal using shaping of the enhancement signal
CN107710324A (en) Audio coder and the method for being encoded to audio signal
US20240221765A1 (en) Apparatus and method for encoding an audio signal using a compensation value
TW201443888A (en) Apparatus and method for generating a frequency enhancement signal using an energy limitation operation