CN101140758A - Perception weighting filtering wave method and perception weighting filter thererof - Google Patents

Perception weighting filtering wave method and perception weighting filter thererof Download PDF

Info

Publication number
CN101140758A
CN101140758A CNA2006101269218A CN200610126921A CN101140758A CN 101140758 A CN101140758 A CN 101140758A CN A2006101269218 A CNA2006101269218 A CN A2006101269218A CN 200610126921 A CN200610126921 A CN 200610126921A CN 101140758 A CN101140758 A CN 101140758A
Authority
CN
China
Prior art keywords
filtering
coefficients
audio signal
transfer function
perceptual weighting
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CNA2006101269218A
Other languages
Chinese (zh)
Other versions
CN100487789C (en
Inventor
胡瑞敏
张伟
杨玉红
张勇
王庭红
马付伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Wuhan University WHU
Original Assignee
Huawei Technologies Co Ltd
Wuhan University WHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd, Wuhan University WHU filed Critical Huawei Technologies Co Ltd
Priority to CNB2006101269218A priority Critical patent/CN100487789C/en
Publication of CN101140758A publication Critical patent/CN101140758A/en
Application granted granted Critical
Publication of CN100487789C publication Critical patent/CN100487789C/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The invention discloses a method for perceptual weighted filtering, which comprises: A. to make a spectrum slant filtering processing to the inputted speech or audio signal. B. to make a traditional perceptual weighted filtering on the output speech or audio signal processed by the spectrum slant filtering. C. to output the speech or audio signal processed by the traditional perceptual weighted filtering and takes the partial coefficient of the coefficient in transfer function used in the perceptual weighted filtering as the coefficient in transfer function used in the said perceptual weighted filtering. Besides, the invention also discloses a perceptual weighted filter designed according to the said method for perceptual weighted filtering. The invention is to achieve the purpose of simulating the formant structure of the inputted speech or audio signal spectrum of the broadband to improve the coding efficiency and enhance the subjective auditory effect by taking full advantages of the human auditory masking effect.

Description

Perceptual weighting filtering method and perceptual weighting filter
Technical Field
The invention relates to the technical field of coding and decoding of voice and audio, in particular to a perceptual weighting filtering technology suitable for a broadband voice and/or audio coding and decoding system.
Background
The auditory perceptibility of a pure sound by human ears is influenced by the existence of another pure sound, but the previous pure sound can still be clearly perceived by human auditory sense, and the phenomenon is called masking effect of human ears.
When the voice coding is carried out, the ear masking effect of a person, namely the noise at the formants is not easy to be perceived relative to the noise of a frequency band with lower energy, and the noise with larger subjective feeling in the formants and valleys is reduced by distributing larger distortion in the formants, so that the importance of the formants and the frequency area is weakened, the coding efficiency is improved, and a better subjective auditory effect is obtained. This speech coding technique is called perceptual weighting filtering technique and the software module that performs this process is called a perceptual weighting filter.
In the first prior art, in order to implement the distribution of noise according to the structure of a spectrum formant in a perceptual weighting domain, a conventional perceptual weighting filter is used in a conventional narrowband speech coding technique, and a corresponding transfer function expression is as follows:
Figure A20061012692100051
wherein the content of the first and second substances,
Figure A20061012692100052
a i an order-l LPC coefficient calculated for an original input signal; λ is a weighting coefficient, and 0 < λ 2 <λ 1 ≤1。
The perceptual weighting filter form described in expression (1) can enable a formant structure of an analog signal spectrum with better coding noise on narrowband speech ranging from 200Hz to 3400Hz, and can be successfully applied to speech coding such as G.729 and AMR (adaptive multi-rate).
With the development of communication technology and the increase of user requirements, it is required to obtain more clear and vivid hearing effects, and at this time, speech coding in the narrowband range of 200Hz to 3400Hz cannot meet such requirements. The speech coding bandwidth is extended to a wide band range of 50Hz to 7000 Hz. In this wideband range, the spectrum tilt phenomenon of the signal spectrum is obvious because the spectrum of the signal has a larger dynamic range, and the conventional perceptual weighting filter based on the expression (1) cannot simulate the formant structure of the wideband signal spectrum with obvious spectrum tilt, and is not suitable for wideband speech coding.
The second prior art provides an improvement scheme based on the conventional perceptual weighting filter. The scheme is to cascade a spectrum tilt filter on the basis of a traditional perception weighting filter described by expression (1), wherein the spectrum tilt filter has the following functional form:
Figure A20061012692100061
wherein p is i (i =1, 2) is the LPC coefficients of order 2 found for the original input signal, and it can be seen that the spectral tilt filter is an all-pole filter.
Multiplying expression (1) by expression (2) yields the perceptual weighting filter given in prior art two:
W′(z)=W(z)P(z), (3)
the corresponding transfer function expression is:
Figure A20061012692100062
wherein W (z) is used for simulating the formant structure of the voice signal, and P (z) is used for eliminating the spectrum tilt of the broadband voice signal spectrum.
Applying the perceptual weighting filter described by the expression (4) in an AMR-WB (adaptive multi-code-rate wideband coding and decoding) system, and taking lambda 1 =0.95,λ 2 When =0.8 and δ =0.7, a good simulation effect can be obtained. When the perceptual weighting filter described in expression (4) is used in this encoding mode, it is necessary to calculate, at the encoding end, order 2 LPC coefficients constituting P (Z) in addition to order l LPC coefficients constituting a (Z). However, it is possible to use a single-layer,the perceptual weighting filter described in expression (4) is not suitable for mixed speech and audio coding, such as the AMR-WB + (extended adaptive multi-rate wideband coding and decoding) system, which is widely adopted at present. The reason is that in the hybrid encoding mode of ACELP/TCX (algebraic code excited linear prediction/transform coded excitation) adopted by AMR-WB +, besides the encoding end, perceptual weighting processing needs to be performed on signals at the decoding end, so if the perceptual weighting filter described by expression (4) is still used in the encoding mode, the corresponding calculation needs to be completed at the encoding end, and the 2 nd order LPC coefficients constituting P (z) need to be obtained at the decoding end. There are two ways to obtain this coefficient: one is to transfer the LPC coefficients of order 2 of P (z) to the decoding end at the encoding end, which has the disadvantage of increasing the bit rate of transmission; the other is that the decoding end utilizes the I order LPC coefficient of W (z) transmitted by the encoding end to calculate the 2 order LPC coefficient of P (z), and the derivation calculation process is complex and involves the inversion operation of a high order non-special matrix. It can be seen that the disadvantage of this approach is that it greatly increases the computational complexity of the decoder.
It can be seen that the application of the perceptual weighting filter described in expression (4) to the speech and audio hybrid coding mode, while theoretically feasible, will encounter the above problems in practical applications.
In the prior art, the third is a perceptual weighting filter adopted in an AMR-WB + protocol, and a corresponding transfer function is:
Figure A20061012692100071
where μ =0.68.
AMR-WB + reduces the dynamic range of the signal through the same pre-emphasis filter to achieve the elimination of spectral tilt of the signal spectrum. However, this processing scheme is not adjusted according to the characteristics of the current coding spectrum itself, and thus the effect of the formant structure of the spectrum of the obtained noise analog signal is not ideal.
Disclosure of Invention
In view of the above, the first main object of the present invention is to: a perceptual weighting filtering method is provided to reduce the computational complexity of a speech and/or audio signal coding system at the encoding and decoding end and to improve the subjective auditory effect.
The present invention, on the basis of the above-mentioned first main object, gives a second main object: a perceptual weighting filter is provided to improve subjective auditory effects.
In order to achieve the object of the first aspect, the invention provides the following technical solutions:
a perceptual weighted filtering method for performing perceptual weighted filtering processing on a wideband speech or audio signal, comprising the steps of:
A. carrying out spectrum tilt filtering processing on an input broadband voice or audio signal;
B. carrying out traditional perception weighting filtering processing on the broadband voice or audio signal output after the spectrum tilt filtering processing;
C. outputting the broadband voice or audio signal subjected to the perception weighting processing;
and directly taking part of coefficients in the coefficients of the transmission function used for the traditional perceptual weighting filtering processing as coefficients of the transmission function used for the spectral tilt filtering processing.
The coefficients of the transfer function used for the spectral tilt filtering process and the conventional perceptual weighting filtering process are all linear prediction coefficients LPC coefficients.
The transmission function coefficients used for the spectrum tilt filtering process are a group of p-order LPC coefficients, and a set b = { b } is formed by the group of coefficients j J =1, \8230;, p }; the transmission function coefficients used in the conventional perceptual weighting filtering process are a set of l-order LPC coefficients, and the set of coefficients forms a set a = { a = i I =1, \ 8230;, l }, then: b 8834a, P is more than 1 and less than l, and P elements in the set b are respectively in one-to-one correspondence with the first P elements in the set a and are the same, wherein b j Is a LPC coefficient of order p, a i Are LPC coefficients of order l.
Performing spectrum tilt filtering processing on an input broadband voice or audio signal by using a transfer function P '(z), wherein the expression of the transfer function P' (z) is as follows:
Figure A20061012692100081
delta is a weighting coefficient, and z is a complex variable; the P' (z) is a full zero transfer function.
Carrying out traditional perception weighting filtering processing on the broadband voice or audio signal output after the spectrum tilt filtering processing by using a transfer function W (z), wherein the expression of the transfer function W (z) is as follows:
Figure A20061012692100091
wherein λ is 1 And λ 2 Is a weighting coefficient, and 0 < lambda 2 <λ 1 ≤1。
Let the transfer function used in the method be W' (z), then its expression is:
Figure A20061012692100092
when the method is used in an adaptive multi-code-rate wideband coding/decoding AMR-WB or an extended adaptive multi-code-rate wideband coding/decoding AMR-WB + wideband audio coding/decoding system, the values of the parameters in W' (z) are respectively as follows: l =16,p =2, λ 1 =0.91,λ 2 =0.3, δ =0.2; and, b 1 =a 1 、 b 2 =a 2
The method transforms the signals of a perception weighting domain to a non-perception weighting domain, and the expression of the used transfer function is as follows:
Figure A20061012692100093
in order to achieve the object of the second aspect, the invention provides a technical solution:
a perceptual weighting filter for performing a perceptual weighting filtering process on an input wideband speech or audio signal, comprising: a spectrum tilt filtering unit and a perception weighting filtering unit;
the spectrum tilt filtering unit performs spectrum flat processing on the frequency spectrum of the input broadband voice or audio signal, and the broadband voice or audio signal is processed by the unit and then continuously input into the perception weighting filtering processing unit;
the perception weighting filtering unit adjusts the noise distribution of the signal input into the unit according to the formant structure of the input original broadband voice or audio signal, and continuously outputs the broadband voice or audio signal which is filtered by the unit.
And directly using part of coefficients in the coefficients of the transmission function used by the perception weighting filtering unit as the coefficients of the transmission function used by the spectrum tilt filtering unit.
Let the transfer function used by the spectral tilt filtering unit be P' (z), then its expression is:
Figure A20061012692100101
delta is a weighting coefficient, z is a complex variable, b j Namely p-order LPC coefficient;
let the transfer function used by the perceptual weighted filtering unit be W (z), an
Figure A20061012692100102
Wherein λ is 1 And λ 2 Is a weighting coefficient, and 0 < lambda 2 <λ 1 ≤1;
And let b = { b = { (b) j ,j=1,…,p},a={a i =1, \8230 =, l }, then: b \8834a, p < 1 < l, and p elements in the set b are respectively in one-to-one correspondence with and are the same as the first p elements in the set a.
Let the transfer function used by the perceptual weighting filter be W "(z), then the expression is:
it can be seen from the above technical solutions that, when the perceptual weighting method or the perceptual weighting filter provided by the present invention is applied in a wideband speech and/or audio coding and decoding system, especially in a wideband speech and audio mixed coding and decoding system, at the encoding end, only the coefficients of the transfer function W (z) used for performing the conventional perceptual weighting filtering process on the wideband speech or audio signal, or the transfer function W (z) constituting the conventional perceptual weighting filter, which need to be calculated originally, are calculated, and part of the coefficients are directly used as the coefficients of the transfer function P '(z) used for performing the spectral tilt filtering process on the wideband speech or audio signal or the transfer function P' (z) constituting the spectral tilt filter, so compared with the prior art, the implementation of the present invention can reduce the calculation complexity at the encoding end; the decoding end decodes the coefficient of W (z) and directly uses part of the coefficient as the coefficient forming the P' (z), thus the invention can be realized without increasing additional transmission bit rate and greatly reduce the algorithm complexity and the calculation amount at the decoding end. In addition, the invention can reduce the algorithm complexity and the calculation amount of the perceptual weighting processing filtering link of the signal, and simultaneously achieve the processing effect of perceptual weighting even exceeding the prior art.
And the perceptual weighting filter and the perceptual weighting method provided by the invention are suitable for processing a broadband voice or audio signal.
In addition, the technical scheme provided by the invention is also suitable for an AMR-WB + coding system, and can achieve better noise shaping effect than the prior art in practical application. When the technical scheme provided by the invention is applied to other wideband speech/audio coding systems except AMR-WB and AMR-WB +, the parameter lambda is adjusted 1 、λ 2 And the value of δ to obtain the best perceptual weighting processing effect.
Drawings
FIG. 1 is a flow diagram of a process for perceptually weighting and filtering an input wideband speech or audio signal in accordance with an embodiment of the present invention;
FIG. 2 is a schematic diagram of a perceptual weighting filter in accordance with an embodiment of the present invention;
FIG. 3 is a comparison of experimental results of perceptual weighting of an original wideband speech signal using perceptual weighting filters according to prior art two and the present invention in an AMR-WB + ACELP/TCX coding system;
FIG. 4 is a comparison of experimental results of perceptual weighting of an original wideband audio signal using perceptual weighting filters according to the second prior art and the present invention in an AMR-WB + ACELP/TCX coding system;
FIG. 5 is a comparison of experimental results of perceptual weighting of an original wideband speech signal using a perceptual weighting filter according to the third prior art and the present invention in an AMR-WB + ACELP/TCX coding system;
FIG. 6 is a comparison of experimental results of the concept of perceptual weighting of an original wideband audio signal using a perceptual weighting filter according to the third prior art and the present invention, respectively, in an AMR-WB + ACELP/TCX coding system.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the accompanying drawings and specific embodiments.
Further analysis of the masking effect mentioned in the background art shows that in the speech spectrum, the noise in the frequency band with higher energy, i.e. the formants, is not easily perceived relative to the noise in the frequency band with lower energy. Therefore, when the error between the original speech and the synthesized speech, i.e., the noise, is assigned, the noise may be allowed to be larger in the formant region where the energy is relatively large, and accordingly, the noise may be required to be smaller in the formant valley where the energy is relatively small. Based on the principle, the invention achieves the purposes of improving the coding efficiency and enhancing the subjective auditory quality of the speech coding by adjusting the noise distribution in the perceptual weighting domain in the broadband speech coding.
When the perceptual weighting filtering method provided by the invention performs perceptual weighting filtering processing on an input broadband voice or audio signal, the used transfer function still adopts the form of the expression (3) in the background technology, namely the transfer function still comprises two parts: the transfer function used for the conventional perceptual weighting filtering process and the transfer function used for the spectral tilt filtering process. The invention focuses on improving the latter, namely, the coefficients of the transmission function used for deriving the spectrum tilt filtering process are not recalculated, but partial coefficients in the coefficients of the transmission function used for the traditional perception weighted filtering process are directly used as the coefficients of the transmission function used for the spectrum tilt filtering process; then, the transmission function provided by the invention is used for carrying out perception weighted filtering processing on the input broadband voice or audio signal, namely carrying out noise shaping, thereby achieving the purpose of masking noise.
First, the transfer function required for the perceptual weighting process of wideband speech or audio signals provided by the present invention will be described in detail.
In the present invention, the transfer function used for perceptual weighting processing of an input wideband speech or audio signal is W "(z), the transfer function used for conventional perceptual weighting filtering is W (z), and the transfer function used for improved spectral tilt filtering in the present invention is P' (z), and the transfer function is:
W″(z)=W(z)P′(z), (6)
where W (z) is a function described by expression (1), if l =16, then:
Figure A20061012692100131
where λ is a weighting coefficient.
The functional form of P' (z) is:
wherein, b j Which is a p-order LPC coefficient corresponding to the original input wideband speech or audio signal.
Before the present invention gives no improvement, the LPC coefficients of order p need to be derived from the coefficients of order l constituting a (z), which is rather complicated. However, in view of the improvement provided by the present invention, the LPC coefficients of order P can be directly obtained from the coefficients of order l, i.e. a set of LPC coefficients of order P constituting P' (z) is set b = { b = { (z) } j J =1, \ 8230;, p }; in response to this, the mobile terminal is allowed to, a group of LPC coefficients of order l constituting a (z) constitutes a set a = { a = { a } i ,i=18230The l }, then: b \8834a, p < 1 < l, and p elements in the set b are respectively in one-to-one correspondence with and are the same as the first p elements in the set a. For example, if p =2,l =16, then the first two coefficients a in the set a may be taken 1 、a 2 As a function of two coefficients b j Respectively corresponding coefficient, i.e. b 1 =a 1 、b 2 =a 2
As can be seen from expressions (2) and (8), the transfer function P (z) used in the spectral tilt filtering process in the prior art is an all-pole function; the P' (z) used in the spectrum tilt filtering process provided by the invention is an all-zero function.
Combining the expression (7) and the expression (8), the expression of the transfer function for perceptual weighting processing of the input broadband voice or audio signal, which is provided by the present invention, is obtained as follows:
then, the transmission function provided by the invention is used in combination with fig. 1 to perform perceptual weighting processing on the input broadband voice or audio signal.
Step 101, performing spectrum tilt filtering processing on an input broadband voice or audio signal;
in this step, the transmission function P' (z) used in the spectrum tilt filtering process provided by the present invention is used to perform filtering process on the wideband speech or audio signal, so as to eliminate the spectrum tilt of the wideband speech or audio signal, and to flatten the spectrum of the obtained output signal.
Step 102, further performing traditional perception weighted filtering processing on the output broadband voice or audio signal obtained after the processing of the step 101, and adjusting noise distribution;
in this step, the noise distribution of the output wideband speech or audio signal is adjusted by using W (z) used in the above conventional perceptual weighting filtering process, so that the obtained noise distribution of the output wideband speech or audio signal can follow the spectral formant change of the wideband speech or audio signal.
And 103, outputting the broadband voice or audio signal subjected to the perception weighting processing.
It can be known from the above content that the perceptual weighting filtering method provided by the present invention makes the frequency spectrum of the wideband speech or audio signal flat through the spectrum tilt filtering, and then utilizes the traditional perceptual weighting filtering process to the outputted wideband speech or audio signal, so that the noise distribution can change along with the frequency spectrum formant of the speech or audio signal, thereby achieving the purpose of masking the noise.
In addition, in practical applications, such as when synthesizing speech signals in the parsing and synthesizing method, the signals in the perceptual weighting domain need to be restored to the non-perceptual weighting domain, and this restoration process is the inverse process of the perceptual weighting, so the corresponding transfer function is the reciprocal of expression (9):
Figure A20061012692100141
the present invention also provides a perceptual weighting filter as shown in fig. 2, which is designed according to the above-mentioned perceptual weighting filtering method of the present invention, and performs filtering processing on the input wideband speech or audio signal. The perceptual weighting filter comprises: a spectral tilt filtering unit 201 and a perceptual weighting filtering unit 202.
The spectrum tilt filtering unit 201 performs spectrum tilt filtering on the input wideband speech or audio signal to make the spectrum of the input wideband speech or audio signal flat, and the wideband speech or audio signal is processed by the unit and then continuously input to the perceptual weighting filtering processing unit;
the perceptual weighting filtering unit 202 adjusts the noise distribution of the wideband speech or audio signal input to the unit according to the formant structure of the original wideband speech or audio signal, and continuously outputs the wideband speech or audio signal subjected to filtering processing by the unit. The perceptual weighting filter unit 202 can reduce the importance of the formant frequency region after the spectrum flattening processing.
The transfer function of the perceptual weighting filtering unit is shown in an expression (1); the transfer function of the spectral tilt filtering unit is shown in expression (8), and correspondingly, the transfer function of the perceptual weighting filter is shown in expression (9).
In addition, corresponding to the perceptual weighting filtering method provided by the present invention, some of the coefficients of the transfer function used by the perceptual weighting filtering unit are directly used as the coefficients of the transfer function used by the spectral tilt filtering unit, i.e., a group of P-order LPC coefficients constituting the P' (z) in the expression (8) is set as b = { b } j J =1, \8230;, p }; in response to this, the mobile terminal is allowed to, the group consisting of A (z) in expression (1) LPC coefficient formation set of order l a = { a = i I =1, \8230;, l }, then: b \8834a, p < 1 < l, and p elements in the set b are respectively corresponding to the first p elements in the set a one by one and are the same. For example, if p =2,l =16, then the first two coefficients a in the set a may be taken 1 、a 2 As a function of two coefficients b j Respectively corresponding coefficients, i.e. b 1 =a 1 、b 2 =a 2
The perceptual weighting filter provided by the invention, the perceptual weighting filter provided by the second prior art and the perceptual weighting filter adopted by the third prior art, namely the AMR-WB + is applied to a broadband voice and audio mixed coding system, such as an ACELP/TCX voice and audio mixed coding system of the AMR-WB +, and the effect of the perceptual weighting processing on the broadband voice signal and the audio signal by the technical scheme of the invention and the prior art is compared and explained by combining the attached drawings. In the second prior art, the transfer function adopted by the perceptual weighting filter corresponds to an expression (4), and the parameters in the expression are respectively: p =2,l =16, λ 1 =0.95,λ 2 =0.8, δ =0.7; the transfer function adopted by the weighting filter in the third prior art corresponds to expression (5), where μ =0.68; the perceptual weighting filter presented in the present invention is used in this embodimentInput function corresponding tableExpression (9), in which the values of the parameters are: p =2,l =16, i.e. two coefficients b j Are respectively connected with the first two a i The coefficients correspond; lambda 1 =0.91,λ 2 =0.3,δ=0.2。
Fig. 3 shows the experimental results of perceptual weighting of original wideband speech signals using perceptual weighting filters according to the second prior art and the present invention. As can be seen from fig. 3, in this embodiment, after the original wideband speech signal is filtered by the perceptual weighting filter provided by the present invention, the obtained quantization noise can change well along with the change of the spectral envelope of the original wideband speech signal; and the frequency components of the formants can well mask the quantization noise. Although the quantization noise obtained after the processing of the perceptual weighting filter provided by the second prior art can also change along with the change of the spectral envelope of the original wideband speech signal, the masking effect is obviously not good when the perceptual weighting filter improved by the present invention is used. As can be seen from the graph in fig. 3, the processing effect of the second prior art is relatively high in quantization noise, and the formant frequency appears at multiple positions and cannot effectively mask the noise, and such processing effect obviously remains to be improved. In comparison, the processing result provided by the embodiment is obviously better, namely, the quantization noise with a larger ratio can be distributed in the formant region, and the quantization noise with a smaller ratio can be distributed in the formant valley, so that a better subjective auditory effect can be obtained.
Fig. 4 shows the experimental results of the perceptual weighting processing performed on the original wideband audio signal by using the perceptual weighting filters according to the second prior art and the present invention. The analysis of fig. 4 can be referred to the analysis of the processing results of fig. 3 described above. Moreover, as can be seen from fig. 4, the processing result given by the second prior art in the low frequency band hardly obtains the effect of masking quantization noise by the frequency components of formants, so that the subjective auditory effect is not ideal, and this experimental result is also consistent with the result that the second prior art analyzed in the background art is not suitable for the mixed coding system of wideband speech and audio. As is apparent from fig. 4, in this embodiment, the processing result provided by the present invention is relatively good.
Fig. 5 shows the experimental results of the perceptual weighting processing performed on the original wideband speech signal by using the perceptual weighting filters of the third prior art and the present invention. As can be seen from fig. 5, in this embodiment, although the processing result given in the third prior art can make the quantization noise lower than the processing result of the present invention, the experimental result shown in fig. 5 shows a drawback inherent in this technique, that is, the processing scheme mentioned in the background art is not adjusted according to the characteristics of the current coded spectrum itself, so the effect of the obtained quantization noise following the change of the resonance peak spectrum envelope is not ideal, and the consequence of this drawback is that the coding efficiency is reduced, that is, more coding bit rates need to be allocated to reduce the noise. The quantization noise processed by the invention can change along with the change of the spectrum envelope of the original broadband voice signal, so that the coding efficiency is improved according to the masking effect.
Fig. 6 shows the experimental results of perceptual weighting of original wideband audio signals using perceptual weighting filters according to the third prior art and the present invention. For the analysis content of fig. 6, reference may be made to the analysis of fig. 5, and details are not repeated.
From the above, when the perceptual weighting method or the perceptual weighting filter provided by the present invention is applied in a speech and/or audio coding/decoding system, especially in a speech and audio mixed coding/decoding system, at the encoding end, the coefficients of the transfer function W (z) used for performing the conventional perceptual weighting filtering process on the wideband speech or audio signal, or the transfer function W (z) constituting the conventional perceptual weighting filter, which need to be calculated originally, are calculated, and part of the coefficients are directly used as the coefficients of the transfer function P '(z) used for performing the spectral tilt filtering process on the wideband speech or audio signal or the transfer function P' (z) constituting the spectral tilt filter, so compared with the prior art, the implementation of the present invention can reduce the calculation complexity at the encoding end; the decoding end decodes the coefficient of W (z) and directly uses part of the coefficient as the coefficient forming the P' (z), thus the invention can be realized without increasing additional transmission bit rate and greatly reduce the algorithm complexity and the calculation amount at the decoding end. Moreover, the experimental results shown in fig. 3 to fig. 6 show that the perceptual weighting processing effect of the prior art can be achieved or even exceeded while the algorithm complexity and the calculation amount of the perceptual weighting filtering processing link of the wideband speech or audio signal are reduced.
And the perceptual weighting filter provided by the invention is suitable for processing a broadband voice signal or an audio signal. When the perceptual weighting filter provided by the invention is applied in other wideband speech/audio coding systems than AMR-WB and AMR-WB +, the parameter lambda is adjusted 1 、λ 2 And the value of δ to obtain the best perceptually weighted filtering effect.
In practical applications, such as when synthesizing speech signals in analytical synthesis methods, it is necessary to restore signals in perceptual weighting domain to non-perceptual weighting domain, and this restoration process is the inverse process of perceptual weighting, therefore, the present invention provides an inverse perceptual weighting filter based on the above-mentioned perceptual weighting filter, and the filter corresponds to the inverse process of the perceptual weighting filter, and the corresponding transfer function is the expression (10).

Claims (11)

1. A perception weighted filtering method for processing perception weighted filtering processing on broadband voice or audio signals is characterized by comprising the following steps:
A. carrying out spectrum tilt filtering processing on an input broadband voice or audio signal;
B. carrying out traditional perception weighting filtering processing on the broadband voice or audio signal output after the spectrum tilt filtering processing;
C. outputting the broadband voice or audio signal which is processed by the traditional perception weighted filtering;
and directly taking part of coefficients in the coefficients of the transmission function used for the traditional perceptual weighting filtering processing as coefficients of the transmission function used for the spectral tilt filtering processing.
2. The method according to claim 1, wherein the coefficients of the transfer function used in the spectral tilt filtering process and the perceptual weighting filtering process are linear prediction coefficients LPC coefficients.
3. The method according to claim 2, wherein the transmission function coefficients used in the spectral tilt filtering process are a set of p-order LPC coefficients, and the set of coefficients is used to form a set b = { b = j J =1, ·, p }; the transmission function coefficients used in the conventional perceptual weighting filtering process are a set of l-order LPC coefficients, and the set a = { a } is formed by the set of coefficients i I =1,.. L }, then there are: b \ 57767a, 1 < p < l, and p elements in the set b are respectively corresponding to the first p elements in the set a and are the same, wherein b j Is an LPC coefficient of order p, a i Are LPC coefficients of order l.
4. The method of claim 3, wherein the spectral tilt filtering process is performed on the input wideband speech or audio signal by using a transfer function P '(z), where the expression of the transfer function P' (z) is:
Figure A2006101269210002C1
delta is a weighting coefficient, and z is a complex variable; the P' (z) is the full zero transfer function.
5. The method of claim 4, wherein the output wideband speech or audio signal after the spectral tilt filtering is processed by conventional perceptual weighting filtering using a transfer function W (z), wherein the expression of the transfer function W (z) is:
Figure A2006101269210003C1
wherein λ is 1 And λ 2 Is a weighting coefficient, and 0 < lambda 2 <λ 1 ≤1。
6. The method of claim 5, wherein the transmission function used in the method is represented by W "(z), and is expressed as:
Figure A2006101269210003C2
7. the method of claim 6, wherein when the method is used in an AMR-WB or AMR-WB + wideband codec system, the parameters in W "(z) take the following values: l =16,p =2, λ 1 =0.91,λ 2 =0.3, δ =0.2; and, b 1 =a 1 、b 2 =a 2
8. The method of claim 6, wherein the method transforms the perceptual weighted domain signal into the non-perceptual weighted domain using a transfer function expressed by:
Figure A2006101269210003C3
9. a perceptual weighting filter for performing a perceptual weighting filtering process on an input wideband speech or audio signal, comprising: a spectrum tilt filtering unit and a perceptual weighting filtering unit;
the spectrum tilt filtering unit performs spectrum flat processing on the frequency spectrum of the input broadband voice or audio signal, and the broadband voice or audio signal is processed by the unit and then continuously input into the perception weighting filtering processing unit;
the perception weighting filter unit adjusts the noise distribution of the signal input into the unit according to the formant structure of the input original broadband voice or audio signal, and continuously outputs the broadband voice or audio signal after the filtering processing of the unit;
and directly using part of the coefficients of the transfer function used by the perceptual weighting filtering unit as the coefficients of the transfer function used by the spectral tilt filtering unit.
10. The perceptual weighting filter of claim 9 wherein the spectral tilt filtering unit is configured to use a transfer function P' (z) expressed as:
delta is a weighting coefficient, z is a complex variable, b j Namely p-order LPC coefficient;
let the transfer function used by the perceptual weighted filtering unit be W (z), an
Wherein λ is 1 And λ 2 Is a weighting coefficient, and 0 < lambda 2 <λ 1 ≤1;
And let b = { b = { (b) j ,j=1,...,p},a={a i And =1,.., l }, then there are: b \57767a, 1 < p < l, and p elements in the set b are respectively corresponding to the first p elements in the set a one by one and are the same.
11. The perceptual weighting filter of claim 9 or 10 wherein said perceptual weighting filter has a transfer function W "(z) expressed by the following expression:
Figure A2006101269210004C3
CNB2006101269218A 2006-09-06 2006-09-06 Perception weighting filtering wave method and perception weighting filter thererof Active CN100487789C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB2006101269218A CN100487789C (en) 2006-09-06 2006-09-06 Perception weighting filtering wave method and perception weighting filter thererof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB2006101269218A CN100487789C (en) 2006-09-06 2006-09-06 Perception weighting filtering wave method and perception weighting filter thererof

Publications (2)

Publication Number Publication Date
CN101140758A true CN101140758A (en) 2008-03-12
CN100487789C CN100487789C (en) 2009-05-13

Family

ID=39192679

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2006101269218A Active CN100487789C (en) 2006-09-06 2006-09-06 Perception weighting filtering wave method and perception weighting filter thererof

Country Status (1)

Country Link
CN (1) CN100487789C (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010127616A1 (en) * 2009-05-05 2010-11-11 Huawei Technologies Co., Ltd. System and method for frequency domain audio post-processing based on perceptual masking
CN102054482B (en) * 2009-10-27 2012-11-28 中国移动通信集团公司 Method and device for enhancing voice signal
CN104703093A (en) * 2013-12-09 2015-06-10 中国移动通信集团公司 Audio output method and device
CN106575508A (en) * 2014-06-10 2017-04-19 瑞内特有限公司 Digital encapsulation of audio signals
CN109478407A (en) * 2016-03-15 2019-03-15 弗劳恩霍夫应用研究促进协会 Decoding apparatus for handling the code device of input signal and for handling the signal after encoding

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010127616A1 (en) * 2009-05-05 2010-11-11 Huawei Technologies Co., Ltd. System and method for frequency domain audio post-processing based on perceptual masking
US8391212B2 (en) 2009-05-05 2013-03-05 Huawei Technologies Co., Ltd. System and method for frequency domain audio post-processing based on perceptual masking
CN102054482B (en) * 2009-10-27 2012-11-28 中国移动通信集团公司 Method and device for enhancing voice signal
CN104703093A (en) * 2013-12-09 2015-06-10 中国移动通信集团公司 Audio output method and device
CN104703093B (en) * 2013-12-09 2018-07-17 中国移动通信集团公司 A kind of audio-frequency inputting method and device
CN106575508A (en) * 2014-06-10 2017-04-19 瑞内特有限公司 Digital encapsulation of audio signals
CN109478407A (en) * 2016-03-15 2019-03-15 弗劳恩霍夫应用研究促进协会 Decoding apparatus for handling the code device of input signal and for handling the signal after encoding
CN109478407B (en) * 2016-03-15 2023-11-03 弗劳恩霍夫应用研究促进协会 Encoding device for processing an input signal and decoding device for processing an encoded signal

Also Published As

Publication number Publication date
CN100487789C (en) 2009-05-13

Similar Documents

Publication Publication Date Title
TWI321315B (en) Methods of generating a highband excitation signal and apparatus for anti-sparseness filtering
RU2383943C2 (en) Encoding audio signals
JP5047268B2 (en) Speech post-processing using MDCT coefficients
CN101765879B (en) Device and method for noise shaping in multilayer embedded codec interoperable with ITU-T G.711 standard
JP5165559B2 (en) Audio codec post filter
JP5224017B2 (en) Audio encoding apparatus, audio encoding method, and audio encoding program
JP4954069B2 (en) Post filter, decoding device, and post filter processing method
TWI643186B (en) High band excitation signal generation
CN103187065B (en) The disposal route of voice data, device and system
WO2015154397A1 (en) Noise signal processing and generation method, encoder/decoder and encoding/decoding system
CN101140759A (en) Band-width spreading method and system for voice or audio signal
WO2010028301A1 (en) Spectrum harmonic/noise sharpness control
JP2010020251A (en) Speech coder and method, speech decoder and method, speech band spreading apparatus and method
JP2007529031A (en) Synthesis of mono audio signal based on encoded multi-channel audio signal
TW200820219A (en) Systems, methods, and apparatus for gain factor limiting
TW201137861A (en) Audio signal encoder, audio signal decoder, method for providing an encoded representation of an audio content, method for providing a decoded representation of an audio content and computer program for use in low delay applications
WO2011127832A1 (en) Time/frequency two dimension post-processing
JPH04233600A (en) Low-delay-code exciting-wire type prediction encoding for speech in 32 kb/s wide band
CN105280190A (en) Bandwidth extension encoding and decoding method and device
FI3330966T3 (en) Improved frequency band extension in an audio frequency signal decoder
WO2024051412A1 (en) Speech encoding method and apparatus, speech decoding method and apparatus, computer device and storage medium
CN101140758A (en) Perception weighting filtering wave method and perception weighting filter thererof
JPH07160296A (en) Voice decoding device
CN101304261B (en) Method and apparatus for spreading frequency band
CN105957533B (en) Voice compression method, voice decompression method, audio encoder and audio decoder

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant