CN108630212B - Perception reconstruction method and device for high-frequency excitation signal in non-blind bandwidth extension - Google Patents
Perception reconstruction method and device for high-frequency excitation signal in non-blind bandwidth extension Download PDFInfo
- Publication number
- CN108630212B CN108630212B CN201810290508.8A CN201810290508A CN108630212B CN 108630212 B CN108630212 B CN 108630212B CN 201810290508 A CN201810290508 A CN 201810290508A CN 108630212 B CN108630212 B CN 108630212B
- Authority
- CN
- China
- Prior art keywords
- excitation signal
- frequency excitation
- energy
- sub
- frequency
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/032—Quantisation or dequantisation of spectral components
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/08—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
- G10L19/12—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a code excitation, e.g. in code excited linear prediction [CELP] vocoders
Abstract
The invention relates to a perception reconstruction method and a device of a high-frequency excitation signal in non-blind bandwidth extension, firstly calculating a perception characteristic parameter of an original high-frequency excitation signal, and selecting a reconstructed high-frequency excitation signal source according to a statistical threshold of the parameter; then calculating sub-band energy in the MDCT domain of the original high-frequency excitation signal, and using the energy to adjust the energy of the high-frequency excitation signal source; and finally, carrying out sensing recovery on the high-frequency excitation signal after the energy is adjusted through the sensing characteristic parameters of the high-frequency excitation signal. The invention obviously improves the reconstructed tone quality of the high-frequency excitation signal only by adding the perception characteristic parameters.
Description
Technical Field
The invention relates to the technical field of audio coding and decoding, in particular to a perceptual reconstruction method and a perceptual reconstruction device for a high-frequency excitation signal in non-blind bandwidth extension.
Background
Currently, global mobile communication is rapidly developing, and the 5G era is about to enter. With the development of global informatization, the amount of data in the communication field is increasing, and among the huge data, the voice and audio data occupy a greater proportion. Therefore, compressing voice and audio data to effectively reduce data storage and transmission costs has become a common global challenge.
The audio non-blind bandwidth expansion is one of the standard technologies in the field of modern audio coding, and only a small amount of high-frequency signal parameters are transmitted during coding by utilizing the characteristic that the high-frequency and low-frequency signals have correlation, and reconstructed high frequency is obtained by copying the low-frequency signals and adjusting the high-frequency signal parameters during decoding. Audio non-blind bandwidth extension is adopted by most mainstream standards organizations due to its advantage of obtaining higher sound quality at very low code rates. In the existing non-blind bandwidth extension method, a source filter model is generally adopted to synthesize a high-frequency signal. In the source filter model method, a high frequency excitation signal and a high frequency linear prediction coefficient LPC are synthesized into a high frequency signal by a linear prediction synthesis filter. Because the code rate of LPC coding is smaller, the LPC coding is generally directly transmitted to a decoding end as a parameter, and because the high-frequency excitation signal has higher code rate and certain correlation with low frequency, the high-frequency excitation signal is generally directly replaced by the low-frequency excitation signal obtained from the decoding end. The method has better coding quality when the high and low frequency excitation signals have higher correlation, but the coding quality is sharply reduced when the correlation is weaker. According to the invention, the perception characteristic parameter of the high-frequency excitation signal is added, and the accuracy of the reconstruction of the high-frequency excitation signal is improved by a high-frequency excitation signal perception recovery method, so that the non-blind bandwidth expansion reconstruction tone quality is improved.
The invention provides a method and a device for perceptual reconstruction of a high-frequency excitation signal in non-blind bandwidth extension to improve the reconstruction tone quality of a high-frequency signal under the condition of improving few coding code rates.
Disclosure of Invention
The invention aims to provide a method and a device for perceptual reconstruction of a high-frequency excitation signal in non-blind bandwidth extension, so that the coding tone quality is obviously improved under the condition of increasing few coding code rates.
In order to achieve the above object, the present invention provides a perceptual reconstruction apparatus for high-frequency excitation signals, which includes a high-frequency excitation signal perceptual feature calculation module, a signal source selection module, a high-frequency excitation signal subband energy calculation module, an energy adjustment module, and an excitation signal perceptual recovery module.
The high-frequency excitation signal perception feature calculation module is used for calculating a perception feature parameter HFEPF of an original high-frequency excitation signal and outputting the parameter to the signal source selection module and the excitation signal perception recovery module;
the signal source selection module is used for selecting a signal source for reconstructing the high-frequency excitation signal, in the selection process, the excitation signal source is selected according to different thresholds of a perception characteristic parameter HFEPF of the original high-frequency excitation signal calculated by the high-frequency excitation signal perception characteristic calculation module, the excitation signal source comprises a random noise signal, a low-frequency excitation signal and a low-frequency whitening excitation signal, and the selected excitation signal source is output to the energy adjustment module;
the high-frequency excitation signal subband energy calculating module is used for calculating subband energy HFESE in the frequency domain of the original high-frequency excitation signal, and the subband energy is output to the energy adjusting module and used for adjusting the excitation signal source obtained in the signal source selecting module;
the energy adjusting module is used for adjusting the energy of the excitation signal source obtained by the signal source selecting module, the parameters required by the energy adjustment are derived from subband energy HFESE output by the high-frequency excitation signal subband energy calculating module, and the excitation signal after the energy adjustment is output to the excitation signal perception restoring module;
and the excitation signal perception recovery module is used for performing perception recovery on the excitation signal after the energy adjustment obtained by the energy adjustment module, and the perception characteristic parameter HFEPF in the high-frequency excitation signal perception characteristic calculation module is used for adjusting in the perception recovery process.
The invention also provides a perception reconstruction method of the high-frequency excitation signal in the non-blind bandwidth extension, which comprises the following steps:
step 1: performing linear prediction analysis on an original high-frequency signal to obtain a high-frequency linear prediction coefficient LPC and a time domain high-frequency excitation signal Exci _ sig _ T, performing Modified Discrete Cosine Transform (MDCT) on the high-frequency excitation signal Exci _ sig _ T to obtain a high-frequency excitation signal Exci _ sig _ F in an MDCT domain, then calculating the spectral flatness SFM and the spectral peak state factor SKF of the high-frequency excitation signal Exci _ sig _ F in the MDCT domain, and taking the logarithm of the ratio of the SFM and the SKF as a perceptual feature parameter HFEPF of the original high-frequency excitation signal;
step 2: selecting different high-frequency excitation signal sources according to different threshold values of the HFEPF calculated in the step 1, if the HFEPF is smaller than alpha 1, selecting a low-frequency excitation signal provided by a decoding end, if the HFEPF is larger than or equal to alpha 1 and smaller than alpha 2, selecting a low-frequency whitening excitation signal provided by the decoding end, and if the HFEPF is larger than or equal to alpha 2, selecting a random noise signal, wherein the selected high-frequency excitation signal source is expressed as an Exci _ sig _ source, wherein 0< alpha 1< alpha 2, and the value is obtained by mass data statistics;
and step 3: calculating Sub-band Energy of an original high-frequency excitation signal inci _ sig _ F in the MDCT domain in step 1, wherein the Sub-band division method adopts non-uniform division and totally divides the original high-frequency excitation signal inci _ sig _ F into four Sub-bands, if the length of a frame is N, the length of a first Sub-band is 1/8N, the length of a second Sub-band is 1/8N, the length of a third Sub-band is 1/4N, and the length of a fourth Sub-band is 1/2N, and then calculating Energy for each Sub-band to obtain four Sub-band Energy parameters Sub _ Energy _ ori (i), wherein i is 1, 2, 3, and 4;
and 4, step 4: performing Energy adjustment on the high-frequency excitation signal source, namely _ Sig _ source, obtained in the step 2, wherein in the adjustment process, a Sub-band Energy parameter Sub _ Energy _ source (i) of the high-frequency excitation signal source, namely 1, 2, 3 and 4 is calculated firstly, the calculation method is the same as that in the step 3, then the ratio Sub _ Energy _ r (i) of Sub-band _ Energy _ ori (i) and Sub-band _ Energy _ source (i) obtained in the step 3 is calculated, wherein i is the same self-band sequence number, and the signal on each Sub-band of the high-frequency excitation signal source is multiplied by the corresponding value in the Sub-band _ Energy _ r (i) to obtain an Energy-adjusted high-frequency excitation signal, namely _ Sig _ Energy;
and 5: and (4) carrying out perception recovery on the high-frequency excitation signal Exci _ Sig _ adjust after the energy is adjusted in the step (4), wherein the perception recovery process is as follows: setting a spectrum peak value expansion factor sigma (sigma is more than or equal to 0 and less than or equal to 3) and a noise harmonic factor beta (beta is more than or equal to 0 and less than or equal to 10), expanding and contracting the spectrum coefficient by using the spectrum peak value expansion factor sigma, carrying out comfortable noise harmonic by using the noise harmonic factor beta, and controlling the values of the sigma and the beta by using a perception characteristic parameter HFEPF of the high-frequency excitation signal to obtain a final high-frequency excitation signal Exci _ Sig _ Fina.
Compared with the related art, the method developed by the perception reconstruction device of the high-frequency excitation signal in the non-blind bandwidth extension has the following beneficial effects: according to the invention, the perception characteristic parameter of the high-frequency excitation signal is added, the accuracy of the reconstruction of the high-frequency excitation signal is improved by the high-frequency excitation signal perception recovery method, the non-blind bandwidth extension reconstruction tone quality is improved, the spectrum flatness and the peak factor proportion of the spectrum envelope are considered by the perception characteristic parameter, and the auditory perception characteristic can be reflected better.
The attached abbreviations in the above technical schemes are as follows:
HFEPF: high Frequency Excitation Perception Feature
HFESE: sub-band Energy in the Frequency domain of the High Frequency Excitation sub-band and Energy original High Frequency Excitation signal
LPC: linear Prediction Coefficient of Linear Prediction
Exci _ sig _ T: excitation Signal on Time Domain Excitation Signal
MDCT: modified Discrete Cosine Transform
Exci _ sig _ F: excitation Signal of Excitation Signal on Frequency Domain
SFM: spectral Flatness of Spectral Flatness Measurement
SKF: spectral Kurtosis Factor
Exci _ sig _ source: excitation Signal Source
Sub _ Energy _ ori: original subband Energy of Original subband Energy
Sub _ Energy _ source: sub-band Energy of Source band Energy excitation signal Source
Ener _ r: sub-band Energy Ratio of sub-band
Exci _ Sig _ adjust: excited Excitation Signal with Adjusted energy
Exci _ Sig _ permission: excitation signal after Excitation Signla performance perception recovery
Exi _ Sig _ Final: final Excitation Signla Final high frequency Excitation
Drawings
Fig. 1 is a block diagram of a perceptual reconstruction apparatus for a high frequency excitation signal in non-blind bandwidth extension according to the present invention;
FIG. 2 is a flowchart of a perceptual reconstruction method for a high frequency excitation signal in non-blind bandwidth extension according to the present invention;
fig. 3 is a flow chart of perceptual recovery in the perceptual reconstruction method for a high-frequency excitation signal in non-blind bandwidth extension according to the present invention.
Detailed Description
The technical method of the invention is further explained by the following concrete embodiments in combination with the attached drawings:
referring to fig. 1, the apparatus for perceptual reconstruction of a non-blind bandwidth extension medium-high frequency excitation signal according to an embodiment of the present invention includes a high-frequency excitation signal perceptual feature calculation module 1, a signal source selection module 2, a high-frequency excitation signal subband energy calculation module 3, an energy adjustment module 4, and an excitation signal perceptual recovery module 5, and each module may be implemented by using a software curing technology.
The high-frequency excitation signal perception feature calculation module 1 is configured to calculate a perception feature parameter HFEPF of an original high-frequency excitation signal, perform linear prediction analysis on the original high-frequency signal to obtain a high-frequency linear prediction coefficient LPC and a time domain high-frequency excitation signal eci _ sig _ T, perform MDCT transformation on the eci _ sig _ T to obtain a frequency domain high-frequency excitation signal eci _ sig _ F, calculate a spectral flatness SFM and a spectral peak state factor SKF for the eci _ sig _ F, take a logarithm of a ratio of the SFM to the SKF as the perception feature parameter HFEPF of the original high-frequency excitation signal, and output the HFEPF to the signal source selection module 2 and the excitation signal perception restoration module 5;
the signal source selection module 2 is configured to select a signal source for reconstructing a high-frequency excitation signal, in the selection process, select an excitation signal source according to different thresholds of a perceptual feature parameter HFEPF of an original high-frequency excitation signal calculated by the high-frequency excitation signal perceptual feature calculation module 1, where the excitation signal source includes a random noise signal, a low-frequency excitation signal, and a low-frequency whitening excitation signal, and if HFEPF is less than α 1, select a low-frequency excitation signal provided by a decoding end, and if HFEPF is greater than or equal to α 1 and less than α 2, select a low-frequency whitening excitation signal provided by the decoding end, and if HFEPF is greater than or equal to α 2, select a random noise signal, where the selected high-frequency excitation signal source is represented as Exci _ sig _ source, where 0< α 1< α 2, and a value thereof is obtained by mass data statistics; and the selected excitation signal source inci _ sig _ source is output to the energy adjusting module 4;
the high-frequency excitation signal subband energy calculating module 3 is configured to calculate subband energy HFESE in the frequency domain of the original high-frequency excitation signal, and the subband energy is output to the energy adjusting module 4 and used to adjust the excitation signal source exi _ sig _ source obtained in the signal source selecting module 2;
the energy adjusting module 4 is configured to perform energy adjustment on the excitation signal source inci _ sig _ source obtained by the signal source selecting module 2, where parameters required by the energy adjustment are derived from subband energy HFESE output by the high-frequency excitation signal subband energy calculating module 3, and output the excitation signal inci _ sig _ adjust after the energy adjustment to the excitation signal perception recovering module 5;
the excitation signal perception recovery module 5 is configured to perform perception recovery on the excitation signal inci _ sig _ adjust after the energy adjustment obtained by the energy adjustment module 4, and adjust the perception characteristic parameter HFEPF in the high-frequency excitation signal perception characteristic calculation module 1 in the perception recovery process to finally obtain the excitation signal inci _ sig _ final reconstructed at a high frequency.
Referring to fig. 2, the perceptual reconstruction method for a high-frequency excitation signal in non-blind bandwidth extension according to the embodiment of the present invention may automatically perform a process by using a computer software technology, and specifically includes the following steps:
step 1: the method comprises the following steps of performing linear prediction analysis on an original high-frequency signal to obtain a high-frequency linear prediction coefficient LPC and a time domain high-frequency excitation signal Exci _ sig _ T, performing Modified Discrete Cosine Transform (MDCT) on the high-frequency excitation signal Exci _ sig _ T to obtain a high-frequency excitation signal Exci _ sig _ F in an MDCT domain, then calculating the spectral flatness SFM and the spectral peak state factor SKF of the high-frequency excitation signal Exci _ sig _ F in the MDCT domain, and taking the logarithm of the ratio of the SFM to the SKF as a perceptual characteristic parameter HFEPF of the original high-frequency excitation signal, wherein the specific calculation steps of the HFEPF are as follows:
step 1.1: calculating spectral flatness
In the formula, N represents a frame sequence, and N represents a frame length.
Step 1.2: calculating spectral kurtosis factors
In the formula, N represents a frame sequence, and N represents a frame length.
Step 1.3 calculating perceptual characteristic parameters of the high-frequency excitation signal
In the formula, n represents a frame order.
Step 2: and (4) selecting different high-frequency excitation signal sources according to different threshold values of the HFEPF calculated in the step 1. The perceptual feature parameter HFEPF of the high-frequency excitation signal is first calculated for a large number of different types of audio signals (e.g., speech, music, natural sounds, etc.), and the values of the thresholds α 1 and α 2 are determined by counting the distribution of HFEPF values, which in this embodiment is determined as α 1 ═ 6.5 and α 2 ═ 14.5. Then, determining the selection of a high-frequency excitation signal source according to different values of alpha 1 and alpha 2, if HFEPF < alpha 1, selecting a low-frequency excitation signal provided by a decoding end, if alpha 1< ═ HFEPF < alpha 2, selecting a low-frequency whitening excitation signal provided by the decoding end, if HFEPF > is alpha 2, selecting a random noise signal, wherein the random noise signal is an average value of a large number of high-frequency excitation signals of different types; the obtained high frequency excitation signal source is denoted as exi _ sig _ source.
And step 3: calculating the Sub-band Energy of the original high-frequency excitation signal inci _ sig _ F in the MDCT domain in step 1, wherein the Sub-band division method adopts non-uniform division and totally divides the original high-frequency excitation signal inci _ sig _ F into four Sub-bands, if the length of a frame is N, the length of the first Sub-band is 1/8N, the length of the second Sub-band is 1/8N, the length of the third Sub-band is 1/4N, and the length of the fourth Sub-band is 1/2N, and then calculating the Energy of each Sub-band to obtain four Sub-band Energy parameters Sub _ Energy _ ori, wherein the calculation formula is as follows:
in the formula, N represents a frame length.
And 4, step 4: performing Energy adjustment on the high-frequency excitation signal source, namely _ Sig _ source, obtained in the step 2, wherein in the adjustment process, a Sub-band Energy parameter Sub _ Energy _ source (i) of the high-frequency excitation signal source, namely _ Sig _ source, is firstly calculated, the calculation method is the same as that in the step 3, then a ratio Sub _ Energy _ r (i) of the Sub _ Energy _ ori (i) and the Sub _ Energy _ source (i) obtained in the step 3 is calculated, signals on each Sub-band of the excitation signal source, namely _ Sig _ source, are respectively multiplied by corresponding values in the Sub _ Energy _ r (i) to obtain an Energy-adjusted high-frequency excitation signal, namely _ Sig _ add, and the Energy ratio Sub _ Energy _ r (i) thereof is calculated as follows:
where i denotes a subband number, i is 1, 2, 3, 4.
The calculation formula for the energy adjustment is as follows:
Exci_Sib_adjust(1:N/8)=Exci_Sib_source(1:N/8)*Sub_Ener_r1,
in the formula, (1: N/8) represents that the sampling points are from 1 to N/8, and N represents the frame length.
Exci_Sib_adjust(N/8+1:N/4)=Exci_Sib_source(N/8+1:N/4)*Sub_Ener_r2,
Wherein (N/8+ 1: N/4) represents a sampling point from N/8+ 1: n/4, N represents the frame length.
Exci_Sib_adjust(N/4+1:N/2)=Exci_Sib_source(N/4+1:N/2)*Sub_Ener_r3,
Wherein (N/4+ 1: N/2) represents a sampling point from N/4+ 1: n/2, N representing the frame length.
Exci_Sib_adjust(N/2+1:N)=Exci_Sib_source(N/2+1:N)*Sub_Ener_r4,
Where (N/2+ 1: N) indicates that the spots are from N/2+ 1: n, N indicates the frame length.
And 5: and (4) carrying out perception recovery on the high-frequency excitation signal Exci _ Sig _ adjust after the energy is adjusted in the step (4), wherein the perception recovery process is as follows: setting a spectrum peak expansion factor sigma (sigma is more than or equal to 0 and less than or equal to 3) and a noise harmonic factor beta (beta is more than or equal to 0 and less than or equal to 10), expanding and contracting the spectrum coefficient by using the spectrum peak expansion factor sigma, carrying out comfortable noise harmonic by using the noise harmonic factor beta, and controlling the values of the sigma and the beta by using a perception characteristic parameter HFEPF of the high-frequency excitation signal to obtain a Final high-frequency excitation signal Exci _ Sig _ Final, wherein the specific steps are as follows (as shown in figure 3):
step 5.1, setting initial values of σ and β, where σ 0 is 0, β 0 is 0, setting step d1 of σ to 0.3, step d2 of β to 1.0, setting perceptual error threshold delta to 0.1, and setting perceptual feature E1 of the excitation signal after perceptual restoration to 1000;
step 5.2 calculate perceptual recovery Signal Exci _ Sig _ permission
Exci_Sig_perception(n,i)=σ0n·Exci_sig_adjust(n,i)+β0nRand (n, i), where Rand (·) denotes the generation of random noise, n denotes the frame order, and i is tabulatedEach sample point of the frame is shown.
Step 5.3 calculation of perception error E0
Firstly, the perceptual characteristic parameter HFEPF of the perceptual recovery signal Exci _ Sig _ duration is calculated according to the method in the step 1n', recalculate HFEPFn' and the perceptual feature parameter HFEPF of the original high-frequency excitation signal obtained in step 1nThe difference between them is taken as the perceived error E0:
E0=|HFEPFn-HFEPFn'|.
step 5.4, judging whether E0 is smaller than a perception error threshold delta, if not, turning to step 5.5, otherwise, ending the step 5, and taking the obtained perception recovery signal Exci _ Sig _ percentage as a finally generated high-frequency excitation signal Exci _ Sig _ Final;
step 5.5, judging whether the perception error E0 is smaller than E1, if so, turning to step 5.6, otherwise, turning to step 5.7;
step 5.6 record the perceptual error E0 using E1, the current value of σ 0 using σ 1, and the current value of β 0 using β 1;
step 5.7, judging whether sigma 0 is less than or equal to 3 and beta 0 is less than or equal to 10, if so, turning to step 5.8, otherwise, turning to step 5.9;
step 5.8 updates σ 0 and β 0 to increase step sizes d1 and d2, respectively;
step 5.9 recalculates the perceptual recovery signal by σ 1 and β 1 obtained in step 5.6, the calculation formula is as follows:
Exci_Sig_perception(n,i)=σ1n·Exci_sig_adjust(n,i)+β1n·Rand(n,i),
where n denotes the frame order, i denotes each sampling point of the frame, and the obtained exi _ Sig _ duration is used as the finally generated high-frequency excitation signal exi _ Sig _ Final.
The above description is only an embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.
Claims (4)
1. A perception reconstruction method of a high-frequency excitation signal in non-blind bandwidth extension adopts a perception reconstruction device comprising a high-frequency excitation signal perception characteristic calculation module, a signal source selection module, a high-frequency excitation signal sub-band energy calculation module, an energy adjustment module and an excitation signal perception recovery module to carry out perception reconstruction, and is characterized by comprising the following steps:
step 1, performing linear prediction analysis on an original high-frequency signal to obtain a high-frequency linear prediction coefficient LPC and a time domain high-frequency excitation signal Exci _ sig _ T, performing modified discrete cosine transform (MDCT transform) on the high-frequency excitation signal Exci _ sig _ T to obtain a high-frequency excitation signal Exci _ sig _ F in an MDCT domain, then calculating the spectral flatness SFM and the spectral peak state factor SKF of the high-frequency excitation signal Exci _ sig _ F in the MDCT domain through a high-frequency excitation signal perceptual feature calculation module, and taking the logarithm of the ratio of the SFM to the SKF as a perceptual feature parameter HFEPF of the original high-frequency excitation signal;
step 2, according to different threshold values of the HFEPF calculated in the step 1, selecting different high-frequency excitation signal sources through the signal source selection module, if the HFEPF is less than α 1, selecting a low-frequency excitation signal provided by a decoding end, if the HFEPF is greater than or equal to α 1 and less than α 2, selecting a low-frequency whitening excitation signal provided by the decoding end, if the HFEPF is greater than or equal to α 2, selecting a random noise signal, wherein the selected high-frequency excitation signal source is represented as an exi _ sig _ source, wherein 0< α 1< α 2, and the value is obtained by a large amount of data statistics;
step 3, calculating the Sub-band Energy of the original high-frequency excitation signal inci _ sig _ F in the MDCT domain in step 1 by using the high-frequency excitation signal Sub-band Energy calculation module, wherein the Sub-band division method adopts non-uniform division and totally divides the original high-frequency excitation signal inci _ sig _ F into four Sub-bands, if the length of a frame is N, the length of a first Sub-band is 1/8N, the length of a second Sub-band is 1/8N, the length of a third Sub-band is 1/4N, and the length of a fourth Sub-band is 1/2N, and then calculating the Energy of each Sub-band to obtain four Sub-band Energy parameters Sub _ Energy _ ori (i), wherein i is 1, 2, 3 and 4;
step 4, adjusting the Energy of the high-frequency excitation signal source, namely, the external _ Sig _ source obtained in the step 2 through the Energy adjustment module, wherein in the adjustment process, a Sub-band Energy parameter Sub _ Energy _ source (i) of the high-frequency excitation signal source, namely, 1, 2, 3, 4 is firstly calculated, the calculation method is the same as that in the step 3, then, a ratio Sub _ Energy _ r (i) of the Sub-band _ Energy _ ori (i) and the Sub-band _ Energy _ source (i) obtained in the step 3 is calculated, and the signal on each Sub-band of the external _ Sig _ source is multiplied by the corresponding value in the Sub _ Energy _ r (i) to obtain an Energy-adjusted high-frequency excitation signal, namely, the external _ Sig _ j;
and 5, carrying out perception recovery on the high-frequency excitation signal Exci _ Sig _ adjust subjected to the energy adjustment in the step 4 through the excitation signal perception recovery module, wherein the perception recovery process is as follows: setting a spectrum peak value expansion factor sigma (sigma is more than or equal to 0 and less than or equal to 3) and a noise harmonic factor beta (beta is more than or equal to 0 and less than or equal to 10), expanding and contracting the spectrum coefficient by using the spectrum peak value expansion factor sigma, carrying out comfortable noise harmonic by using the noise harmonic factor beta, and controlling the values of the sigma and the beta by using a perception characteristic parameter HFEPF of the high-frequency excitation signal to obtain a Final high-frequency excitation signal Exci _ Sig _ Final.
2. The perceptual reconstruction method of a high-frequency excitation signal in non-blind bandwidth extension according to claim 1, wherein the high-frequency excitation signal perceptual feature calculation module is configured to calculate a perceptual feature parameter HFEPF of an original high-frequency excitation signal and output the parameter to the signal source selection module and the excitation signal perceptual restoration module;
the signal source selection module is used for selecting a signal source for reconstructing the high-frequency excitation signal, in the selection process, the excitation signal source is selected according to different thresholds of a perception characteristic parameter HFEPF of the original high-frequency excitation signal calculated by the high-frequency excitation signal perception characteristic calculation module, the excitation signal source comprises a random noise signal, a low-frequency excitation signal and a low-frequency whitening excitation signal, and the selected excitation signal source is output to the energy adjustment module;
the high-frequency excitation signal subband energy calculating module is used for calculating subband energy HFESE in the frequency domain of the original high-frequency excitation signal, and the subband energy is output to the energy adjusting module and used for adjusting the excitation signal source obtained in the signal source selecting module;
the energy adjusting module is used for adjusting the energy of the excitation signal source obtained by the signal source selecting module, the parameters required by the energy adjustment are derived from subband energy HFESE output by the high-frequency excitation signal subband energy calculating module, and the excitation signal after the energy adjustment is output to the excitation signal perception restoring module;
and the excitation signal perception recovery module is used for performing perception recovery on the excitation signal after the energy adjustment obtained by the energy adjustment module, and the perception characteristic parameter HFEPF in the high-frequency excitation signal perception characteristic calculation module is used for adjusting in the perception recovery process.
3. The method of claim 1, wherein the perceptual feature parameter HFEPF is represented by a logarithm of a ratio of a spectral flatness SFM and a spectral peak-state factor SKF of the original high-frequency excitation signal in the MDCT domain.
4. The method according to claim 1, wherein the adjusting is performed by multiplying each frame signal by the ratio of the corresponding subband energy of the original high frequency excitation signal to the subband energy of the excitation signal source.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810290508.8A CN108630212B (en) | 2018-04-03 | 2018-04-03 | Perception reconstruction method and device for high-frequency excitation signal in non-blind bandwidth extension |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810290508.8A CN108630212B (en) | 2018-04-03 | 2018-04-03 | Perception reconstruction method and device for high-frequency excitation signal in non-blind bandwidth extension |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108630212A CN108630212A (en) | 2018-10-09 |
CN108630212B true CN108630212B (en) | 2021-05-07 |
Family
ID=63696673
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810290508.8A Active CN108630212B (en) | 2018-04-03 | 2018-04-03 | Perception reconstruction method and device for high-frequency excitation signal in non-blind bandwidth extension |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108630212B (en) |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1342230A1 (en) * | 2000-11-14 | 2003-09-10 | Coding Technologies Sweden AB | Enhancing perceptual performance of high frequency reconstruction coding methods by adaptive filtering |
CN101436406A (en) * | 2008-12-22 | 2009-05-20 | 西安电子科技大学 | Audio encoder and decoder |
CN103026408A (en) * | 2010-07-19 | 2013-04-03 | 华为技术有限公司 | Audio frequency signal generation device |
CN103493131A (en) * | 2010-12-29 | 2014-01-01 | 三星电子株式会社 | Apparatus and method for encoding/decoding for high-frequency bandwidth extension |
WO2014199632A1 (en) * | 2013-06-11 | 2014-12-18 | パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ | Device and method for bandwidth extension for acoustic signals |
CN104321815A (en) * | 2012-03-21 | 2015-01-28 | 三星电子株式会社 | Method and apparatus for high-frequency encoding/decoding for bandwidth extension |
CN104517610A (en) * | 2013-09-26 | 2015-04-15 | 华为技术有限公司 | Band spreading method and apparatus |
CN105513601A (en) * | 2016-01-27 | 2016-04-20 | 武汉大学 | Method and device for frequency band reproduction in audio coding bandwidth extension |
EP3174049A1 (en) * | 2011-07-13 | 2017-05-31 | Huawei Technologies Co., Ltd. | Audio signal coding method and device |
CN107221334A (en) * | 2016-11-01 | 2017-09-29 | 武汉大学深圳研究院 | The method and expanding unit of a kind of audio bandwidth expansion |
-
2018
- 2018-04-03 CN CN201810290508.8A patent/CN108630212B/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1342230A1 (en) * | 2000-11-14 | 2003-09-10 | Coding Technologies Sweden AB | Enhancing perceptual performance of high frequency reconstruction coding methods by adaptive filtering |
CN101436406A (en) * | 2008-12-22 | 2009-05-20 | 西安电子科技大学 | Audio encoder and decoder |
CN103026408A (en) * | 2010-07-19 | 2013-04-03 | 华为技术有限公司 | Audio frequency signal generation device |
CN103493131A (en) * | 2010-12-29 | 2014-01-01 | 三星电子株式会社 | Apparatus and method for encoding/decoding for high-frequency bandwidth extension |
EP3174049A1 (en) * | 2011-07-13 | 2017-05-31 | Huawei Technologies Co., Ltd. | Audio signal coding method and device |
CN104321815A (en) * | 2012-03-21 | 2015-01-28 | 三星电子株式会社 | Method and apparatus for high-frequency encoding/decoding for bandwidth extension |
WO2014199632A1 (en) * | 2013-06-11 | 2014-12-18 | パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ | Device and method for bandwidth extension for acoustic signals |
CN104517610A (en) * | 2013-09-26 | 2015-04-15 | 华为技术有限公司 | Band spreading method and apparatus |
CN105513601A (en) * | 2016-01-27 | 2016-04-20 | 武汉大学 | Method and device for frequency band reproduction in audio coding bandwidth extension |
CN107221334A (en) * | 2016-11-01 | 2017-09-29 | 武汉大学深圳研究院 | The method and expanding unit of a kind of audio bandwidth expansion |
Non-Patent Citations (4)
Title |
---|
Adaptive Bandwidth Extension of Low Bitrate Compressed Audio Based on Spectral Correlation;Jiang Lin .etc;《 2015 8th International Conference on Intelligent Computation Technology and Automation (ICICTA)》;20160519;全文 * |
AVS2 speech and audio coding scheme for high quality at low bitrates;Lin Jiang .etc;《 2014 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)》;20140908;全文 * |
低码率音频带宽扩展设计与实现;顾莹;《中国优秀硕士学位论文全文数据库 信息科技辑》;20160815;全文 * |
基于相关系数的AVS-P10带宽扩展优化;文彬 等;《计算机应用与软件》;20170228;第34卷(第2期);第179-183页 * |
Also Published As
Publication number | Publication date |
---|---|
CN108630212A (en) | 2018-10-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10134406B2 (en) | Noise signal processing method, noise signal generation method, encoder, decoder, and encoding and decoding system | |
US9251800B2 (en) | Generation of a high band extension of a bandwidth extended audio signal | |
RU2752127C2 (en) | Improved quantizer | |
CN101521014B (en) | Audio bandwidth expansion coding and decoding devices | |
RU2585990C2 (en) | Device and method for encoding by huffman method | |
KR102424755B1 (en) | High-band signal modeling | |
CN103069484A (en) | Time/frequency two dimension post-processing | |
MX2013010879A (en) | Encoding apparatus and method, and program. | |
EP2186089A1 (en) | Method and device for noise filling | |
KR101143792B1 (en) | Signal encoding device and method, and signal decoding device and method | |
TWI524332B (en) | Apparatus and method for generating a frequency enhanced signal using temporal smoothing of subbands | |
CN114550732B (en) | Coding and decoding method and related device for high-frequency audio signal | |
KR20070051857A (en) | Scalable audio coding | |
CN102194458B (en) | Spectral band replication method and device and audio decoding method and system | |
US20130006644A1 (en) | Method and device for spectral band replication, and method and system for audio decoding | |
US8532985B2 (en) | Warped spectral and fine estimate audio encoding | |
EP3550563B1 (en) | Encoder, decoder, encoding method, decoding method, and associated programs | |
WO2024051412A1 (en) | Speech encoding method and apparatus, speech decoding method and apparatus, computer device and storage medium | |
CN108630212B (en) | Perception reconstruction method and device for high-frequency excitation signal in non-blind bandwidth extension | |
KR101387808B1 (en) | Apparatus for high quality multiple audio object coding and decoding using residual coding with variable bitrate | |
JP5416173B2 (en) | Frequency band copy method, apparatus, audio decoding method, and system | |
WO2001024164A1 (en) | Voice encoder, voice decoder, and voice encoding and decoding method | |
TW201606752A (en) | Apparatus and method for comfort noise generation mode selection | |
GB2349054A (en) | Digital audio signal encoders | |
TW201443888A (en) | Apparatus and method for generating a frequency enhancement signal using an energy limitation operation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |