CN108630212B - Perception reconstruction method and device for high-frequency excitation signal in non-blind bandwidth extension - Google Patents

Perception reconstruction method and device for high-frequency excitation signal in non-blind bandwidth extension Download PDF

Info

Publication number
CN108630212B
CN108630212B CN201810290508.8A CN201810290508A CN108630212B CN 108630212 B CN108630212 B CN 108630212B CN 201810290508 A CN201810290508 A CN 201810290508A CN 108630212 B CN108630212 B CN 108630212B
Authority
CN
China
Prior art keywords
excitation signal
frequency excitation
energy
sub
frequency
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810290508.8A
Other languages
Chinese (zh)
Other versions
CN108630212A (en
Inventor
姜林
余绍黔
李小龙
王同罕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
HUNAN UNIVERSITY OF COMMERCE
East China Institute of Technology
Original Assignee
HUNAN UNIVERSITY OF COMMERCE
East China Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by HUNAN UNIVERSITY OF COMMERCE, East China Institute of Technology filed Critical HUNAN UNIVERSITY OF COMMERCE
Priority to CN201810290508.8A priority Critical patent/CN108630212B/en
Publication of CN108630212A publication Critical patent/CN108630212A/en
Application granted granted Critical
Publication of CN108630212B publication Critical patent/CN108630212B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/032Quantisation or dequantisation of spectral components
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/08Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
    • G10L19/12Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a code excitation, e.g. in code excited linear prediction [CELP] vocoders

Abstract

The invention relates to a perception reconstruction method and a device of a high-frequency excitation signal in non-blind bandwidth extension, firstly calculating a perception characteristic parameter of an original high-frequency excitation signal, and selecting a reconstructed high-frequency excitation signal source according to a statistical threshold of the parameter; then calculating sub-band energy in the MDCT domain of the original high-frequency excitation signal, and using the energy to adjust the energy of the high-frequency excitation signal source; and finally, carrying out sensing recovery on the high-frequency excitation signal after the energy is adjusted through the sensing characteristic parameters of the high-frequency excitation signal. The invention obviously improves the reconstructed tone quality of the high-frequency excitation signal only by adding the perception characteristic parameters.

Description

Perception reconstruction method and device for high-frequency excitation signal in non-blind bandwidth extension
Technical Field
The invention relates to the technical field of audio coding and decoding, in particular to a perceptual reconstruction method and a perceptual reconstruction device for a high-frequency excitation signal in non-blind bandwidth extension.
Background
Currently, global mobile communication is rapidly developing, and the 5G era is about to enter. With the development of global informatization, the amount of data in the communication field is increasing, and among the huge data, the voice and audio data occupy a greater proportion. Therefore, compressing voice and audio data to effectively reduce data storage and transmission costs has become a common global challenge.
The audio non-blind bandwidth expansion is one of the standard technologies in the field of modern audio coding, and only a small amount of high-frequency signal parameters are transmitted during coding by utilizing the characteristic that the high-frequency and low-frequency signals have correlation, and reconstructed high frequency is obtained by copying the low-frequency signals and adjusting the high-frequency signal parameters during decoding. Audio non-blind bandwidth extension is adopted by most mainstream standards organizations due to its advantage of obtaining higher sound quality at very low code rates. In the existing non-blind bandwidth extension method, a source filter model is generally adopted to synthesize a high-frequency signal. In the source filter model method, a high frequency excitation signal and a high frequency linear prediction coefficient LPC are synthesized into a high frequency signal by a linear prediction synthesis filter. Because the code rate of LPC coding is smaller, the LPC coding is generally directly transmitted to a decoding end as a parameter, and because the high-frequency excitation signal has higher code rate and certain correlation with low frequency, the high-frequency excitation signal is generally directly replaced by the low-frequency excitation signal obtained from the decoding end. The method has better coding quality when the high and low frequency excitation signals have higher correlation, but the coding quality is sharply reduced when the correlation is weaker. According to the invention, the perception characteristic parameter of the high-frequency excitation signal is added, and the accuracy of the reconstruction of the high-frequency excitation signal is improved by a high-frequency excitation signal perception recovery method, so that the non-blind bandwidth expansion reconstruction tone quality is improved.
The invention provides a method and a device for perceptual reconstruction of a high-frequency excitation signal in non-blind bandwidth extension to improve the reconstruction tone quality of a high-frequency signal under the condition of improving few coding code rates.
Disclosure of Invention
The invention aims to provide a method and a device for perceptual reconstruction of a high-frequency excitation signal in non-blind bandwidth extension, so that the coding tone quality is obviously improved under the condition of increasing few coding code rates.
In order to achieve the above object, the present invention provides a perceptual reconstruction apparatus for high-frequency excitation signals, which includes a high-frequency excitation signal perceptual feature calculation module, a signal source selection module, a high-frequency excitation signal subband energy calculation module, an energy adjustment module, and an excitation signal perceptual recovery module.
The high-frequency excitation signal perception feature calculation module is used for calculating a perception feature parameter HFEPF of an original high-frequency excitation signal and outputting the parameter to the signal source selection module and the excitation signal perception recovery module;
the signal source selection module is used for selecting a signal source for reconstructing the high-frequency excitation signal, in the selection process, the excitation signal source is selected according to different thresholds of a perception characteristic parameter HFEPF of the original high-frequency excitation signal calculated by the high-frequency excitation signal perception characteristic calculation module, the excitation signal source comprises a random noise signal, a low-frequency excitation signal and a low-frequency whitening excitation signal, and the selected excitation signal source is output to the energy adjustment module;
the high-frequency excitation signal subband energy calculating module is used for calculating subband energy HFESE in the frequency domain of the original high-frequency excitation signal, and the subband energy is output to the energy adjusting module and used for adjusting the excitation signal source obtained in the signal source selecting module;
the energy adjusting module is used for adjusting the energy of the excitation signal source obtained by the signal source selecting module, the parameters required by the energy adjustment are derived from subband energy HFESE output by the high-frequency excitation signal subband energy calculating module, and the excitation signal after the energy adjustment is output to the excitation signal perception restoring module;
and the excitation signal perception recovery module is used for performing perception recovery on the excitation signal after the energy adjustment obtained by the energy adjustment module, and the perception characteristic parameter HFEPF in the high-frequency excitation signal perception characteristic calculation module is used for adjusting in the perception recovery process.
The invention also provides a perception reconstruction method of the high-frequency excitation signal in the non-blind bandwidth extension, which comprises the following steps:
step 1: performing linear prediction analysis on an original high-frequency signal to obtain a high-frequency linear prediction coefficient LPC and a time domain high-frequency excitation signal Exci _ sig _ T, performing Modified Discrete Cosine Transform (MDCT) on the high-frequency excitation signal Exci _ sig _ T to obtain a high-frequency excitation signal Exci _ sig _ F in an MDCT domain, then calculating the spectral flatness SFM and the spectral peak state factor SKF of the high-frequency excitation signal Exci _ sig _ F in the MDCT domain, and taking the logarithm of the ratio of the SFM and the SKF as a perceptual feature parameter HFEPF of the original high-frequency excitation signal;
step 2: selecting different high-frequency excitation signal sources according to different threshold values of the HFEPF calculated in the step 1, if the HFEPF is smaller than alpha 1, selecting a low-frequency excitation signal provided by a decoding end, if the HFEPF is larger than or equal to alpha 1 and smaller than alpha 2, selecting a low-frequency whitening excitation signal provided by the decoding end, and if the HFEPF is larger than or equal to alpha 2, selecting a random noise signal, wherein the selected high-frequency excitation signal source is expressed as an Exci _ sig _ source, wherein 0< alpha 1< alpha 2, and the value is obtained by mass data statistics;
and step 3: calculating Sub-band Energy of an original high-frequency excitation signal inci _ sig _ F in the MDCT domain in step 1, wherein the Sub-band division method adopts non-uniform division and totally divides the original high-frequency excitation signal inci _ sig _ F into four Sub-bands, if the length of a frame is N, the length of a first Sub-band is 1/8N, the length of a second Sub-band is 1/8N, the length of a third Sub-band is 1/4N, and the length of a fourth Sub-band is 1/2N, and then calculating Energy for each Sub-band to obtain four Sub-band Energy parameters Sub _ Energy _ ori (i), wherein i is 1, 2, 3, and 4;
and 4, step 4: performing Energy adjustment on the high-frequency excitation signal source, namely _ Sig _ source, obtained in the step 2, wherein in the adjustment process, a Sub-band Energy parameter Sub _ Energy _ source (i) of the high-frequency excitation signal source, namely 1, 2, 3 and 4 is calculated firstly, the calculation method is the same as that in the step 3, then the ratio Sub _ Energy _ r (i) of Sub-band _ Energy _ ori (i) and Sub-band _ Energy _ source (i) obtained in the step 3 is calculated, wherein i is the same self-band sequence number, and the signal on each Sub-band of the high-frequency excitation signal source is multiplied by the corresponding value in the Sub-band _ Energy _ r (i) to obtain an Energy-adjusted high-frequency excitation signal, namely _ Sig _ Energy;
and 5: and (4) carrying out perception recovery on the high-frequency excitation signal Exci _ Sig _ adjust after the energy is adjusted in the step (4), wherein the perception recovery process is as follows: setting a spectrum peak value expansion factor sigma (sigma is more than or equal to 0 and less than or equal to 3) and a noise harmonic factor beta (beta is more than or equal to 0 and less than or equal to 10), expanding and contracting the spectrum coefficient by using the spectrum peak value expansion factor sigma, carrying out comfortable noise harmonic by using the noise harmonic factor beta, and controlling the values of the sigma and the beta by using a perception characteristic parameter HFEPF of the high-frequency excitation signal to obtain a final high-frequency excitation signal Exci _ Sig _ Fina.
Compared with the related art, the method developed by the perception reconstruction device of the high-frequency excitation signal in the non-blind bandwidth extension has the following beneficial effects: according to the invention, the perception characteristic parameter of the high-frequency excitation signal is added, the accuracy of the reconstruction of the high-frequency excitation signal is improved by the high-frequency excitation signal perception recovery method, the non-blind bandwidth extension reconstruction tone quality is improved, the spectrum flatness and the peak factor proportion of the spectrum envelope are considered by the perception characteristic parameter, and the auditory perception characteristic can be reflected better.
The attached abbreviations in the above technical schemes are as follows:
HFEPF: high Frequency Excitation Perception Feature
HFESE: sub-band Energy in the Frequency domain of the High Frequency Excitation sub-band and Energy original High Frequency Excitation signal
LPC: linear Prediction Coefficient of Linear Prediction
Exci _ sig _ T: excitation Signal on Time Domain Excitation Signal
MDCT: modified Discrete Cosine Transform
Exci _ sig _ F: excitation Signal of Excitation Signal on Frequency Domain
SFM: spectral Flatness of Spectral Flatness Measurement
SKF: spectral Kurtosis Factor
Exci _ sig _ source: excitation Signal Source
Sub _ Energy _ ori: original subband Energy of Original subband Energy
Sub _ Energy _ source: sub-band Energy of Source band Energy excitation signal Source
Ener _ r: sub-band Energy Ratio of sub-band
Exci _ Sig _ adjust: excited Excitation Signal with Adjusted energy
Exci _ Sig _ permission: excitation signal after Excitation Signla performance perception recovery
Exi _ Sig _ Final: final Excitation Signla Final high frequency Excitation
Drawings
Fig. 1 is a block diagram of a perceptual reconstruction apparatus for a high frequency excitation signal in non-blind bandwidth extension according to the present invention;
FIG. 2 is a flowchart of a perceptual reconstruction method for a high frequency excitation signal in non-blind bandwidth extension according to the present invention;
fig. 3 is a flow chart of perceptual recovery in the perceptual reconstruction method for a high-frequency excitation signal in non-blind bandwidth extension according to the present invention.
Detailed Description
The technical method of the invention is further explained by the following concrete embodiments in combination with the attached drawings:
referring to fig. 1, the apparatus for perceptual reconstruction of a non-blind bandwidth extension medium-high frequency excitation signal according to an embodiment of the present invention includes a high-frequency excitation signal perceptual feature calculation module 1, a signal source selection module 2, a high-frequency excitation signal subband energy calculation module 3, an energy adjustment module 4, and an excitation signal perceptual recovery module 5, and each module may be implemented by using a software curing technology.
The high-frequency excitation signal perception feature calculation module 1 is configured to calculate a perception feature parameter HFEPF of an original high-frequency excitation signal, perform linear prediction analysis on the original high-frequency signal to obtain a high-frequency linear prediction coefficient LPC and a time domain high-frequency excitation signal eci _ sig _ T, perform MDCT transformation on the eci _ sig _ T to obtain a frequency domain high-frequency excitation signal eci _ sig _ F, calculate a spectral flatness SFM and a spectral peak state factor SKF for the eci _ sig _ F, take a logarithm of a ratio of the SFM to the SKF as the perception feature parameter HFEPF of the original high-frequency excitation signal, and output the HFEPF to the signal source selection module 2 and the excitation signal perception restoration module 5;
the signal source selection module 2 is configured to select a signal source for reconstructing a high-frequency excitation signal, in the selection process, select an excitation signal source according to different thresholds of a perceptual feature parameter HFEPF of an original high-frequency excitation signal calculated by the high-frequency excitation signal perceptual feature calculation module 1, where the excitation signal source includes a random noise signal, a low-frequency excitation signal, and a low-frequency whitening excitation signal, and if HFEPF is less than α 1, select a low-frequency excitation signal provided by a decoding end, and if HFEPF is greater than or equal to α 1 and less than α 2, select a low-frequency whitening excitation signal provided by the decoding end, and if HFEPF is greater than or equal to α 2, select a random noise signal, where the selected high-frequency excitation signal source is represented as Exci _ sig _ source, where 0< α 1< α 2, and a value thereof is obtained by mass data statistics; and the selected excitation signal source inci _ sig _ source is output to the energy adjusting module 4;
the high-frequency excitation signal subband energy calculating module 3 is configured to calculate subband energy HFESE in the frequency domain of the original high-frequency excitation signal, and the subband energy is output to the energy adjusting module 4 and used to adjust the excitation signal source exi _ sig _ source obtained in the signal source selecting module 2;
the energy adjusting module 4 is configured to perform energy adjustment on the excitation signal source inci _ sig _ source obtained by the signal source selecting module 2, where parameters required by the energy adjustment are derived from subband energy HFESE output by the high-frequency excitation signal subband energy calculating module 3, and output the excitation signal inci _ sig _ adjust after the energy adjustment to the excitation signal perception recovering module 5;
the excitation signal perception recovery module 5 is configured to perform perception recovery on the excitation signal inci _ sig _ adjust after the energy adjustment obtained by the energy adjustment module 4, and adjust the perception characteristic parameter HFEPF in the high-frequency excitation signal perception characteristic calculation module 1 in the perception recovery process to finally obtain the excitation signal inci _ sig _ final reconstructed at a high frequency.
Referring to fig. 2, the perceptual reconstruction method for a high-frequency excitation signal in non-blind bandwidth extension according to the embodiment of the present invention may automatically perform a process by using a computer software technology, and specifically includes the following steps:
step 1: the method comprises the following steps of performing linear prediction analysis on an original high-frequency signal to obtain a high-frequency linear prediction coefficient LPC and a time domain high-frequency excitation signal Exci _ sig _ T, performing Modified Discrete Cosine Transform (MDCT) on the high-frequency excitation signal Exci _ sig _ T to obtain a high-frequency excitation signal Exci _ sig _ F in an MDCT domain, then calculating the spectral flatness SFM and the spectral peak state factor SKF of the high-frequency excitation signal Exci _ sig _ F in the MDCT domain, and taking the logarithm of the ratio of the SFM to the SKF as a perceptual characteristic parameter HFEPF of the original high-frequency excitation signal, wherein the specific calculation steps of the HFEPF are as follows:
step 1.1: calculating spectral flatness
Figure BDA0001617367420000071
In the formula, N represents a frame sequence, and N represents a frame length.
Step 1.2: calculating spectral kurtosis factors
Figure BDA0001617367420000081
In the formula, N represents a frame sequence, and N represents a frame length.
Step 1.3 calculating perceptual characteristic parameters of the high-frequency excitation signal
Figure BDA0001617367420000082
In the formula, n represents a frame order.
Step 2: and (4) selecting different high-frequency excitation signal sources according to different threshold values of the HFEPF calculated in the step 1. The perceptual feature parameter HFEPF of the high-frequency excitation signal is first calculated for a large number of different types of audio signals (e.g., speech, music, natural sounds, etc.), and the values of the thresholds α 1 and α 2 are determined by counting the distribution of HFEPF values, which in this embodiment is determined as α 1 ═ 6.5 and α 2 ═ 14.5. Then, determining the selection of a high-frequency excitation signal source according to different values of alpha 1 and alpha 2, if HFEPF < alpha 1, selecting a low-frequency excitation signal provided by a decoding end, if alpha 1< ═ HFEPF < alpha 2, selecting a low-frequency whitening excitation signal provided by the decoding end, if HFEPF > is alpha 2, selecting a random noise signal, wherein the random noise signal is an average value of a large number of high-frequency excitation signals of different types; the obtained high frequency excitation signal source is denoted as exi _ sig _ source.
And step 3: calculating the Sub-band Energy of the original high-frequency excitation signal inci _ sig _ F in the MDCT domain in step 1, wherein the Sub-band division method adopts non-uniform division and totally divides the original high-frequency excitation signal inci _ sig _ F into four Sub-bands, if the length of a frame is N, the length of the first Sub-band is 1/8N, the length of the second Sub-band is 1/8N, the length of the third Sub-band is 1/4N, and the length of the fourth Sub-band is 1/2N, and then calculating the Energy of each Sub-band to obtain four Sub-band Energy parameters Sub _ Energy _ ori, wherein the calculation formula is as follows:
Figure BDA0001617367420000091
Figure BDA0001617367420000092
Figure BDA0001617367420000093
Figure BDA0001617367420000094
in the formula, N represents a frame length.
And 4, step 4: performing Energy adjustment on the high-frequency excitation signal source, namely _ Sig _ source, obtained in the step 2, wherein in the adjustment process, a Sub-band Energy parameter Sub _ Energy _ source (i) of the high-frequency excitation signal source, namely _ Sig _ source, is firstly calculated, the calculation method is the same as that in the step 3, then a ratio Sub _ Energy _ r (i) of the Sub _ Energy _ ori (i) and the Sub _ Energy _ source (i) obtained in the step 3 is calculated, signals on each Sub-band of the excitation signal source, namely _ Sig _ source, are respectively multiplied by corresponding values in the Sub _ Energy _ r (i) to obtain an Energy-adjusted high-frequency excitation signal, namely _ Sig _ add, and the Energy ratio Sub _ Energy _ r (i) thereof is calculated as follows:
Figure BDA0001617367420000095
where i denotes a subband number, i is 1, 2, 3, 4.
The calculation formula for the energy adjustment is as follows:
Exci_Sib_adjust(1:N/8)=Exci_Sib_source(1:N/8)*Sub_Ener_r1,
in the formula, (1: N/8) represents that the sampling points are from 1 to N/8, and N represents the frame length.
Exci_Sib_adjust(N/8+1:N/4)=Exci_Sib_source(N/8+1:N/4)*Sub_Ener_r2,
Wherein (N/8+ 1: N/4) represents a sampling point from N/8+ 1: n/4, N represents the frame length.
Exci_Sib_adjust(N/4+1:N/2)=Exci_Sib_source(N/4+1:N/2)*Sub_Ener_r3,
Wherein (N/4+ 1: N/2) represents a sampling point from N/4+ 1: n/2, N representing the frame length.
Exci_Sib_adjust(N/2+1:N)=Exci_Sib_source(N/2+1:N)*Sub_Ener_r4,
Where (N/2+ 1: N) indicates that the spots are from N/2+ 1: n, N indicates the frame length.
And 5: and (4) carrying out perception recovery on the high-frequency excitation signal Exci _ Sig _ adjust after the energy is adjusted in the step (4), wherein the perception recovery process is as follows: setting a spectrum peak expansion factor sigma (sigma is more than or equal to 0 and less than or equal to 3) and a noise harmonic factor beta (beta is more than or equal to 0 and less than or equal to 10), expanding and contracting the spectrum coefficient by using the spectrum peak expansion factor sigma, carrying out comfortable noise harmonic by using the noise harmonic factor beta, and controlling the values of the sigma and the beta by using a perception characteristic parameter HFEPF of the high-frequency excitation signal to obtain a Final high-frequency excitation signal Exci _ Sig _ Final, wherein the specific steps are as follows (as shown in figure 3):
step 5.1, setting initial values of σ and β, where σ 0 is 0, β 0 is 0, setting step d1 of σ to 0.3, step d2 of β to 1.0, setting perceptual error threshold delta to 0.1, and setting perceptual feature E1 of the excitation signal after perceptual restoration to 1000;
step 5.2 calculate perceptual recovery Signal Exci _ Sig _ permission
Exci_Sig_perception(n,i)=σ0n·Exci_sig_adjust(n,i)+β0nRand (n, i), where Rand (·) denotes the generation of random noise, n denotes the frame order, and i is tabulatedEach sample point of the frame is shown.
Step 5.3 calculation of perception error E0
Firstly, the perceptual characteristic parameter HFEPF of the perceptual recovery signal Exci _ Sig _ duration is calculated according to the method in the step 1n', recalculate HFEPFn' and the perceptual feature parameter HFEPF of the original high-frequency excitation signal obtained in step 1nThe difference between them is taken as the perceived error E0:
E0=|HFEPFn-HFEPFn'|.
step 5.4, judging whether E0 is smaller than a perception error threshold delta, if not, turning to step 5.5, otherwise, ending the step 5, and taking the obtained perception recovery signal Exci _ Sig _ percentage as a finally generated high-frequency excitation signal Exci _ Sig _ Final;
step 5.5, judging whether the perception error E0 is smaller than E1, if so, turning to step 5.6, otherwise, turning to step 5.7;
step 5.6 record the perceptual error E0 using E1, the current value of σ 0 using σ 1, and the current value of β 0 using β 1;
step 5.7, judging whether sigma 0 is less than or equal to 3 and beta 0 is less than or equal to 10, if so, turning to step 5.8, otherwise, turning to step 5.9;
step 5.8 updates σ 0 and β 0 to increase step sizes d1 and d2, respectively;
step 5.9 recalculates the perceptual recovery signal by σ 1 and β 1 obtained in step 5.6, the calculation formula is as follows:
Exci_Sig_perception(n,i)=σ1n·Exci_sig_adjust(n,i)+β1n·Rand(n,i),
where n denotes the frame order, i denotes each sampling point of the frame, and the obtained exi _ Sig _ duration is used as the finally generated high-frequency excitation signal exi _ Sig _ Final.
The above description is only an embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (4)

1. A perception reconstruction method of a high-frequency excitation signal in non-blind bandwidth extension adopts a perception reconstruction device comprising a high-frequency excitation signal perception characteristic calculation module, a signal source selection module, a high-frequency excitation signal sub-band energy calculation module, an energy adjustment module and an excitation signal perception recovery module to carry out perception reconstruction, and is characterized by comprising the following steps:
step 1, performing linear prediction analysis on an original high-frequency signal to obtain a high-frequency linear prediction coefficient LPC and a time domain high-frequency excitation signal Exci _ sig _ T, performing modified discrete cosine transform (MDCT transform) on the high-frequency excitation signal Exci _ sig _ T to obtain a high-frequency excitation signal Exci _ sig _ F in an MDCT domain, then calculating the spectral flatness SFM and the spectral peak state factor SKF of the high-frequency excitation signal Exci _ sig _ F in the MDCT domain through a high-frequency excitation signal perceptual feature calculation module, and taking the logarithm of the ratio of the SFM to the SKF as a perceptual feature parameter HFEPF of the original high-frequency excitation signal;
step 2, according to different threshold values of the HFEPF calculated in the step 1, selecting different high-frequency excitation signal sources through the signal source selection module, if the HFEPF is less than α 1, selecting a low-frequency excitation signal provided by a decoding end, if the HFEPF is greater than or equal to α 1 and less than α 2, selecting a low-frequency whitening excitation signal provided by the decoding end, if the HFEPF is greater than or equal to α 2, selecting a random noise signal, wherein the selected high-frequency excitation signal source is represented as an exi _ sig _ source, wherein 0< α 1< α 2, and the value is obtained by a large amount of data statistics;
step 3, calculating the Sub-band Energy of the original high-frequency excitation signal inci _ sig _ F in the MDCT domain in step 1 by using the high-frequency excitation signal Sub-band Energy calculation module, wherein the Sub-band division method adopts non-uniform division and totally divides the original high-frequency excitation signal inci _ sig _ F into four Sub-bands, if the length of a frame is N, the length of a first Sub-band is 1/8N, the length of a second Sub-band is 1/8N, the length of a third Sub-band is 1/4N, and the length of a fourth Sub-band is 1/2N, and then calculating the Energy of each Sub-band to obtain four Sub-band Energy parameters Sub _ Energy _ ori (i), wherein i is 1, 2, 3 and 4;
step 4, adjusting the Energy of the high-frequency excitation signal source, namely, the external _ Sig _ source obtained in the step 2 through the Energy adjustment module, wherein in the adjustment process, a Sub-band Energy parameter Sub _ Energy _ source (i) of the high-frequency excitation signal source, namely, 1, 2, 3, 4 is firstly calculated, the calculation method is the same as that in the step 3, then, a ratio Sub _ Energy _ r (i) of the Sub-band _ Energy _ ori (i) and the Sub-band _ Energy _ source (i) obtained in the step 3 is calculated, and the signal on each Sub-band of the external _ Sig _ source is multiplied by the corresponding value in the Sub _ Energy _ r (i) to obtain an Energy-adjusted high-frequency excitation signal, namely, the external _ Sig _ j;
and 5, carrying out perception recovery on the high-frequency excitation signal Exci _ Sig _ adjust subjected to the energy adjustment in the step 4 through the excitation signal perception recovery module, wherein the perception recovery process is as follows: setting a spectrum peak value expansion factor sigma (sigma is more than or equal to 0 and less than or equal to 3) and a noise harmonic factor beta (beta is more than or equal to 0 and less than or equal to 10), expanding and contracting the spectrum coefficient by using the spectrum peak value expansion factor sigma, carrying out comfortable noise harmonic by using the noise harmonic factor beta, and controlling the values of the sigma and the beta by using a perception characteristic parameter HFEPF of the high-frequency excitation signal to obtain a Final high-frequency excitation signal Exci _ Sig _ Final.
2. The perceptual reconstruction method of a high-frequency excitation signal in non-blind bandwidth extension according to claim 1, wherein the high-frequency excitation signal perceptual feature calculation module is configured to calculate a perceptual feature parameter HFEPF of an original high-frequency excitation signal and output the parameter to the signal source selection module and the excitation signal perceptual restoration module;
the signal source selection module is used for selecting a signal source for reconstructing the high-frequency excitation signal, in the selection process, the excitation signal source is selected according to different thresholds of a perception characteristic parameter HFEPF of the original high-frequency excitation signal calculated by the high-frequency excitation signal perception characteristic calculation module, the excitation signal source comprises a random noise signal, a low-frequency excitation signal and a low-frequency whitening excitation signal, and the selected excitation signal source is output to the energy adjustment module;
the high-frequency excitation signal subband energy calculating module is used for calculating subband energy HFESE in the frequency domain of the original high-frequency excitation signal, and the subband energy is output to the energy adjusting module and used for adjusting the excitation signal source obtained in the signal source selecting module;
the energy adjusting module is used for adjusting the energy of the excitation signal source obtained by the signal source selecting module, the parameters required by the energy adjustment are derived from subband energy HFESE output by the high-frequency excitation signal subband energy calculating module, and the excitation signal after the energy adjustment is output to the excitation signal perception restoring module;
and the excitation signal perception recovery module is used for performing perception recovery on the excitation signal after the energy adjustment obtained by the energy adjustment module, and the perception characteristic parameter HFEPF in the high-frequency excitation signal perception characteristic calculation module is used for adjusting in the perception recovery process.
3. The method of claim 1, wherein the perceptual feature parameter HFEPF is represented by a logarithm of a ratio of a spectral flatness SFM and a spectral peak-state factor SKF of the original high-frequency excitation signal in the MDCT domain.
4. The method according to claim 1, wherein the adjusting is performed by multiplying each frame signal by the ratio of the corresponding subband energy of the original high frequency excitation signal to the subband energy of the excitation signal source.
CN201810290508.8A 2018-04-03 2018-04-03 Perception reconstruction method and device for high-frequency excitation signal in non-blind bandwidth extension Active CN108630212B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810290508.8A CN108630212B (en) 2018-04-03 2018-04-03 Perception reconstruction method and device for high-frequency excitation signal in non-blind bandwidth extension

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810290508.8A CN108630212B (en) 2018-04-03 2018-04-03 Perception reconstruction method and device for high-frequency excitation signal in non-blind bandwidth extension

Publications (2)

Publication Number Publication Date
CN108630212A CN108630212A (en) 2018-10-09
CN108630212B true CN108630212B (en) 2021-05-07

Family

ID=63696673

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810290508.8A Active CN108630212B (en) 2018-04-03 2018-04-03 Perception reconstruction method and device for high-frequency excitation signal in non-blind bandwidth extension

Country Status (1)

Country Link
CN (1) CN108630212B (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1342230A1 (en) * 2000-11-14 2003-09-10 Coding Technologies Sweden AB Enhancing perceptual performance of high frequency reconstruction coding methods by adaptive filtering
CN101436406A (en) * 2008-12-22 2009-05-20 西安电子科技大学 Audio encoder and decoder
CN103026408A (en) * 2010-07-19 2013-04-03 华为技术有限公司 Audio frequency signal generation device
CN103493131A (en) * 2010-12-29 2014-01-01 三星电子株式会社 Apparatus and method for encoding/decoding for high-frequency bandwidth extension
WO2014199632A1 (en) * 2013-06-11 2014-12-18 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ Device and method for bandwidth extension for acoustic signals
CN104321815A (en) * 2012-03-21 2015-01-28 三星电子株式会社 Method and apparatus for high-frequency encoding/decoding for bandwidth extension
CN104517610A (en) * 2013-09-26 2015-04-15 华为技术有限公司 Band spreading method and apparatus
CN105513601A (en) * 2016-01-27 2016-04-20 武汉大学 Method and device for frequency band reproduction in audio coding bandwidth extension
EP3174049A1 (en) * 2011-07-13 2017-05-31 Huawei Technologies Co., Ltd. Audio signal coding method and device
CN107221334A (en) * 2016-11-01 2017-09-29 武汉大学深圳研究院 The method and expanding unit of a kind of audio bandwidth expansion

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1342230A1 (en) * 2000-11-14 2003-09-10 Coding Technologies Sweden AB Enhancing perceptual performance of high frequency reconstruction coding methods by adaptive filtering
CN101436406A (en) * 2008-12-22 2009-05-20 西安电子科技大学 Audio encoder and decoder
CN103026408A (en) * 2010-07-19 2013-04-03 华为技术有限公司 Audio frequency signal generation device
CN103493131A (en) * 2010-12-29 2014-01-01 三星电子株式会社 Apparatus and method for encoding/decoding for high-frequency bandwidth extension
EP3174049A1 (en) * 2011-07-13 2017-05-31 Huawei Technologies Co., Ltd. Audio signal coding method and device
CN104321815A (en) * 2012-03-21 2015-01-28 三星电子株式会社 Method and apparatus for high-frequency encoding/decoding for bandwidth extension
WO2014199632A1 (en) * 2013-06-11 2014-12-18 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ Device and method for bandwidth extension for acoustic signals
CN104517610A (en) * 2013-09-26 2015-04-15 华为技术有限公司 Band spreading method and apparatus
CN105513601A (en) * 2016-01-27 2016-04-20 武汉大学 Method and device for frequency band reproduction in audio coding bandwidth extension
CN107221334A (en) * 2016-11-01 2017-09-29 武汉大学深圳研究院 The method and expanding unit of a kind of audio bandwidth expansion

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
Adaptive Bandwidth Extension of Low Bitrate Compressed Audio Based on Spectral Correlation;Jiang Lin .etc;《 2015 8th International Conference on Intelligent Computation Technology and Automation (ICICTA)》;20160519;全文 *
AVS2 speech and audio coding scheme for high quality at low bitrates;Lin Jiang .etc;《 2014 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)》;20140908;全文 *
低码率音频带宽扩展设计与实现;顾莹;《中国优秀硕士学位论文全文数据库 信息科技辑》;20160815;全文 *
基于相关系数的AVS-P10带宽扩展优化;文彬 等;《计算机应用与软件》;20170228;第34卷(第2期);第179-183页 *

Also Published As

Publication number Publication date
CN108630212A (en) 2018-10-09

Similar Documents

Publication Publication Date Title
US10134406B2 (en) Noise signal processing method, noise signal generation method, encoder, decoder, and encoding and decoding system
US9251800B2 (en) Generation of a high band extension of a bandwidth extended audio signal
RU2752127C2 (en) Improved quantizer
CN101521014B (en) Audio bandwidth expansion coding and decoding devices
RU2585990C2 (en) Device and method for encoding by huffman method
KR102424755B1 (en) High-band signal modeling
CN103069484A (en) Time/frequency two dimension post-processing
MX2013010879A (en) Encoding apparatus and method, and program.
EP2186089A1 (en) Method and device for noise filling
KR101143792B1 (en) Signal encoding device and method, and signal decoding device and method
TWI524332B (en) Apparatus and method for generating a frequency enhanced signal using temporal smoothing of subbands
CN114550732B (en) Coding and decoding method and related device for high-frequency audio signal
KR20070051857A (en) Scalable audio coding
CN102194458B (en) Spectral band replication method and device and audio decoding method and system
US20130006644A1 (en) Method and device for spectral band replication, and method and system for audio decoding
US8532985B2 (en) Warped spectral and fine estimate audio encoding
EP3550563B1 (en) Encoder, decoder, encoding method, decoding method, and associated programs
WO2024051412A1 (en) Speech encoding method and apparatus, speech decoding method and apparatus, computer device and storage medium
CN108630212B (en) Perception reconstruction method and device for high-frequency excitation signal in non-blind bandwidth extension
KR101387808B1 (en) Apparatus for high quality multiple audio object coding and decoding using residual coding with variable bitrate
JP5416173B2 (en) Frequency band copy method, apparatus, audio decoding method, and system
WO2001024164A1 (en) Voice encoder, voice decoder, and voice encoding and decoding method
TW201606752A (en) Apparatus and method for comfort noise generation mode selection
GB2349054A (en) Digital audio signal encoders
TW201443888A (en) Apparatus and method for generating a frequency enhancement signal using an energy limitation operation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant