CN102543086A - Device and method for expanding speech bandwidth based on audio watermarking - Google Patents
Device and method for expanding speech bandwidth based on audio watermarking Download PDFInfo
- Publication number
- CN102543086A CN102543086A CN2011104223927A CN201110422392A CN102543086A CN 102543086 A CN102543086 A CN 102543086A CN 2011104223927 A CN2011104223927 A CN 2011104223927A CN 201110422392 A CN201110422392 A CN 201110422392A CN 102543086 A CN102543086 A CN 102543086A
- Authority
- CN
- China
- Prior art keywords
- frequency
- watermark
- module
- signal
- frequency domain
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Abstract
The invention disclosed a device and a method for expanding speech bandwidth based on an audio watermarking. The device and the method are as follows: in the starting part, the speech sent by a person is a bandwidth signal; before the speech is transmitted by a telephone line, high-frequency parameters are embedded to a narrow band code stream; the narrow band speech signal is transmitted by the telephone line; A-law decoding is performed on a receiving end, and then the high-frequency parameters are extracted; the high-frequency part in the bandwidth speech is recovered by the high-frequency parameters; finally, the high-frequency speech and low-frequency speech are synthesized to be a bandwidth speech. The device and the method use characteristics of the audio watermarking to build a hidden channel in the narrow band speech and uses the channel to transmit the parameters of the high-frequency speech to achieve band extension of the speech signal without changing the original network protocol.
Description
Technical field
The present invention relates to voice processing technology, particularly a kind of apparatus and method of expanding based on the speech bandwidth of audio frequency watermark.
Background technology
The main concentration of energy of human speech signal is in 0.3~3.4KHz, and the 4KHz bandwidth just can guarantee enough intelligibilitys.Therefore, public telephone network (PSTN) coding standard that International Telecommunication Union formulates the G.711 SF of (being A rule and μ rule) is 8KHz, and uses till today always.
Narrowband speech has reduced the demand to communication bandwidth when guaranteeing certain intelligibility, but this is to be cost with the naturality of sacrificing voice.Narrowband speech has been lost the high fdrequency component in the raw tone, so it sounds nature inadequately.In order to improve voice quality, G.722 ITU-T has proposed first broadband voice codec that is used for the remote phone meeting.Broadband voice communications can realize through designing transmission link again, but for huge PSTN fixed telephone network, it is expensive excessive to design transmission link again.
Traditional watermark is meant being seen mark when paper faces toward light, and the true and false that generally is used for important bill detects.And digital watermark technology is to utilize ubiquitous redundancy of multimedia digital works and randomness, is embedded into some numerical information in the copyright, realizes the hiding transmission of information.Digital watermarking is mainly used in the copyright and the integrality of protection copyright.Because people's the sense of hearing is than quick-eyed, it is more than what to be embedded into image difficult that watermark is embedded into audio frequency.
Audio frequency watermark based on least significant bit (LSB) (LSB): based on the method for the speech bandwidth of LSB expansion is that the lowest order that high-frequency parameter is embedded into encoding code stream is realized that the quantity of this method embed watermark is many, algorithm is simple, the communication channel that the suitable bit error rate is lower.
Audio frequency watermark based on time domain echo concealing technology: the audio frequency watermark based on time domain echo concealing technology is to have utilized the time domain masking effect in the human hearing characteristic: though a voice signal finishes, it is also influential to the hearing ability of another sound.The watermark negligible amounts that this method embeds, embed watermark have certain influence to original sound later.
This method of audio frequency watermark based on the frequency domain discrete Fourier transformation is at first carried out the DFT conversion to audio-frequency information; Selecting frequency range wherein then is that the DFT coefficient of 2.4~6.4kHz carries out watermark and embeds, and replaces corresponding D FT coefficient with the spectrum component of representing watermark sequence.Though it is this method has good robustness, when embed watermark and original DFT coefficient difference are excessive, bigger to the influence of raw tone.
Audio frequency watermark based on the frequency domain discrete cosine transform: this method is done discrete cosine transform to time-domain signal earlier, then sequence is revised discrete cosine transform (MDCT), changes with embed watermark through the coefficient to MDCT.This method has good robustness, but the negligible amounts of embed watermark.
The shortcoming of prior art: above method can not be accomplished good equilibrium aspect three of robustness, disguise and embed watermark quantity, its shortcoming is separately all arranged, and therefore can not be used for the speech bandwidth expansion preferably.
Summary of the invention
Realize the various shortcoming and defect that bandwidth is expanded to existing audio frequency watermark, the invention provides a kind of apparatus and method of expanding based on the speech bandwidth of audio frequency watermark.
In order to achieve the above object, a kind of method of expanding based on the speech bandwidth of audio frequency watermark provided by the invention may further comprise the steps:
Steps A. use QMF analysis filter pack module broadband voice to be divided into the high fdrequency component of narrowband speech and the 8000~16000Hz of two part: 0~8000Hz; And export the signal sampling frequencies with two and reduce to 8KHz, obtain low frequency signal
s L (
n) and high-frequency signal
s H (
n).
Step B. extracts 30 high-frequency parameters through extracting the high-frequency parameter module: 16 temporal envelope parameters, 12 frequency domain envelope parameters, average time domain envelope parameters and average frequency domain envelope parameters; This partial reference the way of document " based on the DTX/CNG algorithm research and the realization of layering broadband voice coding/decoding system ", below be the concrete method for distilling of each parameter:
Step B1. extracts 16 temporal envelope parameters and average temporal envelope parameter:
The high fdrequency component of every 20ms
s H (
n) be divided into 16 sections, every section comprises 10 sampled points; 16 temporal envelope parameters are:
Calculate average temporal envelope:
Use the time domain envelope parameters
T(
i) and mean value
Make difference and carry out normalization:
Step B2. extracts 12 frequency domain envelope parameters and average frequency domain envelope parameters:
High fdrequency component
s H (
n) 160 sampled points and last 48 of previous frame of present frame adopt point to get through a windowing process
, use long 208 the sampling point window functions of window here
Window(
n):
Wherein, N=208;
Signal after the windowing is mended 0 to 256 point, and the FFT conversion of doing then at 256 gets
S F (
k):
Wherein,
L=256; Frequency domain is divided into 12 evenly at interval, calculates each frequency domain envelope parameters at interval, and convert logarithm weighting sub belt energy parameter to.
Calculate the average frequency domain envelope:
。
With the frequency domain envelope parameters
F(
i) and mean value
Make difference and carry out normalization:
Step C. through coding/decoding module G.711 with narrow band voice signal
s L (
n) through A-law encoding device coding, obtain each point 8
BitThe code stream of data length is embedded into watermark information in the code stream, is sent in the network through telephone wire; Receiving end extracts watermark information from code stream, and through A rule decoder decode, obtains narrow band voice signal.
Step D. is embedded into watermark through the watermark merge module and comprises following dual mode in the code stream:
D1. be embedded into watermark in the code stream uniformly through the watermark merge module: because a frame signal has 160 sampled points, and the bit number of embed watermark is 66bit, whenever embeds 1 bit information at a distance from a sampled point.
Perhaps D2. selectively is embedded into watermark information in the little sample point of amplitude through the watermark merge module; Use C0~C7 to represent the lowest order of encoding code stream to arrive most significant digit; According to agreement G.711, most significant digit C7 represents the sign bit of sampled point, and C6~C4 is the paragraph sign indicating number, C3~C0 section of being ISN; The paragraph sign indicating number is more little, and the amplitude of the sampled value of code stream representative is more little; This method uses the C6 position that division of signal is large-signal, i.e. C6=1 and small-signal, i.e. C6=0, embed watermark when C6 is 0; If not enough 66 of the position that a frame embeds then is chosen in other position embed watermarks.
Step e. corresponding through extracting watermark module extraction watermark with step D, comprise dual mode:
E1. be to extract through the process of extracting watermark module extraction watermark according to the position of embed watermark.
Perhaps E2. judges whether to have embedded watermark according to the characteristics of code stream; From the initial judgement of a frame, if C6 is 0, then extract watermark from lowest order, C6 did not extract watermark at 1 o'clock; If watermark less than 66 bits that extract when arriving postamble, then return the starting point of a frame, be the fetched at 1 place at C6, up to extracting 66 bit watermark.
Step F. use white noise to recover the high frequency voice through recovering the high frequency voice module:
At first, use the high-frequency parameter that extracts that it is carried out temporal envelope shaping and frequency domain envelope shaping then, can obtain the high frequency voice signal the AR model of the white noise sequence that produces through constructing by the low frequency voice.
Step F 1. uses white noise to recover the high frequency voice:
Because high frequency voice and low frequency voice have certain correlativity, the low frequency voice structure AR model that uses decoding to obtain; Produce white noise sequence in decoding end, this sequence is carried out forming processes through the AR model of constructing, make noise possess the characteristic of high frequency voice.
The local adjustment of step F 2. temporal envelopes, this partial reference the way of document " based on the DTX/CNG algorithm research and the realization of layering broadband voice coding/decoding system ":
The temporal envelope parameter of normalization temporal envelope parameter of from watermark, recovering and average temporal envelope calculating high-frequency signal:
The temporal envelope calculation of parameter time domain local gain factor by noise and high-frequency signal:
Use the temporal envelope of time domain local gain factor pair noise to adjust:
Gain factor between two sections uses approach based on linear interpolation to handle:
The local adjustment of step F 3. frequency domain envelopes, this partial reference the way of document " based on the DTX/CNG algorithm research and the realization of layering broadband voice coding/decoding system ":
The adjusted signal of time domain is handled according to extracting 12 frequency domain envelope parameters and average frequency domain envelope parameters, obtained the logarithm weighting sub belt energy parameter
and the average frequency domain envelope
of noise.According in the local adjustment of temporal envelope to the local method of adjustment of the temporal envelope of noise, the frequency domain envelope of noise is carried out part adjustment.
The adjustment of the step F 4. frequency domain envelopes overall situation:
Calculate the frequency domain global gain factor of each frame by the average frequency domain envelope of noise and high-frequency signal:
。
Use frequency domain global gain factor is carried out overall situation adjustment to the frequency domain envelope of each frame:
Adjusted frequency spectrum is done the IFFT conversion, and using the window window function then is among 208 the buffer to obtaining depositing in after the time-domain signal windowing length:
Wherein, L=256, n=0,1 ... 207.
With the value of last 48 points among the former frame buffer and preceding 48 somes addition among the present frame buffer, then with present frame buffer in the value of n=48~159 constitute the time-domain signal that present frame recovers.
The adjustment of the step F 5. temporal envelopes overall situation:
Step according to the adjustment of the frequency domain envelope overall situation is carried out overall situation adjustment to temporal envelope, and adjusted signal
promptly is the high-frequency signal by Noise Estimation.
Step G. adopts the low frequency signal
of frequency through QMF composite filter pack module with 8KHz and the high-frequency signal
that estimates improves SF to 16kHz; Then respectively through low pass and high pass FIR wave filter; The signal of handling is
and
, and the coefficient of wave filter is identical with the QMF analysis filter.
Two signal plus are promptly obtained the broadband signal of final 16KHz SF:
The present invention provides a kind of device of expanding based on the speech bandwidth of audio frequency watermark in addition.The device of said speech bandwidth expansion based on audio frequency watermark comprises: QMF analysis filter pack module, extract the high-frequency parameter module, G.711 coding/decoding module, watermark merge module, extract the watermark module, recover high frequency voice module and QMF composite filter pack module.
Said QMF analysis filter pack module is divided into broadband voice the high fdrequency component of narrowband speech and the 8000~16000Hz of two part: 0~8000Hz; And export the signal sampling frequencies with two and reduce to 8KHz, obtain low frequency signal
s L (
n) and high-frequency signal
s H (
n).
Said extraction high-frequency parameter module is extracted 30 high-frequency parameters: 16 temporal envelope parameters, 12 frequency domain envelope parameters, average time domain envelope parameters and average frequency domain envelope parameters; This partial reference the way of document " based on the DTX/CNG algorithm research and the realization of layering broadband voice coding/decoding system ", below be the concrete method for distilling of each parameter:
Extract 16 temporal envelope parameters and average temporal envelope parameter:
The high fdrequency component of every 20ms
s H (
n) be divided into 16 sections, every section comprises 10 sampled points; 16 temporal envelope parameters are:
Calculate average temporal envelope:
Use the time domain envelope parameters
T(
i) and mean value
Make difference and carry out normalization:
Extract 12 frequency domain envelope parameters and average frequency domain envelope parameters:
High fdrequency component
s H (
n) 160 sampled points and last 48 of previous frame of present frame adopt point to get through a windowing process
, use long 208 the sampling point window functions of window here
Window(
n):
Wherein, N=208.
Signal after the windowing is mended 0 to 256 point, and the FFT conversion of doing then at 256 gets
S F (
k):
Wherein,
L=256; Frequency domain is divided into 12 evenly at interval, calculates each frequency domain envelope parameters at interval, and convert logarithm weighting sub belt energy parameter to.
Calculate the average frequency domain envelope:
。
With the frequency domain envelope parameters
F(
i) and mean value
Make difference and carry out normalization:
Said G.711 coding/decoding module is with narrow band voice signal
s L (
n) through A-law encoding device coding, obtain each point 8
BitThe code stream of data length is embedded into watermark information in the code stream, is sent in the network through telephone wire; Receiving end extracts watermark information from code stream, and through A rule decoder decode, obtains narrow band voice signal.
Said watermark merge module is embedded into watermark and comprises following dual mode in the code stream:
Mode one: be embedded into watermark in the code stream uniformly through the watermark merge module: because a frame signal has 160 sampled points, and the bit number of embed watermark is 66bit, whenever embeds 1 bit information at a distance from a sampled point.
Mode two: watermark information selectively is embedded in the little sample point of amplitude through the watermark merge module; Use C0~C7 to represent the lowest order of encoding code stream to arrive most significant digit; According to agreement G.711, most significant digit C7 represents the sign bit of sampled point, and C6~C4 is the paragraph sign indicating number, C3~C0 section of being ISN; The paragraph sign indicating number is more little, and the amplitude of the sampled value of code stream representative is more little; This method uses the C6 position that division of signal is large-signal, i.e. C6=1 and small-signal, i.e. C6=0, embed watermark when C6 is 0; If not enough 66 of the position that a frame embeds then is chosen in other position embed watermarks.
It is corresponding with the watermark merge module that said extraction watermark module is extracted watermark, comprises dual mode:
Mode one: the process through extracting watermark module extraction watermark is to extract according to the position of embed watermark.
Mode two: the characteristics according to code stream judge whether to have embedded watermark; From the initial judgement of a frame, if C6 is 0, then extract watermark from lowest order, C6 did not extract watermark at 1 o'clock; If watermark less than 66 bits that extract when arriving postamble, then return the starting point of a frame, be the fetched at 1 place at C6, up to extracting 66 bit watermark.
Said recovery high frequency voice module uses white noise to recover the high frequency voice:
At first, use the high-frequency parameter that extracts that it is carried out temporal envelope shaping and frequency domain envelope shaping then, can obtain the high frequency voice signal the AR model of the white noise sequence that produces through constructing by the low frequency voice.
Use white noise to recover the high frequency voice:
Because high frequency voice and low frequency voice have certain correlativity, the low frequency voice structure AR model that uses decoding to obtain; Produce white noise sequence in decoding end, this sequence is carried out forming processes through the AR model module of constructing, make noise possess the characteristic of high frequency voice.
The local adjustment of temporal envelope, this partial reference the way of document " based on the DTX/CNG algorithm research and the realization of layering broadband voice coding/decoding system ":
The temporal envelope parameter of normalization temporal envelope parameter of from watermark, recovering and average temporal envelope calculating high-frequency signal:
The temporal envelope calculation of parameter time domain local gain factor by noise and high-frequency signal:
Use the temporal envelope of time domain local gain factor pair noise to adjust:
Gain factor between two sections uses approach based on linear interpolation to handle:
The local adjustment of frequency domain envelope, this partial reference the way of document " based on the DTX/CNG algorithm research and the realization of layering broadband voice coding/decoding system ":
The adjusted signal of time domain is handled according to extracting 12 frequency domain envelope parameters and average frequency domain envelope parameters, obtained the logarithm weighting sub belt energy parameter
and the average frequency domain envelope
of noise.According in the local adjustment of temporal envelope to the local method of adjustment of the temporal envelope of noise, the frequency domain envelope of noise is carried out part adjustment.
The adjustment of the frequency domain envelope overall situation:
Calculate the frequency domain global gain factor of each frame by the average frequency domain envelope of noise and high-frequency signal:
Use frequency domain global gain factor is carried out overall situation adjustment to the frequency domain envelope of each frame:
Adjusted frequency spectrum is done the IFFT conversion, and using the window window function then is in 208 the buffer device to obtaining depositing in after the time-domain signal windowing length:
Wherein, L=256, n=0,1 ... 207.
With the value of last 48 points in the former frame buffer device and preceding 48 somes addition in the present frame buffer device, then with present frame buffer device in the value of n=48~159 constitute the time-domain signal that present frame recovers.
The adjustment of the temporal envelope overall situation:
Step according to the adjustment of the frequency domain envelope overall situation is carried out overall situation adjustment to temporal envelope, and adjusted signal
promptly is the high-frequency signal by Noise Estimation.
Said QMF composite filter pack module adopts the low frequency signal
of frequency with 8KHz and the high-frequency signal
that estimates improves SF to 16kHz; Then respectively through low pass and high pass FIR wave filter; The signal of handling is
and
, and the coefficient of wave filter is identical with the QMF analysis filter.
Two signal plus are promptly obtained the broadband signal of final 16KHz SF:
Beneficial effect: the present invention has provided a kind of method of improving speech quality based on audio frequency watermark.This method is utilized the characteristic of audio frequency watermark, in narrowband speech, sets up a hiding channel, utilizes the parameter of these Channel Transmission high frequency voice, thereby under the prerequisite that does not change the legacy network agreement, has realized the band spread of voice signal.The present invention uses the adaptive audio watermark to realize the speech bandwidth expansion, and high-frequency information less to the influence of raw tone, that embed is more, robustness good, is fit to various types of voice, and the broadband voice auditory effect that recovers is good than narrowband speech.
Description of drawings
Fig. 1 principle of the invention block diagram.
Fig. 2 window window function of the present invention.
Fig. 3 the present invention is the encoding code stream form G.711.
Fig. 4 the present invention recovers high frequency voice block diagram.
Embodiment
Below in conjunction with accompanying drawing and embodiment the present invention is elaborated.
Fig. 1 has provided the complete theory diagram of the present invention.The beginning part, the voice that the people sends are broadband signals, before through the telephone wire transmission, high-frequency parameter are embedded in the arrowband code stream, through telephone wire transmission narrow band voice signal; Carry out the decoding of A rule at receiving end, use the high-frequency parameter extraction module to extract high-frequency parameter then, use the high-frequency parameter synthesis module to recover the HFS in the broadband voice, at last with high frequency voice and low frequency phonetic synthesis broadband voice.
Each module introduction that relates in the principle of the invention block diagram is following:
1, QMF analysis filter pack module
To send voice are broadband voices to the beginning groups of people, and phone line defeated be narrowband speech, so the present invention uses the QMF analysis filterbank broadband voice to be divided into the high fdrequency component of narrowband speech and the 8000~16000Hz of two part: 0~8000Hz.QMF analysis filter among the present invention uses the FIR wave filter on 64 rank, low-pass FIR filter h
L(n) coefficient is seen appendix.Hi-pass filter h
H(n) be by low-pass filter h
L(n) frequency displacement obtains, and just uses multiple sinusoidal sequence
Modulation, that is:
=
=
Broadband signal is passed through the QMF analysis filterbank, and two output signal sampling frequencies are reduced to 8KHz, just can obtain low frequency signal s
L(n) and high-frequency signal s
H(n).
2, extract the high-frequency parameter module
30 high-frequency parameters of extraction required for the present invention: 16 temporal envelope parameters, 12 frequency domain envelope parameters, average time domain envelope parameters and average frequency domain envelope parameters.It below is the concrete method for distilling of each parameter.
(1) extracts 16 temporal envelope parameters and average temporal envelope parameter
The high fdrequency component s of every 20ms
H(n) be divided into 16 sections, every section comprises 10 sampled points.16 temporal envelope parameters are:
Calculate average temporal envelope:
。
Make difference with time domain envelope parameters T (i) with mean value
and carry out normalization:
(2) extract 12 frequency domain envelope parameters and average frequency domain envelope parameters
High fdrequency component s
HLast 48 of 160 sampled points of present frame (n) and previous frame are adopted point to get through a windowing process
, use long 208 the sampling point window function window (n) of window here:
Wherein, N=208.Window function is as shown in Figure 2.
Signal after the windowing is mended 0 to 256 point, and the FFT conversion of doing then at 256 gets S
F(k):
Wherein, L=256.Frequency domain is divided into 12 evenly at interval, calculates each frequency domain envelope parameters at interval, and convert logarithm weighting sub belt energy parameter to.The computing method of frequency domain envelope sub-band division and the logarithm weighted energy F (i) that respectively carries are seen appendix.
Calculate the average frequency domain envelope:
Frequency domain envelope parameters F (i) is made difference with mean value
carries out normalization:
3, coding/decoding module G.711
With narrow band voice signal s
L(n) through A-law encoding device coding, obtain the code stream of each some 8bit data length, watermark information is embedded in the code stream, be sent in the network through telephone wire.Receiving end extracts watermark information from code stream, and through A rule decoder decode, obtains narrow band voice signal.
4, watermark merge module
It is simply watermark information to be embedded in the lowest order of arrowband code stream that existing least significant bit (LSB) embeds algorithm, and to the characteristics of host-host protocol and the subjective sensation of people's ear, this paper proposes two kinds of modified least significant bit (LSB)s and embeds algorithm.
First method is to be embedded into watermark in the code stream comparatively uniformly: because a frame signal has 160 sampled points, and the bit number of embed watermark is 66bit, can whenever embed 1 bit information at a distance from a sampled point.Bad in the time of can avoiding causing the auditory effect fashion like this because of localized distortion is excessive, make whole auditory effect remain on a higher level.
Second method is that the auditory properties according to the characteristics of host-host protocol and people's ear proposes a kind of selectable least significant bit (LSB) and embeds algorithm.What G.711 use is non-uniform quantizing, signal sampling value hour, and quantized interval is also little; When the signal sampling value was big, quantized interval was also big.So if change the encoding code stream of little sample value, the amplitude of variation of sample value is little, change the encoding code stream of big sample value, the vary within wide limits of sample value.Make that so no matter watermark being embedded into little sample point still is big sample point, the signal to noise ratio (S/N ratio) that obtains theoretically changes very little.But according to the time domain masking effect of people's ear, large-signal makes the modification of small-signal be difficult for being discovered by people's ear to the masking effect of back small-signal.According to this characteristic, can watermark information selectively be embedded in the little sample point of amplitude, make that the hiding property of watermark is better.Use C0~C7 to represent the lowest order of encoding code stream to arrive most significant digit, as shown in Figure 3.Follow G.711 agreement of certificate, most significant digit C7 represents the sign bit of sampled point, and C6~C4 is the paragraph sign indicating number, C3~C0 section of being ISN.The paragraph sign indicating number is more little, and the amplitude of the sampled value of code stream representative is more little.This paper uses the C6 position that division of signal is large-signal (C6=1) and small-signal (C6=0), embed watermark when C6 is 0.If not enough 66 of the position that a frame embeds then is chosen in other position embed watermarks.
5, extract the watermark module
According to the difference that embeds algorithm, use the watermark extracting method corresponding with it.The process of first kind of algorithm extraction watermark is to extract according to the position of embed watermark.Second method is that the characteristics according to code stream judge whether to have embedded watermark.From the initial judgement of a frame, if C6 is 0, then extract watermark from lowest order, C6 did not extract watermark at 1 o'clock.If watermark less than 66 bits that extract when arriving postamble, then return the starting point of a frame, be the fetched at 1 place at C6, up to extracting 66 bit watermark.
6, recover the high frequency voice module
Because high frequency characteristics of speech sounds and noise ratio are similar, this module uses white noise to recover the high frequency voice.At first, use the high-frequency parameter that extracts that it is carried out temporal envelope shaping and frequency domain envelope shaping then, can obtain the high frequency voice signal the AR model of the white noise sequence that produces through constructing by the low frequency voice.It is as shown in Figure 4 to recover high frequency voice block diagram.
(1) use white noise to recover the high frequency voice
Because high frequency voice and low frequency voice have certain correlativity, the low frequency voice structure AR model that uses decoding to obtain.Produce white noise sequence in decoding end, this sequence is carried out forming processes through the AR model of constructing, make noise possess the characteristic of high frequency voice.
(2) the local adjustment of temporal envelope
The temporal envelope parameter of normalization temporal envelope parameter of from watermark, recovering and average temporal envelope calculating high-frequency signal:
The temporal envelope calculation of parameter time domain local gain factor by noise and high-frequency signal:
Use the temporal envelope of time domain local gain factor pair noise to adjust:
Gain factor between two sections uses approach based on linear interpolation to handle:
(3) the local adjustment of frequency domain envelope
The adjusted signal of time domain is handled according to extracting 12 frequency domain envelope parameters and average frequency domain envelope parameters, obtained the logarithm weighting sub belt energy parameter
and the average frequency domain envelope
of noise.According in the local adjustment of temporal envelope to the local method of adjustment of the temporal envelope of noise, the frequency domain envelope of noise is carried out part adjustment.
(4) frequency domain envelope overall situation adjustment
Calculate the frequency domain global gain factor of each frame by the average frequency domain envelope of noise and high-frequency signal:
Use frequency domain global gain factor is carried out overall situation adjustment to the frequency domain envelope of each frame:
Adjusted frequency spectrum is done the IFFT conversion, and the window window function that uses Fig. 2 then is among 208 the buffer to obtaining depositing in after the time-domain signal windowing length:
Wherein, L=256, n=0,1 ... 207.
With the value of last 48 points among the former frame buffer and preceding 48 somes addition among the present frame buffer, then with present frame buffer in the value of n=48~159 constitute the time-domain signal that present frame recovers.
(5) temporal envelope overall situation adjustment
Step according to the adjustment of the frequency domain envelope overall situation is carried out overall situation adjustment to temporal envelope, and adjusted signal
promptly is the high-frequency signal by Noise Estimation.
7, QMF composite filter pack module
8KHz is adopted the low frequency signal
of frequency and the high-frequency signal
that estimates improves SF to 16kHz; Then respectively through low pass and high pass FIR wave filter; The signal of handling is
and
, and the coefficient of wave filter is identical with the QMF analysis filter.
Two signal plus are promptly obtained the broadband signal of final 16KHz SF:
Sum up: present embodiment proposes two kinds of modified least significant bit (LSB) watermarking algorithms.A kind of improving one's methods is whenever to embed 1 bit information at a distance from a sampled point, and be bad in the time of can avoiding causing the auditory effect fashion because of localized distortion is excessive like this, makes whole auditory effect remain on a higher level.It is to propose a kind of selectable least significant bit (LSB) embedding algorithm according to the characteristics of host-host protocol and the auditory properties of people's ear that another kind is improved one's methods.According to the time domain masking effect of people's ear, large-signal makes the modification of small-signal be difficult for being discovered by people's ear to the masking effect of back small-signal.According to this characteristic, can watermark information selectively be embedded in the little sample point of amplitude, make that the hiding property of watermark is better.
Native system is embedded into the high-frequency information in the voice signal in the arrowband code stream based on above-mentioned watermarking algorithm, transfers out through cable telephone network, extracts the high-frequency parameter of voice at receiving end, the synthetic wideband voice, thus realize the spread spectrum of voice signal.Because the sheltering of watermarking algorithm is better, so even do not extract watermark and synthetic wideband voice functions module at receiving end, also can not influence normal speech quality.And the telephone terminal with this function will be heard the language behind the spread-spectrum, and speech quality is greatly improved.
Above content is to combine optimal technical scheme to the further explain that the present invention did, and can not assert that the practical implementation of invention only limits to these explanations.Under the present invention, the those of ordinary skill of technical field, under the prerequisite that does not break away from design of the present invention, simple deduction and replacement can also be made, all protection scope of the present invention should be regarded as.
Appendix
The frequency domain envelope carry division:
The logarithm weighted energy that respectively carries
F(
i) computing method:
0 subband:
1 ~ 10 subband:
11 subbands:
Claims (2)
1. the method based on the expansion of the speech bandwidth of audio frequency watermark may further comprise the steps, and wherein step B, step F 2, F3 are with reference to the way in the document " based on the DTX/CNG algorithm research and the realization of layering broadband voice coding/decoding system ":
Steps A. use QMF analysis filter pack module broadband voice to be divided into the high fdrequency component of narrowband speech and the 8000~16000Hz of two part: 0~8000Hz; And export signals with two and fall sampling module through one, SF is reduced to 8KHz, obtain low frequency signal
s L (
n) and high-frequency signal
s H (
n);
Module is extracted 30 high-frequency parameters: 16 temporal envelope parameters, 12 frequency domain envelope parameters, average time domain envelope parameters and average frequency domain envelope parameters; Below be the concrete method for distilling of each parameter:
Step B1. extracts 16 temporal envelope parameters and average temporal envelope parameter:
The high fdrequency component of every 20ms
s H (
n) be divided into 16 sections, every section comprises 10 sampled points; 16 temporal envelope parameters are:
Calculate average temporal envelope:
Use the time domain envelope parameters
T(
i) and mean value
Make difference and carry out normalization:
Step B2. extracts 12 frequency domain envelope parameters and average frequency domain envelope parameters:
High fdrequency component
s H (
n) 160 sampled points and last 48 of previous frame of present frame adopt point to get through windowing resume module
, use long 208 the sampling point window functions of window here
Window(
n):
Wherein, N=208;
Mend 0 to 256 point through the signal after the windowing module, the FFT conversion of doing then at 256 gets
S F (
k):
Wherein,
L=256; Frequency domain is divided into 12 evenly at interval, calculates each frequency domain envelope parameters at interval, and convert logarithm weighting sub belt energy parameter to;
Calculate the average frequency domain envelope:
With the frequency domain envelope parameters
F(
i) and mean value
Make difference and carry out normalization:
D1. be embedded into watermark in the code stream uniformly through the watermark merge module: because a frame signal has 160 sampled points, and the bit number of embed watermark is 66bit, whenever embeds 1 bit information at a distance from a sampled point;
Perhaps D2. selectively is embedded into watermark information in the little sample point of amplitude through the watermark merge module; Use C0~C7 to represent the lowest order of encoding code stream to arrive most significant digit; According to agreement G.711, most significant digit C7 represents the sign bit of sampled point, and C6~C4 is the paragraph sign indicating number, C3~C0 section of being ISN; The paragraph sign indicating number is more little, and the amplitude of the sampled value of code stream representative is more little; This method uses the C6 position that division of signal is large-signal, i.e. C6=1 and small-signal, i.e. C6=0, embed watermark when C6 is 0; If not enough 66 of the position that a frame embeds then is chosen in other position embed watermarks;
Step e. corresponding through extracting watermark module extraction watermark with step D, comprise dual mode:
E1. through extracting the watermark module
Perhaps E2. judges whether to have embedded watermark according to the characteristics of code stream; From the initial judgement of a frame, if C6 is 0, then extract watermark from lowest order, C6 did not extract watermark at 1 o'clock; If watermark less than 66 bits that extract when arriving postamble, then return the starting point of a frame, be the fetched at 1 place at C6, up to extracting 66 bit watermark;
Step F. use white noise to recover the high frequency voice through recovering the high frequency voice module:
At first, use the high-frequency parameter module of extracting that it is carried out temporal envelope shaping and frequency domain envelope shaping then, can obtain the high frequency voice signal the AR model equipment of the white noise sequence that produces through constructing by the low frequency voice;
Step F 1. uses white noise to recover the high frequency voice:
Because high frequency voice and low frequency voice have certain correlativity, the low frequency voice structure AR model module that uses decoding to obtain; Produce white noise sequence in decoding end, this sequence is carried out forming processes through the AR model module of constructing, make noise possess the characteristic of high frequency voice;
The local adjusting module of step F 2. temporal envelopes:
The temporal envelope parameter of normalization temporal envelope parameter of from watermark, recovering and average temporal envelope calculating high-frequency signal:
The temporal envelope calculation of parameter time domain local gain factor by noise and high-frequency signal:
Use time domain local gain module that the temporal envelope of noise is adjusted:
Gain factor between two sections uses approach based on linear interpolation to handle:
The local adjusting module of step F 3. frequency domain envelopes:
The adjusted signal of time domain is handled according to extracting 12 frequency domain envelope parameters and average frequency domain envelope parameters, obtained the logarithm weighting sub belt energy parameter
and the average frequency domain envelope
of noise; According in the local adjustment of temporal envelope to the local method of adjustment of the temporal envelope of noise, the frequency domain envelope of noise is carried out part adjustment
Step F 4. frequency domain envelopes overall situation adjusting module:
Calculate the frequency domain global gain factor of each frame by the average frequency domain envelope of noise and high-frequency signal:
Use frequency domain global gain factor is carried out overall situation adjustment to the frequency domain envelope of each frame:
Adjusted frequency spectrum is fed the IFFT conversion module, and using window window function module then is in 208 the buffer device to obtaining depositing in after the time-domain signal windowing length:
Wherein, L=256;
With the value of last 48 points in the former frame buffer device and preceding 48 somes addition in the present frame buffer device, then with present frame buffer device in the value of n=48~159 constitute the time-domain signal that present frame recovers;
Step F 5. temporal envelopes overall situation adjusting module:
Step according to the adjustment of the frequency domain envelope overall situation is carried out overall situation adjustment to temporal envelope, and adjusted signal
promptly is the high-frequency signal by Noise Estimation
Step G. improves SF to 16kHz through QMF composite filter pack module with the low frequency signal
of 8KHz SF and the high-frequency signal
that estimates; Then respectively through low pass and high pass FIR filter module; The signal of handling is
and
, and the coefficient of wave filter is identical with the QMF analysis filter;
Two signal plus are promptly obtained the broadband signal of final 16KHz SF:
2. device based on the expansion of the speech bandwidth of audio frequency watermark; It is characterized in that the device of said speech bandwidth expansion based on audio frequency watermark comprises: QMF analysis filter pack module, extract the high-frequency parameter module, G.711 coding/decoding module, watermark merge module, extract the watermark module, recover high frequency voice module and QMF composite filter pack module;
Said QMF analysis filter pack module is divided into broadband voice the high fdrequency component of narrowband speech and the 8000~16000Hz of two part: 0~8000Hz; And export the signals feeding with two and fall sampling module, SF is reduced to 8KHz, obtain low frequency signal
s L (
n) and high-frequency signal
s H (
n);
Said extraction high-frequency parameter module is extracted 30 high-frequency parameters: 16 temporal envelope parameters, 12 frequency domain envelope parameters, average time domain envelope parameters and average frequency domain envelope parameters; This partial reference the way of document " based on the DTX/CNG algorithm research and the realization of layering broadband voice coding/decoding system ", below be the concrete method for distilling of each parameter:
Extract 16 temporal envelope parameters and average temporal envelope parameter module:
The high fdrequency component of every 20ms
s H (
n) be divided into 16 sections, every section comprises 10 sampled points; 16 temporal envelope parameters are:
Calculate average temporal envelope:
Use the time domain envelope parameters
T(
i) and mean value
Make difference and carry out normalization:
Extract 12 frequency domain envelope parameters and average frequency domain envelope parameters module:
High fdrequency component
s H (
n) 160 sampled points and last 48 of previous frame of present frame adopt point to get through a windowing process
, use long 208 the sampling point window functions of window here
Window(
n):
Wherein, N=208;
Signal after the windowing is mended 0 to 256 point, and the FFT conversion of doing then at 256 gets
S F (
k):
Wherein,
L=256; Frequency domain is divided into 12 evenly at interval, calculates each frequency domain envelope parameters at interval, and convert logarithm weighting sub belt energy parameter to;
Calculate the average frequency domain envelope:
With the frequency domain envelope parameters
F(
i) and mean value
Make difference and carry out normalization:
Said G.711 coding/decoding module is with narrow band voice signal
s L (
n) through A-law encoding device module coding, obtain each point 8
BitThe code stream of data length is embedded into watermark information in the code stream, is sent in the network through telephone wire; Receiving end extracts watermark information from code stream, and through A rule decoder decode, obtains narrow band voice signal;
Said watermark merge module is embedded into watermark and comprises following dual mode in the code stream:
Mode one: be embedded into watermark in the code stream uniformly through the watermark merge module: because a frame signal has 160 sampled points, and the bit number of embed watermark is 66bit, whenever embeds 1 bit information at a distance from a sampled point;
Mode two: watermark information selectively is embedded in the little sample point of amplitude through the watermark merge module; Use C0~C7 to represent the lowest order of encoding code stream to arrive most significant digit; According to agreement G.711, most significant digit C7 represents the sign bit of sampled point, and C6~C4 is the paragraph sign indicating number, C3~C0 section of being ISN; The paragraph sign indicating number is more little, and the amplitude of the sampled value of code stream representative is more little; This method uses the C6 position that division of signal is large-signal, i.e. C6=1 and small-signal, i.e. C6=0, embed watermark when C6 is 0; If not enough 66 of the position that a frame embeds then is chosen in other position embed watermarks;
It is corresponding with the watermark merge module that said extraction watermark module is extracted watermark, comprises dual mode:
Mode one: the process through extracting watermark module extraction watermark is to extract according to the position of embed watermark;
Mode two: the characteristics according to code stream judge whether to have embedded watermark; From the initial judgement of a frame, if C6 is 0, then extract watermark from lowest order, C6 did not extract watermark at 1 o'clock; If watermark less than 66 bits that extract when arriving postamble, then return the starting point of a frame, be the fetched at 1 place at C6, up to extracting 66 bit watermark;
Said recovery high frequency voice module uses white noise to recover the high frequency voice:
At first, use the high-frequency parameter that extracts that it is carried out temporal envelope shaping and frequency domain envelope shaping then, can obtain the high frequency voice signal the AR model of the white noise sequence that produces through constructing by the low frequency voice;
Use white noise to recover the high frequency voice module:
Because high frequency voice and low frequency voice have certain correlativity, the low frequency voice structure AR model module that uses decoding to obtain; Produce white noise sequence in decoding end, this sequence is carried out forming processes through the AR model module of constructing, make noise possess the characteristic of high frequency voice;
The local adjusting module of temporal envelope, this partial reference the way of document " based on the DTX/CNG algorithm research and the realization of layering broadband voice coding/decoding system ":
The temporal envelope parameter of normalization temporal envelope parameter of from watermark, recovering and average temporal envelope calculating high-frequency signal:
The temporal envelope calculation of parameter time domain local gain factor by noise and high-frequency signal:
Use the temporal envelope of time domain local gain factor pair noise to adjust:
Gain factor between two sections uses approach based on linear interpolation to handle:
The local adjusting module of frequency domain envelope, this partial reference the way of document " based on the DTX/CNG algorithm research and the realization of layering broadband voice coding/decoding system ":
The adjusted signal of time domain is handled according to extracting 12 frequency domain envelope parameters and average frequency domain envelope parameters, obtained the logarithm weighting sub belt energy parameter
and the average frequency domain envelope
of noise; According in the local adjustment of temporal envelope to the local method of adjustment of the temporal envelope of noise, the frequency domain envelope of noise is carried out part adjustment;
The adjustment of the frequency domain envelope overall situation:
Calculate the frequency domain global gain factor of each frame by the average frequency domain envelope of noise and high-frequency signal:
Use frequency domain global gain factor is carried out overall situation adjustment to the frequency domain envelope of each frame:
Through the IFFT conversion module, using window window function module then is in 208 the buffer device to obtaining depositing in after the time-domain signal windowing length with adjusted frequency spectrum:
Wherein, L=256, n=0,1 ... 207;
With the value of last 48 points in the former frame buffer device and preceding 48 somes addition in the present frame buffer device, then with present frame buffer device in the value of n=48~159 constitute the time-domain signal that present frame recovers;
Temporal envelope overall situation adjusting module:
Step according to the adjustment of the frequency domain envelope overall situation is carried out overall situation adjustment to temporal envelope, and adjusted signal
promptly is the high-frequency signal by Noise Estimation;
Said QMF composite filter pack module improves SF to 16kHz with the low frequency signal
of 8KHz SF and the high-frequency signal
that estimates; Then respectively through low pass and high pass FIR filter module; The signal of handling is
and
, and the coefficient of wave filter is identical with the QMF analysis filter;
Two signal plus are promptly obtained the broadband signal of final 16KHz SF:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2011104223927A CN102543086B (en) | 2011-12-16 | 2011-12-16 | Device and method for expanding speech bandwidth based on audio watermarking |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2011104223927A CN102543086B (en) | 2011-12-16 | 2011-12-16 | Device and method for expanding speech bandwidth based on audio watermarking |
Publications (2)
Publication Number | Publication Date |
---|---|
CN102543086A true CN102543086A (en) | 2012-07-04 |
CN102543086B CN102543086B (en) | 2013-08-14 |
Family
ID=46349824
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN2011104223927A Expired - Fee Related CN102543086B (en) | 2011-12-16 | 2011-12-16 | Device and method for expanding speech bandwidth based on audio watermarking |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN102543086B (en) |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103001915A (en) * | 2012-11-30 | 2013-03-27 | 南京邮电大学 | Time domain reshaping method of asymmetric limiting light orthogonal frequency division multiplexing (OFDM) communication system |
CN103258543A (en) * | 2013-04-12 | 2013-08-21 | 大连理工大学 | Method for expanding artificial voice bandwidth |
CN103474079A (en) * | 2012-08-06 | 2013-12-25 | 苏州沃通信息科技有限公司 | Voice encoding method |
CN104269173A (en) * | 2014-09-30 | 2015-01-07 | 武汉大学深圳研究院 | Voice frequency bandwidth extension device and method achieved in switching mode |
CN105264599A (en) * | 2013-01-29 | 2016-01-20 | 弗劳恩霍夫应用研究促进协会 | Audio encoder, audio decoder, method for providing encoded audio information and decoded audio information, computer program and encoded representation using a signal-adaptive bandwidth extension |
CN105659321A (en) * | 2014-02-28 | 2016-06-08 | 松下电器(美国)知识产权公司 | Decoding device, encoding device, decoding method, encoding method, terminal device, and base station device |
CN105723458A (en) * | 2013-09-12 | 2016-06-29 | 沙特阿拉伯石油公司 | Dynamic threshold methods, systems, computer readable media, and program code for filtering noise and restoring attenuated high-frequency components of acoustic signals |
CN106612168A (en) * | 2016-12-23 | 2017-05-03 | 中国电子科技集团公司第三十研究所 | Voice out-of-synchronism detection method based on PCM coding characteristics |
CN108074578A (en) * | 2016-11-17 | 2018-05-25 | 中国科学院声学研究所 | A kind of transmission of audio frequency watermark and the system and method for information exchange |
CN108269585A (en) * | 2013-04-05 | 2018-07-10 | 杜比实验室特许公司 | The companding device and method of quantizing noise are reduced using advanced spectrum continuation |
CN110544472A (en) * | 2019-09-29 | 2019-12-06 | 上海依图信息技术有限公司 | Method for improving performance of voice task using CNN network structure |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4667340A (en) * | 1983-04-13 | 1987-05-19 | Texas Instruments Incorporated | Voice messaging system with pitch-congruent baseband coding |
CN101140759A (en) * | 2006-09-08 | 2008-03-12 | 华为技术有限公司 | Band-width spreading method and system for voice or audio signal |
CN101521014A (en) * | 2009-04-08 | 2009-09-02 | 武汉大学 | Audio bandwidth expansion coding and decoding devices |
CN102105931A (en) * | 2008-07-11 | 2011-06-22 | 弗朗霍夫应用科学研究促进协会 | Apparatus and method for generating a bandwidth extended signal |
-
2011
- 2011-12-16 CN CN2011104223927A patent/CN102543086B/en not_active Expired - Fee Related
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4667340A (en) * | 1983-04-13 | 1987-05-19 | Texas Instruments Incorporated | Voice messaging system with pitch-congruent baseband coding |
CN101140759A (en) * | 2006-09-08 | 2008-03-12 | 华为技术有限公司 | Band-width spreading method and system for voice or audio signal |
CN102105931A (en) * | 2008-07-11 | 2011-06-22 | 弗朗霍夫应用科学研究促进协会 | Apparatus and method for generating a bandwidth extended signal |
CN101521014A (en) * | 2009-04-08 | 2009-09-02 | 武汉大学 | Audio bandwidth expansion coding and decoding devices |
Cited By (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103474079A (en) * | 2012-08-06 | 2013-12-25 | 苏州沃通信息科技有限公司 | Voice encoding method |
CN103001915B (en) * | 2012-11-30 | 2015-01-28 | 南京邮电大学 | Time domain reshaping method of asymmetric limiting light orthogonal frequency division multiplexing (OFDM) communication system |
CN103001915A (en) * | 2012-11-30 | 2013-03-27 | 南京邮电大学 | Time domain reshaping method of asymmetric limiting light orthogonal frequency division multiplexing (OFDM) communication system |
CN105264599B (en) * | 2013-01-29 | 2019-05-10 | 弗劳恩霍夫应用研究促进协会 | Audio coder, provides the method for codes audio information at audio decoder |
CN105264599A (en) * | 2013-01-29 | 2016-01-20 | 弗劳恩霍夫应用研究促进协会 | Audio encoder, audio decoder, method for providing encoded audio information and decoded audio information, computer program and encoded representation using a signal-adaptive bandwidth extension |
US11423923B2 (en) | 2013-04-05 | 2022-08-23 | Dolby Laboratories Licensing Corporation | Companding system and method to reduce quantization noise using advanced spectral extension |
CN108269585A (en) * | 2013-04-05 | 2018-07-10 | 杜比实验室特许公司 | The companding device and method of quantizing noise are reduced using advanced spectrum continuation |
CN103258543A (en) * | 2013-04-12 | 2013-08-21 | 大连理工大学 | Method for expanding artificial voice bandwidth |
CN105723458A (en) * | 2013-09-12 | 2016-06-29 | 沙特阿拉伯石油公司 | Dynamic threshold methods, systems, computer readable media, and program code for filtering noise and restoring attenuated high-frequency components of acoustic signals |
CN105723458B (en) * | 2013-09-12 | 2019-09-24 | 沙特阿拉伯石油公司 | For filtering out noise and going back dynamic threshold method, the system, computer-readable medium of the high fdrequency component that acoustic signal is decayed |
US10672409B2 (en) | 2014-02-28 | 2020-06-02 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Decoding device, encoding device, decoding method, and encoding method |
CN105659321A (en) * | 2014-02-28 | 2016-06-08 | 松下电器(美国)知识产权公司 | Decoding device, encoding device, decoding method, encoding method, terminal device, and base station device |
US11257506B2 (en) | 2014-02-28 | 2022-02-22 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Decoding device, encoding device, decoding method, and encoding method |
CN104269173B (en) * | 2014-09-30 | 2018-03-13 | 武汉大学深圳研究院 | The audio bandwidth expansion apparatus and method of switch mode |
CN104269173A (en) * | 2014-09-30 | 2015-01-07 | 武汉大学深圳研究院 | Voice frequency bandwidth extension device and method achieved in switching mode |
CN108074578A (en) * | 2016-11-17 | 2018-05-25 | 中国科学院声学研究所 | A kind of transmission of audio frequency watermark and the system and method for information exchange |
CN106612168B (en) * | 2016-12-23 | 2019-07-16 | 中国电子科技集团公司第三十研究所 | A kind of voice step failing out detecting method based on pcm encoder feature |
CN106612168A (en) * | 2016-12-23 | 2017-05-03 | 中国电子科技集团公司第三十研究所 | Voice out-of-synchronism detection method based on PCM coding characteristics |
CN110544472A (en) * | 2019-09-29 | 2019-12-06 | 上海依图信息技术有限公司 | Method for improving performance of voice task using CNN network structure |
Also Published As
Publication number | Publication date |
---|---|
CN102543086B (en) | 2013-08-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102543086B (en) | Device and method for expanding speech bandwidth based on audio watermarking | |
JP7330934B2 (en) | Apparatus and method for bandwidth extension of acoustic signals | |
CN101512639B (en) | Method and equipment for voice/audio transmitter and receiver | |
CN1808568B (en) | Audio encoding/decoding apparatus having watermark insertion/abstraction function and method using the same | |
CN102522092B (en) | Device and method for expanding speech bandwidth based on G.711.1 | |
CN101202043B (en) | Method and system for encoding and decoding audio signal | |
CN105280190B (en) | Bandwidth extension encoding and decoding method and device | |
CN101206860A (en) | Method and apparatus for encoding and decoding layered audio | |
KR100921867B1 (en) | Apparatus And Method For Coding/Decoding Of Wideband Audio Signals | |
CN101779236A (en) | Temporal masking in audio coding based on spectral dynamics in frequency sub-bands | |
CN101662288A (en) | Method, device and system for encoding and decoding audios | |
CN102194458B (en) | Spectral band replication method and device and audio decoding method and system | |
Chen et al. | An audio watermark-based speech bandwidth extension method | |
CN102142255A (en) | Method for embedding and extracting digital watermark in audio signal | |
Bao et al. | MP3-resistant music steganography based on dynamic range transform | |
CN105957533B (en) | Voice compression method, voice decompression method, audio encoder and audio decoder | |
CN101604527A (en) | Under the VoIP environment based on the method for the hidden transferring of wideband voice of G.711 encoding | |
CN114974270A (en) | Audio information self-adaptive hiding method | |
Xu et al. | Performance analysis of data hiding in MPEG-4 AAC audio | |
Dymarski et al. | Robust audio watermarks in frequency domain | |
Nishimura | Data hiding for audio signals that are robust with respect to air transmission and a speech codec | |
CN114863942B (en) | Model training method for voice quality conversion, method and device for improving voice quality | |
CN103474079A (en) | Voice encoding method | |
Koduri | Discrete cosine transform-based data hiding for speech bandwidth extension | |
CN101833953B (en) | Method and device for lowering redundancy rate of multi-description coding and decoding |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20130814 Termination date: 20151216 |
|
EXPY | Termination of patent right or utility model |