CN102543086A - Device and method for expanding speech bandwidth based on audio watermarking - Google Patents

Device and method for expanding speech bandwidth based on audio watermarking Download PDF

Info

Publication number
CN102543086A
CN102543086A CN2011104223927A CN201110422392A CN102543086A CN 102543086 A CN102543086 A CN 102543086A CN 2011104223927 A CN2011104223927 A CN 2011104223927A CN 201110422392 A CN201110422392 A CN 201110422392A CN 102543086 A CN102543086 A CN 102543086A
Authority
CN
China
Prior art keywords
frequency
watermark
module
signal
frequency domain
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2011104223927A
Other languages
Chinese (zh)
Other versions
CN102543086B (en
Inventor
陈喆
殷福亮
赵承勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dalian University of Technology
Original Assignee
Dalian University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dalian University of Technology filed Critical Dalian University of Technology
Priority to CN2011104223927A priority Critical patent/CN102543086B/en
Publication of CN102543086A publication Critical patent/CN102543086A/en
Application granted granted Critical
Publication of CN102543086B publication Critical patent/CN102543086B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention disclosed a device and a method for expanding speech bandwidth based on an audio watermarking. The device and the method are as follows: in the starting part, the speech sent by a person is a bandwidth signal; before the speech is transmitted by a telephone line, high-frequency parameters are embedded to a narrow band code stream; the narrow band speech signal is transmitted by the telephone line; A-law decoding is performed on a receiving end, and then the high-frequency parameters are extracted; the high-frequency part in the bandwidth speech is recovered by the high-frequency parameters; finally, the high-frequency speech and low-frequency speech are synthesized to be a bandwidth speech. The device and the method use characteristics of the audio watermarking to build a hidden channel in the narrow band speech and uses the channel to transmit the parameters of the high-frequency speech to achieve band extension of the speech signal without changing the original network protocol.

Description

A kind of apparatus and method of expanding based on the speech bandwidth of audio frequency watermark
Technical field
The present invention relates to voice processing technology, particularly a kind of apparatus and method of expanding based on the speech bandwidth of audio frequency watermark.
Background technology
The main concentration of energy of human speech signal is in 0.3~3.4KHz, and the 4KHz bandwidth just can guarantee enough intelligibilitys.Therefore, public telephone network (PSTN) coding standard that International Telecommunication Union formulates the G.711 SF of (being A rule and μ rule) is 8KHz, and uses till today always.
Narrowband speech has reduced the demand to communication bandwidth when guaranteeing certain intelligibility, but this is to be cost with the naturality of sacrificing voice.Narrowband speech has been lost the high fdrequency component in the raw tone, so it sounds nature inadequately.In order to improve voice quality, G.722 ITU-T has proposed first broadband voice codec that is used for the remote phone meeting.Broadband voice communications can realize through designing transmission link again, but for huge PSTN fixed telephone network, it is expensive excessive to design transmission link again.
Traditional watermark is meant being seen mark when paper faces toward light, and the true and false that generally is used for important bill detects.And digital watermark technology is to utilize ubiquitous redundancy of multimedia digital works and randomness, is embedded into some numerical information in the copyright, realizes the hiding transmission of information.Digital watermarking is mainly used in the copyright and the integrality of protection copyright.Because people's the sense of hearing is than quick-eyed, it is more than what to be embedded into image difficult that watermark is embedded into audio frequency.
Audio frequency watermark based on least significant bit (LSB) (LSB): based on the method for the speech bandwidth of LSB expansion is that the lowest order that high-frequency parameter is embedded into encoding code stream is realized that the quantity of this method embed watermark is many, algorithm is simple, the communication channel that the suitable bit error rate is lower.
Audio frequency watermark based on time domain echo concealing technology: the audio frequency watermark based on time domain echo concealing technology is to have utilized the time domain masking effect in the human hearing characteristic: though a voice signal finishes, it is also influential to the hearing ability of another sound.The watermark negligible amounts that this method embeds, embed watermark have certain influence to original sound later.
This method of audio frequency watermark based on the frequency domain discrete Fourier transformation is at first carried out the DFT conversion to audio-frequency information; Selecting frequency range wherein then is that the DFT coefficient of 2.4~6.4kHz carries out watermark and embeds, and replaces corresponding D FT coefficient with the spectrum component of representing watermark sequence.Though it is this method has good robustness, when embed watermark and original DFT coefficient difference are excessive, bigger to the influence of raw tone.
Audio frequency watermark based on the frequency domain discrete cosine transform: this method is done discrete cosine transform to time-domain signal earlier, then sequence is revised discrete cosine transform (MDCT), changes with embed watermark through the coefficient to MDCT.This method has good robustness, but the negligible amounts of embed watermark.
The shortcoming of prior art: above method can not be accomplished good equilibrium aspect three of robustness, disguise and embed watermark quantity, its shortcoming is separately all arranged, and therefore can not be used for the speech bandwidth expansion preferably.
Summary of the invention
Realize the various shortcoming and defect that bandwidth is expanded to existing audio frequency watermark, the invention provides a kind of apparatus and method of expanding based on the speech bandwidth of audio frequency watermark.
In order to achieve the above object, a kind of method of expanding based on the speech bandwidth of audio frequency watermark provided by the invention may further comprise the steps:
Steps A. use QMF analysis filter pack module broadband voice to be divided into the high fdrequency component of narrowband speech and the 8000~16000Hz of two part: 0~8000Hz; And export the signal sampling frequencies with two and reduce to 8KHz, obtain low frequency signal s L ( n) and high-frequency signal s H ( n).
Step B. extracts 30 high-frequency parameters through extracting the high-frequency parameter module: 16 temporal envelope parameters, 12 frequency domain envelope parameters, average time domain envelope parameters and average frequency domain envelope parameters; This partial reference the way of document " based on the DTX/CNG algorithm research and the realization of layering broadband voice coding/decoding system ", below be the concrete method for distilling of each parameter:
Step B1. extracts 16 temporal envelope parameters and average temporal envelope parameter:
The high fdrequency component of every 20ms s H ( n) be divided into 16 sections, every section comprises 10 sampled points; 16 temporal envelope parameters are:
Figure 2011104223927100002DEST_PATH_IMAGE001
Figure 957699DEST_PATH_IMAGE002
Calculate average temporal envelope:
Figure 2011104223927100002DEST_PATH_IMAGE003
Use the time domain envelope parameters T( i) and mean value
Figure 750206DEST_PATH_IMAGE004
Make difference and carry out normalization:
Figure 2011104223927100002DEST_PATH_IMAGE005
Figure 592260DEST_PATH_IMAGE002
Step B2. extracts 12 frequency domain envelope parameters and average frequency domain envelope parameters:
High fdrequency component s H ( n) 160 sampled points and last 48 of previous frame of present frame adopt point to get through a windowing process
Figure 800518DEST_PATH_IMAGE006
, use long 208 the sampling point window functions of window here Window( n):
Figure 2011104223927100002DEST_PATH_IMAGE007
Figure 677207DEST_PATH_IMAGE008
Wherein, N=208;
Signal after the windowing is mended 0 to 256 point, and the FFT conversion of doing then at 256 gets S F ( k):
Figure 2011104223927100002DEST_PATH_IMAGE009
Figure 309790DEST_PATH_IMAGE010
Wherein, L=256; Frequency domain is divided into 12 evenly at interval, calculates each frequency domain envelope parameters at interval, and convert logarithm weighting sub belt energy parameter to.
Calculate the average frequency domain envelope:
With the frequency domain envelope parameters F( i) and mean value
Figure 639140DEST_PATH_IMAGE012
Make difference and carry out normalization:
Figure 2011104223927100002DEST_PATH_IMAGE013
Figure 713406DEST_PATH_IMAGE014
Step C. through coding/decoding module G.711 with narrow band voice signal s L ( n) through A-law encoding device coding, obtain each point 8 BitThe code stream of data length is embedded into watermark information in the code stream, is sent in the network through telephone wire; Receiving end extracts watermark information from code stream, and through A rule decoder decode, obtains narrow band voice signal.
Step D. is embedded into watermark through the watermark merge module and comprises following dual mode in the code stream:
D1. be embedded into watermark in the code stream uniformly through the watermark merge module: because a frame signal has 160 sampled points, and the bit number of embed watermark is 66bit, whenever embeds 1 bit information at a distance from a sampled point.
Perhaps D2. selectively is embedded into watermark information in the little sample point of amplitude through the watermark merge module; Use C0~C7 to represent the lowest order of encoding code stream to arrive most significant digit; According to agreement G.711, most significant digit C7 represents the sign bit of sampled point, and C6~C4 is the paragraph sign indicating number, C3~C0 section of being ISN; The paragraph sign indicating number is more little, and the amplitude of the sampled value of code stream representative is more little; This method uses the C6 position that division of signal is large-signal, i.e. C6=1 and small-signal, i.e. C6=0, embed watermark when C6 is 0; If not enough 66 of the position that a frame embeds then is chosen in other position embed watermarks.
Step e. corresponding through extracting watermark module extraction watermark with step D, comprise dual mode:
E1. be to extract through the process of extracting watermark module extraction watermark according to the position of embed watermark.
Perhaps E2. judges whether to have embedded watermark according to the characteristics of code stream; From the initial judgement of a frame, if C6 is 0, then extract watermark from lowest order, C6 did not extract watermark at 1 o'clock; If watermark less than 66 bits that extract when arriving postamble, then return the starting point of a frame, be the fetched at 1 place at C6, up to extracting 66 bit watermark.
Step F. use white noise to recover the high frequency voice through recovering the high frequency voice module:
At first, use the high-frequency parameter that extracts that it is carried out temporal envelope shaping and frequency domain envelope shaping then, can obtain the high frequency voice signal the AR model of the white noise sequence that produces through constructing by the low frequency voice.
Step F 1. uses white noise to recover the high frequency voice:
Because high frequency voice and low frequency voice have certain correlativity, the low frequency voice structure AR model that uses decoding to obtain; Produce white noise sequence in decoding end, this sequence is carried out forming processes through the AR model of constructing, make noise possess the characteristic of high frequency voice.
The local adjustment of step F 2. temporal envelopes, this partial reference the way of document " based on the DTX/CNG algorithm research and the realization of layering broadband voice coding/decoding system ":
The temporal envelope parameter of normalization temporal envelope parameter of from watermark, recovering and average temporal envelope calculating high-frequency signal:
Figure 2011104223927100002DEST_PATH_IMAGE015
Figure 195334DEST_PATH_IMAGE016
The temporal envelope calculation of parameter time domain local gain factor by noise and high-frequency signal:
Figure 2011104223927100002DEST_PATH_IMAGE017
Use the temporal envelope of time domain local gain factor pair noise to adjust:
Figure 251015DEST_PATH_IMAGE018
?
Figure 2011104223927100002DEST_PATH_IMAGE019
Figure 818394DEST_PATH_IMAGE020
Gain factor between two sections uses approach based on linear interpolation to handle:
Figure 617722DEST_PATH_IMAGE022
The local adjustment of step F 3. frequency domain envelopes, this partial reference the way of document " based on the DTX/CNG algorithm research and the realization of layering broadband voice coding/decoding system ":
The adjusted signal of time domain is handled according to extracting 12 frequency domain envelope parameters and average frequency domain envelope parameters, obtained the logarithm weighting sub belt energy parameter
Figure 2011104223927100002DEST_PATH_IMAGE023
and the average frequency domain envelope of noise.According in the local adjustment of temporal envelope to the local method of adjustment of the temporal envelope of noise, the frequency domain envelope of noise is carried out part adjustment.
The adjustment of the step F 4. frequency domain envelopes overall situation:
Calculate the frequency domain global gain factor of each frame by the average frequency domain envelope of noise and high-frequency signal:
Use frequency domain global gain factor is carried out overall situation adjustment to the frequency domain envelope of each frame:
Figure 92375DEST_PATH_IMAGE026
Adjusted frequency spectrum is done the IFFT conversion, and using the window window function then is among 208 the buffer to obtaining depositing in after the time-domain signal windowing length:
Figure 396318DEST_PATH_IMAGE028
Wherein, L=256, n=0,1 ... 207.
With the value of last 48 points among the former frame buffer and preceding 48 somes addition among the present frame buffer, then with present frame buffer in the value of n=48~159 constitute the time-domain signal that present frame recovers.
The adjustment of the step F 5. temporal envelopes overall situation:
Step according to the adjustment of the frequency domain envelope overall situation is carried out overall situation adjustment to temporal envelope, and adjusted signal
Figure DEST_PATH_IMAGE029
promptly is the high-frequency signal by Noise Estimation.
Step G. adopts the low frequency signal
Figure 484490DEST_PATH_IMAGE030
of frequency through QMF composite filter pack module with 8KHz and the high-frequency signal that estimates improves SF to 16kHz; Then respectively through low pass and high pass FIR wave filter; The signal of handling is
Figure DEST_PATH_IMAGE031
and , and the coefficient of wave filter is identical with the QMF analysis filter.
Two signal plus are promptly obtained the broadband signal of final 16KHz SF:
Figure DEST_PATH_IMAGE033
The present invention provides a kind of device of expanding based on the speech bandwidth of audio frequency watermark in addition.The device of said speech bandwidth expansion based on audio frequency watermark comprises: QMF analysis filter pack module, extract the high-frequency parameter module, G.711 coding/decoding module, watermark merge module, extract the watermark module, recover high frequency voice module and QMF composite filter pack module.
Said QMF analysis filter pack module is divided into broadband voice the high fdrequency component of narrowband speech and the 8000~16000Hz of two part: 0~8000Hz; And export the signal sampling frequencies with two and reduce to 8KHz, obtain low frequency signal s L ( n) and high-frequency signal s H ( n).
Said extraction high-frequency parameter module is extracted 30 high-frequency parameters: 16 temporal envelope parameters, 12 frequency domain envelope parameters, average time domain envelope parameters and average frequency domain envelope parameters; This partial reference the way of document " based on the DTX/CNG algorithm research and the realization of layering broadband voice coding/decoding system ", below be the concrete method for distilling of each parameter:
Extract 16 temporal envelope parameters and average temporal envelope parameter:
The high fdrequency component of every 20ms s H ( n) be divided into 16 sections, every section comprises 10 sampled points; 16 temporal envelope parameters are:
Figure 939852DEST_PATH_IMAGE034
Figure DEST_PATH_IMAGE035
Calculate average temporal envelope:
Figure 80983DEST_PATH_IMAGE036
Use the time domain envelope parameters T( i) and mean value
Figure 392010DEST_PATH_IMAGE004
Make difference and carry out normalization:
Figure DEST_PATH_IMAGE037
Figure 225974DEST_PATH_IMAGE035
Extract 12 frequency domain envelope parameters and average frequency domain envelope parameters:
High fdrequency component s H ( n) 160 sampled points and last 48 of previous frame of present frame adopt point to get through a windowing process , use long 208 the sampling point window functions of window here Window( n):
Figure DEST_PATH_IMAGE039
Figure 934484DEST_PATH_IMAGE040
Wherein, N=208.
Signal after the windowing is mended 0 to 256 point, and the FFT conversion of doing then at 256 gets S F ( k):
Figure DEST_PATH_IMAGE041
Figure 100017DEST_PATH_IMAGE042
Wherein, L=256; Frequency domain is divided into 12 evenly at interval, calculates each frequency domain envelope parameters at interval, and convert logarithm weighting sub belt energy parameter to.
Calculate the average frequency domain envelope:
With the frequency domain envelope parameters F( i) and mean value
Figure 104882DEST_PATH_IMAGE044
Make difference and carry out normalization:
Figure DEST_PATH_IMAGE045
Figure 415253DEST_PATH_IMAGE046
Said G.711 coding/decoding module is with narrow band voice signal s L ( n) through A-law encoding device coding, obtain each point 8 BitThe code stream of data length is embedded into watermark information in the code stream, is sent in the network through telephone wire; Receiving end extracts watermark information from code stream, and through A rule decoder decode, obtains narrow band voice signal.
Said watermark merge module is embedded into watermark and comprises following dual mode in the code stream:
Mode one: be embedded into watermark in the code stream uniformly through the watermark merge module: because a frame signal has 160 sampled points, and the bit number of embed watermark is 66bit, whenever embeds 1 bit information at a distance from a sampled point.
Mode two: watermark information selectively is embedded in the little sample point of amplitude through the watermark merge module; Use C0~C7 to represent the lowest order of encoding code stream to arrive most significant digit; According to agreement G.711, most significant digit C7 represents the sign bit of sampled point, and C6~C4 is the paragraph sign indicating number, C3~C0 section of being ISN; The paragraph sign indicating number is more little, and the amplitude of the sampled value of code stream representative is more little; This method uses the C6 position that division of signal is large-signal, i.e. C6=1 and small-signal, i.e. C6=0, embed watermark when C6 is 0; If not enough 66 of the position that a frame embeds then is chosen in other position embed watermarks.
It is corresponding with the watermark merge module that said extraction watermark module is extracted watermark, comprises dual mode:
Mode one: the process through extracting watermark module extraction watermark is to extract according to the position of embed watermark.
Mode two: the characteristics according to code stream judge whether to have embedded watermark; From the initial judgement of a frame, if C6 is 0, then extract watermark from lowest order, C6 did not extract watermark at 1 o'clock; If watermark less than 66 bits that extract when arriving postamble, then return the starting point of a frame, be the fetched at 1 place at C6, up to extracting 66 bit watermark.
Said recovery high frequency voice module uses white noise to recover the high frequency voice:
At first, use the high-frequency parameter that extracts that it is carried out temporal envelope shaping and frequency domain envelope shaping then, can obtain the high frequency voice signal the AR model of the white noise sequence that produces through constructing by the low frequency voice.
Use white noise to recover the high frequency voice:
Because high frequency voice and low frequency voice have certain correlativity, the low frequency voice structure AR model that uses decoding to obtain; Produce white noise sequence in decoding end, this sequence is carried out forming processes through the AR model module of constructing, make noise possess the characteristic of high frequency voice.
The local adjustment of temporal envelope, this partial reference the way of document " based on the DTX/CNG algorithm research and the realization of layering broadband voice coding/decoding system ":
The temporal envelope parameter of normalization temporal envelope parameter of from watermark, recovering and average temporal envelope calculating high-frequency signal:
Figure DEST_PATH_IMAGE047
Figure 648920DEST_PATH_IMAGE048
The temporal envelope calculation of parameter time domain local gain factor by noise and high-frequency signal:
Figure DEST_PATH_IMAGE049
Use the temporal envelope of time domain local gain factor pair noise to adjust:
?
Figure DEST_PATH_IMAGE051
Figure 172622DEST_PATH_IMAGE052
Gain factor between two sections uses approach based on linear interpolation to handle:
Figure 97853DEST_PATH_IMAGE022
The local adjustment of frequency domain envelope, this partial reference the way of document " based on the DTX/CNG algorithm research and the realization of layering broadband voice coding/decoding system ":
The adjusted signal of time domain is handled according to extracting 12 frequency domain envelope parameters and average frequency domain envelope parameters, obtained the logarithm weighting sub belt energy parameter
Figure 135210DEST_PATH_IMAGE023
and the average frequency domain envelope
Figure 259024DEST_PATH_IMAGE024
of noise.According in the local adjustment of temporal envelope to the local method of adjustment of the temporal envelope of noise, the frequency domain envelope of noise is carried out part adjustment.
The adjustment of the frequency domain envelope overall situation:
Calculate the frequency domain global gain factor of each frame by the average frequency domain envelope of noise and high-frequency signal:
Figure 543375DEST_PATH_IMAGE025
Use frequency domain global gain factor is carried out overall situation adjustment to the frequency domain envelope of each frame:
Figure 766021DEST_PATH_IMAGE026
Adjusted frequency spectrum is done the IFFT conversion, and using the window window function then is in 208 the buffer device to obtaining depositing in after the time-domain signal windowing length:
Figure 37919DEST_PATH_IMAGE028
Wherein, L=256, n=0,1 ... 207.
With the value of last 48 points in the former frame buffer device and preceding 48 somes addition in the present frame buffer device, then with present frame buffer device in the value of n=48~159 constitute the time-domain signal that present frame recovers.
The adjustment of the temporal envelope overall situation:
Step according to the adjustment of the frequency domain envelope overall situation is carried out overall situation adjustment to temporal envelope, and adjusted signal
Figure 40642DEST_PATH_IMAGE029
promptly is the high-frequency signal by Noise Estimation.
Said QMF composite filter pack module adopts the low frequency signal
Figure 940465DEST_PATH_IMAGE030
of frequency with 8KHz and the high-frequency signal
Figure 568892DEST_PATH_IMAGE029
that estimates improves SF to 16kHz; Then respectively through low pass and high pass FIR wave filter; The signal of handling is
Figure 683610DEST_PATH_IMAGE031
and
Figure 44184DEST_PATH_IMAGE032
, and the coefficient of wave filter is identical with the QMF analysis filter.
Two signal plus are promptly obtained the broadband signal of final 16KHz SF:
Beneficial effect: the present invention has provided a kind of method of improving speech quality based on audio frequency watermark.This method is utilized the characteristic of audio frequency watermark, in narrowband speech, sets up a hiding channel, utilizes the parameter of these Channel Transmission high frequency voice, thereby under the prerequisite that does not change the legacy network agreement, has realized the band spread of voice signal.The present invention uses the adaptive audio watermark to realize the speech bandwidth expansion, and high-frequency information less to the influence of raw tone, that embed is more, robustness good, is fit to various types of voice, and the broadband voice auditory effect that recovers is good than narrowband speech.
Description of drawings
Fig. 1 principle of the invention block diagram.
Fig. 2 window window function of the present invention.
Fig. 3 the present invention is the encoding code stream form G.711.
Fig. 4 the present invention recovers high frequency voice block diagram.
Embodiment
Below in conjunction with accompanying drawing and embodiment the present invention is elaborated.
Fig. 1 has provided the complete theory diagram of the present invention.The beginning part, the voice that the people sends are broadband signals, before through the telephone wire transmission, high-frequency parameter are embedded in the arrowband code stream, through telephone wire transmission narrow band voice signal; Carry out the decoding of A rule at receiving end, use the high-frequency parameter extraction module to extract high-frequency parameter then, use the high-frequency parameter synthesis module to recover the HFS in the broadband voice, at last with high frequency voice and low frequency phonetic synthesis broadband voice.
Each module introduction that relates in the principle of the invention block diagram is following:
1, QMF analysis filter pack module
To send voice are broadband voices to the beginning groups of people, and phone line defeated be narrowband speech, so the present invention uses the QMF analysis filterbank broadband voice to be divided into the high fdrequency component of narrowband speech and the 8000~16000Hz of two part: 0~8000Hz.QMF analysis filter among the present invention uses the FIR wave filter on 64 rank, low-pass FIR filter h L(n) coefficient is seen appendix.Hi-pass filter h H(n) be by low-pass filter h L(n) frequency displacement obtains, and just uses multiple sinusoidal sequence
Figure DEST_PATH_IMAGE053
Modulation, that is: = =
Broadband signal is passed through the QMF analysis filterbank, and two output signal sampling frequencies are reduced to 8KHz, just can obtain low frequency signal s L(n) and high-frequency signal s H(n).
2, extract the high-frequency parameter module
30 high-frequency parameters of extraction required for the present invention: 16 temporal envelope parameters, 12 frequency domain envelope parameters, average time domain envelope parameters and average frequency domain envelope parameters.It below is the concrete method for distilling of each parameter.
(1) extracts 16 temporal envelope parameters and average temporal envelope parameter
The high fdrequency component s of every 20ms H(n) be divided into 16 sections, every section comprises 10 sampled points.16 temporal envelope parameters are:
Figure 236557DEST_PATH_IMAGE034
Figure 907710DEST_PATH_IMAGE035
Calculate average temporal envelope:
Make difference with time domain envelope parameters T (i) with mean value
Figure 904933DEST_PATH_IMAGE004
and carry out normalization:
Figure 935206DEST_PATH_IMAGE037
Figure 31338DEST_PATH_IMAGE035
(2) extract 12 frequency domain envelope parameters and average frequency domain envelope parameters
High fdrequency component s HLast 48 of 160 sampled points of present frame (n) and previous frame are adopted point to get through a windowing process
Figure 555991DEST_PATH_IMAGE038
, use long 208 the sampling point window function window (n) of window here:
Figure 421179DEST_PATH_IMAGE039
Figure 622353DEST_PATH_IMAGE040
Wherein, N=208.Window function is as shown in Figure 2.
Signal after the windowing is mended 0 to 256 point, and the FFT conversion of doing then at 256 gets S F(k):
Figure 18831DEST_PATH_IMAGE041
Figure 268547DEST_PATH_IMAGE042
Wherein, L=256.Frequency domain is divided into 12 evenly at interval, calculates each frequency domain envelope parameters at interval, and convert logarithm weighting sub belt energy parameter to.The computing method of frequency domain envelope sub-band division and the logarithm weighted energy F (i) that respectively carries are seen appendix.
Calculate the average frequency domain envelope:
Figure 126257DEST_PATH_IMAGE043
Frequency domain envelope parameters F (i) is made difference with mean value carries out normalization:
Figure 569056DEST_PATH_IMAGE045
Figure 435512DEST_PATH_IMAGE046
3, coding/decoding module G.711
With narrow band voice signal s L(n) through A-law encoding device coding, obtain the code stream of each some 8bit data length, watermark information is embedded in the code stream, be sent in the network through telephone wire.Receiving end extracts watermark information from code stream, and through A rule decoder decode, obtains narrow band voice signal.
4, watermark merge module
It is simply watermark information to be embedded in the lowest order of arrowband code stream that existing least significant bit (LSB) embeds algorithm, and to the characteristics of host-host protocol and the subjective sensation of people's ear, this paper proposes two kinds of modified least significant bit (LSB)s and embeds algorithm.
First method is to be embedded into watermark in the code stream comparatively uniformly: because a frame signal has 160 sampled points, and the bit number of embed watermark is 66bit, can whenever embed 1 bit information at a distance from a sampled point.Bad in the time of can avoiding causing the auditory effect fashion like this because of localized distortion is excessive, make whole auditory effect remain on a higher level.
Second method is that the auditory properties according to the characteristics of host-host protocol and people's ear proposes a kind of selectable least significant bit (LSB) and embeds algorithm.What G.711 use is non-uniform quantizing, signal sampling value hour, and quantized interval is also little; When the signal sampling value was big, quantized interval was also big.So if change the encoding code stream of little sample value, the amplitude of variation of sample value is little, change the encoding code stream of big sample value, the vary within wide limits of sample value.Make that so no matter watermark being embedded into little sample point still is big sample point, the signal to noise ratio (S/N ratio) that obtains theoretically changes very little.But according to the time domain masking effect of people's ear, large-signal makes the modification of small-signal be difficult for being discovered by people's ear to the masking effect of back small-signal.According to this characteristic, can watermark information selectively be embedded in the little sample point of amplitude, make that the hiding property of watermark is better.Use C0~C7 to represent the lowest order of encoding code stream to arrive most significant digit, as shown in Figure 3.Follow G.711 agreement of certificate, most significant digit C7 represents the sign bit of sampled point, and C6~C4 is the paragraph sign indicating number, C3~C0 section of being ISN.The paragraph sign indicating number is more little, and the amplitude of the sampled value of code stream representative is more little.This paper uses the C6 position that division of signal is large-signal (C6=1) and small-signal (C6=0), embed watermark when C6 is 0.If not enough 66 of the position that a frame embeds then is chosen in other position embed watermarks.
5, extract the watermark module
According to the difference that embeds algorithm, use the watermark extracting method corresponding with it.The process of first kind of algorithm extraction watermark is to extract according to the position of embed watermark.Second method is that the characteristics according to code stream judge whether to have embedded watermark.From the initial judgement of a frame, if C6 is 0, then extract watermark from lowest order, C6 did not extract watermark at 1 o'clock.If watermark less than 66 bits that extract when arriving postamble, then return the starting point of a frame, be the fetched at 1 place at C6, up to extracting 66 bit watermark.
6, recover the high frequency voice module
Because high frequency characteristics of speech sounds and noise ratio are similar, this module uses white noise to recover the high frequency voice.At first, use the high-frequency parameter that extracts that it is carried out temporal envelope shaping and frequency domain envelope shaping then, can obtain the high frequency voice signal the AR model of the white noise sequence that produces through constructing by the low frequency voice.It is as shown in Figure 4 to recover high frequency voice block diagram.
(1) use white noise to recover the high frequency voice
Because high frequency voice and low frequency voice have certain correlativity, the low frequency voice structure AR model that uses decoding to obtain.Produce white noise sequence in decoding end, this sequence is carried out forming processes through the AR model of constructing, make noise possess the characteristic of high frequency voice.
(2) the local adjustment of temporal envelope
The temporal envelope parameter of normalization temporal envelope parameter of from watermark, recovering and average temporal envelope calculating high-frequency signal:
Figure 337609DEST_PATH_IMAGE047
Figure 818269DEST_PATH_IMAGE048
The temporal envelope calculation of parameter time domain local gain factor by noise and high-frequency signal:
Figure 923760DEST_PATH_IMAGE049
Use the temporal envelope of time domain local gain factor pair noise to adjust:
Figure 780857DEST_PATH_IMAGE050
?
Figure 537461DEST_PATH_IMAGE051
Gain factor between two sections uses approach based on linear interpolation to handle:
Figure 47387DEST_PATH_IMAGE022
(3) the local adjustment of frequency domain envelope
The adjusted signal of time domain is handled according to extracting 12 frequency domain envelope parameters and average frequency domain envelope parameters, obtained the logarithm weighting sub belt energy parameter
Figure 504914DEST_PATH_IMAGE023
and the average frequency domain envelope
Figure 863826DEST_PATH_IMAGE024
of noise.According in the local adjustment of temporal envelope to the local method of adjustment of the temporal envelope of noise, the frequency domain envelope of noise is carried out part adjustment.
(4) frequency domain envelope overall situation adjustment
Calculate the frequency domain global gain factor of each frame by the average frequency domain envelope of noise and high-frequency signal:
Figure 951868DEST_PATH_IMAGE025
Use frequency domain global gain factor is carried out overall situation adjustment to the frequency domain envelope of each frame:
Figure 281218DEST_PATH_IMAGE026
Figure 480118DEST_PATH_IMAGE027
Adjusted frequency spectrum is done the IFFT conversion, and the window window function that uses Fig. 2 then is among 208 the buffer to obtaining depositing in after the time-domain signal windowing length:
Figure 696467DEST_PATH_IMAGE028
Wherein, L=256, n=0,1 ... 207.
With the value of last 48 points among the former frame buffer and preceding 48 somes addition among the present frame buffer, then with present frame buffer in the value of n=48~159 constitute the time-domain signal that present frame recovers.
(5) temporal envelope overall situation adjustment
Step according to the adjustment of the frequency domain envelope overall situation is carried out overall situation adjustment to temporal envelope, and adjusted signal
Figure 17727DEST_PATH_IMAGE029
promptly is the high-frequency signal by Noise Estimation.
7, QMF composite filter pack module
8KHz is adopted the low frequency signal
Figure 772056DEST_PATH_IMAGE030
of frequency and the high-frequency signal that estimates improves SF to 16kHz; Then respectively through low pass and high pass FIR wave filter; The signal of handling is
Figure 907820DEST_PATH_IMAGE031
and
Figure 337664DEST_PATH_IMAGE032
, and the coefficient of wave filter is identical with the QMF analysis filter.
Two signal plus are promptly obtained the broadband signal of final 16KHz SF:
Figure 126760DEST_PATH_IMAGE033
Sum up: present embodiment proposes two kinds of modified least significant bit (LSB) watermarking algorithms.A kind of improving one's methods is whenever to embed 1 bit information at a distance from a sampled point, and be bad in the time of can avoiding causing the auditory effect fashion because of localized distortion is excessive like this, makes whole auditory effect remain on a higher level.It is to propose a kind of selectable least significant bit (LSB) embedding algorithm according to the characteristics of host-host protocol and the auditory properties of people's ear that another kind is improved one's methods.According to the time domain masking effect of people's ear, large-signal makes the modification of small-signal be difficult for being discovered by people's ear to the masking effect of back small-signal.According to this characteristic, can watermark information selectively be embedded in the little sample point of amplitude, make that the hiding property of watermark is better.
Native system is embedded into the high-frequency information in the voice signal in the arrowband code stream based on above-mentioned watermarking algorithm, transfers out through cable telephone network, extracts the high-frequency parameter of voice at receiving end, the synthetic wideband voice, thus realize the spread spectrum of voice signal.Because the sheltering of watermarking algorithm is better, so even do not extract watermark and synthetic wideband voice functions module at receiving end, also can not influence normal speech quality.And the telephone terminal with this function will be heard the language behind the spread-spectrum, and speech quality is greatly improved.
Above content is to combine optimal technical scheme to the further explain that the present invention did, and can not assert that the practical implementation of invention only limits to these explanations.Under the present invention, the those of ordinary skill of technical field, under the prerequisite that does not break away from design of the present invention, simple deduction and replacement can also be made, all protection scope of the present invention should be regarded as.
Appendix
The frequency domain envelope carry division:
Figure 667462DEST_PATH_IMAGE058
The logarithm weighted energy that respectively carries F( i) computing method:
0 subband:
Figure DEST_PATH_IMAGE059
Figure 183370DEST_PATH_IMAGE060
Figure DEST_PATH_IMAGE061
Figure 580853DEST_PATH_IMAGE062
Figure DEST_PATH_IMAGE063
Figure 122824DEST_PATH_IMAGE064
1 ~ 10 subband:
Figure 263955DEST_PATH_IMAGE066
Figure 2011104223927100002DEST_PATH_IMAGE067
Figure 637299DEST_PATH_IMAGE068
Figure 2011104223927100002DEST_PATH_IMAGE069
Figure 956416DEST_PATH_IMAGE070
Figure 172634DEST_PATH_IMAGE071
Figure 117456DEST_PATH_IMAGE072
,?
Figure 246045DEST_PATH_IMAGE074
Figure 746297DEST_PATH_IMAGE075
Figure 979963DEST_PATH_IMAGE076
Figure 452533DEST_PATH_IMAGE077
Figure 362720DEST_PATH_IMAGE078
Figure 287951DEST_PATH_IMAGE079
Figure 325308DEST_PATH_IMAGE080
11 subbands:
Figure 714701DEST_PATH_IMAGE081
Figure 224628DEST_PATH_IMAGE083
Figure 987047DEST_PATH_IMAGE084
Figure 230947DEST_PATH_IMAGE085
Figure 496319DEST_PATH_IMAGE086
Figure 396142DEST_PATH_IMAGE087

Claims (2)

1. the method based on the expansion of the speech bandwidth of audio frequency watermark may further comprise the steps, and wherein step B, step F 2, F3 are with reference to the way in the document " based on the DTX/CNG algorithm research and the realization of layering broadband voice coding/decoding system ":
Steps A. use QMF analysis filter pack module broadband voice to be divided into the high fdrequency component of narrowband speech and the 8000~16000Hz of two part: 0~8000Hz; And export signals with two and fall sampling module through one, SF is reduced to 8KHz, obtain low frequency signal s L ( n) and high-frequency signal s H ( n);
Module is extracted 30 high-frequency parameters: 16 temporal envelope parameters, 12 frequency domain envelope parameters, average time domain envelope parameters and average frequency domain envelope parameters; Below be the concrete method for distilling of each parameter:
Step B1. extracts 16 temporal envelope parameters and average temporal envelope parameter:
The high fdrequency component of every 20ms s H ( n) be divided into 16 sections, every section comprises 10 sampled points; 16 temporal envelope parameters are:
Figure 880094DEST_PATH_IMAGE001
Figure 233846DEST_PATH_IMAGE002
Calculate average temporal envelope:
Figure 674054DEST_PATH_IMAGE003
Use the time domain envelope parameters T( i) and mean value
Figure 274800DEST_PATH_IMAGE004
Make difference and carry out normalization:
Figure 548262DEST_PATH_IMAGE005
Step B2. extracts 12 frequency domain envelope parameters and average frequency domain envelope parameters:
High fdrequency component s H ( n) 160 sampled points and last 48 of previous frame of present frame adopt point to get through windowing resume module
Figure 954973DEST_PATH_IMAGE006
, use long 208 the sampling point window functions of window here Window( n):
Figure 420DEST_PATH_IMAGE007
Figure 772067DEST_PATH_IMAGE008
Wherein, N=208;
Mend 0 to 256 point through the signal after the windowing module, the FFT conversion of doing then at 256 gets S F ( k):
Figure 785022DEST_PATH_IMAGE009
Figure 480577DEST_PATH_IMAGE010
Wherein, L=256; Frequency domain is divided into 12 evenly at interval, calculates each frequency domain envelope parameters at interval, and convert logarithm weighting sub belt energy parameter to;
Calculate the average frequency domain envelope:
With the frequency domain envelope parameters F( i) and mean value
Figure 837926DEST_PATH_IMAGE012
Make difference and carry out normalization:
D1. be embedded into watermark in the code stream uniformly through the watermark merge module: because a frame signal has 160 sampled points, and the bit number of embed watermark is 66bit, whenever embeds 1 bit information at a distance from a sampled point;
Perhaps D2. selectively is embedded into watermark information in the little sample point of amplitude through the watermark merge module; Use C0~C7 to represent the lowest order of encoding code stream to arrive most significant digit; According to agreement G.711, most significant digit C7 represents the sign bit of sampled point, and C6~C4 is the paragraph sign indicating number, C3~C0 section of being ISN; The paragraph sign indicating number is more little, and the amplitude of the sampled value of code stream representative is more little; This method uses the C6 position that division of signal is large-signal, i.e. C6=1 and small-signal, i.e. C6=0, embed watermark when C6 is 0; If not enough 66 of the position that a frame embeds then is chosen in other position embed watermarks;
Step e. corresponding through extracting watermark module extraction watermark with step D, comprise dual mode:
E1. through extracting the watermark module
Perhaps E2. judges whether to have embedded watermark according to the characteristics of code stream; From the initial judgement of a frame, if C6 is 0, then extract watermark from lowest order, C6 did not extract watermark at 1 o'clock; If watermark less than 66 bits that extract when arriving postamble, then return the starting point of a frame, be the fetched at 1 place at C6, up to extracting 66 bit watermark;
Step F. use white noise to recover the high frequency voice through recovering the high frequency voice module:
At first, use the high-frequency parameter module of extracting that it is carried out temporal envelope shaping and frequency domain envelope shaping then, can obtain the high frequency voice signal the AR model equipment of the white noise sequence that produces through constructing by the low frequency voice;
Step F 1. uses white noise to recover the high frequency voice:
Because high frequency voice and low frequency voice have certain correlativity, the low frequency voice structure AR model module that uses decoding to obtain; Produce white noise sequence in decoding end, this sequence is carried out forming processes through the AR model module of constructing, make noise possess the characteristic of high frequency voice;
The local adjusting module of step F 2. temporal envelopes:
The temporal envelope parameter of normalization temporal envelope parameter of from watermark, recovering and average temporal envelope calculating high-frequency signal:
Figure 775106DEST_PATH_IMAGE014
Figure 44414DEST_PATH_IMAGE015
The temporal envelope calculation of parameter time domain local gain factor by noise and high-frequency signal:
Figure 991420DEST_PATH_IMAGE016
Use time domain local gain module that the temporal envelope of noise is adjusted:
Figure 916651DEST_PATH_IMAGE017
?
Figure 203276DEST_PATH_IMAGE018
Figure 343401DEST_PATH_IMAGE019
Gain factor between two sections uses approach based on linear interpolation to handle:
Figure 362173DEST_PATH_IMAGE020
The local adjusting module of step F 3. frequency domain envelopes:
The adjusted signal of time domain is handled according to extracting 12 frequency domain envelope parameters and average frequency domain envelope parameters, obtained the logarithm weighting sub belt energy parameter
Figure 837016DEST_PATH_IMAGE021
and the average frequency domain envelope
Figure 678065DEST_PATH_IMAGE022
of noise; According in the local adjustment of temporal envelope to the local method of adjustment of the temporal envelope of noise, the frequency domain envelope of noise is carried out part adjustment
Step F 4. frequency domain envelopes overall situation adjusting module:
Calculate the frequency domain global gain factor of each frame by the average frequency domain envelope of noise and high-frequency signal:
Figure 859647DEST_PATH_IMAGE023
Use frequency domain global gain factor is carried out overall situation adjustment to the frequency domain envelope of each frame:
Figure 377216DEST_PATH_IMAGE024
Figure 11460DEST_PATH_IMAGE025
Adjusted frequency spectrum is fed the IFFT conversion module, and using window window function module then is in 208 the buffer device to obtaining depositing in after the time-domain signal windowing length:
Figure 390620DEST_PATH_IMAGE026
Wherein, L=256;
With the value of last 48 points in the former frame buffer device and preceding 48 somes addition in the present frame buffer device, then with present frame buffer device in the value of n=48~159 constitute the time-domain signal that present frame recovers;
Step F 5. temporal envelopes overall situation adjusting module:
Step according to the adjustment of the frequency domain envelope overall situation is carried out overall situation adjustment to temporal envelope, and adjusted signal
Figure 754605DEST_PATH_IMAGE027
promptly is the high-frequency signal by Noise Estimation
Step G. improves SF to 16kHz through QMF composite filter pack module with the low frequency signal of 8KHz SF and the high-frequency signal
Figure 312418DEST_PATH_IMAGE027
that estimates; Then respectively through low pass and high pass FIR filter module; The signal of handling is
Figure 744536DEST_PATH_IMAGE029
and
Figure 713760DEST_PATH_IMAGE030
, and the coefficient of wave filter is identical with the QMF analysis filter;
Two signal plus are promptly obtained the broadband signal of final 16KHz SF:
Figure 510815DEST_PATH_IMAGE031
2. device based on the expansion of the speech bandwidth of audio frequency watermark; It is characterized in that the device of said speech bandwidth expansion based on audio frequency watermark comprises: QMF analysis filter pack module, extract the high-frequency parameter module, G.711 coding/decoding module, watermark merge module, extract the watermark module, recover high frequency voice module and QMF composite filter pack module;
Said QMF analysis filter pack module is divided into broadband voice the high fdrequency component of narrowband speech and the 8000~16000Hz of two part: 0~8000Hz; And export the signals feeding with two and fall sampling module, SF is reduced to 8KHz, obtain low frequency signal s L ( n) and high-frequency signal s H ( n);
Said extraction high-frequency parameter module is extracted 30 high-frequency parameters: 16 temporal envelope parameters, 12 frequency domain envelope parameters, average time domain envelope parameters and average frequency domain envelope parameters; This partial reference the way of document " based on the DTX/CNG algorithm research and the realization of layering broadband voice coding/decoding system ", below be the concrete method for distilling of each parameter:
Extract 16 temporal envelope parameters and average temporal envelope parameter module:
The high fdrequency component of every 20ms s H ( n) be divided into 16 sections, every section comprises 10 sampled points; 16 temporal envelope parameters are:
Figure 181968DEST_PATH_IMAGE032
Calculate average temporal envelope:
Use the time domain envelope parameters T( i) and mean value
Figure 943884DEST_PATH_IMAGE004
Make difference and carry out normalization:
Figure 40016DEST_PATH_IMAGE035
Figure 564670DEST_PATH_IMAGE033
Extract 12 frequency domain envelope parameters and average frequency domain envelope parameters module:
High fdrequency component s H ( n) 160 sampled points and last 48 of previous frame of present frame adopt point to get through a windowing process
Figure 492175DEST_PATH_IMAGE036
, use long 208 the sampling point window functions of window here Window( n):
Figure 631032DEST_PATH_IMAGE037
Figure 24580DEST_PATH_IMAGE038
Wherein, N=208;
Signal after the windowing is mended 0 to 256 point, and the FFT conversion of doing then at 256 gets S F ( k):
Figure 602191DEST_PATH_IMAGE039
Figure 321886DEST_PATH_IMAGE040
Wherein, L=256; Frequency domain is divided into 12 evenly at interval, calculates each frequency domain envelope parameters at interval, and convert logarithm weighting sub belt energy parameter to;
Calculate the average frequency domain envelope:
Figure 444694DEST_PATH_IMAGE041
With the frequency domain envelope parameters F( i) and mean value
Figure 515418DEST_PATH_IMAGE042
Make difference and carry out normalization:
Figure 283971DEST_PATH_IMAGE044
Said G.711 coding/decoding module is with narrow band voice signal s L ( n) through A-law encoding device module coding, obtain each point 8 BitThe code stream of data length is embedded into watermark information in the code stream, is sent in the network through telephone wire; Receiving end extracts watermark information from code stream, and through A rule decoder decode, obtains narrow band voice signal;
Said watermark merge module is embedded into watermark and comprises following dual mode in the code stream:
Mode one: be embedded into watermark in the code stream uniformly through the watermark merge module: because a frame signal has 160 sampled points, and the bit number of embed watermark is 66bit, whenever embeds 1 bit information at a distance from a sampled point;
Mode two: watermark information selectively is embedded in the little sample point of amplitude through the watermark merge module; Use C0~C7 to represent the lowest order of encoding code stream to arrive most significant digit; According to agreement G.711, most significant digit C7 represents the sign bit of sampled point, and C6~C4 is the paragraph sign indicating number, C3~C0 section of being ISN; The paragraph sign indicating number is more little, and the amplitude of the sampled value of code stream representative is more little; This method uses the C6 position that division of signal is large-signal, i.e. C6=1 and small-signal, i.e. C6=0, embed watermark when C6 is 0; If not enough 66 of the position that a frame embeds then is chosen in other position embed watermarks;
It is corresponding with the watermark merge module that said extraction watermark module is extracted watermark, comprises dual mode:
Mode one: the process through extracting watermark module extraction watermark is to extract according to the position of embed watermark;
Mode two: the characteristics according to code stream judge whether to have embedded watermark; From the initial judgement of a frame, if C6 is 0, then extract watermark from lowest order, C6 did not extract watermark at 1 o'clock; If watermark less than 66 bits that extract when arriving postamble, then return the starting point of a frame, be the fetched at 1 place at C6, up to extracting 66 bit watermark;
Said recovery high frequency voice module uses white noise to recover the high frequency voice:
At first, use the high-frequency parameter that extracts that it is carried out temporal envelope shaping and frequency domain envelope shaping then, can obtain the high frequency voice signal the AR model of the white noise sequence that produces through constructing by the low frequency voice;
Use white noise to recover the high frequency voice module:
Because high frequency voice and low frequency voice have certain correlativity, the low frequency voice structure AR model module that uses decoding to obtain; Produce white noise sequence in decoding end, this sequence is carried out forming processes through the AR model module of constructing, make noise possess the characteristic of high frequency voice;
The local adjusting module of temporal envelope, this partial reference the way of document " based on the DTX/CNG algorithm research and the realization of layering broadband voice coding/decoding system ":
The temporal envelope parameter of normalization temporal envelope parameter of from watermark, recovering and average temporal envelope calculating high-frequency signal:
Figure 764631DEST_PATH_IMAGE045
Figure 384968DEST_PATH_IMAGE046
The temporal envelope calculation of parameter time domain local gain factor by noise and high-frequency signal:
Figure 789536DEST_PATH_IMAGE047
Use the temporal envelope of time domain local gain factor pair noise to adjust:
?
Figure 463279DEST_PATH_IMAGE049
Figure 243017DEST_PATH_IMAGE050
Gain factor between two sections uses approach based on linear interpolation to handle:
Figure 448345DEST_PATH_IMAGE020
The local adjusting module of frequency domain envelope, this partial reference the way of document " based on the DTX/CNG algorithm research and the realization of layering broadband voice coding/decoding system ":
The adjusted signal of time domain is handled according to extracting 12 frequency domain envelope parameters and average frequency domain envelope parameters, obtained the logarithm weighting sub belt energy parameter
Figure 325035DEST_PATH_IMAGE021
and the average frequency domain envelope
Figure 147497DEST_PATH_IMAGE022
of noise; According in the local adjustment of temporal envelope to the local method of adjustment of the temporal envelope of noise, the frequency domain envelope of noise is carried out part adjustment;
The adjustment of the frequency domain envelope overall situation:
Calculate the frequency domain global gain factor of each frame by the average frequency domain envelope of noise and high-frequency signal:
Use frequency domain global gain factor is carried out overall situation adjustment to the frequency domain envelope of each frame:
Figure 426480DEST_PATH_IMAGE024
Figure 157675DEST_PATH_IMAGE025
Through the IFFT conversion module, using window window function module then is in 208 the buffer device to obtaining depositing in after the time-domain signal windowing length with adjusted frequency spectrum:
Figure 964089DEST_PATH_IMAGE026
Wherein, L=256, n=0,1 ... 207;
With the value of last 48 points in the former frame buffer device and preceding 48 somes addition in the present frame buffer device, then with present frame buffer device in the value of n=48~159 constitute the time-domain signal that present frame recovers;
Temporal envelope overall situation adjusting module:
Step according to the adjustment of the frequency domain envelope overall situation is carried out overall situation adjustment to temporal envelope, and adjusted signal
Figure 718418DEST_PATH_IMAGE027
promptly is the high-frequency signal by Noise Estimation;
Said QMF composite filter pack module improves SF to 16kHz with the low frequency signal
Figure 517747DEST_PATH_IMAGE028
of 8KHz SF and the high-frequency signal
Figure 41132DEST_PATH_IMAGE027
that estimates; Then respectively through low pass and high pass FIR filter module; The signal of handling is
Figure 284026DEST_PATH_IMAGE029
and
Figure 525651DEST_PATH_IMAGE030
, and the coefficient of wave filter is identical with the QMF analysis filter;
Two signal plus are promptly obtained the broadband signal of final 16KHz SF:
Figure 863091DEST_PATH_IMAGE031
CN2011104223927A 2011-12-16 2011-12-16 Device and method for expanding speech bandwidth based on audio watermarking Expired - Fee Related CN102543086B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2011104223927A CN102543086B (en) 2011-12-16 2011-12-16 Device and method for expanding speech bandwidth based on audio watermarking

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2011104223927A CN102543086B (en) 2011-12-16 2011-12-16 Device and method for expanding speech bandwidth based on audio watermarking

Publications (2)

Publication Number Publication Date
CN102543086A true CN102543086A (en) 2012-07-04
CN102543086B CN102543086B (en) 2013-08-14

Family

ID=46349824

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2011104223927A Expired - Fee Related CN102543086B (en) 2011-12-16 2011-12-16 Device and method for expanding speech bandwidth based on audio watermarking

Country Status (1)

Country Link
CN (1) CN102543086B (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103001915A (en) * 2012-11-30 2013-03-27 南京邮电大学 Time domain reshaping method of asymmetric limiting light orthogonal frequency division multiplexing (OFDM) communication system
CN103258543A (en) * 2013-04-12 2013-08-21 大连理工大学 Method for expanding artificial voice bandwidth
CN103474079A (en) * 2012-08-06 2013-12-25 苏州沃通信息科技有限公司 Voice encoding method
CN104269173A (en) * 2014-09-30 2015-01-07 武汉大学深圳研究院 Voice frequency bandwidth extension device and method achieved in switching mode
CN105264599A (en) * 2013-01-29 2016-01-20 弗劳恩霍夫应用研究促进协会 Audio encoder, audio decoder, method for providing encoded audio information and decoded audio information, computer program and encoded representation using a signal-adaptive bandwidth extension
CN105659321A (en) * 2014-02-28 2016-06-08 松下电器(美国)知识产权公司 Decoding device, encoding device, decoding method, encoding method, terminal device, and base station device
CN105723458A (en) * 2013-09-12 2016-06-29 沙特阿拉伯石油公司 Dynamic threshold methods, systems, computer readable media, and program code for filtering noise and restoring attenuated high-frequency components of acoustic signals
CN106612168A (en) * 2016-12-23 2017-05-03 中国电子科技集团公司第三十研究所 Voice out-of-synchronism detection method based on PCM coding characteristics
CN108074578A (en) * 2016-11-17 2018-05-25 中国科学院声学研究所 A kind of transmission of audio frequency watermark and the system and method for information exchange
CN108269585A (en) * 2013-04-05 2018-07-10 杜比实验室特许公司 The companding device and method of quantizing noise are reduced using advanced spectrum continuation
CN110544472A (en) * 2019-09-29 2019-12-06 上海依图信息技术有限公司 Method for improving performance of voice task using CNN network structure

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4667340A (en) * 1983-04-13 1987-05-19 Texas Instruments Incorporated Voice messaging system with pitch-congruent baseband coding
CN101140759A (en) * 2006-09-08 2008-03-12 华为技术有限公司 Band-width spreading method and system for voice or audio signal
CN101521014A (en) * 2009-04-08 2009-09-02 武汉大学 Audio bandwidth expansion coding and decoding devices
CN102105931A (en) * 2008-07-11 2011-06-22 弗朗霍夫应用科学研究促进协会 Apparatus and method for generating a bandwidth extended signal

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4667340A (en) * 1983-04-13 1987-05-19 Texas Instruments Incorporated Voice messaging system with pitch-congruent baseband coding
CN101140759A (en) * 2006-09-08 2008-03-12 华为技术有限公司 Band-width spreading method and system for voice or audio signal
CN102105931A (en) * 2008-07-11 2011-06-22 弗朗霍夫应用科学研究促进协会 Apparatus and method for generating a bandwidth extended signal
CN101521014A (en) * 2009-04-08 2009-09-02 武汉大学 Audio bandwidth expansion coding and decoding devices

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103474079A (en) * 2012-08-06 2013-12-25 苏州沃通信息科技有限公司 Voice encoding method
CN103001915B (en) * 2012-11-30 2015-01-28 南京邮电大学 Time domain reshaping method of asymmetric limiting light orthogonal frequency division multiplexing (OFDM) communication system
CN103001915A (en) * 2012-11-30 2013-03-27 南京邮电大学 Time domain reshaping method of asymmetric limiting light orthogonal frequency division multiplexing (OFDM) communication system
CN105264599B (en) * 2013-01-29 2019-05-10 弗劳恩霍夫应用研究促进协会 Audio coder, provides the method for codes audio information at audio decoder
CN105264599A (en) * 2013-01-29 2016-01-20 弗劳恩霍夫应用研究促进协会 Audio encoder, audio decoder, method for providing encoded audio information and decoded audio information, computer program and encoded representation using a signal-adaptive bandwidth extension
US11423923B2 (en) 2013-04-05 2022-08-23 Dolby Laboratories Licensing Corporation Companding system and method to reduce quantization noise using advanced spectral extension
CN108269585A (en) * 2013-04-05 2018-07-10 杜比实验室特许公司 The companding device and method of quantizing noise are reduced using advanced spectrum continuation
CN103258543A (en) * 2013-04-12 2013-08-21 大连理工大学 Method for expanding artificial voice bandwidth
CN105723458A (en) * 2013-09-12 2016-06-29 沙特阿拉伯石油公司 Dynamic threshold methods, systems, computer readable media, and program code for filtering noise and restoring attenuated high-frequency components of acoustic signals
CN105723458B (en) * 2013-09-12 2019-09-24 沙特阿拉伯石油公司 For filtering out noise and going back dynamic threshold method, the system, computer-readable medium of the high fdrequency component that acoustic signal is decayed
US10672409B2 (en) 2014-02-28 2020-06-02 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Decoding device, encoding device, decoding method, and encoding method
CN105659321A (en) * 2014-02-28 2016-06-08 松下电器(美国)知识产权公司 Decoding device, encoding device, decoding method, encoding method, terminal device, and base station device
US11257506B2 (en) 2014-02-28 2022-02-22 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Decoding device, encoding device, decoding method, and encoding method
CN104269173B (en) * 2014-09-30 2018-03-13 武汉大学深圳研究院 The audio bandwidth expansion apparatus and method of switch mode
CN104269173A (en) * 2014-09-30 2015-01-07 武汉大学深圳研究院 Voice frequency bandwidth extension device and method achieved in switching mode
CN108074578A (en) * 2016-11-17 2018-05-25 中国科学院声学研究所 A kind of transmission of audio frequency watermark and the system and method for information exchange
CN106612168B (en) * 2016-12-23 2019-07-16 中国电子科技集团公司第三十研究所 A kind of voice step failing out detecting method based on pcm encoder feature
CN106612168A (en) * 2016-12-23 2017-05-03 中国电子科技集团公司第三十研究所 Voice out-of-synchronism detection method based on PCM coding characteristics
CN110544472A (en) * 2019-09-29 2019-12-06 上海依图信息技术有限公司 Method for improving performance of voice task using CNN network structure

Also Published As

Publication number Publication date
CN102543086B (en) 2013-08-14

Similar Documents

Publication Publication Date Title
CN102543086B (en) Device and method for expanding speech bandwidth based on audio watermarking
JP7330934B2 (en) Apparatus and method for bandwidth extension of acoustic signals
CN101512639B (en) Method and equipment for voice/audio transmitter and receiver
CN1808568B (en) Audio encoding/decoding apparatus having watermark insertion/abstraction function and method using the same
CN102522092B (en) Device and method for expanding speech bandwidth based on G.711.1
CN101202043B (en) Method and system for encoding and decoding audio signal
CN105280190B (en) Bandwidth extension encoding and decoding method and device
CN101206860A (en) Method and apparatus for encoding and decoding layered audio
KR100921867B1 (en) Apparatus And Method For Coding/Decoding Of Wideband Audio Signals
CN101779236A (en) Temporal masking in audio coding based on spectral dynamics in frequency sub-bands
CN101662288A (en) Method, device and system for encoding and decoding audios
CN102194458B (en) Spectral band replication method and device and audio decoding method and system
Chen et al. An audio watermark-based speech bandwidth extension method
CN102142255A (en) Method for embedding and extracting digital watermark in audio signal
Bao et al. MP3-resistant music steganography based on dynamic range transform
CN105957533B (en) Voice compression method, voice decompression method, audio encoder and audio decoder
CN101604527A (en) Under the VoIP environment based on the method for the hidden transferring of wideband voice of G.711 encoding
CN114974270A (en) Audio information self-adaptive hiding method
Xu et al. Performance analysis of data hiding in MPEG-4 AAC audio
Dymarski et al. Robust audio watermarks in frequency domain
Nishimura Data hiding for audio signals that are robust with respect to air transmission and a speech codec
CN114863942B (en) Model training method for voice quality conversion, method and device for improving voice quality
CN103474079A (en) Voice encoding method
Koduri Discrete cosine transform-based data hiding for speech bandwidth extension
CN101833953B (en) Method and device for lowering redundancy rate of multi-description coding and decoding

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20130814

Termination date: 20151216

EXPY Termination of patent right or utility model