CN101350198B

CN101350198B - Method for compressing watermark using voice based on bone conduction

Info

Publication number: CN101350198B
Application number: CN2008101507573A
Authority: CN
Inventors: 同鸣; 姬红兵; 陈巍; 闫涛
Original assignee: Xidian University
Current assignee: Xidian University
Priority date: 2008-08-29
Filing date: 2008-08-29
Publication date: 2011-09-21
Anticipated expiration: 2028-08-29
Also published as: CN101350198A

Abstract

The invention discloses a method of speech compression watermarking based on bone conduction, which substantially solves the problem that prior similar methods can not detect a position and a type of attacking. During the process of watermarking and embedding, watermarking which is embedded in a current frame is produced by a line spectrum frequency coefficient of the current frame, a pitch period which is extracted from a next frame and the watermarking of a previous frame, different watermarking generative rules are selected according to the position of different frames, and the watermarking of the current frame is embedded in a position serial number of multi-pulse excitation of speech coding. During the process of validating the watermarking, the watermarking is extracted from the position serial number of the pulse excitation which is selected, the watermarking which is extracted is compared with validated watermarking, and the position of attacking and the attacking types are detected according to different results which are produced by different attacking positions and attacking types. The invention has excellent noise suppression ability and low frequency pickup ability, can be used for voice recording equipment, and provides warranty for the integrity and the authenticity of digital speech.

Description

Method for compressing watermark using voice based on bone conduction

Technical field

The invention belongs to field of information processing, be specifically related to the method for watermark compression, can be used for sound pick-up outfit, for digital speech integrality and authenticity are given security.

Background technology

Digital watermarking is the research focus of multimedia messages processing in recent years, and the digital speech watermark has the characteristics of himself as a vital classification in digital watermarking field.Speech signal bandwidth is narrower, and more more convenient than image and vision signal in transmission, form is also more various, as phone, and audio broadcasting, and video sound accompaniment etc. all is common in the daily life, its coverage rate is very extensive.Along with the development of modern signal processing technology, people can be easy to according to the intention of oneself voice signal of juggling the figures, and therefore in order to prove the validity of audio digital signals content, the researchist has proposed the notion of breakable watermark.Opposite fully with the robustness watermark, breakable watermark has the sensitivity of height to attack, the voice signal of embed watermark is done any small change watermark can't be recovered or incomplete recovery, thereby judge whether voice signal is distorted.

The frangible voice water mark method that has proposed at present mainly contains:

1.Chung-Ping Wu, Kuo, C.C.J.Speech content integrity verification integrated with ITUG.723.1 speech coding.Proc.IEEE Int.Conf.Inf.Technol.:Coding Comput., 2001, pp:680-684, this method extracts some features of voice, embeds as watermark after treatment, judges the integrality of voice by the feature of the watermark of relatively extracting and selection.Because this method depends on a threshold value and judges that whether voice are modified, and easily cause erroneous judgement.

2.Chen Ning, Zhu Jie.An Efficient Approach to Integrate Watermarking with SpeechCoding Algorithm.Communications and Networking in China, CHINACOM ' 07.2007:355-359, this method is embedded in watermark in the secondary vector quantization parameter of speech linear predictive coefficient, its defective is to judge whether host's carrier is attacked, and can not judge attack type.

3.Chung-Ping Wu, Kuo C.-C.J.Fragile speech watermarking based on exponential scalequantization for tamper detection.Acoustics, Speech, and Signal Processing, (2002.Proceedings. ICASSP ' 02) .2002 (4): 13-17, though this method can tell resampling, Gauss add make an uproar, G.711 compress speech and the type of attacking such as voice compression coding G.721, can not tell the type that insertion, replacement and shearing etc. are attacked.

4.Chia-Hsiung Liu, Chen O.T.-C.A content-based fragile watermark scheme for speechwaveform authentication.Circuits and Systems, 2005.2005 (1): 432-435, though this method can detect position and type that insertion, replacement, deletion etc. are attacked, but belong to the time domain watermark, be not suitable for the voice of compressed format.

In sum, existing frangible voice watermark has following deficiency: 1) embed watermark is fixed, easily deleted or forgery; 2) mostly based on time domain or transform domain, can't satisfy the requirement of real-time, and the watermark that embeds is easily lost after overcompression is handled, development along with voice compression technique, increasing voice signal is to exist with the form after the compression, adopts the method embed watermark of time domain or transform domain then will face complicated encoding-decoding process; 3) most fragile watermark method can only judge whether host's carrier is distorted, and can not detect to attack position and attack type.

Summary of the invention

The present invention seeks at above-mentioned deficiency, propose a kind ofly can detect the method for compressing watermark using voice of attacking position and attack type, for digital speech integrality and authenticity are given security based on bone conduction.

The key problem in technology of realizing the object of the invention is: in watermark embed process, with the initial branch of voice signal in groups, each group comprises some frames.Be embedded into the watermark of present frame and produce by the line spectral frequencies coefficient of present frame, the pitch period of next frame extraction and the watermark of former frame, and the watermark create-rule different according to the choice of location of different frame.In the watermark verification process, use the watermark of extraction and checking watermark to compare, according to the Different Results of difference attack position and attack type generation, detect the position and the attack type of the frame of being attacked.Concrete scheme is as follows:

One, watermark embed process

(1) utilizes the bone conduction device signal that voice signal is removed noise and other noise pre-service, and extract the numbering IDD of bone conduction equipment and the numbering ID of voice signal _S

(2) pretreated voice signal is divided into the frame of several regular lengths according to the length of voice, every frame length 30ms is divided into one group with the n frame, and except last group, promptly the frame number of last group is the total frame number remainder divided by n;

(3) utilize G.723.1 voice compression coding standard, extract each main feed line spectral frequency coefficient L _{G, i}With pitch period P _{G, i}, generate in order to subsequent watermark, wherein g represents the sequence number of the group at frame place, i represents the position of frame in group;

(4) utilize the numbering ID of Hash hash function with the bone conduction equipment of extraction _D, voice signal numbering ID _S, present frame line spectral frequencies coefficient L _{G, i}, the pitch period P that extracts of next frame _{G, i+1}Watermark W with former frame _{G, i-1}, generate the watermark W that is embedded in present frame _{G, i}

(5) with the watermark W of present frame _{G, i}Be embedded in the position number of multi-pulse excitation of voice coding, promptly replace the least significant bit of the position number of pulse excitation, finally obtain containing the voice compression coding stream of watermark with watermark.

Two, watermark extracting proof procedure

1) the compressed encoding stream that contains watermark is carried out decoding processing;

2) from the position number of selected pulse excitation, extract watermark

3) utilize the numbering ID of Hash hash function with bone conduction equipment _D, voice signal numbering ID _S, present frame line spectral frequencies coefficient L _{G, i}, the pitch period P that extracts of next frame _{G, i+1}Watermark with the former frame extraction

Generate the checking watermark

4) relatively g organizes the watermark that the i frame extracts

With the checking watermark that generates

If

Be not equal to

Judge that then g organizes i frame mistake,, judge the position and the type of attacking according to position and the distribution that mistake occurs.

This method has the following advantages:

1) the present invention is owing to be embedded into the watermark of present frame by the line spectral frequencies coefficient of present frame, the pitch period of next frame extraction and the watermark generation of former frame, and the watermark create-rule different according to the choice of location of different frame, in the watermark verification process, relatively watermark of Ti Quing and checking watermark, attack the Different Results of position and attack type generation according to difference, detect and attack position and attack type, thereby can more effectively guarantee the integrality of digital multimedia products such as digital recording.

2) the present invention carries out pre-service to voice signal and removes noise and other noise owing to utilized good noise suppression ability of bone conduction device and low frequency pickup ability, and extracts the bone conduction device numbering with generating watermark, has increased the security of watermark.

Description of drawings

Fig. 1 is a watermark embed process block diagram of the present invention;

Fig. 2 is a watermark extraction process block diagram of the present invention;

Fig. 3 is that the present invention is to inserting the analogous diagram of attacking;

Fig. 4 is that the present invention is to replacing the analogous diagram of attacking;

Fig. 5 is the analogous diagram that the present invention attacks deletion;

Fig. 6 is that the present invention is to replacing and insert the analogous diagram that gangs up against.

Embodiment

One. the basic theory introduction

1. bone conduction device

In recent years, because the speech processes characteristic of bone conduction device uniqueness has obtained field of voice signal and has more and more paid close attention to.This device is a kind of high-sensitive solid audio sensor, have good noise suppression ability and low frequency pickup ability, its principle is when people speak, and the bone of head can produce vibration, and the bone conduction of being close to bone can capture the foundation of these vibration informations as speech detection.In addition, as a kind of solid shock sensor, the bone conduction device is insensitive to neighbourhood noise, has natural noise suppression advantage.The bone conduction device can only receive the composition of voice signal 3.5～4kHz with lower frequency, and most energy of voice all concentrate in this frequency range, and its airborne to external world vibration simultaneously is insensitive.

2.G.723.1 voice compression coding standard

G.723.1 standard is a kind of Low Bit-rate Coding algorithm that ITU-T is organized in release in 1996.Be mainly used in compression, as video-phone system, digital transmission system and quality voice compressibility etc. to voice and other multimedia audio signals.This scrambler is the speech coder of a dual rate, and two code rates are respectively 6.3kbps and 5.3kbps.6.3kbps two-forty adopt multi-pulse excitation maximum likelihood quantification MP MLQ algorithm, the low rate of 5.3kbps adopts algebraic codebook Excited Linear Prediction ACELP algorithm.These two kinds of algorithms have identical theoretical foundation, all are based on linear pre-side LPC, adopt the driving source of aperiodic component.

G.723.1 standard is based on the algorithm of Code Excited Linear Prediction CELP encoding model, at coding side, analog input signal at first passes through voice band filtering, again filtering output is become 16bit linear PCM voice signal through the 8khz sample conversion, each frame is 30ms, 240 sampling points should be arranged mutually, then, extract the CELP parameter by the time-delay voice are analyzed, these parameter codings are transmitted.In decoding end,, pumping signal is obtained reconstructed speech signal by composite filter with these parametric configuration pumping signal and composite filter.

3.Hash function

The Hash function is a kind of cryptographic algorithm that is used for information security field, and it can change into the information of some different lengths the coding of one group of regular length, i.e. hash value.Hash value length is usually much smaller than the length of importing.The Hash function of a safety should satisfy following condition at least: 1) input length is arbitrarily; 2) output length is fixed; 3) to each given input, calculating output is that hash value is easily; 4) message selected at random of the description of given Hash function and one, find another message different with this message make their Hash to same value be calculate infeasible.

The Hash function is mainly used in completeness check and improves the validity of digital signature, existing at present a lot of schemes, as: MD4, MD5, SHA1 etc.Because the MD5Hash algorithm has ＂ digital finger-print ＂ characteristic, become most widely used a kind of file integrality checking algorithm at present, so this paper adopts the MD5Hash algorithm to generate watermark.

Two, related symbol explanation

ID _DThe numbering of bone conduction equipment

ID _SThe numbering of voice

L _{G, i}G organizes the line spectral frequencies coefficient of i frame

P _{G, i}G organizes the pitch period of i frame

W _{G, i}Be embedded in g and organize the watermark information of i frame

The least significant bit of LSB (x) x

The position number of Exc (i) i selected pulse

The i bit of W (i) watermark information

G organizes the watermark of extracting the i frame from synthetic speech

Organize the checking watermark of i frame by the g of synthetic speech generation

Three, based on the compression domain voice water mark method of bone conduction

With reference to Fig. 1, digital watermark embed process of the present invention is as follows:

Step 1 is carried out pre-service to voice signal.

Because the bone conduction device has good noise suppression ability, and voice signal and bone conduction signal are fully synchronous, the short-time energy of therefore calculating the bone conduction signal distributes, and set a thresholding e, if the bone conduction signal energy of section is greater than e sometime, then the voice signal of this time period is constant, otherwise can think that the voice signal of this time period is noise or noise, with the voice zero setting of this time period, thus noise and noise in the removal voice signal; And the numbering ID of extraction bone conduction equipment _DNumbering ID with voice _S, in order to generate watermark.The numbering ID of bone conduction equipment wherein _DRelevant with digital recording system, guaranteed that the identical voice segments that distinct device is recorded produces different watermarks; The numbering ID of voice signal _SRelevant with the time and the number of times of recording, it has guaranteed that the identical speech data that identical equipment is recorded can produce different watermarks in the different time.

Step 2 is divided frame, packet transaction.

Pretreated voice signal is divided into the frame of several regular lengths according to the length of voice, and every frame length 30ms is divided into one group with the n frame, and except last group, promptly the frame number of last group is the total frame number remainder divided by n;

Step 3, extracting parameter L _{G, i}And P _{G, i}

Utilize G.723.1 voice compression coding standard, extract each main feed line spectral frequency coefficient L _{G, i}With pitch period P _{G, i}, generate in order to subsequent watermark, wherein g represents the sequence number of the group at frame place, i represents the position of frame in group.

Step 4 generates watermark.

Utilize the numbering ID of Hash hash function with the bone conduction equipment of extraction _D, voice signal numbering ID _S, present frame line spectral frequencies coefficient L _{G, i}, the pitch period P that extracts of next frame _{G, i+1}Watermark W with former frame _{G, i-1}, generate the watermark W that embeds present frame by following three kinds of situations _{G, i}:

(1), then presses following formula and generate watermark W if present frame is first frame of each group _{G, 1}, promptly

W _g，1＝H _x(ID _D，ID _S，g，W _g-1，n，L _g，1，P _g，2)

In the formula, H _x() expression Hash function,

W _{G, 1}Be that g organizes the watermark that first frame generates, when g=1, make W _{0, n}Be private key Key,

W _{G-1, n}Be the watermark of the n frame of g-1 group, i.e. the watermark of former frame,

L _{G, 1}Be the line spectral frequencies coefficient that g organizes the 1st frame,

P _{G, 2}It is the pitch period that g organizes the 2nd frame.

(2) if present frame is the last frame of last group, establishing last group has the m frame, then presses following formula and generates watermark W _{T, m}, promptly

W _T，m＝H _x(ID _D，ID _S，W _T，m-1，L _T，m，(T-1)×n+m)

In the formula, W _{T, m}Be that T organizes the watermark that the m frame generates,

W _{T, m-1}Be the watermark of the m-1 frame of T group,

L _{T, m}Be the line spectral frequencies coefficient that T organizes the m frame,

(T-1) * n+m is the voice totalframes.

(3) other situation is then pressed following formula and is generated watermark W _{G, i}, promptly

W _g，i＝H _x(ID _D，ID _S，W _g，i-1，L _g，i，P _g，i+1)

In the formula, W _{G, i}Be the watermark of the i frame generation of g group,

W _{G, i-1}Be the watermark of the i-1 frame of g group,

L _{G, i}Be the line spectral frequencies coefficient of the i frame of g group,

P _{G, i+1}It is the pitch period that g organizes the i+1 frame.

Step 5, watermark embeds.

Because G.723.1 standard is based on the algorithm of Code Excited Linear Prediction CELP encoding model, pumping signal therefore will be with the watermark W of present frame to the minimum that influences of voice quality _{G, i}Be embedded in the position number of multi-pulse excitation of voice coding, promptly replace the least significant bit of the position number of pulse excitation, finally obtain containing the voice compression coding stream of watermark with watermark.

With reference to Fig. 2, digital watermarking leaching process of the present invention is as follows:

Steps A, decoding processing.

Reference is the voice compression coding standard G.723.1, the voice coding stream that contains watermark is carried out decoding processing, and extract the coefficient L of the line spectral frequencies of every frame _{G, i}With pitch period P _{G, i}

Step B, watermark extracting.

Because watermark is to be embedded on the least significant bit of the position number of selected pulse excitation, presses the following formula watermark extracting.

W′(i)＝LSB(Exc(i))

Wherein, W ' (i) represents the i bit of the watermark information that extracts.

Step C generates the checking watermark.

Utilize the numbering ID of Hash hash function with bone conduction equipment _D, voice signal numbering ID _S, present frame line spectral frequencies coefficient L _{G, i}, the pitch period P that extracts of next frame _{G, i+1}Watermark with the former frame extraction

Generate the checking watermark by following three kinds of situations

(1), then presses following formula and generate watermark if present frame is first frame of each group

Promptly

W_{g, 1}^{''} = H_{x} ({ID}_{D}, {ID}_{S}, g, W_{g - 1, n}^{'}, L_{g, 1}, P_{g, 2})

In the formula, H _x() expression Hash function,

Be the checking watermark that g organizes the 1st frame, when g=1, W _{0, n}Be private key Key,

Be the watermark of the n frame of g-1 group, i.e. the watermark of former frame extraction,

P _{G, 2}It is the pitch period that g organizes the 2nd frame;

(2) if present frame is the last frame of last group, establishing last group has the m frame, then presses following formula and generates watermark

Promptly

W_{T, m}^{''} = H_{x} ({ID}_{D}, {ID}_{S}, W_{T, m - 1}^{'}, L_{T, m}, (T - 1) \times n + m)

In the formula,

Be that T organizes the checking watermark that the m frame generates,

Be the watermark of the m-1 frame extraction of T group,

(T-1) * n+m is the voice totalframes.

W_{g, i}^{''} = H_{x} ({ID}_{D}, {ID}_{S}, W_{g, i - 1}^{'}, L_{g, i}, P_{g, i + 1})

In the formula,

Be the checking watermark of the i frame generation of g group,

Be the watermark of the i-1 frame extraction of g group,

P _{G, i+1}It is the pitch period that g organizes the i+1 frame.

Step D, watermark verification.

Relatively g organizes the watermark that the i frame extracts

With the checking watermark that generates

If

Be not equal to

Judge that then g organizes i frame mistake,, judge the position and the type of attacking according to following process according to position and the distribution that mistake occurs:

(4a) detect the attack position

Since first group of first frame, the relatively watermark of Ti Quing

With the checking watermark that generates

If

With

Unanimity then continues relatively next frame; If

W_{g, i}^{''} &NotEqual; W_{g, i}^{''},

Judge that g organizes i frame mistake, then position under fire is that g organizes the i frame;

(4b) set type of error

Error type I: it is inconsistent with the checking watermark that a certain group first frame extracts watermark, but the extraction watermark of its former frame and back one frame is consistent with the checking watermark, that is:

W_{g, 1}^{'} &NotEqual; W_{g, 1}^{''},

W_{g - 1, n}^{'} = W_{g - 1, n}^{''},

W_{g, 2}^{'} = W_{g, 2}^{''};

Error type II: it is inconsistent with the checking watermark that the last frame of last group extracts watermark, but the extraction watermark of its former frame is consistent with the checking watermark;

The 3rd class mistake: other mistake is the 3rd class mistake;

(4c) judge attack type according to the different distributions of different type of errors

If attack type is attacked for inserting, then insert the position of attacking and the 3rd continuous class mistake occurs, the equally spaced first kind and the 3rd class mistake of occurring after the 3rd continuous class mistake, error type II appears in last frame, is spaced apart every group of frame number that comprises;

If attack type is attacked for replacing, then replace the position of attacking and the 3rd continuous class mistake, other frame inerrancy occur;

If attack type is attacked for deletion, then the equally spaced first kind and the 3rd class mistake of occurring after the position that deletion is attacked is spaced apart every group the frame number that comprises, and error type II can appear in last frame.

Effect of the present invention can further specify by following experiment simulation:

1. simulated conditions

Select sample frequency 8khz, the voice signal of the monophony wav form of quantization digit 16bit embeds carrier as watermark, and voice length is 25.5s, is divided into 850 frames, and every frame length 30ms is divided into 17 groups, and every group comprises 50 frames.The experiment software environment is Matlab7.0.Several attack tests below having designed:

(1) insert attacking, is the position that signal that a section of 216 frames does not contain watermark is inserted into the 325th frame of synthetic speech with length.

(2) changing attack, is that the signal that a section of 216 frames does not contain watermark is replaced the 325th frame of synthetic speech to 540 frames with length.

(3) deletion is attacked, the 325th frame to the 500 frames of deletion synthetic speech.

(4) ganging up against, is that a section of 116 frames does not contain the 175th frame that the signal of watermark replaces synthetic speech to 290 frames with length, is the position that signal that a section of 116 frames does not contain watermark is inserted into the 475th frame of synthetic speech with length.

2. simulation result and analysis

Experimental result is respectively as Fig. 3, Fig. 4, Fig. 5 and Fig. 6, and they are respectively speech waveform figure and watermarking detecting results after inserting, replace, delete and ganging up against.Wherein:

Fig. 3 a is a voice signal in the oscillogram of inserting after attacking, and it is the signal that does not contain watermark of 216 frames that the position that this figure is presented at the 325th frame～540 frames has been inserted into a segment length.

Fig. 3 b is a watermarking detecting results, as can be seen from the figure, begin to occur continuous the 3rd class mistake that a segment length is 216 frames from its 325th frame～540 frames, Error type I and the 3rd class mistake uniformly-spaced appear in the back of this 3rd continuous class mistake, it is length 50 frames at interval, be every group of frame number that comprises, error type II then appears in the last frame at voice, therefore can judge that position under fire is the 325th frame, owing to replace to attack and after the 3rd continuous class mistake, other mistakes can not occur, and the 3rd continuous class mistake can not appear in the deletion attack, can judge that therefore the suffered attack of voice is to insert to attack, and insertion length is 216 frames.

Fig. 4 a is the oscillogram of voice signal after replacing attack, and this figure is presented at the 325th frame～540 frames and is replaced by one section isometric signal that does not contain watermark.

Fig. 4 b is a watermarking detecting results, as can be seen from the figure: continuous the 3rd class mistake occurs from the 325th frame～540 frames, thereafter frame inerrancy, therefore can judge that position under fire is the 325th frame, owing to insert to attack and after the 3rd continuous class mistake, other mistakes can occur, and the 3rd continuous class mistake can not appear in the deletion attack, can judge that therefore the suffered attack of voice is to replace to attack, and the scope of replacement is the 325th frame～540 frames.

Fig. 5 a is the oscillogram of voice signal after deletion is attacked, and this figure shows that the 325th frame～500 frames are deleted.

Fig. 5 b is a watermarking detecting results, uniformly-spaced occurs Error type I and the 3rd class mistake after the 325th frame of this figure, its at interval length be i.e. 50 frames of every group of frame number that comprises, error type II then appears at the last frame of voice.Can judge that thus position under fire is the 325th frame, owing to insert to attack and after the 3rd continuous class mistake, other mistakes can occur, other mistakes can not appear after the 3rd continuous class mistake and replace to attack, therefore can judge that the suffered attack of voice is the deletion attack, because the position of the frame that the 3rd class mistake that uniformly-spaced occurs occurs is 325,375,425..., its remainder divided by every group frame number 50 all is 25, therefore the length L of deletion is: L=50 * N+25+1, N=0,1 ... recomputate the checking watermark of last frame

:

W_{T, m}^{''} = H_{x} (D, S, W_{T, m - 1}, L_{T, m}, (T - 1) \times n + m + 50 \times N + 25 + 1)

Search for N=3, make

W_{T, m}^{''} = W_{T, m}^{'},

Thereby calculating deletion length is 176 frames.

Fig. 6 a is the oscillogram of voice signal after ganging up against, and this figure demonstration is replaced by one section isometric signal that does not contain watermark from the 175th frame～290 frames, and it is the signal that does not contain watermark of one section of 116 frame that the position of the 475th frame has been inserted into length.

Fig. 6 b is a watermarking detecting results, as can be seen from the figure: the 3rd continuous class mistake all occurs at the 175th～290 frame and the 475th frame～590 frames, because the frame of replacing after attacking does not have mistake, insert then equally spaced Error type I and the 3rd class mistake of occurring of frame after attacking, therefore can judge that voice are subjected to replacing attack at the 175th～290 frame, the 475th～590 frame is subjected to inserting and attacks.

Above simulation result shows the present invention can judge not only whether host's carrier is attacked, and can accurately detect the position and the attack type of attack.

Claims

1. compressing watermark using voice embedding grammar based on bone conduction comprises following process:

(1) utilizes the bone conduction device signal that voice signal is removed noise and other noise pre-service, and extract the numbering ID of bone conduction equipment _DNumbering ID with voice signal _S

(3) utilize G. 723.1 voice compression coding standards, extract the coefficient L of each main feed line spectral frequency _{G, i}With pitch period P _{G, i}, L wherein _{G, i}And P _{G, i}Represent that respectively g organizes the line spectral frequencies coefficient and the pitch period of i frame;

2. watermark embedding method according to claim 1, wherein being undertaken of step (4): (4a) if present frame for first frame of each group, is then pressed following formula generation watermark W by following three kinds of situations _{G, 1}, promptly

W _g，1＝H _x(ID _D，ID _S，g，W _g-1，n，L _g，1，P _g，2)

In the formula, H _x() expression Hash function,

P _{G, 2}It is the pitch period that g organizes the 2nd frame;

(4b) if present frame is the last frame of last group, establishing last group has the m frame, then presses following formula and generates watermark W _{T, m}, promptly

W _T，m＝H _x(ID _D，ID _S，W _T，m-1，L _T，m，(T-1)×n+m)

W _{T, m-1}Be the watermark of the m-1 frame of T group,

(T-1) * n+m is the voice totalframes;

(4c) other situation is then pressed following formula and is generated watermark W _{G, i}, promptly

W _g，i＝H _x(ID _D，ID _S，W _g，i-1，L _g，i，P _g，i+1)

In the formula, W _{G, i}Be the watermark of the i frame generation of g group,

W _{G, i-1}Be the watermark of the i-1 frame of g group,

P _{G, i+1}It is the pitch period that g organizes the i+1 frame.

3. the compressing watermark using voice based on bone conduction extracts verification method, comprises following process:

2) from the position number of selected pulse excitation, extract watermark

Generate the checking watermark

4) relatively g organizes the watermark that the i frame extracts

Figure 2008101507573100001F2008101507573C00024

With the checking watermark that generates

If

Be not equal to

4. watermark extracting verification method according to claim 3, wherein being undertaken of step 3): (3a) if present frame for first frame of each group, is then pressed following formula generation watermark by following three kinds of situations

Promptly

In the formula, H _x() expression Hash function,

P _{G, 2}It is the pitch period that g organizes the 2nd frame;

(3b) if present frame is the last frame of last group, establishing last group has the m frame, then presses following formula and generates watermark

Promptly

In the formula,

Be that T organizes the checking watermark that the m frame generates,

Be the watermark of the m-1 frame extraction of T group,

(T-1) * n+m is the voice totalframes;

(3c) other situation is then pressed following formula and is generated watermark W _{G, i}, promptly

In the formula,

Be the checking watermark of the i frame generation of g group,

Be the watermark of the i-1 frame extraction of g group,

P _{G, i+1}It is the pitch period that g organizes the i+1 frame.

5. watermark extracting verification method according to claim 3, the described judgement of the step 4) position and the type of attacking wherein, carry out according to the following procedure:

(4a) detect the attack position

Since first group of first frame, the relatively watermark of Ti Quing With the checking watermark that generates

If With

Unanimity then continues relatively next frame; If

(4b) set type of error

The 3rd class mistake: other mistake is the 3rd class mistake;