CN102237093B

CN102237093B - Echo hiding method based on forward and backward echo kernels

Info

Publication number: CN102237093B
Application number: CN2011101330467A
Authority: CN
Inventors: 张玲华; 李刚; 黄智渊; 张磊
Original assignee: Nanjing Post and Telecommunication University
Current assignee: Nanjing Post and Telecommunication University; Nanjing University of Posts and Telecommunications
Priority date: 2011-05-23
Filing date: 2011-05-23
Publication date: 2012-08-15
Anticipated expiration: 2031-05-23
Also published as: CN102237093A

Abstract

The invention discloses an echo hiding method based on forward and backward echo kernels. The method comprises the following steps of embedding a watermark, namely framing an audio carrier signal, importing forward and backward echoes of which the delay time is d into each frame of audio signal, and adding a time interval with a length of d between two frames of the audio carrier signal; and extracting the watermark, namely detecting watermarks of each frame of audio signal by a cepstrum method, extracting the watermarks according to the delay time d of the echoes, and removing the time interval between two frames of the audio carrier signal. Compared with the prior art, the echo hiding method has the advantages of eliminating signal loss during watermark embedding in the prior art and greatly increasing the signaling rate and the recovery rate by adding the time interval with the length equal to that of the delay time of the echoes between two frames of the audio carrier signal.

Description

A kind of based on the echo hidden method of front and back to echo nuclear

Technical field

The present invention relates to a kind of echo hidden method, relate in particular to a kind of based on the echo hidden method of front and back to echo nuclear.Belong to field of information security technology.

Background technology

Along with science and technology development, people are more and more urgent to the demand of information security.In information security, the most important is the attack that prevents the assailant.Some active attack persons find to understand attack immediately after the security information.Generally speaking, in order to prevent assailant's attack, adopt the way of Information hiding usually.So-called Information hiding be exactly the Information hiding of maintaining secrecy in unclassified media data (like sound; Image etc.); Thereby make active attack person can't find hiding of information; Meanwhile, guarantee information concealing technology and encryption technology not to repel each other, these are referred to as digital watermarking by the information of hiding.From another aspect, even the assailant has found that hidden information also need pass through to detect and decipher two steps and just can have been hidden Info.Improve the security of a voice system, should note the following aspects.The one, voice bearer can not listened difference before and after should keeping embedding security information as far as possible.Like this, the assailant just can't determine whether to hide Info.Even the 2nd, is found to hide Info by the listener-in, can not must let him extract easily and hide Info.The 3rd. the encryption method to security information can not be cracked easily.

Study and avoid the assailant to attack, at first will understand the human auditory system masking characteristics.The auditory masking characteristic of people's ear is divided into frequency masking characteristic and temporal masking characteristic.The temporal masking characteristic is meant that a signal can be sheltered with the signal that sends afterwards before.When mistiming and difference on the frequency were all very little, temporarily sheltering with sheltering simultaneously respectively increased.Just be based on based on the Information hiding system of echo that people's ear masking characteristics hides Info.Echo hiding utilized the auditory masking effect of people's ear, is a kind of effective audio-frequency information hidden method.The mode to add echo that its objective is embeds fresh information in original audio-frequency information, realize Information hiding.The mathematical model of echo nuclear is represented like formula (1):

(1)

The sound

that embeds echo can be expressed as the convolution of

and

, and

and

is respectively the unit impulse response of original sound signal and echo nuclear.Echo signal is incorporated in the middle of the original sound by

; Wherein,

is time delay, and

is attenuation coefficient.Voice signal after the embedding echo is represented as follows:

(2)

The concrete grammar that echo is hidden is: to one section sound signal data, be divided into some fragments that comprise identical number of samples earlier, each fragment time is about several to dozens of ms, and number of samples is designated as N.Every section is used for embedding 1 bit and hides Info.In the information telescopiny; Every segment signal is used formula (1); Select

, then in signal, embed the bit " 0 " that hides Info; Select

, then in signal, embed the bit " 1 " that hides Info.Delay

and

is based on the human auditory masking effect as a criterion for selection.At last, all are contained the voice signal section that hides Info and be connected into continuous signal.

In fact the extraction of embedding information is exactly to confirm the echo time-delay.To write voice signal all be a convolution property composite signal because every section latent, directly confirms that from time domain or frequency domain there is certain difficulty in the echo time-delay, can adopt convolution homomorphic filtering system to handle, and this convolution property composite signal is become the additivity composite signal.The method of human cepstral analysis such as Bender is confirmed the echo time-delay.

For voice signal

, its cepstrum is described below

(3)

Wherein,

and

representes Fourier transform and inverse fourier transform respectively.So formula (3) can be expressed as following form:

(4)

Formula (4) is the cepstrum that calculates

and

respectively; Summation then, promptly

(5)

asked cepstrum:

(6)

Wherein,

.

Because ; Again because of attenuation coefficient , then:

(7)

So,

(8)

Therefore, when echo was examined suc as formula (8), the cepstrum of voice signal was represented as follows after the embedding echo:

(9)

In the formula (9);

only nonzero value occurs at the integral multiple place of

; So in the cepstrum domain

of signal; Echo delay

is located also peak value can occur; Can confirm to embed the delay size of echo in view of the above, be " 0 " or " 1 " thereby further confirm to hide Info.

Compare with other audio-frequency information hidden methods (for example: LSB method, phase encoding, band spectrum modulation method), echo hiding has many advantages: 1. hidden algorithm is simple; 2. algorithm does not produce noise, and concealment effect is good; 3. sometimes because the introducing of echo forms stereophonic effect, make sound sound simpler and more honest; 4. extracting does not need original audio data when hiding Info, and can realize the blind Detecting that hides Info.But the weakness of this method also clearly; The embedding capacity is less, and (general secret information embedded quantity is 2b/s～64b/s; Its size is relevant with transmission environment and parameter designing), extraction ratio is not very satisfactory, and interchannel noise, people can reduce the extraction accuracy rate for distorting all.

Sentience is not to comprise that echo is hidden in key property that all interior Information hiding based on audio frequency ought to possess.In echo was hidden implementation procedure, the amplitude of the echo of introducing had directly determined the not sentience that hides Info.

People such as Xu (Xu. Proakis, Digital Communications. New York:McGraw-Hill, 2001. )A kind of echo hidden method based on many echo nuclears is proposed on the basis of Bender primal algorithm.Promptly introduce the not sentience that a plurality of echo raisings by a small margin hide Info with many echo examining.But in the face of third-party malicious attack, the recovery rate that this method hides Info is unsatisfactory, and algorithm does not have stronger robustness.

Oh is at document (Oh H2O, HyunWook Kim, JongWon Seok. Transparent and Robust Audio Watermarking with a New Echo Embedding Technique [C] .ICME, 2001. 3172320. )In a kind of echo hidden method based on bipolarity echo nuclear has been proposed.This method is divided into two parts echo and coloration to the appreciable echo of people's ear.The former influences acoustical quality because the time of echo delay of introducing is excessive; The latter acoustically can be regarded as the colouration to original sound.Oh has studied the echo signal that embeds opposed polarity and number in the colouration territory to influence that original sound produced.Opposed polarity is to the symbol of

in the formula (10).If

claims then that for positive number echo is a positive polarity; If

claims then that for negative echo is a negative polarity.The frequency response of the echo signal of opposed polarity and number is different.Embed in sound signal that two polarity are opposite, the echo of different delayed time, can strengthen the not sentience of echo.Huang has developed the method for Oh, has proposed a kind of echo hidden method of analyzing based on psychoacoustic model (Psychoacoustic model MPEG-1), further improved algorithm robustness and echo can not perceptibility.But still there is defective in these class methods: the recovery rate that hides Info is lower, particularly introduces echo amplitude less the time.

Kim has proposed a kind of based on the echo hidden method of front and back to echo nuclear (Hyoung Joong Kim; Yong Hee Choi. A Novel Echo2Hiding Scheme with Backward and Forward Kernels [J]. Circuits and Systems for Video Technology; 2003,13 (8): 8852889.).Proposed a kind of new echo " nuclear " in the method, this echo nuclear is identical by two delay times, but echo in the opposite direction is introduced factor formation, examines to echo before and after being called.Can be expressed as

(10)

Wherein

claims the back to the echo introducing factor, and

claims the forward direction echo introducing factor.The result of study of Kim shows; When utilizing cepstral analysis detection of concealed information, the peak value size of corresponding echo position receives the influence of three factors: what 1. play maximum decisive action is the cepstrum response

of echo nuclear; 2. the cepstrum of original sound signal response influences the peak value size equally; 3. the direction of time delay

also can not be ignored the influence of peak value size.To identical original sound signal, the hidden algorithm to echo nuclear before and after adopting is guaranteeing well can not obtain the lower bit error rate that hides Info and detect under the prerequisite of sentience.But this method is in watermark embed process, and after raw tone and the echo stack, the signal that each frame obtains is

?

Voice signal, and every in theory frame signal should for

?

In the practical application than the part of having lacked

in theory.In other words, this makes telescopiny lose theoretic accuracy.

Summary of the invention

When technical matters to be solved by this invention is to overcome the watermark embedding of prior art existence; The deficiency of every frame signal meeting excalation; Provide a kind of based on the echo hidden method of front and back to echo nuclear, this method can be eliminated the signal deletion in the watermark embed process.

Particularly, the present invention adopts following technical scheme:

A kind of based on the echo hidden method of front and back to echo nuclear; Comprise watermark embedded part and watermark extracting part; The watermark embedded part comprises the step of the audio carrier signal being carried out the branch frame, and the introducing time-delay is the steps of the front and back of d to echo in each frame sound signal; Watermark extracting partly comprises the step of utilizing the cepstrum method that the watermark in each frame sound signal is detected, and extracts the step of watermark according to the time-delay d of echo;

Said watermark embedded part comprises that also to adding length between each frame of audio carrier signal be the step in the time interval of d;

Said watermark extracting part also comprises the step of removing the time interval between each frame of audio carrier signal.

The present invention has eliminated the signal deletion in the existing in prior technology watermark embed process through between each frame of audio carrier signal, adding the length and the time interval that the echo time-delay equates, has greatly improved the rate of delivering a letter and recovery rate.

Description of drawings

Fig. 1 is a watermark embedding principle synoptic diagram of the present invention;

The recovery rate of existing method and the inventive method compared when Fig. 2 changed for cell number, and wherein "---" is existing method, and "--" is the inventive method;

The recovery rate of existing method and the inventive method compared when Fig. 3 changed for attenuation rate, and wherein "---" is existing method, and "--" is the inventive method;

The recovery rate of existing method and the inventive method compared when Fig. 4 counted variation for time-delay, and wherein "---" is existing method, and "--" is the inventive method;

Fig. 5 is the robustness test result of the inventive method; Wherein A is an original watermark; B1 is an original watermark through existing method transmission and the watermark that under fire do not obtain at receiving end under the situation; B2 is an original watermark through the inventive method transmission and in the watermark that does not under fire obtain under the situation; C1 is that original watermark passes through existing method transmission and attacks in the resulting watermark of receiving end through white Gaussian noise; C2 is that original watermark passes through the inventive method transmission and attacks in the resulting watermark of receiving end through white Gaussian noise, and D1 is that original watermark passes through existing method transmission and attacks in the resulting watermark of receiving end through resampling, and D2 is that original watermark passes through the inventive method transmission and attacks in the resulting watermark of receiving end through resampling; E1 is that original watermark passes through existing method transmission and attacks in the resulting watermark of receiving end through heavy filtering, and E2 is that original watermark passes through the inventive method transmission and attacks in the resulting watermark of receiving end through heavy filtering.

Embodiment

Below in conjunction with accompanying drawing technical scheme of the present invention is elaborated:

Suppose x1 ( n) be the frame signal in the original tone signal, and

In existing watermark embedded mode, after raw tone and the echo stack, the signal that each frame obtains is

?

Voice signal, and every in theory frame signal should for

?

In the practical application than the part of having lacked

in theory.In other words, this makes telescopiny lose theoretic accuracy.

In embedded mode of the present invention, as shown in Figure 1, through being the time interval of d to adding length between each frame of audio carrier signal, wherein d is the time-delay of echo, obtains in each frame

?

Signal, identical with the value of accurate Calculation in theory, avoided hiding signal error numerically with echo in theory.Though can reduce sound quality to a certain extent through improving later embedded mode, can greatly improve the rate of delivering a letter and recovery rate.And the sound quality that this mode reduces can remedy through hardware mode when hardware is realized.

In order to verify validity of the present invention, existing method and the inventive method are contrasted:

One, performance comparison:

The wave form song s that adopts 15s is as carrier, and SF 22.05KHZ transmits with the inventive method with traditional respectively.Accompanying drawing 3-5 has shown contrast experiment's result.

As the carrier audio frequency, getting attenuation coefficient is 0.05 with s1, adopts time-delay 45 expressions " 1 ", and 55 expressions " 0 " of delaying time constantly change institute's transmit cell number, compare the recovery rate of classic method and the inventive method, and the gained result sees accompanying drawing 2.Can find out that when transmitted first number was less than 30, the recovery rate of existing method and the inventive method was all lower, discontented full border needs; When transmitted first number more than 30 the time, the recovery rate of the inventive method is apparently higher than existing method; Therefore, when cell number constantly changed, the recovery rate performance of the inventive method was better than existing method.

As the carrier audio frequency, the cell number of transmission is taken as 340 with s1, adopts time-delay 45 expressions " 1 ", 55 expressions " 0 " of delaying time.Constantly change attenuation coefficient, relatively the recovery rate of classic method and the inventive method compares, and the gained result sees accompanying drawing 3.Can know by figure, when attenuation coefficient less than 0.1 the time, the recovery rate poor performance is few, can not satisfy actual needs but all be lower than 0.7; When attenuation coefficient greater than 0.1 the time, the recovery rate performance of the inventive method is better than existing method, and recovery rate satisfies practical application; Therefore, the inventive method more can satisfy practical application than existing method.

Getting the carrier audio frequency is s1, and the cell number that is passed is 340.If employing time-delay n point expression " 1 ", time-delay n+10 point is represented " 0 ", constantly changes the size of n, compares the recovery rate of classic method and the inventive method, and the gained result sees Fig. 4.Can be obtained by figure: the recovery rate performance of the inventive method is better than existing method, can reach traditional inaccessiable high recovery rate; Therefore, the inventive method more can satisfy practical application requirements than existing method.

Two. the robustness test:

The wave form song that adopts 180s is as carrier, SF 22.05KHZ, and the watermark name is taken as " digital watermarking ", and pixel is 99 * 95.Various attack is done in watermark detected robustness.Its result is as shown in Figure 5; Wherein A is an original watermark; B1 is an original watermark through existing method transmission and the watermark that under fire do not obtain at receiving end under the situation; B2 is an original watermark through the inventive method transmission and in the watermark that does not under fire obtain under the situation; C1 is that original watermark passes through existing method transmission and attacks in the resulting watermark of receiving end through white Gaussian noise; C2 is that original watermark passes through the inventive method transmission and attacks in the resulting watermark of receiving end through white Gaussian noise, and D1 is that original watermark passes through existing method transmission and attacks in the resulting watermark of receiving end through resampling, and D2 is that original watermark passes through the inventive method transmission and attacks in the resulting watermark of receiving end through resampling; E1 is that original watermark passes through existing method transmission and attacks in the resulting watermark of receiving end through heavy filtering, and E2 is that original watermark passes through the inventive method transmission and attacks in the resulting watermark of receiving end through heavy filtering.

Choose 100 width of cloth bianry images and use figure, get the 70th image and be embedded in the audio carrier, ask for former watermark and the normalized correlation coefficient that extracts watermark as watermark as experiment.NC value (normalized correlation coefficient) is that the similarity of extracting watermark and original watermark is carried out quantitative evaluation.

NC value under table 1 various attack

?	A	B1	B2	C1	C2	D1	D2	E1	E2
										The NC value	1	0.957	1	0.957	0.999	0.932	0.994	0.817	0.990

Can find out that from table 1 under various attack, the NC value that the inventive method obtains is than traditional height.Therefore the inventive method is higher than the recovery rate of existing method under various attack.

Table 2 is under the various attack, and resulting SNR (signal to noise ratio (S/N ratio)) and PSNR (Y-PSNR) are used for the robustness of measure algorithm.

SNR under table 2 various attack and PSNR value

?	A	B1	B2	C1	C2	D1	D2	E1	E2
										SNR	∞	11.74	∞	11.74	38.87	9.75	19.63	5.68	17.73
PSNR	87.86	11.65	87.86	11.65	39.73	9.72	20.34	5.58	18

Can find out that from table 2 under various attack, the NC value that the inventive method obtains is than traditional height.Therefore the inventive method has more performance than existing method from the angle of SNR and PSNR.

Three, carrying capacity test

Hide in the assessment test at STEP 2001 audio-frequency informations, the embedded quantity of hiding data 2 b/15 s is regarded as satisfying the standard of sound works copy control; The embedded quantity that hides Info of 72 b/30 s is regarded as the copyright management Passing Criteria.In the experiment; Voice signal is carried out after the staging treating every section comprise 52 sampled points; And these six sections sampled voice frequencies of A, B, C, D, E and F all are 22.05 KHz, and the carrying capacity that hides Info of the inventive method is 400 bps, considerably beyond STEP 2001 standards.

Four, subjective assessment

To adopt distinct methods to embed the voice signal perceived quality after hiding Info in order testing, to have carried out following experiment: get 5 sections pop music A, B, C, D, E and F.20 mean aves student that is 24 years old no professional music background is play the voice signal that these five sections original sound signal and corresponding embedding thereof hide Info; Let they distinguish between the two difference and according to subjective discrimination (Subjective Difference Grades; SDG) marking; Get its mean value, the result sees table 3.

Table 3 is the test result of sentience not

Can find out that from experimental result the SDG of the existing relatively method correspondence of the inventive method is weaker, but the reduction of audio quality is still within the acceptable scope.

Claims

1. one kind based on the echo hidden method of front and back to echo nuclear; Comprise watermark embedded part and watermark extracting part; The watermark embedded part comprises the step of the audio carrier signal being carried out the branch frame, and the introducing time-delay is the steps of the front and back of d to echo in each frame sound signal; Watermark extracting partly comprises the step of utilizing the cepstrum method that the watermark in each frame sound signal is detected, and extracts the step of watermark according to the time-delay d of echo; It is characterized in that,