CN100353444C

CN100353444C - Digital audio-frequency anti-distorting method

Info

Publication number: CN100353444C
Application number: CNB2004100273560A
Authority: CN
Inventors: 黄继武; 吴绍权; 施礼
Original assignee: National Sun Yat Sen University
Current assignee: Sun Yat Sen University; National Sun Yat Sen University
Priority date: 2004-05-28
Filing date: 2004-05-28
Publication date: 2007-12-05
Anticipated expiration: 2024-05-28
Also published as: CN1585020A

Abstract

The present invention relates to a digital audio anti-distorting method which is based on the digital watermark technology of a wavelet transformation. As for audio data which needs protection, when the audio data is recorded, the recorded audio data is embedded with watermarks in real time, namely that the processes of recording the audio data and embedding the watermarks are synchronously operated. After the process of recording the audio data is completed, the process of embedding the watermarks is also completed. An embedding method comprises the following steps that the recorded audio data is decomposed by wavelets; the low frequency coefficient of the wavelet transformation is embedded with the watermark and synchronous codes; an audio containing the synchronous codes and the watermark is obtained through an inverse wavelet transformation. The embedded watermark has strong continuity and the audio data containing the watermark can be judged through the extraction of the watermark so as to identify the cutting and cut positions of the audio data. The present invention can be used for anti-distorting sensitive recorded audios and important audio data.

Description

A kind of DAB tamper resistant method

Technical field

The present invention relates to the tamper resistant method of DAB, used digital watermark technology based on wavelet transformation.

Background technology

Along with the widespread use of DAB, digital audio-frequency data be subjected to easily distorting make the integrality of data be difficult to be guaranteed etc. a series of problems appear at we in face of.For example: 1) audio-frequency information that transmits in the network is easy to be distorted, and the method for needing guarantees the integrality of audio-frequency information.2) notes are traditional evidence obtaining means of department such as public security organs, yet independent notes evidence obtaining means have that accuracy is not high, writing speed slow, be easy to distort, be easy to produce shortcoming such as dispute.Digital recording evidence obtaining means can solve the problem of above-mentioned existence to a great extent, particularly trap a person into confession, can play good illustration when extorting a confession behavior etc. judging whether to exist afterwards, and be convenient to inquiry.But the defective that common digital recording also exists data to be distorted easily.In case solve the anti-tamper and integrated authentication problem of data, digital recording will can be used as a kind of important evidence obtaining means.

Summary of the invention

The tamper resistant method that the purpose of this invention is to provide a kind of DAB, this method have good anti-cutting ability, and have the ability of protection voice data integrality, make audio frequency after the processing and original audio in that acoustically difference is little.

To achieve these goals, the inventive method is divided into the embedding of watermark and synchronous code and detects two steps:

The telescopiny of watermark and synchronous code is as follows: 1) recording: choose sampling rate and quantified precision is recorded, for normal speech, choose the 22.1kHz sampling rate, the 8bits quantified precision; Choose the 44.1kHz sampling rate for HD Audio, the 16bits quantified precision; Simulated audio signal is formed digital signal by the A/D conversion; 2) real-time embed watermark and synchronous code: in recording, with the audio digital signals segmentation, take biorthogonal series small echo to carry out wavelet transformation to each segment data, the section length default value is 0.1 second, the m sequence of choosing Cycle Length and be 63bit or 31bit is as synchronous code, take the serial small echo of Daubechies (biorthogonal) to carry out that 5 layers of wavelet transformation embed synchronous code then and default length is the watermark of 44bits to each segment data, the watermark that embeds in the whole voice data can constitute a continuous Serial No., and the digital audio and video signals behind the embed watermark deposits computer disk in;

The testing process of watermark is as follows: the 1) detection of watermark: for the detection of watermark under the normal condition, with the length segmentation of voice data to be detected when not being shorter than watermark and embedding, guarantee to comprise at least in every section audio data a synchronous code, each segment data is carried out the wavelet transformation of the identical decomposition number of plies with telescopiny and identical small echo, search and extraction synchronous code are extracted watermark according to the position of synchronous code at last from the low frequency coefficient that decomposition obtains; 2) judge and to distort: for section audio data by the detection of watermark under the cutting situation, same elder generation carries out segmentation and carries out wavelet transformation data, from wavelet coefficient, extract watermark, when the watermark that occurs extracting is discontinuous, think that then breakpoint has appearred in voice data, for the data after having only section audio data by the situation breakpoint of cutting is complete, utilize this point just can judge breakpoint location, thereby extract remaining all watermarks; 3) for the multistage voice data by the detection of watermark under the cutting situation, the watermark of the translation character of then utilizing wavelet transformation after to breakpoint weighs synchronously, thereby extracts remaining all watermarks.

Real-time to the sound signal embed watermark in order to realize, the technology that can adopt multithreading is to realize carrying out synchronously of recording and watermark embedding.System has set up the voice thread and watermark embeds thread, and has opened up two memory queues in internal memory, is respectively speech data formation and wait embed watermark data queue.Two formations is big or small identical in system, and each formation is the round-robin queue that the memory block of 36096 bytes is formed by 20 block sizes all, and the size of memory block specifically can be adjusted the size of voice data segmentation according to actual needs.

Concrete way is:

One, the embedding of watermark and synchronous code:

One) Recording Process is as follows:

(1) sound-track engraving apparatus playback, the speech simulation signal inserts the sound card of computing machine, and voice lines process control sound card is finished the digitizing of voice signal and made digital audio signal constantly be deposited in the speech data formation by the A/D conversion under the sampling rate of 22050KHz and 8 bit quantization precision.

(2) by the formation of voice lines range monitoring speech data, be not empty in case find formation, the call back function that promptly having data to enter formation then provides by Windows at once deposits the data in the speech data formation in waits for embed watermark data queue.

Two) embed watermark and synchronous code:

(1) watermark data of Xu Yaoing is the continuous number sequence that is made of binary bit sequence, and each numeral is made of 16 bits.It is 31 m sequence that the synchronous code that is used to locate watermark is chosen as length.Wavelet transform (DWT) is adopted in the mathematic(al) manipulation that voice signal carries out, and decomposing the number of plies is 5, and wavelet basis is selected the simplest Haar small echo in the Daubechies series for use.The length that can be calculated each watermark data piece of embedding by the parameter of selected watermark and wavelet transformation is 32 * (16+31)=1504 bytes.

(2) as follows based on the algorithm steps of wavelet transformation embed watermark:

1) will need the length segmentation of the primary speech signal of embed watermark by the watermark data piece;

2) respectively resultant each segment data of segmentation is done the DWT conversion, obtain the pairing DWT of each section audio data territory low frequency coefficient;

3) will need the watermark that embeds and synchronous code information translation for { 1 ,+1} sequence embeds sequence in the low frequency coefficient of DWT territory by suitable intensity then;

4) by the DWT territory low frequency coefficient that has embedded watermark information is done inverse discrete wavelet transform (IDWT) obtains to containing the watermarked audio data.

The effect that embeds synchronous code in the voice data is to be used to locate watermark signal, makes watermark signal have the performance of anti-cutting and translation.The watermark after embedding is finished and the structure of synchronous code are as follows:

Synchronous code, watermark, synchronous code, watermark, synchronous code, watermark

(3) watermark embeds thread monitor and waits for the embed watermark formation, in case find to have data to enter formation then at once by above-mentioned embedding algorithm to the data embed watermark in the formation.36096 bytes of each data block length in the formation, according to the length of watermark data piece with its segmentation, segments=36096/1504=24 then to each section embed watermark and synchronizing information thereof, promptly can embed 24 watermarks in each data block in the formation.Watermark embeds data block that thread will embed watermark again and takes out from formation and deposit hard disk in.

The watermark that embeds has very strong continuity, and whether can detect voice data by this continuity complete, if voice data is complete then the corresponding watermark that extracts also is continuous.On the contrary, if having appearred in the watermark that extracts, breakpoint proves that voice data has incompleteness.Can find out incomplete position by the watermark breakpoint.

Two, the testing process of watermark:

One) detection of normal audio data

(1) as follows based on the algorithm steps of wavelet transformation extraction watermark:

1) voice signal is pressed the length segmentation of watermark data piece, each segment signal is carried out the DWT conversion;

2) from the DWT territory low frequency coefficient that decomposition obtains, extract 1 ,+1} sequence;

3) from 1, the synchronous code of search location watermark in the+1} sequence, in case synchronous code finds subsequently be exactly watermark digit.

(2) buffer zone that to open up a length in internal memory be 9024 bytes, with the data of required extraction the buffer zone that reads in by one one of buffer length, length by the watermark data piece is divided into the 9024/1504=6 section with the data in the buffer zone, respectively each section is done the DWT conversion, if voice data is complete then can finds synchronous code and obtain watermark from each section.The watermark that extracts is 0,1,2,3 ... etc. continuous integer.

Be not 1 or can not extract watermark then think that incompleteness has appearred in audio data block if system runs into the front and back watermark difference that extracts in testing process, can determine the position of voice data incompleteness by the breakpoint of watermark, the key that find out the watermark breakpoint is to find the synchronous code of first watermark behind the breakpoint.For having only one section by the voice data of cutting, utilizing synchronous code behind the breakpoint and watermark all is complete can accurate in locating go out breakpoint.And for multiple segment data being arranged by the voice data of cutting, because the watermark before and after the breakpoint is no longer complete, so utilize the translation character of wavelet transformation to come location break point.

Two) judgement is distorted:

For section audio data by the detection of cutting:

For the detection of having only a breakpoint in the voice data, in case specific practice be find that the watermark extract is discontinuous then think current extracting data block be incomplete, begin to obtain the included data block of remaining data (complete data block length equal'ss i.e. 9024 bytes of buffer length) number then with complete watermark and synchronous code from the data block start address of breakpoint, calculate the total length of all full block of data in proper order, deduct the length that these data blocks then obtain the residual data piece with remaining data, the complete watermark that extracts again in the residual data piece then can be determined breakpoint, just can find out voice data by the position of cutting by the watermark breakpoint.

For the multistage voice data by the detection of cutting:

For the detection that a plurality of breakpoints are arranged in the voice data, specific practice is to adopt wavelet transformation special " translation invariance ", when detecting watermark when discontinuous, the current data block of extracting is carried out the translation search, utilize the watermark after breakpoint is found in translation, (position probing that does not embed synchronous code has gone out synchronous code because false-alarm can appear in the translation search, like this will be thinking watermark data by mistake at the random data of this back, position, thereby obtain incorrect result), the translation search has comprised correct and wrong result, so system screens Search Results, choose correct watermark, thereby determined the watermark breakpoint.

The inventive method has following outstanding advantage:

1) in wavelet transformation voice data is embedded the continuity watermark with self-synchronization, select suitable embedment strength can make audio frequency and original audio behind the embed watermark have only very little difference acoustically.

2) seal has good anti-cutting ability, has the ability of protection voice data integrality.Tentatively realized whether detecting this voice data by cutting, the stronger ability that detects breakpoint has been arranged for breakpoint quantity voice data seldom by the watermarked audio data are extracted, and can be with the incomplete position of very high precision 3dpa data.

Description of drawings:

For the performance of testing the anti-cutting of watermark voice data done recording, to record data in real time embed watermark, and the experiment of the anti-cutting performance of test watermark, the audio data format that is adopted in the experiment is the wav form, for example: among the voice data * .wav, * the filename of representing voice data .wav subsequently represent that this audio data file is to adopt the wav form.

Fig. 1 by do not have embed watermark record length be the data plot of 3 fens 38 seconds original audio Original.wav.

Fig. 2 is the data plot of the audio frequency waterl.wav of Original.wav after watermark embeds.

Fig. 3 is that the waterl.wav afterbody is by the data plot of the last audio frequency remainl.wav of cutting.

Fig. 4 starts by the data plot of the last audio frequency remain2.wav of cutting for waterl.wav.

Fig. 5 is that the waterl.wav middle part is by the data plot of the last audio frequency remain3.wav of cutting.

Fig. 6 is the data plot of the audio frequency water2.wav of embed watermark.

Fig. 7 has multiple segment data by the data plot of the last audio frequency remain4.wav of cutting among the water2.wav.

Expression employing quantified precision is 8 among Fig. 1, and sample frequency is 22050Hz, and monophony, the length of admission are about 3 minutes 38 seconds original audio data Original.wav figure.

Represent among Fig. 2 that getting embedment strength is the data plot that 30 couples of original audio data Original.wav embed resulting audio frequency waterl.wav in real time.

The audio frequency of being recorded has embedded 3192 numerals altogether, and each numeral is 16bit, and the scope of numeral is from 0 to 3191.System will not have the original audio data deposit of embed watermark yet when the recording data being embedded and deposit, after End of Tape, two groups of voice datas compared, and be very little in the two difference acoustically.

Test result for single breakpoint:

Embodiment

Test one:

The afterbody of the audio file waterl.wav place since 3 minutes is cropped, and resulting voice data remainl.wav as shown in Figure 3.

Can find out among Fig. 3 that the watermark digit that extracts from remaining audio file is from 0 to 2640, amount to 2641.For by that section audio under the cutting, to get the search bound and be respectively 2640 and 3192, translation search more then can extract totally 551 watermark digit, and the watermark digit scope that extracts is from 2641 to 3191, just in time is the numeral of losing.

Test two:

The head of the audio file waterl.wav place since 1 minute is cropped, and resulting voice data remain2.wav as shown in Figure 4.

The watermark digit that extracts from remaining audio file is from 879 to 3191, amount to 2513, for by the audio frequency under the cutting, get the search bound and be respectively-1 and 879, translation search again can extract totally 878 watermark digit, from 0 to 877, just in time be the numeral of losing, 888 owing to being subjected to cutting destruction, so can't extract.

Test three:

The head of audio file water1.wav minutes 32 seconds is cropped since 1 minute 30 seconds place to 1, and resulting voice data remain3.wav as shown in Figure 5.

Can find out among Fig. 5, from remaining audio file, extract altogether 3171 numerals, and the upper limit of judging cutting is 1321 that the lower limit of cutting is 1343.The numeral that is extracted is from 0 to 1321 and 1343 to 1391.Get its search for the audio frequency under the cutting and be limited to 1321 and 1343 up and down and carry out translation search, can extract 21 numerals, from 1322 to 1342, just be the numeral of losing.

Test result for a plurality of breakpoints:

Adopting quantified precision is 8, and sample frequency is 22050Hz, monophony, and the original audio length of admission is about 3 minutes and 45 seconds.And to get embedment strength be 30 it to be embedded in real time.Resulting audio frequency water2.wav data as shown in Figure 6.

Fig. 6 can find out, audio file-be total to 3384 of embed watermarks, and digital scope is from 0 to 3383.

With audio file water2.wav from 2 minutes and 25 seconds to 2 minutes and 27 seconds, 1 minute 45 seconds to 1 minute and 46 seconds, 1 minute and 19 seconds to 1 minute and 22 seconds three segment datas crop, the voice data remain4.wav that obtains is as shown in Figure 7.

Among Fig. 7, the voice data after the cutting is carried out the extraction of watermark, 3 breakpoints appear in the result, are respectively (1140,1232), and (1610,1645) and (2202,2253), the watermark sum that extracts is 3209, last numeral is 3383.Be sheared the voice data that gets off to three sections and extract, with separately the breakpoint bound as the translation search, the watermark scope that extracts three segment datas is respectively totally 89 of 1142-1230,1612-1642 totally 48 of totally 31 and 2204-2251.It just in time is the numeral of losing.

As can be seen when taking place voice data, still can from remaining voice data, extract the discontinuous watermark digit that contains damaged breakpoint, from experimental result for location break point information provides important basis by cutting.

Claims

1, a kind of DAB tamper resistant method is characterized in that this method is divided into the embedding of watermark and synchronous code and detects two steps:

The telescopiny of watermark and synchronous code is as follows: 1) recording: choose sampling rate and quantified precision is recorded, for normal speech, choose the 22.1kHz sampling rate, the 8bits quantified precision; Choose the 44.1kHz sampling rate for HD Audio, the 16bits quantified precision; Simulated audio signal is formed digital signal by the A/D conversion; 2) real-time embed watermark and synchronous code: in recording, employing is based on the watermarking algorithm of wavelet transformation, with the audio digital signals segmentation, take biorthogonal series small echo to carry out wavelet transformation to each segment data, then every section audio is embedded a synchronous code and a watermark, embedded watermark has constituted a continuous Serial No. in the whole voice data, and the digital audio and video signals after embedding synchronous code and the watermark deposits computer disk in;

The testing process of watermark is as follows: the 1) detection of watermark: for the detection of watermark under the normal condition, with the length segmentation of voice data to be detected when not being shorter than watermark and embedding, guarantee to comprise at least in every section audio data a synchronous code, each segment data is carried out the wavelet transformation of the identical decomposition number of plies with telescopiny and identical small echo, search and extraction synchronous code are extracted watermark according to the position of synchronous code at last from the low frequency coefficient that decomposition obtains; 2) judge and to distort: for section audio data by the detection of watermark under the cutting situation, same elder generation carries out segmentation and carries out wavelet transformation data, from wavelet coefficient, extract watermark, when the watermark that occurs extracting is discontinuous, think that then breakpoint has appearred in voice data, for having only section audio data by the situation of cutting, data behind the breakpoint are complete, utilize this point just can judge breakpoint location, thereby extract remaining all watermarks; 3) for the multistage voice data by the detection of watermark under the cutting situation, the watermark of the translation character of then utilizing wavelet transformation after to breakpoint weighs synchronously, thereby extracts remaining all watermarks.