CN103366753A

CN103366753A - Moving picture experts group audio layer-3 (MP3) audio double-compression detection method under same code rate

Info

Publication number: CN103366753A
Application number: CN2013102707188A
Authority: CN
Inventors: 王让定; 马朋飞; 严迪群; 金超
Original assignee: Ningbo University
Current assignee: Ningbo University
Priority date: 2013-06-28
Filing date: 2013-06-28
Publication date: 2013-10-23
Anticipated expiration: 2033-06-28
Also published as: CN103366753B

Abstract

The invention discloses a moving picture experts group audio layer-3 (MP3) audio double-compression detection method under same code rate. The MP3 audio double-compression detection method under the same code rate comprises the following steps of firstly acquiring to-be-detected MP3 audios, then, decoding the to-be-detected MP3 audios to obtain decoded WAV audios, then, coding the decoded WAV audios, respectively and correspondingly acquiring scale factor matrixes in a decoding process and a coding process according to the position of every long-window coded frame during the decoding process and the coding processes, extracting a characteristic value in every to-be-detected MP3 audio according to the scale factor matrixes respectively acquired in the decoding process and the coding process, and sending the characteristic values into a trained support vector machine and detecting the characteristic values to determine whether the to-be-detected MP3 audios are MP3 audios after primary compression or MP3 audios after double compression under the same code rate. The MP3 audio double-compression detection method under the same code rate has the advantages that the blank of MP3 audio double-compression detection under the same code rate is filled and the defect of the audio double-compression detection under higher code rate is overcome, and two scale factor matrixes are introduced, so that the computation complexity is lowered.

Description

The two compressed detected methods of MP3 audio frequency under a kind of same code rate

Technical field

The present invention relates to a kind of two compress techniques of audio frequency of multimedia digital evidence obtaining, especially relate to the two compressed detected methods of MP3 audio frequency under a kind of same code rate.

Background technology

Fast development along with multimedia information technology, people obtain in the world various multimedia messagess, people's the life and these multimedia messagess are also affecting sb. closely by audio frequency apparatus, vision facilities, video equipment and other recording unit more and more.In this digital times, the appearance of the authoring tools such as Coo1EditPro, GoldWave allows many layman only just can finish editor and modification to multimedia messages by authoring tool, does not significantly distort vestige and can not stay.Although most people is to the editor of multimedia messages and revise just in order to make multimedia messages become more perfect, thereby strengthen vision or the auditory effect of multimedia messages.But also some people carries out distorting of various malice even forges multimedia messages multimedia messages in order to reach own hidden secret.In a single day these multimedia messagess that are tampered and forge are used to formal media, court's exhibit, scientific discovery or government website etc., can cause serious harm to people's life, property safety and society stable undoubtedly.

Audio frequency is as one of most important branch in the multimedia messages, and it can be downloaded at network, also can be used as electronic record and is stored.At present increasing people begins to utilize mobile terminal device to record, and with the audio frequency that is recorded to as court evidence or news media.And these are used for the accuracy of the audio frequency of court evidence and news media etc., will affect to a great extent people's life and stablizing of society, so audio forensics are significant.MP3 judges that as most popular audio format on the current network whether it is tampered is present problem demanding prompt solution.Must first the MP3 audio decompression be become the WAV audio frequency to distorting of MP3 audio frequency, the content of the WAV audio frequency that then in time domain solution is pressed into increases, deletes, shears, splices and other operation, again the WAV audio compression is become the MP3 audio frequency at last, the MP3 audio frequency after this just causes being tampered must suffer the two compressions of MP3 audio frequency.Therefore whether the researchist can suffer the two compressions of MP3 audio frequency by detecting the MP3 audio frequency, inferred whether the MP3 audio frequency is tampered or predicts in which position to be tampered.At present a lot of scholars have carried out a large amount of research to two compressed detected of image and video, yet to two compressed detected of MP3 audio frequency but seldom, two compressed detected under the MP3 same code rate especially.

The people such as Shi Yunqing according to the second time compression process sound intermediate frequency the MDCT coefficient value MDCT coefficient value that will be less than compression process sound intermediate frequency for the first time-1 and 1 number-1 and 1 number, utilize Benford law and vertical shift to detect the MP3 audio frequency and whether suffered two compressions, the method to the first time compression bit rate be RB1, and for the second time compression bit rate is that (the two compressed detected of MP3 audio frequency of RB2＞RB1) have and detect preferably effect RB2, yet to the first time compression bit rate be RB1, and for the second time compression bit rate is that ((compression bit rate is RB1 to RB2 for the first time for RB2＜RB1) or same code rate, and for the second time compression bit rate is RB2, RB2=RB1) between the detection effect of conversion relatively poor.The people such as Liu Qingzhong distinguish single compression and two compression according to the quantity of non-zero MDCT coefficient and the histogram distribution of zero coefficient and nonzero coefficient, the method has remedied for the first time, and compression bit rate is RB1, and for the second time compression bit rate be RB2 (defective of two compressed detected of RB2＜RB1), however but do not provide testing result for the MP3 audio frequency pair compressed detected of same code rate.In the process of the two compressed detected of the MP3 of same code rate audio frequency, because the coding parameter of the compression first time of the two compressed detected under the MP3 same code rate is identical with the coding parameter of for the second time compression, therefore for the first time compression and the redundant information of removing in the compression process for the second time are very close, this just so that detection difficulty greatly increase.

Summary of the invention

Technical matters to be solved by this invention provides the two compressed detected methods of MP3 audio frequency under a kind of same code rate, it has higher verification and measurement ratio to the two compressed detected of the MP3 audio frequency under the same code rate, and computation complexity is low, can the MP3 audio frequency of different-style be detected simultaneously, and can keep well compatible with the audio compression coding standard.

The present invention solves the problems of the technologies described above the technical scheme that adopts: the two compressed detected methods of the MP3 audio frequency under a kind of same code rate is characterized in that may further comprise the steps:

1. choose N the unpressed WAV audio samples that style is different, then utilize the MP3 scrambler that unpressed WAV audio samples is encoded, obtain the first compression MP3 audio frequency that forms behind each unpressed WAV audio samples first compression, recycling MP3 decoding device becomes the WAV audio frequency with the first compression MP3 audio decompression that obtains, wherein, N 〉=10;

2. utilize step 1. in employed MP3 scrambler to step 1. in each WAV audio frequency of being pressed into of solution encode, obtain the second-compressed MP3 audio frequency corresponding with each unpressed WAV audio samples, then with each first compression MP3 audio frequency as a positive sample, with each second-compressed MP3 audio frequency as a negative sample, all positive samples and all negative samples are consisted of a training sample set, each subsample in this training sample set is positive sample or is negative sample, extract again all eigenwerts in each subsample in this training sample set, and select according to the order of sequence 250 eigenwerts in all eigenwerts in each subsample, wherein, the coding bit rate of the coding bit rate of the MP3 scrambler in this step and the step MP3 scrambler in 1. is identical;

3. utilize support vector machine to all subsamples in the training sample set separately corresponding 250 eigenwerts extracting train the support vector machine after obtaining training;

4. choose M MP3 audio frequency from network or the electronic record, each MP3 audio frequency of choosing is the MP3 audio frequency after two compressions under MP3 audio frequency behind the first compression or the same code rate, with each MP3 audio frequency of choosing as a MP3 audio samples to be detected, wherein, M 〉=1;

5. with current pending MP3 audio samples to be detected as current sample;

6. extract all eigenwerts in the current sample, then select according to the order of sequence 250 eigenwerts in all eigenwerts from current sample, again 250 eigenwerts that select are carried out normalized, obtain 250 eigenwerts after the normalized;

7. the eigenwert after utilizing support vector machine after the training to 250 normalizeds in the current sample detects, and determines that current sample is that MP3 audio frequency behind the first compression still is the MP3 audio frequency after the two compressions under the same code rate;

8. then the MP3 audio samples to be detected that the next one is pending returns step and 6. continues to carry out, until finish the detection of M MP3 audio samples to be detected as current sample.

The described step 2. middle detailed process of extracting 250 eigenwerts in each subsample is:

2.-1, current pending subsample is defined as current subsample;

2.-2, the frame definition that utilizes long window coded system to process in the current subsample is long window coded frame, then utilize step 1. in employed MP3 decoding device to the processing of decoding of the every frame in the current subsample, obtain the decoded WAV audio frequency in current subsample, wherein, in decoding process, extract the position of each the long window coded frame in the current subsample, follow the position according to each the long window coded frame in the current subsample, obtain the scale factor matrix of the position of all the long window coded frame in the current subsample, be designated as sf _a

2.-3, utilize step 1. in the processing of encoding of employed MP3 scrambler decoded WAV audio frequency that step is obtained in 2.-2, in the coding processing procedure, the position of each the long window coded frame that obtains in 2.-2 according to step, obtain the scale factor matrix of the position of all the long window coded frame of decoded WAV audio frequency in the coding processing procedure, be designated as sf _b, wherein, the coding bit rate of the MP3 scrambler in this step and step 1. in the coding bit rate of employed MP3 scrambler identical;

2.-4, according to sf _aAnd sf _b, obtain the matrix of differences of all the long window coded frame in the current subsample, be designated as Δ sf, Δ sf=sf _a-sf _b, obtain again the average of the matrix of differences of each the long window coded frame in the current subsample according to Δ sf, the average of the matrix of differences of q in the current subsample long window coded frame is designated as

Wherein, Δ sf _JxThe element of the capable x row of j among the expression Δ sf, ∑ represents the symbol of suing for peace, x=1,2,, n, n=21, q=1,2 ..., I, I represent the number of the long window coded frame in the current subsample, j=qC-q, qC-q+1 ... qC, C represent the channel number of current subsample, if current subsample is monophonic audio, C=2 then, if current subsample is stereo audio, C=4 then;

The average of the long window coded frame of all in the current subsample that 2.-5, will obtain matrix of differences separately is as all eigenwerts of current subsample;

2.-6, with first the non-vanishing eigenwert in the current subsample as the initial characteristics value, from the initial characteristics value, 250 eigenwerts of Continuous Selection are shown as a capable vector of one dimension with 250 lists of feature values selecting, are designated as feature;

2.-7, subsample that the next one is pending is as current subsample, then returns step and 2.-2 continues to carry out, until finish the extraction of all eigenwerts in each subsample, and obtains 250 eigenwerts of each subsample.

Described step detailed process 6. is:

6.-1, the frame definition that utilizes long window coded system to process in the current sample is long window coded frame, then utilize step 1. in employed MP3 decoding device to the processing of decoding of the every frame in the current sample, obtain the decoded WAV audio frequency of current sample, wherein, in decoding process, extract the bit rate of current sample and the position of each the long window coded frame in the current sample, follow the position according to each the long window coded frame in the current sample, obtain the scale factor matrix of the position of all the long window coded frame in the current sample, be designated as sf _a';

6.-2, utilize step 1. in the processing of encoding of employed MP3 scrambler decoded WAV audio frequency that step is obtained in 6.-1, in the coding processing procedure, the position of each the long window coded frame that obtains in 6.-1 according to step, obtain the scale factor matrix of the position of all the long window coded frame of decoded WAV audio frequency in the coding processing procedure, be designated as sf _b', wherein, the coding bit rate of the MP3 scrambler in this step is identical with the bit rate of current sample;

6.-3, according to sf _a' and sf _b', obtain the matrix of differences of all the long window coded frame in the current sample, be designated as Δ sf', Δ sf'=sf _a'-sf _b', obtain again the average of the matrix of differences of each the long window coded frame in the current sample according to Δ sf', the average of the matrix of differences of the long window coded frame of the q' in the current sample is designated as

Wherein, Δ sf' _J'x'The element of the capable x' row of j' among the expression Δ sf', ∑ represents the symbol of suing for peace, x'=1,2,, n, n=21, q'=1,2 ..., I', I' represent the number of the long window coded frame in the current sample, j'=q'C'-q', q'C'-q'+1 ... q'C', C' are the channel number of current sample, if current sample is monophonic audio, C'=2 then, if current sample is stereo audio, C'=4 then;

The average of the long window coded frame of all in the current sample that 6.-4, will obtain matrix of differences separately is as all eigenwerts of current sample;

6.-5, with first the non-vanishing eigenwert in the current sample as the initial characteristics value, from the initial characteristics value, 250 eigenwerts of Continuous Selection, 250 lists of feature values then will selecting are shown as the capable vector of one dimension, are designated as fea;

6.-6, according to each element among the fea, calculate the average of 250 eigenwerts choosing in the current sample, be designated as mean',

Wherein, k element among fea (k) the expression fea, 1≤k≤250;

6.-7, according to the average mean' of 250 eigenwerts choosing in the current sample, the standard variance of the capable vector f ea of Computing One-Dimensional is designated as var',

Wherein, ∑ represents the symbol of suing for peace;

6.-8, according to mean' and var', the eigenwert of 250 eigenwerts after normalized that obtains choosing is designated as data'(k with k eigenwert after the normalized),

The detailed process that eigenwert after described step utilizes in 7. support vector machine after the training to 250 normalizeds in the current sample detects is: according to

Utilize the support vector machine after the training that step obtains in 3., the eigenwert after 250 normalizeds in the current sample is judged: when

Value be 1 o'clock, determine that current sample is positive sample, and judge that current sample is the MP3 audio frequency behind the first compression, when

Value be-1 o'clock, determine that current sample is negative sample, and judge that current sample is the MP3 audio frequency after two compressions under the same code rate, wherein, sgn () is the symbol discriminant function, when

Σ_{i = 1}^{n^{'}} {a_{i}}^{*} \times y_{i} \times K (X, X_{i}) + b^{*}

Value greater than 0 o'clock,

sgn (Σ_{i = 1}^{n'} {a_{i}}^{*} \times y_{i} \times K (X, X_{i}) + b^{*})

Value be 1, when

Σ_{i = 1}^{n^{'}} {a_{i}}^{*} \times y_{i} \times K (X, X_{i}) + b^{*}

Value less than 0 o'clock,

sgn (Σ_{i = 1}^{n'} {a_{i}}^{*} \times y_{i} \times K (X, X_{i}) + b^{*})

Value be that-1, n' represents the total sample number in the training sample set, n'=2N, i=1,2 ..., n', a _i ^*The Lagrange multiplier of i subsample in the set of expression training sample, y _iThe label of i sample in the set of expression training sample, when i subsample in the training sample set is positive sample, y _i=1, when i subsample in the training sample set is negative sample, y _i=-1, X _iExpression is for the capable vector f eature of one dimension of i subsample of the training sample set of support vector machine training, and X represents the proper vector P' of current sample,

P'=[data'(1) data'(2) ... data'(250)], K (X, X _i) be kernel function, it is mapped to higher dimensional space with X from lower dimensional space, calculates simultaneously X and X _iIn the inner product of higher dimensional space, make calculated amount revert to X * X _iMagnitude, b ^*The side-play amount that represents optimum lineoid.

Described step 2.-6 in after choosing 250 eigenwerts, 250 eigenwerts choosing are carried out normalized, 250 eigenwerts after the normalized are used for the training of support vector machine, wherein, the concrete steps that 250 eigenwerts choosing are carried out normalized are as follows:

2.-and 6a, calculate the average of 250 eigenwerts choosing in the current sample, be designated as mean,

Wherein, k element in 250 eigenwerts that feature (k) expression is chosen, 1≤k≤250;

2.-and 6b, according to the average mean of 250 eigenwerts choosing in the current subsample, the standard variance of the capable vector f eature of Computing One-Dimensional is designated as var,

Wherein, ∑ represents the symbol of suing for peace;

2.-and 6c, according to mean and var, the eigenwert of 250 eigenwerts after normalized that obtains choosing is designated as data with k the eigenwert of 250 eigenwerts after normalized of choosing in i the subsample _i(k),

dat a_{i} (k) = \frac{feature (k) - mean}{var} .

The described step 1. scope of the coding bit rate of middle MP3 scrambler is 56kbps～192kbps.

Compared with prior art, the invention has the advantages that:

1. at first to the MP3 audio samples to be detected processing of decoding, scale factor matrix when obtaining the decoded WAV audio frequency of MP3 audio samples to be detected and decoding, then the WAV audio frequency that decoding the is obtained processing of encoding, scale factor matrix when obtaining encoding, obtain eigenwert in the MP3 audio samples to be detected according to these two scale factor matrix again, then the eigenwert that gets access to is carried out normalized, and the eigenwert after utilizing support vector machine to normalized detects, confirm that MP3 audio samples to be detected is MP3 audio samples behind the first compression or the MP3 audio samples after two compression, the employing above-mentioned steps can not only be finished the two compressed detected of MP3 audio frequency under the same code rate, and the accuracy rate that detects is high.

2. by introducing above-mentioned two scale factor matrix, and two scale factors will introducing are as essential characteristic, compare with traditional MDCT coefficient as essential characteristic, the needed data volume of the inventive method greatly reduces, thereby greatly reduces the computation complexity of the inventive method.

Description of drawings

Fig. 1 is the overall realization block diagram of the inventive method;

Fig. 2 is the process schematic diagram of MP3 audio compression coding standard;

Fig. 3 is the concrete testing process schematic diagram of the inventive method.

Embodiment

Embodiment is described in further detail the present invention below in conjunction with accompanying drawing.

The present invention proposes the two compressed detected methods of MP3 audio frequency under a kind of same code rate, and its FB(flow block) may further comprise the steps as shown in Figure 1:

1. choose N the unpressed WAV audio samples that style is different, when choosing unpressed WAV audio samples, removing all is quiet unpressed WAV audio samples, unpressed WAV audio samples can be selected the audio frequency of the styles such as blues, pop, classical, country and folk, then utilize the MP3 scrambler that unpressed WAV audio samples is encoded, obtain the first compression MP3 audio frequency that forms behind the first compression of each unpressed WAV audio samples, recycling MP3 decoding device becomes the WAV audio frequency with the first compression MP3 audio decompression that obtains.Consider the precision of the support vector machine after the training in the subsequent step and the complexity of total algorithm, get N 〉=10.In the present embodiment, the MP3 scrambler can select the coding bit rate of 56～192kbps that selected unpressed WAV audio samples is encoded, and coding bit rate can select 56,64,80,96,112,128,160 or the bit-rate parameters such as 192kbps.In the present embodiment, step 1., step 2. with step 2.-4 in the coding bit rate of employed MP3 scrambler all identical, and in the present embodiment MP3 scrambler and the MP3 decoding device in steps all adopt at present more popular lame3.99.5, as long as the experimenter is by arranging the switching that different parameters can realize MP3 scrambler and MP3 decoding device at lame3.99.5.The realization schematic diagram of MP3 scrambler as shown in Figure 2, the MP3 scrambler mainly by processes such as Fourier transform, discrete cosine transform, quantification and entropy codings, converts unpressed WAV original audio to the MP3 audio frequency.

2. utilize step 1. in employed MP3 scrambler to step 1. in each WAV audio frequency of being pressed into of solution encode, obtain the second-compressed MP3 audio frequency corresponding with each unpressed WAV audio samples, then with each first compression MP3 audio frequency as a positive sample, with each second-compressed MP3 audio frequency as a negative sample, the label of each positive sample is designated as+1, the label of each negative sample is designated as-1, then all positive samples and all negative samples are consisted of a training sample set, each subsample in this training sample set is positive sample or is negative sample, extract again all eigenwerts in each subsample in this training sample set, and select according to the order of sequence 250 eigenwerts in all eigenwerts in each subsample, the detailed process of extracting 250 eigenwerts in each subsample is:

2.-1, current pending subsample is defined as current subsample.

2.-2, the frame definition that utilizes long window coded system to process in the current subsample is long window coded frame, then utilize step 1. in employed MP3 decoding device to the processing of decoding of the every frame in the current subsample, obtain the decoded WAV audio frequency in current subsample, wherein, in decoding process, extract the position of each the long window coded frame in the current subsample, with the synthetic location matrix of the set of locations of all long window coded frame, be designated as array

Array=[L ₁L ₂L _I] ^T, [L ₁L ₂L _I] ^TExpression [L ₁L ₂L _I] transposition.Then according to the position of each the long window coded frame in the current subsample, obtain the scale factor matrix of each the long window coded frame correspondence position in the current subsample, be designated as sf _sIf current subsample is monophonic audio, then gets

s f_{s} = [\begin{matrix} S_{(2 q - 1,1)} & S_{(2 q - 1,2)} & \cdot \cdot \cdot & S_{(2 q - 1, n)} \\ S_{(2 q, 1)} & S_{(2 q, 2)} & \cdot \cdot \cdot & S_{(2 q, n)} \end{matrix}];

If current subsample is dual-channel audio, then get

{sf}_{s} = [\begin{matrix} S_{(4 q - 3,1)} & S_{(4 q - 3,2)} & \cdot \cdot \cdot & S_{(4 q - 3, n)} \\ S_{(4 q - 2,1)} & S_{(4 q - 2,2)} & \cdot \cdot \cdot & S_{(4 q - 2, n)} \\ S_{(4 q - 1,1)} & S_{(4 q - 1,2)} & \cdot \cdot \cdot & S_{(4 q - 1, n)} \\ S_{(4 q, 1)} & S_{(4 q, 2)} & \cdot \cdot \cdot & S_{(4 q, n)} \end{matrix}],

Wherein, q=1,2 ..., I, I represent the number of the long window coded frame in the current subsample, n=21.

2.-3, according to sf _sWith the channel number of current subsample, obtain the scale factor matrix of current subsample in decoding process, be designated as sf _a, sf _aScale factor matrix sf by each long window coded frame _sConsist of, be expressed as:

Wherein, if current subsample is monophonic audio, then get m=2I; If current subsample is dual-channel audio, then get m=4I.

2.-4, utilize step 1. in the processing of encoding of employed MP3 scrambler decoded WAV audio frequency that step is obtained in 2.-2.In the coding processing procedure, the location matrix array according to step obtains in 2.-2 obtains the scale factor matrix of the position of all the long window coded frame of decoded WAV audio frequency in the coding processing procedure, is designated as sf _b,

2.-5, according to sf _aAnd sf _b, obtain the matrix of differences of all the long window coded frame in the current subsample, be designated as Δ sf,

Obtain again the average of the matrix of differences of each long window coded frame according to Δ sf, be designated as

\overset{&OverBar;}{Δs f_{frame} (q)}, \overset{&OverBar;}{Δs f_{frame} (q)} = \frac{1}{nC} Σ_{i = qC - q}^{qC} Σ_{x = 1}^{n} Δs f_{jx} .

If current subsample is monophonic audio, the matrix of differences of then getting q long window coded frame is

s f_{s} = [\begin{matrix} S_{(2 q - 1,1)} & S_{(2 q - 1,2)} & \cdot \cdot \cdot & S_{(2 q - 1, n)} \\ S_{(2 q, 1)} & S_{(2 q, 2)} & \cdot \cdot \cdot & S_{(2 q, n)} \end{matrix}];

If current subsample is dual-channel audio, the matrix of differences of then getting q long window coded frame is

{sf}_{s} = [\begin{matrix} S_{(4 q - 3,1)} & S_{(4 q - 3,2)} & \cdot \cdot \cdot & S_{(4 q - 3, n)} \\ S_{(4 q - 2,1)} & S_{(4 q - 2,2)} & \cdot \cdot \cdot & S_{(4 q - 2, n)} \\ S_{(4 q - 1,1)} & S_{(4 q - 1,2)} & \cdot \cdot \cdot & S_{(4 q - 1, n)} \\ S_{(4 q, 1)} & S_{(4 q, 2)} & \cdot \cdot \cdot & S_{(4 q, n)} \end{matrix}],

By introducing scale factor matrix sf _aAnd sf _bCan detect exactly the two compression MP3 audio frequency under the same code rate.Wherein,

The average that represents the matrix of differences of q long window coded frame in the current subsample, Δ sf _JxThe element of the capable x row of j among the expression Δ sf, x=1,2 ... n, n=21, q=1,2,, I, j=qC-q, qC-q+1 ..., qC, ∑ represents the symbol of suing for peace, and C is the channel number of current subsample, if current subsample is monophonic audio, then get C=2, if current subsample is stereo audio, then get C=4.

2.-6, will obtain

All eigenwerts as current subsample.

2.-7, with first the non-vanishing eigenwert in the current subsample as first eigenwert, from first eigenwert, 250 eigenwerts of Continuous Selection are shown as a capable vector of one dimension with 250 lists of feature values selecting, are designated as feature; Then calculate the average of 250 eigenwerts choosing in the current sample, be designated as mean,

Wherein, k element in 250 eigenwerts that feature (k) expression is chosen, 1≤k≤250.

2.-8, according to the average mean of 250 eigenwerts choosing in the current subsample, the standard variance of the capable vector f eature of Computing One-Dimensional is designated as var,

Wherein, ∑ represents the symbol of suing for peace.

2.-9, according to mean and var, the eigenwert of 250 eigenwerts after normalized that obtains choosing.K the eigenwert of 250 eigenwerts after normalized of choosing in i the subsample is designated as data _i(k),

Wherein, the eigenwert after the normalized is applied in the sample training process, can makes testing result more accurate.

2.-10, subsample that the next one is pending is as current subsample, then returns step and 2.-2 continues to carry out, until finish the extraction of all eigenwerts in each subsample, and obtains eigenwert after 250 normalizeds of each subsample.

3. utilize support vector machine to all subsamples in the training sample set separately 250 eigenwerts after the corresponding normalization of extracting train the support vector machine after obtaining training.In the present embodiment, the unpressed WAV audio samples of generally choosing is The more the better.The unpressed WAV audio samples of choosing is more, the precision of the masterplate parameter in the support vector machine after the training is higher, thereby can effectively improve the accuracy in detection of the inventive method, but also can increase the detection time of the inventive method simultaneously, therefore can be according to accuracy of detection and algorithm complex actual conditions, from the audio frequency that record in legal CD CD and laboratory, gather tens head to several thousand first WAV audio samples, obtain the MP3 audio samples behind the first compression after the WAV audio samples is encoded, after the decoding of MP3 audio samples behind the first compression again coding obtain MP3 audio samples after the second-compressed, the quantity of the quantity of the MP3 audio samples behind the first compression that obtains and the MP3 audio samples after the second-compressed equates with the quantity of selected WAV audio samples.

4. choose M MP3 audio frequency from network or the electronic record, each MP3 audio frequency of choosing is the MP3 audio frequency after two compressions under MP3 audio frequency behind the first compression or the same code rate, when choosing the MP3 audio frequency, removing all is quiet MP3 audio frequency, with each the MP3 audio frequency in the remaining MP3 audio frequency as a MP3 audio samples to be detected, because M the MP3 audio frequency of selecting is used for detecting, its quantity does not affect precision and the algorithm speed of this algorithm, so can get M 〉=1.

5. with current pending MP3 audio samples to be detected as current sample.

6. extract all eigenwerts in the current sample, then select according to the order of sequence 250 eigenwerts in all eigenwerts from current sample, again 250 eigenwerts that select are carried out normalized, obtain 250 eigenwerts after the normalized, detailed process is:

6.-1, the frame definition that utilizes long window coded system to process in the current sample is long window coded frame, then utilize step 1. in employed MP3 decoding device to the processing of decoding of the every frame in the current sample, obtain the decoded WAV audio frequency of current sample, wherein, in decoding process, extract the bit rate of current sample and the position of each the long window coded frame in the current sample, and with the synthetic location matrix of the set of locations of all long window coded frame, be designated as array', array'=[L _1'L _2'L _I'] ^T, [L _1'L _2'L _I'] ^TExpression [L _1'L _2'L _I'] transposition.Then according to the position of each the long window coded frame in the current sample, obtain the scale factor matrix of each the long window coded frame correspondence position in the current sample, be designated as sf _s', if current sample is monophonic audio, then get

{sf}_{s}^{'} = [\begin{matrix} {S^{'}}_{(2 q^{'} - 1, 1)} & {S^{'}}_{(2 q^{'} - 1,2)} & \cdot \cdot \cdot & {S^{'}}_{(2 q^{'} - 1, n)} \\ {S^{'}}_{(2 q^{'}, 1)} & {S^{'}}_{(2 q^{'}, 2)} & \cdot \cdot \cdot & {S^{'}}_{(2 q^{'}, n)} \end{matrix}];

If current sample is dual-channel audio, then get

{s f^{'}}_{s} = [\begin{matrix} {S^{'}}_{(4 q^{'} - 3,1)} & {S^{'}}_{(4 q^{'} - 3,2)} & \cdot \cdot \cdot & {S^{'}}_{(4 q^{'} - 3, n)} \\ {S^{'}}_{(4 q^{'} - 2,1)} & {S^{'}}_{(4 q^{'} - 2,2)} & \cdot \cdot \cdot & {S^{'}}_{(4 q^{'} - 2, n)} \\ {S^{'}}_{(4 q^{'} - 1,1)} & {S^{'}}_{(4 q^{'} - 1,2)} & \cdot \cdot \cdot & {S^{'}}_{(4 q^{'} - 1, n)} \\ {S^{'}}_{(4 q^{'}, 1)} & {S^{'}}_{(4 q^{'}, 2)} & \cdot \cdot \cdot & {S^{'}}_{(4 q^{'}, n)} \end{matrix}],

Wherein, n=21, q'=1,2 ..., I', I' represent the number of the long window coded frame in the current sample.

6.-2, according to sf _s' and the channel number of current sample, the scale factor matrix when obtaining current sample decoding is designated as sf _a', sf _a' by the scale factor matrix sf of each long window coded frame _s' consist of, be expressed as:

Wherein, if current sample is monophonic audio, then get m'=2I'; If current sample is dual-channel audio, then get m'=4I'.

6.-3, utilize step 1. in the processing of encoding of employed MP3 scrambler decoded WAV audio frequency that step is obtained in 6.-1, in the coding processing procedure, 6.-1 the location matrix array' that obtains in according to step, obtain the scale factor matrix of the position of all the long window coded frame of decoded WAV audio frequency in the coding processing procedure, be designated as sf _b',

The coding bit rate of the MP3 scrambler in this step is identical with the bit rate of current sample.

6.-4, according to sf _a' and sf _b', obtain the matrix of differences of all the long window coded frame in the current sample, be designated as Δ sf',

Obtain again the average of the matrix of differences of each long window coded frame according to Δ sf', be designated as

\overset{&OverBar;}{Δs {f^{'}}_{frame} (q^{'})}, \overset{&OverBar;}{Δs {f_{frame}}^{'} (q^{'})} = \frac{1}{n C^{'}} Σ_{j^{'} = q^{'} C^{'} - q^{'}}^{q^{'} C^{'}} Σ_{x^{'} = 1}^{n} {Δsf}^{'}_{j^{'} x^{'}} .

If current sample is monophonic audio, then get

{sf}_{s}^{'} = [\begin{matrix} {S^{'}}_{(2 q^{'} - 1, 1)} & {S^{'}}_{(2 q^{'} - 1,2)} & \cdot \cdot \cdot & {S^{'}}_{(2 q^{'} - 1, n)} \\ {S^{'}}_{(2 q^{'}, 1)} & {S^{'}}_{(2 q^{'}, 2)} & \cdot \cdot \cdot & {S^{'}}_{(2 q^{'}, n)} \end{matrix}];

If current sample is dual-channel audio, then get

{s f^{'}}_{s} = [\begin{matrix} {S^{'}}_{(4 q^{'} - 3,1)} & {S^{'}}_{(4 q^{'} - 3,2)} & \cdot \cdot \cdot & {S^{'}}_{(4 q^{'} - 3, n)} \\ {S^{'}}_{(4 q^{'} - 2,1)} & {S^{'}}_{(4 q^{'} - 2,2)} & \cdot \cdot \cdot & {S^{'}}_{(4 q^{'} - 2, n)} \\ {S^{'}}_{(4 q^{'} - 1,1)} & {S^{'}}_{(4 q^{'} - 1,2)} & \cdot \cdot \cdot & {S^{'}}_{(4 q^{'} - 1, n)} \\ {S^{'}}_{(4 q^{'}, 1)} & {S^{'}}_{(4 q^{'}, 2)} & \cdot \cdot \cdot & {S^{'}}_{(4 q^{'}, n)} \end{matrix}],

By introducing scale factor matrix sf _a' and sf _b' can detect exactly the two compression MP3 audio frequency under the same code rate.Wherein,

The average that represents the matrix of differences of q' long window coded frame in the current sample, Δ sf' _J'x'The element of the capable x' row of j' among the expression Δ sf', ∑ represents the symbol of suing for peace, x'=1,2,, n, n=21, q'=1,2 ..., I', j'=q'C'-q', q'C'-q'+1 ... q'C', C' are the channel number of current sample, if current sample is monophonic audio, C'=2 then, if current sample is stereo audio, C'=4 then.

6.-5, will obtain

All eigenwerts as current sample.

6.-6, with first the non-vanishing eigenwert in the current sample as first eigenwert, from first eigenwert, 250 eigenwerts of Continuous Selection, 250 lists of feature values then will selecting are shown as the capable vector of one dimension, are designated as fea.

6.-7, according to each element among the fea, calculate the average of 250 eigenwerts choosing in the current sample, be designated as mean',

Wherein, k element among fea (k) the expression fea, 1≤k≤250.

6.-8, according to the average mean' of 250 eigenwerts choosing in the current sample, the standard variance of the capable vector f ea of Computing One-Dimensional is designated as var', Wherein, ∑ represents the symbol of suing for peace.

6.-9, according to mean' and var', the eigenwert of 250 eigenwerts after normalized that obtains choosing is designated as data'(k),

Wherein, data'(k) k the eigenwert of expression after the normalized.

7. the eigenwert after utilizing support vector machine after the training to 250 normalizeds in the current sample detects, determine that current sample belongs to the MP3 audio frequency after MP3 audio frequency behind the first compression still belongs to two compressions under the same code rate, detailed process is: according to Utilize the support vector machine after the training that step obtains in 3., the eigenwert after 250 normalizeds in the current sample is judged: when Value be 1 o'clock, determine that current sample is positive sample, and judge that current sample is the MP3 audio frequency behind the first compression, when Value be-1 o'clock, determine that current sample is negative sample, and judge that current sample is the MP3 audio frequency after two compressions under the same code rate.Wherein, sgn () is the symbol discriminant function, when

Σ_{i = 1}^{n^{'}} {a_{i}}^{*} \times y_{i} \times K (X, X_{i}) + b^{*}

Value greater than 0 o'clock,

sgn (Σ_{i = 1}^{n'} {a_{i}}^{*} \times y_{i} \times K (X, X_{i}) + b^{*})

Value be 1, when

Σ_{i = 1}^{n^{'}} {a_{i}}^{*} \times y_{i} \times K (X, X_{i}) + b^{*}

Value less than 0 o'clock,

sgn (Σ_{i = 1}^{n'} {a_{i}}^{*} \times y_{i} \times K (X, X_{i}) + b^{*})

Value be total sample number that-1, n' represents training sample set, n'=2N, i=1,2 ..., n', a _i ^*The Lagrange multiplier of i subsample in the set of expression training sample, y _iThe label of i sample in the set of expression training sample, when i subsample in the training sample set is positive sample, y _i=1, when i subsample in the training sample set is negative sample, y _i=-1, X _iExpression is for the capable vector f eature of i subsample of the training sample set of support vector machine training, be [feature (1) feature (2) ... feature (250)], X represents the proper vector P' of current sample to be detected, P'=[data'(1) data'(2) ... data'(250)], K (X, X _i) be kernel function, it is mapped to higher dimensional space with X from lower dimensional space, calculates simultaneously X and X _iIn the inner product of higher dimensional space, make calculated amount revert to X * X _iMagnitude, b ^*The side-play amount that represents optimum lineoid.In specific implementation process, X _iThe proper vector P that also can obtain after normalized with 250 eigenwerts of i subsample in the training sample set represents,

P=[data _i(1)data _i(2)…data _i(250)]。

8. then the MP3 audio samples to be detected that the next one is pending returns step and 6. continues to carry out as current sample, until M MP3 audio samples to be detected all detects complete.

In the present embodiment, the MP3 scrambler adopts 56,64,80,96,112,128,160 successively, the coding bit rate parameter of 192kbps is encoded to selected original WAV audio samples.Corresponding one group of experiment under each bit rate, audio samples comprises the audio samples after audio samples after the 687 monics time compression and 687 prime ministers are with the two compressions under the code check, totally 8 groups of experiments, these audio samples also can obtain from laboratory recording or the legal CD laser disc of buying.Extract 250 eigenwerts in each sample from obtain audio samples, 70% of all eigenwerts of extracting are used for the training of support vector machine, the remaining eigenwert of extracting is used for experiment test, every group of experiment test carries out 10 times.Calculate at last the predictablity rate of every group of experiment test by the mean value of asking 10 times testing result, the result of the predictablity rate of every group of experiment test is as shown in table 1.

Table 1 couple compressed detected result

Bit rate	TP	TN	Accuracy
				56kbps	96.35%	89.05%	92.70%
64kbps	98.54%	91.53%	95.04%
				80kbps	99.27%	92.70%	95.99%
96kbps	99.37%	96.85%	98.11%
				112kbps	98.61%	90.66%	94.64%
128kbps	98.54%	95.62%	97.08%
				160kbps	96.60%	94.66%	95.64%
192kbps	93.32%	90.32%	91.82%

Wherein, TP represents that prediction accuracy, the TN of single compressed audio represent that prediction accuracy, the Accuracy of the two compressed audios under the same code rate represent the consensus forecast accuracy rate.Under different coding bit rates, the two compressed detected results under the same code rate are substantially more than 92% as can be seen from Table 1, and verification and measurement ratio is higher, and also applicable under higher code check.

To sum up, the two compressed detected methods of MP3 audio frequency under the same code rate of the present invention have higher detection discrimination, broken through the blank of the two compressed detected of MP3 audio frequency under the same code rate, and be applicable under the higher same code rate the two compressed detected of MP3 audio frequency, and the two compressed detected algorithms of relatively current audio frequency take the MDCT coefficient as feature, computation complexity is lower.

Claims

1. two compressed detected methods of the MP3 audio frequency under the same code rate is characterized in that may further comprise the steps:

5. with current pending MP3 audio samples to be detected as current sample;

2. two compressed detected methods of the MP3 audio frequency under a kind of same code rate according to claim 1 is characterized in that: the detailed process that described step is extracted 250 eigenwerts in each subsample in 2. is:

2.-1, current pending subsample is defined as current subsample;

3. two compressed detected methods of the MP3 audio frequency under a kind of same code rate according to claim 2, it is characterized in that: described step detailed process 6. is:

Wherein, k element among fea (k) the expression fea, 1≤k≤250;

Wherein, ∑ represents the symbol of suing for peace;

4. two compressed detected methods of the MP3 audio frequency under a kind of same code rate according to claim 3 is characterized in that: the detailed process that the eigenwert after described step utilizes in 7. support vector machine after the training to 250 normalizeds in the current sample detects is: according to

Σ_{i = 1}^{n^{'}} {a_{i}}^{*} \times y_{i} \times K (X, X_{i}) + b^{*}

Value greater than 0 o'clock,

sgn (Σ_{i = 1}^{n'} {a_{i}}^{*} \times y_{i} \times K (X, X_{i}) + b^{*})

Value be 1, when

Σ_{i = 1}^{n^{'}} {a_{i}}^{*} \times y_{i} \times K (X, X_{i}) + b^{*}

Value less than 0 o'clock,

sgn (Σ_{i = 1}^{n'} {a_{i}}^{*} \times y_{i} \times K (X, X_{i}) + b^{*})

5. two compressed detected methods of the MP3 audio frequency under a kind of same code rate according to claim 4, it is characterized in that: described step 2.-6 in after choosing 250 eigenwerts, 250 eigenwerts choosing are carried out normalized, 250 eigenwerts after the normalized are used for the training of support vector machine, the concrete steps of wherein, 250 eigenwerts choosing being carried out normalized are as follows:

Wherein, k element in 250 eigenwerts that feature (k) expression is chosen,

1≤k≤250；

2.-and 6b, according to the average mean of 250 eigenwerts choosing in the current subsample, the capable vector of Computing One-Dimensional

The standard variance of feature is designated as var,

Wherein, ∑ represents the symbol of suing for peace;

dat a_{i} (k) = \frac{feature (k) - mean}{var} .

6. it is characterized in that according to claim 4 or the two compressed detected methods of the MP3 audio frequency under 5 described a kind of same code rate: described step 1. in the scope of coding bit rate of MP3 scrambler be 56kbps～192kbps.