CN108806718A - Based on the audio authentication method to ENF phase spectrums and instantaneous frequency spectrum analysis - Google Patents

Based on the audio authentication method to ENF phase spectrums and instantaneous frequency spectrum analysis Download PDF

Info

Publication number
CN108806718A
CN108806718A CN201810585686.3A CN201810585686A CN108806718A CN 108806718 A CN108806718 A CN 108806718A CN 201810585686 A CN201810585686 A CN 201810585686A CN 108806718 A CN108806718 A CN 108806718A
Authority
CN
China
Prior art keywords
phase
enf
enfc
feature
instantaneous frequency
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810585686.3A
Other languages
Chinese (zh)
Other versions
CN108806718B (en
Inventor
王志锋
王静
左明章
叶俊民
闵秋莎
田元
夏丹
陈迪
罗恒
宁国勤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huazhong Normal University
Central China Normal University
Original Assignee
Huazhong Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huazhong Normal University filed Critical Huazhong Normal University
Priority to CN201810585686.3A priority Critical patent/CN108806718B/en
Publication of CN108806718A publication Critical patent/CN108806718A/en
Application granted granted Critical
Publication of CN108806718B publication Critical patent/CN108806718B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique

Abstract

The invention belongs to digital audio and video signals processing technology fields, disclose a kind of audio authentication method based on to ENF phase spectrums and instantaneous frequency spectrum analysis, it is pre-processed to measured signal, then feature extraction is carried out to ENF signals, analyze the phase spectrum and instantaneous frequency spectrum of ENF signals, extract the phase spectrum fluctuation characteristic of ENF signals, phase spectrum and frequency spectrum fitting parameter feature;By differentiating that correlation analysis DCA methods carry out Fusion Features, the correlation between different feature sets is maximized;Depth random forest is finally applied to carry out Model Construction to the feature after fusion, trained model carries out transfer learning.Feature-based fusion technology that the present invention uses carries out characteristic processing, and recognition differential is improved while reducing intrinsic dimensionality away from carrying out model training using deep learning method, substantially increase the accuracy rate of the passive tampering detection of digital audio.

Description

Based on the audio authentication method to ENF phase spectrums and instantaneous frequency spectrum analysis
Technical field
The invention belongs to digital audio and video signals processing technology fields, more particularly to one kind is based on to ENF phase spectrums and instantaneously The audio authentication method of frequency spectrum analysis.
Background technology
Currently, the prior art commonly used in the trade is such:
With the development of computer and internet the relevant technologies, people are more dependent on using digital multimedia data.Number The advantages of word multi-medium data is easy to preservation, editor and propagates brings many facilities and enjoyment to people's daily life.Such as people Do not need any professional knowledge quickly and easily digital audio file can be spliced using audio edited software, plus Enter the operations such as noise and transformation, this is Internet era entertainment way prevailing.But the development of technology is a double-edged sword, together When allow some criminals to have an opportunity to take advantage of.Criminal can maliciously distort digital audio and be carried out wide-scale distribution, And it only is difficult to discover out with sense organ.It is propagated in court's recording proof, Deceptive news if applying such digital audio file Etc. under occasions, may result in serious consequence, damage the just and social trust order of law.Thus it is guaranteed that digital audio Authenticity and integrity, it is highly important to carry out tampering detection to digital audio.Digital audio tampering detection is digital audio One important branch of evidence obtaining, in the fields such as judicial evidence collection, Justice of Journalism and scientific discovery extensive application.
In current digital audio altering detecting method, most efficient method is the detection based on mains frequency consistency Method in the past decade almost becomes the common standard of digital audio identification, obtain in world wide academic research personnel and The concern of law enforcement agency.Its principle is, if sound pick-up outfit downward recording audio the case where accessing power grid, audio letter Mains frequency (Electirc Network Frequency, ENF) information will necessarily be carried in number.This not only enables ENF become one Kind is naturally embedded into the watermark signal in audio signal, and can be used as timestamp.Embedded ENF in audio file Ingredient (ENF component, ENFC), can extract by bandpass filtering.Using ENFC stability and uniqueness into Generally there are two Research Thinkings for row digital audio tampering detection, and first is the mains frequency for the ENFC and power supply department that will be extracted Data in database are compared, and determine whether the audio recording time is consistent with what is declared, establishes and preserves and is large-scale ENF Signals Data Bases difficulty is high, cost is high, and there is presently no the relatively high ENF databases of practical value.Grigoras most early in Romanian part establishes ENF reference databases.Liu Yuming etc. analyzes north American power grid detecting system, proposes to establish mark The method of quasi- mains frequency;Second is the certain features extracted in ENF signals, carries out consistency or Regularity Analysis. Grigoras proposes the audio forgery detection algorithm based on ENF earliest, mainly passes through the fluctuation and reference of ENF in audio to be detected The data in time are compared, to judge whether audio is tampered with.Then Grigoras verifications add in short-term audio signal Window is analyzed, and can carry out more careful, accurate comparison with database.The Research foundation in Grigoras such as Rodr í guez On, the consistency of ENF phase changes is detected audio by the method for proposing that ENF standard databases need not be used as feature It distorts, chooses boundary value and categorised decision is carried out to this feature.Hu Yongjian etc. is ideal by one on the basis of Rodr í guez Sinusoidal signal, which is used as, refers to signal, constructs new characteristic quantity to detect the discontinuity of ENF phases.Hu Yongjian etc. is then to above-mentioned Method is improved, and proposes not needing additional reference signal and the method that directly calculates ENF maximum offsets, furthermore with mostly special Sign, which is combined, is accurately positioned tampered region.Esquef etc. can cause to distort the mutation of point ENF instantaneous frequencys according to operation is distorted, and propose TPSW (Two-Pass Split-Window) method estimates ENF background change levels, is more than background by practical Instantaneous frequency variations The peak point of change level is known as distorting a little.
In conclusion problem of the existing technology is:
There are problems that currently based on the ENF researchs for carrying out the passive tampering detection of digital audio:
1) do not have authoritative ENF comparison databases.Using being carried out in ENF ingredients and the ENF databases in measured signal pair Than distorting no reliable result to judge whether voice signal passes through;
2) most of method does not extract characteristic crucial in voice signal, can directly be to voice signal It is no to be tampered carry out decision;
3) correlation between override feature collection, it is not further to the initial characteristic data extracted to be handled;
4) existing most methods the degree of automation is not high, ineffective, and to the adaptivity of disparate databases signal Difference.
Solve the difficulty and meaning of above-mentioned technical problem:
The ENF comparison databases for establishing authority, cost dearly and difficult management, practical operation have little significance;Extract language Whether key feature data are come directly to being tampered that make decisions be researcher's asking of wanting to capture all the time in sound signal Topic.
The phase spectrum sensitive to signal cutout and instantaneous frequency spectrum are used as feature in the ENF ingredients of present invention selection signal, Carry out tampering detection;The present invention is tested using the voice signal of three databases, and using deep learning method depth with Machine forest carries out Model Construction, ensure that the adaptivity of the program and the degree of automation can be applied to actual conditions.
Invention content
In view of the problems of the existing technology, the present invention provides one kind based on to ENF phase spectrums and instantaneous frequency spectrum point The audio authentication method of analysis.The present invention analyzes the phase spectrum and frequency spectrum of ENFC by extracting the ENFC in voice signal, carries Take phase and frequency feature.Fusion Features are carried out to phase spectrum signature and frequency spectrum signature using DCA methods, it is random using depth Forest carries out Model Construction to fusion feature, and obtained model can distort carry out decision to whether arbitrary measured signal passes through, Realize voice signal insertion, the automatic detection of delete operation.This method is by merging phase representative in ENF ingredients With instantaneous frequency feature, and use deep learning method training pattern, obtain that automatic detection model can be carried out, improve detection Efficiency realizes the automation of digital audio tampering detection.
The invention is realized in this way a kind of digital audio true and false based on to ENF phase spectrums and instantaneous frequency spectrum analysis Identification method, including:It is pre-processed, including down-sampling and narrow-band filtering, is obtained with mains frequency to measured signal first Narrow band signal centered on (Electirc Network Frequency, ENF) standard frequency;Then ENF signals are carried out special Sign extraction analyzes the phase spectrum and instantaneous frequency spectrum of ENF signals, extracts the phase spectrum fluctuation characteristic of ENF signals, phase spectrum and frequency Rate composes fitting parameter feature;By differentiating correlation analysis (discriminant correlation analysis, DCA) method Fusion Features are carried out, maximize the correlation between different feature sets, while eliminating correlation between class, and restricted interior phase Guan Xing;Depth random forest is finally applied to carry out Model Construction to the feature after fusion, trained model carries out transfer learning, After i.e. model preserves, carry out decision whether can be tampered to arbitrary measured signal.The present invention is based on the ENF marks in measured signal Remember that signal carries out tampering detection, extraction ENF signals affected phase and frequency feature because distorting, and this method is to carrying The feature set taken carries out DCA Fusion Features, is trained classification to the feature after fusion using depth random forest method, obtains To disaggregated model, which can be obtained good detection result for the insertion of signal and deletion situation, it is multiple to reduce calculating Miscellaneous degree, substantially increases classification accuracy, can realize automatic classifier system.
Specifically include following steps:
Step 1:It is pre-processed to measured signal;
Step 2:The feature extraction of phase spectrum and frequency spectrum is carried out to the ENF ingredients in signal;
Step 3:Fusion Features are carried out to multiple feature sets of extraction using DCA methods;
Step 4:Model Construction is carried out to the feature after fusion using depth random forest, can be determined to measured signal Plan.
Further, step 1, following steps are specifically included:
Step 1.1:X [n] is pre-processed to measured signal, and pretreatment includes down-sampling, goes DC component, obtains xd [n];
Step 1.2:The signal x of down-sampling will be passed through in step 1.1d[n], by centre frequency at ENF standard frequencies Bandpass filter, obtain the ENF ingredients x in signalENFC[n]。
Further, step 2, following steps are specifically included:
Step A1:To xENFC[n] carries out being based on DFT1Phase Power estimation, extraction phase spectrum fluctuation characteristic F;
Step A2:To xENFC[n] carries out the instantaneous frequency Power estimation based on Hilbert;
Step A3:It carries out curve fitting respectively to phase spectrum and frequency spectrum, extracts phase spectrum fit characteristicWith instantaneous frequency Rate composes fit characteristic
Further, in step A1, to xENFC[n] carries out being based on DFT1Phase Power estimation, first to xENFC[n] signal into Leaf transformation DFT in the conventional N point discrete Fouriers of row, to be based on DFT0Phase estimation, obtain estimation phaseBased on DFT1Phase Position estimation is in DFT0On the basis of phase estimation, calculate xENFCThe approximate first derivative of [n] at point n:
x′ENFC[n]=fd(xENFC[n]-xENFC[n-1])
In conjunction with approximate first derivative andThe phase estimation of higher order is carried out, and linear interpolation is carried out to estimated result, Obtain phase spectrum estimated result, extraction phase spectrum fluctuation characteristic F;
In step A2, to xENFC[n] carries out the instantaneous Frequency Estimation converted based on Hilbert, obtains x firstENFC[n's] Analytical function:
x(a) ENFC[x]=xENFC[x]+i*Η{xENFC[x] },
WhereinΗ represents Hilbert transformation;Instantaneous frequency is Η { xENFC[n] } phase angle change rate, estimation The instantaneous frequency f [n] of ENF signals removes oscillation and boundary effect to f [n], builds xENFC[n] instantaneous frequency is composed;
In step A3, according to xENFCThe characteristics of phase spectrum and frequency spectrum of [n], respectively use Sum of Sines and Gaussian comes fit phase spectrum and frequency spectral curve;
Sum of Sines expression formula forms:
Gaussian expression formula forms:
Wherein expression argument is fit characteristic,
Further, step 3, it specifically includes:
The target of Fusion Features be by the relevant information in two or more feature vectors be combined into one it is more single than any Input feature value has more the information of discrimination, or in the case where intrinsic dimensionality is excessive, and spy is reduced by Fusion Features Sign dimension can still reach and the approximate accuracy of high dimensional feature.Using the phase for differentiating that correlation analysis DCA will be obtained in step 2 Position feature set and frequecy characteristic collection carry out Fusion Features, and DCA is carried out by the correlation two-by-two between maximizing two feature sets Fusion Features, and restricted interior correlation.The conversion of feature set is calculated by maximizing the covariance matrix between feature set Matrix, while ensureing the diagonalization of scatter matrix in class.
Further, step 4, it specifically includes:
Step 4.1:Model Construction is carried out to the feature after fusion using depth random forest;
Depth random forest is a kind of deep neural network model, can be used for classifying.By fusion feature part for training Depth random forest, the training process of depth random forest is different with traditional random forest, the variation and layer that it can be according to precision Number limits and automatically determines the model parameters such as the number of plies, will not stop after training precision is not promoted or the number of plies reaches maximum value Training, using classification results at this time as final classification precision.
Step 4.2:After preservation model decision is carried out to whether arbitrary measured signal is tampered.
The number of plies and structural parameters of obtained depth random forest after the completion of the training process of depth random forest are constituted The fusion feature disaggregated model of gained of the invention, can carry out arbitrary measured signal fusion feature classification and decision.
Another object of the present invention is to provide described in a kind of realize based on to ENF phase spectrums and instantaneous frequency spectrum analysis The computer program of digital audio authenticity identification method.
Another object of the present invention is to provide described in a kind of realize based on to ENF phase spectrums and instantaneous frequency spectrum analysis The letter digital audio signal processing system of digital audio authenticity identification method.
Another object of the present invention is to provide a kind of computer readable storage mediums, including instruction, when it is in computer When upper operation so that computer execution is described to be reflected based on the digital audio true and false to ENF phase spectrums and instantaneous frequency spectrum analysis Determine method.
In conclusion advantages of the present invention and good effect are
The present invention analyzes phase spectrum sensitive to signal cutout in ENF signals and instantaneous frequency spectrum, and extraction is effective respectively Feature set, and the feature set extracted is handled;
The feature-based fusion technology that the present invention uses carries out characteristic processing, improves and knows while reducing intrinsic dimensionality Other gap carries out model training using deep learning method, substantially increases the accuracy rate of the passive tampering detection of digital audio;
The present invention is high for complex environment recording and noisy speech stability, has very strong robustness.
The present invention is that the accuracy of the passive tampering detection of digital audio and automation propose a kind of algorithm of popularity.
The experimental data that the present invention uses come from three different databases totally 500 voices (including original language Sound and distort voice), import these voice signals using MATLAB, it is special to extract the fluctuation of ENF content consistencies by inventive step 1 Sign.According to step 2, phase fluctuation and instantaneous frequency fluctuation are fitted using 5 sin cores and 5 Gaussian kernels;According to step 3, using phase fluctuation feature and frequency fluctuation feature as a feature set, DCA Fusion Features are carried out, two-dimentional fusion is obtained Feature is characterized addition label, uses ten folding cross validations to fusion feature using depth random forest, is finally obtaining classification just True rate reaches 99.8%.
Description of the drawings
Fig. 1 is the digital audio true and false provided in an embodiment of the present invention based on to ENF phase spectrums and instantaneous frequency spectrum analysis Identification method flow chart.
Fig. 2 is provided in an embodiment of the present invention based on DFT1Phase spectrum feature extraction flow chart;
Fig. 3 is the embodiment of the present invention based on Hilbert transformation instantaneous frequency spectrum signature extraction flow charts.
Specific implementation mode
In order to make the purpose , technical scheme and advantage of the present invention be clearer, with reference to embodiments, to the present invention It is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, it is not used to Limit the present invention.
Referring to Fig.1, a kind of digital audio true and false based on to ENF phase spectrums and instantaneous frequency spectrum analysis provided by the invention Identification method includes the following steps:
Step 1:It is pre-processed to measured signal;
Specific implementation includes following sub-step:
Step 1.1:X [n] is pre-processed to measured signal, including down-sampling, goes DC component, obtains xd[n];
In view of (over-sampling can carry for frequency alias effect, signal message loss and the signal-to-noise ratio of signal in the present embodiment The signal-to-noise ratio of high RST) balance, by the resampling frequency f of signaldIt is set to 1000HZ or 1200HZ (by the ENF frequencies of standard Rate is placed on ω0=π/10rad/sample).
Step 1.2:The signal x of down-sampling will be passed through in step 1.1d[n], by centre frequency at ENF standard frequencies Bandpass filter, obtain the ENF ingredients x in signalENFC[n]。
The present embodiment carries out narrow-band filtering using the linear zero phase FIR filter of 10000 ranks prevents phase delay.Center Frequency is at ENF standard frequencies, bandwidth 0.6HZ, passband ripple 0.5dB, stopband attenuation 100dB.Use high-grade filting Device is ideal narrow band signal in order to obtain.Zero padding (zero padding) refer to the end of time-domain signal plus zero with The way for increasing signal length, frequency resolution can be improved before DFT using zero padding, and frequency is more accurately found in help Peak point in spectrum.
Step 2:The feature extraction of phase spectrum and frequency spectrum is carried out to the ENF ingredients in signal;
Specific implementation includes following sub-step:
Step A1:To xENFC[n] carries out being based on DFT1Phase Power estimation, extraction phase spectrum fluctuation characteristic F;
Such as Fig. 2, first to xENFC[n] signal carries out leaf transformation (DFT) in conventional N point discrete Fouriers, obtains X (k), enables kpeakAs every frame | X (k) | the integer index of maximum value is referred to as based on DFT0Phase estimation:
Calculate ENF signals xENFCThe approximate first derivative of [n] at point n:
x′ENFC[n]=fd(xENFC[n]-xENFC[n-1]) (2)
To x 'ENFC[n] carries out DFT0Phase estimation obtains | X ' (k) |, will | X ' (k) | it is multiplied by a scale coefficient F (k).
DFT can be obtained in this way0[k]=X (k) and DFT1[k]=F (k) | X ' (k) |.xENFC[n estimates that frequency values are
ENFC, which is a narrow band signal, to be write as:xENFC[n]=acos (ω0n+ φ0), wherein ω0=2 π fENFC/fd, φ0Represent xENFCInitial phase, and fENFCBe ENF it is actual frequency.It is pushed away according to mathematics Calculation can obtain:
Whereinθ represents x 'ENFCEstimation phase, to X ' (k) carry out linear interpolation to obtain more Accurate value.Based on DFT1The estimation phase spectrum of method is:
Characteristic quantity F is calculated the phase fluctuation feature of ENFC is described.It enablesIt is corresponding n-thbThe estimation phase of frame Position,Wherein 2≤nb≤NBlock,It indicatesFrom nb=2 arrive NBlockAverage value.
Step A2:To xENFC[n] carries out the instantaneous frequency Power estimation based on Hilbert;
Such as Fig. 3, to signal xENFC[n] carries out discrete Hilbert transform.X is obtained firstENFCThe analytical function of [n]:x(a) ENFC[x]=xENFC[x]+i*Η{xENFC[x] }, whereinΗ represents Hilbert transformation.Instantaneous amplitude is Η { xENFC [n] } amplitude, instantaneous frequency is Η { xENFC[n] } phase angle change rate.Estimate the instantaneous frequency f [n] of ENF signals.It is using Due to there is numerical radius during Hilbert transformation, so obtained f [n] there are certain unwanted oscillation, needs further Low-pass filtering, removal oscillation are carried out to f [n].Due to the boundary effect of Frequency Estimation, remove f [n] each 2000 sampled points end to end, Last gained f [n] is the instantaneous frequency Power estimation of ENFC.
Step A3:It carries out curve fitting respectively to phase spectrum and frequency spectrum, extracts phase spectrum fit characteristicWith instantaneous frequency Rate composes fit characteristic
The characteristics of the present embodiment is distributed for ENF phase distributions and instantaneous frequency, uses different analytical expressions respectively Discrete data point group is fitted.For phase or frequency curve selection selection analytical expression standard be:The expression formula Original signal curve and editor's signal curve can be not only fitted respectively, and the difference of the two can be embodied in parameter On.Based on this standard, the present embodiment has selected Sum of two fitting expressions of Sines and Gaussian to be intended respectively Phase curve and frequency curve are closed, wherein expression argument is fitting parameter feature.
Analytical expression Sum of Sines are adapted to fit phase spectrum, and form is:
Wherein a is amplitude, and b is frequency, and c is the phase constant of each sine wave item, and n refers to the quantity of this sequence, is taken Value range is 1≤n≤9.It enablesFor phase spectrum fit characteristic, i.e.,:
Analytical expression Gaussian is adapted to fitting peak value, and form is:
Wherein a is the amplitude of peak value, and b is the position where peak value, and c is related with the secondary lobe at peak, and n refers to being fitted how many Peak value, value range are 1≤n≤8.It enablesFor frequency spectrum fit characteristic, i.e.,:
Step 3:Fusion Features are carried out to multiple feature sets of extraction using DCA methods;
The phase property collection obtained in step 2 and frequecy characteristic collection feature is carried out using differentiation correlation analysis (DCA) to melt It closes.DCA carries out effective Fusion Features by the correlation two-by-two between maximizing two feature sets, while eliminating correlation between class Property, and restricted interior correlation.Intrinsic dimensionality can also be reduced simultaneously, reduce the gap on recognition result.DCA is to apply to ask With the feature-based fusion of method, having reduces intrinsic dimensionality, while the advantages of reduce the gap on recognition result.
Assuming that X ∈ Rp×nWith Y ∈ Rq×nIndicate that two matrixes, each matrix include the n training from different mode Feature vector.If the sample in data matrix is collected from c independent classes.N row in this way in data matrix can be with It is divided into c independent group, wherein niDependent of dead military hero is in ithClassEnable xij∈ X are indicated and ithJth in classthSample This corresponding feature vector.WithX is indicated respectivelyijIn ithMean value in class and in entire feature set, i.e.,Scatter matrix is defined as between class
Wherein
If characteristic is more than classification number (p>>C), calculate covariance matrixIt will be than calculating It is more prone to.By rightMapping can effectively obtainUpper significant feature vector.Therefore it only needs Find the covariance matrix of c × c dimensionsFeature vector.As can be distinguished well between fruit, thenIt will It is a diagonal matrix, becauseIt is symmetric positive semidefinite matrix, the present invention can be by becoming its diagonalization of changing commanders:
P is orthogonal eigenvectors matrix,It is diagonal matrix of the nonnegative real number characteristic value by sequence sequence of successively decreasing.Q(c×r)For The matrix of the r feature vector compositions from matrix P corresponds to first r maximum nonzero eigenvalue.Have:
Mapping in this way can obtain SbxMiddle r important feature vectors:Q→ΦbxQ
bxQ)TSbxbxQ)=Λ(r×r), (13)
Wbxbx-1/2S can be unified by being one kindbxThe transformation for reducing data matrix dimension X simultaneously is tieed up from p dimensions to r. I.e.:
X ' is the projections of X in space, and scatter matrix is I between class, and class is separable.Pay attention to being up to c-1 here A generalized eigenvalue, therefore the upper limit of r is c-1, other upper limits of r are made of the order of data matrix, i.e. r≤min (c-1, rank (X),rank(Y))。
Similar above-mentioned method handles second feature collection Y, and finds transformation matrix Wby, unify between the class of second mode to spread Matrix SbyThe dimension for reducing data matrix Y simultaneously is tieed up from q dimensions to r.
Φ′bxWith Φ 'byMore new capital be r × c non-square quadrature matrix.In spite of Sbx=Sby=I, matrixWithAll it is stringent diagonal matrixElement wherein on diagonal line is non-right close to 1 Element on linea angulata is close to 0.It is minimum related that this so that the center of class has before, therefore can well be divided class From.Next needing to enable the character pair that the feature in same feature set is only concentrated with another feature has non-zero correlation.In order to Realize that this target, the present invention need the scatter matrix between the class of transformation matrix to carry out diagonalization, i.e. S 'xy=X ' Y 'T.Using strange Different value decomposes (SVD) diagonalization Sxy
X ' and Y ' orders therein are all r, S 'xy(r×r)It is non-reduced.It is the member on a diagonal matrix and leading diagonal Element is all nonzero value.Enable Wcx=U Σ-1/2, Wcy=V Σ-1/2, have:
(UΣ-1/2)TS′xy(VΣ-1/2)=I, (19)
It is connected to the covariance matrix S ' between feature setxy.Next feature set is converted:
WhereinIt is the final transformed matrix of X and Y respectively.It can easily be proven that after transformation Scatter matrix is still diagonal between the class of feature set, therefore, can be separated between class.Class between scatter matrix be:
It is known in formula (14)And U is an orthogonal matrix, is had:
Here it can equally proveIt is diagonal matrix.Obtain converting characteristic collectionRepresent the association between feature Variance is a leading diagonal Striking symmetry matrix, shows that the correlation between single feature concentrates different characteristic is minimum.Transformation Feature setCovariance between representative sample is block diagonal matrix, shows that sample has more with the sample in same class High correlation.
Step 4:Model Construction is carried out to the feature after fusion using depth random forest, can be determined to measured signal Plan.
Step 4.1:Model Construction is carried out to the feature after fusion using depth random forest;
The present invention needs to carry out data the data volume of the scanning enlarged sample of more granularities first, is carried out by sliding window Sampling.Window size is 100, step-length 1, then the sample that 301 groups of characteristics are 100 can be obtained after sampling, but these samples are complete An original sample of portion source, so the quantity to sample is expanded.It is then complete using a random forest and one Full random forest is trained.The generation of decision tree in completely random forest is need not to calculate gini index or entropy increasing Benefit randomly selects an attribute as attribute is divided gradually to generate completion.Assuming that the present invention needs to do three classification, then pass through As soon as generating the characteristic information that 301 groups of dimensions are three after a random forest and completely random forest respectively, generated after combination 1806 dimension datas.In the generation and test process of the two random forests and completely random forest, cross validation is rolled over using k Mode predict, first using k-1 groups again this be also equivalent to 300 groups of data and train random forest, with other one It organizes and is tested in data distribution area k-1 numbers, then test set is done to be averaged also just having obtained the output of random forest, every group Data do a test, and recycling k times also just can still obtain the output of k groups.Certainly feature extraction is being carried out using sliding window When can also set different serial ports sizes and different step-lengths, then by after random forest and completely random forest again It combines again together.
In cascading forest, by output (the 3*4=12 dimensions of two completely random forests and two common random forests According to) and initial data (referring to 3618 dimension datas exported after the scanning of more granularities) series connection after as next layer of input (12+ 3618=3630 dimension datas) because being all that the output series connection of last layer has been come in each layer of input has each time Therefore 3630 dimension datas are also equivalent to and are corrected to the parameter of random forest, so, the number of plies of depth random forest is not The present invention oneself setting, it can be depending on the variation of precision and number of plies limitation, when training precision does not have promotion or the number of plies Reach after maximum value will deconditioning, using classification results at this time as final classification precision.
Step 4.2:Whether can be tampered to arbitrary measured signal after preservation model and carry out decision.
The number of plies and structural parameters of obtained depth random forest after the completion of the training process of depth random forest are constituted The fusion feature disaggregated model of gained of the invention, can carry out arbitrary measured signal fusion feature classification and decision.
With reference to specific embodiment/experiment/emulation credit analysis, the invention will be further described.
The experimental data that the present invention uses come from three different databases totally 500 voices (including original language Sound and distort voice), import these voice signals using MATLAB, it is special to extract the fluctuation of ENF content consistencies by inventive step 1 Sign.According to step 2, phase fluctuation and instantaneous frequency fluctuation are fitted using 5 sin cores and 5 Gaussian kernels;According to step 3, using phase fluctuation feature and frequency fluctuation feature as a feature set, DCA Fusion Features are carried out, two-dimentional fusion is obtained Feature is characterized addition label, uses ten folding cross validations to fusion feature using depth random forest, is finally obtaining classification just True rate reaches 99.8%.
In the above-described embodiments, can come wholly or partly by software, hardware, firmware or its arbitrary combination real It is existing.When using entirely or partly realizing in the form of a computer program product, the computer program product include one or Multiple computer instructions.When loading on computers or executing the computer program instructions, entirely or partly generate according to Flow described in the embodiment of the present invention or function.The computer can be all-purpose computer, special purpose computer, computer network Network or other programmable devices.The computer instruction can store in a computer-readable storage medium, or from one Computer readable storage medium is transmitted to another computer readable storage medium, for example, the computer instruction can be from one A web-site, computer, server or data center pass through wired (such as coaxial cable, optical fiber, Digital Subscriber Line (DSL) Or wireless (such as infrared, wireless, microwave etc.) mode is carried out to another web-site, computer, server or data center Transmission).The computer read/write memory medium can be that any usable medium that computer can access either includes one The data storage devices such as a or multiple usable mediums integrated server, data center.The usable medium can be magnetic Jie Matter, (for example, floppy disk, hard disk, tape), optical medium (for example, DVD) or semiconductor medium (such as solid state disk Solid State Disk (SSD)) etc..
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all essences in the present invention All any modification, equivalent and improvement etc., should all be included in the protection scope of the present invention made by within refreshing and principle.

Claims (10)

1. a kind of digital audio authenticity identification method based on to ENF phase spectrums and instantaneous frequency spectrum analysis, which is characterized in that institute It states and includes based on the digital audio authenticity identification method to ENF phase spectrums and instantaneous frequency spectrum analysis:
Pre-processed to measured signal first, including down-sampling and narrow-band filtering, obtain be with mains frequency ENF standard frequencies The narrow band signal at center;Then feature extraction is carried out to ENF signals, analyzes the phase spectrum and instantaneous frequency spectrum of ENF signals, extraction The phase spectrum fluctuation characteristic of ENF signals, phase spectrum and frequency spectrum fitting parameter feature;
By differentiating that correlation analysis DCA methods carry out Fusion Features, the correlation between different feature sets is maximized, is disappeared simultaneously Except correlation between class, and it is restricted in correlation;
Depth random forest is finally applied to carry out Model Construction to the feature after fusion, trained model carries out transfer learning; After model preserves, whether carry out decision is tampered to arbitrary measured signal.
2. as described in claim 1 based on the digital audio authenticity identification method to ENF phase spectrums and instantaneous frequency spectrum analysis, It is characterized in that, the digital audio authenticity identification method based on to ENF phase spectrums and instantaneous frequency spectrum analysis specifically includes:
Step 1:It is pre-processed to measured signal;
Step 2:The feature extraction of phase spectrum and frequency spectrum is carried out to the ENF ingredients in signal;
Step 3:Fusion Features are carried out to multiple feature sets of extraction using DCA methods;
Step 4:Model Construction is carried out to the feature after fusion using depth random forest, carries out decision to measured signal.
3. as claimed in claim 2 based on the digital audio authenticity identification method to ENF phase spectrums and instantaneous frequency spectrum analysis, It is characterized in that,
Step 1, following steps are specifically included:
Step 1.1:X [n] is pre-processed to measured signal, and pretreatment includes down-sampling, goes DC component, obtains xd[n];
Step 1.2:The signal x of down-sampling will be passed through in step 1.1d[n] passes through band logical of the centre frequency at ENF standard frequencies Filter obtains the ENF ingredients x in signalENFC[n]。
4. as claimed in claim 2 based on the digital audio authenticity identification method to ENF phase spectrums and instantaneous frequency spectrum analysis, Step 2, following steps are specifically included:
Step A1:To xENFC[n] carries out being based on DFT1Phase Power estimation, extraction phase spectrum fluctuation characteristic F;
Step A2:To xENFC[n] carries out the instantaneous frequency Power estimation based on Hilbert;
Step A3:It carries out curve fitting respectively to phase spectrum and frequency spectrum, extracts phase spectrum fit characteristicIt is composed with instantaneous frequency Fit characteristic
5. as claimed in claim 3 based on the digital audio authenticity identification method to ENF phase spectrums and instantaneous frequency spectrum analysis, In step A1, to xENFC[n] carries out being based on DFT1Phase Power estimation, first to xENFCIt is discrete that [n] signal carries out conventional N points Fourier transformation DFT, to be based on DFT0Phase estimation, obtain estimation phaseBased on DFT1Phase estimation is in DFT0Basis Upper phase estimation calculates xENFCThe approximate first derivative of [n] at point n:
x′ENFC[n]=fd(xENFC[n]-xENFC[n-1])
In conjunction with approximate first derivative andThe phase estimation of higher order is carried out, and linear interpolation is carried out to estimated result, is obtained Phase spectrum estimated result, extraction phase spectrum fluctuation characteristic F;
In step A2, to xENFC[n] carries out the instantaneous Frequency Estimation converted based on Hilbert, obtains x firstENFCThe parsing of [n] Function:
x(a) ENFC[x]=xENFC[x]+i*Η{xENFC[x] },
WhereinΗ represents Hilbert transformation;Instantaneous frequency is Η { xENFC[n] } phase angle change rate, estimation ENF letter Number instantaneous frequency f [n], to f [n] remove oscillation and boundary effect, build xENFC[n] instantaneous frequency is composed;
In step A3, according to xENFCThe characteristics of phase spectrum and frequency spectrum of [n], is come using Sum of Sines and Gaussian respectively Fit phase is composed and frequency spectral curve;
Sum of Sines expression formula forms:
Gaussian expression formula forms:
Wherein expression argument is fit characteristic,
6. as claimed in claim 2 based on the digital audio authenticity identification method to ENF phase spectrums and instantaneous frequency spectrum analysis, Step 3, it specifically includes:
The phase property collection obtained in step 2 and frequecy characteristic collection are subjected to Fusion Features, DCA using differentiation correlation analysis DCA Fusion Features, and restricted interior correlation are carried out by the correlation two-by-two maximized between two feature sets;Pass through maximum Change the transformed matrix that the covariance matrix between feature set calculates feature set, is carried out at the same time the diagonalization of scatter matrix in class.
7. as claimed in claim 2 based on the digital audio authenticity identification method to ENF phase spectrums and instantaneous frequency spectrum analysis, Step 4, it specifically includes:
Step 4.1:Model Construction is carried out to the feature after fusion using depth random forest:By fusion feature part for training In depth random forest training process, number of plies model is automatically determined according to the variation of precision and number of plies limitation for depth random forest Parameter, training precision is not promoted or the number of plies reaches deconditioning after maximum value, using classification results as final classification precision;
Step 4.2:After preservation model decision is carried out to whether arbitrary measured signal is tampered:Depth random forest was trained The number of plies and structural parameters of the depth random forest obtained after the completion of journey, structure fusion feature disaggregated model, to arbitrary letter to be measured Number fusion feature carries out classification and decision.
8. a kind of realize described in claim 1~7 any one based on the digital sound to ENF phase spectrums and instantaneous frequency spectrum analysis The computer program of frequency authenticity identification method.
9. a kind of realize described in claim 1~7 any one based on the digital sound to ENF phase spectrums and instantaneous frequency spectrum analysis The letter digital audio signal processing system of frequency authenticity identification method.
10. a kind of computer readable storage medium, including instruction, when run on a computer so that computer executes such as Described in claim 1~7 any one based on the digital audio authenticity side to ENF phase spectrums and instantaneous frequency spectrum analysis Method.
CN201810585686.3A 2018-06-06 2018-06-06 Audio identification method based on analysis of ENF phase spectrum and instantaneous frequency spectrum Active CN108806718B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810585686.3A CN108806718B (en) 2018-06-06 2018-06-06 Audio identification method based on analysis of ENF phase spectrum and instantaneous frequency spectrum

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810585686.3A CN108806718B (en) 2018-06-06 2018-06-06 Audio identification method based on analysis of ENF phase spectrum and instantaneous frequency spectrum

Publications (2)

Publication Number Publication Date
CN108806718A true CN108806718A (en) 2018-11-13
CN108806718B CN108806718B (en) 2020-07-21

Family

ID=64087865

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810585686.3A Active CN108806718B (en) 2018-06-06 2018-06-06 Audio identification method based on analysis of ENF phase spectrum and instantaneous frequency spectrum

Country Status (1)

Country Link
CN (1) CN108806718B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110808070A (en) * 2019-11-14 2020-02-18 福州大学 Sound event classification method based on deep random forest in audio monitoring
CN111998936A (en) * 2020-08-25 2020-11-27 四川长虹电器股份有限公司 Equipment abnormal sound detection method and system based on transfer learning
CN112151067A (en) * 2020-09-27 2020-12-29 湖北工业大学 Passive detection method for digital audio tampering based on convolutional neural network
CN112365901A (en) * 2020-11-03 2021-02-12 武汉工程大学 Mechanical audio fault detection method and device
CN113453225A (en) * 2021-06-23 2021-09-28 华中科技大学 Physical layer watermark authentication method and system for LTE system
CN113704409A (en) * 2021-08-31 2021-11-26 上海师范大学 False recruitment information detection method based on cascade forest
US11736763B2 (en) 2019-10-09 2023-08-22 Sony Interactive Entertainment Inc. Fake video detection using block chain

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105223475A (en) * 2015-08-25 2016-01-06 国家电网公司 Based on the shelf depreciation chromatogram characteristic algorithm for pattern recognition of Gaussian parameter matching
US20170200457A1 (en) * 2016-01-11 2017-07-13 University Of Tennessee Research Foundation Tampering detection and location identification of digital audio recordings
CN107274915A (en) * 2017-07-31 2017-10-20 华中师范大学 A kind of DAB of feature based fusion distorts automatic testing method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105223475A (en) * 2015-08-25 2016-01-06 国家电网公司 Based on the shelf depreciation chromatogram characteristic algorithm for pattern recognition of Gaussian parameter matching
US20170200457A1 (en) * 2016-01-11 2017-07-13 University Of Tennessee Research Foundation Tampering detection and location identification of digital audio recordings
CN107274915A (en) * 2017-07-31 2017-10-20 华中师范大学 A kind of DAB of feature based fusion distorts automatic testing method

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
丁敏: "通信辐射源个体识别技术研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
包永强: "音频取证若干关键技术研究进展", 《数据采集与处理》 *
江艳霞: "基于局部Gabor相位特征融合的人脸识别", 《光电工程》 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11736763B2 (en) 2019-10-09 2023-08-22 Sony Interactive Entertainment Inc. Fake video detection using block chain
JP7467615B2 (en) 2019-10-09 2024-04-15 株式会社ソニー・インタラクティブエンタテインメント Fake Video Detection Using Blockchain
CN110808070A (en) * 2019-11-14 2020-02-18 福州大学 Sound event classification method based on deep random forest in audio monitoring
CN110808070B (en) * 2019-11-14 2022-05-06 福州大学 Sound event classification method based on deep random forest in audio monitoring
CN111998936A (en) * 2020-08-25 2020-11-27 四川长虹电器股份有限公司 Equipment abnormal sound detection method and system based on transfer learning
CN112151067A (en) * 2020-09-27 2020-12-29 湖北工业大学 Passive detection method for digital audio tampering based on convolutional neural network
CN112365901A (en) * 2020-11-03 2021-02-12 武汉工程大学 Mechanical audio fault detection method and device
CN113453225A (en) * 2021-06-23 2021-09-28 华中科技大学 Physical layer watermark authentication method and system for LTE system
CN113704409A (en) * 2021-08-31 2021-11-26 上海师范大学 False recruitment information detection method based on cascade forest
CN113704409B (en) * 2021-08-31 2023-08-04 上海师范大学 False recruitment information detection method based on cascading forests

Also Published As

Publication number Publication date
CN108806718B (en) 2020-07-21

Similar Documents

Publication Publication Date Title
CN108806718A (en) Based on the audio authentication method to ENF phase spectrums and instantaneous frequency spectrum analysis
CN108766464B (en) Digital audio tampering automatic detection method based on power grid frequency fluctuation super vector
Du et al. Network traffic anomaly detection based on wavelet analysis
CN107274915A (en) A kind of DAB of feature based fusion distorts automatic testing method
Wang et al. Multi-task Joint Sparse Representation Classification Based on Fisher Discrimination Dictionary Learning.
CN108805102A (en) A kind of video caption detection and recognition methods and system based on deep learning
Li et al. Automatic modulation classification based on bispectrum and CNN
CN103854661A (en) Method and device for extracting music characteristics
CN108766465B (en) Digital audio tampering blind detection method based on ENF general background model
CN109598216B (en) Convolution-based radio frequency fingerprint feature extraction method
CN116599743A (en) 4A abnormal detour detection method and device, electronic equipment and storage medium
Xu et al. A jamming recognition algorithm based on deep neural network in satellite navigation system
Liao et al. Fast Fourier Transform with Multi-head Attention for Specific Emitter Identification
CN116150670A (en) Task independent brain pattern recognition method based on feature decorrelation decoupling
Wu et al. Classification of complex power quality disturbances based on modified empirical wavelet transform and light gradient boosting machine
CN109408498A (en) The identification of time series feature and decomposition method based on eigenmatrix decision tree
Tamtama et al. Increasing Accuracy of The Random Forest Algorithm Using PCA and Resampling Techniques with Data Augmentation for Fraud Detection of Credit Card Transaction
Hao et al. Contrastive self-supervised clustering for specific emitter identification
Wang et al. Specific emitter identification based on the multi‐discrepancy deep adaptation network
Susyanto et al. Semiparametric likelihood‐ratio‐based biometric score‐level fusion via parametric copula
CN113010673A (en) Vulnerability automatic classification method based on entropy optimization support vector machine
Huang et al. Radio fingerprint extraction based on marginal fisher deep autoencoders
CN116738259B (en) Multi-harmonic-based electromagnetic leakage radiation source fingerprint extraction and identification method and device
Gbashi et al. Proposed vision for network intrusion detection system using latent semantic analysis and data mining
CN112529035B (en) Intelligent identification method for identifying individual types of different radio stations

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20181113

Assignee: Hubei ZHENGBO Xusheng Technology Co.,Ltd.

Assignor: CENTRAL CHINA NORMAL University

Contract record no.: X2024980001275

Denomination of invention: Audio identification method based on analysis of ENF phase spectrum and instantaneous frequency spectrum

Granted publication date: 20200721

License type: Common License

Record date: 20240124

EE01 Entry into force of recordation of patent licensing contract
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20181113

Assignee: Hubei Rongzhi Youan Technology Co.,Ltd.

Assignor: CENTRAL CHINA NORMAL University

Contract record no.: X2024980001548

Denomination of invention: Audio identification method based on analysis of ENF phase spectrum and instantaneous frequency spectrum

Granted publication date: 20200721

License type: Common License

Record date: 20240126

EE01 Entry into force of recordation of patent licensing contract