CN101835040B - Digital Video Source Forensics Method - Google Patents

Digital Video Source Forensics Method Download PDF

Info

Publication number
CN101835040B
CN101835040B CN 201010126186 CN201010126186A CN101835040B CN 101835040 B CN101835040 B CN 101835040B CN 201010126186 CN201010126186 CN 201010126186 CN 201010126186 A CN201010126186 A CN 201010126186A CN 101835040 B CN101835040 B CN 101835040B
Authority
CN
China
Prior art keywords
frame
activity
group
frames
code rate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN 201010126186
Other languages
Chinese (zh)
Other versions
CN101835040A (en
Inventor
苏育挺
张静
徐俊瑜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN 201010126186 priority Critical patent/CN101835040B/en
Publication of CN101835040A publication Critical patent/CN101835040A/en
Application granted granted Critical
Publication of CN101835040B publication Critical patent/CN101835040B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The invention belongs to the technical field of digital video detection, and in particular relates to a digital video source evidence forensics method. The method comprises the following steps of: establishing a video sequence sample library used for training; calculating the activity of each of image groups by using an activity and complexity analysis module, and dividing the activity into high activity, medium activity and low activity; decoding the video sequence partially, acquiring various information in a compressed domain thereof, and respectively extracting three characteristics, namely a code rate characteristic, a texture characteristic and a motion vector characteristic; establishing three classifiers, and classifying the image groups with different activities by adopting different classifiers; reading the video sequence of a video source to be detected, acquiring a characteristic vector of each image group of the video sequence, selecting a corresponding classifier according to the activity thereof, and giving a classification result; and synthesizing the classification results of all image groups of the video sequence to perform final judgment. The method has high pertinence and high practicability, and is not easy to attack.

Description

Digital video source evidence forensics method
Technical field
The invention belongs to digital video resource information safe practice field, be specifically related to a kind of digital video source evidence forensics method.
Background technology
The beginning of this century; During digital video has been widely used in daily life and has worked; Meanwhile, developing rapidly of video editing and process software makes the video amateur utilize these video editing instruments more easily; Revise video content and produce the digital video of mixing the spurious with the genuine, this has overturned the traditional concept of people " seeing is believing ".Be used to formal medium, scientific discovery, insurance and court's exhibit etc. if digital video is distorted and forged, will produce significant effects politics and social stability.One tame TV station of Czech Republic in 2007 has play one section not through the strict video of examining; Make ten hundreds of spectators see the false picture of bohemia generation nuclear explosion; Almost cause society panic, it is formed by Bohemian local outdoor scene and nuclear mushroom cloud video-splicing in fact.On the other hand, along with the development of networking, various video sharing website has appearred, for example YouTube, excellent cruel.These Online Video resources and traditional video have very big difference, its more privatization, with a low credibility.Therefore how effectively these multimedia resources of supervision and management have become the key of keeping information industry health, stable development.
Amplification along with the network bandwidth; The digital video resource just progressively replaces the status of literal and rest image; Become the main flow of network information resource; And the appearance of video editing software easily makes that the video technology of distorting is popular, and in this case, the forensic technologies of digital video resource becomes a focus of information security field.The digital video forensic technologies comprises initiatively evidence obtaining and two kinds of passive evidence obtainings, and whether they can pass through to distort to video is carried out authentication, and different applications is still arranged separately.It is the anti-counterfeiting technology of representative that existing active forensic technologies comprises with the robust digital watermark, is the tamperproof technology of representative with the fragile digital watermark, and is the authentication techniques of representative with digital finger-print, digital signature.The basic ideas that these technology are adopted all are through adding additional information digital video to be carried out the authenticity and integrity discriminating.But present case is not contain digital watermarking or digital digest in the DV of the overwhelming majority.Along with video is forged the technology of distorting and developed rapidly, initiatively forensic technologies is owing to receive the restriction of application conditions, can't contain fundamentally that video distorts development, and passive forensic technologies research is more paid attention in digital video evidence obtaining now.
Current research about the passive forensic technologies of video concentrates on two aspects: detection is detected and distorts in the digital video source.It is the first step of medium authentication that the digital video source is detected, and the information of relevant digital video collection, processing, output equipment (like digital video camera) mainly is provided, and in brief, analyzes and answer the problem of " coming from what " and " how producing " exactly.The digital video source evidence forensics technology only needs the participation of evidence obtaining side just can implement evidence obtaining; Can accomplish by evidence obtaining side is independent; Be that digital video source evidence forensics is directly to differentiate according to medium itself, do not need in advance digital video to be done any preliminary treatment as adding digital watermarking that practicality is stronger.The simplest a kind of method is checked the video file header exactly; Devices such as general digital camera can write information such as relevant system information, camera type, coding mode, date and time in video file packet header; But these information all are easy to be changed, and are with a low credibility; Another kind method is exactly to utilize the inherent attribute of encoder and the statistical property of output video code stream to differentiate its source, and this method is with a high credibility, vulnerable not.
Summary of the invention
The object of the present invention is to provide a kind of digital video source evidence forensics method with a high credibility, not pregnable; This technology is not needing other supplementarys (for example shifting to an earlier date embed watermark, video heads file etc.); Single from video encode stream analysis itself, identify this video file by which video camera or software encoder output, this method is with strong points; Practical, vulnerable not.For this reason, the present invention adopts following technical scheme.
A kind of digital video source evidence forensics method may further comprise the steps:
(1) set up the sample storehouse that is used to train, the sample storehouse comprises video sequence that obtains after some are taken by video camera and the video sequence that is produced by various software encoder codings, and these video sequences all are the original compression sequences;
(2) video sequence at first through activity and analysis of complexity module, calculates the activity of each image sets, and utilizes double threshold that it is divided into high activity, middle activity, low activity, and its concrete computational methods are:
I. at first according to formula (1) calculate the luminance component of every adjacent two frames in an image sets (GOP) lining energy difference fd (x, y):
fd(x,y)=|f 1(x,y)-f 2(x,y)| (1)
In the formula, f 1(x, y) and f 2(x y) represents respectively in the 1st frame and the 2nd frame and is positioned at (x, y) the DC coefficient value of the luminance block of position;
Ii. calculate total average energy difference Fd then:
Fd = 1 M Σ x Σ y fd ( x , y ) - - - ( 2 )
In the formula, M representes the number of piece in the frame, and (x y) is the energy difference that calculates in the step 1) to fd;
Iii. calculate the energy variance of an image sets at last according to formula (3), and utilize it to decide this fragment to belong to high activity image sets, middle activity image sets or low activity image sets:
Z = 1 n - 1 Σ 1 n | Fd ( i ) | 2 - - - ( 3 )
In the formula, Fd (i) is that the average energy of every adjacent two frames is poor, and i is a frame index number, and n is a frame number that image sets comprises; Define two threshold T at last 1And T 2(T 1<T 2), if Z>T 2Then it is labeled as high activity image sets; If T 2>Z>T 1Activity image sets in then it being labeled as; Otherwise with the low activity image sets of its mark;
(3) then video sequence is carried out partial decoding of h, obtain the various information of its compression domain, and extract three category features respectively: code check characteristic, textural characteristics, motion vector characteristic;
Iv. wherein the code check characteristic is made up of following 7 stack features amounts, and this patent is with NB IThe code check of representing I frame in the image sets, NB p(i) code check of i P frame in image sets of expression, NB B(j) code check of j B frame in image sets of expression:
A) M, N, they are respectively the frame number of P frame in the image sets and the frame number of B frame;
B) NB I, it is the code check of I frame in the image sets;
C) RPI, it is the average bit rate of P frame in the image sets and the ratio of the code check of I frame, its computing formula is following:
RPI = 1 M Σ i = 1 M NB P ( i ) NB I - - - ( 4 )
D) RBI, it is the average bit rate of B frame in the image sets and the ratio of the code check of I frame, its computing formula is following:
RBI = 1 N Σ i = 1 N NB B ( i ) NB I - - - ( 5 )
E) RAP, RVP, they are respectively the average and the variance of the relative difference of adjacent two P frame code checks in the image sets:
RA P = 1 M - 1 Σ j = 1 M - 1 D P ( j ) - - - ( 6 )
RV P = 1 M - 1 Σ j = 1 M - 1 ( D P ( j ) - RA P ) 2 - - - ( 7 )
Wherein, D P(j) be the relative difference of adjacent two P frame code checks, computing formula is following:
D P ( j ) = | NB P ( j + 1 ) - NB P ( j ) | NB P ( j ) j = 1,2 , . . . M - 1 - - - ( 8 )
F) RAB, RVB, they are respectively the continuous average and the variances of the relative difference of B frame code check in twos in the image sets:
RA B = 1 N - 1 Σ j = 1 N - 1 D B ( j ) - - - ( 9 )
RV B = 1 N - 1 Σ j = 1 N - 1 ( D B ( j ) - RA B ) 2 - - - ( 10 )
Wherein, D B(j) be the continuous relative difference of B frame code check in twos, computing formula is following:
D B ( j ) = | NB B ( j + 1 ) - NB B ( j ) | NB B ( j ) j = 1,3 , N - 1 - - - ( 11 )
G) RDIP, it is the ratio of the I frame code check difference and the P frame code check difference of two adjacent images group, its computing formula is following:
RDIP = I 2 - I 1 P 2 - P 1 - - - ( 12 )
In the formula, I1 is the I frame code check of previous image sets, and P1 is the code check of first P frame of adjacent I1; I2 is the I frame code check of present image group, and P2 is the code check of first P frame of adjacent I2;
V. 7 stack features amounts below wherein textural characteristics has comprised are with Q (i) kBe illustrated in the quantization parameter of i macro block in the frame of video of k type, the K frame is a kind of in I frame, P frame, the B frame; QS (i) kBe illustrated in the k type frame, i has the macro block number of identical quantization parameter continuously; With QD (i) kRepresent the quantization parameter difference of i to adjacent two macro blocks:
A) QA k, QV k, k ∈ I, and P, B}, it is in an image sets, the Q of k type frame (i) kAverage and variance;
B) QMA k, QMI k, k ∈ I, and P, B}, it is in an image sets, the QS of the frame of k type (i) kMaximum and minimum value;
C) QSA k, QSV k, k ∈ I, and P, B}, it is in an image sets, the QS of the frame of k type (i) kAverage and variance;
D) QMD k, k ∈ I, and P, B}, it is in an image sets, the QD of the frame of k type (i) kMaximum;
E) QAD k, QVD k, k ∈ I, and P, B}, it is in an image sets, the QD of the frame of k type (i) kAverage and variance;
F) ADI, it is that absolute frame between the two adjacent images group I frame is poor;
G) HEP k, k ∈ I, and P, B}, the high-frequency energy of all kinds frame accounts for the ratio of integral energy in its image sets;
Vi. a few stack features amounts below wherein the motion vector characteristic has comprised are with MV (k; X, y) expression is positioned at k type frame (x, the y) motion vector of the macro block of position, MVH (k; X, y), MVV (k; X y) is its level and vertical component respectively:
A) MX, MY, it is motion vector MV (k; X, level y) and the maximum of vertical component;
B) MZ, it is static macro block characteristics amount:
MZ = MM + MS 2 - - - ( 13 )
Wherein, MM and MS definition is as follows:
MM = min n ( Σ x = 1 8 Σ y = 1 8 | X M ( x , y ; n ) - X M R ( x , y ; n ) | ) n = 1,2 , . . . - - - ( 14 )
MS = max n ( Σ x = 1 8 Σ y = 1 8 | X S ( x , y ; m ) - X S R ( x , y ; m ) | ) m = 1,2 , . . . - - - ( 15 )
In the formula, X M(x, y; N) be (x, y) locational pixel value, the X of n motion macro block of present frame M R(x, y; N) be pixel value on its reference frame relevant position; Similarly, X S(x, y; M) be (x, y) locational pixel value, the X of present frame n static macro block S R(x, y; M) be its pixel value on the relevant position in reference frame;
C) MAX k, MAY k, MDX k, MDY k, { they are respectively average and the variance of relative error on level and vertical component of motion vector to k ∈ for P, B}, and this relative error is meant the motion vector MV that current decoding obtains, and (x is y) with optimal motion vector MV 0(x, the distance between y), this optimal motion vector MV 0(x is to utilize a global search algorithm based on TM5 to draw y);
The relative error computing formula of its horizontal component and vertical component is following:
F H ( k ; x , y ) = | MVH ( k ; x , y ) - MVH 0 ( k ; x , y ) MVH 0 ( k ; x , y ) | - - - ( 16 )
F V ( k ; x , y ) = | MVV ( k ; x , y ) - MVV 0 ( k ; x , y ) MVV 0 ( k ; x , y ) | - - - ( 17 )
MVH in the formula 0(k; X, y), MVV 0(k; X y) is the optimal motion vector MV of K type predictive frame 0(x, level y) and vertical component;
D) MC, the matching criterior characteristic quantity:
MC = 1 m Σ x Σ y R m ( x , y ) - - - ( 18 )
Wherein, R m(x, y) be positioned in m the P frame (it is defined as for x, the matching attribute of macro block y):
R ( x , y ) = 1 if min i , j ( MAE ( i + MVH , j + MVV ) = MAE ( MVH , MVV ) i , j = - 1,0,1 0 otherwise - - - ( 19 )
In the formula, MAE (x, y) function be calculate current macro and motion vector (x, y) reference macroblock of indication between mean absolute difference;
(4) set up 3 graders; And train respectively: to the image sets of different activity; Adopt different graders,, only use the motion vector characteristic for high activity image sets; Only utilize code check characteristic and textural characteristics during low activity image sets, use three stack features amounts to classify simultaneously during middle activity image sets;
(5) read the video sequence of source video sequence to be detected, for its each image sets, repeating step (2) to (4) obtains the characteristic vector of image sets, selects respective classified device and provides classification results according to its activity;
(6) classification results of all images group of comprehensive video sequence carries out conclusive judgement.
The present invention is mainly in order to satisfy when video resource is collected evidence; Know that video segment is the needs of being recorded by the video camera of what type; And a kind of source detection technique of design; It has effectively verified the authenticity of video, accomplishes the first step of medium authentication, for judiciary provides evidence.Its evident characteristic of source video sequence forensic technologies of the present invention comprises:
(1) practicality: the participation that the source video sequence forensic technologies only need the side of evidence obtaining just can be implemented evidence obtaining; Can accomplish by evidence obtaining side is independent; Be that the source evidence forensics technology is directly to come differential coding device according to the built-in attribute and the statistical property of video code flow; Do not need in advance digital video to be done any preliminary treatment as at the encoder-side embed digital watermark, practical.
(2) real-time: all be to extract characteristic quantity in compression field in characteristic extracting module, promptly only need partial decoding of h, the system resource and the time that need are all few especially.
(3) novelty: in activity and analysis of complexity module; Each image sets (GOP) is carried out the activity judgement respectively and is divided into high activity, middle activity, low activity; Video to different activity adopts different character amount and grader; With strong points, the various video resources of adaptation that so just can be flexible and changeable have improved the accuracy that source video sequence detects greatly.
Description of drawings
Fig. 1 is the flow chart of whole video source evidence forensics of the present invention;
Fig. 2 is the flow chart of characteristic extracting module of the present invention;
Fig. 3 is the flow chart of comprehensive decision device of the present invention.
Embodiment
Source video sequence forensic technologies of the present invention mainly is made up of several sections such as activity and analysis of complexity module, characteristic extracting module, grader, comprehensive decision devices.Fig. 1 has described the process of whole video source evidence forensics.In steps A, to calculate the activity of each GOP through activity and analysis of complexity module, and utilize double threshold that it is divided into high activity, middle activity, low activity, its computational methods are:
At first calculate the energy difference of the luminance component of every adjacent two frames in the GOP, its computing formula is following:
fd(x,y)=|f 1(x,y)-f 2(x,y)| (1)
F in the formula 1(x, y) and f 2(x, y) represent respectively in the 1st frame and the 2nd frame and be positioned at (then total average energy difference is for x, y) the DC coefficient value of the luminance block of position:
Fd = 1 M Σ x Σ y fd ( x , y ) - - - ( 2 )
M representes the number of piece in the frame in the formula.Through calculating the energy variance of an image sets (GOP), decide this fragment to belong to high activity GOP, middle activity GOP or low activity GOP then, its computing formula is following:
Z = 1 n - 1 Σ 1 n | Fd ( i ) | 2 - - - ( 3 )
Fd in the formula (i) is in last a one-step process, and the average energy of every adjacent two frames that calculate is poor, and i is a frame index number, and n is a frame number that GOP comprises.Define two threshold T at last 1And T 2(T 1<T 2), if Z>T 2Then it is labeled as high activity GOP; If T 2>Z>T 1Activity GOP in then it being labeled as; Otherwise with the low activity GOP of its mark.
At step B, utilize characteristic extracting module at first to realize data decode, obtaining three types of initial data and then extracting three category features is code check characteristic, textural characteristics, motion vector characteristic.At step C, select corresponding characteristic vector and grader that the testing image group is classified based on the result of activity and analysis of complexity module.At step D, the classification results of comprehensive all images group utilizes the method for maximal possibility estimation to make final justice.
Fig. 2 has described the flow chart of characteristic extracting module.At first utilize data-analyzing machine that video resource is carried out partial decoding of h, and obtain three types of initial data.Be input to these three types of initial data respectively in code check feature extractor, texture feature extraction device, the motion vector feature extractor then, export three category features vector.
1. code check characteristic vector: Data Rate Distribution is the first step of Rate Control strategy, can fully reflect the different thinkings of encoder designer.Before coding one frame, bit rate controller can come the bit number of preassignment present frame according to a lot of parameters, like frame type, and the bit number of former same type frame, current buffer conditions, complexity of present frame or the like.Conversely, the storage size of every type of frame in obtaining each image sets (GOP) (have 3 kinds be I frame, P frame, B frame) is a code check, and we can extract some characteristic quantities and come reverse modeling to embody the difference of encoder.We use NB IThe code check of representing I frame among the GOP, NB P(i) code check of i P frame among GOP of expression, NB B(j) code check of j B frame among GOP of expression.
(1) frame number of P frame and B frame among M, the N, GOP, its key reaction the frame structure of an image sets.
(2) code check of I frame among the NBI, GOP, it has mainly embodied the code check benchmark of encoder.
(3) ratio of the average bit rate of P frame and the code check of I frame among the RPI, GOP, it has reflected the code check pre-distribution scheme in the encoder.Its computing formula is following:
RPI = 1 M Σ i = 1 M NB P ( i ) NB I - - - ( 4 )
(4) ratio of the average bit rate of B frame and the code check of I frame among the RBI, GOP, it has reflected the code check pre-distribution scheme in the encoder.Its computing formula is following:
RBI = 1 N Σ i = 1 N NB B ( i ) NB I - - - ( 5 )
(5) average and the variance of the relative difference of adjacent two P frame code checks among RAP, the RVP, GOP, it reflects the fine-tuning capability of encoder to the P frame.
RA P = 1 M - 1 Σ j = 1 M - 1 D P ( j ) - - - ( 6 )
RV P = 1 M - 1 Σ j = 1 M - 1 ( D P ( j ) - RA P ) 2 - - - ( 7 )
D wherein P(j) be the relative difference of code check, computing formula is following:
D P ( j ) = | NB P ( j + 1 ) - NB P ( j ) | NB P ( j ) j = 1,2 , . . . M - 1 - - - ( 8 )
(6) RAB, RVB, the continuous average and the variance of the relative difference of B frame code check in twos among the GOP, it reflects the fine-tuning capability of encoder to the B frame.
RA B = 1 N - 1 Σ j = 1 N - 1 D B ( j ) - - - ( 9 )
RV B = 1 N - 1 Σ j = 1 N - 1 ( D B ( j ) - RA B ) 2 - - - ( 10 )
D wherein B(j) be the relative difference of code check, computing formula is following:
D B ( j ) = | NB B ( j + 1 ) - NB B ( j ) | NB B ( j ) j = 1,3 , N - 1 - - - ( 11 )
(7) RDIP, the ratio of the I frame code check difference of adjacent two GOP and P frame code check difference, its computing formula is following:
RDIP = I 2 - I 1 P 2 - P 1 - - - ( 12 )
Wherein I1 is the I frame code check of previous GOP, and P1 is the code check of first P frame of adjacent I1; In like manner, I2 is the I frame code check of current GOP, and P2 is the code check of first P frame of adjacent I2.
The code check characteristic vector that these characteristic quantities are formed can reflect effectively that the Rate Control strategy can be differentiated the video segment of low activity effectively in the difference of aspects such as preassignment, fine setting in the encoder.For example, if coder side overweights the quality that improves spatial discrimination, it can strengthen the code check of I frame so, and RPI will descend; Otherwise if lay particular emphasis on the quality that improves temporal resolution, it will improve the code check of P frame so, and RPI will promote.For another example, the relative difference of code check can reflect the Data Rate Distribution fine-tuning capability of different coding device to consecutive frame, and some simple or real-time encoder is the Data Rate Distribution that can not change adjacent B frame basically.
2. texture feature vector: the different coding device adopts the different code rate control strategy according to concrete needs, but the method for its stable code stream but is the same, promptly is to adjust three parameters---the coding mode of quantization parameter, frame per second, interframe block.Latter two parameter all be used for handling abnormal situation such as buffering area unusual, changing quantization parameter is the main means that realize the Rate Control target.The rule change of quantization parameter can reflect the difference of encoder equally.At first utilize data-analyzing machine that video data is carried out partial decoding of h, obtain the quantization parameter distribution situation of every frame among the GOP, propose various characteristic quantities then.We are with Q (i) kBe illustrated in the quantization parameter of i macro block in the frame of video (I frame, P frame or B frame) of k type; QS (i) kRepresent that i has the macro block number of identical quantization parameter continuously; With QD (i) kRepresent the quantization parameter difference of i to adjacent two macro blocks.
(1) QA k, QV k, k ∈ I, and P, B}, it is in an image sets (GOP), average and the variance of the Q of k type frame (i).
(2) QMAk, QMIk, k ∈ I, and P, B}, it is in a GOP, the QS of the frame of k type (i) kMaximum and minimum value.
(3) QSAk, QSVk, ∈ I, and P, B}, it is in a GOP, the QS of the frame of k type (i) kAverage and variance.
(4) QMDk, k ∈ I, and P, B}, it is in a GOP, the QD of the frame of k type (i) kMaximum.
(5) QADk, QVDk, k ∈ I, and P, B}, it is in a GOP, the QD of the frame of k type (i) kAverage and variance.
(6) ADI, the absolute frame between adjacent two I frames is poor.
(7) HEPk, { high-frequency energy of all kinds frame accounts for the ratio of integral energy to k ∈ among the B}, GOP for I, P.
These characteristic quantities have been combined to form texture feature vector, and it can reflect the difference of the Rate Control strategy in the encoder.For example, if the Rate Control strategy lays particular emphasis on the quality of entire frame, it will change the quantization parameter of adjacent macroblocks seldom significantly so; Otherwise if tend to stably bit rate output stream, its fine setting mechanism that will induce one so is according to removing to modulate the quantization parameter of current macro when the capacity of anterior bumper and the quantization parameter of previous macro block.
3. motion vector characteristic vector: motion estimation algorithm is the nucleus module of inter prediction encoding technology, accounts for 70% of whole coded system computational load greatly, is the key that improves the whole system performance.Encoder designer can be weighed coding real-time and code efficiency according to concrete needs when the design encoder.Therefore, its coding emphasis is different, is embodied on the motion vector Distribution Statistics also just different.It obtains the motion vector difference situation of each predictive frame among the GOP (being P frame, B frame) at first to utilize data analysis, proposes various characteristic quantities then.We are with MV (k; X, y) expression is positioned at k type frame (x, the y) motion vector of the macro block of position, MVH (k; X, y), MVV (k; X y) is its level and vertical component respectively.
(1) MX, MY, it is motion vector MV (k; X, level y) and the maximum of vertical component.
(2) MZ, static macro block characteristics amount, it has reacted motion estimation algorithm and has judged whether current macro is the threshold value of static macro block.So-called static macro block is meant that motion vector is those zero inter macroblocks in P frame or B frame.
MZ = MM + MS 2 - - - ( 13 )
Wherein MM and MS definition is as follows:
MM = min n ( Σ x = 1 8 Σ y = 1 8 | X M ( x , y ; n ) - X M R ( x , y ; n ) | ) n = 1,2 , . . . - - - ( 14 )
MS = max n ( Σ x = 1 8 Σ y = 1 8 | X S ( x , y ; m ) - X S R ( x , y ; m ) | ) m = 1,2 , . . . - - - ( 15 )
X in the formula M(x, y; N) be (x, y) locational pixel value, the X of n motion macro block of present frame M R(x, y; N) be pixel value on its reference frame relevant position.Similarly, X S(x, y; M) be (x, y) locational pixel value, the X of present frame n static macro block S R(x, y; M) be its pixel value on the relevant position in reference frame.
(3) MAX k, MAY k, MDX k, MDY k, { P, B} are respectively the average and the variances of the relative error of motion vector level and vertical component to k ∈.The relative error here is meant the motion vector MV that current decoding obtains, and (x is y) with optimal motion vector MV 0(x, the distance between y).In order to estimate the performance of motion estimation algorithm, utilize a global search algorithm to go to reappraise based on TM5 (MPEG-2TestModel 5), obtain optimum motion vector MV 0(x, y).
Its horizontal component relative error computing formula is following:
F H ( k ; x , y ) = | MVH ( k ; x , y ) - MVH 0 ( k ; x , y ) MVH 0 ( k ; x , y ) | - - - ( 16 )
Its vertical component relative error computing formula is following:
F V ( k ; x , y ) = | MVV ( k ; x , y ) - MVV 0 ( k ; x , y ) MVV 0 ( k ; x , y ) | - - - ( 17 )
MVH in the formula 0(k; X, y), MVV 0(k; X y) is the optimal motion vector MV of K type predictive frame 0(x, level y) and vertical component.
(4) MC, the matching criterior characteristic quantity.
MC = 1 m Σ x Σ y R m ( x , y ) - - - ( 18 )
R wherein m(x is to be positioned at (x, the matching attribute of macro block y) in m the P frame y).It is defined as:
R ( x , y ) = 1 if min i , j ( MAE ( i + MVH , j + MVV ) = MAE ( MVH , MVV ) i , j = - 1,0,1 0 otherwise - - - ( 19 )
MAE in the formula (x, y) function be calculate current macro and motion vector (x, y) reference macroblock of indication between mean absolute difference.Be R m(x is that 1 the meaning is that the mean absolute difference of current macro and its reference macroblock is minimum in 3 * 3 neighborhoods of reference macroblock y).MC can reflect the matching criterior that encoder adopts and the distance of mean absolute difference criterion.
The motion vector characteristic vector that this several characteristic amount is formed can reflect the difference of encoder in motion estimation algorithm.For example, though coding standard has been stipulated the maximum search window, practical encoder all can define an especially little search window, so that reduce computation complexity.And for example, real-time encoder can adopt a bigger static macro block threshold value, makes more macro block be judged to static macro block, thereby improves coding rate; And other encoder system can adopt less threshold value in order to make full use of predictive coding, yet can improve code efficiency greatly.
What Fig. 3 described is the flow chart of comprehensive decision device.This patent is a detecting unit with image sets (GOP), extract three category feature vectors after, select the respective classified device according to the activity of its image sets.To the GOP of different activity, adopt different mode classifications.Wherein grader 1 is that it only utilizes the motion vector characteristic vector to high activity image sets; Grader 2 is that it uses motion vector characteristic vector, code check characteristic vector and texture feature vector simultaneously to middle activity image sets; Grader 3 is that it only utilizes code check characteristic vector and texture feature vector to low activity image sets.Then remaining image sets is carried out identical operations, and provide the corresponding judgment result, the classification results of last comprehensive video sequence all images group utilizes the method for maximal possibility estimation to make conclusive judgement.
Before carrying out digital video source evidence forensics, at first to set up the sample storehouse that is used to train, it comprises video segment that obtains after some are taken by video camera and the video sequence that is produced by various software encoder codings, these video sequences all are the original compression sequences.Then each image sets in the sample storehouse is carried out partial decoding of h and feature extraction, obtain characteristic vector, and set up grader and utilize corresponding characteristic vector to train.And carry out the image sets of source evidence forensics for needs, also adopting uses the same method obtain its characteristic vector after, adjudicate with the respective classified device, and to provide this video sequence be the result who is produced by which encoder, thereby accomplish the detection in digital video source.

Claims (1)

1.一种数字视频来源取证方法,包括以下步骤: 1. A digital video source evidence collection method, comprising the following steps: (1)建立用于训练的样本库,样本库包括一些由摄像机拍摄后得到的视频序列和由各种软件编码器编码产生的视频序列,这些视频序列都是原始压缩序列; (1) Set up a sample library for training. The sample library includes some video sequences obtained after being taken by the camera and video sequences encoded by various software encoders. These video sequences are all original compressed sequences; (2)视频序列首先通过活动性与复杂度分析模块,计算每个图像组的活动性,并利用双门限将其划分为高活动性、中活动性、低活动性,其具体计算方法为: (2) The video sequence first calculates the activity of each image group through the activity and complexity analysis module, and uses double thresholds to divide it into high activity, medium activity and low activity. The specific calculation method is: i.首先根据公式(1)计算一个图像组(GOP)里每相邻两帧的亮度分量的能量差fd(x,y): i. First calculate the energy difference fd(x, y) of the luminance components of every two adjacent frames in a group of pictures (GOP) according to formula (1): fd(x,y)=|f1(x,y)-f2(x,y)|                                            (1) fd(x, y)=|f 1 (x, y)-f 2 (x, y)| (1) 式中,f1(x,y)和f2(x,y)分别代表第1帧和第2帧中位于(x,y)位置的亮度块的DC系数值; In the formula, f 1 (x, y) and f 2 (x, y) respectively represent the DC coefficient value of the luminance block at the position (x, y) in the first frame and the second frame; ii.然后计算总的平均能量差Fd: ii. Then calculate the total mean energy difference Fd: 式中,S表示一帧中块的个数,fd(x,y)是步骤(1)中计算得到的能量差; In the formula, S represents the number of blocks in a frame, and fd(x, y) is the energy difference calculated in step (1); iii.最后根据公式(3)计算一个图像组的能量方差,并利用它来决定该片段是属于高活动性图像组、中活动性图像组还是低活动性图像组: iii. Finally, calculate the energy variance of an image group according to formula (3) and use it to decide whether the segment belongs to a high activity image group, a medium activity image group or a low activity image group:
Figure FDA0000138723830000012
Figure FDA0000138723830000012
式中,Fd(i)是每相邻两帧的平均能量差,i是帧索引号,n是一个图像组包含的帧数;最后定义两个门限值T1和T2,T1<T2,如果Z>T2则将其标记为高活动性图像组;如果T2>Z>T1则将其标记为中活动性图像组;否则将其标记低活动性图像组; In the formula, Fd(i) is the average energy difference between two adjacent frames, i is the frame index number, n is the number of frames contained in a picture group; finally define two thresholds T 1 and T 2 , T 1 < T 2 , if Z>T 2 , mark it as a high-activity image group; if T 2 >Z>T 1 , mark it as a medium-activity image group; otherwise, mark it as a low-activity image group; (3)接着对视频序列进行部分解码,获取其压缩域的各种信息,并分别提取三类特征:码率特征、纹理特征、运动矢量特征; (3) Then partially decode the video sequence, obtain various information in its compressed domain, and extract three types of features: bit rate feature, texture feature, and motion vector feature; i.其中码率特征由以下7组特征量组成,以NBI表示一个图像组中I帧的码率,NBp(i)表示一个图像组中第i个P帧的码率,NBB(j)表示一个图像组中第j个B帧的码率: i. wherein the code rate feature is composed of the following 7 groups of feature quantities, with NB I representing the code rate of the I frame in a group of pictures, NB p (i) representing the code rate of the i-th P frame in a group of pictures, and NB B ( j) represents the code rate of the jth B frame in a picture group: a)M、N,它们分别是一个图像组中P帧的帧数和B帧的帧数; a) M, N, they are respectively the number of frames of the P frame and the number of frames of the B frame in a picture group; b)NBI,它是一个图像组中I帧的码率; b) NB I , which is the code rate of the I frame in a group of pictures; c)RPI,它是一个图像组中P帧的平均码率与I帧的码率之比,其计算公式如下: c) RPI, which is the ratio of the average code rate of the P frame to the code rate of the I frame in a picture group, and its calculation formula is as follows:
Figure FDA0000138723830000013
Figure FDA0000138723830000013
d)RBI,它是一个图像组中B帧的平均码率与I帧的码率之比,其计算公式如下: d) RBI, which is the ratio of the average code rate of the B frame in a picture group to the code rate of the I frame, and its calculation formula is as follows:
Figure FDA0000138723830000014
Figure FDA0000138723830000014
e)RAP,RVP,它们分别是一个图像组中相邻两个P帧码率的相对差值的均值和方差:  e) RA P , RV P , which are the mean and variance of the relative difference between the code rates of two adjacent P frames in an image group:
Figure FDA0000138723830000021
Figure FDA0000138723830000021
Figure FDA0000138723830000022
Figure FDA0000138723830000022
其中,DP(j)是相邻两个P帧码率的相对差值,计算公式如下: Among them, D P (j) is the relative difference between the code rates of two adjacent P frames, and the calculation formula is as follows:
Figure FDA0000138723830000023
Figure FDA0000138723830000023
f)RAB,RVB,它们分别是一个图像组中连续两两B帧码率的相对差值的均值和方差: f) RA B , RV B , which are respectively the mean and variance of the relative difference between the code rates of two consecutive B frames in an image group:
Figure FDA0000138723830000024
Figure FDA0000138723830000024
Figure FDA0000138723830000025
Figure FDA0000138723830000025
其中,DB(j)是连续两两B帧码率的相对差值,计算公式如下: Among them, D B (j) is the relative difference between the code rates of two consecutive B frames, and the calculation formula is as follows:
Figure FDA0000138723830000026
Figure FDA0000138723830000026
g)RDIP,它是相邻两个图像组的I帧码率差值与P帧码率差值的比值,其计算公式如下: g) RDIP, which is the ratio of the I frame code rate difference and the P frame code rate difference of two adjacent image groups, and its calculation formula is as follows:
Figure FDA0000138723830000027
Figure FDA0000138723830000027
式中,I1是前一个图像组的I帧码率,P1是紧挨 I1的第一个P帧的码率;I2是当前图像组的I帧码率,P2是紧挨I2的第一个P帧的码率; In the formula, I1 is the I frame code rate of the previous picture group, P1 is the code rate of the first P frame next to I1; I2 is the I frame code rate of the current picture group, and P2 is the first P frame close to I2 The code rate of the P frame; ii.其中纹理特征包含了以下7组特征量,以Q(i)k表示在k类型的视频帧中第i个宏块的量化参数,K帧为I帧、P帧、B帧中的一种;QS(i)k表示在k类型帧中,第i个连续具有相同量化参数的宏块个数;用QD(i)k表示第i对相邻两个宏块的量化参数差值: ii. wherein the texture features include the following 7 sets of feature quantities, Q(i) k represents the quantization parameter of the i-th macroblock in a k-type video frame, and the K frame is one of the I frame, P frame, and B frame QS(i) k represents the number of the i-th consecutive macroblocks with the same quantization parameter in the k-type frame; QD(i) k represents the quantization parameter difference between the i-th pair of adjacent two macroblocks: a)QAk,QVk,k∈{I,P,B},它是在一个图像组中,k类型帧的Q(i)k的均值与方差; a) QA k , QV k , k∈{I, P, B}, which is the mean and variance of Q(i) k of k type frames in an image group; b)QMAk,QMIk,k∈{I,P,B},它是在一个图像组中,k类型的帧的QS(i)k的最大值与最小值; b) QMA k , QMI k , k ∈ {I, P, B}, which is the maximum and minimum values of QS(i) k of k type frames in a group of images; c)QSAk,QSVk,k∈{I,P,B},它是在一个图像组中,k类型的帧的QS(i)k的均值和方差; c) QSA k , QSV k , k ∈ {I, P, B}, which is the mean and variance of QS(i) k for frames of type k in a group of images; d)QMDk,k∈{I,P,B},它是在一个图像组中,k类型的帧的QD(i)k的最大值; d) QMD k , k ∈ {I, P, B}, which is the maximum value of QD(i) k for frames of type k in a group of pictures; e)QADk,QVDk,k∈{I,P,B},它是在一个图像组中,k类型的帧的QD(i)k的均值与方差; e) QAD k , QVD k , k ∈ {I, P, B}, which is the mean and variance of QD(i) k of frames of type k in an image group; f)ADI,它是相邻两个图像组I帧之间的绝对帧差; f) ADI, which is the absolute frame difference between two adjacent image group I frames; g)HEPk,k∈{I,P,B},它一个图像组中各种类型帧的高频能量占整体能量的比值; g) HEP k , k∈{I, P, B}, which is the ratio of the high-frequency energy of various types of frames in an image group to the overall energy; iii.其中运动矢量特征包含了以下几组特征量,以MV(k;x,y)表示位于k类型帧(x,y)位置的宏块的运动矢量,MVH(k;x,y)、MVV(k;x,y)分别是其水平和垂直分量:  iii. wherein the motion vector feature includes the following groups of feature quantities, with MV (k; x, y) representing the motion vector of the macroblock located at the k type frame (x, y) position, MVH (k; x, y), MVV(k; x, y) are its horizontal and vertical components respectively: a)MX、MY,它是运动矢量MV(k;x,y)的水平和垂直分量的最大值; a) MX, MY, which is the maximum value of the horizontal and vertical components of the motion vector MV(k; x, y); b)MZ,它是静止宏块特征量: b) MZ, which is a static macroblock feature quantity:
Figure FDA0000138723830000031
Figure FDA0000138723830000031
其中,MM和MS定义如下: Among them, MM and MS are defined as follows:
Figure FDA0000138723830000032
Figure FDA0000138723830000032
Figure FDA0000138723830000033
Figure FDA0000138723830000033
式中,XM(x,y;n)是当前帧第n个运动宏块的(x,y)位置上的像素值, 
Figure FDA0000138723830000034
是在其参考帧相应位置上的像素值;类似的,XS(x,y;m)是当前帧第n个静止宏块的(x,y)位置上的像素值, 是其在参考帧中相应位置上的像素值;
In the formula, X M (x, y; n) is the pixel value at the (x, y) position of the nth motion macroblock in the current frame,
Figure FDA0000138723830000034
is the pixel value at the corresponding position of its reference frame; similarly, X S (x, y; m) is the pixel value at the (x, y) position of the nth static macroblock in the current frame, is the pixel value at the corresponding position in the reference frame;
c)MAXk,MAYk,MDXk,MDYk,k∈{P,B},它们分别是运动矢量的相对误差在水平和垂直分量上的均值和方差,该相对误差是指当前解码获取的运动矢量MV(x,y)与最优运动矢量MV0(x,y)之间的距离,该最优运动矢量MV0(x,y)是利用一个基于TM5的全局搜索算法得出; c) MAX k , MAY k , MDX k , MDY k , k∈{P, B}, they are the mean and variance of the relative error of the motion vector on the horizontal and vertical components respectively, the relative error refers to the current decoding obtained the distance between the motion vector MV(x , y) and the optimal motion vector MV 0 ( x, y), which is obtained using a TM5-based global search algorithm; 其水平分量和垂直分量的相对误差计算公式如下: The relative error calculation formula of its horizontal component and vertical component is as follows:
Figure FDA0000138723830000036
Figure FDA0000138723830000036
Figure FDA0000138723830000037
Figure FDA0000138723830000037
式中MVH0(k;x,y),MVV0(k;x,y)是K类型预测帧的最优运动矢量MV0(x,y)的水平和垂直分量; In the formula MVH 0 (k; x, y), MVV 0 (k; x, y) is the horizontal and vertical component of the optimal motion vector MV 0 (x, y) of the K type prediction frame; d)MC,匹配准则特征量: d) MC, matching criterion feature quantity:
Figure FDA0000138723830000038
Figure FDA0000138723830000038
其中,Rm(x,y)是第m个P帧中位于(x,y)的宏块的匹配因子,其定义为: Among them, R m (x, y) is the matching factor of the macroblock located at (x, y) in the mth P frame, which is defined as: 式中,MAE(x,y)函数是计算当前宏块与运动矢量(x,y)所指的参考宏块的之间平均绝对差; In the formula, the MAE (x, y) function is to calculate the average absolute difference between the current macroblock and the reference macroblock indicated by the motion vector (x, y); (4)建立3个分类器,并分别进行训练:针对不同活动性的图像组,采用不同的分类器,对于高活动性图像组,只使用运动矢量特征,低活动性图像组时只利用码率特征和纹理特征,中活动性图像组时同时使用三组特征量进行分类; (4) Establish three classifiers and train them separately: for different activity image groups, use different classifiers, for high activity image groups, only use motion vector features, and for low activity image groups, only use code Rate features and texture features, and three sets of feature quantities are used for classification when the active image group is used at the same time; (5)读取待检测视频来源的视频序列,对于其各个图像组,重复步骤(2)至(4)获取图像组的特征向量,根据其活动性选择相应的分类器并给出分类结果;  (5) Read the video sequence of the video source to be detected, for each image group, repeat steps (2) to (4) to obtain the feature vector of the image group, select the corresponding classifier according to its activity and give the classification result; (6)综合视频序列的所有图像组的分类结果进行最终判决。  (6) The classification results of all image groups of the video sequence are integrated to make a final decision. the
CN 201010126186 2010-03-17 2010-03-17 Digital Video Source Forensics Method Expired - Fee Related CN101835040B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 201010126186 CN101835040B (en) 2010-03-17 2010-03-17 Digital Video Source Forensics Method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 201010126186 CN101835040B (en) 2010-03-17 2010-03-17 Digital Video Source Forensics Method

Publications (2)

Publication Number Publication Date
CN101835040A CN101835040A (en) 2010-09-15
CN101835040B true CN101835040B (en) 2012-07-04

Family

ID=42718944

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 201010126186 Expired - Fee Related CN101835040B (en) 2010-03-17 2010-03-17 Digital Video Source Forensics Method

Country Status (1)

Country Link
CN (1) CN101835040B (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102122349B (en) * 2011-02-16 2014-01-29 哈尔滨工业大学 Method of Constructing Multi-class Support Vector Machine Classifier Based on Bhattacharyachian Distance and Directed Acyclic Graph Applied to Servo Motor System
CN103034993A (en) * 2012-10-30 2013-04-10 天津大学 Digital video transcode detection method
CN103618899B (en) * 2013-12-05 2016-08-17 福建师范大学 Video interleave altering detecting method based on intensity signal and device
CN105208388B (en) * 2014-06-24 2019-03-05 深圳市腾讯计算机系统有限公司 Dynamically adjustment encodes the method and system of frame per second in video communication
CN104469361B (en) * 2014-12-30 2017-06-09 武汉大学 A kind of video with Motion Adaptive deletes frame evidence collecting method
CN105007466B (en) * 2015-07-23 2019-04-19 熊建民 Prevent the monitor video recording system and its method for recording of editing
TWI554083B (en) * 2015-11-16 2016-10-11 晶睿通訊股份有限公司 Image processing method and camera thereof
CN105845132A (en) * 2016-03-22 2016-08-10 宁波大学 Coding parameter statistical feature-based AAC sound recording document source identification method
CN108710893B (en) * 2018-04-04 2021-10-29 中山大学 A Feature Fusion-Based Classification Method for Digital Image Camera Source Models
CN113038142B (en) * 2021-03-25 2022-11-01 北京金山云网络技术有限公司 Video data screening method and device and electronic equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5987178A (en) * 1996-02-22 1999-11-16 Lucent Technologies, Inc. Apparatus and method for a programmable video motion estimator
KR20010045766A (en) * 1999-11-08 2001-06-05 오길록 Apparatus For Motion Estimation With Control Section Implemented By State Translation Diagram
CN1719898A (en) * 2005-05-25 2006-01-11 中山大学 A Method of Protecting MPEG-2 Video Data

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5987178A (en) * 1996-02-22 1999-11-16 Lucent Technologies, Inc. Apparatus and method for a programmable video motion estimator
KR20010045766A (en) * 1999-11-08 2001-06-05 오길록 Apparatus For Motion Estimation With Control Section Implemented By State Translation Diagram
CN1719898A (en) * 2005-05-25 2006-01-11 中山大学 A Method of Protecting MPEG-2 Video Data

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
2008 International Workshop on Geoscience and Remote Sensing》.2008,全文. *
Yuting Su ET AL.A Source Video Identification Algorithm Based on Features in Video Stream.《2008 International Workshop on Education Technology and Training & 2008 International Workshop on Geoscience and Remote Sensing》.2008,全文.
Yuting Su ET AL.A Source Video Identification Algorithm Based on Features in Video Stream.《2008 International Workshop on Education Technology and Training &amp *
Yuting Su ET AL.A source video identification algorithm based on motion vectors.《2009 Second International Workshop on Computer Science and Engineering》.2009,全文. *

Also Published As

Publication number Publication date
CN101835040A (en) 2010-09-15

Similar Documents

Publication Publication Date Title
CN101835040B (en) Digital Video Source Forensics Method
CN102124489B (en) Signature derivation for images
CN108235001B (en) Deep sea video quality objective evaluation method based on space-time characteristics
CN109635791B (en) A video forensics method based on deep learning
Ravi et al. Compression noise based video forgery detection
CN113536990A (en) Deep fake face data identification method
CN101790097B (en) Method for detecting multiple times of compression and coding of digital video
Akbari et al. A new forensic video database for source smartphone identification: Description and analysis
Chen et al. Unsupervised curriculum domain adaptation for no-reference video quality assessment
CN101931821B (en) Video transmission error control method and system
Hong et al. Detection of frame deletion in HEVC-Coded video in the compressed domain
Nam et al. Two-stream network for detecting double compression of H. 264 videos
CN106097241A (en) Reversible information hidden method based on eight neighborhood pixel
CN113033379A (en) Intra-frame evidence-obtaining deep learning method based on double-current CNN
CN111212291A (en) DFL-CNN network-based video intra-frame object removal tamper detection method
He et al. Exposing fake bitrate videos using hybrid deep-learning network from recompression error
Goodwin et al. Blind video tamper detection based on fusion of source features
Bakas et al. Mpeg double compression based intra-frame video forgery detection using cnn
CN102857831A (en) H.264 video integrality authentication method
Yang et al. Blind VQA on 360° video via progressively learning from pixels, frames, and video
Yao et al. An approach to detect video frame deletion under anti-forensics
Tan et al. Hybrid deep-learning framework for object-based forgery detection in video
CN106529405A (en) Local anomaly behavior detection method based on video image block model
CN110503049B (en) A method for estimating the number of vehicles in satellite video based on generative adversarial network
Wang et al. Steganalysis of JPEG images by block texture based segmentation

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20120704

CF01 Termination of patent right due to non-payment of annual fee