CN104077590A - Video fingerprint extraction method and system - Google Patents

Video fingerprint extraction method and system Download PDF

Info

Publication number
CN104077590A
CN104077590A CN201410307572.4A CN201410307572A CN104077590A CN 104077590 A CN104077590 A CN 104077590A CN 201410307572 A CN201410307572 A CN 201410307572A CN 104077590 A CN104077590 A CN 104077590A
Authority
CN
China
Prior art keywords
information
video
hash codes
moving objects
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201410307572.4A
Other languages
Chinese (zh)
Inventor
吴金勇
孙威
王军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Security and Surveillance Technology PRC Inc
Original Assignee
China Security and Surveillance Technology PRC Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Security and Surveillance Technology PRC Inc filed Critical China Security and Surveillance Technology PRC Inc
Priority to CN201410307572.4A priority Critical patent/CN104077590A/en
Publication of CN104077590A publication Critical patent/CN104077590A/en
Pending legal-status Critical Current

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention belongs to the technical field of communication information security, and provides a video fingerprint extraction method and system. The method comprises the steps of conducting preprocessing, and extracting Y-channel information in video information; obtaining three levels of information in the Y-channel information, obtaining different pieces of feature information for the three levels of information respectively, and conducting Hash feature extraction on the obtained different pieces of feature information respectively to obtain a video clip information Hash code, a frame image information Hash code and a motion object level information Hash code; integrating the video clip information Hash code, the frame image information Hash code and the motion object level information Hash code to organize and construct dendritical structures. According to the video fingerprint extraction method and system, the one-to-one corresponding relation is built according to the integration of the three different levels of Hash codes and the actual structure of video clips, and the dendritical structures are constructed; according to comparison of the dendritical structures of the Hash codes, whether frame information is tampered or not can be directly judged, the tampered position can be rapidly positioned, the tampering detection capacity is high, and security is high.

Description

A kind of method for extracting video fingerprints and system
Technical field
The present invention relates to communication message safety technical field, relate in particular to a kind of method for extracting video fingerprints and system.
Background technology
Along with the development of human society, public safety problem more and more causes people's attention.Video monitoring system is more and more applied among real life as a kind of supplementary means fast and effectively that solves public safety problem, has also produced numerous and jumbled video information when its application offers convenience.The information transmission technology has obtained significant progress simultaneously, and a large amount of information can be mutual in different shared among users.Yet, in message transmitting procedure, need by public passage, and the security of public passage cannot be effectively guaranteed and cause receiving party can not clearly receive the security of information.In the face of the demand of the safety evaluatio of the video information of enormous amount, rely on that manually to evaluate be an almost impossible mission.
Video finger print extractive technique, by video is processed, generates one section of unique fingerprint, evaluates the security of video by the comparison of fingerprint.The extracting method of video finger print is the gordian technique in video information safety evaluation system, and this technology characterizes video by the extraction of video content features being obtained to one section of string of binary characters.The video that the method possesses similar content can access the characteristic of similar character string.Current method for extracting video fingerprints is mainly by extracting the key frame images in video, to key frame, utilize the technology of image perception Hash to carry out the feature extraction of perception Hash, then feature key frame being obtained is carried out the series connection of Hash coding and is obtained video finger print, the advantage of these class methods is to utilize the image perception salted hash Salted of existing comparative maturity, realize the similarity comparison of video, its shortcoming is only key frame images to be carried out the extraction of Hash feature, and face distorting of non-key frame, can not effectively be authenticated, easily make authentication result occur mistake.Therefore prior art security aspect video authentication is lower, distorts detectability not strong.
Summary of the invention
In view of this, the present invention proposes a kind of method for extracting video fingerprints and system, by extracting the Y channel information in video information, obtain respectively the different characteristic information in video segment information, frame image information and Moving Objects level information, and different characteristic informations is carried out to Hash obtain multi-level Hash codes, according to the practical structures of video segment and multi-level Hash codes, set up relation one to one, build tree structure, can orient fast tampered position, distort detectability strong, safe.
To achieve these goals, the invention provides a kind of method for extracting video fingerprints, comprise step:
Y channel information is extracted in pre-service in video information;
Obtain three hierarchical informations in Y channel information, described three hierarchical informations are respectively video segment information, frame image information and Moving Objects level information; Described three hierarchical informations are obtained respectively to different characteristic informations;
Respectively the described different characteristic information obtaining is carried out to Hash feature extraction, obtain video segment information Hash codes, frame image information Hash codes, Moving Objects level information Hash codes;
Video segment information Hash codes, frame image information Hash codes and Moving Objects level information Hash codes are gathered, and tissue construction tree structure, mode for the information of same level with cascade connects, and the information of different levels connects with pointer according to the hypotaxis between different levels information in video information.
Wherein, the video segment information of obtaining described in Y channel information comprises:
Sequence of image frames in video segment is fixed to frame period sampling;
The sequence of image frames that sampling is obtained carries out weight addition, obtains time-space domain frame.
Wherein, described video segment information is carried out Hash feature extraction and is comprised:
Utilize horizontal operator G xwith vertical operator G ycarry out convolution with time-space domain frame respectively, obtain respectively horizontal gradient image and VG (vertical gradient) image;
Wherein, horizontal operator G xwith vertical operator G ybe respectively:
G x = - 1 0 1 - 1 0 1 - 1 0 1
G y = 1 1 1 0 0 0 - 1 - 1 - 1
Respectively horizontal gradient image and VG (vertical gradient) image are carried out to piecemeal;
Calculate the average gray of pixel in the piece of each piecemeal, and to the average gray of pixel in the piece of each piecemeal, adopt the method for adaptive threshold to carry out thresholding to obtain video segment information Hash codes.
Wherein, the method for described adaptive threshold is specially: adopt the median of the average gray of pixel in the piece of each piecemeal as adaptive threshold, the average gray of piecemeal is greater than described adaptive threshold and gets 1, otherwise gets 0.
Wherein, described in, obtaining frame image information comprises:
Whole two field picture is normalized to size;
Each image after normalization size is carried out to piecemeal and obtain image block;
Calculate the related coefficient of each image block and R-matrix.
Wherein, described in obtain Moving Objects level information and comprise Moving Objects region is extracted, Moving Objects region is extracted specifically and is comprised:
Adopt the modeling of many Gaussian Background: regard each pixel as many Gausses weight distribution, gaussian probability distribution function is as follows:
p ( I ( i , j ) ) = Σ k = 1 K w ij , t k η [ I ( i , j ) , μ ij , t k , ( σ ij , t k ) 2 ]
Wherein represent that t is positioned at the weight of k the gaussian component of many Gaussian distribution of (i, j) point constantly, meets
Σ k = 1 K w ij , t k = 1
And the Gaussian probability-density function that represents I (i, j), wherein represent that respectively t is positioned at average and the variance of k the gaussian component of many Gaussian distribution of (i, j) point constantly;
Average using the value of pixel in the first two field picture as first gaussian component of many Gaussian Background, and its variance and weight are made as predetermined value, and according to size sorts;
Since the second two field picture, in each later two field picture, the value of each pixel is all mated with many Gaussian Background model.
Wherein, the described value by each pixel in each later two field picture of the second two field picture all comprises with the matching process that many Gaussian Background model mates:
Judge whether current pixel point meets the Gaussian distribution in background model, if met, be considered as background and upgrade average, variance and the weight of the Gaussian distribution being satisfied in background model;
If do not met, be considered as target, and the average using the value of current pixel point as new Gaussian distribution, and set its variance and weight.
Wherein, the Hash feature of described extraction Moving Objects level information comprises:
Size normalization is carried out in Moving Objects region, obtain the small images of n*n size, wherein n is integer;
Adopt Block DCT algorithm to carry out image processing to Y channel information;
Extract 1 DC coefficient and K ac coefficient, wherein the integer of K ∈ [5,9];
Obtain thus:
feature T={Y 0,Y 1,Y 2,...,Y i,...Y K}
Y wherein 0the DC coefficient that represents Y passage, Y ibe i ac coefficient of Y passage;
N*n ordinal number formed to a FEATURE matrix:
FEATURE = { feature 1 T , feature 2 T , . . . feature i t . . . , feature n 2 T }
Wherein the AC and DC coefficient set that represents i small images;
Every row of FEATURE matrix is chosen to the method for adaptive threshold and carried out binaryzation, obtain the binaryzation matrix of FEATURE matrix.
Wherein, described according to the hypotaxis between different levels information in video information, different levels information is connected and comprised with pointer: the storage unit that the pointed in storage unit in video segment information Hash codes layer is represented to the Hash codes of the first two field picture in two field picture Hash codes layer in video segment or the Hash codes place of last frame image; By the storage unit at first Moving Objects level information Hash codes of this two field picture in the pointed Moving Objects Hash codes layer in the storage unit in frame image information Hash codes layer or last Moving Objects level information Hash codes place.
To achieve these goals, the present invention also provides a kind of video finger print extraction system, comprising:
Pretreatment module, for video information is carried out to pre-service, extracts the Y channel information in video information;
Acquisition module, for obtaining three hierarchical informations of Y channel information, described three hierarchical informations are respectively video segment information, frame image information and Moving Objects level information, and described three hierarchical informations are obtained respectively to different characteristic informations;
Processing module, for respectively the described different characteristic information obtaining being carried out to Hash feature extraction, obtains video segment information Hash codes, frame image information Hash codes, Moving Objects level information Hash codes;
Build module, video segment information Hash codes, frame image information Hash codes and Moving Objects level information Hash codes are gathered, and tissue construction tree structure, mode for the information of same level with cascade connects, and the information of different levels connects with pointer according to the hypotaxis between different levels information in video information.
Described method for extracting video fingerprints provided by the invention and system are by extracting Y channel information to video information, and obtain video segment information, different characteristic information in frame image information and Moving Objects level information, respectively described different characteristic information is carried out to Hash, obtain Hash codes corresponding to each information, in conjunction with the Hash codes set of three different levels and set up relation one to one according to the practical structures of video segment, build tree structure, thereby when in video segment, arbitrary storage unit of arbitrary levels changes, all can be embodied directly in the variation of certain branch of whole tree structure or the Hash codes unit of certain leaf node.By the contrast to Hash codes tree structure, can directly judge frame information and whether be tampered, can orient fast tampered position, distort detectability strong, safe.
Accompanying drawing explanation
In order to be illustrated more clearly in the embodiment of the present invention or technical scheme of the prior art, to the accompanying drawing of required use in embodiment or description of the Prior Art be briefly described below, apparently, accompanying drawing in the following describes is only some embodiments of the present invention, for those of ordinary skills, do not paying under the prerequisite of creative work, can also obtain according to these accompanying drawings other accompanying drawing.
Fig. 1 is embodiment of the present invention method for extracting video fingerprints process flow diagram;
Fig. 2 is that embodiment of the present invention video segment information is carried out Hash feature extraction process flow diagram;
Fig. 3 is that embodiment of the present invention frame image information carries out Hash feature extraction process flow diagram;
Fig. 4 is that embodiment of the present invention Moving Objects level information is carried out Hash feature extraction process flow diagram;
Fig. 5 is the mode figure that embodiment of the present invention binary matrix is cascaded as Hash codes;
Fig. 6 is the multi-level Hash codes organizational form of embodiment of the present invention figure;
Fig. 7 is embodiment of the present invention video finger print extraction system structural drawing.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is clearly and completely described, obviously, described embodiment is only the present invention's part embodiment, rather than whole embodiment.Embodiment based in the present invention, those of ordinary skills, not making the every other embodiment obtaining under creative work prerequisite, belong to the scope of protection of the invention.
Embodiment mono-:
Refer to Fig. 1 to Fig. 4, it,, for a kind of method for extracting video fingerprints that the embodiment of the present invention provides, specifically comprises the steps:
Y channel information is extracted in S10, pre-service in video information;
It should be noted that: in YUV color space, Y channel information has determined the lightness of color, U channel information and V channel information have determined that color itself is colourity.The angle of learning from human cognitive, the cognitive sensitivity of the mankind to video information, monochrome information will be far longer than colouring information, therefore the perception Hash feature extraction in the present embodiment is only processed monochrome information, only in video information, extracts Y channel information.
In the present embodiment, original video information can be that YUV color space can be also other color space.If YUV color space directly extracts its Y channel information, if other color space needs to be first converted into YUV color space, and then extract Y channel information.In the present embodiment, the resolution of institute's input video can be but be not limited to CIF, D1,720p, 1080p etc., and frame per second does not limit equally.
S20, the Y channel information extracting in video information is divided into three hierarchical informations of refinement gradually, described three hierarchical informations are respectively video segment information, frame image information and Moving Objects level information; And by different information representation modes, described three hierarchical informations are obtained respectively to different characteristic informations respectively;
S30 carries out Hash to the different characteristic information extracting respectively, obtains corresponding video segment information Hash codes, frame image information Hash codes, Moving Objects level information Hash codes.Detailed process is as follows:
S201, obtain the video segment information in Y channel information;
Video segment information is represented by time-space domain information, and the extracting method of time-space domain information comprises following two steps:
(a) first the sequence of image frames in video segment is fixed to frame period sampling;
Because the time-space domain information of video is mainly reflected in interframe, change, and variation between consecutive frame is general
Obvious not, therefore adopt anchor-frame interval to sample, sampling interval is 3-10 frame, in this reality
Executing preferred every 5 frames in example once samples.
(b) sequence of image frames then sampling being obtained carries out weight addition, obtains time-space domain frame;
Concrete grammar can show with following formula table:
F ( m , n ) = Σ k = 1 J w k F ( m , n , k )
Wherein, after F (m, n, k) representative sampling, in k frame, coordinate is the gray-scale value that (m, n) locates, w kbe the weight of k frame, F (m, n) represents the gray-scale value that time-space domain information (m, n) is located.W in this preferred embodiment kuse exponential function γ krepresent, and preferred γ=0.6.
So, by above (a), (b) two steps, can obtain the video segment information in Y passage that used time spatial information (si) represents.
S202, video segment information is carried out to Hash feature extraction;
In this preferred embodiment, utilize horizontal operator G xwith vertical operator G ycarry out convolution with time-space domain frame respectively, obtain respectively horizontal gradient image and VG (vertical gradient) image.Wherein, horizontal operator G xwith vertical operator G ybe respectively:
G x = - 1 0 1 - 1 0 1 - 1 0 1
G y = 1 1 1 0 0 0 - 1 - 1 - 1
Respectively horizontal gradient image and VG (vertical gradient) image are carried out to piecemeal, preferred 4*4 piecemeal herein, then calculate the average gray of the interior pixel of piece of each piecemeal, to the average gray of pixel in the piece of each piecemeal, adopt the method for adaptive threshold to carry out thresholding again, the method of adaptive threshold is specially in the present embodiment: adopt the median MedianNumber of the average gray of pixel in the piece of each piecemeal as adaptive threshold, the average gray of piecemeal is greater than this adaptive threshold and gets 1, otherwise get 0, obtain like this video segment information Hash codes VideoclipHash.This video segment information Hash codes VideoclipHash only has 4*4*2=32 binary digit, as rough an expression of video segment information integral body.
S203, obtain the frame image information of video segment;
Frame image information is directly represented by whole two field picture, and every two field picture is extracted to its rough textural characteristics.Rough textural characteristics adopts the related coefficient of image block and R-matrix to represent, Gauss's low pass matrix R of the preferred 8*8 of R-matrix in the present embodiment, standard deviation δ r=0.5,
Gauss's low pass matrix R is specifically expressed as follows:
R = 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 8.09 E - 05 0.0044 0.0044 8.09 E - 05 0 0 0 0 0.0044 0.241 0.241 0.0044 0 0 0 0 0.0044 0.241 0.241 0.0044 0 0 0 0 8.09 E - 05 0.0044 0.0044 8.09 E - 05 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
In order to guarantee to process image block and R-matrix has formed objects, by whole two field picture normalization size, be 64*64 in the present embodiment, this image is carried out to the piecemeal of 8*8, obtain the image block of 64 8*8, calculate the related coefficient of each image block and R-matrix:
ρ = Σ i = 1 n ( I ( i ) - μ I ) ( R ( i ) - μ R ) Σ i = 1 n ( I ( i ) - μ I ) 2 Σ i = 1 n ( R ( i ) - μ R ) 2
μ wherein ithe partitioned matrix I standard deviation, the μ that represent each image block rthe standard deviation that represents Gauss's low pass matrix R.I (i), R (i) are respectively i data value in partitioned matrix I and Gauss's low pass matrix R.The rough textural characteristics that can obtain so whole frame image information represents:
feature T={ρ 1,ρ 2,…ρ i…ρ 64}
ρ wherein iinteger in middle i ∈ [1,64].
S204, the rough textural characteristics of frame image information is carried out to Hash feature extraction;
Method to the rough textural characteristics of frame image information by adaptive threshold quantizes, and the method for adaptive threshold is specially in the present embodiment: ask for feature tin each element ρ imedian MedianNumber, adopt median MedianNumber as adaptive threshold, for feature teach element ρ ibe greater than median MedianNumber and get 1, otherwise get 0, as rough an expression of rough textural characteristics.Then according to the position of image block from top to bottom, concatenated in order from left to right obtains the frame image information Hash codes Imagehash of each two field picture, and the length of this Hash codes is 64 binary digits in the present embodiment.The frame image information Hash codes collection that two field picture under whole video segment generates can be expressed as:
Imagehash n={Imagehash 1,Imagehash 2,…,Imagehash n}
Wherein n is the quantity of two field picture in video segment.
S205, obtain Moving Objects level information in video segment;
It should be noted that: from general cognitive science angle, people is divided into global information and two aspects of local message for the cognition of the picture frame in video, and human visual system is the highest to the Moving Objects susceptibility in video.Based on this, the present invention adopts Moving Objects to carry out the local message in presentation video.Specifically comprise the steps:
S2051, Moving Objects extracted region;
Particularly, Moving Objects region adopts many Gaussian Background modeling method to extract.Regard each pixel as many Gausses weight distribution, probability distribution function is as follows:
p ( I ( i , j ) ) = Σ k = 1 K w ij , t k η [ I ( i , j ) , μ ij , t k , ( σ ij , t k ) 2 ]
Wherein represent that t is positioned at the weight of k the gaussian component of many Gaussian distribution of (i, j) point constantly, meets
Σ k = 1 K w ij , t k = 1
And the Gaussian probability-density function that represents I (i, j), wherein represent that respectively t is positioned at average and the variance of k the gaussian component of many Gaussian distribution of (i, j) point constantly.
S2052, in the initialized process of many Gaussian Background template, the average using the value of pixel in the first two field picture as first gaussian component of many Gaussian Background, and its variance and weight are made as predetermined value, and according to size sorts;
S2053, since the second two field picture, in each later two field picture, the value of each pixel is all mated with many Gaussian Background model, matching process is specific as follows:
Judge whether current pixel point meets the Gaussian distribution in background model, if met, be considered as background and upgrade average, variance and the weight of the Gaussian distribution being satisfied in background model; If do not met, be considered as target, the average using the value of current pixel point as new Gaussian distribution, and set its variance and weight.
Through constantly upgrading, just complete and set up many Gaussian Background model like this, by this model, just can extract the moving target of each frame in video.
Through the method, obtain Moving Objects region and represented by n the rectangle frame perpendicular to coordinate axis, rectangle frame is comprised of four parameters:
rect i={x i,y i,width i,height i}i=1,2,…,n
(x wherein i, y i) represent the left upper apex coordinate of rectangle frame, width i, height ibe respectively the wide and high of rectangle frame.The Moving Objects set of regions obtaining like this can be expressed as follows:
rect n={rect 1,rect 2,…,rect n}
Rect wherein nrepresent the area information of n Moving Objects.
The Hash feature of S206, extraction Moving Objects level information;
In each Moving Objects region, include an object, therefore only need to carry out Hash to extracting the differentiation feature of different motion subject area, just can in the Hash codes comparison below, determine whether object is maliciously changed.Detailed process is as follows:
First size normalization is carried out in Moving Objects region, obtain the small images of n*n size, wherein n is integer preferred n=32 or 4 here;
Adopt Block DCT algorithm to carry out image processing to Y channel information;
Extract 1 direct current (DC) coefficient and exchange (AC) coefficient with K, wherein the number of K determines susceptibility and the robustness of Hash codes, and the larger susceptibility of K is better, corresponding robustness is poorer, in order to guarantee the equilibrium of susceptibility and robustness, the integer of K ∈ [5,9] in the present embodiment;
Obtain thus:
feature T={Y 0,Y 1,Y 2,…,Y i,…Y K}
Y wherein 0direct current (DC) coefficient that represents Y passage, Y iit is i interchange (AC) coefficient of Y passage;
N*n ordinal number formed to a FEATURE matrix:
FEATURE = { feature 1 T , feature 2 T , . . . feature i t . . . , feature n 2 T }
Wherein the DC and the AC coefficient set that represent i small images;
Every row of FEATURE matrix is chosen to the method for adaptive threshold and carried out binaryzation, obtain the binaryzation matrix of FEATURE matrix.
In the present embodiment, adopt piecemeal DCT (discrete cosine transform) algorithm to carry out image processing to Y channel information.With n=4, be illustrated below, each Moving Objects region is divided into 4*4 small images.
Each small images extraction 1 DC (direct current) coefficient, during AC (interchange) the coefficient number K=7 simultaneously extracting, obtains:
feature T={Y 0,Y 1,Y 2,…Y i…,Y 7}
Y wherein 0the DC coefficient that represents Y passage, and Y ii the ac coefficient that represents Y passage.
Then column vector 4*4 fritter being obtained forms the matrix of a 8*16:
FEATURE = { feature 1 T , feature 2 T , . . . feature i T . . . , feature 16 T }
Wherein the DC and the AC coefficient set that represent i small images.
And work as 1 DC coefficient of each small images extraction, during AC (interchange) the coefficient number K=5 simultaneously extracting, obtain:
feature T={Y 0,Y 1,Y 2,…Y i…,Y 5}
Y wherein 0the DC coefficient that represents Y passage, and Y ii the DC coefficient that represents Y passage.
Then column vector 4*4 fritter being obtained forms the matrix of a 6*16:
FEATURE = { feature 1 T , feature 2 T , . . . feature i T . . . , feature 16 T }
Wherein the DC and the AC coefficient set that represent i small images.
And work as 1 DC coefficient of each small images extraction, during AC (interchange) the coefficient number K=9 simultaneously extracting, obtain:
feature T={Y 0,Y 1,Y 2,…Y i…,Y 9}
Y wherein 0the DC coefficient that represents Y passage, and Y ii the DC coefficient that represents Y passage.
Then column vector 4*4 fritter being obtained forms the matrix of a 10*16:
FEATURE = { feature 1 T , feature 2 T , . . . feature i T . . . , feature 16 T }
Wherein the DC and the AC coefficient set that represent i small images.
Owing to extracting the difference of AC coefficient number, the length of the Hash codes obtaining also produces corresponding variation, and the AC coefficient extracting is more, and the length of Hash codes is just longer, and the resource that takies transmission channel is just more.And aspect information extraction, AC coefficient is more, the detailed information of extraction is just more, and the susceptibility of corresponding Hash codes is just better, and robustness is just poorer simultaneously.
Performance comparison can be summed up with following formula:
Length 5<Length 7<Length 9
Sensitive 5<Sensitive 7<Sensitive 9
Rubust 5>Rubust7>Rubust 9
Length wherein ithe length of Hash codes while representing K=i, Sensitive ithe susceptibility of Hash codes while representing K=i, and Rubust ithe robustness of Hash codes while representing K=i, the integer of i ∈ [5,9].
Therefore in the present embodiment, the susceptibility of K=9 is greater than the susceptibility of K=7, and the susceptibility of K=7 is greater than the susceptibility of K=5; The robustness of K=5 is greater than the robustness of K=7, and the robustness of K=7 is greater than the robustness of K=9, and balance robustness and susceptibility factor are considered, preferred K=7 in the embodiment of the present invention.
The method of finally every row of FEATURE matrix being chosen to adaptive threshold is carried out binaryzation, the method of the adaptive threshold in the rough textural characteristics of the method for concrete adaptive threshold and above-mentioned frame image information is basic identical, obtain the binaryzation matrix of FEATURE matrix, and then last column data that is the previous row in matrix according to the mode of joining end to end in Fig. 5 by every row in binaryzation matrix is the first row data of next line, obtain the Moving Objects level information Hash codes Objectnesshash of each subject area, the Moving Objects level information Hash codes collection that two field picture under whole video segment generates can be expressed as:
Objectnesshash n={Objectnesshash 1,…,Objectnesshash n}
Objectnesshash wherein nthe moving object-level information Hash codes that represents n Moving Objects.
So far, complete the Hash procedure of multi-level features information in whole video segment, obtained respectively the Hash codes set of three levels.
S40, these three Hash codes from different levels information of Moving Objects level information Hash codes Objectnesshash of video segment information Hash codes Videoclip, frame image information Hash codes Imagehash and subject area are gathered, and carry out tissue construction tree structure, organizational form is organized for the information exchange of the information of same level and different levels is crossed to two kinds of different modes.Refer to Fig. 6, particularly, for the information of same level, directly in the mode of cascade, connect, and the information of different levels is according to the hypotaxis between different levels information in video information, and different levels information is connected with pointer.And specific to the storage unit of different levels, not only comprise the Hash information of this hierarchical information, also include the further feature information of describing this hierarchical information.Concrete mode is as follows:
Video segment information Hash codes layer: this layer represents the Global Information of video segment, because the Global Information of video segment only has one section of VideoclipHash and represents, therefore this layer only needs a storage unit, this cell stores has the video segment of comprising information Hash codes VideoclipHash and a pointer, and this pointed represents the storage unit at the Hash codes of the first two field picture in two field picture Hash codes layer in video segment or the Hash codes place of last frame image.
Frame image information Hash codes layer: this layer represents the Global Information of each two field picture in video segment, because video segment is comprised of multiple image, and every two field picture all generates one section of frame image information Hash codes, therefore need a plurality of storage unit to form, storage unit number determines by the frame number of video segment, and the frame image information Hash codes obtaining is in the present embodiment:
Imagehash n={Imagehash 1,Imagehash 2,…,Imagehash n}
Imagehash wherein nthe frame image information Hash codes that represents n two field picture.
Therefore, need n storage unit to form, between each storage unit according to the concatenated in order of place frame number.Each storage unit comprises that a segment table shows frame image information Hash codes and a pointer of this two field picture, first Moving Objects level information Hash codes of this two field picture in pointed Moving Objects Hash codes layer or the storage unit at last Moving Objects level information Hash codes place simultaneously.
Moving Objects level information Hash codes layer: this layer represents the local message set of Moving Objects place two field picture.This layer is comprised of a plurality of storage unit equally, the Moving Objects number that the number of storage unit is obtained by every two field picture determines, each Moving Objects generates one section of Moving Objects level information Hash codes, and the Moving Objects level information Hash codes collection that a two field picture in the present embodiment obtains is:
Objectnesshash n={Objectnesshash 1,…,Objectnesshash n}
Objectnesshash wherein nthe Moving Objects level information Hash codes that represents n Moving Objects.
Therefore Moving Objects level information Hash codes layer corresponding to this two field picture needs n storage unit to form, and each storage unit under same two field picture is according to rect iin (x i, y i) sort, adopt in the present embodiment mode from left to right, from top to bottom to carry out.Each storage unit comprises two parts: the positional information rect of Moving Objects Hash codes and object i.
Method for extracting video fingerprints of the present invention is by being divided into video segment information by the Y channel information in video information, frame image information, these three of Moving Objects level information are the hierarchical information of refinement progressively, and described three hierarchical information are extracted to different characteristic informations, to video segment information extraction gray feature, frame image information is extracted to textural characteristics and the DTC coefficient characteristics to the information extraction of Moving Objects level representing by related coefficient, according to the different characteristic of described different levels, obtain the feature Hash codes of every layer, by above-mentioned multi-level Hash codes is gathered, tissue construction tree structure, and then obtain video finger print.The present invention can set up relation one to one by the practical structures of the different levels Hash information obtaining in above-mentioned steps and video segment, when so in video segment, arbitrary storage unit of arbitrary levels changes, can be embodied directly in the variation of certain branch of whole tree structure or the Hash codes unit of certain leaf node.By the contrast to Hash codes tree structure, can directly judge frame information and whether be tampered, and can orient fast tampered position, distort detectability strong, safe.
Embodiment bis-:
Refer to Fig. 7, the invention provides a kind of video finger print extraction system, comprising:
Pretreatment module, for video information is carried out to pre-service, and extracts the Y channel information in video information;
Acquisition module, for obtaining three hierarchical informations of Y channel information, described three hierarchical informations are respectively video segment information, frame image information and Moving Objects level information, and described three hierarchical informations are obtained respectively to different characteristic informations;
Wherein, the different characteristic information of obtaining specifically comprises extracts textural characteristics and the DTC coefficient characteristics to the information extraction of Moving Objects level representing by related coefficient to video segment information extraction gray feature, to frame image information;
Processing module, for respectively the described different characteristic information obtaining being carried out to Hash feature extraction, obtains video segment information Hash codes, frame image information Hash codes, Moving Objects level information Hash codes;
Build module, video segment information Hash codes, frame image information Hash codes and Moving Objects level information Hash codes are gathered, and tissue construction tree structure, mode for the information of same level with cascade connects, and the information of different levels connects with pointer according to the hypotaxis between different levels information in video information;
Particularly, build module more and comprise storage unit and generation unit,
Storage unit, for storing Hash codes and the pointer of different information;
Generation unit, sets up relation one to one by the practical structures of the Hash information of different levels and video segment, builds tree structure.
Video finger print extraction system of the present invention is extracted Y channel information by video pre-filtering module, and obtain video segment information, different characteristic information in frame image information and Moving Objects level information, respectively described different characteristic information is carried out to Hash, obtain the Hash codes that each characteristic information is corresponding, in conjunction with the Hash codes set of three different levels and set up relation one to one according to the practical structures of video segment, build tree structure, thereby when in video segment, arbitrary storage unit of arbitrary levels changes, all can be embodied directly in the variation of certain branch of whole tree structure or the Hash codes unit of certain leaf node.By the contrast to Hash codes tree structure, can directly judge frame information and whether be tampered, and can orient fast tampered position.
Method for extracting video fingerprints of the present invention and system are extracted roughly by the general characteristic in video segment (video segment information), local feature (Moving Objects level information) is carried out to meticulous extraction, and at Hash codes tissue, retained the structure of video itself, can realize the safety evaluatio of video information, and positioning tampering region quickly and accurately.The method of the invention and system can be used for the evaluation of a series of video information safeties that transmit by public passage such as video in video monitoring system, Internet Transmission, in addition can also be for video copy detection.
In the present invention, each modular unit or various method steps can concentrate on single computer installation, or be distributed on the network that a plurality of computer installations form, or they are made into respectively to integrated circuit modules, or a plurality of modules in them or step are made into single integrated circuit module realize.The present invention is not restricted to the combination of any specific hardware and software.
With reference to the accompanying drawings of the preferred embodiments of the present invention, not thereby limit to interest field of the present invention above.Those skilled in the art do not depart from the scope and spirit of the present invention, and can have multiple flexible program to realize the present invention, such as the feature as an embodiment can be used for another embodiment, obtain another embodiment.Allly using any modification of doing within technical conceive of the present invention, be equal to and replace and improve, all should be within covering scope of the present invention.

Claims (10)

1. a method for extracting video fingerprints, is characterized in that, comprises step:
Y channel information is extracted in pre-service in video information;
Obtain three hierarchical informations in Y channel information, described three hierarchical informations are respectively video segment information, frame image information and Moving Objects level information, and described three hierarchical informations are obtained respectively to different characteristic informations;
Respectively the described different characteristic information obtaining is carried out to Hash feature extraction, obtain video segment information Hash codes, frame image information Hash codes, Moving Objects level information Hash codes;
Video segment information Hash codes, frame image information Hash codes and Moving Objects level information Hash codes are gathered, and tissue construction tree structure, the mode for the information of same level with cascade connects,
The information of different levels connects with pointer according to the hypotaxis between different levels information in video information.
2. method for extracting video fingerprints according to claim 1, is characterized in that, described in the video segment information obtained in Y channel information comprise:
Sequence of image frames in video segment is fixed to frame period sampling;
The sequence of image frames that sampling is obtained carries out weight addition, obtains time-space domain frame.
3. method for extracting video fingerprints according to claim 2, is characterized in that, described video segment information is carried out Hash feature extraction and comprised:
Utilize horizontal operator G xwith vertical operator G y, carry out convolution with time-space domain frame respectively, obtain respectively horizontal gradient image and VG (vertical gradient) image;
Wherein, horizontal operator G xwith vertical operator G ybe respectively:
G x = - 1 0 1 - 1 0 1 - 1 0 1
G y = 1 1 1 0 0 0 - 1 - 1 - 1
Respectively horizontal gradient image and VG (vertical gradient) image are carried out to piecemeal;
Calculate the average gray of pixel in the piece of each piecemeal, and to the average gray of pixel in the piece of each piecemeal, adopt the method for adaptive threshold to carry out thresholding to obtain video segment information Hash codes.
4. method for extracting video fingerprints according to claim 3, it is characterized in that, the method of described adaptive threshold is specially: adopt the median of the average gray of pixel in the piece of each piecemeal as adaptive threshold, the average gray of piecemeal is greater than described adaptive threshold and gets 1, otherwise gets 0.
5. method for extracting video fingerprints according to claim 1, is characterized in that, described in obtain frame image information and comprise:
Whole two field picture is normalized to size;
Each image after normalization size is carried out to piecemeal and obtain image block;
Calculate the related coefficient of each image block and R-matrix.
6. method for extracting video fingerprints according to claim 1, is characterized in that, described in obtain Moving Objects level information and comprise Moving Objects region is extracted, Moving Objects region is extracted specifically and is comprised:
Adopt the modeling of many Gaussian Background: regard each pixel as many Gausses weight distribution, gaussian probability distribution function is as follows:
p ( I ( i , j ) ) = Σ k = 1 K w ij , t k η [ I ( i , j ) , μ ij , t k , ( σ ij , t k ) 2 ]
Wherein represent that t is positioned at the weight of k the gaussian component of many Gaussian distribution of (i, j) point constantly, meets
Σ k = 1 K w ij , t k = 1
And the Gaussian probability-density function that represents I (i, j), wherein represent that respectively t is positioned at average and the variance of k the gaussian component of many Gaussian distribution of (i, j) point constantly;
Average using the value of pixel in the first two field picture as first gaussian component of many Gaussian Background, and its variance and weight are made as predetermined value, and according to size sorts;
Since the second two field picture, in each later two field picture, the value of each pixel is all mated with many Gaussian Background model.
7. method for extracting video fingerprints according to claim 6, is characterized in that, the described value by each pixel in each later two field picture of the second two field picture all comprises with the matching process that many Gaussian Background model mates:
Judge whether current pixel point meets the Gaussian distribution in background model, if met, be considered as background and upgrade average, variance and the weight of the Gaussian distribution being satisfied in background model;
If do not met, be considered as target, and the average using the value of current pixel point as new Gaussian distribution, and set its variance and weight.
8. method for extracting video fingerprints according to claim 1, is characterized in that, the Hash feature of described extraction Moving Objects level information comprises:
Size normalization is carried out in Moving Objects region, obtain the small images of n*n size, wherein n is integer;
Adopt Block DCT algorithm to carry out image processing to Y channel information;
Extract 1 DC coefficient and K ac coefficient, wherein the integer of K ∈ [5,9];
Obtain thus:
feature T={Y 0,Y 1,Y 2,...,Y i,...Y K}
Y wherein 0the DC coefficient that represents Y passage, Y ibe i ac coefficient of Y passage;
N*n ordinal number formed to a FEATURE matrix:
FEATURE = { feature 1 T , feature 2 T , . . . feature i t . . . , feature n 2 T }
Wherein the AC and DC coefficient set that represents i small images;
Every row of FEATURE matrix is chosen to the method for adaptive threshold and carried out binaryzation, obtain the binaryzation matrix of FEATURE matrix.
9. according to the method for extracting video fingerprints described in claim 1~8 any one, it is characterized in that, the hypotaxis in described video information between different levels information connects and comprises with pointer: the storage unit that the pointed in storage unit in video segment information Hash codes layer is represented to the Hash codes of the first two field picture in two field picture Hash codes layer in video segment or the Hash codes place of last frame image; By the storage unit at first Moving Objects level information Hash codes of this two field picture in the pointed Moving Objects Hash codes layer in the storage unit in frame image information Hash codes layer or last Moving Objects level information Hash codes place.
10. a video finger print extraction system, is characterized in that, comprising:
Pretreatment module, for video information is carried out to pre-service, extracts the Y channel information in video information;
Acquisition module, for obtaining three hierarchical informations of Y channel information, described three hierarchical informations are respectively video segment information, frame image information and Moving Objects level information, and described three hierarchical informations are obtained respectively to different characteristic informations;
Processing module, for respectively the described different characteristic information obtaining being carried out to Hash feature extraction, obtains video segment information Hash codes, frame image information Hash codes, Moving Objects level information Hash codes;
Build module, video segment information Hash codes, frame image information Hash codes and Moving Objects level information Hash codes are gathered, and tissue construction tree structure, mode for the information of same level with cascade connects, and the information of different levels connects with pointer according to the hypotaxis between different levels information in video information.
CN201410307572.4A 2014-06-30 2014-06-30 Video fingerprint extraction method and system Pending CN104077590A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410307572.4A CN104077590A (en) 2014-06-30 2014-06-30 Video fingerprint extraction method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410307572.4A CN104077590A (en) 2014-06-30 2014-06-30 Video fingerprint extraction method and system

Publications (1)

Publication Number Publication Date
CN104077590A true CN104077590A (en) 2014-10-01

Family

ID=51598836

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410307572.4A Pending CN104077590A (en) 2014-06-30 2014-06-30 Video fingerprint extraction method and system

Country Status (1)

Country Link
CN (1) CN104077590A (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104581431A (en) * 2014-11-28 2015-04-29 安科智慧城市技术(中国)有限公司 Video authentication method and device
CN104809248A (en) * 2015-05-18 2015-07-29 成都索贝数码科技股份有限公司 Video fingerprint extraction and retrieval method
CN105611319A (en) * 2015-12-24 2016-05-25 杭州当虹科技有限公司 Video content anti-tampering method
CN105611428A (en) * 2015-12-22 2016-05-25 北京安寻网络科技有限公司 Video evidence preserving and verifying method and device
CN105930478A (en) * 2016-05-03 2016-09-07 福州市勘测院 Element object spatial information fingerprint-based spatial data change capture method
CN108021927A (en) * 2017-11-07 2018-05-11 天津大学 A kind of method for extracting video fingerprints based on slow change visual signature
CN108540823A (en) * 2018-05-15 2018-09-14 北京首汽智行科技有限公司 A kind of integrity of video method of calibration based on block chain technology
CN108629049A (en) * 2018-05-14 2018-10-09 芜湖岭上信息科技有限公司 A kind of image real-time storage and lookup device and method based on hash algorithm
CN109241317A (en) * 2018-09-13 2019-01-18 北京工商大学 Based on the pedestrian's Hash search method for measuring loss in deep learning network
CN110035327A (en) * 2019-04-17 2019-07-19 深圳市摩天之星企业管理有限公司 A kind of safe playback method
CN110083740A (en) * 2019-05-07 2019-08-02 深圳市网心科技有限公司 Video finger print extracts and video retrieval method, device, terminal and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101404750A (en) * 2008-11-11 2009-04-08 清华大学 Video fingerprint generation method and device
CN102077584A (en) * 2008-06-30 2011-05-25 思科技术公司 Video fingerprint systems and methods
CN102208026A (en) * 2011-05-27 2011-10-05 电子科技大学 Method for extracting digital video fingerprints
US8494234B1 (en) * 2007-03-07 2013-07-23 MotionDSP, Inc. Video hashing system and method
CN103593464A (en) * 2013-11-25 2014-02-19 华中科技大学 Video fingerprint detecting and video sequence matching method and system based on visual features
US8660296B1 (en) * 2012-01-10 2014-02-25 Google Inc. Systems and methods for facilitating video fingerprinting using local descriptors
CN103747271A (en) * 2014-01-27 2014-04-23 深圳大学 Video tamper detection method and device based on mixed perceptual hashing

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8494234B1 (en) * 2007-03-07 2013-07-23 MotionDSP, Inc. Video hashing system and method
CN102077584A (en) * 2008-06-30 2011-05-25 思科技术公司 Video fingerprint systems and methods
CN101404750A (en) * 2008-11-11 2009-04-08 清华大学 Video fingerprint generation method and device
CN102208026A (en) * 2011-05-27 2011-10-05 电子科技大学 Method for extracting digital video fingerprints
US8660296B1 (en) * 2012-01-10 2014-02-25 Google Inc. Systems and methods for facilitating video fingerprinting using local descriptors
CN103593464A (en) * 2013-11-25 2014-02-19 华中科技大学 Video fingerprint detecting and video sequence matching method and system based on visual features
CN103747271A (en) * 2014-01-27 2014-04-23 深圳大学 Video tamper detection method and device based on mixed perceptual hashing

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
李泽洲: ""基于视频指纹的视频监测技术研究"", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
王典: ""基于混合高斯的背景建模与阴影抑制算法研究"", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104581431A (en) * 2014-11-28 2015-04-29 安科智慧城市技术(中国)有限公司 Video authentication method and device
CN104581431B (en) * 2014-11-28 2018-01-30 精宸智云(武汉)科技有限公司 Video authentication method and device
CN104809248A (en) * 2015-05-18 2015-07-29 成都索贝数码科技股份有限公司 Video fingerprint extraction and retrieval method
CN104809248B (en) * 2015-05-18 2018-01-23 成都华栖云科技有限公司 Video finger print extracts and search method
CN105611428A (en) * 2015-12-22 2016-05-25 北京安寻网络科技有限公司 Video evidence preserving and verifying method and device
CN105611319B (en) * 2015-12-24 2018-08-17 杭州当虹科技有限公司 A kind of method that video content is anti-tamper
CN105611319A (en) * 2015-12-24 2016-05-25 杭州当虹科技有限公司 Video content anti-tampering method
CN105930478A (en) * 2016-05-03 2016-09-07 福州市勘测院 Element object spatial information fingerprint-based spatial data change capture method
CN105930478B (en) * 2016-05-03 2019-04-19 福州市勘测院 Spatial data based on feature object spatial information fingerprint changes catching method
CN108021927A (en) * 2017-11-07 2018-05-11 天津大学 A kind of method for extracting video fingerprints based on slow change visual signature
CN108629049A (en) * 2018-05-14 2018-10-09 芜湖岭上信息科技有限公司 A kind of image real-time storage and lookup device and method based on hash algorithm
CN108540823A (en) * 2018-05-15 2018-09-14 北京首汽智行科技有限公司 A kind of integrity of video method of calibration based on block chain technology
CN109241317A (en) * 2018-09-13 2019-01-18 北京工商大学 Based on the pedestrian's Hash search method for measuring loss in deep learning network
CN110035327A (en) * 2019-04-17 2019-07-19 深圳市摩天之星企业管理有限公司 A kind of safe playback method
CN110083740A (en) * 2019-05-07 2019-08-02 深圳市网心科技有限公司 Video finger print extracts and video retrieval method, device, terminal and storage medium
CN110083740B (en) * 2019-05-07 2021-04-06 深圳市网心科技有限公司 Video fingerprint extraction and video retrieval method, device, terminal and storage medium

Similar Documents

Publication Publication Date Title
CN104077590A (en) Video fingerprint extraction method and system
CN112734775B (en) Image labeling, image semantic segmentation and model training methods and devices
CN108108751B (en) Scene recognition method based on convolution multi-feature and deep random forest
US8024775B2 (en) Sketch-based password authentication
US20120027263A1 (en) Hand gesture detection
WO2016082277A1 (en) Video authentication method and apparatus
CN102592148A (en) Face identification method based on non-negative matrix factorization and a plurality of distance functions
CN110991549A (en) Countermeasure sample generation method and system for image data
CN113761259A (en) Image processing method and device and computer equipment
CN112132099A (en) Identity recognition method, palm print key point detection model training method and device
Jia et al. SAR image change detection based on iterative label-information composite kernel supervised by anisotropic texture
CN109242796A (en) Character image processing method, device, electronic equipment and computer storage medium
CN106127748A (en) A kind of characteristics of image sample database and method for building up thereof
CN103839042A (en) Human face recognition method and human face recognition system
CN106845513A (en) Staff detector and method based on condition random forest
CN104050628A (en) Image processing method and image processing device
CN106203448B (en) A kind of scene classification method based on Nonlinear Scale Space Theory
CN110489659A (en) Data matching method and device
CN107644105A (en) One kind searches topic method and device
CN111199175A (en) Training method and device for target detection network model
Sarmah et al. Optimization models in steganography using metaheuristics
CN105069403B (en) A kind of three-dimensional human ear identification based on block statistics feature and the classification of dictionary learning rarefaction representation
CN107358244B (en) A kind of quick local invariant feature extracts and description method
Zanotta et al. An adaptive semisupervised approach to the detection of user-defined recurrent changes in image time series
CN108830217B (en) Automatic signature distinguishing method based on fuzzy mean hash learning

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20141001

WD01 Invention patent application deemed withdrawn after publication