Summary of the invention
The technical problem to be solved in the present invention is: overcome the deficiencies in the prior art, provide a kind of robust video watermark process such as common attacks such as geometric attacks of resisting of function admirable, to protect better the copyright of digital video.
The technical solution adopted for the present invention to solve the technical problems: a kind of robust video watermark process of the geometric attack resisted based on SIFT, as shown in Figure 1, 2, comprise watermark embedded part and watermark extracting part, two parts all be take sequence of frames of video as operand, by after the sequence segmentation, every section video is as a basic operating unit;
The watermark embed process of described watermark embedded part is:
(1.1) color space conversion, the frame of video that at first will be meaned by the RGB color space is converted to the frame of video meaned by the Ycbcr color space, extracts the Y component, i.e. luminance component;
(1.2) luminance component of frame of video is carried out to one deck two dimension DWT conversion, extract the low frequency component;
(1.3) in frame of video every n frame as one section, n is positive integer, concrete value is determined by watermark information size and frame of video quantity, every section first frame is converted to the low frequency component obtained through one deck two dimension DWT and carry out the extraction of SIFT characteristic value, the characteristic point position that record obtains;
(1.4) at described characteristic point position place, watermarked in the following manner:
be the pixel value of i section characteristic point position place after watermarked
be i section characteristic point position place original pixel value, a is the multiplication factor that value is positive integer, and w represents watermark information, and value is 1 and-1, with the form of array be expressed as [1 ,-1,1 ,-1 ...], size is n;
(1.5) luminance component is carried out to one deck two dimension DWT inverse transformation, color space is converted to the RGB color space by Ycbcr, preserve video;
The watermark extraction process of described watermark extracting part is:
(2.1) color space conversion, the frame of video that at first will be meaned by the RGB color space is converted to the frame of video meaned by the Ycbcr color space, extracts the Y component, i.e. luminance component;
(2.2) one deck two dimension DWT conversion is carried out in the luminance component of frame of video, the wavelet basis of choosing is the haar small echo, extracts low frequency component;
(2.3) in frame of video, every n frame is as one section, and n is positive integer, and value when this value must embed with watermark is identical, and the low frequency component that every section first frame is obtained through one deck two dimension DWT conversion carries out the extraction of SIFT characteristic value, the characteristic point position that record obtains;
(2.4) the pixel prediction error of calculated characteristics point position, the fallout predictor adopted is SGAP, the computing formula of SGAP is:
Wherein x ' is pixel predictors, x
nfor predict pixel top pixel value, x
wfor left pixel value, x
nefor upper right corner pixel value, x
nwfor the top left corner pixel value;
Predicated error is defined as the difference of pixel value and its predicted value, each section is calculated its consensus forecast difference, this section all predicated error is sued for peace then divided by the sum of all pixels at characteristic point position place, is designated as e, set threshold value threshold > 0, extract result and be:
In described step (1.4), the value of α is 5≤a≤15.
The value of described step (2.4) threshold is 1≤threshold≤5.
It is the haar small echo that wavelet basis is chosen in the two dimension of one deck described in described step (1.2) DWT conversion, extracts low frequency component.
The principle of the invention: because the video in reality is substantially all colored, generally can mean with the RGB model, in the RGB model, energy disperses to be unfavorable for processing, processing procedure and raising robustness for method for simplifying, at first consideration is converted to the Ycbcr model by the RGB model, then unification is that luminance component is processed to the Y component, can convert back the RGB model according to concrete storage format after the watermark EO preserves again, after obtaining the Y component, again it is carried out to the 2D-DWT conversion, get the low frequency coefficient operation, the wavelet basis that wavelet transformation is chosen here is the Haar small echo.
But the beared information amount to the staged operation of video and video is closely related, and each Duan Ke represents 1 bit information, in actual mechanical process, can consider setting according to the frame number of video and the amount of information of required embedding, suppose that every n frame is one section, the numbering of section is designated as i, and the i value is 1,2 ..., n.
Next the first frame of every section is carried out to the SIFT conversion, the position of recording feature point, complete SIFT algorithm roughly has four steps: 1, the metric space extreme value detects.Detect the potential point of interest for yardstick and invariable rotary at metric space by the gaussian derivative function.2, the key point location.On point of interest location, determine position and the yardstick of key point.3, orientation determination.Gradient direction based on image local, give each key point assign direction.4, key point is described.The gradient of measurement image part in the field of each key point, finally express by a characteristic vector.In the present invention, the position of only having characteristic point of application, can omit most of irrelevant operation like this, only need to utilize part SIFT algorithm to get final product.
The embedding of watermark or extract for be the pixel value at characteristic point position place, computing formula during embedding is as follows:
be the pixel value of i section characteristic point position place after watermarked,
be i section characteristic point position place original pixel value, a is the multiplication factor that value is positive integer, and w represents watermark information, and value is 1 and-1, with the form of array can be expressed as [1 ,-1,1 ,-1 ...], size is n.
At first the pixel prediction error that needs calculated characteristics point position during extraction, the fallout predictor adopted is SGAP, the computing formula of SGAP is:
Wherein x ' is pixel predictors, x
nfor predict pixel top pixel value, x
wfor left pixel value, x
nefor upper right corner pixel value, x
nwfor the top left corner pixel value.
Predicated error is defined as the difference of pixel value and its predicted value, each section is calculated its consensus forecast difference, this section all predicated error is sued for peace then divided by the sum of all pixels at characteristic point position place, is designated as e, set threshold value threshold > 0, extract result and be:
The advantage that the present invention compared with prior art had is:
(1) the present invention can resist geometric attack better, such as rotation, shearing, convergent-divergent, upset etc.;
(2) distortion of the present invention is little, and due to only watermarked at the characteristic point position place in processing procedure of the present invention, and the ratio that the characteristic point number is occupied in the frame of video sum of all pixels is very little, so can not produce very large distortion;
(3) but watermark capacity self adaptation of the present invention adjustment, if the watermark information amount need embedded is greatly can suitably increase segments, if less reduce segments;
(4) the present invention realizes simply, and the process that embeds and extract is clear and definite, and computation complexity is little;
(5) the designed watermark of the present invention is blind watermark, that is to say and not need original video and original embed watermark information when watermark extracting.
Embodiment
As shown in Fig. 1,4, the present invention includes watermark embedding and watermark extracting two parts.
Wherein the watermark embed process flow chart is shown in Fig. 3, and its step is as follows:
Step 1: for every frame video data, at first be transformed under the Ycbcr color space, extract the Y component, if original video is the yuv form, can directly extract the Y component.
Step 2: the Y component is made to two-dimensional discrete wavelet conversion, and wavelet basis is the Haar small echo, extracts low frequency component.
Step 3: at first according to watermark information size and video frame number by the video equitable subsection, suppose that watermark information bit number is m, the video frame number is n, can select to be less than the digital k of the integer rounded under n/m, then every k frame is as one section, conveniently process during for information extraction and reach the related requests such as content-control, k can be made as to fixed value, such as 30.
Step 4: the first frame to every section video is done the SIFT feature extraction, and the characteristic point position that record extracts, in each frame basis position definite by first frame of every section video, is done operation as follows:
be the pixel value of i section characteristic point position place after watermarked,
be i section characteristic point position place original pixel value, a is the multiplication factor that value is positive integer, and w represents watermark information, and value is 1 and-1, with the form of array can be expressed as [1 ,-1,1 ,-1 ...], size is n.
Step 5: the frame of video after operation is carried out to two-dimentional DWT inverse transformation, color space is converted to the RGB color space by Ycbcr, preserve frame of video.
The watermark extraction process flow chart is shown in Fig. 4, and its step is as follows:
Step 1: color space conversion, the frame of video at first meaned by the RGB color space is converted to the frame of video meaned by the Ycbcr color space, extracts the Y component, i.e. luminance component;
Step 2: the luminance component to frame of video carries out one deck two dimension DWT conversion, and the wavelet basis of choosing is the haar small echo, extracts low frequency component;
Step 3: every n frame is as one section, and n is positive integer, and value when this value must embed with watermark is identical, and the low frequency component that every section first frame is obtained through above-mentioned conversion carries out the extraction of SIFT characteristic value, the characteristic point position that record obtains;
Step 4: the pixel prediction error of calculated characteristics point position, the fallout predictor adopted is SGAP, the computing formula of SGAP is:
Wherein x ' is pixel predictors, x
nfor predict pixel top pixel value, x
wfor left pixel value, x
nefor upper right corner pixel value, x
nwfor the top left corner pixel value.
Predicated error is defined as the difference of pixel value and its predicted value, and each section is calculated its consensus forecast difference, to this section all predicated error summation, then divided by the sum of all pixels at characteristic point position place, be designated as e, set threshold value threshold 0, extract result and be:
In a word, the present invention has improved the robustness of video watermark process to geometric attack, and distortion is little, processes simple.
Non-elaborated part of the present invention belongs to techniques well known.
The above; be only part embodiment of the present invention, but protection scope of the present invention is not limited to this, in the technical scope that any those skilled in the art disclose in the present invention; the variation that can expect easily or replacement, within all should being encompassed in protection scope of the present invention.