Summary of the invention
For addressing the above problem, the invention discloses on a kind of Android platform the real-time video localization method based on the ROI motion detection, use image processing algorithm that the equipment video captured is carried out real time data conversion and image pre-service, combination is based on the ROI motion detection algorithm again, calculate the mobile range of mobile device, omit the repeat character (RPT) location process for the less frame of video of mobile range, under the prerequisite that guarantees the character locating accuracy rate, improved the character efficient of location in real time.
In order to achieve the above object, the invention provides following technical scheme:
Based on the real-time video localization method of ROI motion detection, comprise the steps: on a kind of Android platform
Step 10: with the real-time transfer algorithm of original YUV420 format video data stream by YUV and RGB, change into the video frame images of rgb format;
Step 20: the video frame images to described rgb format carries out pre-service, and described preprocessing process comprises gray processing, binaryzation and edge detection process;
Step 30: adopt the ROI method for testing motion that each two field picture is detected, the variation of calculating the consecutive frame state comes the mobile range of tracking equipment, when motion amplitude is little between the consecutive frame, continues to use the character locating result of former frame; When motion amplitude between the consecutive frame is big, back one frame is carried out character locating again.
As a preferred embodiment of the present invention, in the described step 20, the gray processing method adopts the weighted mean value method, and binarization method adopts the OSTU method to calculate binary-state threshold, and described rim detection adopts the Canny edge detection algorithm.
As a preferred embodiment of the present invention, described ROI method for testing motion comprises the steps:
Step 301: initial frame is carried out the character zone location, and the positional information of the results area of location is designated as the positional information of the area-of-interest of second frame;
Step 302: calculate the quantity of information of consecutive frame area-of-interest respectively, and calculate the absolute value of the quantity of information difference of consecutive frame area-of-interest;
Step 303: when quantity of information difference in the step 302 during greater than the information gap threshold value, this frame video is carried out character locating again, when the quantity of information difference is not more than the information gap threshold value, then continue to use former frame character locating result; Continue execution in step 302.
As a preferred embodiment of the present invention, the described process that character is positioned comprises the steps:
Step 401: the edge detection results that needs position is carried out the morphology expansion process;
Step 402: the connected domain in the image after 401 step process is screened according to predefined screening rule, obtain character zone information, and cut the position to the maximum boundary rectangle of the connected domain that filters out in binary image, obtains the result of character locating cutting.
As a preferred embodiment of the present invention, described quantity of information is black pixel value, and concrete computing method are: scan area-of-interest in the bianry image dot matrix, the gray-scale value that adds up is 0 counts.
Compared with prior art, real-time video character locating method based on the ROI motion detection provided by the invention, utilize similarity and successional characteristics between the frame of video, to video consecutive frame area-of-interest computing information quantitative changeization, carry out motion detection, omit the resetting process of identical characters, the efficient of character locating is had significantly improve.In addition, the present invention is directed to the limitation of Android mobile device processing power, with the local Development Framework of Android the complex image processing process is realized with native language C++, improved the efficient of program operation.Write the method for all making localization process with every frame video with respect to simple use Java language, can effectively improve the real-time of location, be particularly suitable for handling the block letter multiword symbol location under the simple scenario.
Embodiment
Below with reference to specific embodiment technical scheme provided by the invention is elaborated, should understands following embodiment and only be used for explanation the present invention and be not used in and limit the scope of the invention.
When carrying out character locating, this method is at first obtained the current preview frame data that the Android handheld device is gathered, carrying out further image processing to obtaining data, as shown in Figure 1, comprise the steps that specifically the width of cloth that present embodiment adopts the ThinkPad Tablet183823C of association to catch comprises the coloured image of character to be handled as original image:
Step 10 is converted to rgb format with the video flowing of YUV420 standard format, and rgb format is easier to carry out image and handles, and it is as follows to calculate video image RGB component formula by YUV (being YCrCb) three-component:
R=1.164*(Y-16)+1.596*(Cr-128)
G=1.164*(Y-16)-0.813*(Cr-128)-0.392*(Cb-128) (1-1)
B=1.164*(Y-16)+2.017*(Cb-128)
Wherein, Y represents lightness, and Cr and Cb are colourity, has defined two aspects of color respectively, i.e. tone and saturation degree.
Step 20 uses gray processing, binarization method and edge detection method that every two field picture in the video is carried out pre-service, wherein binarization method adopts the OSTU method to calculate binary-state threshold, and rim detection adopts the Canny edge detection algorithm that image is carried out profile and extracts.Through after the pre-service, can access the character zone evident characteristic and handle image.
Specifically, the concrete steps of step 20 are as follows:
Frame of video after the step 201 pair format conversion is carried out the gray processing processing, is about to colored RGB image and is converted into gray level image.The weighted mean value method is preferably adopted in the calculating of gray-scale value.R, G, B component to RGB are given different weights W
R, W
G, W
B, get their weighted mean value again, be formulated as:
Usually, for three kinds of colors of red, green, blue, human eye is the highest to the susceptibility of green, and redness is taken second place, and is blue minimum, therefore, chooses W in this example
R=0.299, W
G=0.587, W
B=0.114.The result of gray processing as shown in Figure 4.
Image behind the step 202 pair gray processing carries out binary conversion treatment.If certain any coordinate is in the gray level image (x, y), G={0,1 ..., 255}, G are 0 to 255 integer, i.e. tonal range, g (x, y) expression (x, the grey scale pixel value of y) locating.Get gray-scale value t as threshold value (t ∈ G), then be divided into greater than threshold value t with less than two parts of threshold value t according to the pixel in big young pathbreaker's gray-scale map of threshold value.Definite employing maximum variance between clusters (OTSU) of threshold value t, algorithm is slit into two groups with image in a certain gray-scale value punishment, respectively corresponding background parts and prospect part (character part).If the probability of the appearance of gradation of image value i (0≤i≤255) in image is Pi, the global threshold gray scale is t; Pixel segmentation in the image is become two classes, and namely gray scale is smaller or equal to the background classes A=[0 of threshold value, and 1 ..., t] and gray scale greater than the prospect class B=[t+1 of threshold value, t+2 ..., 255], the probability that background classes and prospect class occur is respectively P
A, P
B, the gray average ω of the two then
AAnd ω
BBe described as respectively:
The total gray average of image is:
The inter-class variance that can get the AB zone thus is:
σ
2=P
A*(ω
A-ω
0)
2+P
B*(ω
B-ω
0)
2 (2-4)
Threshold value t is traveled through between tonal range 0~255, is that the value of corresponding t was the threshold value of getting when variance was maximum between A, category-B when σ in the formula (2-4) obtains maximal value.
The binaryzation formula is:
Wherein, (x y) is pixel value after the binaryzation to b.After the OTSU binaryzation, image effect as shown in Figure 5.
Image after the step 203 pair binaryzation carries out rim detection.Adopt the Canny rim detection, i.e. Zui You notch cuttype edge detection algorithm.Algorithm adopts Gauss's single order differential to come the Grad of computed image, by seeking the local maximum of image gradient, obtain intensity and the direction of image border, strong, the weak edge by dual threshold method detected image is then exported it when strong edge and weak edge are connected to form profile.Core procedure comprises following:
(1) noise in the removal image adopts Gaussian filter that image is carried out smoothing processing;
(2) ask the gradient of gradation of image value, comprise amplitude and direction, adopt the finite difference of single order partial derivative to calculate usually;
(3) local maximum of the gradient magnitude of searching gradation of image;
(4) select two threshold values, obtain the roughly edge of image by high threshold, collect the new edge of connection layout picture by low threshold value, solve not closed-ended question of edge.
The result who carries out the Canny rim detection by above step as shown in Figure 6.
Step 30 adopts the ROI method for testing motion that each two field picture is detected, the variation of calculating the consecutive frame state comes the mobile range of tracking equipment, whether the mobile range of judgment device is bigger, when motion amplitude is little between the consecutive frame, then can continue to use the character locating result of former frame, need not back one frame character is reorientated; When motion amplitude between the consecutive frame is big, then need back one frame is carried out character locating again, i.e. fresh character region ROI more.The back motion amplitude compared with former frame of one frame is judged by contrasting in two frames black picture element value difference among the ROI.By above-mentioned steps, each frame in the traversing graph picture need not the little frame of mobile range is reorientated like this, has obviously promoted location efficiency.
Concrete ROI method for testing motion treatment scheme as shown in Figure 2, concrete steps are as follows:
Step 301 pair initial frame carries out character locating, and the positional information of the results area of location is designated as the positional information of the area-of-interest of second frame.The character locating result who records first frame is rectangular area Rect
1=F
1(x
1, y
1, w
1, h
1), (x wherein
1, y
1) be the coordinate figure of the rectangle upper left corner in image, w
1Be the width of rectangle, h
1It is the height of rectangle.If the i two field picture is F
i, F then
iCharacter locating result be rectangular area Rect
i=F
i(x
i, y
i, w
i, h
i), (x wherein
i, y
i) be the coordinate figure of the rectangle upper left corner in image, w
iBe the width of rectangle, h
iBe the height of rectangle, define i+1 two field picture F
I+1The ROI(area-of-interest) be Rect
iDetermined zone is designated as M
I+1, i.e. M
I+1=F
I+1(x
i, y
i, w
i, h
i), remember that the area-of-interest black pixel value is quantity of information D in the i frame video
i, concrete computing method are: scan area-of-interest in the bianry image dot matrix, namely scan from [x
i, y
i] to [x
i+ w
i, y
i+ h
i] each point in the zone, the gray-scale value that adds up is 0 counts, this value is the quantity of information D of area-of-interest
i
The absolute value of the difference of the quantity of information of the area-of-interest of step 302 calculating i frame and i+1 frame is ⊿, and whether Pan Duan ⊿ is greater than information gap threshold value d.Information gap threshold value d preferably gets 1% of image gross information content, i.e. d=M * N/100, M, N are respectively width and the height of image.
Step 303 Ruo ⊿〉d, carry out character locating again to this frame video.Ruo ⊿≤d shows it is identical character, need not resetting, and originally the character locating result of i+1 frame and quantity of information are continued to use the result of i frame, and concrete mode is: D
I+1=D
i, M
I+1=M
i, i=i+1.At last, turn to step 302, continue namely to judge whether the quantity of information of next frame and the quantity of information difference of current character locating area surpass threshold value d.
It is identical in the step 301 initial frame to be carried out in character locating and the step 303 video is carried out again the method for character locating, is the method that combines based on morphology and connected domain analysis.First expansive working with mathematical morphology is processed into character zone and is similar to the rectangular area, again the connected domain screening is carried out in above-mentioned similar rectangular area, finds out its corresponding minimum boundary rectangle, cuts, and obtains the result of character locating cutting.
Carry out the concrete treatment scheme of character locating as shown in Figure 3, step is as follows:
The edge detection results that needs in the step 401 pair step 30 to position is carried out the morphology expansion process.Pending image is X, chooses the square of structural element B(3 * 3), it is right that the point around the central point of B and the point on the X and the X is carried out slip factor one by one, if having a point to drop within the scope of X on the B, should be stain just then.Through effect behind the result of morphology expansion process as shown in Figure 7.
Connected domain in the image after the step 402 pair above-mentioned A step process is screened according to screening rule (this rule can be made amendment as required) as shown in Figure 8, obtain character zone information, and cut the position to the maximum boundary rectangle of the connected domain that filters out in the binary image after step 202 is handled, and obtains the result of character locating cutting.Generally speaking, same Android equipment shot picture size is similar substantially with Pixel Information, and the width of establishing original image is W, highly is H, and the width of the minimum boundary rectangle of character connected region is cW, highly is cH, and the area of connected domain is cA.The cutting result of character zone as shown in Figure 9.
Above-mentioned various images are handled and the character locating method, adopt the local exploitation of JNI(Java when handling) handle framework, complicated transforming process is write with native language (C++), improved the efficient of program.
Adopt the character locating method that adds motion detection in the present embodiment to carry out 10 groups of image character locating experiments, adopt the character locating method that does not add the motion detection mode (namely each two field picture all to be positioned again, the remaining image disposal route is identical with present embodiment) carry out 10 image character positioning experiments, with above two kinds of methods as a comparison case, its performance test and comparative result are as shown in figure 10.Horizontal ordinate represents that the group number of testing, ordinate are illustrated in the front and back that add the ROI motion detection, and every frame video is located handled averaging time, and unit is millisecond (ms).Add about 90ms of processing time of the average every frame video of character locating of ROI motion detection step, do not compare with adding the ROI motion detection, processing speed has improved about 40%.
The disclosed technological means of the present invention program is not limited only to the disclosed technological means of above-mentioned embodiment, also comprises the technical scheme of being made up of above technical characterictic combination in any.