Summary of the invention
For solving the problems referred to above, the invention discloses video real-time location method based on ROI motion detection on a kind of Android platform,
The video using image processing algorithm to capture equipment carries out real time data conversion and Image semantic classification, moves in conjunction with based on ROI
Detection algorithm, calculates the mobile range of mobile device, the frame of video that mobile range is less is omitted to the process of repeat character (RPT) location,
On the premise of ensureing character locating accuracy rate, improve the efficiency of character real-time positioning.
In order to achieve the above object, the present invention provides following technical scheme:
On a kind of Android platform, video real-time location method based on ROI motion detection, comprises the steps:
Step 10: original YUV420 format video data stream is passed through the real-time transfer algorithm of YUV Yu RGB, changes into RGB
The video frame images of form;
Step 20: the video frame images of described rgb format is carried out pretreatment, and described preprocessing process includes gray processing, two-value
Change and edge detection process;
Step 30: using ROI method for testing motion to detect each two field picture, the change calculating consecutive frame state is followed the tracks of
The mobile range of equipment, when motion amplitude is little between consecutive frame, continues to use the character locating result of former frame;When consecutive frame it
Between motion amplitude bigger time, a later frame is re-started character locating.
As a preferred embodiment of the present invention, in described step 20, gray processing method uses weighted average method, binaryzation side
Method uses OSTU method to calculate binary-state threshold, and described rim detection uses Canny edge detection algorithm.
As a preferred embodiment of the present invention, described ROI method for testing motion comprises the steps:
Step 301: initial frame is carried out character zone location, the sense that the positional information of the results area of location is designated as the second frame is emerging
The positional information in interest region;
Step 302: calculate the quantity of information of consecutive frame area-of-interest respectively, and the quantity of information calculating consecutive frame area-of-interest is poor
The absolute value of value;
Step 303: when quantity of information difference is more than information gap threshold value in step 302, this frame video is carried out character locating again,
When quantity of information difference is not more than information gap threshold value, then continue to use former frame character locating result;Continue executing with step 302.
As a preferred embodiment of the present invention, the described process positioning character comprises the steps:
Step 401: carry out morphological dilations process to needing the edge detection results positioned;
Step 402: the connected domain in the image after 401 step process is screened according to screening rule set in advance, obtains
Obtain character zone information, and in binary image, the position of the maximum boundary rectangle of the connected domain filtered out is cut,
Result to character locating cutting.
As a preferred embodiment of the present invention, described quantity of information is black pixel value, and concrete computational methods are: in binary map
Scanning area-of-interest in picture point battle array, cumulative gray value is 0 counts.
Compared with prior art, the video real-time character locating method based on ROI motion detection that the present invention provides, utilize video
Similarity and successional feature between frame, calculate quantity of information change to video consecutive frame area-of-interest, carries out motion detection,
Omit the resetting process of identical characters, the efficiency of character locating is had and significantly improves.Additionally, the present invention is directed to Android
The limitation of mobile device disposal ability, by Android this locality Development Framework by complicated image processing process native language
C++ realizes, and improves the efficiency that program is run.Write relative to simple use Java language and all make localization process with every frame video
Method, the real-time of location can be effectively improved, be particularly suitable for processing the many character locating of block letter under simple scenario.
Detailed description of the invention
The technical scheme provided the present invention below with reference to specific embodiment is described in detail, it should be understood that following specific embodiment party
Formula is merely to illustrate the present invention rather than limits the scope of the present invention.
When carrying out character locating, first this method obtains the current preview frame data that Android handheld equipment gathers, to obtaining
Fetching data and carry out further image procossing, as it is shown in figure 1, specifically include following steps, the present embodiment uses association ThinkPad
The coloured image that one width of Tablet183823C capture comprises character processes as original image:
The video flowing of YUV420 reference format is converted to rgb format by step 10, and rgb format is easier to carry out image procossing,
Video image RGB component formula is calculated as follows by YUV (i.e. YCrCb) three-component:
R=1.164* (Y-16)+1.596* (Cr-128)
G=1.164* (Y-16)-0.813* (Cr-128)-0.392* (Cb-128) (1-1)
B=1.164* (Y-16)+2.017* (Cb-128)
Wherein, Y represents that lightness, Cr and Cb are colourity, respectively defines two aspects of color, i.e. tone and saturation.
Step 20 uses gray processing, binarization method and edge detection method that two field picture every in video is carried out pretreatment, wherein
Binarization method uses OSTU method to calculate binary-state threshold, and rim detection uses Canny edge detection algorithm to carry out image
Contours extract.After pretreatment, it is possible to obtain character zone feature and significantly process image.
Specifically, the specifically comprising the following steps that of step 20
Frame of video after form is changed by step 201 carries out gray processing process, will be converted into gray level image by color RGB image.
The calculating of gray value preferably employs weighted average method.Different weights W is given to the R of RGB, G, B componentR、WG、
WB, then take their weighted mean, it is formulated as:
Typically for three kinds of colors of red, green, blue, human eye is the highest to green sensitivity, and redness is taken second place, blue minimum, because of
This, choose W in this exampleR=0.299, WG=0.587, WB=0.114.The result of gray processing is as shown in Figure 4.
Step 202 carries out binary conversion treatment to the image after gray processing.If in gray level image, the coordinate of certain point is (x, y),
G={0,1 ..., 255}, G are the integer of 0 to 255, i.e. tonal range, and (x y) represents (x, y) grey scale pixel value at place to g.Take ash
Pixel in gray-scale map as threshold value (t ∈ G), is then divided into more than threshold value t with less than threshold value by angle value t according to the size of threshold value
Two parts of t.The determination of threshold value t uses maximum variance between clusters (OTSU), and image is divided at a certain gray value by algorithm
Two groups, the most corresponding background parts and foreground part (character portion).If image intensity value i (0≤i≤255) going out in the picture
Existing probability is Pi, and global threshold gray scale is t;Pixel in image is divided into two classes, the i.e. gray scale background less than or equal to threshold value
Class A=[0,1 ..., t] and gray scale more than threshold value prospect class B=[t+1, t+2 ..., 255], the probability that background classes and prospect class occur is respectively
For PA、PB, then the gray average ω of the twoAAnd ωBIt is respectively depicted as:
The total gray average of image is:
The inter-class variance that thus can obtain AB region is:
σ2=PA*(ωA-ω0)2+PB*(ωB-ω0)2(2-4)
Threshold value t is traveled through between tonal range 0~255, when in formula (2-4), σ obtains between maximum i.e. A, B class
During variance maximum, the value of corresponding t is taken threshold value.
Binaryzation formula is:
Wherein, (x y) is the pixel value after binaryzation to b.After OTSU binaryzation, image effect is as shown in Figure 5.
Step 203 carries out rim detection to the image after binaryzation.Use Canny rim detection, i.e. optimum notch cuttype edge
Detection algorithm.Algorithm uses Gauss first differential to calculate the Grad of image, by finding the local maximum of image gradient,
Obtain intensity and the direction of image border, then by the edge strong, weak of dual-threshold voltage detection image, when strong edge and weak edge
Connect formation profile to output it.Core procedure includes following:
(1) remove the noise in image, use Gaussian filter that image is smoothed,;
(2) seek the gradient of image intensity value, including amplitude and direction, generally use the finite difference formulations of first-order partial derivative;
(3) local maximum of the gradient magnitude of gradation of image is found;
(4) select two threshold values, obtained the substantially edge of image by high threshold, collected the new limit connecting image by Low threshold
Edge, solves edge not closed-ended question.
The result of Canny rim detection is carried out as shown in Figure 6 by above step.
Step 30 uses ROI method for testing motion to detect each two field picture, and the change calculating consecutive frame state is followed the tracks of
The mobile range of equipment, it is judged that the mobile range of equipment is the biggest, when motion amplitude is little between consecutive frame, then can continue to use
The character locating result of former frame, it is not necessary to a later frame character is reorientated;When between consecutive frame, motion amplitude is bigger,
Then need a later frame is re-started character locating, the most more fresh character region ROI.A later frame motion compared with former frame
Amplitude is judged by the black picture element value difference contrasted in two frames in ROI.By above-mentioned steps, each frame in traversing graph picture, this
Sample is without reorientating the frame that mobile range is little, hence it is evident that improve location efficiency.
Concrete ROI method for testing motion handling process is as in figure 2 it is shown, specifically comprise the following steps that
Step 301 carries out character locating to initial frame, and the positional information of the results area of location is designated as the area-of-interest of the second frame
Positional information.The character locating result recording the first frame is rectangular area Rect1=F1(x1,y1,w1,h1), wherein (x1,y1) it is square
Shape upper left corner coordinate figure in the picture, w1It is the width of rectangle, h1It it is the height of rectangle.If the i-th two field picture is Fi, then Fi
Character locating result be rectangular area Recti=Fi(xi,yi,wi,hi), wherein (xi,yi) it is rectangle upper left corner coordinate figure in the picture,
wiIt is the width of rectangle, hiIt is the height of rectangle, defines i+1 two field picture Fi+1ROI(area-of-interest) be Recti
Determined by region, be designated as Mi+1, i.e. Mi+1=Fi+1(xi,yi,wi,hi), remember area-of-interest black pixel value in the i-th frame video
For quantity of information Di, circular is: scans area-of-interest in bianry image dot matrix, i.e. scans from [xi,yi] extremely
[xi+wi,yi+hi] each point in region, cumulative gray value is 0 counts, and this value is the quantity of information D of area-of-interesti。
Step 302 calculates the absolute value of the difference of the quantity of information of the area-of-interest of the i-th frame and i+1 frame, it is judged that be
No more than information gap threshold value d.Information gap threshold value d preferably takes the 1% of image gross information content, i.e. d=M × N/100, and M, N divide
It not width and the height of image.
If step 303 > d, this frame video is carried out character locating again.If≤d, it is shown to be identical character, need not
Resetting, the character locating result of originally i+1 frame and quantity of information continue to use the result of the i-th frame, and concrete mode is: Di+1=Di,
Mi+1=Mi, i=i+1.Finally, turn to step 302, i.e. continue the letter of the quantity of information judging next frame and region, current character location
Whether breath amount difference exceedes threshold value d.
The method that initial frame carries out in step 301 carrying out video again in character locating and step 303 character locating is identical,
For the method combined based on morphology and connected domain analysis.First with the expansive working of mathematical morphology, character zone is processed into class
It is similar to rectangular area, more above-mentioned similar rectangular area is carried out connected domain screening, find out its corresponding minimum enclosed rectangle, carry out
Cutting, obtains the result of character locating cutting.
Carry out the concrete handling process of character locating as it is shown on figure 3, step is as follows:
Step 401 carries out morphological dilations process to needing the edge detection results positioned in step 30.Pending image
For X, selecting structure element B (squares of 3 × 3), the point of the central point of B with the point on X and X surrounding is carried out one by one
Slip ratio pair, if there being a point to fall within the scope of X on B, then this point is just for stain.Through the knot that morphological dilations processes
After Guo, effect is as shown in Figure 7.
According to screening rule as shown in Figure 8, (this rule can to the connected domain in the image after the process of above-mentioned step A for step 402
Modify as required) screen, it is thus achieved that in character zone information, and the binary image after processing through step 202
The position of the maximum boundary rectangle of the connected domain filtered out is cut, obtains the result of character locating cutting.Generally,
Picture size captured by same Android device and Pixel Information basic simlarity, if the width of original image is W, height is H,
The width of character connected region minimum enclosed rectangle is cW, and height is cH, and the area of connected domain is cA.The cutting of character zone
Result is as shown in Figure 9.
Above-mentioned various image procossing and character locating method, use the exploitation of JNI(Java this locality when processing) process framework,
Complicated transformation process native language (C++) is write, improves the efficiency of program.
Use the character locating method adding motion detection in the present embodiment to carry out 10 groups of image character positioning experiments, then use and do not add
Each two field picture (is i.e. all positioned, remaining image processing method and this enforcement by the character locating method entering motion detection mode
Example is identical) carry out 10 image character positioning experiments, by both the above method as a comparison case, its performance test and comparative result are such as
Shown in Figure 10.Abscissa represents the group number of experiment, and vertical coordinate represents that every frame video is fixed before and after adding ROI motion detection
Average time handled by Wei, unit is millisecond (ms).Add the averagely every frame video of character locating of ROI motion detection step
Process time about 90ms, compared with not adding ROI motion detection, processing speed improves about 40%.
Technological means disclosed in the present invention program is not limited only to the technological means disclosed in above-mentioned embodiment, also include by more than
The technical scheme that technical characteristic combination in any is formed.