CN108776822A - Target area detection method, device, terminal and storage medium - Google Patents
Target area detection method, device, terminal and storage medium Download PDFInfo
- Publication number
- CN108776822A CN108776822A CN201810650498.4A CN201810650498A CN108776822A CN 108776822 A CN108776822 A CN 108776822A CN 201810650498 A CN201810650498 A CN 201810650498A CN 108776822 A CN108776822 A CN 108776822A
- Authority
- CN
- China
- Prior art keywords
- target area
- image
- class node
- sample areas
- region
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Image Analysis (AREA)
Abstract
The embodiment of the invention discloses a kind of target area detection method, device, terminal and storage mediums, belong to field of computer technology.This method includes:Determine multiple sample areas and classification results;Grader is obtained, grader includes the multiple class nodes being arranged in order;According to the classification results of multiple sample areas and multiple sample areas, first class node in grader is trained, continues to be trained next class node after the completion of first class node training, until the training of multiple class nodes is completed;When it does not include target area to determine in the image currently tracked, using the grader for having trained completion, classify at least one of the second image after the image that currently tracks region, the target area in the second image is determined according to classification results, without being detected to every frame image, unnecessary calculation amount is reduced.And the accuracy of grader is improved, and then improves the accuracy of target area.
Description
Technical field
The present embodiments relate to field of computer technology, more particularly to a kind of target area detection method, device, terminal
And storage medium.
Background technology
With the fast development of internet with the extensive rise of video social activity, the main propagation type of internet information is from text
Word, picture have been gradually evolved into video, and various video processing function also occurs in succession, such as video filter, video mark
(tagging) etc., personalisation process can be carried out to certain target areas in video, promoted by these video processing functions
Interest.
In the related technology, during terminal plays video, user can in present image manually determined target area
Domain, the target area in terminal-pair present image carry out editing and processing, such as in target area addition paster or to target area
Beautify etc..Also, terminal can also be on the basis of position of the target area in present image, to distinguish since present image
To tracking and backward tracking before carrying out, the position in the every frame image of target area before the present image and later is determined, from
And identical editing and processing is carried out to the target area in every frame image, ensure the consistency between image.
But if position or attitudes vibration during terminal taking video are larger, certain figures of video can be led to
Do not include target area as in, then when it does not include the image of target area to track to, target area tracking failure, later
In the image of tracking, even if there are target areas, it is also difficult to detected again.
Invention content
An embodiment of the present invention provides a kind of target area detection method, device, terminal and storage mediums, can solve phase
Pass technology there are the problem of.The technical solution is as follows:
On the one hand, a kind of target area detection method is provided, the method includes:
According to the target area that user determines in the first image of video, multiple sample areas and the multiple are determined
The classification results of sample areas, the classification results are for indicating whether the sample areas belongs to the target area;
Grader to be trained is obtained, the grader includes the multiple class nodes being arranged in order according to sequencing;
According to the classification results of the multiple sample areas and the multiple sample areas, in the grader
One class node is trained, and continues to instruct next class node after the completion of first class node training
Practice, until the training of the multiple class node is completed;
The target area is tracked in other images in the video in addition to described first image, it is current when determining
When not including the target area in the image of tracking, the grader completed has been trained in application, to the image currently tracked
At least one of the second image later region is classified, and the mesh in second image is determined according to classification results
Mark region.
On the other hand, a kind of target area detection device is provided, described device includes:
Sample determining module, the target area for being determined in the first image of video according to user, determines multiple samples
The classification results of one's respective area and the multiple sample areas, the classification results are for indicating whether the sample areas belongs to
The target area;
Acquisition module, for obtaining grader to be trained, the grader includes being arranged in order according to sequencing
Multiple class nodes;
Training module, for the classification results according to the multiple sample areas and the multiple sample areas, to institute
It states first in grader class node to be trained, continue to next point after the completion of first class node training
Class node is trained, until the training of the multiple class node is completed;
Detection module, for tracking the target area in other images in the video in addition to described first image
Domain, when it does not include the target area to determine in the image currently tracked, the grader completed has been trained in application, to working as
At least one of the second image after the image of preceding tracking region is classified, and second figure is determined according to classification results
The target area as in.
In another aspect, providing a kind of for detecting the terminal of target area, the terminal includes processor and memory,
Be stored at least one instruction, at least one section of program, code set or instruction set in the memory, described instruction, described program,
The code set or described instruction collection are loaded by the processor and are had to realize institute in the target area detection method
The operation having.
In another aspect, providing a kind of computer readable storage medium, it is stored in the computer readable storage medium
At least one instruction, at least one section of program, code set or instruction set, described instruction, described program, the code set or the finger
Collection is enabled to be loaded by processor and had to realize possessed operation in the target area detection method.
Method, apparatus, terminal and storage medium provided in an embodiment of the present invention, according to user in the first image of video
Determining target area determines the classification results of multiple sample areas and multiple sample areas, and classification results are for indicating sample
Whether one's respective area belongs to target area, obtains grader to be trained, grader include be arranged in order according to sequencing it is more
A class node, according to the classification results of multiple sample areas and multiple sample areas, to first classification in grader
Node is trained, and continues to be trained next class node after the completion of first class node training, until multiple points
Class node training complete, track target area in other images in addition to the first image in video, when determine currently with
When not including target area in the image of track, the grader of completion train in application, to second after the image that currently tracks
At least one of image region is classified, and the target area in the second image is determined according to classification results, without to every frame
Image is detected, and reduces unnecessary calculation amount.And first use Dynamic Programming training method, in grader on
Next class node can be just trained after the completion of the training of one class node, improves the accuracy of grader, is being trained point
After class device when target area tracks failure, application has trained the grader completed to classify, and detects target area again,
The accuracy of target area can be improved.
Also, when target area occurs compared with large deformation, grader can be carried out more according to the target area after deformation
Newly, new target area is arrived in study in time, the robustness and reliability of grader is improved, in terminal fast jitter, rotation, mesh
Mark can timely and accurately detect that target area, detection result are ideal when being blocked.
Also, using the grader of linear structure, it can ensure that class node is fixed number of, maximumlly
Classifying space is segmented, classification accuracy is improved.
Also, image vision information is combined with sensing data, the posture information provided by the sensor configured,
Estimate target area position, into line trace or detection in the target area of estimation, and to estimation target area other than its
His region is not necessarily to, into line trace or detection, to avoid tracking failure caused by sensor error, and can also reduce
Unnecessary calculation amount improves arithmetic speed.
Description of the drawings
To describe the technical solutions in the embodiments of the present invention more clearly, make required in being described below to embodiment
Attached drawing is briefly described, it should be apparent that, the accompanying drawings in the following description is only some implementations of the embodiment of the present invention
Example, for those of ordinary skill in the art, without creative efforts, can also obtain according to these attached drawings
Obtain other attached drawings.
Fig. 1 is a kind of schematic diagram for TLD algorithms that the relevant technologies provide;
Fig. 2 is a kind of schematic diagram of target area detection method provided in an embodiment of the present invention;
Fig. 3 is a kind of structural schematic diagram of grader provided in an embodiment of the present invention;
Fig. 4 is a kind of image trace schematic diagram provided in an embodiment of the present invention;
Fig. 5 is a kind of characteristic point schematic diagram provided in an embodiment of the present invention;
Fig. 6 is another characteristic point schematic diagram provided in an embodiment of the present invention;
Fig. 7 is a kind of coordinate system schematic diagram provided in an embodiment of the present invention;
Fig. 8 is a kind of schematic diagram of cascade classifier provided in an embodiment of the present invention;
Fig. 9 is a kind of operating process schematic diagram provided in an embodiment of the present invention;
Figure 10 is a kind of tracking velocity schematic diagram provided in an embodiment of the present invention;
Figure 11 is a kind of structural schematic diagram of target area detection device provided in an embodiment of the present invention;
Figure 12 is a kind of structural schematic diagram of terminal provided in an embodiment of the present invention.
Specific implementation mode
In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with attached drawing to the present invention
Embodiment is described in further detail.
Before the embodiment of the present invention is described in detail, first to TLD (Tracking-Learning-
Detection, tracking-study-detection) algorithm introduced as follows:
TLD algorithms are used to carry out long-time tracking to the single body in video, and referring to Fig. 1, TLD algorithms include three moulds
Block:Tracking module, detection module and study module.
One, tracking module:
Tracking module is used to track the motion change situation between wantonly two frames adjacent image, according to target area in previous frame
The motion change situation between position and two field pictures in image determines the position of target area in next frame image.
There are effective when target area only in next frame image for tracking module.
Also, tracking module can be also supplied to using the target area traced into next frame image as positive sample region
Study module, by study module by positive sample region for training grader.
Two, detection module:
Detection module is for comprehensively scanning image, and application class device classifies to the region scanned,
Region similar with target area is found out, positive sample region and negative sample region is generated, is supplied to study module.
When tracking module there is no target area due to leading to tracking failure in the image that traces into, detection module can
The target area found out is supplied to tracking module, continued in image later into line trace by tracking module.
Three, study module:
The sample areas that study module is used to be provided according to tracking module and detection module, to the grader of detection module into
Row iteration is trained, and the classification accuracy of grader is improved.
In the related technology, do not include target area when tracking to when tracking target area in the multiple image of video
When image, failure can be tracked, at this time if the image tracked later is there are target area, needs to detect target area in the picture
Domain could continue to track.But application class device is needed in detection process, the target area which traces into before
Domain is trained to obtain, and when tracking failure, grader also complete completely by training, and accuracy is poor, causes the target area to be difficult
Accurately it detected.
An embodiment of the present invention provides a kind of target area detection methods, target can be determined in the first image in user
Behind region, grader is trained according to the classification results of multiple sample areas and multiple sample areas in the first image.So
Even if tracking failure can also apply the grader of trained completion, accurately detect target area.
The embodiment of the present invention can be applied in the scene for carrying out editing and processing to video, when user is in a certain figure of video
As in manually determined target area when, terminal can to the target area carry out editing and processing, and can also video other
Target area is detected in image, and editing and processing in the same manner is carried out to the target area in other images.
For example, when user shoots one section of video and chooses head zone, terminal can be in video per on frame image
Head zone adds paster, and with the change in location of head zone, the position of paster also can accordingly change.
Fig. 2 is a kind of schematic diagram of target area detection method provided in an embodiment of the present invention.The embodiment of the present invention is held
Row main body is terminal, and referring to Fig. 2, this method includes:
201, terminal obtains the target area that user determines in the first image of video.
Wherein, which can be the equipment such as mobile phone, smart camera, which is configured with camera, can be clapped by camera
Take the photograph image or video.The video includes multiple image, and the first image is any image in video, can be the in video
One frame image, or may be the image etc. that video playing arrives when user triggers edit instruction.
For example, during terminal plays video, when detecting pause play instruction, display is currently playing arrive the
One image, user can choose target area, expression to carry out editing and processing, terminal detection to target area in the first image
To when the operation for choosing target area, the target area is obtained.Wherein, choose target area operation can be slide or
Person's clicking operation etc., target area can be determined according to the initial position of slide and final position, or be grasped according to clicking
The click on area of work determines.
202, terminal determines point of multiple sample areas and multiple sample areas according to the target area in the first image
Class result.
Wherein, there are one classification results for each sample areas tool, and the classification results are for indicating whether sample areas belongs to
Target area indicates that the sample areas is positive sample region, if sample areas does not belong to if sample areas belongs to target area
In target area, indicate that the sample areas is negative sample region.
In a kind of possible realization method, the first image of terminal-pair carries out region detection, obtains multiple sample areas, according to
Each sample areas and position of the target area in the first image, determine the weight between each sample areas and target area
Folded rate determines the classification results of multiple sample areas according to the Duplication between multiple sample areas and target area.
Optionally, to the first image carry out region detection when, may be used fixed the first image of window pair of size into
Row traversal, obtains multiple sample areas of corresponding size.Wherein, the size of the window can be less than the size of target area, with
Just it chooses to multiple sample areas for belonging to target area, and the size of the window can be according to the size of target area and right
The demand of accuracy determines.
Optionally, for each sample areas, when the Duplication between sample areas and target area is more than default value
When, determine that the sample areas belongs to target area, and when the Duplication between sample areas and target area is not more than present count
When value, determine that the sample areas is not belonging to target area.The default value can be 0 or 50% etc., with specific reference to accurate
The demand of degree determines.
It is of course also possible to use other modes determine the classification results of each sample areas, such as by sample areas and mesh
Mark region is compared, and calculates the similarity of sample areas and target area, the classification of sample areas is determined according to the similarity
As a result etc..
203, terminal obtains grader to be trained, and grader includes the multiple classification sections being arranged in order according to sequencing
Point.
It, will not be during tracking target area according to tracking in order to ensure classification accuracy in the embodiment of the present invention
The target area training grader arrived, but can will be classified according to the sample areas in the first image before tracking target area
Device training is completed, and can ensure, when tracking target area failure, more accurate grader can be applied to detect mesh in this way
Mark region.
Also, the grader that terminal uses includes the multiple class nodes being arranged in order according to sequencing, multiple point
Class node constitutes linear structure, and each class node is used equally for territorial classification.The structure of the grader can be as shown in Figure 3.
204, terminal is according to the classification results of multiple sample areas and multiple sample areas, to first in grader
Class node is trained, and continues to be trained next class node after the completion of first class node training, until more
The training of a class node is completed.
An embodiment of the present invention provides a kind of training methods of Dynamic Programming, from the grader for only including a class node
Start, first class node be trained, when first class node training is completed, fixes first class node,
Second class node is trained, and so on, until all class nodes training in grader is completed, in this way may be used
When ensureing training every time, the grader that training is completed before can be optimal, and so as to obtain optimal grader, carry
The accuracy rate of high-class device.
In a kind of possible realization method, terminal first initializes the node parameter of multiple class nodes, according to multiple samples
The classification results of region and multiple sample areas are trained the node parameter of first class node in grader,
The node parameter after first class node training is obtained, the classification knot according to multiple sample areas, multiple sample areas is continued
Node parameter after fruit and the training of a upper class node, is trained the node parameter of next class node, obtains
Node parameter after the training of next class node, until the training of multiple class nodes is completed, multiple class nodes at this time
Node parameter has trained completion, and multiple class nodes may be used and classify.
Optionally, when any class node in grader exports the first classification numerical value, this region to be sorted is indicated
Belong to target area and indicates that this region to be sorted is not belonging to target area when any class node exports the second classification numerical value
Domain.Wherein, the first classification numerical value and the second classification numerical value are different, for example, first classify numerical value when being 1 second numerical value of classifying be 0,
Or first classification numerical value when being 0 second classification numerical value be 1.
Optionally, the node parameter of each class node may include two location of pixels i and j and threshold value x, i and j are
Positive integer.When the region of a certain image is input in a certain class node, can according on location of pixels i gray scale and pixel position
It sets whether the difference between the gray scale on j classifies to region more than threshold value x, is determined when the difference is not more than threshold value x
The classification results of the class node are the first classification numerical value, and the classification knot of the class node is determined when the difference is more than threshold value x
Fruit is the second classification numerical value.
Referring to Fig. 3, grader includes n class node, can export n classification numerical value, one two is constituted after combination
Binary value, it is 0-2 to be converted to the value range after decimal valuen-1, each class node is with 0 and 1 two classification sky
Between, then grader has 2nA classifying space can ensure that class node number is fixed, maximumlly subdivision point
Classification accuracy is improved in space-like.Wherein, n is positive integer, such as can be 6 or 10.
205, terminal chooses the multiple positive sample regions for belonging to target area from multiple sample areas, according to grader pair
The classification results that multiple positive sample region is classified, determine target classification result.
For the classifying space where finding out target area in multiple classifying spaces, terminal can obtain multiple positive samples
Region, for each positive sample region, the multiple class nodes of terminal applies are respectively classified to positive sample region, are obtained more
The classification numerical value that a class node exports respectively is distinguished multiple class nodes defeated according to the sequencing of multiple class nodes
The classification combinations of values gone out constitutes binary numeral, using the corresponding decimal value of binary numeral as point in positive sample region
Class by the most classification results of occurrence number in multiple positive sample regions as a result, be determined as target classification result.So, only when
It when some region of classification results are equal to target classification result, just can determine that the region belongs to target area, and work as a certain region
Classification results be not equal to target classification result when, determine that the region is not belonging to target area.
For example, after a certain positive sample region is input to grader, classification numerical value that class node 1 is exported to class node n
The binary numeral that combination is constituted is 100110, and corresponding decimal value is 38.
206, terminal tracks target area in other images in addition to the first image in video.
Referring to Fig. 4, for the image being temporally located in video before the first image, terminal can carry out before to
Track determines the target area in these images, and for the image being temporally located in video after the first image, terminal can
, to tracking, to determine the target area in these images after progress.
Specifically, the target area in the first image of terminal-pair is detected, and obtains multiple characteristic points, passes through two frame in office
Multiple characteristic points are tracked in adjacent image, position of multiple characteristic points in other images are determined, according to multiple characteristic points at it
Position in his image determines the target area in other images.
When wherein extracting characteristic point, referring to Fig. 5, terminal may be used uniform grid and take mode a little, in the first image
The grid of multiple homogeneous phases etc. is set, a point is chosen in each grid as characteristic point, so as to rapidly choose
The characteristic point of fixed quantity.
Alternatively, in view of the characteristic point chosen needs the feature for effectively reflecting image, FAST (Features may be used
From Accelerated Segment Test, Accelerated fractionation test feature), Harris (a kind of Corner Detection Algorithm), SURF
(Speed Up Robust Feature accelerate robust feature), BRISK (Binary Robust Invariant Scalable
Keypoints, the constant expansible key point of binary robust) scheduling algorithm extracts characteristic point, the spy of extraction from the first image
Sign point as indicated with 6, can reflect the characteristics of image of target area.
In a kind of possible realization method, terminal can be since the first image, to multiple spy in next frame image
Sign clicks through line trace, finds the matching characteristic point in next frame image, to obtain the movable information of multiple characteristic points, the fortune
Dynamic information can indicate change in location situation of the next frame image relative to the first image, then according to target area in the first image
In position and the movable informations of multiple characteristic points be iterated calculating, it may be determined that target area is in next frame image
Position, to trace into target area.Similar tracking mode can also be used for image later, existed according to target area
The movable information of position and multiple characteristic points in previous frame image is iterated calculating, determines target area in next frame figure
Position as in.
Wherein, the movable information that light stream matching algorithm obtains characteristic point may be used in terminal, or is obtained using other algorithms
Take the movable information of characteristic point.
Wherein, after terminal gets the movable information of multiple characteristic points, multiple feature can be determined according to movable information
Location information of the point in previous frame image and the location information in next frame image, so that it is determined that next frame image is opposite
Displacement parameter in the rotation translation matrix of next frame image, the rotation translation matrix is next frame image relative to upper one
The change in location information of frame image can determine position of the target area in next frame image according to the displacement parameter.
In a kind of possible realization method, for the video of captured in real-time, terminal can lead to during shooting video
The sensor for crossing configuration obtains posture information of the camera when shooting is per frame image, which can indicate that camera is current
Position and posture, the sensor may include acceleration transducer and gyro sensor etc..According to wantonly two frames adjacent image it
Between posture information variable quantity and previous frame objective area in image position, obtain the estimation mesh in next frame image
Mark region.Feature point tracking then is carried out in estimation target area, determines position of the target area in next frame image, and nothing
Feature point tracking need to be carried out to the region other than estimation target area, unnecessary calculation amount can be reduced, improve tracking velocity.
It should be noted that if the position of camera or attitudes vibration are excessive when shooting video, certain figures are may result in
Do not include target area as in, following step 207 can be executed at this time and detect target area again in image later.
Alternatively, if the Parameters variations such as the position of camera or posture are excessive when shooting video, can cause in certain images
Larger deformation has occurred in target area, is difficult at this time to trace into target area according to the characteristic point originally extracted.In order to ensure
In this case it may also detect that target area, in a kind of possible realization method, for tracking to third image, terminal
The tracking error of third image can be obtained at the target area in tracing into third image, when tracking error is more than first
When predetermined threshold value, indicate that target area has occurred compared with large deformation, then using the target area traced into third image as sample
Region collects, and can be updated to grader according to the sample areas, obtains updated grader.
Wherein, which can be FB (Forward-Backward Error, forward-backward algorithm) error, NCC
(Normalized Cross Correlation, normalized crosscorrelation) error or SSD (Sum-of-Squared
Differences, squared difference and) error etc..
In alternatively possible realization method, the first predetermined threshold value and the second predetermined threshold value can be arranged in terminal, and second is pre-
If threshold value is more than the first predetermined threshold value, when the tracking error of third image is more than the first predetermined threshold value and is not more than the second default threshold
When value, the target area traced into third image is collected as sample areas, it can be to dividing according to the sample areas
Class device is updated, and obtains updated grader.And when tracking error is more than the second predetermined threshold value, indicate tracking failure,
The domain error currently tracked is too big, cannot function as target area, it is thus determined that not including target area in third image, at this time
There is still a need for execute following step 207 to detect target area again in image later.
In addition to target area is there is a situation where compared with large deformation, preset duration or preset quantity can also be arranged in terminal,
Every preset duration or after the image of preset quantity, using the target area currently traced into as sample areas, according to
The sample areas is updated grader, obtains updated grader, can not be deformed upon in target area in this way but
The variation is arrived in study in time when ambient enviroment changes, and ensures timely and accurately detect using updated grader
Target area.
207, when it does not include target area to determine in the image currently tracked, terminal applies have trained the classification completed
Device classifies at least one of the second image after the image that currently tracks region, is determined according to classification results
Target area in two images.
If tracked since the first image, target area can not be traced into when reaching a certain image currently tracked,
Can then determine in the image currently tracked not include target area, tracking failure.At this time, it may be necessary to the image currently tracked it
Again target area is detected in image afterwards, can just continue to track.
Wherein, to when tracking before progress, the image after the image currently tracked refers to being located at present image on the time
Image before, to when tracking after progress, the image after the image currently tracked refer to be located on the time present image it
Image afterwards.
By taking the second image after the image currently tracked as an example, terminal can carry out region detection to the second image, obtain
To at least one of the second image region, which is input in the grader of trained completion, is applied
Grader classifies at least one region, obtains classification results, namely is to determine which region at least one region
Belong to target area, which region is not belonging to target area, to determine target area in the second image according to classification results
Position, realize the repositioning of target area.
In a kind of possible realization method, for the video of captured in real-time, terminal can lead to during shooting video
The sensor for crossing configuration obtains posture information of the camera when shooting is per frame image, which can indicate that camera is current
Position and posture, the sensor may include acceleration transducer and gyro sensor etc..According to wantonly two frames adjacent image it
Between posture information variable quantity and previous frame objective area in image position, obtain the estimation mesh in next frame image
Mark region.Region detection then is carried out to estimation target area, obtains at least one region, application class device determines after being classified
The accurate location of target area, and to other regions other than estimation target area without being detected, it is unnecessary to reduce
Calculation amount, improve detection speed.
The coordinate system of terminal can be with as shown in fig. 7, can obtain terminal by sensor shooting wantonly two frames adjacent image
Between during displacement on three directions, can according to target area previous frame image position Xt, use is following
Formula estimate target area next frame image position Xt+1:
Xt+1=K*R*K-1Xt;
Wherein,X and y indicates that the two-dimensional coordinate of pixel, X indicate that the homogeneous coordinates of pixel, K indicate camera
Parameter matrix,Fx, fy, cx and cy indicate that the parameter of camera, R indicate the rotation between two field pictures
Translation matrix can be determined according to displacement of terminal during shooting between wantonly two frames adjacent image on three directions.
Based on the possibility realization method in step 204, for each region in the second image, terminal can be applied multiple
Class node respectively classifies to the region, the classification numerical value that multiple class nodes export respectively is obtained, according to multiple classification
The sequencing of node, the classification combinations of values that multiple class nodes are exported respectively constitutes binary numeral, by binary number
It is worth classification results of the corresponding decimal value as the region, and judges whether the classification results in the region are equal to target classification
As a result, when classification results are equal to target classification result, determine that the region belongs to target area, when classification results are not equal to target
When classification results, determine that the region is not belonging to target area.Each region in the second image can be determined using aforesaid way
Whether belong to target area, and then determines position of the target area in the second image.
In a kind of possible realization method, the region in the second image can be screened, obtained using above-mentioned grader
To multiple regions that may belong to target area, remaining region can also be continued to sieve using nearest neighbor classifier later
Choosing, that is to say, calculate the similarity between each region and target area, and when similarity is more than default similarity, determining should
Region belongs to target area, when similarity is not more than default similarity, determines that the region is not belonging to target area, the areas Ze Jianggai
It filters in domain.The region for belonging to target area can be determined after the completion of screening, and then determines position of the target area in the second image
It sets.
In alternatively possible realization method, the region in the second image can be screened using above-mentioned grader,
Obtain multiple regions that may belong to target area.The description sub-portfolio of each characteristic point in target area can be constituted at this time
The feature of target area, and characteristic matching grader is applied, for remaining each extracted region characteristic point, according in the region
The description sub-portfolio of each characteristic point constitutes the feature in the region, calculates between the feature in the region and the feature of target area
Distance determines that the region belongs to target area when the distance is less than pre-determined distance, when the distance is not less than pre-determined distance,
Determine that the region is not belonging to target area, then by the area filter.Wherein, which can be Euclidean distance or Hamming distance
Deng.
In alternatively possible realization method, can by the grader, nearest neighbor classifier of linear structure in step 207 and
Characteristic matching classifiers combination constitutes cascade classifier as shown in Figure 8, using cascade classifier detect after repeatedly screening
Go out the target area in the second image.Referring to Fig. 8, region 1, region 2 and region 3 are input in cascade classifier, linear junction
The grader of structure determines that region 1 is not belonging to target area, and region 2 and 3 belongs to target area, then filters region 1, by region 2
It is input in nearest neighbor classifier with 3, nearest neighbor classifier determines that region 2 is not belonging to target area, and region 3 belongs to target area
Region 2 is then filtered in domain, and region 3 is input in Feature Points Matching grader, and Feature Points Matching grader determines that region 3 belongs to
In target area, then the target area exported is region 3.
Image after second image can be continued to track, that is to say first to the target area in the second image
Extract multiple characteristic points, using with 206 similar mode of above-mentioned steps to multiple characteristic point into line trace, find target area
Domain.
In another embodiment, when each region in the second image is not admitted to target area, the second image is indicated
In also do not include target area, can continue to be detected image later at this time, until find target in a certain image
Region.
It, can be to mesh when tracking in any frame image of video or detect target area in the embodiment of the present invention
It marks region and carries out editing and processing, such as target area is zoomed in or out, add paster in the target area or the spy that shines
Effect carries out mosaic processing etc. to target area, and specific processing mode can be by terminal default setting or by user setting.
By carrying out editing and processing to target area, user can be helped to generate with personal presentation, more abundant lively video, increased
It is strong recreational and interesting.
Method provided in an embodiment of the present invention is determined according to the target area that user determines in the first image of video
The classification results of multiple sample areas and multiple sample areas, classification results are for indicating whether sample areas belongs to target area
Domain obtains grader to be trained, and grader includes the multiple class nodes being arranged in order according to sequencing, according to multiple samples
The classification results of one's respective area and multiple sample areas are trained first class node in grader, first point
Class node training after the completion of continues to be trained next class node, until multiple class nodes training complete, regarding
Target area is tracked in other images in frequency in addition to the first image, does not include target area when determining in the image currently tracked
When domain, the grader completed has been trained in application, at least one of the second image after the image that currently tracks region into
Row classification, determines the target area in the second image according to classification results, without being detected to every frame image, reduces not
Necessary calculation amount.And the training method for first using Dynamic Programming, in grader after the completion of upper class node training
Next class node can be just trained, the accuracy of grader is improved, is lost after training grader when target area tracks
When losing, application has trained the grader completed to classify, and detects target area again, can improve the accurate of target area
Property.
Also, when target area occurs compared with large deformation, grader can be carried out more according to the target area after deformation
Newly, new target area is arrived in study in time, the robustness and reliability of grader is improved, in terminal fast jitter, rotation, mesh
Mark can timely and accurately detect that target area, detection result are ideal when being blocked.
Also, when tracking target area due to only relying on sensing data, once there is quickly violent shake, meeting in terminal
Cause sensing data fluctuation larger, deviation occurs in the position of target area, and tracking is caused to fail.And the embodiment of the present invention will be schemed
As visual information is combined with sensing data, the posture information provided by the sensor configured estimates the position of target area
Set, into line trace or detection in the target area of estimation, and to estimation target area other than other regions without carry out with
Track or detection can avoid tracking failure caused by sensor error, and can also reduce unnecessary calculation amount, carry
High arithmetic speed.
The operational flowchart of the embodiment of the present invention can with as shown in figure 9, terminal may include tracking module, detection module and
Study module, tracking module are supplied to for executing above-mentioned steps 206 using the target area traced into as positive sample region
Study module, detection module is used to execute above-mentioned steps 201-205 and obtains the grader that training is completed, and is tracked in tracking module
When failure, executes step 207 and detect target area again, then continue to track by tracking module.Also, when the mesh traced into
When marking region generation compared with large deformation, the target area can be learnt by study module, grader is updated.
In traditional TLD algorithms, tracking module, detection module and study module are be combined with each other, for each frame image,
It will be merged by the result of tracking module and detection module, determine the position of target area, by determining target area
As positive sample region, grader is trained by study module.Promote the robustness of detection module.Due to traditional
TLD algorithms are the Tracking of single goal, and each frame image is required for carrying out the processing of three parts, and calculation amount is larger, processing
Speed is slower.
And in method provided in an embodiment of the present invention, it is not necessary that every frame image is all detected and is learnt, only tracking
It can just be detected, be learnt when target area occurs compared with large deformation, so as to avoid unnecessary calculation amount when failure.
Moreover, because the grader of detection module application has been trained by the way of Dynamic Programming before tracking
At, improve the accuracy rate of grader, track failure when be detected the standard it is also ensured that the target area that detected
True rate.
Due to using the grader of binary tree structure in traditional TLD algorithms, it is assumed that grader includes 15 points in total
Class node needs to carry out 4 layers of classification, and finally determining class interval only has 8, and the class interval of division reduces, and causes to classify
Accuracy rate is not high enough.And the grader of linear structure is used in the embodiment of the present invention, n class node in grader is divided
Class, grader have 2nA classifying space can ensure that class node number is fixed, maximumlly subdivision classification
Classification accuracy is improved in space.
For 3 test videos, the tracking error of method used in the embodiment of the present invention and tradition TLD algorithms can be as follows
Shown in table 1, the embodiment of the present invention significantly reduces tracking error, accuracy rate higher as can be seen from Table 1.
Table 1
Test video 1 | Test video 2 | Test video 3 | |
The present invention | 5.6 | 10.11 | 1.24 |
Traditional TLD | 7.1 | 15.3 | 1.33 |
Method used in the embodiment of the present invention and CT (Compressive Tracking, compression tracking) algorithm, tradition TLD
(Efficient Convolution Operators for Tracking, the efficient convolution for tracking are calculated by algorithm and ECO
Son) tracking velocity of algorithm can be as shown in Figure 10, and as can be seen from Figure 10 the embodiment of the present invention significantly improves tracking speed
Degree, can substantially reach real-time tracking.
Figure 11 is a kind of structural schematic diagram of target area detection device provided in an embodiment of the present invention.It, should referring to Figure 11
Device includes:
Sample determining module 1101 determines multiple sample areas and multiple sample areas for executing in above-described embodiment
Classification results the step of;
Acquisition module 1102, for executing the step of obtaining grader to be trained in above-described embodiment;
Training module 1103, for executing point in above-described embodiment according to multiple sample areas and multiple sample areas
As a result, being trained to first class node in grader, first class node training continues to next class after the completion
A class node is trained, until multiple class nodes train the step of completing;
Detection module 1104, for execute in above-described embodiment in other images in addition to the first image in video with
Track target area, when it does not include target area to determine in the image currently tracked, the grader completed has been trained in application, to working as
At least one of the second image after the image of preceding tracking region is classified, and is determined in the second image according to classification results
Target area the step of.
Optionally, sample determining module 1101, including:
Region detection unit carries out region detection to the first image for executing in above-described embodiment, obtains multiple samples
The step of region;
Determination unit, for executing in above-described embodiment according to the Duplication between multiple sample areas and target area,
The step of determining the classification results of multiple sample areas.
Optionally, training module 1103, including:
Initialization unit, for the step of executing the node parameter for initializing multiple class nodes in above-described embodiment;
Training unit, for executing the classification knot in above-described embodiment according to multiple sample areas and multiple sample areas
Fruit is trained the node parameter of first class node in grader, obtains the section after first class node training
The step of point parameter;
Training unit is additionally operable to execute point continued in above-described embodiment according to multiple sample areas, multiple sample areas
Node parameter after class result and the training of a upper class node, is trained the node parameter of next class node,
The node parameter after next class node training is obtained, until multiple class nodes train the step of completing.
Optionally, when any class node in grader exports the first classification numerical value, this region to be sorted is indicated
Belong to target area and indicates that this region to be sorted is not belonging to target area when any class node exports the second classification numerical value
Domain;
Device further includes:
Choose module, for execute in above-described embodiment chosen from multiple sample areas belong to target area it is multiple just
The step of sample areas;
Sort module, for executing for each positive sample region in above-described embodiment, using multiple class nodes, respectively
The step of classifying to positive sample region, obtaining the classification numerical value that multiple class nodes export respectively;
Composite module saves multiple classification for executing in above-described embodiment according to the sequencing of multiple class nodes
The classification combinations of values that point exports respectively constitutes binary numeral, using the corresponding decimal value of binary numeral as positive sample
The step of classification results in region;
Target determination module, for executing classification that occurrence number in multiple positive sample regions is most in above-described embodiment
As a result the step of being determined as target classification result.
Optionally, device further includes:
Error acquisition module, for execute in above-described embodiment in the third image in addition to the first image in video with
When track target area, obtain tracking error the step of;
Sample acquisition module, for executing in above-described embodiment when tracking error is more than the first predetermined threshold value, by third
The step of target area traced into image is as sample areas;
Update module is updated grader according to sample areas for executing in above-described embodiment, after obtaining update
Grader the step of.
Optionally, detection module 1104, for executing in above-described embodiment for each region in the second image, application
Multiple class nodes, respectively classify to region, obtain the classification numerical value that multiple class nodes export respectively;According to multiple points
The sequencing of class node, the classification combinations of values that multiple class nodes are exported respectively constitutes binary numeral, by binary system
Classification results of the corresponding decimal value of numerical value as region;When classification results are equal to target classification result, region is determined
The step of belonging to target area.
Optionally, the target area in the first image is examined for executing in above-described embodiment in detection module 1104
It surveys, obtains multiple characteristic points;By tracking multiple characteristic points in two frames adjacent image in office, determine multiple characteristic points in other figures
Position as in;According to position of multiple characteristic points in other images, the step of determining the target area in other images.
Optionally, device further includes:
Error acquisition module, for execute in above-described embodiment in the third image in addition to the first image in video with
When track target area, obtain tracking error the step of;
Sample collection module, for executing in above-described embodiment when tracking error is more than the first predetermined threshold value, by third
The step of target area traced into image is as sample areas;
Update module is updated grader according to sample areas for executing in above-described embodiment, after obtaining update
Grader the step of.
Optionally, sample collection module is additionally operable to execute in above-described embodiment when tracking error is more than the first predetermined threshold value
And when no more than the second predetermined threshold value, using the target area traced into third image as the step of sample areas, second is pre-
If threshold value is more than the first predetermined threshold value;
Device further includes:Determining module, for executing in above-described embodiment when tracking error is more than the second predetermined threshold value,
Determine in third image not include the steps that target area.
It should be noted that:Above-described embodiment provide target area detection device when detecting target area, only more than
The division progress of each function module is stated for example, in practical application, it can be as needed and by above-mentioned function distribution by difference
Function module complete, i.e., the internal structure of terminal is divided into different function modules, with complete it is described above whole or
Person's partial function.In addition, the target area detection device that above-described embodiment provides belongs to target area detection method embodiment
Same design, specific implementation process refer to embodiment of the method, and which is not described herein again.
Figure 12 shows the structure diagram for the terminal 1200 that an illustrative embodiment of the invention provides.The terminal 1200 can
To be portable mobile termianl, such as:Smart mobile phone, tablet computer, MP3 player (Moving Picture Experts
Group Audio Layer III, dynamic image expert's compression standard audio level 3), MP4 (Moving Picture
Experts Group Audio Layer IV, dynamic image expert's compression standard audio level 4) player, laptop,
Desktop computer, headset equipment or any other intelligent terminal.Terminal 1200 is also possible to be referred to as user equipment, portable terminal
Other titles such as end, laptop terminal, terminal console.
In general, terminal 1200 includes:Processor 1201 and memory 1202.
Processor 1201 may include one or more processing cores, such as 4 core processors, 5 core processors etc..Place
DSP (Digital Signal Processing, Digital Signal Processing), FPGA (Field- may be used in reason device 1201
Programmable Gate Array, field programmable gate array), PLA (Programmable Logic Array, may be programmed
Logic array) at least one of example, in hardware realize.Processor 1201 can also include primary processor and coprocessor, master
Processor is the processor for being handled data in the awake state, also referred to as CPU (Central Processing
Unit, central processing unit);Coprocessor is the low power processor for being handled data in the standby state.?
In some embodiments, processor 1201 can be integrated with GPU (Graphics Processing Unit, image processor),
GPU is used to be responsible for the rendering and drafting of content to be shown needed for display screen.In some embodiments, processor 1201 can also wrap
AI (Artificial Intelligence, artificial intelligence) processor is included, the AI processors are for handling related machine learning
Calculating operation.
Memory 1202 may include one or more computer readable storage mediums, which can
To be non-transient.Memory 1202 may also include high-speed random access memory and nonvolatile memory, such as one
Or multiple disk storage equipments, flash memory device.In some embodiments, the non-transient computer in memory 1202 can
Storage medium is read for storing at least one instruction, at least one instruction by processor 1201 for being had to realize this Shen
Please in embodiment of the method provide target area detection method.
In some embodiments, terminal 1200 is also optional includes:Peripheral device interface 1203 and at least one periphery are set
It is standby.It can be connected by bus or signal wire between processor 1201, memory 1202 and peripheral device interface 1203.It is each outer
Peripheral equipment can be connected by bus, signal wire or circuit board with peripheral device interface 1203.Specifically, peripheral equipment includes:
In radio circuit 1204, touch display screen 1205, camera 1206, voicefrequency circuit 1207, positioning component 1208 and power supply 1209
At least one.
Peripheral device interface 1203 can be used for I/O (Input/Output, input/output) is relevant at least one outer
Peripheral equipment is connected to processor 1201 and memory 1202.In some embodiments, processor 1201, memory 1202 and periphery
Equipment interface 1203 is integrated on same chip or circuit board;In some other embodiments, processor 1201, memory
1202 and peripheral device interface 1203 in any one or two can be realized on individual chip or circuit board, this implementation
Example is not limited this.
Radio circuit 1204 is for receiving and emitting RF (Radio Frequency, radio frequency) signal, also referred to as electromagnetic signal.
Radio circuit 1204 is communicated by electromagnetic signal with communication network and other communication equipments.Radio circuit 1204 is by telecommunications
Number being converted to electromagnetic signal is sent, alternatively, the electromagnetic signal received is converted to electric signal.Optionally, radio circuit
1204 include:Antenna system, one or more amplifiers, tuner, oscillator, digital signal processor, compiles solution at RF transceivers
Code chipset, user identity module card etc..Radio circuit 1204 can by least one wireless communication protocol come with it is other
Terminal is communicated.The wireless communication protocol includes but not limited to:Metropolitan Area Network (MAN), each third generation mobile communication network (2G, 3G, 4G and
12G), WLAN and/or WiFi (Wireless Fidelity, Wireless Fidelity) network.In some embodiments, radio frequency
Circuit 1204 can also include the related circuits of NFC (Near Field Communication, wireless near field communication), this Shen
Please this is not limited.
Display screen 1205 is for showing UI (User Interface, user interface).The UI may include figure, text,
Icon, video and its their arbitrary combination.When display screen 1205 is touch display screen, display screen 1205 also there is shooting to exist
The ability of the surface of display screen 1205 or the touch signal of surface.The touch signal can be used as control signal to be input to place
Reason device 1201 is handled.At this point, display screen 1205 can be also used for providing virtual push button and/or dummy keyboard, it is also referred to as soft to press
Button and/or soft keyboard.In some embodiments, display screen 1205 can be one, and the front panel of terminal 1200 is arranged;Another
In a little embodiments, display screen 1205 can be at least two, be separately positioned on the different surfaces of terminal 1200 or in foldover design;
In still other embodiments, display screen 1205 can be flexible display screen, be arranged on the curved surface of terminal 1200 or fold
On face.Even, display screen 1205 can also be arranged to non-rectangle irregular figure, namely abnormity screen.Display screen 1205 can be with
Using LCD (Liquid Crystal Display, liquid crystal display), OLED (Organic Light-Emitting Diode,
Organic Light Emitting Diode) etc. materials prepare.
CCD camera assembly 1206 is for shooting image or video.Optionally, CCD camera assembly 1206 includes front camera
And rear camera.In general, the front panel in terminal is arranged in front camera, rear camera is arranged at the back side of terminal.?
In some embodiments, rear camera at least two is that main camera, depth of field camera, wide-angle camera, focal length are taken the photograph respectively
As any one in head, to realize that main camera and the fusion of depth of field camera realize background blurring function, main camera and wide
Pan-shot and VR (Virtual Reality, virtual reality) shooting functions or other fusions are realized in angle camera fusion
Shooting function.In some embodiments, CCD camera assembly 1206 can also include flash lamp.Flash lamp can be monochromatic temperature flash of light
Lamp can also be double-colored temperature flash lamp.Double-colored temperature flash lamp refers to the combination of warm light flash lamp and cold light flash lamp, be can be used for
Light compensation under different-colour.
Voicefrequency circuit 1207 may include microphone and loud speaker.Microphone is used to shoot the sound wave of user and environment, and
It converts sound waves into electric signal and is input to processor 1201 and handled, or be input to radio circuit 1204 to realize that voice is logical
Letter.For stereo shooting or the purpose of noise reduction, microphone can be multiple, be separately positioned on the different parts of terminal 1200.
Microphone can also be array microphone or omnidirectional's shooting type microphone.Loud speaker is then used to that processor 1201 or radio frequency will to be come from
The electric signal of circuit 1204 is converted to sound wave.Loud speaker can be traditional wafer speaker, can also be that piezoelectric ceramics is raised one's voice
Device.When loud speaker is piezoelectric ceramic loudspeaker, the audible sound wave of the mankind can be not only converted electrical signals to, can also be incited somebody to action
Electric signal is converted to the sound wave that the mankind do not hear to carry out the purposes such as ranging.In some embodiments, voicefrequency circuit 1207 may be used also
To include earphone jack.
Positioning component 1208 is used for the current geographic position of positioning terminal 1200, to realize navigation or LBS (Location
Based Service, location based service).Positioning component 1208 can be the GPS (Global based on the U.S.
Positioning System, global positioning system), the dipper system of China, Russia Gray receive this system or European Union
The positioning component of Galileo system.
Power supply 1209 is used to be powered for the various components in terminal 1200.Power supply 1209 can be alternating current, direct current
Electricity, disposable battery or rechargeable battery.When power supply 1209 includes rechargeable battery, which can support wired
Charging or wireless charging.The rechargeable battery can be also used for supporting fast charge technology.
In some embodiments, terminal 1200 further include there are one or multiple sensors 1210.The one or more senses
Device 1210 includes but not limited to:Acceleration transducer 1211, gyro sensor 1212, pressure sensor 1213, fingerprint sensing
Device 1214, optical sensor 1215 and proximity sensor 1216.
Acceleration transducer 1211 can detect the acceleration in three reference axis of the coordinate system established with terminal 1200
Size.For example, acceleration transducer 1211 can be used for detecting component of the acceleration of gravity in three reference axis.Processor
The 1201 acceleration of gravity signals that can be shot according to acceleration transducer 1211, control touch display screen 1205 is with transverse views
Or longitudinal view carries out the display of user interface.Acceleration transducer 1211 can be also used for game or the exercise data of user
Shooting.
Gyro sensor 1212 can be with the body direction of detection terminal 1200 and rotational angle, gyro sensor 1212
Shooting user can be cooperateed with to act the 3D of terminal 1200 with acceleration transducer 1211.Processor 1201 is according to gyro sensors
The data that device 1212 is shot, may be implemented following function:Action induction (for example changing UI according to the tilt operation of user) is clapped
Image stabilization, game control when taking the photograph and inertial navigation.
The lower layer of side frame and/or touch display screen 1205 in terminal 1200 can be arranged in pressure sensor 1213.When
The gripping signal that user can be detected in the side frame of terminal 1200 to terminal 1200 is arranged in pressure sensor 1213, by
Reason device 1201 carries out right-hand man's identification or prompt operation according to the gripping signal that pressure sensor 1213 is shot.Work as pressure sensor
1213 are arranged in the lower layer of touch display screen 1205, are grasped to the pressure of touch display screen 1205 according to user by processor 1201
Make, realization controls the operability control on the interfaces UI.Operability control include button control, scroll bar control,
At least one of icon control, menu control.
Fingerprint sensor 1214 is used to shoot the fingerprint of user, is taken according to fingerprint sensor 1214 by processor 1201
Fingerprint recognition user identity, alternatively, by fingerprint sensor 1214 according to the identity of the fingerprint recognition user taken.Knowing
When the identity for not going out user is trusted identity, authorize the user that there is relevant sensitive operation, sensitivity behaviour by processor 1201
Work includes solving lock screen, checking encryption information, download software, payment and change setting etc..Fingerprint sensor 1214 can be set
Set the front, the back side or side of terminal 1200.When being provided with physical button or manufacturer Logo in terminal 1200, fingerprint sensor
1214 can integrate with physical button or manufacturer's mark.
Optical sensor 1215 is used for shooting environmental luminous intensity.In one embodiment, processor 1201 can be according to light
The ambient light intensity that sensor 1215 is shot is learned, the display brightness of touch display screen 1205 is controlled.Specifically, work as ambient light intensity
When higher, the display brightness of touch display screen 1205 is turned up;When ambient light intensity is relatively low, the aobvious of touch display screen 1205 is turned down
Show brightness.In another embodiment, the ambient light intensity that processor 1201 can also be shot according to optical sensor 1215, is moved
State adjusts the acquisition parameters of CCD camera assembly 1206.
Proximity sensor 1216, also referred to as range sensor are generally arranged at the front panel of terminal 1200.Proximity sensor
1216 the distance between the front for shooting user and terminal 1200.In one embodiment, when proximity sensor 1216 is examined
When measuring the distance between the front of user and terminal 1200 and tapering into, by processor 1201 control touch display screen 1205 from
Bright screen state is switched to breath screen state;When proximity sensor 1216 detect the distance between front of user and terminal 1200 by
When gradual change is big, touch display screen 1205 is controlled by processor 1201 and is switched to bright screen state from breath screen state.
It, can be with it will be understood by those skilled in the art that the restriction of the not structure paired terminal 1200 of structure shown in Figure 12
Including than illustrating more or fewer components, either combining certain components or being arranged using different components.
The embodiment of the present invention additionally provides a kind of terminal for detecting target area, which includes processor and storage
Device, is stored at least one instruction, at least one section of program, code set or instruction set in memory, instruction, program, code set or
Instruction set is loaded by processor and is had possessed operation in the target area detection method to realize above-described embodiment.
The embodiment of the present invention additionally provides a kind of computer readable storage medium, is stored in the computer readable storage medium
Have at least one instruction, at least one section of program, code set or instruction set, the instruction, the program, the code set or the instruction set by
Processor loads and has possessed operation in the target area detection method to realize above-described embodiment.
One of ordinary skill in the art will appreciate that realizing that all or part of step of above-described embodiment can pass through hardware
It completes, relevant hardware can also be instructed to complete by program, the program can be stored in a kind of computer-readable
In storage medium, storage medium mentioned above can be read-only memory, disk or CD etc..
The foregoing is merely the preferred embodiments of the embodiment of the present invention, are not intended to limit the invention embodiment, all at this
Within the spirit and principle of inventive embodiments, any modification, equivalent replacement, improvement and so on should be included in the present invention's
Within protection domain.
Claims (15)
1. a kind of target area detection method, which is characterized in that the method includes:
According to the target area that user determines in the first image of video, multiple sample areas and the multiple sample are determined
The classification results in region, the classification results are for indicating whether the sample areas belongs to the target area;
Grader to be trained is obtained, the grader includes the multiple class nodes being arranged in order according to sequencing;
According to the classification results of the multiple sample areas and the multiple sample areas, to first in the grader
Class node is trained, and continues to be trained next class node after the completion of first class node training, directly
It is completed to the training of the multiple class node;
The target area is tracked in other images in the video in addition to described first image, is currently tracked when determining
Image in when not including the target area, the grader of completion train in application, to after the image that currently tracks
At least one of the second image region classify, determine the target area in second image according to classification results
Domain.
2. according to the method described in claim 1, it is characterized in that, the determination in the first image of video according to user
Target area determines the classification results of multiple sample areas and the multiple sample areas, including:
Region detection is carried out to described first image, obtains multiple sample areas;
According to the Duplication between the multiple sample areas and the target area, the classification of the multiple sample areas is determined
As a result.
3. according to the method described in claim 1, it is characterized in that, described according to the multiple sample areas and the multiple
The classification results of sample areas are trained first class node in the grader, first class node
Continue to be trained next class node after the completion of training, until the training of the multiple class node is completed, including:
Initialize the node parameter of the multiple class node;
According to the classification results of the multiple sample areas and the multiple sample areas, to first in the grader
The node parameter of class node is trained, and obtains the node parameter after first class node training;
Continue to be trained according to the multiple sample areas, the classification results of the multiple sample areas and a upper class node
Node parameter afterwards is trained the node parameter of next class node, after obtaining next class node training
Node parameter, until the multiple class node training complete.
4. according to the method described in claim 3, it is characterized in that, any class node in the grader exports first point
When class numerical value, indicate that this region to be sorted belongs to the target area, any the second classification of class node output number
When value, indicate that this region to be sorted is not belonging to the target area;
After the training completion of the multiple class node, the method further includes:
The multiple positive sample regions for belonging to the target area are chosen from the multiple sample areas;
For each positive sample region, using the multiple class node, classifies respectively to the positive sample region, obtain
The classification numerical value that the multiple class node exports respectively;
According to the sequencing of the multiple class node, the classification combinations of values structure that the multiple class node is exported respectively
At binary numeral, using the corresponding decimal value of the binary numeral as the classification results in the positive sample region;
The most classification results of occurrence number in the multiple positive sample region are determined as target classification result.
5. according to the method described in claim 4, it is characterized in that, the grader completed has been trained in the application, to working as
At least one of the second image after the image of preceding tracking region is classified, and second figure is determined according to classification results
The target area as in, including:
For each region in second image, using the multiple class node, classify respectively to the region,
Obtain the classification numerical value that the multiple class node exports respectively;
According to the sequencing of the multiple class node, the classification combinations of values structure that the multiple class node is exported respectively
At binary numeral, using the corresponding decimal value of the binary numeral as the classification results in the region;
When the classification results are equal to the target classification result, determine that the region belongs to the target area.
6. according to the method described in claim 1, it is characterized in that, it is described in the video in addition to described first image
The target area is tracked in other images, including:
The target area in described first image is detected, multiple characteristic points are obtained;
By tracking the multiple characteristic point in two frames adjacent image in office, determine the multiple characteristic point in other described images
In position;
According to position of the multiple characteristic point in other described images, the target area in other described images is determined
Domain.
7. according to the method described in claim 1, it is characterized in that, the method further includes:
When tracking the target area in the third image in the video in addition to described first image, obtains tracking and miss
Difference;
When the tracking error be more than the first predetermined threshold value when, using the target area traced into the third image as
Sample areas;
The grader is updated according to the sample areas, obtains the updated grader.
8. the method according to the description of claim 7 is characterized in that described when the tracking error is more than the first predetermined threshold value
When, using the target area traced into the third image as sample areas, including:
It, will be in the third image when the tracking error is more than first predetermined threshold value and is not more than the second predetermined threshold value
The target area traced into is more than first predetermined threshold value as sample areas, second predetermined threshold value;
When the tracking error is more than second predetermined threshold value, determine in the third image not include the target area
Domain.
9. a kind of target area detection device, which is characterized in that described device includes:
Sample determining module, the target area for being determined in the first image of video according to user, determines multiple sample areas
The classification results of domain and the multiple sample areas, the classification results are for indicating whether the sample areas belongs to described
Target area;
Acquisition module, for obtaining grader to be trained, the grader include be arranged in order according to sequencing it is multiple
Class node;
Training module, for the classification results according to the multiple sample areas and the multiple sample areas, to described point
First class node in class device is trained, and continues to save next classification after the completion of first class node training
Point is trained, until the training of the multiple class node is completed;
Detection module, for tracking the target area in other images in the video in addition to described first image,
When it does not include the target area to determine in the image currently tracked, the grader completed has been trained in application, to current
At least one of the second image after the image of tracking region is classified, and second image is determined according to classification results
In the target area.
10. device according to claim 9, which is characterized in that the sample determining module, including:
Region detection unit obtains multiple sample areas for carrying out region detection to described first image;
Determination unit, for according to the Duplication between the multiple sample areas and the target area, determining the multiple
The classification results of sample areas.
11. device according to claim 9, which is characterized in that the training module, including:
Initialization unit, the node parameter for initializing the multiple class node;
Training unit, for the classification results according to the multiple sample areas and the multiple sample areas, to described point
The node parameter of first class node in class device is trained, and obtains the node ginseng after first class node training
Number;
The training unit, be additionally operable to continue according to the classification results of the multiple sample areas, the multiple sample areas with
And the node parameter after upper class node training, the node parameter of next class node is trained, is obtained described
Node parameter after next class node training, until the training of the multiple class node is completed.
12. according to the devices described in claim 11, which is characterized in that any class node output first in the grader
When classification numerical value, indicate that this region to be sorted belongs to the target area, any second classification of class node output
When numerical value, indicate that this region to be sorted is not belonging to the target area;
Described device further includes:
Module is chosen, for choosing the multiple positive sample regions for belonging to the target area from the multiple sample areas;
Sort module is used for for each positive sample region, using the multiple class node, respectively to the positive sample region
Classify, obtains the classification numerical value that the multiple class node exports respectively;
Composite module exports the multiple class node for the sequencing according to the multiple class node respectively
Combinations of values of classifying constitutes binary numeral, using the corresponding decimal value of the binary numeral as the positive sample region
Classification results;
Target determination module, for the most classification results of occurrence number in the multiple positive sample region to be determined as target point
Class result.
13. device according to claim 9, which is characterized in that described device further includes:
Error acquisition module, for tracking the target area in the third image in the video in addition to described first image
When domain, tracking error is obtained;
Sample acquisition module, for when the tracking error is more than the first predetermined threshold value, will be traced into the third image
The target area as sample areas;
Update module obtains the updated grader for being updated to the grader according to the sample areas.
14. a kind of for detecting the terminal of target area, which is characterized in that the terminal includes processor and memory, described
It is stored at least one instruction, at least one section of program, code set or instruction set in memory, it is described instruction, described program, described
Code set or described instruction collection are loaded by the processor and are had to realize as described in claim 1 to 8 any claim
Target area detection method in possessed operation.
15. a kind of computer readable storage medium, which is characterized in that be stored at least one in the computer readable storage medium
Item instruction, at least one section of program, code set or instruction set, described instruction, described program, the code set or described instruction collection by
Processor loads and has to be had to realize in the target area detection method as described in claim 1 to 8 any claim
Some operations.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810650498.4A CN108776822B (en) | 2018-06-22 | 2018-06-22 | Target area detection method, device, terminal and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810650498.4A CN108776822B (en) | 2018-06-22 | 2018-06-22 | Target area detection method, device, terminal and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108776822A true CN108776822A (en) | 2018-11-09 |
CN108776822B CN108776822B (en) | 2020-04-24 |
Family
ID=64025419
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810650498.4A Active CN108776822B (en) | 2018-06-22 | 2018-06-22 | Target area detection method, device, terminal and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108776822B (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110084204A (en) * | 2019-04-29 | 2019-08-02 | 北京字节跳动网络技术有限公司 | Image processing method, device and electronic equipment based on target object posture |
CN110245246A (en) * | 2019-04-30 | 2019-09-17 | 维沃移动通信有限公司 | A kind of image display method and terminal device |
CN111241869A (en) * | 2018-11-28 | 2020-06-05 | 杭州海康威视数字技术股份有限公司 | Method and device for checking materials and computer readable storage medium |
CN112312203A (en) * | 2020-08-25 | 2021-02-02 | 北京沃东天骏信息技术有限公司 | Video playing method, device and storage medium |
CN113743380A (en) * | 2021-11-03 | 2021-12-03 | 江苏博子岛智能产业技术研究院有限公司 | Active tracking method based on video image dynamic monitoring |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102831618A (en) * | 2012-07-20 | 2012-12-19 | 西安电子科技大学 | Hough forest-based video target tracking method |
CN103744924A (en) * | 2013-12-26 | 2014-04-23 | 西安理工大学 | Frequent pattern based selective ensemble classification method |
KR20160111151A (en) * | 2015-03-16 | 2016-09-26 | (주)이더블유비엠 | image processing method and apparatus, and interface method and apparatus of gesture recognition using the same |
CN106709932A (en) * | 2015-11-12 | 2017-05-24 | 阿里巴巴集团控股有限公司 | Face position tracking method and device and electronic equipment |
CN107066990A (en) * | 2017-05-04 | 2017-08-18 | 厦门美图之家科技有限公司 | A kind of method for tracking target and mobile device |
-
2018
- 2018-06-22 CN CN201810650498.4A patent/CN108776822B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102831618A (en) * | 2012-07-20 | 2012-12-19 | 西安电子科技大学 | Hough forest-based video target tracking method |
CN103744924A (en) * | 2013-12-26 | 2014-04-23 | 西安理工大学 | Frequent pattern based selective ensemble classification method |
KR20160111151A (en) * | 2015-03-16 | 2016-09-26 | (주)이더블유비엠 | image processing method and apparatus, and interface method and apparatus of gesture recognition using the same |
CN106709932A (en) * | 2015-11-12 | 2017-05-24 | 阿里巴巴集团控股有限公司 | Face position tracking method and device and electronic equipment |
CN107066990A (en) * | 2017-05-04 | 2017-08-18 | 厦门美图之家科技有限公司 | A kind of method for tracking target and mobile device |
Non-Patent Citations (3)
Title |
---|
HONG LIU 等: "ROBUST HAND TRACKING BASED ON ONLINE LEARNING AND MULTI-CUE FLOCKS OF FEATURES", 《2013 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING》 * |
ZDENEK KALAL 等: "Tracking-Learning-Detection", 《IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE》 * |
姜媚: "监控视频事件检测算法", 《中国优秀硕士学位论文全文数据库信息科技辑》 * |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111241869A (en) * | 2018-11-28 | 2020-06-05 | 杭州海康威视数字技术股份有限公司 | Method and device for checking materials and computer readable storage medium |
CN111241869B (en) * | 2018-11-28 | 2024-04-02 | 杭州海康威视数字技术股份有限公司 | Material checking method and device and computer readable storage medium |
CN110084204A (en) * | 2019-04-29 | 2019-08-02 | 北京字节跳动网络技术有限公司 | Image processing method, device and electronic equipment based on target object posture |
CN110084204B (en) * | 2019-04-29 | 2020-11-24 | 北京字节跳动网络技术有限公司 | Image processing method and device based on target object posture and electronic equipment |
CN110245246A (en) * | 2019-04-30 | 2019-09-17 | 维沃移动通信有限公司 | A kind of image display method and terminal device |
CN112312203A (en) * | 2020-08-25 | 2021-02-02 | 北京沃东天骏信息技术有限公司 | Video playing method, device and storage medium |
CN112312203B (en) * | 2020-08-25 | 2023-04-07 | 北京沃东天骏信息技术有限公司 | Video playing method, device and storage medium |
CN113743380A (en) * | 2021-11-03 | 2021-12-03 | 江苏博子岛智能产业技术研究院有限公司 | Active tracking method based on video image dynamic monitoring |
Also Published As
Publication number | Publication date |
---|---|
CN108776822B (en) | 2020-04-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110555883B (en) | Repositioning method and device for camera attitude tracking process and storage medium | |
WO2020216116A1 (en) | Action recognition method and apparatus, and human-machine interaction method and apparatus | |
CN109086709A (en) | Feature Selection Model training method, device and storage medium | |
CN108776822A (en) | Target area detection method, device, terminal and storage medium | |
CN111079576B (en) | Living body detection method, living body detection device, living body detection equipment and storage medium | |
CN110222789B (en) | Image recognition method and storage medium | |
CN110062269A (en) | Extra objects display methods, device and computer equipment | |
WO2019007258A1 (en) | Method, apparatus and device for determining camera posture information, and storage medium | |
CN109815150B (en) | Application testing method and device, electronic equipment and storage medium | |
CN108615247A (en) | Method for relocating, device, equipment and the storage medium of camera posture tracing process | |
CN110555839A (en) | Defect detection and identification method and device, computer equipment and storage medium | |
WO2020221012A1 (en) | Method for determining motion information of image feature point, task execution method, and device | |
CN114648480A (en) | Surface defect detection method, device and system | |
CN108537845A (en) | Pose determines method, apparatus and storage medium | |
CN110650379B (en) | Video abstract generation method and device, electronic equipment and storage medium | |
CN108682036A (en) | Pose determines method, apparatus and storage medium | |
CN108596976A (en) | Method for relocating, device, equipment and the storage medium of camera posture tracing process | |
JP2021524957A (en) | Image processing methods and their devices, terminals and computer programs | |
WO2019024717A1 (en) | Anti-counterfeiting processing method and related product | |
CN110807361A (en) | Human body recognition method and device, computer equipment and storage medium | |
CN110148178A (en) | Camera localization method, device, terminal and storage medium | |
CN111127509B (en) | Target tracking method, apparatus and computer readable storage medium | |
CN110827195B (en) | Virtual article adding method and device, electronic equipment and storage medium | |
CN112749613B (en) | Video data processing method, device, computer equipment and storage medium | |
CN109285178A (en) | Image partition method, device and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |