CN108776822A - Target area detection method, device, terminal and storage medium - Google Patents

Target area detection method, device, terminal and storage medium Download PDF

Info

Publication number
CN108776822A
CN108776822A CN201810650498.4A CN201810650498A CN108776822A CN 108776822 A CN108776822 A CN 108776822A CN 201810650498 A CN201810650498 A CN 201810650498A CN 108776822 A CN108776822 A CN 108776822A
Authority
CN
China
Prior art keywords
target area
image
class node
sample areas
region
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810650498.4A
Other languages
Chinese (zh)
Other versions
CN108776822B (en
Inventor
姜媚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201810650498.4A priority Critical patent/CN108776822B/en
Publication of CN108776822A publication Critical patent/CN108776822A/en
Application granted granted Critical
Publication of CN108776822B publication Critical patent/CN108776822B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the invention discloses a kind of target area detection method, device, terminal and storage mediums, belong to field of computer technology.This method includes:Determine multiple sample areas and classification results;Grader is obtained, grader includes the multiple class nodes being arranged in order;According to the classification results of multiple sample areas and multiple sample areas, first class node in grader is trained, continues to be trained next class node after the completion of first class node training, until the training of multiple class nodes is completed;When it does not include target area to determine in the image currently tracked, using the grader for having trained completion, classify at least one of the second image after the image that currently tracks region, the target area in the second image is determined according to classification results, without being detected to every frame image, unnecessary calculation amount is reduced.And the accuracy of grader is improved, and then improves the accuracy of target area.

Description

Target area detection method, device, terminal and storage medium
Technical field
The present embodiments relate to field of computer technology, more particularly to a kind of target area detection method, device, terminal And storage medium.
Background technology
With the fast development of internet with the extensive rise of video social activity, the main propagation type of internet information is from text Word, picture have been gradually evolved into video, and various video processing function also occurs in succession, such as video filter, video mark (tagging) etc., personalisation process can be carried out to certain target areas in video, promoted by these video processing functions Interest.
In the related technology, during terminal plays video, user can in present image manually determined target area Domain, the target area in terminal-pair present image carry out editing and processing, such as in target area addition paster or to target area Beautify etc..Also, terminal can also be on the basis of position of the target area in present image, to distinguish since present image To tracking and backward tracking before carrying out, the position in the every frame image of target area before the present image and later is determined, from And identical editing and processing is carried out to the target area in every frame image, ensure the consistency between image.
But if position or attitudes vibration during terminal taking video are larger, certain figures of video can be led to Do not include target area as in, then when it does not include the image of target area to track to, target area tracking failure, later In the image of tracking, even if there are target areas, it is also difficult to detected again.
Invention content
An embodiment of the present invention provides a kind of target area detection method, device, terminal and storage mediums, can solve phase Pass technology there are the problem of.The technical solution is as follows:
On the one hand, a kind of target area detection method is provided, the method includes:
According to the target area that user determines in the first image of video, multiple sample areas and the multiple are determined The classification results of sample areas, the classification results are for indicating whether the sample areas belongs to the target area;
Grader to be trained is obtained, the grader includes the multiple class nodes being arranged in order according to sequencing;
According to the classification results of the multiple sample areas and the multiple sample areas, in the grader One class node is trained, and continues to instruct next class node after the completion of first class node training Practice, until the training of the multiple class node is completed;
The target area is tracked in other images in the video in addition to described first image, it is current when determining When not including the target area in the image of tracking, the grader completed has been trained in application, to the image currently tracked At least one of the second image later region is classified, and the mesh in second image is determined according to classification results Mark region.
On the other hand, a kind of target area detection device is provided, described device includes:
Sample determining module, the target area for being determined in the first image of video according to user, determines multiple samples The classification results of one's respective area and the multiple sample areas, the classification results are for indicating whether the sample areas belongs to The target area;
Acquisition module, for obtaining grader to be trained, the grader includes being arranged in order according to sequencing Multiple class nodes;
Training module, for the classification results according to the multiple sample areas and the multiple sample areas, to institute It states first in grader class node to be trained, continue to next point after the completion of first class node training Class node is trained, until the training of the multiple class node is completed;
Detection module, for tracking the target area in other images in the video in addition to described first image Domain, when it does not include the target area to determine in the image currently tracked, the grader completed has been trained in application, to working as At least one of the second image after the image of preceding tracking region is classified, and second figure is determined according to classification results The target area as in.
In another aspect, providing a kind of for detecting the terminal of target area, the terminal includes processor and memory, Be stored at least one instruction, at least one section of program, code set or instruction set in the memory, described instruction, described program, The code set or described instruction collection are loaded by the processor and are had to realize institute in the target area detection method The operation having.
In another aspect, providing a kind of computer readable storage medium, it is stored in the computer readable storage medium At least one instruction, at least one section of program, code set or instruction set, described instruction, described program, the code set or the finger Collection is enabled to be loaded by processor and had to realize possessed operation in the target area detection method.
Method, apparatus, terminal and storage medium provided in an embodiment of the present invention, according to user in the first image of video Determining target area determines the classification results of multiple sample areas and multiple sample areas, and classification results are for indicating sample Whether one's respective area belongs to target area, obtains grader to be trained, grader include be arranged in order according to sequencing it is more A class node, according to the classification results of multiple sample areas and multiple sample areas, to first classification in grader Node is trained, and continues to be trained next class node after the completion of first class node training, until multiple points Class node training complete, track target area in other images in addition to the first image in video, when determine currently with When not including target area in the image of track, the grader of completion train in application, to second after the image that currently tracks At least one of image region is classified, and the target area in the second image is determined according to classification results, without to every frame Image is detected, and reduces unnecessary calculation amount.And first use Dynamic Programming training method, in grader on Next class node can be just trained after the completion of the training of one class node, improves the accuracy of grader, is being trained point After class device when target area tracks failure, application has trained the grader completed to classify, and detects target area again, The accuracy of target area can be improved.
Also, when target area occurs compared with large deformation, grader can be carried out more according to the target area after deformation Newly, new target area is arrived in study in time, the robustness and reliability of grader is improved, in terminal fast jitter, rotation, mesh Mark can timely and accurately detect that target area, detection result are ideal when being blocked.
Also, using the grader of linear structure, it can ensure that class node is fixed number of, maximumlly Classifying space is segmented, classification accuracy is improved.
Also, image vision information is combined with sensing data, the posture information provided by the sensor configured, Estimate target area position, into line trace or detection in the target area of estimation, and to estimation target area other than its His region is not necessarily to, into line trace or detection, to avoid tracking failure caused by sensor error, and can also reduce Unnecessary calculation amount improves arithmetic speed.
Description of the drawings
To describe the technical solutions in the embodiments of the present invention more clearly, make required in being described below to embodiment Attached drawing is briefly described, it should be apparent that, the accompanying drawings in the following description is only some implementations of the embodiment of the present invention Example, for those of ordinary skill in the art, without creative efforts, can also obtain according to these attached drawings Obtain other attached drawings.
Fig. 1 is a kind of schematic diagram for TLD algorithms that the relevant technologies provide;
Fig. 2 is a kind of schematic diagram of target area detection method provided in an embodiment of the present invention;
Fig. 3 is a kind of structural schematic diagram of grader provided in an embodiment of the present invention;
Fig. 4 is a kind of image trace schematic diagram provided in an embodiment of the present invention;
Fig. 5 is a kind of characteristic point schematic diagram provided in an embodiment of the present invention;
Fig. 6 is another characteristic point schematic diagram provided in an embodiment of the present invention;
Fig. 7 is a kind of coordinate system schematic diagram provided in an embodiment of the present invention;
Fig. 8 is a kind of schematic diagram of cascade classifier provided in an embodiment of the present invention;
Fig. 9 is a kind of operating process schematic diagram provided in an embodiment of the present invention;
Figure 10 is a kind of tracking velocity schematic diagram provided in an embodiment of the present invention;
Figure 11 is a kind of structural schematic diagram of target area detection device provided in an embodiment of the present invention;
Figure 12 is a kind of structural schematic diagram of terminal provided in an embodiment of the present invention.
Specific implementation mode
In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with attached drawing to the present invention Embodiment is described in further detail.
Before the embodiment of the present invention is described in detail, first to TLD (Tracking-Learning- Detection, tracking-study-detection) algorithm introduced as follows:
TLD algorithms are used to carry out long-time tracking to the single body in video, and referring to Fig. 1, TLD algorithms include three moulds Block:Tracking module, detection module and study module.
One, tracking module:
Tracking module is used to track the motion change situation between wantonly two frames adjacent image, according to target area in previous frame The motion change situation between position and two field pictures in image determines the position of target area in next frame image. There are effective when target area only in next frame image for tracking module.
Also, tracking module can be also supplied to using the target area traced into next frame image as positive sample region Study module, by study module by positive sample region for training grader.
Two, detection module:
Detection module is for comprehensively scanning image, and application class device classifies to the region scanned, Region similar with target area is found out, positive sample region and negative sample region is generated, is supplied to study module.
When tracking module there is no target area due to leading to tracking failure in the image that traces into, detection module can The target area found out is supplied to tracking module, continued in image later into line trace by tracking module.
Three, study module:
The sample areas that study module is used to be provided according to tracking module and detection module, to the grader of detection module into Row iteration is trained, and the classification accuracy of grader is improved.
In the related technology, do not include target area when tracking to when tracking target area in the multiple image of video When image, failure can be tracked, at this time if the image tracked later is there are target area, needs to detect target area in the picture Domain could continue to track.But application class device is needed in detection process, the target area which traces into before Domain is trained to obtain, and when tracking failure, grader also complete completely by training, and accuracy is poor, causes the target area to be difficult Accurately it detected.
An embodiment of the present invention provides a kind of target area detection methods, target can be determined in the first image in user Behind region, grader is trained according to the classification results of multiple sample areas and multiple sample areas in the first image.So Even if tracking failure can also apply the grader of trained completion, accurately detect target area.
The embodiment of the present invention can be applied in the scene for carrying out editing and processing to video, when user is in a certain figure of video As in manually determined target area when, terminal can to the target area carry out editing and processing, and can also video other Target area is detected in image, and editing and processing in the same manner is carried out to the target area in other images.
For example, when user shoots one section of video and chooses head zone, terminal can be in video per on frame image Head zone adds paster, and with the change in location of head zone, the position of paster also can accordingly change.
Fig. 2 is a kind of schematic diagram of target area detection method provided in an embodiment of the present invention.The embodiment of the present invention is held Row main body is terminal, and referring to Fig. 2, this method includes:
201, terminal obtains the target area that user determines in the first image of video.
Wherein, which can be the equipment such as mobile phone, smart camera, which is configured with camera, can be clapped by camera Take the photograph image or video.The video includes multiple image, and the first image is any image in video, can be the in video One frame image, or may be the image etc. that video playing arrives when user triggers edit instruction.
For example, during terminal plays video, when detecting pause play instruction, display is currently playing arrive the One image, user can choose target area, expression to carry out editing and processing, terminal detection to target area in the first image To when the operation for choosing target area, the target area is obtained.Wherein, choose target area operation can be slide or Person's clicking operation etc., target area can be determined according to the initial position of slide and final position, or be grasped according to clicking The click on area of work determines.
202, terminal determines point of multiple sample areas and multiple sample areas according to the target area in the first image Class result.
Wherein, there are one classification results for each sample areas tool, and the classification results are for indicating whether sample areas belongs to Target area indicates that the sample areas is positive sample region, if sample areas does not belong to if sample areas belongs to target area In target area, indicate that the sample areas is negative sample region.
In a kind of possible realization method, the first image of terminal-pair carries out region detection, obtains multiple sample areas, according to Each sample areas and position of the target area in the first image, determine the weight between each sample areas and target area Folded rate determines the classification results of multiple sample areas according to the Duplication between multiple sample areas and target area.
Optionally, to the first image carry out region detection when, may be used fixed the first image of window pair of size into Row traversal, obtains multiple sample areas of corresponding size.Wherein, the size of the window can be less than the size of target area, with Just it chooses to multiple sample areas for belonging to target area, and the size of the window can be according to the size of target area and right The demand of accuracy determines.
Optionally, for each sample areas, when the Duplication between sample areas and target area is more than default value When, determine that the sample areas belongs to target area, and when the Duplication between sample areas and target area is not more than present count When value, determine that the sample areas is not belonging to target area.The default value can be 0 or 50% etc., with specific reference to accurate The demand of degree determines.
It is of course also possible to use other modes determine the classification results of each sample areas, such as by sample areas and mesh Mark region is compared, and calculates the similarity of sample areas and target area, the classification of sample areas is determined according to the similarity As a result etc..
203, terminal obtains grader to be trained, and grader includes the multiple classification sections being arranged in order according to sequencing Point.
It, will not be during tracking target area according to tracking in order to ensure classification accuracy in the embodiment of the present invention The target area training grader arrived, but can will be classified according to the sample areas in the first image before tracking target area Device training is completed, and can ensure, when tracking target area failure, more accurate grader can be applied to detect mesh in this way Mark region.
Also, the grader that terminal uses includes the multiple class nodes being arranged in order according to sequencing, multiple point Class node constitutes linear structure, and each class node is used equally for territorial classification.The structure of the grader can be as shown in Figure 3.
204, terminal is according to the classification results of multiple sample areas and multiple sample areas, to first in grader Class node is trained, and continues to be trained next class node after the completion of first class node training, until more The training of a class node is completed.
An embodiment of the present invention provides a kind of training methods of Dynamic Programming, from the grader for only including a class node Start, first class node be trained, when first class node training is completed, fixes first class node, Second class node is trained, and so on, until all class nodes training in grader is completed, in this way may be used When ensureing training every time, the grader that training is completed before can be optimal, and so as to obtain optimal grader, carry The accuracy rate of high-class device.
In a kind of possible realization method, terminal first initializes the node parameter of multiple class nodes, according to multiple samples The classification results of region and multiple sample areas are trained the node parameter of first class node in grader, The node parameter after first class node training is obtained, the classification knot according to multiple sample areas, multiple sample areas is continued Node parameter after fruit and the training of a upper class node, is trained the node parameter of next class node, obtains Node parameter after the training of next class node, until the training of multiple class nodes is completed, multiple class nodes at this time Node parameter has trained completion, and multiple class nodes may be used and classify.
Optionally, when any class node in grader exports the first classification numerical value, this region to be sorted is indicated Belong to target area and indicates that this region to be sorted is not belonging to target area when any class node exports the second classification numerical value Domain.Wherein, the first classification numerical value and the second classification numerical value are different, for example, first classify numerical value when being 1 second numerical value of classifying be 0, Or first classification numerical value when being 0 second classification numerical value be 1.
Optionally, the node parameter of each class node may include two location of pixels i and j and threshold value x, i and j are Positive integer.When the region of a certain image is input in a certain class node, can according on location of pixels i gray scale and pixel position It sets whether the difference between the gray scale on j classifies to region more than threshold value x, is determined when the difference is not more than threshold value x The classification results of the class node are the first classification numerical value, and the classification knot of the class node is determined when the difference is more than threshold value x Fruit is the second classification numerical value.
Referring to Fig. 3, grader includes n class node, can export n classification numerical value, one two is constituted after combination Binary value, it is 0-2 to be converted to the value range after decimal valuen-1, each class node is with 0 and 1 two classification sky Between, then grader has 2nA classifying space can ensure that class node number is fixed, maximumlly subdivision point Classification accuracy is improved in space-like.Wherein, n is positive integer, such as can be 6 or 10.
205, terminal chooses the multiple positive sample regions for belonging to target area from multiple sample areas, according to grader pair The classification results that multiple positive sample region is classified, determine target classification result.
For the classifying space where finding out target area in multiple classifying spaces, terminal can obtain multiple positive samples Region, for each positive sample region, the multiple class nodes of terminal applies are respectively classified to positive sample region, are obtained more The classification numerical value that a class node exports respectively is distinguished multiple class nodes defeated according to the sequencing of multiple class nodes The classification combinations of values gone out constitutes binary numeral, using the corresponding decimal value of binary numeral as point in positive sample region Class by the most classification results of occurrence number in multiple positive sample regions as a result, be determined as target classification result.So, only when It when some region of classification results are equal to target classification result, just can determine that the region belongs to target area, and work as a certain region Classification results be not equal to target classification result when, determine that the region is not belonging to target area.
For example, after a certain positive sample region is input to grader, classification numerical value that class node 1 is exported to class node n The binary numeral that combination is constituted is 100110, and corresponding decimal value is 38.
206, terminal tracks target area in other images in addition to the first image in video.
Referring to Fig. 4, for the image being temporally located in video before the first image, terminal can carry out before to Track determines the target area in these images, and for the image being temporally located in video after the first image, terminal can , to tracking, to determine the target area in these images after progress.
Specifically, the target area in the first image of terminal-pair is detected, and obtains multiple characteristic points, passes through two frame in office Multiple characteristic points are tracked in adjacent image, position of multiple characteristic points in other images are determined, according to multiple characteristic points at it Position in his image determines the target area in other images.
When wherein extracting characteristic point, referring to Fig. 5, terminal may be used uniform grid and take mode a little, in the first image The grid of multiple homogeneous phases etc. is set, a point is chosen in each grid as characteristic point, so as to rapidly choose The characteristic point of fixed quantity.
Alternatively, in view of the characteristic point chosen needs the feature for effectively reflecting image, FAST (Features may be used From Accelerated Segment Test, Accelerated fractionation test feature), Harris (a kind of Corner Detection Algorithm), SURF (Speed Up Robust Feature accelerate robust feature), BRISK (Binary Robust Invariant Scalable Keypoints, the constant expansible key point of binary robust) scheduling algorithm extracts characteristic point, the spy of extraction from the first image Sign point as indicated with 6, can reflect the characteristics of image of target area.
In a kind of possible realization method, terminal can be since the first image, to multiple spy in next frame image Sign clicks through line trace, finds the matching characteristic point in next frame image, to obtain the movable information of multiple characteristic points, the fortune Dynamic information can indicate change in location situation of the next frame image relative to the first image, then according to target area in the first image In position and the movable informations of multiple characteristic points be iterated calculating, it may be determined that target area is in next frame image Position, to trace into target area.Similar tracking mode can also be used for image later, existed according to target area The movable information of position and multiple characteristic points in previous frame image is iterated calculating, determines target area in next frame figure Position as in.
Wherein, the movable information that light stream matching algorithm obtains characteristic point may be used in terminal, or is obtained using other algorithms Take the movable information of characteristic point.
Wherein, after terminal gets the movable information of multiple characteristic points, multiple feature can be determined according to movable information Location information of the point in previous frame image and the location information in next frame image, so that it is determined that next frame image is opposite Displacement parameter in the rotation translation matrix of next frame image, the rotation translation matrix is next frame image relative to upper one The change in location information of frame image can determine position of the target area in next frame image according to the displacement parameter.
In a kind of possible realization method, for the video of captured in real-time, terminal can lead to during shooting video The sensor for crossing configuration obtains posture information of the camera when shooting is per frame image, which can indicate that camera is current Position and posture, the sensor may include acceleration transducer and gyro sensor etc..According to wantonly two frames adjacent image it Between posture information variable quantity and previous frame objective area in image position, obtain the estimation mesh in next frame image Mark region.Feature point tracking then is carried out in estimation target area, determines position of the target area in next frame image, and nothing Feature point tracking need to be carried out to the region other than estimation target area, unnecessary calculation amount can be reduced, improve tracking velocity.
It should be noted that if the position of camera or attitudes vibration are excessive when shooting video, certain figures are may result in Do not include target area as in, following step 207 can be executed at this time and detect target area again in image later.
Alternatively, if the Parameters variations such as the position of camera or posture are excessive when shooting video, can cause in certain images Larger deformation has occurred in target area, is difficult at this time to trace into target area according to the characteristic point originally extracted.In order to ensure In this case it may also detect that target area, in a kind of possible realization method, for tracking to third image, terminal The tracking error of third image can be obtained at the target area in tracing into third image, when tracking error is more than first When predetermined threshold value, indicate that target area has occurred compared with large deformation, then using the target area traced into third image as sample Region collects, and can be updated to grader according to the sample areas, obtains updated grader.
Wherein, which can be FB (Forward-Backward Error, forward-backward algorithm) error, NCC (Normalized Cross Correlation, normalized crosscorrelation) error or SSD (Sum-of-Squared Differences, squared difference and) error etc..
In alternatively possible realization method, the first predetermined threshold value and the second predetermined threshold value can be arranged in terminal, and second is pre- If threshold value is more than the first predetermined threshold value, when the tracking error of third image is more than the first predetermined threshold value and is not more than the second default threshold When value, the target area traced into third image is collected as sample areas, it can be to dividing according to the sample areas Class device is updated, and obtains updated grader.And when tracking error is more than the second predetermined threshold value, indicate tracking failure, The domain error currently tracked is too big, cannot function as target area, it is thus determined that not including target area in third image, at this time There is still a need for execute following step 207 to detect target area again in image later.
In addition to target area is there is a situation where compared with large deformation, preset duration or preset quantity can also be arranged in terminal, Every preset duration or after the image of preset quantity, using the target area currently traced into as sample areas, according to The sample areas is updated grader, obtains updated grader, can not be deformed upon in target area in this way but The variation is arrived in study in time when ambient enviroment changes, and ensures timely and accurately detect using updated grader Target area.
207, when it does not include target area to determine in the image currently tracked, terminal applies have trained the classification completed Device classifies at least one of the second image after the image that currently tracks region, is determined according to classification results Target area in two images.
If tracked since the first image, target area can not be traced into when reaching a certain image currently tracked, Can then determine in the image currently tracked not include target area, tracking failure.At this time, it may be necessary to the image currently tracked it Again target area is detected in image afterwards, can just continue to track.
Wherein, to when tracking before progress, the image after the image currently tracked refers to being located at present image on the time Image before, to when tracking after progress, the image after the image currently tracked refer to be located on the time present image it Image afterwards.
By taking the second image after the image currently tracked as an example, terminal can carry out region detection to the second image, obtain To at least one of the second image region, which is input in the grader of trained completion, is applied Grader classifies at least one region, obtains classification results, namely is to determine which region at least one region Belong to target area, which region is not belonging to target area, to determine target area in the second image according to classification results Position, realize the repositioning of target area.
In a kind of possible realization method, for the video of captured in real-time, terminal can lead to during shooting video The sensor for crossing configuration obtains posture information of the camera when shooting is per frame image, which can indicate that camera is current Position and posture, the sensor may include acceleration transducer and gyro sensor etc..According to wantonly two frames adjacent image it Between posture information variable quantity and previous frame objective area in image position, obtain the estimation mesh in next frame image Mark region.Region detection then is carried out to estimation target area, obtains at least one region, application class device determines after being classified The accurate location of target area, and to other regions other than estimation target area without being detected, it is unnecessary to reduce Calculation amount, improve detection speed.
The coordinate system of terminal can be with as shown in fig. 7, can obtain terminal by sensor shooting wantonly two frames adjacent image Between during displacement on three directions, can according to target area previous frame image position Xt, use is following Formula estimate target area next frame image position Xt+1
Xt+1=K*R*K-1Xt
Wherein,X and y indicates that the two-dimensional coordinate of pixel, X indicate that the homogeneous coordinates of pixel, K indicate camera Parameter matrix,Fx, fy, cx and cy indicate that the parameter of camera, R indicate the rotation between two field pictures Translation matrix can be determined according to displacement of terminal during shooting between wantonly two frames adjacent image on three directions.
Based on the possibility realization method in step 204, for each region in the second image, terminal can be applied multiple Class node respectively classifies to the region, the classification numerical value that multiple class nodes export respectively is obtained, according to multiple classification The sequencing of node, the classification combinations of values that multiple class nodes are exported respectively constitutes binary numeral, by binary number It is worth classification results of the corresponding decimal value as the region, and judges whether the classification results in the region are equal to target classification As a result, when classification results are equal to target classification result, determine that the region belongs to target area, when classification results are not equal to target When classification results, determine that the region is not belonging to target area.Each region in the second image can be determined using aforesaid way Whether belong to target area, and then determines position of the target area in the second image.
In a kind of possible realization method, the region in the second image can be screened, obtained using above-mentioned grader To multiple regions that may belong to target area, remaining region can also be continued to sieve using nearest neighbor classifier later Choosing, that is to say, calculate the similarity between each region and target area, and when similarity is more than default similarity, determining should Region belongs to target area, when similarity is not more than default similarity, determines that the region is not belonging to target area, the areas Ze Jianggai It filters in domain.The region for belonging to target area can be determined after the completion of screening, and then determines position of the target area in the second image It sets.
In alternatively possible realization method, the region in the second image can be screened using above-mentioned grader, Obtain multiple regions that may belong to target area.The description sub-portfolio of each characteristic point in target area can be constituted at this time The feature of target area, and characteristic matching grader is applied, for remaining each extracted region characteristic point, according in the region The description sub-portfolio of each characteristic point constitutes the feature in the region, calculates between the feature in the region and the feature of target area Distance determines that the region belongs to target area when the distance is less than pre-determined distance, when the distance is not less than pre-determined distance, Determine that the region is not belonging to target area, then by the area filter.Wherein, which can be Euclidean distance or Hamming distance Deng.
In alternatively possible realization method, can by the grader, nearest neighbor classifier of linear structure in step 207 and Characteristic matching classifiers combination constitutes cascade classifier as shown in Figure 8, using cascade classifier detect after repeatedly screening Go out the target area in the second image.Referring to Fig. 8, region 1, region 2 and region 3 are input in cascade classifier, linear junction The grader of structure determines that region 1 is not belonging to target area, and region 2 and 3 belongs to target area, then filters region 1, by region 2 It is input in nearest neighbor classifier with 3, nearest neighbor classifier determines that region 2 is not belonging to target area, and region 3 belongs to target area Region 2 is then filtered in domain, and region 3 is input in Feature Points Matching grader, and Feature Points Matching grader determines that region 3 belongs to In target area, then the target area exported is region 3.
Image after second image can be continued to track, that is to say first to the target area in the second image Extract multiple characteristic points, using with 206 similar mode of above-mentioned steps to multiple characteristic point into line trace, find target area Domain.
In another embodiment, when each region in the second image is not admitted to target area, the second image is indicated In also do not include target area, can continue to be detected image later at this time, until find target in a certain image Region.
It, can be to mesh when tracking in any frame image of video or detect target area in the embodiment of the present invention It marks region and carries out editing and processing, such as target area is zoomed in or out, add paster in the target area or the spy that shines Effect carries out mosaic processing etc. to target area, and specific processing mode can be by terminal default setting or by user setting. By carrying out editing and processing to target area, user can be helped to generate with personal presentation, more abundant lively video, increased It is strong recreational and interesting.
Method provided in an embodiment of the present invention is determined according to the target area that user determines in the first image of video The classification results of multiple sample areas and multiple sample areas, classification results are for indicating whether sample areas belongs to target area Domain obtains grader to be trained, and grader includes the multiple class nodes being arranged in order according to sequencing, according to multiple samples The classification results of one's respective area and multiple sample areas are trained first class node in grader, first point Class node training after the completion of continues to be trained next class node, until multiple class nodes training complete, regarding Target area is tracked in other images in frequency in addition to the first image, does not include target area when determining in the image currently tracked When domain, the grader completed has been trained in application, at least one of the second image after the image that currently tracks region into Row classification, determines the target area in the second image according to classification results, without being detected to every frame image, reduces not Necessary calculation amount.And the training method for first using Dynamic Programming, in grader after the completion of upper class node training Next class node can be just trained, the accuracy of grader is improved, is lost after training grader when target area tracks When losing, application has trained the grader completed to classify, and detects target area again, can improve the accurate of target area Property.
Also, when target area occurs compared with large deformation, grader can be carried out more according to the target area after deformation Newly, new target area is arrived in study in time, the robustness and reliability of grader is improved, in terminal fast jitter, rotation, mesh Mark can timely and accurately detect that target area, detection result are ideal when being blocked.
Also, when tracking target area due to only relying on sensing data, once there is quickly violent shake, meeting in terminal Cause sensing data fluctuation larger, deviation occurs in the position of target area, and tracking is caused to fail.And the embodiment of the present invention will be schemed As visual information is combined with sensing data, the posture information provided by the sensor configured estimates the position of target area Set, into line trace or detection in the target area of estimation, and to estimation target area other than other regions without carry out with Track or detection can avoid tracking failure caused by sensor error, and can also reduce unnecessary calculation amount, carry High arithmetic speed.
The operational flowchart of the embodiment of the present invention can with as shown in figure 9, terminal may include tracking module, detection module and Study module, tracking module are supplied to for executing above-mentioned steps 206 using the target area traced into as positive sample region Study module, detection module is used to execute above-mentioned steps 201-205 and obtains the grader that training is completed, and is tracked in tracking module When failure, executes step 207 and detect target area again, then continue to track by tracking module.Also, when the mesh traced into When marking region generation compared with large deformation, the target area can be learnt by study module, grader is updated.
In traditional TLD algorithms, tracking module, detection module and study module are be combined with each other, for each frame image, It will be merged by the result of tracking module and detection module, determine the position of target area, by determining target area As positive sample region, grader is trained by study module.Promote the robustness of detection module.Due to traditional TLD algorithms are the Tracking of single goal, and each frame image is required for carrying out the processing of three parts, and calculation amount is larger, processing Speed is slower.
And in method provided in an embodiment of the present invention, it is not necessary that every frame image is all detected and is learnt, only tracking It can just be detected, be learnt when target area occurs compared with large deformation, so as to avoid unnecessary calculation amount when failure.
Moreover, because the grader of detection module application has been trained by the way of Dynamic Programming before tracking At, improve the accuracy rate of grader, track failure when be detected the standard it is also ensured that the target area that detected True rate.
Due to using the grader of binary tree structure in traditional TLD algorithms, it is assumed that grader includes 15 points in total Class node needs to carry out 4 layers of classification, and finally determining class interval only has 8, and the class interval of division reduces, and causes to classify Accuracy rate is not high enough.And the grader of linear structure is used in the embodiment of the present invention, n class node in grader is divided Class, grader have 2nA classifying space can ensure that class node number is fixed, maximumlly subdivision classification Classification accuracy is improved in space.
For 3 test videos, the tracking error of method used in the embodiment of the present invention and tradition TLD algorithms can be as follows Shown in table 1, the embodiment of the present invention significantly reduces tracking error, accuracy rate higher as can be seen from Table 1.
Table 1
Test video 1 Test video 2 Test video 3
The present invention 5.6 10.11 1.24
Traditional TLD 7.1 15.3 1.33
Method used in the embodiment of the present invention and CT (Compressive Tracking, compression tracking) algorithm, tradition TLD (Efficient Convolution Operators for Tracking, the efficient convolution for tracking are calculated by algorithm and ECO Son) tracking velocity of algorithm can be as shown in Figure 10, and as can be seen from Figure 10 the embodiment of the present invention significantly improves tracking speed Degree, can substantially reach real-time tracking.
Figure 11 is a kind of structural schematic diagram of target area detection device provided in an embodiment of the present invention.It, should referring to Figure 11 Device includes:
Sample determining module 1101 determines multiple sample areas and multiple sample areas for executing in above-described embodiment Classification results the step of;
Acquisition module 1102, for executing the step of obtaining grader to be trained in above-described embodiment;
Training module 1103, for executing point in above-described embodiment according to multiple sample areas and multiple sample areas As a result, being trained to first class node in grader, first class node training continues to next class after the completion A class node is trained, until multiple class nodes train the step of completing;
Detection module 1104, for execute in above-described embodiment in other images in addition to the first image in video with Track target area, when it does not include target area to determine in the image currently tracked, the grader completed has been trained in application, to working as At least one of the second image after the image of preceding tracking region is classified, and is determined in the second image according to classification results Target area the step of.
Optionally, sample determining module 1101, including:
Region detection unit carries out region detection to the first image for executing in above-described embodiment, obtains multiple samples The step of region;
Determination unit, for executing in above-described embodiment according to the Duplication between multiple sample areas and target area, The step of determining the classification results of multiple sample areas.
Optionally, training module 1103, including:
Initialization unit, for the step of executing the node parameter for initializing multiple class nodes in above-described embodiment;
Training unit, for executing the classification knot in above-described embodiment according to multiple sample areas and multiple sample areas Fruit is trained the node parameter of first class node in grader, obtains the section after first class node training The step of point parameter;
Training unit is additionally operable to execute point continued in above-described embodiment according to multiple sample areas, multiple sample areas Node parameter after class result and the training of a upper class node, is trained the node parameter of next class node, The node parameter after next class node training is obtained, until multiple class nodes train the step of completing.
Optionally, when any class node in grader exports the first classification numerical value, this region to be sorted is indicated Belong to target area and indicates that this region to be sorted is not belonging to target area when any class node exports the second classification numerical value Domain;
Device further includes:
Choose module, for execute in above-described embodiment chosen from multiple sample areas belong to target area it is multiple just The step of sample areas;
Sort module, for executing for each positive sample region in above-described embodiment, using multiple class nodes, respectively The step of classifying to positive sample region, obtaining the classification numerical value that multiple class nodes export respectively;
Composite module saves multiple classification for executing in above-described embodiment according to the sequencing of multiple class nodes The classification combinations of values that point exports respectively constitutes binary numeral, using the corresponding decimal value of binary numeral as positive sample The step of classification results in region;
Target determination module, for executing classification that occurrence number in multiple positive sample regions is most in above-described embodiment As a result the step of being determined as target classification result.
Optionally, device further includes:
Error acquisition module, for execute in above-described embodiment in the third image in addition to the first image in video with When track target area, obtain tracking error the step of;
Sample acquisition module, for executing in above-described embodiment when tracking error is more than the first predetermined threshold value, by third The step of target area traced into image is as sample areas;
Update module is updated grader according to sample areas for executing in above-described embodiment, after obtaining update Grader the step of.
Optionally, detection module 1104, for executing in above-described embodiment for each region in the second image, application Multiple class nodes, respectively classify to region, obtain the classification numerical value that multiple class nodes export respectively;According to multiple points The sequencing of class node, the classification combinations of values that multiple class nodes are exported respectively constitutes binary numeral, by binary system Classification results of the corresponding decimal value of numerical value as region;When classification results are equal to target classification result, region is determined The step of belonging to target area.
Optionally, the target area in the first image is examined for executing in above-described embodiment in detection module 1104 It surveys, obtains multiple characteristic points;By tracking multiple characteristic points in two frames adjacent image in office, determine multiple characteristic points in other figures Position as in;According to position of multiple characteristic points in other images, the step of determining the target area in other images.
Optionally, device further includes:
Error acquisition module, for execute in above-described embodiment in the third image in addition to the first image in video with When track target area, obtain tracking error the step of;
Sample collection module, for executing in above-described embodiment when tracking error is more than the first predetermined threshold value, by third The step of target area traced into image is as sample areas;
Update module is updated grader according to sample areas for executing in above-described embodiment, after obtaining update Grader the step of.
Optionally, sample collection module is additionally operable to execute in above-described embodiment when tracking error is more than the first predetermined threshold value And when no more than the second predetermined threshold value, using the target area traced into third image as the step of sample areas, second is pre- If threshold value is more than the first predetermined threshold value;
Device further includes:Determining module, for executing in above-described embodiment when tracking error is more than the second predetermined threshold value, Determine in third image not include the steps that target area.
It should be noted that:Above-described embodiment provide target area detection device when detecting target area, only more than The division progress of each function module is stated for example, in practical application, it can be as needed and by above-mentioned function distribution by difference Function module complete, i.e., the internal structure of terminal is divided into different function modules, with complete it is described above whole or Person's partial function.In addition, the target area detection device that above-described embodiment provides belongs to target area detection method embodiment Same design, specific implementation process refer to embodiment of the method, and which is not described herein again.
Figure 12 shows the structure diagram for the terminal 1200 that an illustrative embodiment of the invention provides.The terminal 1200 can To be portable mobile termianl, such as:Smart mobile phone, tablet computer, MP3 player (Moving Picture Experts Group Audio Layer III, dynamic image expert's compression standard audio level 3), MP4 (Moving Picture Experts Group Audio Layer IV, dynamic image expert's compression standard audio level 4) player, laptop, Desktop computer, headset equipment or any other intelligent terminal.Terminal 1200 is also possible to be referred to as user equipment, portable terminal Other titles such as end, laptop terminal, terminal console.
In general, terminal 1200 includes:Processor 1201 and memory 1202.
Processor 1201 may include one or more processing cores, such as 4 core processors, 5 core processors etc..Place DSP (Digital Signal Processing, Digital Signal Processing), FPGA (Field- may be used in reason device 1201 Programmable Gate Array, field programmable gate array), PLA (Programmable Logic Array, may be programmed Logic array) at least one of example, in hardware realize.Processor 1201 can also include primary processor and coprocessor, master Processor is the processor for being handled data in the awake state, also referred to as CPU (Central Processing Unit, central processing unit);Coprocessor is the low power processor for being handled data in the standby state.? In some embodiments, processor 1201 can be integrated with GPU (Graphics Processing Unit, image processor), GPU is used to be responsible for the rendering and drafting of content to be shown needed for display screen.In some embodiments, processor 1201 can also wrap AI (Artificial Intelligence, artificial intelligence) processor is included, the AI processors are for handling related machine learning Calculating operation.
Memory 1202 may include one or more computer readable storage mediums, which can To be non-transient.Memory 1202 may also include high-speed random access memory and nonvolatile memory, such as one Or multiple disk storage equipments, flash memory device.In some embodiments, the non-transient computer in memory 1202 can Storage medium is read for storing at least one instruction, at least one instruction by processor 1201 for being had to realize this Shen Please in embodiment of the method provide target area detection method.
In some embodiments, terminal 1200 is also optional includes:Peripheral device interface 1203 and at least one periphery are set It is standby.It can be connected by bus or signal wire between processor 1201, memory 1202 and peripheral device interface 1203.It is each outer Peripheral equipment can be connected by bus, signal wire or circuit board with peripheral device interface 1203.Specifically, peripheral equipment includes: In radio circuit 1204, touch display screen 1205, camera 1206, voicefrequency circuit 1207, positioning component 1208 and power supply 1209 At least one.
Peripheral device interface 1203 can be used for I/O (Input/Output, input/output) is relevant at least one outer Peripheral equipment is connected to processor 1201 and memory 1202.In some embodiments, processor 1201, memory 1202 and periphery Equipment interface 1203 is integrated on same chip or circuit board;In some other embodiments, processor 1201, memory 1202 and peripheral device interface 1203 in any one or two can be realized on individual chip or circuit board, this implementation Example is not limited this.
Radio circuit 1204 is for receiving and emitting RF (Radio Frequency, radio frequency) signal, also referred to as electromagnetic signal. Radio circuit 1204 is communicated by electromagnetic signal with communication network and other communication equipments.Radio circuit 1204 is by telecommunications Number being converted to electromagnetic signal is sent, alternatively, the electromagnetic signal received is converted to electric signal.Optionally, radio circuit 1204 include:Antenna system, one or more amplifiers, tuner, oscillator, digital signal processor, compiles solution at RF transceivers Code chipset, user identity module card etc..Radio circuit 1204 can by least one wireless communication protocol come with it is other Terminal is communicated.The wireless communication protocol includes but not limited to:Metropolitan Area Network (MAN), each third generation mobile communication network (2G, 3G, 4G and 12G), WLAN and/or WiFi (Wireless Fidelity, Wireless Fidelity) network.In some embodiments, radio frequency Circuit 1204 can also include the related circuits of NFC (Near Field Communication, wireless near field communication), this Shen Please this is not limited.
Display screen 1205 is for showing UI (User Interface, user interface).The UI may include figure, text, Icon, video and its their arbitrary combination.When display screen 1205 is touch display screen, display screen 1205 also there is shooting to exist The ability of the surface of display screen 1205 or the touch signal of surface.The touch signal can be used as control signal to be input to place Reason device 1201 is handled.At this point, display screen 1205 can be also used for providing virtual push button and/or dummy keyboard, it is also referred to as soft to press Button and/or soft keyboard.In some embodiments, display screen 1205 can be one, and the front panel of terminal 1200 is arranged;Another In a little embodiments, display screen 1205 can be at least two, be separately positioned on the different surfaces of terminal 1200 or in foldover design; In still other embodiments, display screen 1205 can be flexible display screen, be arranged on the curved surface of terminal 1200 or fold On face.Even, display screen 1205 can also be arranged to non-rectangle irregular figure, namely abnormity screen.Display screen 1205 can be with Using LCD (Liquid Crystal Display, liquid crystal display), OLED (Organic Light-Emitting Diode, Organic Light Emitting Diode) etc. materials prepare.
CCD camera assembly 1206 is for shooting image or video.Optionally, CCD camera assembly 1206 includes front camera And rear camera.In general, the front panel in terminal is arranged in front camera, rear camera is arranged at the back side of terminal.? In some embodiments, rear camera at least two is that main camera, depth of field camera, wide-angle camera, focal length are taken the photograph respectively As any one in head, to realize that main camera and the fusion of depth of field camera realize background blurring function, main camera and wide Pan-shot and VR (Virtual Reality, virtual reality) shooting functions or other fusions are realized in angle camera fusion Shooting function.In some embodiments, CCD camera assembly 1206 can also include flash lamp.Flash lamp can be monochromatic temperature flash of light Lamp can also be double-colored temperature flash lamp.Double-colored temperature flash lamp refers to the combination of warm light flash lamp and cold light flash lamp, be can be used for Light compensation under different-colour.
Voicefrequency circuit 1207 may include microphone and loud speaker.Microphone is used to shoot the sound wave of user and environment, and It converts sound waves into electric signal and is input to processor 1201 and handled, or be input to radio circuit 1204 to realize that voice is logical Letter.For stereo shooting or the purpose of noise reduction, microphone can be multiple, be separately positioned on the different parts of terminal 1200. Microphone can also be array microphone or omnidirectional's shooting type microphone.Loud speaker is then used to that processor 1201 or radio frequency will to be come from The electric signal of circuit 1204 is converted to sound wave.Loud speaker can be traditional wafer speaker, can also be that piezoelectric ceramics is raised one's voice Device.When loud speaker is piezoelectric ceramic loudspeaker, the audible sound wave of the mankind can be not only converted electrical signals to, can also be incited somebody to action Electric signal is converted to the sound wave that the mankind do not hear to carry out the purposes such as ranging.In some embodiments, voicefrequency circuit 1207 may be used also To include earphone jack.
Positioning component 1208 is used for the current geographic position of positioning terminal 1200, to realize navigation or LBS (Location Based Service, location based service).Positioning component 1208 can be the GPS (Global based on the U.S. Positioning System, global positioning system), the dipper system of China, Russia Gray receive this system or European Union The positioning component of Galileo system.
Power supply 1209 is used to be powered for the various components in terminal 1200.Power supply 1209 can be alternating current, direct current Electricity, disposable battery or rechargeable battery.When power supply 1209 includes rechargeable battery, which can support wired Charging or wireless charging.The rechargeable battery can be also used for supporting fast charge technology.
In some embodiments, terminal 1200 further include there are one or multiple sensors 1210.The one or more senses Device 1210 includes but not limited to:Acceleration transducer 1211, gyro sensor 1212, pressure sensor 1213, fingerprint sensing Device 1214, optical sensor 1215 and proximity sensor 1216.
Acceleration transducer 1211 can detect the acceleration in three reference axis of the coordinate system established with terminal 1200 Size.For example, acceleration transducer 1211 can be used for detecting component of the acceleration of gravity in three reference axis.Processor The 1201 acceleration of gravity signals that can be shot according to acceleration transducer 1211, control touch display screen 1205 is with transverse views Or longitudinal view carries out the display of user interface.Acceleration transducer 1211 can be also used for game or the exercise data of user Shooting.
Gyro sensor 1212 can be with the body direction of detection terminal 1200 and rotational angle, gyro sensor 1212 Shooting user can be cooperateed with to act the 3D of terminal 1200 with acceleration transducer 1211.Processor 1201 is according to gyro sensors The data that device 1212 is shot, may be implemented following function:Action induction (for example changing UI according to the tilt operation of user) is clapped Image stabilization, game control when taking the photograph and inertial navigation.
The lower layer of side frame and/or touch display screen 1205 in terminal 1200 can be arranged in pressure sensor 1213.When The gripping signal that user can be detected in the side frame of terminal 1200 to terminal 1200 is arranged in pressure sensor 1213, by Reason device 1201 carries out right-hand man's identification or prompt operation according to the gripping signal that pressure sensor 1213 is shot.Work as pressure sensor 1213 are arranged in the lower layer of touch display screen 1205, are grasped to the pressure of touch display screen 1205 according to user by processor 1201 Make, realization controls the operability control on the interfaces UI.Operability control include button control, scroll bar control, At least one of icon control, menu control.
Fingerprint sensor 1214 is used to shoot the fingerprint of user, is taken according to fingerprint sensor 1214 by processor 1201 Fingerprint recognition user identity, alternatively, by fingerprint sensor 1214 according to the identity of the fingerprint recognition user taken.Knowing When the identity for not going out user is trusted identity, authorize the user that there is relevant sensitive operation, sensitivity behaviour by processor 1201 Work includes solving lock screen, checking encryption information, download software, payment and change setting etc..Fingerprint sensor 1214 can be set Set the front, the back side or side of terminal 1200.When being provided with physical button or manufacturer Logo in terminal 1200, fingerprint sensor 1214 can integrate with physical button or manufacturer's mark.
Optical sensor 1215 is used for shooting environmental luminous intensity.In one embodiment, processor 1201 can be according to light The ambient light intensity that sensor 1215 is shot is learned, the display brightness of touch display screen 1205 is controlled.Specifically, work as ambient light intensity When higher, the display brightness of touch display screen 1205 is turned up;When ambient light intensity is relatively low, the aobvious of touch display screen 1205 is turned down Show brightness.In another embodiment, the ambient light intensity that processor 1201 can also be shot according to optical sensor 1215, is moved State adjusts the acquisition parameters of CCD camera assembly 1206.
Proximity sensor 1216, also referred to as range sensor are generally arranged at the front panel of terminal 1200.Proximity sensor 1216 the distance between the front for shooting user and terminal 1200.In one embodiment, when proximity sensor 1216 is examined When measuring the distance between the front of user and terminal 1200 and tapering into, by processor 1201 control touch display screen 1205 from Bright screen state is switched to breath screen state;When proximity sensor 1216 detect the distance between front of user and terminal 1200 by When gradual change is big, touch display screen 1205 is controlled by processor 1201 and is switched to bright screen state from breath screen state.
It, can be with it will be understood by those skilled in the art that the restriction of the not structure paired terminal 1200 of structure shown in Figure 12 Including than illustrating more or fewer components, either combining certain components or being arranged using different components.
The embodiment of the present invention additionally provides a kind of terminal for detecting target area, which includes processor and storage Device, is stored at least one instruction, at least one section of program, code set or instruction set in memory, instruction, program, code set or Instruction set is loaded by processor and is had possessed operation in the target area detection method to realize above-described embodiment.
The embodiment of the present invention additionally provides a kind of computer readable storage medium, is stored in the computer readable storage medium Have at least one instruction, at least one section of program, code set or instruction set, the instruction, the program, the code set or the instruction set by Processor loads and has possessed operation in the target area detection method to realize above-described embodiment.
One of ordinary skill in the art will appreciate that realizing that all or part of step of above-described embodiment can pass through hardware It completes, relevant hardware can also be instructed to complete by program, the program can be stored in a kind of computer-readable In storage medium, storage medium mentioned above can be read-only memory, disk or CD etc..
The foregoing is merely the preferred embodiments of the embodiment of the present invention, are not intended to limit the invention embodiment, all at this Within the spirit and principle of inventive embodiments, any modification, equivalent replacement, improvement and so on should be included in the present invention's Within protection domain.

Claims (15)

1. a kind of target area detection method, which is characterized in that the method includes:
According to the target area that user determines in the first image of video, multiple sample areas and the multiple sample are determined The classification results in region, the classification results are for indicating whether the sample areas belongs to the target area;
Grader to be trained is obtained, the grader includes the multiple class nodes being arranged in order according to sequencing;
According to the classification results of the multiple sample areas and the multiple sample areas, to first in the grader Class node is trained, and continues to be trained next class node after the completion of first class node training, directly It is completed to the training of the multiple class node;
The target area is tracked in other images in the video in addition to described first image, is currently tracked when determining Image in when not including the target area, the grader of completion train in application, to after the image that currently tracks At least one of the second image region classify, determine the target area in second image according to classification results Domain.
2. according to the method described in claim 1, it is characterized in that, the determination in the first image of video according to user Target area determines the classification results of multiple sample areas and the multiple sample areas, including:
Region detection is carried out to described first image, obtains multiple sample areas;
According to the Duplication between the multiple sample areas and the target area, the classification of the multiple sample areas is determined As a result.
3. according to the method described in claim 1, it is characterized in that, described according to the multiple sample areas and the multiple The classification results of sample areas are trained first class node in the grader, first class node Continue to be trained next class node after the completion of training, until the training of the multiple class node is completed, including:
Initialize the node parameter of the multiple class node;
According to the classification results of the multiple sample areas and the multiple sample areas, to first in the grader The node parameter of class node is trained, and obtains the node parameter after first class node training;
Continue to be trained according to the multiple sample areas, the classification results of the multiple sample areas and a upper class node Node parameter afterwards is trained the node parameter of next class node, after obtaining next class node training Node parameter, until the multiple class node training complete.
4. according to the method described in claim 3, it is characterized in that, any class node in the grader exports first point When class numerical value, indicate that this region to be sorted belongs to the target area, any the second classification of class node output number When value, indicate that this region to be sorted is not belonging to the target area;
After the training completion of the multiple class node, the method further includes:
The multiple positive sample regions for belonging to the target area are chosen from the multiple sample areas;
For each positive sample region, using the multiple class node, classifies respectively to the positive sample region, obtain The classification numerical value that the multiple class node exports respectively;
According to the sequencing of the multiple class node, the classification combinations of values structure that the multiple class node is exported respectively At binary numeral, using the corresponding decimal value of the binary numeral as the classification results in the positive sample region;
The most classification results of occurrence number in the multiple positive sample region are determined as target classification result.
5. according to the method described in claim 4, it is characterized in that, the grader completed has been trained in the application, to working as At least one of the second image after the image of preceding tracking region is classified, and second figure is determined according to classification results The target area as in, including:
For each region in second image, using the multiple class node, classify respectively to the region, Obtain the classification numerical value that the multiple class node exports respectively;
According to the sequencing of the multiple class node, the classification combinations of values structure that the multiple class node is exported respectively At binary numeral, using the corresponding decimal value of the binary numeral as the classification results in the region;
When the classification results are equal to the target classification result, determine that the region belongs to the target area.
6. according to the method described in claim 1, it is characterized in that, it is described in the video in addition to described first image The target area is tracked in other images, including:
The target area in described first image is detected, multiple characteristic points are obtained;
By tracking the multiple characteristic point in two frames adjacent image in office, determine the multiple characteristic point in other described images In position;
According to position of the multiple characteristic point in other described images, the target area in other described images is determined Domain.
7. according to the method described in claim 1, it is characterized in that, the method further includes:
When tracking the target area in the third image in the video in addition to described first image, obtains tracking and miss Difference;
When the tracking error be more than the first predetermined threshold value when, using the target area traced into the third image as Sample areas;
The grader is updated according to the sample areas, obtains the updated grader.
8. the method according to the description of claim 7 is characterized in that described when the tracking error is more than the first predetermined threshold value When, using the target area traced into the third image as sample areas, including:
It, will be in the third image when the tracking error is more than first predetermined threshold value and is not more than the second predetermined threshold value The target area traced into is more than first predetermined threshold value as sample areas, second predetermined threshold value;
When the tracking error is more than second predetermined threshold value, determine in the third image not include the target area Domain.
9. a kind of target area detection device, which is characterized in that described device includes:
Sample determining module, the target area for being determined in the first image of video according to user, determines multiple sample areas The classification results of domain and the multiple sample areas, the classification results are for indicating whether the sample areas belongs to described Target area;
Acquisition module, for obtaining grader to be trained, the grader include be arranged in order according to sequencing it is multiple Class node;
Training module, for the classification results according to the multiple sample areas and the multiple sample areas, to described point First class node in class device is trained, and continues to save next classification after the completion of first class node training Point is trained, until the training of the multiple class node is completed;
Detection module, for tracking the target area in other images in the video in addition to described first image, When it does not include the target area to determine in the image currently tracked, the grader completed has been trained in application, to current At least one of the second image after the image of tracking region is classified, and second image is determined according to classification results In the target area.
10. device according to claim 9, which is characterized in that the sample determining module, including:
Region detection unit obtains multiple sample areas for carrying out region detection to described first image;
Determination unit, for according to the Duplication between the multiple sample areas and the target area, determining the multiple The classification results of sample areas.
11. device according to claim 9, which is characterized in that the training module, including:
Initialization unit, the node parameter for initializing the multiple class node;
Training unit, for the classification results according to the multiple sample areas and the multiple sample areas, to described point The node parameter of first class node in class device is trained, and obtains the node ginseng after first class node training Number;
The training unit, be additionally operable to continue according to the classification results of the multiple sample areas, the multiple sample areas with And the node parameter after upper class node training, the node parameter of next class node is trained, is obtained described Node parameter after next class node training, until the training of the multiple class node is completed.
12. according to the devices described in claim 11, which is characterized in that any class node output first in the grader When classification numerical value, indicate that this region to be sorted belongs to the target area, any second classification of class node output When numerical value, indicate that this region to be sorted is not belonging to the target area;
Described device further includes:
Module is chosen, for choosing the multiple positive sample regions for belonging to the target area from the multiple sample areas;
Sort module is used for for each positive sample region, using the multiple class node, respectively to the positive sample region Classify, obtains the classification numerical value that the multiple class node exports respectively;
Composite module exports the multiple class node for the sequencing according to the multiple class node respectively Combinations of values of classifying constitutes binary numeral, using the corresponding decimal value of the binary numeral as the positive sample region Classification results;
Target determination module, for the most classification results of occurrence number in the multiple positive sample region to be determined as target point Class result.
13. device according to claim 9, which is characterized in that described device further includes:
Error acquisition module, for tracking the target area in the third image in the video in addition to described first image When domain, tracking error is obtained;
Sample acquisition module, for when the tracking error is more than the first predetermined threshold value, will be traced into the third image The target area as sample areas;
Update module obtains the updated grader for being updated to the grader according to the sample areas.
14. a kind of for detecting the terminal of target area, which is characterized in that the terminal includes processor and memory, described It is stored at least one instruction, at least one section of program, code set or instruction set in memory, it is described instruction, described program, described Code set or described instruction collection are loaded by the processor and are had to realize as described in claim 1 to 8 any claim Target area detection method in possessed operation.
15. a kind of computer readable storage medium, which is characterized in that be stored at least one in the computer readable storage medium Item instruction, at least one section of program, code set or instruction set, described instruction, described program, the code set or described instruction collection by Processor loads and has to be had to realize in the target area detection method as described in claim 1 to 8 any claim Some operations.
CN201810650498.4A 2018-06-22 2018-06-22 Target area detection method, device, terminal and storage medium Active CN108776822B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810650498.4A CN108776822B (en) 2018-06-22 2018-06-22 Target area detection method, device, terminal and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810650498.4A CN108776822B (en) 2018-06-22 2018-06-22 Target area detection method, device, terminal and storage medium

Publications (2)

Publication Number Publication Date
CN108776822A true CN108776822A (en) 2018-11-09
CN108776822B CN108776822B (en) 2020-04-24

Family

ID=64025419

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810650498.4A Active CN108776822B (en) 2018-06-22 2018-06-22 Target area detection method, device, terminal and storage medium

Country Status (1)

Country Link
CN (1) CN108776822B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110084204A (en) * 2019-04-29 2019-08-02 北京字节跳动网络技术有限公司 Image processing method, device and electronic equipment based on target object posture
CN110245246A (en) * 2019-04-30 2019-09-17 维沃移动通信有限公司 A kind of image display method and terminal device
CN111241869A (en) * 2018-11-28 2020-06-05 杭州海康威视数字技术股份有限公司 Method and device for checking materials and computer readable storage medium
CN112312203A (en) * 2020-08-25 2021-02-02 北京沃东天骏信息技术有限公司 Video playing method, device and storage medium
CN113743380A (en) * 2021-11-03 2021-12-03 江苏博子岛智能产业技术研究院有限公司 Active tracking method based on video image dynamic monitoring

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102831618A (en) * 2012-07-20 2012-12-19 西安电子科技大学 Hough forest-based video target tracking method
CN103744924A (en) * 2013-12-26 2014-04-23 西安理工大学 Frequent pattern based selective ensemble classification method
KR20160111151A (en) * 2015-03-16 2016-09-26 (주)이더블유비엠 image processing method and apparatus, and interface method and apparatus of gesture recognition using the same
CN106709932A (en) * 2015-11-12 2017-05-24 阿里巴巴集团控股有限公司 Face position tracking method and device and electronic equipment
CN107066990A (en) * 2017-05-04 2017-08-18 厦门美图之家科技有限公司 A kind of method for tracking target and mobile device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102831618A (en) * 2012-07-20 2012-12-19 西安电子科技大学 Hough forest-based video target tracking method
CN103744924A (en) * 2013-12-26 2014-04-23 西安理工大学 Frequent pattern based selective ensemble classification method
KR20160111151A (en) * 2015-03-16 2016-09-26 (주)이더블유비엠 image processing method and apparatus, and interface method and apparatus of gesture recognition using the same
CN106709932A (en) * 2015-11-12 2017-05-24 阿里巴巴集团控股有限公司 Face position tracking method and device and electronic equipment
CN107066990A (en) * 2017-05-04 2017-08-18 厦门美图之家科技有限公司 A kind of method for tracking target and mobile device

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
HONG LIU 等: "ROBUST HAND TRACKING BASED ON ONLINE LEARNING AND MULTI-CUE FLOCKS OF FEATURES", 《2013 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING》 *
ZDENEK KALAL 等: "Tracking-Learning-Detection", 《IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE》 *
姜媚: "监控视频事件检测算法", 《中国优秀硕士学位论文全文数据库信息科技辑》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111241869A (en) * 2018-11-28 2020-06-05 杭州海康威视数字技术股份有限公司 Method and device for checking materials and computer readable storage medium
CN111241869B (en) * 2018-11-28 2024-04-02 杭州海康威视数字技术股份有限公司 Material checking method and device and computer readable storage medium
CN110084204A (en) * 2019-04-29 2019-08-02 北京字节跳动网络技术有限公司 Image processing method, device and electronic equipment based on target object posture
CN110084204B (en) * 2019-04-29 2020-11-24 北京字节跳动网络技术有限公司 Image processing method and device based on target object posture and electronic equipment
CN110245246A (en) * 2019-04-30 2019-09-17 维沃移动通信有限公司 A kind of image display method and terminal device
CN112312203A (en) * 2020-08-25 2021-02-02 北京沃东天骏信息技术有限公司 Video playing method, device and storage medium
CN112312203B (en) * 2020-08-25 2023-04-07 北京沃东天骏信息技术有限公司 Video playing method, device and storage medium
CN113743380A (en) * 2021-11-03 2021-12-03 江苏博子岛智能产业技术研究院有限公司 Active tracking method based on video image dynamic monitoring

Also Published As

Publication number Publication date
CN108776822B (en) 2020-04-24

Similar Documents

Publication Publication Date Title
CN110555883B (en) Repositioning method and device for camera attitude tracking process and storage medium
WO2020216116A1 (en) Action recognition method and apparatus, and human-machine interaction method and apparatus
CN109086709A (en) Feature Selection Model training method, device and storage medium
CN108776822A (en) Target area detection method, device, terminal and storage medium
CN111079576B (en) Living body detection method, living body detection device, living body detection equipment and storage medium
CN110222789B (en) Image recognition method and storage medium
CN110062269A (en) Extra objects display methods, device and computer equipment
WO2019007258A1 (en) Method, apparatus and device for determining camera posture information, and storage medium
CN109815150B (en) Application testing method and device, electronic equipment and storage medium
CN108615247A (en) Method for relocating, device, equipment and the storage medium of camera posture tracing process
CN110555839A (en) Defect detection and identification method and device, computer equipment and storage medium
WO2020221012A1 (en) Method for determining motion information of image feature point, task execution method, and device
CN114648480A (en) Surface defect detection method, device and system
CN108537845A (en) Pose determines method, apparatus and storage medium
CN110650379B (en) Video abstract generation method and device, electronic equipment and storage medium
CN108682036A (en) Pose determines method, apparatus and storage medium
CN108596976A (en) Method for relocating, device, equipment and the storage medium of camera posture tracing process
JP2021524957A (en) Image processing methods and their devices, terminals and computer programs
WO2019024717A1 (en) Anti-counterfeiting processing method and related product
CN110807361A (en) Human body recognition method and device, computer equipment and storage medium
CN110148178A (en) Camera localization method, device, terminal and storage medium
CN111127509B (en) Target tracking method, apparatus and computer readable storage medium
CN110827195B (en) Virtual article adding method and device, electronic equipment and storage medium
CN112749613B (en) Video data processing method, device, computer equipment and storage medium
CN109285178A (en) Image partition method, device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant