CN110059521A

CN110059521A - target tracking method and device

Info

Publication number: CN110059521A
Application number: CN201810049002.8A
Authority: CN
Inventors: 黄元捷
Original assignee: Zhejiang Uniview Technologies Co Ltd
Current assignee: Zhejiang Uniview Technologies Co Ltd
Priority date: 2018-01-18
Filing date: 2018-01-18
Publication date: 2019-07-26
Anticipated expiration: 2038-01-18
Also published as: CN110059521B

Abstract

The invention provides a target tracking method and a target tracking device, which are applied to a server storing a characteristic model corresponding to each target object. The method comprises the following steps: carrying out target detection on the current video frame image, and extracting corresponding CNN characteristics according to the position information of each object to be detected obtained by detection; calculating to obtain a corresponding similarity matrix according to the position information and the CNN characteristics of each object to be detected and the position information and the characteristic model of each target object in the previous video frame image; performing data association on each object to be detected and each target object based on the similarity matrix to obtain an optimal matching result; and if the optimal matching result has an object to be detected which is successfully matched with the corresponding target object, updating the corresponding feature model according to the CNN feature of the object to be detected, and obtaining a corresponding tracking result based on the object to be detected. The method has the advantages of strong anti-interference capability and high tracking success rate, and can continuously track the target object.

Description

Method for tracking target and device

Technical field

The present invention relates to the Multitarget Tracking fields of video image, in particular to a kind of method for tracking target And device.

Background technique

With the continuous development of monitoring technology, multiple target for being tracked to target objects multiple in monitor video with The application of track technology is more extensive.Existing multiple target tracking scheme passes through during tracking to target object by the mesh Mark CNN (Convolutional Neural Network, convolutional neural networks) feature of the object in current video image with The mode that CNN feature when the target object is tracked successfully recently is compared realizes the tracking to the target object, but this Kind multiple target tracking scheme anti-interference ability is weak, and target following success rate is low, it will usually because target object is tracked into recently CNN feature when function carries the feature of partial occlusion object, and cause CNN feature of the target object in current video image without CNN feature when method is tracked successfully recently with the target object correctly matches, so as to cause tracking failure.

Summary of the invention

In order to overcome above-mentioned deficiency in the prior art, the purpose of the present invention is to provide a kind of method for tracking target and dresses It sets, the method for tracking target strong antijamming capability, target following success rate is high, can persistently track to target object.

For method, preferred embodiments of the present invention provide a kind of method for tracking target, are applied to server, the clothes Business device is stored with the corresponding characteristic model of each target object, wherein each characteristic model includes the history CNN of corresponding target object Feature, which comprises

Target detection carried out to current video frame image, and according in the obtained current video frame image of detection respectively to The location information of test object extracts the corresponding CNN feature of each object to be detected from the current video frame image；

According to the location information of object to be detected each in current video frame image and corresponding CNN feature, with a upper video The location information of each target object and corresponding characteristic model in frame image are calculated each to be detected in current video frame image Similarity matrix in object and a upper video frame images between each target object；

Each object to be detected and each target object are subjected to data correlation based on the similarity matrix, obtain current video Optimum Matching result between frame image and a upper video frame images；

If exist in the Optimum Matching result with the object to be detected of corresponding target object successful match, according to described The corresponding characteristic model of the target object is carried out more with the CNN feature of the object to be detected of corresponding target object successful match Newly, and the object to be detected based on the successful match obtains corresponding tracking result.The method is according to current video frame figure The history that the characteristic model of each target object includes in the CNN feature of each object to be detected as in and a upper video frame images CNN feature is calculated in current video frame image in each object to be detected and a upper video frame images between each target object Optimal similarity matrix, and based on the similarity matrix obtain the current video frame image and a upper video frame images it Between Optimum Matching as a result, with based on to be detected with corresponding target object successful match present in the Optimum Matching result Object obtains corresponding tracking result, to reduce influence of the chaff interferent to target following, improves target following success rate, realizes Lasting tracking to target object.

For device, preferred embodiments of the present invention provide a kind of target tracker, are applied to server, the clothes Business device is stored with the corresponding characteristic model of each target object, wherein each characteristic model includes the history CNN of corresponding target object Feature, described device include:

Detection and Extraction module for carrying out target detection to current video frame image, and is worked as according to detecting and obtain The location information of each object to be detected extracts each object to be detected from the current video frame image in preceding video frame images Corresponding CNN feature；

Matrix computing module, for according to the location information of object to be detected each in current video frame image and corresponding CNN feature is calculated with the location information and corresponding characteristic model of each target object in a upper video frame images and works as forward sight Similarity matrix in frequency frame image in each object to be detected and a upper video frame images between each target object；

Images match module, for each object to be detected and each target object to be carried out data based on the similarity matrix Association, obtains the Optimum Matching result between current video frame image and a upper video frame images；

Tracking module is updated, if to be checked with corresponding target object successful match for existing in the Optimum Matching result When surveying object, according to described corresponding to the target object with the CNN feature of the object to be detected of corresponding target object successful match Characteristic model be updated, and the object to be detected based on the successful match obtains corresponding tracking result.

In terms of existing technologies, the method for tracking target and device that preferred embodiments of the present invention provide have following The utility model has the advantages that the method for tracking target strong antijamming capability, target following success rate is high, can persistently carry out to target object Tracking.The method for tracking target is applied to server, and the server is stored with the corresponding characteristic model of each target object, In each characteristic model include corresponding target object history CNN feature.Firstly, the method passes through to current video frame image Target detection is carried out, obtains each object to be detected in the current video frame image, and is obtained according to detection described current The location information of each object to be detected extracts each object pair to be detected from the current video frame image in video frame images The CNN feature answered；Then, the method is according to the location information of object to be detected each in current video frame image and corresponding CNN feature, it is special with the history CNN that the location information of each target object in a upper video frame images and corresponding characteristic model include Sign is calculated optimal between each target object in each object to be detected and a upper video frame images in current video frame image Similarity matrix；Then, the method is based on the similarity matrix and each object to be detected and each target object is carried out data Association, obtains the Optimum Matching result between current video frame image and a upper video frame images；Finally, the method is described When existing in Optimum Matching result with the object to be detected of corresponding target object successful match, according to described with corresponding target object The CNN feature of the object to be detected of successful match is updated the corresponding characteristic model of the target object, and based on described Corresponding tracking result is obtained with successful object to be detected, to reduce influence of the chaff interferent to target following, improves target Success rate is tracked, realizes the lasting tracking to target object.

To enable the above objects, features and advantages of the present invention to be clearer and more comprehensible, present pre-ferred embodiments are cited below particularly, And cooperate appended attached drawing, it is described in detail below.

Detailed description of the invention

In order to illustrate the technical solution of the embodiments of the present invention more clearly, below will be to needed in the embodiment attached Figure is briefly described, it should be understood that the following drawings illustrates only certain embodiments of the present invention, therefore is not construed as pair The restriction of the claims in the present invention protection scope, for those of ordinary skill in the art, what is do not made the creative labor Under the premise of, it can also be obtained according to these attached drawings other relevant attached drawings.

Fig. 1 is a kind of block diagram for the server that preferred embodiments of the present invention provide.

Fig. 2 is a kind of flow diagram for the method for tracking target that preferred embodiments of the present invention provide.

Fig. 3 is a kind of flow diagram for the sub-step that step S220 shown in Fig. 2 includes.

Fig. 4 is a kind of flow diagram for the sub-step that step S240 shown in Fig. 2 includes.

A kind of block diagram of target tracker shown in Fig. 1 that Fig. 5 provides for preferred embodiments of the present invention.

Fig. 6 is a kind of block diagram of matrix computing module shown in Fig. 5.

Icon: 10- server；11- memory；12- processor；13- communication unit；100- target tracker；110- Detection and Extraction module；120- matrix computing module；130- images match module；140- updates tracking module；121- similarity meter Operator module；122- matrix generates submodule.

Specific embodiment

In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with the embodiment of the present invention In attached drawing, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described embodiment is A part of the embodiment of the present invention, instead of all the embodiments.The present invention being usually described and illustrated herein in the accompanying drawings is implemented The component of example can be arranged and be designed with a variety of different configurations.

Therefore, the detailed description of the embodiment of the present invention provided in the accompanying drawings is not intended to limit below claimed The scope of the present invention, but be merely representative of selected embodiment of the invention.Based on the embodiments of the present invention, this field is common Technical staff's every other embodiment obtained without creative efforts belongs to the model that the present invention protects It encloses.

It should also be noted that similar label and letter indicate similar terms in following attached drawing, therefore, once a certain Xiang Yi It is defined in a attached drawing, does not then need that it is further defined and explained in subsequent attached drawing.

With reference to the accompanying drawing, it elaborates to some embodiments of the present invention.In the absence of conflict, following Feature in embodiment and embodiment can be combined with each other.

Fig. 1 is please referred to, is a kind of block diagram for the server 10 that preferred embodiments of the present invention provide.In the present invention In embodiment, the server 10 is used to carry out constantly strong antijamming capability to monitored object each in the monitor video got And the high target following of success rate is tracked, wherein the server 10 may be, but not limited to, cloud formula server, distribution Server, centralized server etc..

In the present embodiment, the server 10 includes target tracker 100, memory 11, processor 12 and communication Unit 13.The memory 11, processor 12 and each element of communication unit 13 are directly or indirectly electrically connected between each other, To realize the transmission or interaction of data.For example, these elements can pass through one or more communication bus or signal wire between each other It realizes and is electrically connected.

Wherein, the memory 11 can be used for storing the corresponding characteristic model of each target object in monitor video, wherein often A characteristic model includes the history CNN feature that corresponding target object is extracted when the server 10 is tracked, the mesh Marking object is the object tracked in the monitor video, and the target object can be people, be also possible to vehicle, It can also be animal and/or plant.The memory 11 may be, but not limited to, random access memory (Random Access Memory, RAM), read-only memory (Read Only Memory, ROM), programmable read only memory (Programmable Read-Only Memory, PROM), Erasable Programmable Read Only Memory EPROM (Erasable Programmable Read-Only Memory, EPROM), electrically erasable programmable read-only memory (Electric Erasable Programmable Read-Only Memory, EEPROM) etc..Wherein, it can store in the memory 11 There is a software program, the software program can be performed after receiving and executing instruction in the processor 12.

The processor 12 can be a kind of IC chip of processing capacity with signal.The processor 12 can To be general processor, including central processing unit (Central Processing Unit, CPU), network processing unit (Network Processor, NP) etc.；Can also be digital signal processor, specific integrated circuit, ready-made programmable gate array or other Programmable logic device, discrete gate or transistor logic, discrete hardware components.The processor 12 may be implemented or Execute disclosed each method, step and the logic diagram in the embodiment of the present invention.General processor can be microprocessor or The processor is also possible to any conventional processor etc..

The communication unit 13 is used for the communication link established between the server 10 and other external equipments by network It connects, and passes through the network sending and receiving data.Wherein, other described external equipments include monitoring device and display equipment, the clothes Business device 10 obtains the monitor video for needing to carry out target following by the communication unit 13 from monitoring device, and in completion pair After the target following of the monitor video, the monitor video for completing target following can be set in display by the communication unit 13 It is shown on standby.

The target tracker 100 includes that at least one can be stored in the memory in the form of software or firmware Software function module in 11.The processor 12 can be used for executing the target tracker that the memory 11 stores 100 corresponding executable modules, such as software function module included by the target tracker 100 and computer program Deng.In the present embodiment, the strong antijamming capability of the target tracker 100, can be by will be in current video frame image The side that the history CNN feature that the CNN feature of each object to be detected and the characteristic model of each target object include is compared one by one Formula carries out the constantly high target following of target following success rate to target object each in monitor video.

It is understood that block diagram shown in FIG. 1 is only a kind of structure composition schematic diagram of server 10, the server 10 may also include than shown in Fig. 1 more perhaps less component or with the configuration different from shown in Fig. 1.Shown in Fig. 1 Each component can using hardware, software, or its combination realize.

It referring to figure 2., is a kind of block diagram for the method for tracking target that preferred embodiments of the present invention provide.At this In inventive embodiments, the method for tracking target is applied to above-mentioned server 10, for target object each in monitor video It carries out constantly strong antijamming capability and tracks the high target following of success rate, wherein being stored with corresponding prison in the server 10 The characteristic model of each target object in video is controlled, each characteristic model includes the history CNN feature of corresponding target object.It is right below Method for tracking target detailed process and step shown in Fig. 2 are described in detail.

In embodiments of the present invention, the method for tracking target the following steps are included:

Step S210 carries out target detection, and the current video frame obtained according to detection to current video frame image The location information of each object to be detected extracts the corresponding CNN of each object to be detected from the current video frame image in image Feature.

In the present embodiment, the monitor video that the server 10 is got can continuously display shape by multiple video frame images At the server 10 can be by the way that the CNN feature of each target object that may be present in multiple video frame images to be compared Mode, complete to carry out target following to the object that needs to track in the monitor video.The server 10 can be by working as Preceding video frame images carry out the mode of target detection, get the position letter of each object to be detected in the current video frame image Breath, and according to the location information of each object to be detected from the current video frame image corresponding position take off out correspond to it is to be checked The characteristic area of object is surveyed, so that the CNN for extracting corresponding object to be detected from the characteristic area of each object to be detected is special Sign.Wherein the object to be detected is the object detected in current video frame image, wherein the object to be detected can wrap It includes in at least previous video frame image of the current video frame image in the monitor video according to timing arrangement Emerging needs carry out the object of just secondary tracking in the object and the current video frame image persistently tracked.

Step S220, according to the location information of object to be detected each in current video frame image and corresponding CNN feature, with The location information and corresponding characteristic model of each target object, are calculated in current video frame image in a upper video frame images Similarity matrix in each object to be detected and a upper video frame images between each target object.

In the present embodiment, a upper video frame images are in the monitor video according to the described current of timing arrangement The previous video frame image of video frame images, the target object in a upper video frame images are the server 10 right All target objects got from the monitor video before the current video frame image progress target following, described upper one Target object in video frame images includes the target object directly manifested in video frame images on this, and to working as forward sight Frequency frame image carries out the target object directly manifested in video frame images not on this before target detection.

In the present embodiment, the server 10 by by object to be detected each in current video frame image in the video frame The history CNN feature for including in each target object character pair model in CNN feature in image and a upper video frame images into The mode that row compares, by location information of the object to be detected each in current video frame image in the video frame images and a upper view In frequency frame image each target object on this in video frame images respectively should the mode that is compared of corresponding location information, It obtains in current video frame image optimal similar between each object to be detected and each target object in a upper video frame images Spend matrix.The history CNN feature for wherein including in each characteristic model is that corresponding target object regards when being successfully tracked corresponding CNN feature in frequency frame image.For example, the first video frame images, third video frame images, the 5th video frame images and the 7th view Frequency frame image is arranged successively according to timing, and a target object is in the first video frame images, the 5th video frame images and the 7th view It is successfully tracked in frequency frame image, if there is no limit the features of the target object for the history CNN number of features in characteristic model Model includes that the target object is respectively right in first video frame images, the 5th video frame images and the 7th video frame images The CNN feature answered.

It optionally, referring to figure 3., is a kind of flow diagram for the sub-step that step S220 shown in Fig. 2 includes.? In the present embodiment, each object to be detected and a upper video frame images in current video frame image are calculated in the step S220 In similarity matrix between each target object the step of may include sub-step S221, sub-step S222 and sub-step S223:

Sub-step S221 is obtained each to be detected based on the history CNN feature calculation in each target object character pair model Characteristic similarity between object and each target object.

In the present embodiment, the server 10 can by by object to be detected each in the current video frame image respectively Corresponding CNN feature respectively with all history for including in the characteristic model of each target object in a upper video frame images The mode of calculating is compared in CNN feature, gets optimal feature phase between each object to be detected and each target object Like degree.

Optionally, the history CNN feature calculation based in each target object character pair model obtains each to be detected The step of characteristic similarity between object and each target object includes:

Each history CNN feature in CNN feature and each target object character pair model to each object to be detected Between COS distance calculated, obtain each COS distance between the object to be detected and corresponding target object；

The smallest COS distance of numerical value is chosen from each COS distance as the object to be detected and corresponding target Characteristic similarity between object.

Wherein, the optimal feature between each object to be detected and corresponding target object that can be acquired according to above-mentioned steps Similarity, the optimal characteristic similarity can be calculated by following formula:

Wherein, M_iIndicate that target sequence number is the characteristic model of the target object of i, F_i ⁰Indicate the initial CNN of the target object Feature, F_i ⁿIndicate the target object in the history CNN feature of video frame n image, aff_appIndicate object to be detected and corresponding mesh Mark the characteristic similarity between object, F_iIndicate the history CNN feature of corresponding target object,Indicate corresponding object to be detected CNN feature, cosine (F_i,) indicate target object history CNN feature F_iWith the CNN feature of corresponding object to be detected Between COS distance.

Sub-step S222, location information, target size information based on each target object in a upper video frame images, and it is each Location information of the object to be detected in current video frame image, target size information, are calculated each object to be detected and each mesh Mark the space similarity and shape similarity between object；

In the present embodiment, each target object includes corresponding target object at this in the location information of a upper video frame images The X-coordinate information and Y-coordinate information of the top left co-ordinate point in character pair region, each target object exist in a upper video frame images The target size information of a upper video frame images includes corresponding target object character pair region in a video frame images on this Peak width and region height, location information of each object to be detected in current video frame image include correspondence it is to be detected right It is each to be checked as the characteristic area in the current video frame image is in the X-coordinate information and Y-coordinate information of top left co-ordinate point Surveying target size information of the object in current video frame image includes corresponding object to be detected in the current video frame image In characteristic area peak width and region height.The server 10 will be according to the peak width of each object to be detected, area Domain height, X-coordinate information, the X-coordinate information and Y-coordinate information of Y-coordinate information and each target object, calculating acquire each to be checked Survey the space similarity between object and each target object；The server 10 by according to the peak width of each object to be detected, Region height, X-coordinate information, Y-coordinate information and each target object peak width and region height, calculating acquires each to be checked Survey the shape similarity between object and each target object.In the present embodiment, the space similarity is similar to the shape Degree meets the matching criterior of Hungary Algorithm and its expansion algorithm.Wherein, the space similarity is similar to the shape can It is calculated by following formula:

Wherein, trk_iIndicate i-th of target object, det_jIndicate j-th of object to be detected, X, Y, W, H respectively indicate object X coordinate value, y-coordinate value, peak width and the region height of the upper left angle point in character pair region in corresponding video frame images, aff_mot(trk_i, det_j) indicate space similarity between i-th of target object and j-th of object to be detected, aff_shp(trk_i, det_j) indicate shape similarity between i-th of target object and j-th of object to be detected.

Sub-step S223, according between each object to be detected and each target object characteristic similarity, space similarity and Shape similarity is calculated between each object to be detected and each target object and is associated with similarity, and accordingly obtains described Similarity matrix.

In the present embodiment, the server 10 passes through each object to be detected is similar to the feature between each target object Degree, space similarity and shape similarity carry out multiplication operation, obtain best between each object to be detected and each target object Association similarity, and by the optimal association similarity between each object to be detected and each target object with a matrix type into Row arrangement generates optimal similarity matrix between the current video frame image and a upper video frame images.

Referring once again to Fig. 2, step S230, based on the similarity matrix will each object to be detected and each target object into Row data correlation obtains the Optimum Matching result between current video frame image and a upper video frame images.

In the present embodiment, the server 10 is based on the similarity by using Hungary Algorithm or its expansion algorithm Each object to be detected and each target object are carried out data correlation by matrix, to obtain the current video frame image and a upper view Optimum Matching result between frequency frame image.

Step S240, if existing and the object to be detected of corresponding target object successful match in the Optimum Matching result When, according to it is described with the CNN feature of the object to be detected of corresponding target object successful match to the corresponding feature of the target object Model is updated, and the object to be detected based on the successful match obtains corresponding tracking result.

In the present embodiment, the server 10 is obtaining between the current video frame image and a upper video frame images Optimum Matching result when, object division can be carried out to each object to be detected in current video frame image, obtain with it is described on Corresponded in one video frame images target object successful match object to be detected, with correspond to target in a upper video frame images The lower object to be detected of object matching degree, and do not belong to what each target object in a upper video frame images matched Newly occurs the object to be detected tracked in the current video frame image.

In the present embodiment, for newly occurring the object to be detected that is tracked in the current video frame image, The server 10 can be using CNN feature of the object to be detected in the current video frame image as the object to be detected Initial CNN feature, based on the characteristic model of the initial CNN feature-modeling object to be detected, and with Kalman filter pair The object to be detected carries out parameters revision, obtains the tracking result of the object to be detected.Wherein the server 10 is to described It, can be using the object to be detected that creates as monitoring when the latter video frame images of current video frame image carry out target following One target object of video, and target following is carried out with the characteristic model of the object to be detected.

In the present embodiment, target pair is corresponded to a upper video frame images in the current video frame image As the lower object to be detected of matching degree, the server 10 will be based on Kalman filter to the object to be detected previous Location information in video frame images predicted, and judges whether according to prediction result to remove the tracking of the object to be detected Device.If wherein the prediction result shows that the location information of the object to be detected in previous video frame images remains unchanged for a long period of time, and should When prediction total duration of the object to be detected in previous video frame images is greater than preset duration threshold value, the server 10 will removal To the tracker of the object to be detected；If the prediction result shows position letter of the object to be detected in previous video frame images Breath remains unchanged for a long period of time, and prediction total duration of the object to be detected in previous video frame images be less than preset duration threshold value when, institute Parameters revision will be carried out to the object to be detected with Kalman filter by stating server 10, obtain the tracking knot of the object to be detected Fruit.

In the present embodiment, target pair is corresponded to a upper video frame images in the current video frame image As the object to be detected of successful match, the server 10 will be special with CNN of the object to be detected in current video frame image Sign is updated the characteristic model of the successful target object of object matching to be detected, and to be checked to this with Kalman filter It surveys object and carries out parameters revision, obtain the tracking result of the object to be detected, the test object is Corresponding matching at this time Successful target object.

It optionally, referring to figure 4., is a kind of flow diagram for the sub-step that step S240 shown in Fig. 2 includes.? In the present embodiment, according to described with the CNN feature of the object to be detected of corresponding target object successful match in the step S240 The step of being updated to the corresponding characteristic model of the target object includes sub-step S241 and sub-step S242:

Sub-step S241 carries out the number of features of the history CNN feature in the corresponding characteristic model of the target object Statistics obtains corresponding feature sum.

In the present embodiment, the server 10 passes through system when the characteristic model to corresponding target object is updated The mode for counting the number of features of history CNN feature in the characteristic model of the target object obtains corresponding feature sum.

The feature sum is compared by sub-step S242 with default characteristic storage quantity, and will according to comparison result The corresponding characteristic model of the target object is added to the CNN feature of the object to be detected of the target object successful match In.

In the present embodiment, the server 10 according to comparison result by with described in the target object successful match to The step that the CNN feature of test object is added in the corresponding characteristic model of the target object includes:

It, directly will be with the mesh if the comparison result is that the feature sum is less than the default characteristic storage quantity The CNN feature of the successful object to be detected of mark object matching, which is added in character pair model, to be stored；

If the comparison result is that the feature sum is not less than the default characteristic storage quantity, with described to be detected Any one history CNN feature in the CNN feature replacement character pair model of object in addition to initial CNN feature, will be described The CNN feature of object to be detected is added in the corresponding characteristic model of the target object.

Wherein, the default characteristic storage quantity can be 10, is also possible to 15, can also be 25, numerical value Different configurations can be carried out according to actual needs.

Referring to figure 5., be preferred embodiments of the present invention provide Fig. 1 shown in target tracker 100 one kind Block diagram.In embodiments of the present invention, the target tracker 100 includes Detection and Extraction module 110, matrix calculating mould Block 120, images match module 130 and update tracking module 140.

The Detection and Extraction module 110, for carrying out target detection to current video frame image, and obtained according to detection The location information of each object to be detected extracts each to be checked from the current video frame image in the current video frame image Survey the corresponding CNN feature of object.

In the present embodiment, the Detection and Extraction module 110 can execute step S210 shown in Fig. 2, specifically hold Row process can refer to above to the detailed description of step S210.

The matrix computing module 120, for according to the location information of object to be detected each in current video frame image and Corresponding CNN feature is calculated with the location information and corresponding characteristic model of each target object in a upper video frame images Similarity matrix in current video frame image in each object to be detected and a upper video frame images between each target object.

In the present embodiment, the matrix computing module 120 can execute step S220 shown in Fig. 2, specifically hold Row process can refer to above to the detailed description of step S220.

Fig. 6 is please referred to, is a kind of block diagram of matrix computing module 120 shown in Fig. 5.In the present embodiment, The matrix computing module 120 includes that similarity calculation submodule 121 and matrix generate submodule 122.

The similarity calculation submodule 121, for based on the history CNN feature in each target object character pair model The characteristic similarity between each object to be detected and each target object is calculated.

In the present embodiment, the similarity calculation submodule 121 is based on going through in each target object character pair model The mode that history CNN feature calculation obtains the characteristic similarity between each object to be detected and each target object includes:

Wherein, the similarity calculation submodule 121 can execute sub-step S221 shown in Fig. 3, specific to execute Process is referred to the detailed description of above sub-paragraphs S221.

The similarity calculation submodule 121 is also used to believe based on each target object in the position of a upper video frame images Breath, location information in current video frame image of target size information and each object to be detected, target size information, calculate Space similarity and shape similarity between each object to be detected and each target object.

In the present embodiment, sub-step S222 shown in Fig. 3 can also be performed in the similarity calculation submodule 121, Specific implementation procedure is referred to the detailed description of above sub-paragraphs S222.

The matrix generates submodule 122, for similar to the feature between each target object according to each object to be detected Degree, space similarity and shape similarity be calculated between each object to be detected and each target object be associated with it is similar Degree, and accordingly obtain the similarity matrix.

In the present embodiment, the matrix, which generates submodule 122, can execute sub-step S223 shown in Fig. 3, specifically Implementation procedure be referred to the detailed description of above sub-paragraphs S223.

Referring once again to Fig. 5, described image matching module 130 will be each to be detected right for being based on the similarity matrix As carrying out data correlation with each target object, the Optimum Matching knot between current video frame image and a upper video frame images is obtained Fruit.

In the present embodiment, described image matching module 130 can execute step S230 shown in Fig. 2, specifically hold Row process is referred to above to the detailed description of step S230.

The update tracking module 140, if for exist in the Optimum Matching result with corresponding target object match at When the object to be detected of function, according to it is described with the CNN feature of the object to be detected of corresponding target object successful match to the target The corresponding characteristic model of object is updated, and the object to be detected based on the successful match obtains corresponding tracking result.

In the present embodiment, the update tracking module 140 is according to described to be checked with corresponding target object successful match The mode that is updated to the corresponding characteristic model of the target object of CNN feature for surveying object includes:

The number of features of history CNN feature in the corresponding characteristic model of the target object is counted, is obtained pair The feature sum answered；

The feature sum is compared with default characteristic storage quantity, and will be with the target pair according to comparison result As the CNN feature of the object to be detected of successful match is added in the corresponding characteristic model of the target object.

Wherein, the update tracking module 140 according to comparison result by with described in the target object successful match to The mode that the CNN feature of test object is added in the corresponding characteristic model of the target object includes:

In the present embodiment, the update tracking module 140 can execute shown in step S240 and Fig. 4 shown in Fig. 2 Sub-step S241 and sub-step S242, specific implementation procedure can refer to above to step S240, sub-step S241 and son The detailed description of step S242.

In conclusion in the method for tracking target and device that preferred embodiments of the present invention provide, the target following Method strong antijamming capability, target following success rate is high, can persistently track to target object.The method for tracking target Applied to server, the server is stored with the corresponding characteristic model of each target object, wherein each characteristic model includes pair Answer the history CNN feature of target object.Firstly, the method obtains institute by carrying out target detection to current video frame image Each object to be detected in current video frame image is stated, and each to be detected in the obtained current video frame image according to detecting The location information of object extracts the corresponding CNN feature of each object to be detected from the current video frame image；Then, described Location information and corresponding CNN feature of the method according to object to be detected each in current video frame image, with upper video frame figure The history CNN feature that the location information of each target object and corresponding characteristic model include as in, is calculated current video frame Similarity matrix optimal between each target object in each object to be detected and a upper video frame images in image；Then, described Method is based on the similarity matrix and each object to be detected and each target object is carried out data correlation, obtains current video frame figure Picture and the Optimum Matching result between a upper video frame images；Finally, the method in the Optimum Matching result exist with When the object to be detected of corresponding target object successful match, according to described with the object to be detected of corresponding target object successful match CNN feature the corresponding characteristic model of the target object is updated, and the object to be detected based on the successful match obtains To corresponding tracking result, to reduce influence of the chaff interferent to target following, target following success rate is improved, is realized to target The lasting tracking of object.

The foregoing is only a preferred embodiment of the present invention, is not intended to restrict the invention, for the skill of this field For art personnel, the invention may be variously modified and varied.All within the spirits and principles of the present invention, made any to repair Change, equivalent replacement, improvement etc., should all be included in the protection scope of the present invention.

Claims

1. a kind of method for tracking target, which is characterized in that be applied to server, it is corresponding that the server is stored with each target object Characteristic model, wherein each characteristic model includes the history CNN feature of corresponding target object, which comprises

Target detection is carried out to current video frame image, and each to be detected in the obtained current video frame image according to detecting The location information of object extracts the corresponding CNN feature of each object to be detected from the current video frame image；

According to the location information of object to be detected each in current video frame image and corresponding CNN feature, with upper video frame figure The location information and corresponding characteristic model of each target object, are calculated each object to be detected in current video frame image as in With the similarity matrix in a upper video frame images between each target object；

Each object to be detected and each target object are subjected to data correlation based on the similarity matrix, obtain current video frame figure Picture and the Optimum Matching result between a upper video frame images；

If in the Optimum Matching result exist with the object to be detected of corresponding target object successful match, according to it is described with it is right The CNN feature of the object to be detected of target object successful match is answered to be updated the corresponding characteristic model of the target object, and Object to be detected based on the successful match obtains corresponding tracking result.

2. the method according to claim 1, wherein it is described be calculated it is each to be detected in current video frame image The step of similarity matrix in object and a upper video frame images between each target object includes:

Each object to be detected and each target object are obtained based on the history CNN feature calculation in each target object character pair model Between characteristic similarity；

Worked as based on each target object in the location information of a upper video frame images, target size information and each object to be detected The sky between each object to be detected and each target object is calculated in the location information of preceding video frame images, target size information Between similarity and shape similarity；

It is calculated according to characteristic similarity, space similarity and the shape similarity between each object to be detected and each target object It is associated with similarity between each object to be detected and each target object, and accordingly obtains the similarity matrix.

3. according to the method described in claim 2, it is characterized in that, described based on going through in each target object character pair model History CNN feature calculation obtains the step of characteristic similarity between each object to be detected and each target object and includes:

Between each history CNN feature in CNN feature and each target object character pair model to each object to be detected COS distance calculated, obtain each COS distance between the object to be detected and corresponding target object；

The smallest COS distance of numerical value is chosen from each COS distance as the object to be detected and corresponding target object Between characteristic similarity.

4. method described in any one of -3 according to claim 1, which is characterized in that it is described according to corresponding target pair The step of being updated as the CNN feature of the object to be detected of successful match to the corresponding characteristic model of the target object include:

The number of features of history CNN feature in the corresponding characteristic model of the target object is counted, is obtained corresponding Feature sum；

The feature sum is compared with default characteristic storage quantity, and will be with the target object according to comparison result CNN feature with the successful object to be detected is added in the corresponding characteristic model of the target object.

5. according to the method described in claim 4, it is characterized in that, described will match according to comparison result with the target object The step that the CNN feature of the successful object to be detected is added in the corresponding characteristic model of the target object includes:

It, directly will be with the target pair if the comparison result is that the feature sum is less than the default characteristic storage quantity It is stored as the CNN feature of the object to be detected of successful match is added in character pair model；

If the comparison result is that the feature sum is not less than the default characteristic storage quantity, with the object to be detected CNN feature replacement character pair model in any one history CNN feature in addition to initial CNN feature, will be described to be checked The CNN feature for surveying object is added in the corresponding characteristic model of the target object.

6. a kind of target tracker, which is characterized in that be applied to server, it is corresponding that the server is stored with each target object Characteristic model, wherein each characteristic model includes the history CNN feature of corresponding target object, described device includes:

Detection and Extraction module for carrying out target detection to current video frame image, and works as forward sight according to detecting and obtain It is corresponding to extract each object to be detected from the current video frame image for the location information of each object to be detected in frequency frame image CNN feature；

Matrix computing module, for special according to the location information of object to be detected each in current video frame image and corresponding CNN Sign, with the location information and corresponding characteristic model of each target object in a upper video frame images, is calculated current video frame Similarity matrix in image in each object to be detected and a upper video frame images between each target object；

Images match module, for each object to be detected and each target object to be carried out data pass based on the similarity matrix Connection, obtains the Optimum Matching result between current video frame image and a upper video frame images；

Tracking module is updated, if to be detected right with corresponding target object successful match for existing in the Optimum Matching result As when, according to it is described with the CNN feature of the object to be detected of corresponding target object successful match to the corresponding spy of the target object Sign model is updated, and the object to be detected based on the successful match obtains corresponding tracking result.

7. device according to claim 6, which is characterized in that the matrix computing module includes similarity calculation submodule And matrix generates submodule；

The similarity calculation submodule, for being obtained based on the history CNN feature calculation in each target object character pair model Characteristic similarity between each object to be detected and each target object；

The similarity calculation submodule is also used to location information based on each target object in a upper video frame images, target Location information in current video frame image of dimension information and each object to be detected, target size information, are calculated each to be checked Survey the space similarity and shape similarity between object and each target object；

The matrix generates submodule, for according to the characteristic similarity between each object to be detected and each target object, space The similarity that is associated between each object to be detected and each target object is calculated in similarity and shape similarity, and corresponding Obtain the similarity matrix.

8. device according to claim 7, which is characterized in that the similarity calculation submodule is based on each target object pair The history CNN feature calculation in characteristic model is answered to obtain the side of the characteristic similarity between each object to be detected and each target object Formula includes:

9. the device according to any one of claim 6-8, which is characterized in that the update tracking module is according to The corresponding characteristic model of the target object is updated with the CNN feature of the object to be detected of corresponding target object successful match Mode include:

10. device according to claim 9, which is characterized in that the update tracking module will be with institute according to comparison result The CNN feature for stating the object to be detected of target object successful match is added in the corresponding characteristic model of the target object Mode includes: