CN108986143A

CN108986143A - Target detection tracking method in a kind of video

Info

Publication number: CN108986143A
Application number: CN201810940035.1A
Authority: CN
Inventors: 尚凌辉; 张兆生; 王弘玥; 郑永宏
Original assignee: ZHEJIANG ICARE VISION TECHNOLOGY Co Ltd
Current assignee: Zhejiang Jiehuixin Digital Technology Co.,Ltd.
Priority date: 2018-08-17
Filing date: 2018-08-17
Publication date: 2018-12-11
Anticipated expiration: 2038-08-17
Also published as: CN108986143B

Abstract

The invention discloses target detection tracking methods in a kind of video.The present invention carries out segmentation sampling to video first, obtains several segments video image frame sequence.Then neural network model is usedTarget detection and feature extraction are carried out to every section of video image frame sequence.The correlation matrix of the corresponding target feature vector of all testing results exported in video sequence is calculated again, and then obtains tracking result of all targets detected in frequency sequence in video sequence.Finally temporally axis is ranked up the video image frame sequence of segmentation sampling, and the target detection pursuit path and eigenmatrix of video image frame sequence are input to neural network model, the tracking characteristics of each target in each video image frame sequence are obtained, the correlation of all targets between two adjacent video image frame sequence are calculated using this tracking characteristics, to complete the tracking of target in entire video-frequency band.The present invention, which completes the calculation amount that target detection tracing task needs in video, can be effectively reduced.

Description

Target detection tracking method in a kind of video

Technical field

The invention belongs to technical field of computer vision, it is related to target detection tracking method in a kind of video.

Background technique

The monitoring devices such as bayonet, public security and disparate networks video camera are largely installed and are used, these equipment Video data collected traffic offence, in terms of play the role of it is very big, but with these equipment pacify Loading amount is increasing, and the data volume of production also increasingly increases, and is stored and is faced huge challenge, video knot using these data Structure has become a research hotspot of scientific research and industry.

The underlying issue that all can't steer clear of in all kinds of video structural schemes be exactly accurately and efficiently detection and tracking view Common-denominator target in frequency.A kind of " target following optimization method based on tracking study detection " 107967692A, a kind of " real-time nothing Man-machine video object detection and tracking " 108108697A, " multiple target pedestrian detection and track side based on deep learning Method " it is all to complete target detection using single-frame images in the patents such as 107563313A, calculate object detection results relevant range Then feature relies on matching and tracking that these features complete close interframe target.Target detection relies on all in these methods It is the information of single-frame images, the standard of testing result cannot be led to using the relevant information between close picture frame in time series True rate will receive limitation；The feature used during matched jamming simultaneously is also extracting on single-frame images, and the spy Sign wants that the different target individual of multiclass can be distinguished, and the similar purpose for encountering colleague is very easy to matching error, and tracking is caused to be lost It loses；Finally, the accuracy rate in order to guarantee detecting and tracking, limited every the interval of frame sampling, cause calculation amount bigger, efficiency compares It is low.

Summary of the invention

In view of the deficiencies of the prior art, the present invention provides target detection tracking methods in a kind of video.

The technical solution adopted for solving the technical problem of the present invention are as follows:

Step 1 carries out segmentation sampling to video, obtains several segments video image frame sequence.

Step 2, using neural network model M₁Target detection and feature extraction are carried out to every section of video image frame sequence, it is defeated Information out includes: the number in the sequence of image where target, rectangle frame, the clarification of objective vector of target in the picture.

Step 3, the correlation matrix for calculating the corresponding target feature vector of all testing results exported in video sequence, into And obtain tracking result of all targets detected in frequency sequence in video sequence.

Step 4, temporally axis, by inside video image frame sequence target detection pursuit path and eigenmatrix be input to Neural network model M₂, the tracking characteristics of each target in each video image frame sequence are obtained, are calculated using this tracking characteristics The correlation of all targets between two adjacent video image frame sequence, to complete the tracking of target in entire video-frequency band.

Beneficial effects of the present invention:

1, the accuracy rate of detector is improved using the inter-frame information of time-series image.

2, the space time information of time-series image is made full use of to improve the tracking effect of target.

3, the calculation amount of detecting and tracking can be effectively reduced, operational efficiency is improved.

4, detection and tracking effectively merges, and improves detecting and tracking overall effect.

Detailed description of the invention

Fig. 1 the method for the present invention flow chart.

Specific embodiment

In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with the embodiment of the present invention In attached drawing, the technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described embodiment is only It is only a part of the embodiment of the present invention, instead of all the embodiments.Based on the embodiment of the present invention, ordinary skill people Member's every other embodiment obtained without making creative work, shall fall within the protection scope of the present invention.

As shown in Figure 1, the present invention the following steps are included:

Step 2 includes: target to every section of video image frame sequence progress target detection and feature extraction, the information of output Place image number in the sequence, rectangle frame, the clarification of objective vector of target in the picture.

The correlation matrix of the corresponding target feature vector of all testing results exported in video sequence in step 3, calculating, And then obtain the tracking result in frequency sequence of all targets detected in video sequence.

Step 4, temporally axis, using the target detection pursuit path inside video image frame sequence, (including target is in sequence The rectangle frame of number, target in the picture in column) and eigenmatrix (the serial splicing of feature vector), to front and back adjacent video Target in sequence carries out matched jamming.

Wherein the target detection of every section of video image frame sequence and the calculation method of feature extraction are: execution has trained Neural network model M₁Reasoning process, directly obtain the number in the sequence of image where target, target in the picture Rectangle frame, clarification of objective vector.

Wherein neural network model M₁Training method is:

Collect mark video data；

Sample Video section is cut, the volume of image in the sequence where obtaining video image frame sequence and the target that marked Number, the number information of target rectangle frame in the picture, target；

Pass through the detection to target in sequence of video images and multitask training optimization network model of classifying.

It is a kind of embodiment of target detection tracking method in video below, the specific steps are as follows:

The neural network model M that target detection and matching characteristic calculate in training video image frame sequence₁, specific step It is as follows:

1, multitude of video section V is collected；Target position and each target in artificial mark sequence of video images from occur to The id information of disappearance obtains original mark sample set A={ V₁,V₂,…,V_L}。

2, using deep learning theory and method, to each video-frequency band V in original mark sample set A_i, segmentation sampling life At several video image frame sequences P_i,P_i+1,…,P_i+k∈V_i, obtain training test sample collection B={ P₁,P₂,…,P_i, P_i+1,…,P_i+k…,P_n-k,…,P_n-1,P_n}。

3, using deep learning theory and method, combined training test sample collection B, training is obtained in the way of multitask It can detecte target and calculate the neural network model M of target signature₁。

The neural network model M that object matching tracking characteristics calculate between training video image frame sequence₂, specific step It is as follows:

1, neural network model M is utilized₁, obtain every section of sequence of video images P in training test sample collection B_iIn each target Pursuit path (number in the sequence of image where target, the rectangle frame of target in the picture) and eigenmatrix (feature to The serial splicing of amount).

2, every section of video V is utilized_iThe target information of middle mark and each video image frame sequence P_i+jBy neural network mould Type M₁The pursuit path and eigenmatrix of obtained target, obtain video-frequency band V_iIn each target in different video image frame sequence In feature samples collection: O={ q₁,q₂,…,q_k, wherein q_iBy M₁In P_iMiddle generation, to generate mesh between sequence of video images Mark the training dataset C={ O of matched jamming feature₁,O₂,…,O_s}

3, using deep learning theory and method, combined training test sample collection C is trained to obtain for calculating video image The neural network model M of object matching tracking characteristics between sequence₂。

Utilize neural network model M₁, M₂, target in detecting and tracking video, the specific steps are as follows:

1, it samples to the video segmentation that needs are analyzed, generates several video image frame sequences

2, to each video image frame sequence, neural network model M is executed₁Reasoning process, where obtaining each target Image number in the sequence, rectangle frame, the clarification of objective vector of target in the picture

3, the correlation matrix of the corresponding target feature vector of all testing results exported in video image frame sequence is calculated, Wherein the calculating of correlation can use Euclidean distance, mahalanobis distance etc., and then obtain all in video image frame sequence detect Target the tracking result in video image frame sequence.

4, temporally axis information is ranked up the video image frame sequence of segmentation sampling, according to pursuit path and feature square Battle array executes neural network model M₂Reasoning process, obtain the tracking characteristics of each target in each video image frame sequence Using the correlations of all targets between this feature calculation two adjacent video image frame sequence, (the wherein calculating of correlation can be with With Euclidean distance, mahalanobis distance etc.), to complete the tracking of target in entire video-frequency band.

To sum up, the present invention is based on video image frame sequence data, in conjunction with the information and video image frame sequence of single-frame images Between frame-to-frame correlation, realize target detection tracking method in a kind of video.Compared to the target detection side based on single-frame images Method combines relevant information between image sequence in the present invention, and target detection performance has promotion.Using machine learning method from It being calculated in single-frame images, is used for the matched feature of target following, this feature needs to meet the differentiation of similar similar purpose, Very big or feature the separating capacity of calculation amount for obtaining this feature is poor, is easy matching error, tracking is caused to be lost It loses.Tracking and matching of the invention is divided into two stages thus, in short time inner video image frame sequence the matched jamming of target and Object matching tracking between different images frame frequency sequence: the matched jamming feature inside video image frame sequence relies on video figure As the spininess image information in frame sequence and the correlation between sequence, the separating capacity of feature is also only limitted to video image frame sequence Between internal target；Object matching between video image frame sequence mainly utilizes of video image frame sequence internal object Feature with tracking result and target in video image frame sequence, can very effective raising tracking accuracy rate.Same phase Than other methods, the present invention, which completes the calculation amount that target detection tracing task needs in video, be can be effectively reduced.

The foregoing is only a preferred embodiment of the present invention, is not intended to limit the scope of the present invention, should Understand, the present invention is not limited to implementation as described herein, the purpose of these implementations description is to help this field In technical staff practice the present invention.

Claims

1. target detection tracking method in a kind of video, it is characterised in that method includes the following steps:

Step 1 carries out segmentation sampling to video, obtains several segments video image frame sequence；

Step 2, using neural network modelTarget detection and feature extraction, output are carried out to every section of video image frame sequence Information include: the number in the sequence of image where target, rectangle frame, the clarification of objective vector of target in the picture；

Step 3, the correlation matrix for calculating the corresponding target feature vector of all testing results exported in video sequence, and then Tracking result of all targets detected in frequency sequence into video sequence；

Step 4, temporally axis are ranked up the video image frame sequence of segmentation sampling, and the target of video image frame sequence is examined It surveys pursuit path and eigenmatrix is input to neural network model, obtain each target in each video image frame sequence Tracking characteristics calculate the correlation of all targets between two adjacent video image frame sequence using this tracking characteristics, thus complete At the tracking of target in entire video-frequency band.

2. target detection tracking method in a kind of video according to claim 1, it is characterised in that: the neural network ModelIt establishes in the following ways:

Multitude of video section, the artificial target position marked in sequence of video images and each target are collected from the ID occurred to disappearance Information obtains original mark sample set；

Using deep learning method, to each video-frequency band in original mark sample set, segmentation sampling generates several video figures As frame sequence, obtain training test sample collection；

Using deep learning method, combined training test sample collection and in the way of multitask training obtain neural network model。

3. target detection tracking method in a kind of video according to claim 2, it is characterised in that: the neural network ModelIt establishes in the following ways:

Utilize neural network model, obtain the tracking rail that training test sample concentrates each target in every section of sequence of video images Mark and eigenmatrix；

Pass through neural network model using the target information and each video image frame sequence marked in every section of videoObtained mesh Target pursuit path and eigenmatrix obtain feature samples of each target in different video image frame sequence in every section of video Collection, to generate the training dataset of object matching tracking characteristics between sequence of video images；

Using deep learning method, combined training data set, training obtains neural network model。