CN106815576B - Target tracking method based on continuous space-time confidence map and semi-supervised extreme learning machine - Google Patents
Target tracking method based on continuous space-time confidence map and semi-supervised extreme learning machine Download PDFInfo
- Publication number
- CN106815576B CN106815576B CN201710047829.0A CN201710047829A CN106815576B CN 106815576 B CN106815576 B CN 106815576B CN 201710047829 A CN201710047829 A CN 201710047829A CN 106815576 B CN106815576 B CN 106815576B
- Authority
- CN
- China
- Prior art keywords
- target
- tracked
- semi
- learning machine
- extreme learning
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/46—Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/217—Validation; Performance evaluation; Active pattern learning techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a target tracking method based on a continuous space-time confidence map and a semi-supervised extreme learning machine, which considers that video image frames are continuous in time, the position of a target to be tracked does not break, in addition, the video image frames are continuous in space, the spatial continuity is realized in the fact that a certain specific relation exists between the target and the background around the target, and when the appearance of the target changes greatly, the relation can help to distinguish the target to be tracked and the background area. Aiming at the problems of deformation and shielding, the invention fully considers the information provided by a real target, fully excavates the distribution similarity of labeled samples and unlabeled samples, improves the tracking precision, provides a semi-supervised tracking method based on an extreme learning machine for excavating the distribution similarity of the labeled samples and the unlabeled samples, and combines the two methods in a coupled tracking frame.
Description
Technical Field
The invention relates to a target tracking method based on a continuous space-time confidence map and a semi-supervised extreme learning machine, belonging to the technical field of intelligent information processing and target tracking.
Background
Object tracking is an indispensable link in most vision systems. In a specific scene application (such as the field of video surveillance, etc.), automatic, fast and highly robust target tracking is concerned. The method has wide application prospect in the aspects of video monitoring, traffic detection, intelligent robots, submarine target detection and tracking and the like.
The target tracking is an extremely important part in the field of computer vision, and a moving object tracking algorithm in a video carries out data mining in the video by analyzing the information of a video image of each frame in a video image sequence to be tracked, learns the target behavior, captures a large amount of actions, carries out a series of processing on the information, and obtains and marks the corresponding position of the tracked target in the video image. The problems of blocking deformation among objects, complexity of background, illumination brightness change, poor instantaneity and robustness and the like are to be solved urgently in the tracking process. Classical tracking methods such as Meanshift, particle filtering and the like depend on the richness degree of target information contained in a video, and in an actual video image sequence, the information provided by a target is quite limited, so that the target cannot be stably tracked, such as tangible variable occlusion in a scene, and the classical algorithms are rather ineffective.
Namely, the main problems existing in the prior art: (1) the real-time performance and robustness in the tracking process of the video scene to be tracked are poor, the space-time position information of the target is deficient, and the target characteristics are not obvious; (2) when a shelter and a target to be tracked deform in a scene, especially, the situation that the whole target is sheltered and the target to be tracked deforms greatly can occur, which can cause the problem that the tracked target is lost.
Disclosure of Invention
The invention aims to provide a target tracking method based on a continuous space-time confidence map and a semi-supervised extreme learning machine, so as to make up for the defects of the prior art.
The video image frames are continuous in time, the time continuity is realized in the way that the target to be tracked does not change greatly between adjacent frames, and the position of the target to be tracked does not change suddenly; meanwhile, video image frames are continuous in space, the spatial continuity is realized in the fact that a certain specific relation exists between a target and the background around the target, when the appearance of the target is changed greatly, the relation can help to distinguish the target to be tracked and the background area, and the tracking method utilizing the continuous space-time confidence map learning is provided to overcome the problems of poor real-time performance and robustness, insufficient target space-time position information, unobvious target characteristics and the like. Aiming at the problems of deformation and shielding, the information provided by a real target is fully considered, the distribution similarity of the labeled samples and the unlabeled samples is fully mined, the tracking precision is improved, a semi-supervised tracking method based on an extreme learning machine for mining the distribution similarity of the labeled samples and the unlabeled samples is provided, the two methods are combined in a coupled tracking frame, and the tracking with good robustness and high robustness is realized.
In order to achieve the purpose, the specific technical scheme adopted by the invention is realized by the following steps:
the first step,Acquiring n frames of target video A to be tracked in a specific monitored scene to be tracked ═ I1,…,Ii,…InIn which IiRepresenting the ith frame to-be-tracked video image sequence, and preprocessing the to-be-tracked video sequence by utilizing image filtering denoising and contrast enhancementNoise reduction and highlighting of the region of interest to be tracked;
step two, in the t frame to be tracked video image sequence ItSelecting a target O to be tracked by using a rectangular window, and determining the central position O of the target*O represents the existence of a new target in the scene, and represents the position of the new target, and a two-dimensional confidence map model C of the target to be tracked is definedt(o); the target area to be tracked is enlarged by two times to form a local background area which is expressed asIn thatThe intensity position characteristic w (k) at the coordinate position k is internally extracted to form an intensity position characteristic setI (k) represents the brightness of the image at coordinate position k,represents the coordinate o*A neighborhood of (c); establishing a prior model P (w (k) O) of the target to be tracked of the t-th frame, and calculating a space-time model of the t-th frame
Step three, overlapping and sampling the area where the central position of the target to be tracked is located to obtain N1Each region block image as a positive sample and N2Taking the image of each region block as a negative sample, and extracting positive and negative sample data characteristics xjThe class label of the positive exemplar is 1, the class label of the negative exemplar is 0, yj∈ {1,0}, establishing labeled sample set Training sample set X ═ Xs,Xu}={(xj,yj)},j=1,...,N1+N2;
Step four, training a semi-supervised extreme learning machine network model by using the training sample set X obtained in the step three;
step five, in step It+1In the second step, the t frame space-time model obtained in the second step is usedUpdating the model, and calculating to obtain the space-time model of the t +1 th frameUsing the obtained t +1 th frame space-time modelConvolution It+1Obtaining a space-time confidence map C of the new targett+1(o), maximize Ct+1(o) determining a target position o in the t +1 th frame;
step six, judging whether the target is shielded, if not, entering the step five, otherwise, entering the step seven;
step seven, in step It+1In the formula ItO obtained in*Is a target position, at a target position o*In the region, overlapping sampling is carried out according to the size of a rectangular window of a target region, N region block images are obtained to be used as candidate targets, and the data characteristics of the candidate targets are extractedEstablishing a target image block test sample set to be trackedInputting the test sample set into the trained semi-supervised extreme learning machine network in the fourth step to obtain a T +1 th frame test output T, and maximizing the maximum classification response position of the on-line semi-supervised extreme learning machine to obtain a target position o in the T +1 th frame;
step eight, carrying out on-line semi-supervised extreme learning machine network model updating threshold judgment on the maximum classification response result, if the on-line semi-supervised extreme learning machine model does not need to be updated, entering the step five, otherwise entering the step nine;
step nine, the labeled data set obtained in step threeAnd the test sample set obtained in the step sevenAs an unlabeled data set Xu=Xt+1Step four, retraining the semi-supervised extreme learning machine network model;
and repeating the steps circularly until the tracking is completed on the whole video sequence.
Further, the third step: at the central position o of the target to be tracked*In the region, overlapped sampling is carried out according to the size of a rectangular window of a target region, and the Euclidean distance from the jth sampling point to the central position of a target isWhen in useThen, sampling obtains N1The image of each region block is used as a positive sample whenThen, sampling obtains N2Individual region block image as negative sample, r1、r2And r3Respectively, the sampling radius; extracting positive and negative sample data characteristic xjEstablishing a training sample set of target image blocks to be tracked, and collecting the training sample set (N)1+N2) Using the target image block as a training sample set X { (X)j,yj)},j=1,...,N1+N2The class label of the positive exemplar is 1, the class label of the negative exemplar is 0, yj∈ {1,0}, sequentially scrambling and rearranging the samples in the training sample set, and taking the samples (usually lower proportion) in certain proportion at the top as the marked sample set XsTaking the rest samples (usually with higher proportion) as the unlabeled sample set XuAnd X ═ Xs,Xu}。
The fourth step is that: setting input weight and hidden layer bias in a random mode, and if (a, b) represents the input weight a and threshold b obtained by hidden layer nodes, taking a training sample as a labeled data setAnnotation-free data setWherein XsAnd XuRepresenting an input sample, YsIs a reaction of with XsA corresponding output sample; the mapping function of the hidden layer is g (x), and the mapping function form can be expressed as g (x) 1/(1+ e)-x) The output weight is represented by β, h (x)i)=[G(a1,b1,xi),...,G(am,bm,xi)]s×mRepresenting the output matrix of the ith hidden layer, the node number of the hidden layer is m, eiIndicating the learning error (residual) of the ith input node.
The objective function of the semi-supervised extreme learning machine is as follows:
wherein C isiThe method comprises the steps of representing a penalty parameter, representing a balance parameter by lambda, obtaining a graph Laplacian operation result by labeled data and unlabeled data by L, obtaining an output matrix of a network by F, and obtaining trace operation by Tr;
the semi-supervised extreme learning machine objective function is expressed in a matrix form as follows:
whereinIs the first s rows equal to YsThe last u rows equal to zero, C is the first s diagonal elements CiThe rest is a diagonal matrix of zero, and H is a hidden layer output matrix of the network;
the above equation is biased into β:
let the partial derivative be zero, solve to obtain an output weight β of:
when the tag data is greater than the number of nodes in the hidden layer
When the tagged data is less than the number of nodes in the hidden layer
Wherein HTIs the transpose of matrix H.
The method for judging whether the target is shielded or not is the result of the confidence mapThreshold th for occlusion1Is judged ifHour, it indicates the object is blocked, th1The critical value representing the occlusion can be changed according to different scenes, and when the algorithm is applied to different scenes, the th is manually adjusted1The value, which normally fluctuates within a certain range, is the targetWhen the light source is shielded,will rapidly decrease, and the value after the rapid decrease is defined as th1And judging whether the target is blocked or not according to the value.
The eighth step, the method for judging whether the network model of the on-line semi-supervised extreme learning machine needs to be updated is to respond to the maximum classification result TmaxUpdate threshold th2Is judged if Tmax>th2In time, it indicates that the online semi-supervised extreme learning machine network model does not need to be updated to update the threshold th2And judging whether the network model needs to be updated or not.
The invention has the beneficial effects that: the invention combines a continuous space-time confidence map learning tracking method and a semi-supervised extreme learning machine tracking method, and solves the problems of poor real-time performance and robustness, insufficient target space-time position information, unobvious target characteristics and deformation shielding to cause loss of a tracked target in the tracking process. The method comprises the steps of calculating a continuous space-time confidence map to judge the shielding threshold value, obtaining a method for judging whether a target enters a shielding region, effectively solving the problem of judging the shielding of the target, calculating the maximum response value output by a semi-supervised extreme learning machine network to judge the updating threshold value of the semi-supervised extreme learning machine network model, obtaining a method for judging whether the network model needs to be updated, and effectively solving the problem of poor generalization of the network model. The invention greatly improves the tracking precision and realizes a tracking process with good robustness and high robustness.
Drawings
FIG. 1 is a schematic diagram of an overall tracking process according to the present invention.
FIG. 2 is a region marker map of an object to be tracked in an embodiment.
FIG. 3 is a block diagram of a method for object tracking based on continuous spatiotemporal confidence maps in an exemplary embodiment.
Fig. 4 is a basic framework diagram of a semi-supervised extreme learning machine network.
FIG. 5 is a block diagram of a method for target tracking based on semi-supervised extreme learning in an exemplary embodiment.
Fig. 6 is an example of tracking effect under occlusion in the specific embodiment, where (a) in fig. 6 is a video frame with an object of interest to be tracked, and (b) in fig. 6, (c) in fig. 6, (d) in fig. 6, (e) in fig. 6, and (f) in fig. 6 are video frames for tracking the object of interest after the frame (a) in fig. 6, respectively.
Detailed Description
In order to make the objects, embodiments and advantages of the present invention clearer, the present invention is further illustrated by the following specific examples in conjunction with the accompanying drawings.
The specific flow chart of the invention is shown in fig. 1.
In this embodiment, a segment of classical corridor surveillance video, caviar (384 × 288 pixels, 25 frames per second), is specifically adopted as the video to be tracked.
The method comprises the steps of firstly, preprocessing a video sequence to be tracked by utilizing image filtering denoising and contrast enhancement, reducing noise and highlighting an interested region to be tracked; the method specifically comprises the following steps:
step 1-1, defining a section of classical corridor surveillance video caviar as A, and performing frame division processing to obtain 200 frames of video image sequences to be tracked, namely A ═ { I ═ I1,…,Ii,…I200In which IiRepresenting the ith frame of video image to be tracked of the corridor monitoring video caviar;
step 1-2, for the 200 frames video image sequenceAnd carrying out preprocessing of filtering and denoising and contrast enhancement.
Step two, at the t-1 frame to be tracked video image sequence It=1Selecting a target O to be tracked, and determining the central position O of the target*O represents the existence of a new target in the scene, O represents the position of the new target, and defines a two-dimensional confidence map model C of the target to be trackedt(o); establishing a prior model P (w (k) O) of the target to be tracked of the t-th frame, and calculating a space-time model of the t-th frameAs shown in fig. 3; the method specifically comprises the following steps:
step 2-1 at It=1Selecting an interested target O to be tracked by a user through a rectangular window W, wherein the width of the rectangular window of a target area is W, the height of the rectangular window of the target area is h, the O represents the position of a new target, and the target area is expanded by two times to form a local background area which is represented asAs shown in fig. 2; extracting intensity position characteristics w (k) at a coordinate position k in a local background region to form an intensity position characteristic setI (k) represents the brightness of the image at coordinate position k,represents the coordinate o*A neighborhood of (c);
step 2-2, converting the tracking problem into a problem of calculating a confidence map of the position of the target to be tracked:
wherein C ist(o) a confidence map model representing the t frame, representing the new target location o and the old target location o*The closer the new target position is to the old target position, the larger the confidence value is;representing a space-time model and describing the relative positions and directions of a new target O and a coordinate point k of a local background area, and P (w (k) I O) represents a prior model and describes the position of an old targetSimulating low-level contour information of the target O to be tracked according to the intensity and the relative position direction of the coordinate point k in the local background area;
step 2-5, calculating t-1 frame It=1Confidence graph model Ct(O) and a prior model P (w (k), O) computing a spatio-temporal model of the object of interest for frame t ═ 1
WhereinWhich represents a fast fourier transform of the signal,representing an inverse fast fourier transform.
Step three, overlapping and sampling the area where the central position of the target to be tracked is located to obtain N1Each region block image as a positive sample and N2Taking the image of each region block as a negative sample, and extracting positive and negative sample data characteristics xjEstablishing a labeled sample setAnd label-free sample set XuForm a training sample set X { (X)j,yj)},j=1,...,N1+N2As shown in fig. 5, the method specifically includes the following steps:
step 3-1, at the center position o*In the region, overlapped sampling is carried out according to the size of a rectangular window of a target region, and the Euclidean distance from the jth sampling point to the central position of a target isWhen in useThen, 45 area block images are obtained as positive samples by sampling, and when the positive samples are obtainedThen, 31 area block images are obtained by sampling as negative samples, and the radius r is sampled1、r2And r3Set parameters 5, 10 and 20 (unit: pixel), respectively;
step 3-2, extracting positive and negative sample data characteristics xjEstablishing a training sample set of target image blocks to be tracked, and collecting 76 target image blocks as a training sample set X { (X)j,yj) 1, 76, the class label of the positive exemplar is 1, the class label of the negative exemplar is 0, yj∈{1,0};
Step 3-3, disordering and rearranging the sample sequence in the training sample set, and taking the first 50 samples as the marked sample set XsTaking the remaining 26 samples as the unlabeled sample set XuAnd X ═ Xs,Xu}。
Step four, training the online semi-supervised extreme learning machine network model by using the training sample set X obtained in the step three, and specifically comprising the following steps of:
step 4-1, the semi-supervised extreme learning machine is a single hidden layer feedforward neural network model as shown in fig. 4, and the whole network model is divided into three layers, including: the input layer, the hidden layer and the output layer adopt a random mode to set the input weight and the hidden layer bias, are independent of the training sample, have simple algorithm structure and high calculation efficiency, and if (a, b) represents the hidden layer node to obtainThe training sample is a labeled data setAnnotation-free data setWherein XsAnd XuRepresenting an input sample, YsIs a reaction of with XsA corresponding output sample; the mapping function of the hidden layer is g (x), and the mapping function form can be expressed as g (x) 1/(1+ e)-x) The output weight is represented by β, h (x)i)=[G(a1,b1,xi),…,G(a2000,b2000,xi)]50×2000Representing the output matrix of the ith hidden layer, the node number of the hidden layer is 2000, eiA learning error (residual) representing the ith input node;
step 4-2, the objective function of the semi-supervised extreme learning machine to be trained is as follows:
wherein C isiRepresenting penalty parameters, lambda representing balance parameters, L being the graph Laplace operation result obtained by the labeled data and the unlabeled data, F being the output matrix of the network, and TrIs a trace operation;
and 4-3, expressing the objective function of the semi-supervised extreme learning machine in a matrix form as follows:
whereinIs the first 50 rows equal to YsThe last 26 rows equal the output label sample of zero. C is the first 50 diagonal elements as CiThe remaining is the diagonal matrix of zero;
step 4-4, obtaining the partial derivative of β by the above formula pair:
step 4-5, making the partial derivative zero, solving to obtain an output weight β as follows:
when the tag data is greater than the number of nodes in the hidden layer
When the tagged data is less than the number of nodes in the hidden layer
Wherein HTAnd the matrix is a transposed matrix of the matrix H, and the semi-supervised extreme learning machine network model training is finished.
Step five, in the t +1 th frame, utilizing the t frame space-time model obtained in the step twoUpdating the model, and calculating to obtain the space-time model of the t +1 th frameUsing the obtained t +1 th frame space-time modelConvolution image It+1Obtaining a space-time confidence map C of the new targett+1(o) maximizing the confidence map C thus obtainedt+1(o) determining the target position o in the t +1 th frame, as shown in fig. 3, specifically including the following steps:
step 5-1 at It+1In o*Taking a local background area twice the target size for the target locationExtracting intensity position characteristic in the region to form an intensity position characteristic set
Step 5-2, updating a spatio-temporal model of the target of interest to be tracked in the t frame:
where p is the learning rate, where,is the interesting object space-time model to be tracked calculated in the t frame,expressed in the frequency domain as:
Fw=ρ/(ejw-(1-ρ))
where j is an imaginary unit;
step 5-3, calculating a t +1 frame confidence map of the target of interest to be tracked:
step 5-4, in the t +1 frame, the position o of the interested target, namely the confidence map of the maximized t +1 frame:
o=argmaxCt+1(o)
Judging whether the target is shielded, and specifically comprising the following steps:
step 6-1, obtaining the maximum confidence value of the step 5-4Threshold th for occlusion1Is judged ifAnd if so, indicating that the target is blocked, and judging whether the target is blocked or not. th (h)1The critical value representing the occlusion can be changed according to different scenes, and when the algorithm is applied to different scenes, the th is manually adjusted1The value, which normally fluctuates over a certain range, when the target is occluded,will rapidly decrease, and the value after the rapid decrease is defined as th1A value; in this detailed description, definitions
Step 6-2, ifIf so, indicating that the target is not shielded, performing the step 5-1, otherwise, performing the step 7-1.
Step seven, in the t +1 th frame, overlapping sampling is carried out at the position where the target center has been tracked in the t th frame, candidate target data characteristics are extracted, a target image block test sample to be tracked is established, the test sample is input into the trained online semi-supervised extreme learning machine, and the maximum classification response position in the test sample is the predicted new target position, as shown in fig. 5, the method specifically comprises the following steps:
step 7-1, for the t +1 th frame video image, add*Is a target position at a central position o*In the region, overlap sampling according to the size of rectangular window of target region, sampling at jthSampling point to o*Has a Euclidean distance ofWhen in useSampling to obtain 232 area block images as candidate targets, namely test data, and extracting sample data with characteristics ofAnd recording the test set asRadius of sampling r1The set parameter is 20 (unit: pixel);
and 7-2, testing output is as follows:
T=H*β
where β is the output weight calculated for the t frame, H*For the hidden layer output matrix to be tested,
step 7-3, in the t +1 frame, the position o of the target of interest to be tracked is the maximum classification response position of the maximum t +1 frame semi-supervised extreme learning machine:
o=arg max T
maximum class response value of Tmax。
And step eight, performing online semi-supervised extreme learning machine network model updating threshold judgment on the maximum classification response result, and specifically comprising the following steps:
step 8-1, responding to the maximum classification result TmaxUpdating threshold th for semi-supervised extreme learning machine2Is judged if Tmax>th2Then, it indicates that the network model of the on-line semi-supervised extreme learning machine does not need to be updated, so as to judge whether the model of the on-line semi-supervised extreme learning machine needs to be updated th2Indicating updated threshold value, th in this embodiment2=0。
Step 8-2, if TmaxAnd if the model is more than 0, the online semi-supervised extreme learning machine network model does not need to be updated, and the step 5-1 is carried out, otherwise, the step nine is carried out.
Step nine, retraining the online semi-supervised extreme learning machine network model, as shown in fig. 5, specifically as follows: annotated data set obtained in step 3-3And the test set obtained in step 7-1As an unlabeled data set Xu=Xt+1And 4-1, retraining the online semi-supervised extreme learning machine network model.
And repeating the steps circularly until the tracking of the whole monitoring video sequence to be tracked is completed.
For the above-mentioned surveillance video to be tracked, the results of the comparison of the tracking performance by the particle filtering method, the Meanshift method and the method of the present invention are shown in Table 1, and it can be seen that the method of the present invention is superior to the particle filtering method and the Meanshift method in both the center position deviation result and the deviation mean square error result, and the robustness and robustness of the target tracking are realized.
Table 1 shows the comparison of tracking performance of particle filter, Meanshift and the method of the present invention
Particle filter | Meanshift | The method of the invention | |
Deviation of center position | 75.4796 | 22.9740 | 10.1834 |
Mean square deviation of deviation | 47.8903 | 12.2607 | 7.9702 |
Fig. 6 is an example of the tracking effect under the occlusion condition in the specific embodiment, and it can be seen that the upper target can be accurately tracked under the condition of two times of serious occlusions, further proving the robustness and robustness of the method of the present invention.
The above is the preferred embodiment of the present invention, and it should be noted that: it will be apparent to those skilled in the art that various modifications and adaptations can be made without departing from the principles of the invention and these are intended to be within the scope of the invention.
Claims (5)
1. A target tracking method based on a continuous space-time confidence map and a semi-supervised extreme learning machine is characterized by comprising the following steps:
step one, collecting n frames of target video A to be tracked in a specific monitored scene to be tracked, wherein the video A is equal to { I }1,…,Ii,…InIn which IiRepresenting the ith frame to-be-tracked video image sequence, and preprocessing the to-be-tracked video sequence by utilizing image filtering denoising and contrast enhancementNoise reduction and highlighting of the region of interest to be tracked;
step two, in the t frame to be tracked video image sequence ItSelecting a target O to be tracked by using a rectangular window, and determining the central position O of the target*O represents the existence of a new target in the scene, and O represents the position of the new target, and defines a two-dimensional confidence map model C of the target O to be trackedt(o); the target area to be tracked is enlarged by two times to form a local background area which is expressed asIn thatThe intensity position characteristic w (k) at the coordinate position k is internally extracted to form an intensity position characteristic setI (k) represents the brightness of the image at coordinate position k,represents the coordinate o*A neighborhood of (c); establishing a prior model P (w (k) O) of the target to be tracked of the t-th frame, and calculating a space-time model of the t-th frame
Step three, overlapping and sampling the area where the central position of the target to be tracked is located to obtain N1Each region block image as a positive sample and N2Taking the image of each region block as a negative sample, and extracting positive and negative sample data characteristics xjThe class label of the positive exemplar is 1, the class label of the negative exemplar is 0, yj∈ {1,0}, establishing labeled sample setAnd label-free sample set XuForm a training sample set X ═ Xs,Xu}={(xj,yj)},j=1,...,N1+N2;
Step four, training a semi-supervised extreme learning machine network model by using the training sample set X obtained in the step three;
step five, in step It+1In the second step, the t frame space-time model obtained in the second step is usedUpdating the model, and calculating to obtain the space-time model of the t +1 th frameUsing the obtained t +1 th frame space-time modelConvolution It+1Obtaining a space-time confidence map C of the new targett+1(o), maximize Ct+1(o) determining a target position o in the t +1 th frame;
step six, judging whether the target is shielded, if not, entering the step five, otherwise, entering the step seven;
step seven, in step It+1In the formula ItO obtained in*Is a target position, at a target position o*In the region, overlapping sampling is carried out according to the size of a rectangular window of a target region, N region block images are obtained to be used as candidate targets, and the data characteristics of the candidate targets are extractedEstablishing a target image block test sample set to be trackedInputting the test sample set into the trained semi-supervised extreme learning machine network model in the fourth step to obtain a T +1 th frame test output T, and maximizing the maximum classification response position of the semi-supervised extreme learning machine network model to obtain a target position o in the T +1 th frame;
step eight, judging the updating threshold of the network model of the semi-supervised extreme learning machine according to the maximum classification response result, if the network model of the semi-supervised extreme learning machine does not need to be updated, entering the step five, otherwise entering the step nine;
step nine, the labeled data set obtained in step threeAnd the test sample set obtained in the step sevenAs an unlabeled data set Xu=Xt+1Step four, retraining the semi-supervised extreme learning machine network model;
and repeating the steps circularly until the tracking is completed on the whole video sequence.
2. The method for tracking an object according to claim 1, wherein the third step is specifically: at the central position o of the target to be tracked*In the region, overlapped sampling is carried out according to the size of a rectangular window of a target region, and the Euclidean distance from the jth sampling point to the central position of a target isWhen in useThen, sampling obtains N1The image of each region block is used as a positive sample whenThen, sampling obtains N2Individual region block image as negative sample, r1、r2And r3Respectively, the sampling radius; extracting positive and negative sample data characteristic xjEstablishing a training sample set of target image blocks to be tracked, and collecting the training sample set (N)1+N2) Using the target image block as a training sample set X { (X)j,yj)},j=1,...,N1+N2The class label of the positive exemplar is 1, the class label of the negative exemplar is 0, yj∈ {1,0}, disordering and rearranging the order of the samples in the training sample set, and taking the samples in the first certain proportion as the marked sample set XsTaking the residual samples as the unlabeled sample set XuAnd X ═ Xs,Xu}。
3. The method for tracking an object according to claim 1, wherein the fourth step is specifically: setting input weight and hidden layer bias in a random mode, and if (a, b) represents the input weight a and threshold b obtained by hidden layer nodes, taking a training sample as a labeled data setAnnotation-free data setWherein XsAnd XuRepresenting an input sample, YsIs a reaction of with XsA corresponding output sample; the mapping function of the hidden layer is g (x), and the mapping function form can be expressed as g (x) 1/(1+ e)-x) The output weight is represented by β, h (x)i)=[G(a1,b1,xi),…,G(am,bm,xi)]s×mRepresenting the output matrix of the ith hidden layer, the node number of the hidden layer is m, eiRepresenting the learning error of the ith input node;
the objective function of the semi-supervised extreme learning machine is as follows:
fi=h(xi)β,i=1,...,s+u
wherein C isiThe method comprises the steps of representing a penalty parameter, representing a balance parameter by lambda, obtaining a graph Laplacian operation result by labeled data and unlabeled data by L, obtaining an output matrix of a network by F, and obtaining trace operation by Tr;
the semi-supervised extreme learning machine objective function is expressed in a matrix form as follows:
whereinIs the first s rows equal to YsThe last u rows equal to zero, C is the first s diagonal elements CiThe remaining is the diagonal matrix of zero; h is the hidden layer output matrix of the network;
the above equation is biased into β:
let the partial derivative be zero, solve to obtain an output weight β of:
when the tag data is greater than the number of nodes in the hidden layer
When the tagged data is less than the number of nodes in the hidden layer
Wherein HTIs the transpose of matrix H.
4. The method of claim 1, wherein the sixth step of determining whether the target is occluded is a result of a confidence mapThreshold th for occlusion1Is judged ifHour, it indicates the object is blocked, th1Indicating occlusionThe critical value can be changed according to different scenes, and when the algorithm is applied to different scenes, the th is manually adjusted1The value, which normally fluctuates over a certain range, when the target is occluded,will rapidly decrease, and the value after the rapid decrease is defined as th1And judging whether the target is blocked or not according to the value.
5. The method for tracking an object as claimed in claim 1, wherein the step eight of determining whether the semi-supervised extreme learning machine network model needs to be updated is to respond to the maximum classification result TmaxUpdate threshold th2Is judged if Tmax>th2In time, it is shown that the semi-supervised extreme learning machine network model does not need to be updated to update the threshold th2And judging whether the network model needs to be updated or not.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710047829.0A CN106815576B (en) | 2017-01-20 | 2017-01-20 | Target tracking method based on continuous space-time confidence map and semi-supervised extreme learning machine |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710047829.0A CN106815576B (en) | 2017-01-20 | 2017-01-20 | Target tracking method based on continuous space-time confidence map and semi-supervised extreme learning machine |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106815576A CN106815576A (en) | 2017-06-09 |
CN106815576B true CN106815576B (en) | 2020-07-07 |
Family
ID=59111417
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710047829.0A Active CN106815576B (en) | 2017-01-20 | 2017-01-20 | Target tracking method based on continuous space-time confidence map and semi-supervised extreme learning machine |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106815576B (en) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107423762A (en) * | 2017-07-26 | 2017-12-01 | 江南大学 | Semi-supervised fingerprinting localization algorithm based on manifold regularization |
CN109255321B (en) * | 2018-09-03 | 2021-12-10 | 电子科技大学 | Visual tracking classifier construction method combining history and instant information |
CN110211104B (en) * | 2019-05-23 | 2023-01-06 | 复旦大学 | Image analysis method and system for computer-aided detection of lung mass |
CN110675382A (en) * | 2019-09-24 | 2020-01-10 | 中南大学 | Aluminum electrolysis superheat degree identification method based on CNN-LapseLM |
CN113378673B (en) * | 2021-05-31 | 2022-09-06 | 中国科学技术大学 | Semi-supervised electroencephalogram signal classification method based on consistency regularization |
CN113408984B (en) * | 2021-06-21 | 2024-03-22 | 北京思路智园科技有限公司 | Dangerous chemical transportation tracking system and method |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8254633B1 (en) * | 2009-04-21 | 2012-08-28 | Videomining Corporation | Method and system for finding correspondence between face camera views and behavior camera views |
CN102663453B (en) * | 2012-05-03 | 2014-05-14 | 西安电子科技大学 | Human motion tracking method based on second generation Bandlet transform and top-speed learning machine |
CN103942749B (en) * | 2014-02-24 | 2017-01-04 | 西安电子科技大学 | A kind of based on revising cluster hypothesis and the EO-1 hyperion terrain classification method of semi-supervised very fast learning machine |
CN104992453B (en) * | 2015-07-14 | 2018-10-23 | 国家电网公司 | Target in complex environment tracking based on extreme learning machine |
CN106296734B (en) * | 2016-08-05 | 2018-08-28 | 合肥工业大学 | Method for tracking target based on extreme learning machine and boosting Multiple Kernel Learnings |
-
2017
- 2017-01-20 CN CN201710047829.0A patent/CN106815576B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN106815576A (en) | 2017-06-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106815576B (en) | Target tracking method based on continuous space-time confidence map and semi-supervised extreme learning machine | |
CN107016357B (en) | Video pedestrian detection method based on time domain convolutional neural network | |
US20210248378A1 (en) | Spatiotemporal action detection method | |
CN108830285B (en) | Target detection method for reinforcement learning based on fast-RCNN | |
CN106971152B (en) | Method for detecting bird nest in power transmission line based on aerial images | |
CN113065558A (en) | Lightweight small target detection method combined with attention mechanism | |
CN108320306B (en) | Video target tracking method fusing TLD and KCF | |
CN111582349B (en) | Improved target tracking algorithm based on YOLOv3 and kernel correlation filtering | |
CN107633226A (en) | A kind of human action Tracking Recognition method and system | |
CN112364865B (en) | Method for detecting small moving target in complex scene | |
CN114241511B (en) | Weak supervision pedestrian detection method, system, medium, equipment and processing terminal | |
CN109685045A (en) | A kind of Moving Targets Based on Video Streams tracking and system | |
CN112884712A (en) | Method and related device for classifying defects of display panel | |
Jiang et al. | A self-attention network for smoke detection | |
de Silva et al. | Towards agricultural autonomy: crop row detection under varying field conditions using deep learning | |
CN115375737B (en) | Target tracking method and system based on adaptive time and serialized space-time characteristics | |
CN112036300B (en) | Moving target detection method based on multi-scale space-time propagation layer | |
CN111242026A (en) | Remote sensing image target detection method based on spatial hierarchy perception module and metric learning | |
CN111091101A (en) | High-precision pedestrian detection method, system and device based on one-step method | |
CN111340842A (en) | Correlation filtering target tracking algorithm based on joint model | |
CN111429485B (en) | Cross-modal filtering tracking method based on self-adaptive regularization and high-reliability updating | |
Wang et al. | An efficient attention module for instance segmentation network in pest monitoring | |
Kadim et al. | Deep-learning based single object tracker for night surveillance. | |
Chen et al. | A unified model sharing framework for moving object detection | |
CN107729811B (en) | Night flame detection method based on scene modeling |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |