CN110659566A - Target tracking method and system in shielding state - Google Patents

Target tracking method and system in shielding state Download PDF

Info

Publication number
CN110659566A
CN110659566A CN201910754803.9A CN201910754803A CN110659566A CN 110659566 A CN110659566 A CN 110659566A CN 201910754803 A CN201910754803 A CN 201910754803A CN 110659566 A CN110659566 A CN 110659566A
Authority
CN
China
Prior art keywords
target
picture
occlusion
tracking
candidate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910754803.9A
Other languages
Chinese (zh)
Other versions
CN110659566B (en
Inventor
王海华
马福齐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing Terminus Technology Co Ltd
Original Assignee
Chongqing Terminus Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing Terminus Technology Co Ltd filed Critical Chongqing Terminus Technology Co Ltd
Priority to CN201910754803.9A priority Critical patent/CN110659566B/en
Publication of CN110659566A publication Critical patent/CN110659566A/en
Application granted granted Critical
Publication of CN110659566B publication Critical patent/CN110659566B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • G06V20/42Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items of sport video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/136Segmentation; Edge detection involving thresholding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/215Motion-based segmentation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/18Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Abstract

The invention provides a target tracking method in a shielding state, which comprises the following steps: s1, selecting a tracking target from the 1 st frame of video picture frame, and extracting the picture characteristics of the tracking target in the picture frame; s2, selecting one or more candidate targets from the continuous video picture frames of the 2 nd frame, the 3 rd frame, the … th frame and the nth frame; s3, judging whether the candidate target selected in the S2 is in an occlusion state; s4, when the candidate target is in an occlusion state, carrying out occlusion removing reconstruction on the candidate target by using a GAN neural network; and S5, when the candidate target is in an unoccluded state or after the candidate target is subjected to the deblocking reconstruction, performing general target tracking processing on the candidate target, and confirming the candidate target in the blocked picture frame as the tracking target. A set of matched system is designed based on the method, and the system realizes the de-occlusion reconstruction of the occluded candidate target by using the GAN neural network, thereby ensuring the normal use of the target tracking technology in people flow and traffic flow.

Description

Target tracking method and system in shielding state
Technical Field
The invention relates to the technical field of video monitoring, in particular to a target tracking method and a target tracking system in a shielding state.
Background
With the continuous expansion of the scale of a video monitoring system, target tracking is used as a basic technology in the field of video monitoring, is continuously developed and advanced, and is applied to emergency scenes such as suspicious person tracking, illegal vehicle tracking and the like.
The target tracking refers to analyzing and extracting the same target, such as the same person, vehicle or animal, from continuous video picture frames, and belongs to the basic technology of the video monitoring field. The general method of target tracking is: firstly, extracting the characteristics of a tracking target, namely selecting the tracking target from a first frame of video picture frame, and extracting the picture characteristics of the tracking target in the video picture frame, such as one or more of color distribution characteristics, texture characteristics and edge characteristics; further, extracting the characteristics of the candidate targets, namely selecting one or more candidate targets from the subsequent continuous video picture frames, and extracting the picture characteristics of the one or more candidate targets; and finally, comparing the picture feature similarity of the tracking target with the candidate target, and if the similarity is greater than a threshold value, determining that the candidate target in the subsequent video picture frame is the tracking target.
For the above functions, it is a troublesome problem that a part or even most of the area of the tracking target is blocked by other objects, for example, a person or a vehicle as the tracking target is blocked by other persons or vehicles in the flow of people and vehicles, even if the selected candidate target is actually the tracking target, the similarity between the extracted image feature and the image feature of the tracking target cannot be greater than a threshold value due to the influence of the blocking object, which may easily cause a recognition failure.
At present, a target tracking technology can only track people and vehicles in a video image with a clear tracking target, and cannot track a partially or mostly shielded tracking target, but the shielded tracking target is an inevitable common phenomenon.
The GAN is a generative confrontation neural network, compared with other generative models, the GAN only uses back propagation and can generate clearer and more real samples, the GAN adopts an unsupervised learning mode for training, and can be widely used in the fields of unsupervised learning and semi-supervised learning, for example, when the GAN is applied to scenes such as image completion, only one reference is needed, a discriminator is used for discrimination, and the rest is handed to confrontation training, so that the difficulty of designing a loss function can be effectively avoided.
Therefore, how to combine GAN with a target tracking technology to achieve a de-occlusion process on an occluded tracked target, and further achieve an implementation of the tracking target technology in a real scene of people flow and traffic flow is a problem to be solved by those skilled in the art.
Disclosure of Invention
In view of this, the present invention provides a target tracking method and system in an occlusion state, which utilize a GAN neural network to realize de-occlusion reconstruction of an occluded candidate target, then extract image features of the de-occluded reconstructed candidate target, and perform similarity comparison with the image features of a tracked target, thereby completing target tracking of the occluded candidate target.
In order to achieve the purpose, the invention adopts the following technical scheme:
a target tracking method in an occlusion state comprises the following steps:
s1, selecting a tracking target from the 1 st frame of video picture frame, and extracting the picture characteristics of the tracking target in the picture frame;
s2, selecting one or more candidate targets from the continuous video picture frames of the 2 nd frame, the 3 rd frame, the … th frame and the nth frame;
s3, judging whether the candidate target selected in the S2 is in an occlusion state;
s4, when the candidate target is in an occlusion state, carrying out occlusion removing reconstruction on the candidate target by using a GAN neural network;
and S5, when the candidate target is in an unoccluded state or after the candidate target is subjected to the deblocking reconstruction, performing general target tracking processing on the candidate target, and confirming the candidate target in the blocked picture frame as the tracking target.
Preferably, the picture features extracted in S1 include one or more of color distribution features, texture features, and edge features, and the picture features with distinct features are selected to facilitate comparison of similarity between the candidate target and the tracking target.
Preferably, the specific steps of S3 are as follows:
s31, dividing the candidate target picture into a plurality of sub-regions respectively;
s32, extracting the picture characteristics of each sub-area;
s33, calculating the variation of the picture characteristics of each subregion relative to the adjacent subregions;
and S34, judging whether the variation exceeds a mutation threshold value.
When the candidate target is shielded, the picture characteristics of the picture frame of the candidate target, such as color distribution characteristics, texture characteristics and edge characteristics, have region mutation in distribution, while the picture characteristics of the picture frame of the candidate target which is not shielded are distributed more uniformly without mutation; therefore, the candidate target is divided into a plurality of sub-regions, each candidate target picture is divided into 10 × 10 sub-regions arranged in a matrix, picture features such as color distribution features, texture features and edge features of each sub-region are extracted, the variation of the picture features of each sub-region relative to the adjacent sub-regions in the 100 sub-regions is calculated, and if the variation is larger than a variation threshold, the candidate target is considered to have a region mutation and to be in a shielding state.
Preferably, the specific steps of S4 are as follows:
s41, establishing a training sample library, wherein the training sample library comprises a sample target picture in an occlusion state and a sample target picture in a non-occlusion state;
s42, overlapping the sample target picture in the shielding state with the randomly distributed variable, and generating a sample target picture after de-shielding reconstruction by using a generator;
s43, calculating a loss function value of the sample target picture after the occlusion removal reconstruction relative to the sample target picture in an occlusion-free state by a discriminator;
s44, when the loss function value is not in the allowable range, the feedback generator optimizes the loss function value until the sample target picture output by the generator after the occlusion removal reconstruction is judged to be in the allowable range by the discriminator to obtain a trained GAN;
and S45, substituting the candidate target picture in the shielded state into the trained GAN to obtain a target picture after de-shielding reconstruction.
The GAN neural network comprises a convolutional neural network serving as a generator and a convolutional neural network serving as a discriminator, the convolutional neural network has self-learning capacity, associative storage capacity and high-speed optimal solution searching capacity, and the convolutional neural network serving as the discriminator is trained through a large number of discrimination samples when in use and can be directly used. In order to train the convolutional neural network serving as a generator, a sample library containing a sample target picture in an occlusion state and a sample target picture in a non-occlusion state is established first, the sample target picture in the occlusion state is overlapped with a random distribution variable and then transmitted to the convolutional neural network serving as the generator, the convolutional neural network serving as the generator randomly generates a sample target picture subjected to de-occlusion reconstruction, the generated sample target picture subjected to de-occlusion reconstruction is transmitted to the convolutional neural network serving as a discriminator, the convolutional neural network serving as the discriminator discriminates and compares the generated sample target picture subjected to de-occlusion reconstruction with the sample target picture in a non-occlusion state, a loss function value is calculated, and if the loss function value is within an allowable range, the GAN neural network is trained; and if the loss function value is not in the allowable range, feeding the result back to the convolutional neural network serving as the generator for learning optimization, and repeating the process continuously until the loss function value is in the allowable range, finishing the training of the convolutional neural network serving as the generator, and finishing the GAN neural network.
Preferably, the specific steps of S5 are as follows:
s51, extracting picture features of the picture frame where the candidate target is located;
s52, comparing the similarity of the picture characteristics of the tracking target and the candidate target;
and S53, judging whether the similarity is greater than a threshold value, and if the similarity is greater than the threshold value, determining that the candidate target in the subsequent video picture frame is the tracking target.
A target tracking system in an occluded state, comprising: the device comprises a tracking target picture feature extraction module, a candidate target selection module, an occlusion state judgment module, a de-occlusion reconstruction module and a general target tracking processing module; wherein the content of the first and second substances,
the tracking target picture feature extraction module is used for selecting a tracking target from a first frame of video picture frame and extracting the picture feature of the tracking target in the picture frame;
the candidate target selection module is used for selecting one or more candidate targets from the continuous video picture frames of the 2 nd frame, the 3 rd frame, … th frame and the nth frame;
the shielding state judging module is used for judging whether the candidate target selected by the candidate target selecting module is in a shielding state;
the de-occlusion reconstruction module is used for performing de-occlusion reconstruction on the candidate target by using a GAN neural network when the candidate target is in an occlusion state;
and the general target tracking processing module is used for performing general target tracking processing on the candidate target after the candidate target is in an unoccluded state or is subjected to de-occlusion reconstruction, and determining the candidate target in the occluded picture frame as the tracking target.
Preferably, the picture features extracted by the tracking target picture feature extraction module include one or more of color distribution features, separation features, and edge features.
Preferably, the shielding state determining module includes: the device comprises a region discrete unit, a sub-region picture feature extraction unit, a variation calculation unit and a judgment unit; wherein the content of the first and second substances,
the area discrete unit is used for respectively dividing the candidate target picture into a plurality of sub-areas;
the subarea picture feature extraction unit is used for extracting the picture feature of each subarea;
the variable quantity calculating unit is used for calculating the variable quantity of the picture characteristic of each sub-area relative to the adjacent sub-area;
the judging unit is used for judging whether the variation exceeds a mutation threshold value.
Preferably, the de-occlusion reconstruction module includes: the device comprises a training sample library establishing unit, a sample target picture generating unit after occlusion removal reconstruction, a loss function value calculating unit, a generator optimizing unit and a target picture generating unit after occlusion removal reconstruction; wherein the content of the first and second substances,
the training sample library establishing unit is used for establishing a training sample library, and the training sample library comprises a sample target picture in an occlusion state and a sample target state in a non-occlusion state;
the sample target picture generation unit after the de-occlusion reconstruction is used for superposing a sample target picture in an occlusion state with a random distribution variable and generating a sample target picture after the de-occlusion reconstruction by using a generator;
the loss function value calculating unit is used for calculating a loss function value of a sample target picture after the occlusion removal reconstruction relative to a sample target picture in an occlusion-free state by a discriminator;
the generator optimization unit is used for feeding back the generator to optimize when the loss function value is not in the allowable range until the sample target picture output by the generator after the occlusion removal reconstruction is judged to be in the allowable range by the discriminator to obtain a trained GAN;
and the de-occlusion reconstructed target picture generation unit is used for substituting the candidate target picture in the pointed state into the trained GAN to obtain a de-occlusion reconstructed target picture.
Preferably, the general target tracking processing module includes: the device comprises a candidate target picture feature extraction unit, a similarity comparison unit and a result judgment unit; wherein the content of the first and second substances,
the candidate target picture feature extraction unit is used for extracting picture features of a picture frame where the candidate target is located;
the similarity comparison unit is used for comparing the similarity of the picture characteristics of the tracking target and the candidate target;
the result judging unit is used for judging whether the similarity is greater than a threshold value or not, and if the similarity is greater than the threshold value, the candidate target in the subsequent video picture frame is determined to be the tracking target.
The invention has the following beneficial effects:
based on the above technical scheme, based on the prior art, the invention provides a target tracking method in an occlusion state, and a target tracking system in the occlusion state is designed according to the method, and the GAN neural network is used for realizing the de-occlusion reconstruction of the candidate target, so that the occluded candidate target is efficiently restored, the situation that the candidate target cannot be identified due to occlusion is avoided in the target tracking process, the tracking efficiency of the target tracking technology on suspects and hit vehicles in actual people flow and vehicle flow is enhanced, and the target tracking technology has the feasibility of implementation.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
FIG. 1 is a flow chart of a method of the present invention;
fig. 2 is a block diagram of the system architecture of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
As shown in fig. 1, the present invention proposes the following method:
a target tracking method in an occlusion state comprises the following steps:
s1, selecting a tracking target from the 1 st frame of video picture frame, and extracting the picture characteristics of the tracking target in the picture frame;
s2, selecting one or more candidate targets from the continuous video picture frames of the 2 nd frame, the 3 rd frame, the … th frame and the nth frame;
s3, judging whether the candidate target selected in the S2 is in an occlusion state;
s4, when the candidate target is in an occlusion state, carrying out occlusion removing reconstruction on the candidate target by using a GAN neural network;
and S5, when the candidate target is in an unoccluded state or after the candidate target is subjected to the deblocking reconstruction, performing general target tracking processing on the candidate target, and confirming the candidate target in the blocked picture frame as the tracking target.
To facilitate similarity comparison with the candidate object, the easily distinguishable picture features extracted in S1 include one or more of color distribution features, texture features, and edge features.
In order to further optimize the above technical features, the specific steps of S3 are as follows:
s31, dividing the candidate target picture into a plurality of sub-regions respectively;
s32, extracting the picture characteristics of each sub-area;
s33, calculating the variation of the picture characteristics of each subregion relative to the adjacent subregions;
and S34, judging whether the variation exceeds a mutation threshold value.
Specifically, when the judgment of whether the continuous video picture frames of the 2 nd frame, the 3 rd frame, … and the nth frame are in the occlusion state is made, if the candidate object is blocked, the picture feature distribution of the candidate object will have region mutation, the candidate targets which are not shielded are generally uniform in the distribution of the image features and have no abrupt change, based on the principle, the candidate target image is divided into 10 × 10 matrix arrangement sub-regions, the image features of the 100 sub-regions are respectively extracted, the image features comprise one or more of color distribution features, texture features and edge features, the amount of change in picture characteristics for each sub-region relative to its neighbors is then calculated, and if there is a change that exceeds the snap threshold, the candidate target is determined to have a regional mutation, so that the candidate target is in an occlusion state and de-occlusion reconstruction should be performed.
In order to further optimize the above technical features, the specific steps of S4 are as follows:
s41, establishing a training sample library, wherein the training sample library comprises a sample target picture in an occlusion state and a sample target picture in a non-occlusion state;
s42, overlapping the sample target picture in the shielding state with the randomly distributed variable, and generating a sample target picture after de-shielding reconstruction by using a generator;
s43, calculating a loss function value of the sample target picture after the occlusion removal reconstruction relative to the sample target picture in an occlusion-free state by a discriminator;
s44, when the loss function value is not in the allowable range, the feedback generator optimizes the loss function value until the sample target picture output by the generator after the occlusion removal reconstruction is judged to be in the allowable range by the discriminator to obtain a trained GAN;
and S45, substituting the candidate target picture in the shielded state into the trained GAN to obtain a target picture after de-shielding reconstruction.
The GAN neural network comprises a generator and a discriminator, wherein the generator and the discriminator are convolutional neural networks and have excellent performances of self-learning capability of the neural networks, and the convolutional neural network used as the discriminator in the invention is a trained neural network with accurate discrimination capability, so that the invention establishes a training sample library, trains the convolutional neural network used as the generator, the training sample library comprises a sample target picture in an occlusion state and a sample target picture in a non-occlusion state, the sample target picture in the occlusion state is superposed with random distribution variables, the generator generates a sample target picture after de-occlusion reconstruction, the discriminator calculates loss function values of the sample target picture after de-occlusion reconstruction by taking the sample target picture in the non-occlusion state as a reference, and feeds back the results to the generator, the generator continuously self-learns and trains, and repeats the process until the calculated loss function value is in the allowed range, the convolutional neural network training as the generator is completed, the GAN neural network training is completed, and the occluded target picture can be subjected to de-occlusion reconstruction.
7. In order to further optimize the above technical features, the specific steps of S5 are as follows:
s51, extracting picture features of the picture frame where the candidate target is located;
s52, comparing the similarity of the picture characteristics of the tracking target and the candidate target;
and S53, judging whether the similarity is greater than a threshold value, and if the similarity is greater than the threshold value, determining that the candidate target in the subsequent video picture frame is the tracking target.
As shown in figure 2 of the drawings, in which,
a target tracking system in an occluded state, comprising: the device comprises a tracking target picture feature extraction module 1, a candidate target selection module 2, an occlusion state judgment module 3, a de-occlusion reconstruction module 4 and a general target tracking processing module 5; wherein the content of the first and second substances,
the tracking target picture feature extraction module 1 is used for selecting a tracking target from a first frame of video picture frame and extracting picture features of the tracking target in the picture frame;
the candidate target selecting module 2 is used for selecting one or more candidate targets from the continuous video picture frames of the 2 nd frame, the 3 rd frame, …, the nth frame;
the shielding state judging module 3 is used for judging whether the candidate target selected by the candidate target selecting module 2 is in a shielding state;
the de-occlusion reconstruction module 4 is configured to perform de-occlusion reconstruction on the candidate target by using a GAN neural network when the candidate target is in an occlusion state;
the general target tracking processing module 5 is configured to perform general target tracking processing on the candidate target when the candidate target is in an unobscured state or after performing deblock reconstruction, and confirm that the candidate target in the occluded picture frame is the tracking target.
In order to further optimize the above technical features, the picture features extracted by the tracking target picture feature extraction module 1 include one or more of color distribution features, separation features, and edge features.
In order to further optimize the above technical features, the occlusion state determination module 3 includes: the device comprises a region discrete unit, a sub-region picture feature extraction unit, a variation calculation unit and a judgment unit; wherein the content of the first and second substances,
the area discrete unit is used for respectively dividing the candidate target picture into a plurality of sub-areas;
the subarea picture feature extraction unit is used for extracting the picture feature of each subarea;
the variable quantity calculating unit is used for calculating the variable quantity of the picture characteristic of each sub-area relative to the adjacent sub-area;
the judging unit is used for judging whether the variation exceeds a mutation threshold value.
In order to further optimize the above technical features, the de-occlusion reconstruction module 4 comprises: the device comprises a training sample library establishing unit, a sample target picture generating unit after occlusion removal reconstruction, a loss function value calculating unit, a generator optimizing unit and a target picture generating unit after occlusion removal reconstruction; wherein the content of the first and second substances,
the training sample library establishing unit is used for establishing a training sample library, and the training sample library comprises a sample target picture in an occlusion state and a sample target state in a non-occlusion state;
the sample target picture generation unit after the de-occlusion reconstruction is used for superposing the sample target picture in the occlusion state with the random distribution variable and generating a sample target picture after the de-occlusion reconstruction by using the generator;
the loss function value calculation unit is used for calculating the loss function value of the sample target picture after the occlusion removal reconstruction relative to the sample target picture in the non-occlusion state by the discriminator;
the generator optimization unit is used for feeding back the generator to optimize when the loss function value is not in the allowable range until the sample target picture output by the generator after the occlusion removal reconstruction is judged to be in the allowable range by the discriminator to obtain a trained GAN;
and the de-occlusion reconstructed target picture generation unit is used for substituting the candidate target picture in the pointed state into the trained GAN to obtain a de-occlusion reconstructed target picture.
In order to further optimize the above technical features, the general target tracking processing module 5 includes: the device comprises a candidate target picture feature extraction unit, a similarity comparison unit and a result judgment unit; wherein the content of the first and second substances,
the candidate target picture feature extraction unit is used for extracting picture features of a picture frame where the candidate target is located;
the similarity comparison unit is used for comparing the similarity of the picture characteristics of the tracking target and the candidate target;
and the result judging unit is used for judging whether the similarity is greater than a threshold value or not, and if the similarity is greater than the threshold value, the candidate target in the subsequent video picture frame is determined to be the tracking target.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. A target tracking method under an occlusion state is characterized by comprising the following steps:
s1, selecting a tracking target from the 1 st frame of video picture frame, and extracting the picture characteristics of the tracking target in the picture frame;
s2, selecting one or more candidate targets from the continuous video picture frames of the 2 nd frame, the 3 rd frame, the … th frame and the nth frame;
s3, judging whether the candidate target selected in the S2 is in an occlusion state;
s4, when the candidate target is in an occlusion state, carrying out occlusion removing reconstruction on the candidate target by using a GAN neural network;
and S5, when the candidate target is in an unoccluded state or after the candidate target is subjected to the deblocking reconstruction, performing general target tracking processing on the candidate target, and confirming the candidate target in the blocked picture frame as the tracking target.
2. The method according to claim 1, wherein the picture features extracted in S1 include one or more of color distribution features, texture features, and edge features.
3. The method for tracking the target under the occlusion state according to claim 1, wherein the specific steps of S3 are as follows:
s31, dividing the candidate target picture into a plurality of sub-regions respectively;
s32, extracting the picture characteristics of each sub-area;
s33, calculating the variation of the picture characteristics of each subregion relative to the adjacent subregions;
and S34, judging whether the variation exceeds a mutation threshold value.
4. The method for tracking the target under the occlusion state according to claim 1, wherein the specific steps of S4 are as follows:
s41, establishing a training sample library, wherein the training sample library comprises a sample target picture in an occlusion state and a sample target picture in a non-occlusion state;
s42, overlapping the sample target picture in the shielding state with the randomly distributed variable, and generating a sample target picture after de-shielding reconstruction by using a generator;
s43, calculating a loss function value of the sample target picture after the occlusion removal reconstruction relative to the sample target picture in an occlusion-free state by a discriminator;
s44, when the loss function value is not in the allowable range, the feedback generator optimizes the loss function value until the sample target picture output by the generator after the occlusion removal reconstruction is judged to be in the allowable range by the discriminator to obtain a trained GAN;
and S45, substituting the candidate target picture in the shielded state into the trained GAN to obtain a target picture after de-shielding reconstruction.
5. The method for tracking the target under the occlusion state according to claim 1, wherein the specific steps of S5 are as follows:
s51, extracting picture features of the picture frame where the candidate target is located;
s52, comparing the similarity of the picture characteristics of the tracking target and the candidate target;
and S53, judging whether the similarity is greater than a threshold value, and if the similarity is greater than the threshold value, determining that the candidate target in the subsequent video picture frame is the tracking target.
6. A target tracking system in an occluded state, comprising: the device comprises a tracking target picture feature extraction module (1), a candidate target selection module (2), an occlusion state judgment module (3), a de-occlusion reconstruction module (4) and a general target tracking processing module (5); wherein the content of the first and second substances,
the tracking target picture feature extraction module (1) is used for selecting a tracking target from a first frame video picture frame and extracting the picture feature of the tracking target in the picture frame;
the candidate object selection module (2) is used for selecting one or more candidate objects from the continuous video picture frames of the 2 nd frame, the 3 rd frame, …, the nth frame;
the shielding state judging module (3) is used for judging whether the candidate target selected by the candidate target selecting module (2) is in a shielding state;
the de-occlusion reconstruction module (4) is used for performing de-occlusion reconstruction on the candidate target by using a GAN neural network when the candidate target is in an occlusion state;
the general target tracking processing module (5) is used for performing general target tracking processing on the candidate target when the candidate target is in an unoccluded state or after carrying out de-occlusion reconstruction, and confirming that the candidate target in the occluded picture frame is the tracking target.
7. The occlusion state target tracking system according to claim 6, wherein the image features extracted by the tracking target image feature extraction module (1) include one or more of color distribution features, separation features, and edge features.
8. The occlusion state target tracking system according to claim 6, wherein the occlusion state determination module (3) comprises: the device comprises a region discrete unit, a sub-region picture feature extraction unit, a variation calculation unit and a judgment unit; wherein the content of the first and second substances,
the area discrete unit is used for respectively dividing the candidate target picture into a plurality of sub-areas;
the subarea picture feature extraction unit is used for extracting the picture feature of each subarea;
the variable quantity calculating unit is used for calculating the variable quantity of the picture characteristic of each sub-area relative to the adjacent sub-area;
the judging unit is used for judging whether the variation exceeds a mutation threshold value.
9. The occlusion state target tracking system of claim 6, wherein the de-occlusion reconstruction module (4) comprises: the device comprises a training sample library establishing unit, a sample target picture generating unit after occlusion removal reconstruction, a loss function value calculating unit, a generator optimizing unit and a target picture generating unit after occlusion removal reconstruction; wherein the content of the first and second substances,
the training sample library establishing unit is used for establishing a training sample library, and the training sample library comprises a sample target picture in an occlusion state and a sample target picture in a non-occlusion state;
the sample target picture generation unit after the de-occlusion reconstruction is used for superposing a sample target picture in an occlusion state with a random distribution variable and generating a sample target picture after the de-occlusion reconstruction by using a generator;
the loss function value calculating unit is used for calculating a loss function value of a sample target picture after the occlusion removal reconstruction relative to a sample target picture in an occlusion-free state by a discriminator;
the generator optimization unit is used for feeding back the generator to optimize when the loss function value is not in the allowable range until the sample target picture output by the generator after the occlusion removal reconstruction is judged to be in the allowable range by the discriminator to obtain a trained GAN;
and the de-occlusion reconstructed target picture generation unit is used for substituting the candidate target picture in the pointed state into the trained GAN to obtain a de-occlusion reconstructed target picture.
10. The occlusion state target tracking system according to claim 6, wherein the general target tracking processing module (5) comprises: the device comprises a candidate target picture feature extraction unit, a similarity comparison unit and a result judgment unit; wherein the content of the first and second substances,
the candidate target picture feature extraction unit is used for extracting picture features of a picture frame where the candidate target is located;
the similarity comparison unit is used for comparing the similarity of the picture characteristics of the tracking target and the candidate target;
the result judging unit is used for judging whether the similarity is greater than a threshold value or not, and if the similarity is greater than the threshold value, the candidate target in the subsequent video picture frame is determined to be the tracking target.
CN201910754803.9A 2019-08-15 2019-08-15 Target tracking method and system in shielding state Active CN110659566B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910754803.9A CN110659566B (en) 2019-08-15 2019-08-15 Target tracking method and system in shielding state

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910754803.9A CN110659566B (en) 2019-08-15 2019-08-15 Target tracking method and system in shielding state

Publications (2)

Publication Number Publication Date
CN110659566A true CN110659566A (en) 2020-01-07
CN110659566B CN110659566B (en) 2020-12-18

Family

ID=69037498

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910754803.9A Active CN110659566B (en) 2019-08-15 2019-08-15 Target tracking method and system in shielding state

Country Status (1)

Country Link
CN (1) CN110659566B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112489086A (en) * 2020-12-11 2021-03-12 北京澎思科技有限公司 Target tracking method, target tracking device, electronic device, and storage medium
CN113111823A (en) * 2021-04-22 2021-07-13 广东工业大学 Abnormal behavior detection method and related device for building construction site

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101887588A (en) * 2010-08-04 2010-11-17 中国科学院自动化研究所 Appearance block-based occlusion handling method
CN103456030A (en) * 2013-09-08 2013-12-18 西安电子科技大学 Target tracking method based on scattering descriptor
KR101492059B1 (en) * 2013-05-31 2015-02-11 전자부품연구원 Real Time Object Tracking Method and System using the Mean-shift Algorithm
CN105139418A (en) * 2015-08-04 2015-12-09 山东大学 Novel video tracking method based on partitioning policy
CN105335986A (en) * 2015-09-10 2016-02-17 西安电子科技大学 Characteristic matching and MeanShift algorithm-based target tracking method
CN106204638A (en) * 2016-06-29 2016-12-07 西安电子科技大学 A kind of based on dimension self-adaption with the method for tracking target of taking photo by plane blocking process
CN107730458A (en) * 2017-09-05 2018-02-23 北京飞搜科技有限公司 A kind of fuzzy facial reconstruction method and system based on production confrontation network
CN107909061A (en) * 2017-12-07 2018-04-13 电子科技大学 A kind of head pose tracks of device and method based on incomplete feature
CN108205659A (en) * 2017-11-30 2018-06-26 深圳市深网视界科技有限公司 Face occluder removes and its method, equipment and the medium of model construction
CN108549905A (en) * 2018-04-09 2018-09-18 上海方立数码科技有限公司 A kind of accurate method for tracking target under serious circumstance of occlusion
CN109145745A (en) * 2018-07-20 2019-01-04 上海工程技术大学 A kind of face identification method under circumstance of occlusion
CN109711283A (en) * 2018-12-10 2019-05-03 广东工业大学 A kind of joint doubledictionary and error matrix block Expression Recognition algorithm

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101887588A (en) * 2010-08-04 2010-11-17 中国科学院自动化研究所 Appearance block-based occlusion handling method
KR101492059B1 (en) * 2013-05-31 2015-02-11 전자부품연구원 Real Time Object Tracking Method and System using the Mean-shift Algorithm
CN103456030A (en) * 2013-09-08 2013-12-18 西安电子科技大学 Target tracking method based on scattering descriptor
CN105139418A (en) * 2015-08-04 2015-12-09 山东大学 Novel video tracking method based on partitioning policy
CN105335986A (en) * 2015-09-10 2016-02-17 西安电子科技大学 Characteristic matching and MeanShift algorithm-based target tracking method
CN106204638A (en) * 2016-06-29 2016-12-07 西安电子科技大学 A kind of based on dimension self-adaption with the method for tracking target of taking photo by plane blocking process
CN107730458A (en) * 2017-09-05 2018-02-23 北京飞搜科技有限公司 A kind of fuzzy facial reconstruction method and system based on production confrontation network
CN108205659A (en) * 2017-11-30 2018-06-26 深圳市深网视界科技有限公司 Face occluder removes and its method, equipment and the medium of model construction
CN107909061A (en) * 2017-12-07 2018-04-13 电子科技大学 A kind of head pose tracks of device and method based on incomplete feature
CN108549905A (en) * 2018-04-09 2018-09-18 上海方立数码科技有限公司 A kind of accurate method for tracking target under serious circumstance of occlusion
CN109145745A (en) * 2018-07-20 2019-01-04 上海工程技术大学 A kind of face identification method under circumstance of occlusion
CN109711283A (en) * 2018-12-10 2019-05-03 广东工业大学 A kind of joint doubledictionary and error matrix block Expression Recognition algorithm

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
AGRIM GUPTA 等: "Social GAN: Socially Acceptable Trajectories with Generative Adversarial Networks", 《ARXIV》 *
姚乃明 等: "基于生成式对抗网络的鲁邦人脸表情识别", 《自动化学报》 *
常发亮 等: "遮挡情况下的视觉目标跟踪方法研究", 《控制与决策》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112489086A (en) * 2020-12-11 2021-03-12 北京澎思科技有限公司 Target tracking method, target tracking device, electronic device, and storage medium
CN113111823A (en) * 2021-04-22 2021-07-13 广东工业大学 Abnormal behavior detection method and related device for building construction site

Also Published As

Publication number Publication date
CN110659566B (en) 2020-12-18

Similar Documents

Publication Publication Date Title
Cho et al. A neural-based crowd estimation by hybrid global learning algorithm
KR102155182B1 (en) Video recording method, server, system and storage medium
US9189867B2 (en) Adaptive image processing apparatus and method based in image pyramid
CN110659566B (en) Target tracking method and system in shielding state
CN103824066A (en) Video stream-based license plate recognition method
CN112767711B (en) Multi-class multi-scale multi-target snapshot method and system
Casasent et al. Real, imaginary, and clutter Gabor filter fusion for detection with reduced false alarms
CN111914665B (en) Face shielding detection method, device, equipment and storage medium
CN110765841A (en) Group pedestrian re-identification system and terminal based on mixed attention mechanism
CN111753651A (en) Subway group abnormal behavior detection method based on station two-dimensional crowd density analysis
GB2409031A (en) Face detection
Dwivedi et al. Weapon classification using deep convolutional neural network
CN111753732A (en) Vehicle multi-target tracking method based on target center point
US20070104373A1 (en) Method for automatic key posture information abstraction
CN112417955A (en) Patrol video stream processing method and device
CN115331141A (en) High-altitude smoke and fire detection method based on improved YOLO v5
CN108921147B (en) Black smoke vehicle identification method based on dynamic texture and transform domain space-time characteristics
CN109685062A (en) A kind of object detection method, device, equipment and storage medium
CN112307895A (en) Crowd gathering abnormal behavior detection method under community monitoring scene
CN112001387B (en) Method and device for determining focusing area, terminal and storage medium
Suba et al. Violence detection for surveillance systems using lightweight CNN models
CN113657169A (en) Gait recognition method, device, system and computer readable storage medium
CN113938671A (en) Image content analysis method and device, electronic equipment and storage medium
CN104751489A (en) Grid-based relay tracking method and device in online class
CN112560825B (en) Face detection method and device, electronic equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant