CN113592902A

CN113592902A - Target tracking method and device, computer equipment and storage medium

Info

Publication number: CN113592902A
Application number: CN202110684131.6A
Authority: CN
Inventors: 刘竞爽
Original assignee: Beijing Megvii Technology Co Ltd
Current assignee: Beijing Megvii Technology Co Ltd
Priority date: 2021-06-21
Filing date: 2021-06-21
Publication date: 2021-11-02

Abstract

The application relates to a target tracking method, a target tracking device, computer equipment and a storage medium, wherein the method comprises the following steps: acquiring a candidate target tracking track output by a target tracking detection task; and filtering the candidate target tracking track based on a preset quality filtering condition, and determining the candidate target tracking track meeting the preset quality filtering condition as a final target tracking result of the target object. According to the method, the candidate results output by the target tracking detection task are filtered according to the preset quality filtering condition corresponding to the target object, low-quality results in the candidate results can be filtered, and the speed of the target tracking detection task can be guaranteed as the target tracking detection task is not affected.

Description

Target tracking method and device, computer equipment and storage medium

Technical Field

The present application relates to the field of computer technologies, and in particular, to a target tracking method and apparatus, a computer device, and a storage medium.

Background

The multi-target tracking task is an important branch of the computer vision field, and the main task is to find an image sequence, find moving objects in the image sequence, then map the moving objects in different frames one-to-one, and then give tracks of different objects. These objects may be arbitrary, such as pedestrians, vehicles, animals, etc., and currently, since "pedestrians" are typically non-rigid targets, they are much more difficult than rigid targets; and pedestrian detection and tracking have greater commercial value in practical applications, there is a great deal of research on "pedestrian tracking".

At present, a great challenge to pedestrian tracking is the balance problem of speed and precision, and in a pedestrian tracking system actually deployed in the industry, in order to meet the real-time performance, higher requirements are imposed on the speed. The speed is increased to bring a certain loss of precision, and meanwhile, low-quality tracking results are generated, the low-quality tracking results mainly comprise non-human body tracking results (such as animals, non-motor vehicles, garbage cans, fire hydrants and the like), and how to filter the low-quality tracking results on the premise of not sacrificing the speed is a great challenge for the existing technology.

Disclosure of Invention

In view of the above, it is necessary to provide a target tracking method, an apparatus, a computer device, and a storage medium capable of filtering out low-quality tracking results in view of the above technical problems.

A method of target tracking, the method comprising:

acquiring a candidate target tracking track output by a target tracking detection task;

reading a target feature confidence score of a target object in the candidate target tracking track;

determining whether each candidate target tracking track meets a preset quality filtering condition or not based on the target feature confidence score;

and determining the candidate target tracking track meeting the preset quality filtering condition as a final target tracking result of the target object.

In one embodiment, the determining whether each candidate target tracking trajectory meets a preset quality filtering condition based on the target feature confidence score includes:

determining a representative confidence score of the candidate target tracking trajectory according to the target feature confidence score;

and if the representative confidence score of the candidate target tracking track is larger than the corresponding preset score threshold, judging that the candidate target tracking track meets the preset quality filtering condition.

In one embodiment thereof, the target object comprises a pedestrian; the target feature confidence score comprises: a face confidence score and a human confidence score.

In one embodiment, the determining a representative confidence score for the candidate target tracking trajectory from the target feature confidence score comprises:

and determining the highest confidence score, the lowest confidence score and the average confidence score of the candidate target tracking track according to the target feature confidence scores.

In one embodiment, the determining whether each candidate target tracking trajectory meets a preset quality filtering condition based on the target feature confidence score further includes:

reading the track length of the candidate target tracking track;

and if the track length is greater than a preset length threshold value and the representative confidence score of the candidate target tracking track is greater than a corresponding preset score threshold value, judging that the candidate target tracking track meets a preset quality filtering condition.

reading the first frame position and the last frame position in the candidate target tracking track;

calculating the position offset between the first frame position and the last frame position;

and if the position deviation is greater than a preset deviation threshold value and the representative confidence score of the candidate target tracking track is greater than a corresponding preset score threshold value, judging that the candidate target tracking track meets a preset quality filtering condition.

In one embodiment, the calculating the position offset between the first frame position and the last frame position includes:

respectively reading target detection areas corresponding to the target objects in the first frame and the last frame;

calculating the intersection ratio of the target detection area in the first frame and the target detection area in the last frame;

and if the intersection ratio is smaller than a preset intersection ratio threshold value, judging that the position deviation is larger than a preset deviation threshold value.

An object tracking apparatus, the apparatus comprising:

the candidate track acquisition module is used for acquiring a candidate target tracking track output by the target tracking detection task;

the confidence reading module is used for reading a target feature confidence score of a target object in the candidate target tracking track;

the filtering module is used for determining whether each candidate target tracking track meets a preset quality filtering condition or not based on the target feature confidence score;

and the result determining module is used for determining the candidate target tracking track meeting the preset quality filtering condition as the final target tracking result of the target object.

A computer device comprising a memory and a processor, the memory storing a computer program, the processor implementing the following steps when executing the computer program:

A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the steps of:

The target tracking method, the target tracking device, the computer equipment and the storage medium acquire the candidate target tracking track output by the target tracking detection task; and reading the target feature confidence score of the target object in each candidate target tracking track, judging whether the candidate target tracking track meets a preset quality filtering condition or not based on the target feature confidence score, and finally determining the candidate target tracking track meeting the preset quality filtering condition as a final target tracking result of the target object. According to the method, for the candidate results output by the target tracking detection task, whether the preset quality filtering condition is met or not is judged according to the confidence scores of the target objects in the candidate results, low-quality results in the candidate results can be filtered out quickly, and the method can still ensure the speed of the target tracking detection task due to no influence on the target tracking detection task.

Drawings

FIG. 1 is a schematic flow chart diagram of a target tracking method in one embodiment;

FIG. 2 is a schematic flow chart illustrating filtering candidate target tracking tracks based on a predetermined quality filtering condition according to an embodiment;

FIG. 3 is a schematic diagram illustrating a process of filtering candidate target tracking tracks based on a preset quality filtering condition in another embodiment;

FIG. 4 is a schematic diagram illustrating a process of filtering candidate target tracking tracks based on a preset quality filtering condition in another embodiment;

FIG. 5 is a flow diagram illustrating a process for calculating a position offset between a first frame position and a last frame position in one embodiment;

FIG. 6 is a diagram illustrating IOU definition in one embodiment;

FIG. 7 is a block diagram of a target tracking device in one embodiment;

FIG. 8 is a diagram illustrating an internal structure of a computer device according to an embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

In recent years, technical research based on artificial intelligence, such as computer vision, deep learning, machine learning, image processing, and image recognition, has been actively developed. Artificial Intelligence (AI) is an emerging scientific technology for studying and developing theories, methods, techniques and application systems for simulating and extending human Intelligence. The artificial intelligence subject is a comprehensive subject and relates to various technical categories such as chips, big data, cloud computing, internet of things, distributed storage, deep learning, machine learning and neural networks. Computer vision is used as an important branch of artificial intelligence, particularly a machine is used for identifying the world, and the computer vision technology generally comprises the technologies of face identification, living body detection, fingerprint identification and anti-counterfeiting verification, biological feature identification, face detection, pedestrian detection, target detection, pedestrian identification, image processing, image identification, image semantic understanding, image retrieval, character identification, video processing, video content identification, behavior identification, three-dimensional reconstruction, virtual reality, augmented reality, synchronous positioning and map construction (SLAM), computational photography, robot navigation and positioning and the like. With the research and progress of artificial intelligence technology, the technology is applied to various fields, such as security, city management, traffic management, building management, park management, face passage, face attendance, logistics management, warehouse management, robots, intelligent marketing, computational photography, mobile phone images, cloud services, smart homes, wearable equipment, unmanned driving, automatic driving, smart medical treatment, face payment, face unlocking, fingerprint unlocking, testimony verification, smart screens, smart televisions, cameras, mobile internet, live webcasts, beauty treatment, medical beauty treatment, intelligent temperature measurement and the like.

In one embodiment, as shown in fig. 1, a target tracking method is provided, which relates to the field of image processing of artificial intelligence. The embodiment is illustrated by applying the method to a terminal, and it can be understood that the method can also be applied to a server, and can also be applied to a system comprising the terminal and the server, and is implemented by interaction between the terminal and the server. In this embodiment, the method includes the steps of:

step S110, obtaining candidate target tracking tracks output by the target tracking detection task.

The target tracking means detecting, extracting, identifying and tracking a moving target in an image sequence to obtain a motion parameter of the moving target, processing and analyzing the motion parameter, and realizing behavior understanding of the moving target to complete a higher-level detection task. In this embodiment, the target tracking detection task is configured to output a tracking trajectory of the target object after tracking the target object, that is, a candidate target tracking trajectory in this embodiment.

In one embodiment, when the target tracking filtering request is acquired, the candidate target tracking track output by the target tracking detection task is acquired based on the target tracking filtering request. The target tracking track request can be initiated by a user and is used for requesting to filter the target tracking track; the target tracking filtering request carries information of a target object to be filtered. In the target tracking process, a target object is searched in a series of image sequences, then the target objects in different frames are mapped one to one, and the track of the target object detected in the image sequences is output. Tracking detection is usually performed on a target object according to a monitoring video picture, and a track of the target object is output. However, in the target tracking detection task, in order to meet the requirement of high speed in real-time in the actual application process, the speed is increased to bring some loss in precision, and meanwhile, a low-quality tracking result is generated, and the target tracking track filtering request is used for filtering the low-quality result in the target tracking track. In one embodiment, the target tracking filtering request is parsed to obtain information of the target object to be filtered. The result output by the target tracking detection task of the target object is read and is recorded as a candidate target tracking track in this embodiment.

In a specific embodiment, obtaining the candidate target tracking trajectory output by the target tracking detection task includes: and obtaining the confidence degree of the target object output by the target tracking detection task, the length of the candidate target track, the position of each image frame in the candidate target track and the like.

And step S120, reading the target characteristic confidence score of the target object in the candidate target tracking track.

The target feature confidence score represents a confidence score output in a target tracking detection task, and the confidence generally represents a category confidence of a certain target output by a classifier, namely the probability that the target belongs to A, the probability that the target belongs to B and the like; in this embodiment, the confidence score of a target feature represents the probability that the target belongs to the target feature. Further, a target feature confidence score for the target object may be obtained from an output result of the target tracking detection task for the target object.

The target feature may be one or more features in the target object, and the features may reflect the characteristics of the target object to some extent. The target features may be set according to different target objects, for example, when the target object is a pedestrian, the target features may include a human face, a human body, and the like; if the target object is a pet such as a cat or a dog, the target feature may be set to be a corresponding feature such as a cat or a dog, and so on.

Because the target tracking track comprises a plurality of images, the confidence score corresponding to the target feature may also comprise a plurality of images; in one embodiment, the target feature confidence score may be determined based on the confidence score of each target feature; e.g., as the highest, lowest, mean or median value in the confidence score, the third-quarter digit value, etc.

And step S130, determining whether each candidate target tracking track meets a preset quality filtering condition or not based on the target feature confidence score.

The preset quality filtering condition is preset, corresponds to the target object and is used for evaluating the quality corresponding to the candidate target tracking track; after the target feature confidence score is read in the above steps, respectively judging whether each candidate target tracking track meets the preset quality filtering condition according to the target feature confidence score. In an embodiment, a specific implementation manner for determining whether the candidate target tracking trajectory satisfies the preset quality filtering condition based on the target feature confidence score will be described in detail in the subsequent embodiments, and will not be described herein again.

Step S140, determining the candidate target tracking trajectory satisfying the preset quality filtering condition as a final target tracking result of the target object.

In this embodiment, the candidate target tracking tracks are filtered based on the preset quality filtering condition, only the candidate target tracking tracks meeting the preset quality filtering condition are reserved, the final target tracking result of the target object is determined, and the candidate target tracking tracks not meeting the preset quality filtering condition are filtered. Understandably, the candidate target tracking tracks which do not meet the preset quality filtering condition can be considered to have lower quality; and filtering the target tracking track with lower quality to obtain the final target tracking result of the target object.

The target tracking method acquires a candidate target tracking track output by a target tracking detection task; and reading the target feature confidence score of the target object in each candidate target tracking track, judging whether the candidate target tracking track meets a preset quality filtering condition or not based on the target feature confidence score, and finally determining the candidate target tracking track meeting the preset quality filtering condition as a final target tracking result of the target object. According to the method, for the candidate results output by the target tracking detection task, whether the preset quality filtering condition is met or not is judged according to the confidence scores of the target objects in the candidate results, low-quality results in the candidate results can be filtered out quickly, and the method can still ensure the speed of the target tracking detection task due to no influence on the target tracking detection task.

In one embodiment, as shown in fig. 2, determining whether each candidate target tracking trajectory satisfies a preset quality filtering condition based on the target feature confidence score includes steps S131 and S132.

Step S131, determining a representative confidence score of the candidate target tracking track according to the target feature confidence score.

For the candidate target tracking track, more than two image frames may be included, each image frame corresponds to one target feature confidence score, and when filtering is performed, judgment and filtering can be performed by using only part of the target feature confidence scores in the candidate target tracking track as a representative.

In this embodiment, the representative confidence score represents that one score is selected from the target feature confidence scores of the target tracking trajectory as a representative, and the subsequent steps are performed with the selected score. In one embodiment, determining a representative confidence score for a candidate target tracking trajectory based on the target feature confidence score includes: and determining the highest confidence score, the lowest confidence score and the average confidence score of the candidate target tracking track according to the target feature confidence scores.

It can be understood that the highest confidence score is the highest confidence score of the target features in a group of candidate target tracking trajectories, the lowest confidence score is the lowest confidence score of the target features in the candidate target tracking trajectories, and the average confidence score is the average score of the confidence scores of the target features in the candidate target tracking trajectories, and after the confidence scores of all the target features in the candidate target tracking trajectories are obtained, the corresponding average score can be calculated. In the embodiment, the highest score, the lowest score and the average score in the confidence levels are used as the representative confidence level scores of the target features of the candidate target tracking tracks for subsequent filtering. In other embodiments, the representative confidence score for the candidate target tracking trajectory may be determined in other manners, such as where the representative confidence score includes a median and/or a third quartile of the confidence score, and so on.

Further, if the target features include more than two target features, a representative confidence score of each target feature is determined from each group of candidate target tracking tracks. In one embodiment, the target object comprises a pedestrian; correspondingly, the target feature confidence scores include: a face confidence score and a human confidence score; in this embodiment, representative confidence scores corresponding to a set of human face features and representative confidence scores corresponding to a set of human body features are determined from a set of candidate target tracking trajectories.

In one embodiment, the face confidence score and the human body confidence score respectively represent the confidence scores of detection frames corresponding to the face or the human body in the target tracking track; confidence, also called reliability, or confidence level, confidence coefficient, i.e. when a sample estimates an overall parameter, its conclusion is always uncertain due to the randomness of the sample. Therefore, a probabilistic statement method, i.e. interval estimation in mathematical statistics, is used, i.e. how large the corresponding probability of the estimated value and the overall parameter are within a certain allowable error range, and this corresponding probability is called confidence. In one embodiment, the confidence may be understood as the quality of a detection frame corresponding to the target face or human body; it can be understood that the higher the confidence score corresponding to the target feature of the target object in the target tracking detection task is, the higher the quality of the target feature representing the target object in the target tracking track is, and further, the higher the accuracy is when performing the identification.

In step S132, if the representative confidence score of the candidate target tracking trajectory is greater than the corresponding preset score threshold, it is determined that the candidate target tracking trajectory satisfies the preset quality filtering condition.

The preset score threshold value can be set according to actual conditions, and each target feature confidence corresponds to one group of preset score threshold values; in one embodiment, the representative confidence score includes two or more confidence scores, the preset score threshold includes two or more, and the representative confidence scores correspond to the preset score thresholds one to one. In a specific embodiment, the representative confidence score includes a highest confidence score, a lowest confidence score and an average confidence score, and the preset score threshold includes a preset score threshold corresponding to the highest confidence score, a preset score threshold corresponding to the lowest confidence score and a preset score threshold corresponding to the average confidence score.

In this embodiment, the representative confidence scores are respectively compared with corresponding preset score thresholds, and if the representative confidence scores are all greater than the preset score thresholds, it is indicated that the candidate target tracking trajectory meets the preset quality filtering condition. In one embodiment, if any representative confidence is smaller than or equal to the corresponding preset score threshold, it indicates that the candidate target tracking trajectory does not satisfy the preset quality filtering condition.

In the above embodiment, the confidence scores of the target features in the candidate target tracking trajectory are read, and partial confidence scores in the candidate target tracking trajectory are selected to determine the representative confidence scores, so that the representative confidence scores are respectively compared with the corresponding preset score thresholds, and if the representative confidence scores are all greater than the corresponding preset score thresholds, it is determined that the candidate target tracking trajectory meets the preset quality filtering condition, and the final target tracking result of the target object can be retained.

In one embodiment, as shown in fig. 3, determining whether each candidate target tracking trajectory satisfies a preset quality filtering condition based on the target feature confidence score further includes steps S310 and S320:

step S310, the track length of the candidate target tracking track is read.

In one embodiment, the candidate target tracking trajectory comprises more than two image frames, wherein the image frames comprise the image frames in which the target object is detected; in one embodiment, the track length of the candidate target tracking track may be expressed by the number of image frames included, or may also be expressed by the duration of the candidate target tracking track, and the track length of the candidate target tracking track may be selected according to actual situations in different embodiments.

Further, the track length of the candidate target tracking track may be determined according to an output result of the target tracking detection task.

In step S320, if the track length is greater than the preset length threshold and the representative confidence score of the candidate target tracking track is greater than the corresponding preset score threshold, it is determined that the candidate target tracking track meets the preset quality filtering condition.

The preset length threshold value can be set according to actual conditions. In this embodiment, in the preset quality filtering condition, not only the representative confidence score is set to be compared with the corresponding preset score threshold, but also the track length of the candidate target tracking track is set to be determined, where the track length is greater than the preset length, and the representative confidence scores are both greater than the corresponding preset score threshold, and then it is determined that the candidate target tracking track meets the preset quality filtering condition.

In the above embodiment, the track length of the candidate target tracking track reaches a certain length threshold, and the confidence score satisfies the corresponding condition, and is determined to satisfy the preset quality filtering condition, and if the track length does not reach the preset length threshold, or the confidence score does not satisfy the corresponding condition, it is determined to not satisfy the preset quality filtering condition, and the candidate target tracking track with a shorter track length may be an object that is misrecognized, which reduces the situation of misrecognized.

In another embodiment, as shown in fig. 4, determining whether each candidate target tracking trajectory satisfies a preset quality filtering condition based on the target feature confidence score further includes steps S410 to S430.

And step S410, reading the first frame position and the last frame position in the candidate target tracking track.

The first frame in the candidate target tracking track represents the position of the target object in the first frame image of the candidate target tracking track, and the last frame in the candidate target tracking track represents the position of the target object in the last frame image of the candidate target tracking track. Further, in one embodiment, the first frame position and the last frame position may be determined according to an output result of the target tracking detection task.

In step S420, a position offset between the first frame position and the last frame position is calculated.

After the first frame position and the last frame position are read, the position offset between the first frame and the last frame is calculated.

Further, as shown in fig. 5, in one embodiment, the position offset between the first frame position and the last frame position is calculated, including steps S421 to S423.

Step S421, respectively reading target detection areas corresponding to the target objects in the first frame and the last frame.

The target detection area corresponding to the target object represents a detection area of the target object including the target object in the candidate target tracking track. In one embodiment, the target detection area of the target object includes a circumscribed rectangle containing the target object; the target detection area corresponding to the target object may be determined according to an output result of the target tracking detection task.

Step S422, calculate the intersection ratio between the target detection region in the first frame and the target detection region in the last frame.

The Intersection over Union (IoU) represents a ratio of an Intersection and a Union of target detection areas in the two images. IoU in one embodiment, as shown in FIG. 6, IoU for the dashed and solid boxes is calculated by dividing the area of overlap of the dashed and solid boxes by the sum of 2 areas. IoU, a larger value indicates a higher degree of coincidence, whereas a smaller degree of coincidence.

In step S423, if the cross-over ratio is smaller than the preset cross-over ratio threshold, it is determined that the position deviation is larger than the preset deviation threshold.

The preset intersection ratio threshold value can be set according to actual conditions. In this embodiment, the intersection ratio of the target detection areas in the first frame and the last frame of the target object is used to represent the position deviation of the target object in the candidate target tracking track.

In step S430, if the position offset is greater than the preset offset threshold and the representative confidence score of the candidate target tracking trajectory is greater than the corresponding preset score threshold, it is determined that the candidate target tracking trajectory satisfies the preset quality filtering condition.

The preset offset threshold value can be set according to actual conditions. If the position offset between the first frame and the last frame in the candidate target tracking track is small, it is indicated that the target object may not have position movement in the candidate target tracking track, or the position movement is small, the target object may be a static object, and in tasks such as tracking pedestrians, the target object may be misrecognized, and at this time, it is determined that the candidate target tracking track does not meet the preset quality filtering condition.

In another embodiment, if the representative confidence score of the candidate target tracking track is greater than the corresponding preset score threshold value, but the position offset of the target object between the first frame and the last frame in the candidate target tracking track is less than or equal to the preset offset threshold value, prompt information is generated, the prompt information is sent to relevant personnel, feedback information of the relevant personnel on the prompt information is received, and if the candidate target tracking track is determined to be normal according to the feedback information, the candidate target tracking track is judged to meet the preset quality filtering condition.

In the above embodiment, by setting the position offset of the target object in the first frame and the last frame in the candidate target tracking trajectory in the preset quality filtering condition, the position offset of the first frame and the last frame in the candidate target tracking trajectory is greater than a certain offset threshold, and the confidence score satisfies the corresponding condition, it is determined that the preset quality filtering condition is satisfied, and if the position offset in the first frame and the last frame does not reach the preset offset threshold, or the confidence score does not satisfy the corresponding condition, it is determined that the preset quality filtering condition is not satisfied, and the candidate target tracking trajectory with a smaller position offset may be an object that is erroneously identified, which can reduce the situation that the object is erroneously identified.

In an embodiment, the above target tracking method is described in detail by an embodiment, and in this embodiment, the method is applied to a task of pedestrian tracking as an example. The method comprises the following steps:

and acquiring a video stream containing pedestrians, tracking by using a target tracking algorithm, and acquiring a track candidate result of target tracking, namely the candidate target tracking track. In one embodiment, candidate target tracking trajectories are placed in a cache.

And reading related information in the candidate target tracking track, and respectively judging a face filtering condition, a human body filtering condition and a general filtering condition.

1. Face filtering condition (when the candidate target tracking track contains face, trigger the condition)

1) Track face top score: if the track contains a human face, the highest confidence score of the human face, namely the confidence score output by the target tracking detection task, is taken, and when the highest confidence score of the human face of the track is larger than a highest preset score threshold, the condition can be filtered through the human face.

2) The lowest score of the track face is: and when the track face lowest confidence score is larger than a lowest preset score threshold value, a face filtering condition can be passed.

3) The track face is divided averagely: and when the track face average confidence score is larger than an average score preset score threshold value, a face filtering condition can be passed.

2. Condition of human body filtration

1) Trajectory human body highest score: and when the human body highest confidence score of the candidate target tracking track is larger than a specific highest preset score threshold value, filtering the condition through the human body.

2) The lowest score of the track human body: and when the human body minimum confidence score of the candidate target tracking track is larger than a minimum preset score threshold value, filtering the condition through the human body.

3) Trajectory human body average score: and when the human body average confidence score of the candidate target tracking track is larger than the average score preset score threshold value, filtering the condition through the human body.

3. General filtration conditions

1) Track length: when the track length is greater than a particular threshold, a generic filtering condition may be passed.

2) Position offset of the first frame and the last frame of the track: when the position of a candidate target tracking track changes little all the time, the track is prone to be regarded as a static object, such as a garbage can, a fire hydrant, a tree and the like, and is not a pedestrian. Filtering conditions are therefore set for such trajectories. Specifically, a track first frame human body detection frame and a track last frame human body detection frame are used, and an IoU (interaction over Union) between the two frames is calculated to calculate the position offset. When the calculated value of IoU is less than the preset offset threshold, the general filtering condition may be passed.

In the target tracking method, the candidate target tracking tracks meeting all the filtering conditions are pushed out, wherein the pushing out refers to taking out the tracks from the cache region and pushing the tracks to a final result set. If any one filtering condition is not met, the track cannot be pushed out, and when all filtering conditions are met, the track can be pushed out to serve as a final target tracking result of the finally filtered target object. By the method, low-quality results in the candidate results can be quickly filtered, and the method can still ensure the speed of the target tracking detection task because the target tracking detection task is not influenced.

It should be understood that, although the steps in the flowcharts involved in the above embodiments are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a part of the steps in each flowchart involved in the above embodiments may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of performing the steps or stages is not necessarily sequential, but may be performed alternately or alternately with other steps or at least a part of the steps or stages in other steps.

In one embodiment, as shown in fig. 7, there is provided a target tracking apparatus including: a candidate trajectory acquisition module 710, a confidence reading module 720, a filtering module 730, and a result determination module 740, wherein:

a candidate trajectory acquisition module 710, configured to acquire a candidate target tracking trajectory output by the target tracking detection task;

a confidence reading module 720, configured to read a target feature confidence score of a target object in a candidate target tracking trajectory;

the filtering module 730 determines whether each candidate target tracking track meets a preset quality filtering condition based on the target feature confidence score;

and the result determining module 740 is configured to determine the candidate target tracking trajectory meeting the preset quality filtering condition as a final target tracking result of the target object.

The target tracking device acquires a candidate target tracking track output by a target tracking detection task; and reading the target feature confidence score of the target object in each candidate target tracking track, judging whether the candidate target tracking track meets a preset quality filtering condition or not based on the target feature confidence score, and finally determining the candidate target tracking track meeting the preset quality filtering condition as a final target tracking result of the target object. According to the method, for the candidate results output by the target tracking detection task, whether the preset quality filtering condition is met or not is judged according to the confidence scores of the target objects in the candidate results, low-quality results in the candidate results can be filtered out quickly, and the method can still ensure the speed of the target tracking detection task due to no influence on the target tracking detection task.

In one embodiment, the filtering module 730 of the above apparatus comprises: the representative score determining unit is used for determining a representative confidence score of the candidate target tracking track according to the target feature confidence score; and the judging unit is used for judging that the candidate target tracking track meets the preset quality filtering condition if the representative confidence score of the candidate target tracking track is larger than the corresponding preset score threshold.

In one embodiment, the target object comprises a pedestrian; the target feature confidence scores include: a face confidence score and a human confidence score.

In one embodiment, the representative score determining unit of the above apparatus is further configured to: and determining the highest confidence score, the lowest confidence score and the average confidence score of the candidate target tracking track according to the target feature confidence scores.

In one embodiment, the filtering module 730 of the above apparatus further comprises: a track length reading unit for reading the track length of the candidate target tracking track; the judging unit is further used for judging that the candidate target tracking track meets the preset quality filtering condition if the track length is larger than the preset length threshold and the representative confidence score of the candidate target tracking track is larger than the corresponding preset score threshold.

In one embodiment, the filtering module 730 of the above apparatus further comprises: the position reading unit is used for reading the position of the first frame and the position of the last frame in the candidate target tracking track; an offset calculation unit for calculating a position offset between the first frame position and the last frame position; the judging unit is further configured to judge that the candidate target tracking trajectory meets a preset quality filtering condition if the position deviation is greater than a preset deviation threshold and the representative confidence score of the candidate target tracking trajectory is greater than a corresponding preset score threshold.

In one embodiment, the offset calculation unit of the above apparatus includes: the target detection area reading subunit is used for respectively reading target detection areas corresponding to the target objects in the first frame and the last frame; the merging ratio calculating subunit is used for calculating the merging ratio of the target detection area in the first frame and the target detection area in the last frame; the judging unit is further used for judging that the position deviation is larger than a preset deviation threshold value if the intersection ratio is smaller than the preset intersection ratio threshold value.

For specific limitations of the target tracking device, reference may be made to the above limitations of the target tracking method, which are not described herein again. The modules in the target tracking device can be wholly or partially implemented by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.

In one embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as shown in fig. 8. The computer device includes a processor, a memory, a communication interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The communication interface of the computer device is used for carrying out wired or wireless communication with an external terminal, and the wireless communication can be realized through WIFI, an operator network, NFC (near field communication) or other technologies. The computer program is executed by a processor to implement a method of object tracking. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.

Those skilled in the art will appreciate that the architecture shown in fig. 8 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

In one embodiment, a computer device is provided, comprising a memory and a processor, the memory having a computer program stored therein, the processor implementing the following steps when executing the computer program:

acquiring a candidate target tracking track output by a target tracking detection task; reading a target feature confidence score of a target object in a candidate target tracking track; determining whether each candidate target tracking track meets a preset quality filtering condition or not based on the target feature confidence score; and determining the candidate target tracking track meeting the preset quality filtering condition as a final target tracking result of the target object.

In one embodiment, the processor, when executing the computer program, further performs the steps of: determining a representative confidence score of the candidate target tracking trajectory according to the target feature confidence score; and if the representative confidence score of the candidate target tracking track is larger than the corresponding preset score threshold, judging that the candidate target tracking track meets the preset quality filtering condition.

In one embodiment, the processor, when executing the computer program, further performs the steps of: and determining the highest confidence score, the lowest confidence score and the average confidence score of the candidate target tracking track according to the target feature confidence scores.

In one embodiment, the processor, when executing the computer program, further performs the steps of: reading the track length of a candidate target tracking track; and if the track length is greater than the preset length threshold value and the representative confidence score of the candidate target tracking track is greater than the corresponding preset score threshold value, judging that the candidate target tracking track meets the preset quality filtering condition.

In one embodiment, the processor, when executing the computer program, further performs the steps of: reading the first frame position and the last frame position in the candidate target tracking track; calculating the position offset between the first frame position and the last frame position; and if the position deviation is greater than a preset deviation threshold value and the representative confidence score of the candidate target tracking track is greater than a corresponding preset score threshold value, judging that the candidate target tracking track meets a preset quality filtering condition.

In one embodiment, the processor, when executing the computer program, further performs the steps of: respectively reading target detection areas corresponding to target objects in a first frame and a last frame; calculating the intersection ratio of the target detection area in the first frame and the target detection area in the last frame; and if the intersection ratio is smaller than a preset intersection ratio threshold value, judging that the position deviation is larger than a preset deviation threshold value.

In one embodiment, a computer-readable storage medium is provided, having a computer program stored thereon, which when executed by a processor, performs the steps of:

In one embodiment, the computer program when executed by the processor further performs the steps of: determining a representative confidence score of the candidate target tracking trajectory according to the target feature confidence score; and if the representative confidence score of the candidate target tracking track is larger than the corresponding preset score threshold, judging that the candidate target tracking track meets the preset quality filtering condition.

In one embodiment, the computer program when executed by the processor further performs the steps of: and determining the highest confidence score, the lowest confidence score and the average confidence score of the candidate target tracking track according to the target feature confidence scores.

In one embodiment, the computer program when executed by the processor further performs the steps of: reading the track length of a candidate target tracking track; and if the track length is greater than the preset length threshold value and the representative confidence score of the candidate target tracking track is greater than the corresponding preset score threshold value, judging that the candidate target tracking track meets the preset quality filtering condition.

In one embodiment, the computer program when executed by the processor further performs the steps of: reading the first frame position and the last frame position in the candidate target tracking track; calculating the position offset between the first frame position and the last frame position; and if the position deviation is greater than a preset deviation threshold value and the representative confidence score of the candidate target tracking track is greater than a corresponding preset score threshold value, judging that the candidate target tracking track meets a preset quality filtering condition.

In one embodiment, the computer program when executed by the processor further performs the steps of: respectively reading target detection areas corresponding to target objects in a first frame and a last frame; calculating the intersection ratio of the target detection area in the first frame and the target detection area in the last frame; and if the intersection ratio is smaller than a preset intersection ratio threshold value, judging that the position deviation is larger than a preset deviation threshold value map.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware related to instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, the computer program can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database or other medium used in the embodiments provided herein can include at least one of non-volatile and volatile memory. Non-volatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical storage, or the like. Volatile Memory can include Random Access Memory (RAM) or external cache Memory. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others.

The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above examples only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. A method of target tracking, the method comprising:

2. The method of claim 1, wherein determining whether each of the candidate target tracking trajectories satisfies a predetermined quality filtering condition based on a target feature confidence score comprises:

3. The method of claim 2, wherein:

the target object comprises a pedestrian; the target feature confidence score comprises: a face confidence score and a human confidence score.

4. The method of claim 2, wherein the determining a representative confidence score for the candidate target tracking trajectory from the target feature confidence score comprises:

5. The method according to any one of claims 2 to 4, wherein the determining whether each candidate target tracking trajectory satisfies a preset quality filtering condition based on the target feature confidence score further comprises:

reading the track length of the candidate target tracking track;

6. The method according to any one of claims 2 to 4, wherein the determining whether each candidate target tracking trajectory satisfies a preset quality filtering condition based on the target feature confidence score further comprises:

7. The method of claim 6, wherein calculating the position offset between the first frame position and the last frame position comprises:

8. An object tracking apparatus, characterized in that the apparatus comprises:

9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor, when executing the computer program, implements the steps of the method of any of claims 1 to 7.

10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 7.