WO2021017291A1 - 基于Darkflow-DeepSort的多目标追踪检测方法、装置及存储介质 - Google Patents

基于Darkflow-DeepSort的多目标追踪检测方法、装置及存储介质 Download PDF

Info

Publication number
WO2021017291A1
WO2021017291A1 PCT/CN2019/117801 CN2019117801W WO2021017291A1 WO 2021017291 A1 WO2021017291 A1 WO 2021017291A1 CN 2019117801 W CN2019117801 W CN 2019117801W WO 2021017291 A1 WO2021017291 A1 WO 2021017291A1
Authority
WO
WIPO (PCT)
Prior art keywords
target
darkflow
deepsort
target tracking
model
Prior art date
Application number
PCT/CN2019/117801
Other languages
English (en)
French (fr)
Inventor
王义文
郑权
王健宗
曹靖康
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021017291A1 publication Critical patent/WO2021017291A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/277Analysis of motion involving stochastic approaches, e.g. using Kalman filters
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • G06V20/42Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items of sport video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Definitions

  • This application relates to the technical field of intelligent decision-making, and more specifically, to a method, device and storage medium for tracking and detecting multiple targets based on Darkflow-DeepSort.
  • Visual target tracking methods are widely used in human-computer interaction, unmanned driving and other fields. Tracking methods based on Correlation Filter and Convolutional Neural Network (CNN) have occupied most of the target tracking field.
  • CNN Correlation Filter and Convolutional Neural Network
  • the SORT method (SIMPLE ONLINE AND REALTIME TRACKING, simple online and real-time tracking) has achieved better results.
  • the biggest feature of this method is that it efficiently implements target detection and uses Kalman filter to filter and Hungarian algorithm for tracking.
  • DeepSort is an improvement on the basis of SORT target tracking.
  • the original DeepSort is used to train the high-performance Faster-RCNN model for target detection.
  • the ID switch is reduced by 45%, and the depth appearance information is combined.
  • the tracking effect of occluded targets has been greatly improved; the FP has been increased to achieve the most advanced online tracking effect.
  • the fps can reach up to about 15fps, but the average fps can only reach about 10, and real-time tracking is only stable at about 8fps.
  • the purpose of this application is to provide a method, device and storage medium for tracking and detecting multiple targets based on Darkflow-DeepSort.
  • a multi-target tracking detection method based on Darkflow-DeepSort, applied to an electronic device includes the following steps:
  • S120 Input the detected image into the trained Darkflow-based target detection model to obtain the apparent characteristics of multiple targets; wherein the detected image is obtained based on decoding the surveillance video;
  • S140 Use the Kalman filter of the target tracking model to perform frame-by-frame data association processing on the surveillance video to achieve multi-target tracking in the surveillance video.
  • a multi-target tracking detection system based on Darkflow-DeepSort including a Darkflow-based target detection model acquisition unit, a DeepSort-based target tracking model acquisition unit, and a tracking result acquisition unit; wherein the Darkflow-based target detection model acquisition unit, It is used to train a target detection model based on Darkflow by using the YOLOv3 algorithm; input the detected image into the trained Darkflow-based target detection model to obtain the apparent features of multiple targets; the DeepSort-based target tracking model acquisition unit is used for Input the apparent characteristics of multiple targets into the trained DeepSort-based target tracking model; the target tracking model is obtained by training the multi-target detection data set MOT16 Challenge; the tracking result acquisition unit is used to use the target tracking model
  • the Kalman filter performs frame-by-frame data association processing on the surveillance video to realize multi-target tracking in the surveillance video.
  • An electronic device comprising: a memory, a processor, and a computer program stored in the memory and capable of running on the processor based on a Darkflow-DeepSort-based multi-target tracking detection method, the Darkflow-DeepSort-based multi-target
  • the computer program of the tracking detection method is executed by the processor, the following steps are realized: S110, use the YOLOv3 algorithm to train to obtain a Darkflow-based target detection model; S120, input the detected image into the trained Darkflow-based target detection model to obtain multiple The apparent characteristics of the target; wherein the detection image is obtained based on decoding the surveillance video; S130, the apparent characteristics of multiple targets are input into a trained DeepSort-based target tracking model; the target tracking model is detected by multiple targets The data set MOT16 Challenge is obtained by training; S140, using the Kalman filter of the target tracking model to perform frame-by-frame data association processing on the surveillance video to realize multi-target tracking in the surveillance video.
  • a computer-readable storage medium stores a computer program.
  • the computer program includes a Darkflow-DeepSort-based multi-target tracking and detection program.
  • the DeepSort multi-target tracking detection program is executed by the processor, the steps in the above-mentioned Darkflow-DeepSort-based multi-target tracking detection method.
  • This application uses the Kalman filter of the single hypothesis tracking method and frame-by-frame data association to realize multi-target tracking in surveillance video, and combines the YOLOv3 algorithm and Kalman filter to track multiple targets with high accuracy. It can also avoid the disadvantages of the multi-hypothesis algorithm, which is caused by the exponential increase of the number of measurements and the target number, which is caused by the huge amount of calculation.
  • the target detection image model uses the target detection image model to locate the moving target in the collected moving target image. For the continuously obtained multiple frames of continuous images, that is, for the video, the moving target is located in each frame of the image to achieve the Tracking and detection of moving target behavior; because the YOLOv3 algorithm is very fast in processing pictures, under the same conditions, training the target detection model based on the YOLOv3 algorithm has a faster image processing speed than the existing convolutional neural network algorithm training model (for example, compared to R- CNN is 1000 times faster, 100 times faster than Fast-RCNN).
  • the YOLOv3 algorithm is easy to transplant and can be implemented under various operating systems. It has relatively low requirements for terminal hardware configuration, and can easily implement the operation of the target detection model on lightweight devices.
  • this application can achieve accurate positioning and rapid recognition of moving target features with the same accuracy, improve the speed and accuracy of recognition in the video field, and reduce the delay and lag of the recording and broadcasting system.
  • FIG. 1 is a flowchart of a method for tracking and detecting multiple targets based on Darkflow-DeepSort according to an embodiment of the present application
  • Fig. 2 is a flowchart of a tracking method of a target tracking model according to an embodiment of the present application
  • Fig. 3 is a schematic diagram of a conversion process for converting a Darknet network structure into a Python model structure according to an embodiment of the present application
  • Fig. 4 is a schematic diagram of a Darkflow-based network structure according to an embodiment of the present application.
  • Fig. 5 is a flow chart of the existing tracking method for target tracking according to an embodiment of the present application.
  • FIG. 6 is a schematic structural diagram of a multi-target tracking and detection system based on Darkflow-DeepSort according to an embodiment of the present application
  • FIG. 7 is a schematic structural diagram of an electronic device for tracking and detecting multiple targets based on Darkflow-DeepSort according to an embodiment of the present application
  • the multi-target tracking detection method based on Darkflow-DeepSort includes a target detection stage and a target tracking stage; it involves two models of Darkflow and DeepSort.
  • the Darkflow model is mainly used for training samples for pedestrian detection.
  • the DeepSort model only uses the tracking part such as Kalman filter confirms the trajectory and so on.
  • the multi-target tracking and detection method based on Darkflow-DeepSort uses a convolutional neural network to extract spatial features, uses a Kalman filter to learn the motion law of the target, fuses the features of the target, and performs the location of the target Budget, combining time and space to calculate the similarity of the target to match the target to achieve the purpose of target tracking.
  • FIG. 1 shows the flow of the method for tracking and detecting multiple targets based on Darkflow-DeepSort in an embodiment of the present application.
  • the multi-target tracking detection method based on Darkflow-DeepSort includes the following steps:
  • the YOLOv3 algorithm uses the YOLOv3 algorithm to train to obtain a Darkflow-based target detection model; among them, the Darkflow model is based on the YOLOv3 algorithm and is obtained by using the defined binary cross-entropy training; while the loss function of the YOLOv3 algorithm is two parts, the first part is the category error , The second part is the object position error, which is the defined binary cross entropy, and then the sum of the difference of the two types of errors is taken as the total error function.
  • YOLOv3 (You only look once v3) is a target detection algorithm based on Darknet-53. Compared with other deep learning algorithms, the biggest improvement in performance is its faster detection speed. It is also the reason why we want to adopt this method as the target detection in the multi-target tracking detection in this application.
  • the target detection stage the Darkflow model trained by the YOLOv3 algorithm is used for target detection, and the Darkflow network structure is adopted as the network framework for target detection; in the target tracking stage, the Python model is used to complete target tracking.
  • the detection image is obtained based on the decoding of the surveillance video; an exemplary description is as follows: the customary way of decoding the video is decoding every other frame, for example, the number of frames in the decoding interval is set on the basis of extracting 4 frames per second. If the fps is 24, the number of interval frames is 6, and Opencv is used to decode the video in real time according to the number of decoding interval frames.
  • Apparent features are position information and spatial features; further, the Darkflow-based target detection model is a Python model, and the Python model is a Darkflow-based target detection model.
  • Under the target detection module is a deep feature descriptor; using depth The feature descriptor extracts apparent features.
  • 8 parameters (u, v, ⁇ , h, ) To describe the motion state, where (u, v) is the center coordinate of the bounding box, r is the aspect ratio, and h is the height.
  • the remaining four variables represent the corresponding speed information in the image coordinate system.
  • the apparent feature of the bounding box is a 128-dimensional feature obtained through a deep network.
  • the target tracking model is to determine the position information of the target in the video frame. It is necessary to extract the relatively stable statistical features or some invariant features through the corresponding target apparent feature description method, and obtain the target candidate area through the filter The response is used as the criterion for judging the target position.
  • the target tracking model based on DeepSort is trained on the public multi-target detection data set MOT16 Challenge.
  • the training set itself is the competition data provided by MOT16 Challenge.
  • the data set MOT16 Challenge is divided into a training set and a test set according to a ratio of 8:2.
  • the target tracking model based on DeepSort is built on the basis of Kalman filter, and the Kalman filter is used to build the tracking model, and then the target is detected by Darkflow to determine the apparent feature to match, and the positioning information is input to the Kalman filter for tracking .
  • Fig. 2 shows the flow of the working method of the DeepSort-based target tracking model in an embodiment of the present application.
  • the multi-target tracking detection method based on DeepSort and the core idea of DeepSort is the single-hypothesis tracking method, which uses recursive Kalman filtering and frame-by-frame data association to achieve the multi-target tracking process.
  • DeepSort introduced a deep learning model trained offline on the pedestrian re-identification data set (ReID data set containing more than 1.1 million images of 1,261 people, which is suitable for pedestrian tracking).
  • the DeepSort-based target tracking model in this application uses the DeepSort model to remove the target detection part.
  • the process framework of the target tracking tracking method is shown in Figure 5.
  • the prediction of the target is achieved through the Kalman filter.
  • Kalman filter a typical example of Kalman filter (Kalman) is to predict the coordinates and speed of the object's position from a limited set of noise-containing observation sequences of the object's position (with possible deviation). It can be found in many engineering applications (such as radar, computer vision). At the same time, Kalman filtering is also an important subject in control theory and control system engineering. For example, for radar, people are interested in its ability to track targets. However, the measured values of the target's position, velocity, and acceleration are often noisy at all times. Kalman filter uses the dynamic information of the target to try to remove the influence of noise and get a good estimate of the target position.
  • This estimation can be the estimation of the current target position (filtering), the estimation of the future position (prediction), or the estimation of the past position (interpolation or smoothing).
  • the apparent feature of the target obtained by the target detection model is passed through the Kalman filter for nearest neighbor matching; it should be noted that the nearest neighbor matching is to find the nearest feature to complete the matching according to the distance of the feature.
  • the prediction of the location of this feature is done through the Kalman filter, and then the nearest neighbor matching is done with the actual detected target location.
  • S210 Obtain the motion matching degree and the apparent feature matching degree of multiple targets; wherein the motion matching degree is obtained by calculating the motion similarity of the multiple targets obtained by the Kalman filter; the apparent feature matching degree is obtained by calculating The apparent features of the multiple targets are calculated and obtained; S220, using the motion matching degree and the apparent feature matching degree of the multi-targets, by performing frame-by-frame data association processing on the surveillance video to obtain the matching degree of the target frame; S230, selecting The target frame whose final matching degree reaches the preset matching parameters is used as the target tracking result. That is to say, pair the target with the tracker, update the pairing successful and unsuccessful tracking, delete the tracking that does not meet the conditions; then perform the technique and draw the trajectory on the target to complete the tracking action of the target.
  • the tracking object of the target frame can be a person, an animal or other moving objects.
  • the target frame can be called a human body frame.
  • multi-target tracking is completed by judging the matching degree of the target frame.
  • the final matching degree is equal to the average value of the IOU matching value and the apparent feature matching value, that is, the final matching degree is equal to (IOU matching value + apparent feature matching value)/2.
  • the preset matching parameters are that the final matching degree is greater than 0.5 and the IOU matching value is greater than 0.5; if the preset matching parameters are reached, the matching is successful and used for tracking; otherwise, the matching is determined to be unsuccessful.
  • the specific exemplary description of the tracking situation that does not meet the condition is deleted, and a standard Kalman filter is used to predict the target motion state, where the Kalman filter is based on a constant Velocity model (that is, the velocity is constant by default, that is, there is no acceleration model) and linear observation model.
  • the predicted result of the Kalman filter is (u, v, r, h). For each tracked target, the number of frames a k since the last detection result matches the tracking result is recorded. Once the detection result of a target matches After the tracking results are correctly correlated, the parameter a k is set to 0. Among them, recording is equivalent to having an external recorder or array to record the tracking data of each frame and each target.
  • the Kalman filter only relies on the input positions of each target and then makes predictions.
  • the predicted value of the Kalman filter is compared with the actual detected value. If the observed value is too different from the predicted value, the prediction cannot represent the observed value.
  • Amax is an upper limit
  • a k is the frame value where the predicted value of the Kalman filter does not match the observed value. If a k exceeds Amax, the Kalman filter tracking effect is not good. It is considered that the tracking process for the target has ended, and the tracking is not continued. In other words, the end of the tracking process means that after we have tracked a target but the subsequent Kalman filter cannot accurately predict the new position, we think that the tracking is over.
  • the judgment of the appearance of a new target is that if a target in a certain detection result cannot be compared with an existing tracker (the existing tracker is the tracker that was detected before and is now tracking the target) Connection, then it is considered that a new goal may have emerged.
  • the targets whose appearance times exceed a set threshold are screened, and the targets are given priority through cascade matching.
  • the set threshold of the number of occurrences is generally set to 3 times.
  • the priority given to frequently-occurring targets through cascade matching is set for a target that has been occluded for a long time. Among them, when a target is occluded for a long time, the uncertainty of Kalman filter prediction will be greatly increased, and the observability in the state space will be greatly reduced.
  • the Mahalanobis distance of the trajectory with the longer occlusion time is often smaller, making the detection result more likely to be related to the trajectory with the longer occlusion time. This undesirable effect often destroys the continuity of tracking.
  • cascading matching means combining various matching methods (such as IOU matching or feature matching), and matching by cascading (that is, one matching method followed by another matching method); or, further first Add selection criteria and then perform corresponding matching.
  • the second cascade method is adopted, that is, the selection criterion is added first, and then the corresponding matching action is performed. Therefore, a time sequence is first added, and the targets with high occurrence frequency are selected first, and then the matching mechanism is entered, so that the targets that have been blocked for a long time are more difficult to be matched first, that is, the more frequent targets are given priority. .
  • Figure 3 shows the conversion process of converting the Darknet network structure to the Python model structure in an embodiment of the present application
  • the conversion process of converting the Darknet network structure to the Python model structure includes the following steps: that is, the process of translating Darknet into the stream used by Tensorflow through Darkflow; and the process of translating Darknet into Tensorflow by Darkflow.
  • Figure 4 shows the Darkflow-based network structure of an embodiment of the application
  • the padding of all convolutional layers is 1, and the pooling layer is the maximum pooling.
  • Other parameters such as step size, convolution kernel size, and number of filters are shown in the figure.
  • the convolution kernel is a convolutional layer with (3*3) filters with a number of 32; then there is a maximum pooling with a step size of 2 and a pooling size of 2; then a convolution kernel is (3* 3)
  • a convolutional layer with a filter number of 64 is followed by a maximum pooling with a step size of 2 and a size of 2.
  • the subsequent network structures are similar, and a convolution kernel with a (3*3) filter number of N convolution layers is first performed, where N is twice the number of filters of the last large convolution structure.
  • Fig. 6 is a schematic structural diagram of a multi-target tracking and detection system based on Darkflow-DeepSort provided by an embodiment of the present application.
  • the present application also includes a Darkflow-DeepSort-based multi-target tracking and detection system 600, including a Darkflow-based target detection model acquisition unit 610, and DeepSort-based The target tracking model acquisition unit 620 and the tracking result acquisition unit 630; wherein the Darkflow-based target detection model acquisition unit 610 is used to train using the YOLOv3 algorithm to obtain a Darkflow-based target detection model; input the detected image to the trained The target detection model of Darkflow obtains the apparent characteristics of multiple targets; the DeepSort-based target tracking model acquisition unit 620 is configured to input the apparent characteristics of multiple targets into a trained DeepSort-based target tracking model; The target tracking model is obtained by training on the data set MOT16 Challenge of multi-target detection; the tracking result acquisition unit 630 is configured to use the Kalman filter of the target tracking model to perform frame-by-frame data association processing on the surveillance video to
  • the tracking result obtaining unit 630 includes a first matching degree obtaining module 631, a second matching degree obtaining module 632, and a tracking result determining module 633; wherein, the first matching degree obtaining module 631 is used to obtain multiple targets.
  • the degree of motion matching and the degree of apparent feature matching wherein the degree of motion matching is obtained by calculating the motion similarity of multiple targets obtained by the Kalman filter; the apparent feature matching degree is obtained by comparing the multiple targets
  • the tracking result determination module 633 selects the target frame whose final matching degree reaches the preset matching parameters as the target tracking result.
  • the DeepSort-based target tracking model acquisition unit 620 includes an apparent feature acquisition module 623 and a priority assignment module 624; wherein, the apparent feature acquisition module 623 is used to input the detected image into a trained Darkflow Target detection model to obtain the apparent characteristics of multiple targets; the priority assigning module 624 is used to screen the targets whose appearance times exceed the set threshold for the obtained apparent characteristics of multiple targets, Give priority to it through cascade matching.
  • the padding of the convolutional layer of the Darkflow-based target detection model is all 1, and the pooling layer is all maximum pooling.
  • the apparent feature acquisition module 623 includes an apparent feature description sub-module, and the apparent feature description sub-module is used for the apparent feature (u, v, r, h, ) Is described; where (u, v) is the center coordinate of the detection target, r is the aspect ratio, h is the height, The four variables represent the corresponding speed information of the detection target in the image coordinate system.
  • the DeepSort-based target tracking model acquisition unit 620 includes a DeepSort-based target tracking model training module 621 and a DeepSort-based target tracking model testing module 622; the data set MOT16 Challenge is divided into a training set and a test according to a ratio of 8:2. Set; wherein the DeepSort-based target tracking model training module 621 is used to train the DeepSort-based target tracking model through the training set; the DeepSort-based target tracking model testing module 622 is used to pass the The test set tests the DeepSort-based target tracking model.
  • the specific implementation functions of the multi-target detection unit and the multi-target tracking unit correspond to the corresponding steps of the multi-target tracking detection method based on Darkflow-DeepSort in the embodiment one-to-one, and this embodiment will not describe them one by one.
  • FIG. 7 is a schematic diagram of the logical structure of an electronic device provided by an embodiment of the present application.
  • the electronic device 70 of this embodiment includes a processor 71, a memory 72 and a computer program 73 stored in the memory 72 and running on the processor 71.
  • the processor 71 executes the computer program 73, each step of the multi-target tracking detection method based on Darkflow-DeepSort in the embodiment is implemented, such as steps S110 to S140 shown in FIG. 1.
  • the processor 71 implements the functions of the modules/units in the foregoing device embodiments when executing the multi-target tracking detection method based on Darkflow-DeepSort.
  • the computer program 73 may be divided into one or more modules/units, and the one or more modules/units are stored in the memory 72 and executed by the processor 71 to complete the application.
  • One or more modules/units may be a series of computer program instruction segments capable of completing specific functions, and the instruction segments are used to describe the execution process of the computer program 73 in the electronic device 70.
  • the computer program 73 can be divided into a multi-target detection unit and a multi-target tracking unit, and its functions are described in detail in the embodiments, and will not be repeated here.
  • the electronic device 70 may be a computing device such as a desktop computer, a notebook, a palmtop computer, and a cloud server.
  • the electronic device 70 may include, but is not limited to, a processor 71 and a memory 72.
  • FIG. 7 is only an example of the electronic device 70 and does not constitute a limitation on the electronic device 70. It may include more or less components than those shown in the figure, or combine certain components, or different components.
  • the electronic device may also include input and output devices, network access devices, buses, etc.
  • the so-called processor 71 may be a central processing unit (Central Processing Unit, CPU), other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), Ready-made programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the memory 72 may be an internal storage unit of the electronic device 70, such as a hard disk or a memory of the electronic device 70.
  • the memory 72 may also be an external storage device of the electronic device 70, such as a plug-in hard disk equipped on the electronic device 70, a smart memory card (Smart Media Card, SMC), a Secure Digital (SD) card, and a flash memory card (Flash). Card) and so on.
  • the memory 72 may also include both an internal storage unit of the electronic device 70 and an external storage device.
  • the memory 72 is used to store computer programs and other programs and data required by the electronic device.
  • the memory 72 can also be used to temporarily store data that has been output or will be output.
  • This embodiment provides a computer-readable storage medium with a computer program stored on the computer-readable storage medium.
  • the computer program When the computer program is executed by a processor, it implements the Darkflow-DeepSort-based multi-target tracking detection method in the embodiment. To avoid repetition, I won't repeat it here. Or, when the computer program is executed by the processor, the function of each module/unit in the multi-target tracking and detection system based on Darkflow-DeepSort is realized. In order to avoid repetition, it will not be repeated here.
  • the disclosed device and method may be implemented in other ways.
  • the device embodiments described above are merely illustrative.
  • the division of the modules or units is only a logical function division.
  • there may be other division methods for example, multiple units or components may be Combined or can be integrated into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the functional units in each embodiment of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
  • the integrated module/unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • this application implements all or part of the processes in the above-mentioned embodiments and methods, and can also be completed by instructing relevant hardware through a computer program.
  • the computer program can be stored in a computer-readable storage medium. When the program is executed by the processor, the steps of the foregoing method embodiments can be implemented.
  • the computer program includes computer program code, and the computer program code may be in the form of source code, object code, executable file, or some intermediate forms.
  • the computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, U disk, mobile hard disk, magnetic disk, optical disk, computer memory, read-only memory (ROM, Read-Only Memory) , Random Access Memory (RAM, Random Access Memory), electrical carrier signal, telecommunications signal, and software distribution media, etc.
  • ROM Read-Only Memory
  • RAM Random Access Memory
  • electrical carrier signal telecommunications signal
  • software distribution media etc.
  • the content contained in the computer-readable medium can be appropriately added or deleted in accordance with the requirements of the legislation and patent practice in the jurisdiction.
  • the computer-readable medium Does not include electrical carrier signals and telecommunication signals.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

一种基于Darkflow-DeepSort的多目标追踪检测方法、装置及存储介质,涉及智能决策技术领域,其中,方法包括以下步骤:S110、利用YOLOv3算法训练得到基于Darkflow的目标检测模型;S120、将检测图像输入训练好的基于Darkflow的目标检测模型,得到多个目标的表观特征;S130、将多个目标的表观特征输入训练好的基于DeepSort的目标跟踪模型,目标跟踪模型通过多目标检测的数据集MOT16 Challenge训练得到;S140、利用目标跟踪模型的卡尔曼滤波器对所述监控视频进行逐帧的数据关联处理,实现所述监控视频中的多目标追踪。该方法能够提升多目标追踪检测速度,且在不损失检测准确度的情况下完成多目标追踪。

Description

基于Darkflow-DeepSort的多目标追踪检测方法、装置及存储介质
本申请要求申请号为201910701678.5,申请日为2019年7月31日,发明创造名称为“基于Darkflow-DeepSort的多目标追踪检测方法、装置及存储介质”的专利申请的优先权。
技术领域
本申请涉及智能决策技术领域,更为具体地,涉及一种基于Darkflow-DeepSort的多目标追踪检测方法、装置及存储介质。
背景技术
视觉目标追踪的方法广泛应用于人机交互、无人驾驶等领域,基于相关滤波(Correlation Filter)和卷积神经网络(CNN)的跟踪方法已经占据了目标跟踪领域的大半江山。
在现存的多目标追踪方法中SORT方法(SIMPLE ONLINE AND REALTIME TRACKING,简单的在线和实时跟踪)取得的较好的效果。这个方法最大的特点是高效地实现了目标检测并使用卡尔曼滤波去滤波以及Hungarian算法进行跟踪。
申请人意识到DeepSort是在SORT目标追踪基础上的改进,利用原始DeepSort训练高性能的的Faster-RCNN模型进行目标检测,相对于Sort算法减少了45%的ID switch,并且结合了深度外观信息,对遮挡目标的追踪效果有了大大提升;升高了FP,达到了最先进的在线跟踪效果。但是,用此方法的DeepSort追踪时fps最高能达到15fps左右,但是平均fps仅仅能达到10左右,且实时追踪仅仅稳定在8fps左右。
所以,亟需一种提升检测速度,且不损失检测准确度的多目标追踪检测方法。
发明内容
为了解决上述问题,本申请的目的是提供一种基于Darkflow-DeepSort的多目标追踪检测方法、装置及存储介质。
一种基于Darkflow-DeepSort的多目标追踪检测方法,应用于电子装置,包括以下步骤:
S110、利用YOLOv3算法训练得到基于Darkflow的目标检测模型;
S120、将检测图像输入训练好的基于Darkflow的目标检测模型,得到多个目标的表观特征;其中,所述检测图像基于对监控视频进行解码获得;
S130、将多个目标的表观特征输入训练好的基于DeepSort的目标跟踪模型;所述目标跟踪模型通过多目标检测的数据集MOT16 Challenge训练得到;
S140、利用目标跟踪模型的卡尔曼滤波器对所述监控视频进行逐帧的数据关联处理,实现所述监控视频中的多目标追踪。
一种基于Darkflow-DeepSort的多目标追踪检测系统,包括基于Darkflow的目标检测模型获取单元、基于DeepSort的目标跟踪模型获取单元以及追踪结果获取单元;其中,所述基于Darkflow的目标检测模型获取单元,用于利用YOLOv3算法训练得到基于Darkflow的目标检测模型;将检测图像输入训练好的基于Darkflow的目标检测模型,得到多个目标的表观特征;所述基于DeepSort的目标跟踪模型获取单元,用于将多个目标的表观特征输入训练好的基于DeepSort的目标跟踪模型;所述目标跟踪模型通过多目标检测的数据集MOT16 Challenge训练得到;所述追踪结果获取单元,用于利用目标跟踪模型的卡尔曼滤波器对所述监控视频进行逐帧的数据关联处理,实现所述监控视频中的多目标追踪。
一种电子装置,包括:存储器、处理器以及存储在所述存储器中并可在所述处理器上运行基于Darkflow-DeepSort的多目标追踪检测方法的计算机程序,所述基于Darkflow-DeepSort的多目标追踪检测方法的计算机程序被所述处理器执行时实现如下步骤:S110、利用YOLOv3算法训练得到基于Darkflow的目标检测模型;S120、将检测图像输入训练好的基于Darkflow的目标检测模型,得到多个目标的表观特征;其中,所述检测图像基于对监控视频进行解码获得;S130、将多个目标的表观特征输入训练好的基于DeepSort的目标跟踪模型;所述目标跟踪模型通过多目标检测的数据集MOT16 Challenge训练得到;S140、利用目标跟踪模型的卡尔曼滤波器对所述监控视频进行逐帧 的数据关联处理,实现所述监控视频中的多目标追踪。
根据本申请的另一方面,提供一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序包括基于Darkflow-DeepSort的多目标追踪检测程序,所述基于Darkflow-DeepSort的多目标追踪检测程序被处理器执行时上述基于Darkflow-DeepSort的多目标追踪检测方法中的步骤。
利用上述基于Darkflow-DeepSort的多目标追踪检测方法、装置及存储介质,可以实现的效果如下:
1、本申请利用单假设追踪方法的卡尔曼滤波和逐帧的数据关联实现监控视频中的多目标追踪,并将YOLOv3算法和卡尔曼滤波结合在一起,既可以高准确率的追踪多目标,又可以避免多假设算法的随着量测数和目标数呈指数级增长带来的计算量庞大的弊端。
2、利用目标检测图像模型将运动目标在采集的运动目标图像中定位,对于连续获得的多帧连续图像,也即对视频而言,通过在每帧图像中定位运动目标,从而实现对视频中运动目标行为的跟踪检测;由于YOLOv3算法处理图片的速度很快,在相同条件下,基于YOLOv3算法训练目标检测模型对图像处理速度要比现有卷积神经网络算法训练的模型(如比R-CNN快1000倍,比Fast-RCNN快100倍)的处理速度快。
3、YOLOv3算法移植方便,可以在各个操作系统下实现,对终端硬件的配置要求相对较低,能够较容易的在轻量级设备上实现目标检测模型的运行。
4、提取待追踪目标的表观特征进行最近邻匹配,改善了有遮挡情况下的目标追踪效果,同时,减少了目标ID跳变的问题。
5、利用本申请的方法追踪视频中的目标时,在原始fps为25的视频中,不做抽帧处理的情况下,可以达到15fps,做每三帧抽帧处理时,最优可以达到20以上fps而且不会丢失跟踪目标;而对于实时摄像头追踪也能可以达到14fps以上,在保证准确度的基础上,将检测速度提升100倍。
6、针对实时录播的应用场景,本申请可在相同的精度下实现对运动目标特征的准确定位和快速识别,提高在视频领域识别的速度和精度,减少录播系统的延迟和卡顿。
为了实现上述以及相关目的,本申请的一个或多个方面包括后面将详细 说明的特征。下面的说明以及附图详细说明了本申请的某些示例性方面。然而,这些方面指示的仅仅是可使用本申请的原理的各种方式中的一些方式。此外,本申请旨在包括所有这些方面以及它们的等同物。
附图说明
通过参考以下结合附图的说明,并且随着对本申请的更全面理解,本申请的其它目的及结果将更加明白及易于理解。在附图中:
图1为根据本申请实施例的基于Darkflow-DeepSort的多目标追踪检测方法的流程图;
图2根据本申请实施例的目标跟踪模型的跟踪方法的流程图;
图3根据本申请实施例的将Darknet网络结构转换成Python的模型结构的转换流程示意图;
图4根据本申请实施例的基于Darkflow的网络结构示意图;
图5根据本申请实施例的现有的目标追踪的跟踪方法的流程框架图;
图6为根据本申请实施例的基于Darkflow-DeepSort的多目标追踪检测系统的结构示意图;
图7根据本申请实施例的基于Darkflow-DeepSort的多目标追踪检测的电子装置的结构示意图;
在所有附图中相同的标号指示相似或相应的特征或功能。
具体实施方式
在下面的描述中,出于说明的目的,为了提供对一个或多个实施例的全面理解,阐述了许多具体细节。然而,很明显,也可以在没有这些具体细节的情况下实现这些实施例。在其它例子中,为了便于描述一个或多个实施例,公知的结构和设备以方框图的形式示出。
本申请提供一种基于Darkflow-DeepSort的多目标追踪检测方法、电子装置及存储介质。其中的基于Darkflow-DeepSort的多目标追踪检测方法,包括目标检测阶段和目标追踪阶段;涉及Darkflow和DeepSort两个模型,其中的Darkflow模型主要用于训练样本进行行人检测,DeepSort模型仅仅使用跟踪部分如卡尔曼滤波器确认轨迹等。本申请提供的基于Darkflow-DeepSort的 多目标追踪检测方法在使用卷积神经网络提取空间特征的基础上,使用卡尔曼滤波器学习目标的运动规律,对目标的特征进行融合,对目标的位置进行预算,结合时间和空间两方面计算目标的相似度进行目标匹配,实现目标追踪的目的。
图1示出了本申请实施例的基于Darkflow-DeepSort的多目标追踪检测方法的流程。
如图1所示,基于Darkflow-DeepSort的多目标追踪检测方法,包括如下步骤:
S110、利用YOLOv3算法训练得到基于Darkflow的目标检测模型;其中,Darkflow的模型为基于YOLOv3算法,采用定义的二值交叉熵训练所得;而YOLOv3算法的时候损失函数为两部分,第一部分为类别误差,第二部分为物体位置误差,都是定义的二值交叉熵,然后取两类误差的差方和作为总的误差函数。
需要说明的是,YOLOv3(You only look once v3)是基于Darknet-53的一种目标检测算法,相较于其他的深度学习算法来说,性能上提升的最大的就是其检测速度更快,这也是我们想要采用这种方法作为本申请中多目标追踪检测中的目标检测的原因。其中,在目标检测阶段,使用YOLOv3算法训练的Darkflow模型进行目标检测,采取Darkflow网络结构作为目标检测的网络框架;在目标追踪阶段,利用Python的模型完成目标追踪。
S120、将检测图像输入训练好的基于Darkflow的目标检测模型,得到多个目标的表观特征。
其中,所述检测图像基于对监控视频进行解码获得;示例性的说明如下:对视频进行解码的惯用方式为隔帧解码,例如,每秒提取4帧的基础上设置解码间隔帧数,若视频fps为24,则间隔帧数为6,按照解码间隔帧数使用Opencv对视频实时解码出图像。
表观特征也就是位置信息和空间特征;进一步的,所述基于Darkflow的目标检测模型为Python模型,Python模型就是基于Darkflow的目标检测模型,在目标检测这个模块下就是深度特征描述器;利用深度特征描述器进行提取表观特征。示例性的,使用8个参数(u,v,γ,h,
Figure PCTCN2019117801-appb-000001
)来进行运动状态的描述,其中(u,v)是bounding box的中心坐标,r是长宽比,h表示高度。其 余四个变量表示对应的在图像坐标系中的速度信息。其中,bounding box的表观特征是通过一个深度网络得到的128维的特征。
S130、将多个目标的表观特征输入训练好的基于DeepSort的目标跟踪模型,所述目标跟踪模型通过多目标检测的数据集MOT16 Challenge训练得到。而目标追踪模型就利用DeepSort模型中去掉目标检测的部分,也就是包括卡尔曼滤波器及后续级联匹配。
目标跟踪模型,在于确定目标在视频帧中的位置信息,需要通过相应的目标表观特征描述方法将其中相对稳定的统计特征或某些不变的特征提取出来,通过滤波器来获取目标候选区域的响应,作为判断目标位置的标准。
基于DeepSort的目标跟踪模型,是在公开的多目标检测的数据集MOT16 Challenge上训练得到的。训练集本身是就是MOT16 Challenge提供的比赛数据。将所述数据集MOT16 Challenge按照8:2的比例分为训练集和测试集。
进一步的,基于DeepSort的目标跟踪模型建立在Kalman滤波器的基础上,用Kalman滤波器去建立跟踪模型,然后通过Darkflow实现目标检测确定表观特征进行匹配,并且对定位信息输入Kalman滤波器进行跟踪。
S140、利用目标跟踪模型的卡尔曼滤波器对所述监控视频进行逐帧的数据关联处理,实现监控视频中的多目标追踪。利用单假设追踪方法的卡尔曼滤波实现多目标追踪,既可以高准确率的追踪多目标,又可以避免多假设算法的随着量测数和目标数呈指数级增长计算量庞大的弊端。
图2示出了本申请实施例的基于DeepSort的目标追踪模型的工作方法的流程。
如图2所示,基于DeepSort的多目标追踪检测方法,而DeepSort的核心思想是单假设追踪方法,利用递归的卡尔曼滤波和逐帧的数据关联实现多目标的追踪过程。需要说明的是,DeepSort引入了在行人重识别数据集(ReID数据集,包含1261个人超过110万张图像,该数据集适合做行人追踪)上离线训练的深度学习模型。本申请中的基于DeepSort的目标追踪模型就利用DeepSort模型中去掉目标检测的部分。
在目标追踪阶段,目标追踪的跟踪方法的流程框架如图5所示。首先初始化目标框,在下一帧中产生众多候选框,提取这些候选框的特征,然后对这些候选框评分,最后在这些评分中找到一个得分最高的候选框作为预测的 目标,或者对多个预测值进行融合得到更优的预测目标。在本申请中,目标的预测是通过卡尔曼滤波器来实现的。
其中,卡尔曼滤波(Kalman)的一个典型实例是从一组有限的,包含噪声的,对物体位置的观察序列(可能有偏差)预测出物体的位置的坐标及速度。在很多工程应用(如雷达、计算机视觉)中都可以找到它的身影。同时,卡尔曼滤波也是控制理论以及控制系统工程中的一个重要课题。例如,对于雷达来说,人们感兴趣的是其能够跟踪目标。但目标的位置、速度、加速度的测量值往往在任何时候都有噪声。卡尔曼滤波利用目标的动态信息,设法去掉噪声的影响,得到一个关于目标位置的好的估计。这个估计可以是对当前目标位置的估计(滤波),也可以是对于将来位置的估计(预测),也可以是对过去位置的估计(插值或平滑)。将目标检测模型得到的目标的表观特征通过卡尔曼滤波器进行最近邻匹配;需要说明的是最近邻匹配就是根据特征的距离来找到最近的一个特征完成匹配。这个特征的位置的预测是通过Kalman滤波器来完成的,然后与实际中检测到的目标的位置做最近邻匹配。
S210、获得多目标的运动匹配度以及表观特征匹配度;其中,所述运动匹配度通过对卡尔曼滤波器获得的多目标的运动相似性进行计算获得;所述表观特征匹配度通过将所述多个目标的表观特征计算获得;S220、利用多目标的运动匹配度和表观特征匹配度,通过对监控视频进行逐帧的数据关联处理,得到目标框的匹配度;S230、选取最终匹配度达到预设匹配参数的目标框作为目标追踪结果。也就是说,将目标与追踪器进行配对,更新配对成功与配对未成功的追踪,删除不满足条件的跟踪;然后对目标进行技术与画轨迹,从而完成目标的追踪动作。
需要说明的是,目标框的追踪对象可以是人,可以是动物也可以其他移动的物体。当追踪对象为人时,目标框可以称为人体框。
在具体的实施例中,通过目标框的匹配度判断完成多目标追踪,匹配度的判断分为两个部分:IOU匹配以及表观特征的匹配;其中,IOU的匹配,是对前后两次检测之间进行IOU匹配;表观特征的匹配,是通过一个网络提取表观特征向量,当前跟踪目标与潜在的匹配对象会进行对比。计算前后两次表观特征向量平均距离小的最小值。其中,表观特征的匹配度=1-归一化平均距离的最小值。
而最终匹配度等于IOU匹配值与表观特征匹配值的平均值,也就是说最终匹配度等于(IOU匹配值+表观特征匹配值)/2。
预设匹配参数为最终匹配度大于0.5,且IOU匹配值大于0.5;如果达到了预设匹配参数则说明匹配成功,用于追踪;否则,判定为匹配不成功。
对于更新配对成功与配对未成功的追踪,删除不满足条件的跟踪的情况具体性的示例性的说明,使用一个标准卡尔曼滤波器进行目标运动状态的预测,其中,卡尔曼滤波器为基于常量速度模型(就是指速度默认为恒定,也就是不存在加速度的模型)和线性观测模型。
卡尔曼滤波器的预测的结果为(u,v,r,h),对每一个追踪目标,记录自其上一次检测结果与追踪结果匹配之后的帧数a k,一旦一个目标的检测结果与追踪结果正确关联之后,就将该参数a k设置为0。其中,记录就是相当于有一个外部的记录器,或者数组去记录每一帧,每一个目标的追踪数据,卡尔曼滤波仅仅是依靠输入的各个目标的位置,然后做预测。
需要说明的是,对卡尔曼滤波器的预测值与实际检测值做比较,如果观测值与预测值相差过大,则预测不能代表观测值。
也就是说,Amax为一个上限,a k则是卡尔曼滤波器预测值与观测值不匹配的帧数值,如果a k超过了Amax,则说明卡尔曼滤波器跟踪效果不好了。则认为对该目标的追踪过程已结束,便不再继续跟踪。也就是说,跟踪过程结束是指对一个目标我们进行了跟踪但是后续卡尔曼滤波器不能准确的预测新的位置之后,我们认为跟踪结束了。
其中,对新目标出现的判断则是,如果某次检测结果中的某个目标始终无法与已经存在的追踪器(已存在的追踪器就是之前检测到的,现在正在跟踪目标的追踪器)进行关联,那么则认为可能出现了新目标。
如果连续的3帧中潜在的新的追踪器(新的追踪器就是针对新的出现的目标,能够连续三帧关联预测结果和检测结果,则认为是新的追踪器。)对目标位置的预测结果都能够与检测结果正确关联,那么则确认是出现了新的运动目标。
如果不能达到该要求,则认为是出现了“虚警”,需要删除该运动目标;也就是说,删除该运动目标时指,对于一个新的检测模型检测到的目标其连续三帧内不能完成匹配,我们就认为这个目标不是跟踪目标(可能来源于检 测误差),删除这个目标。一个新的目标的出现会先与已经存在的追踪器做匹配,看是否属于之前正在跟踪的目标,如果不是,则认为是可能出现了新的目标,需要创建新的追踪器。在本申请的一个具体实施例中,对于所述步骤S120中得到的多个目标的表观特征,筛选出现次数超过设定阈值的目标,对所述目标通过级联匹配赋予其优先权。其中,出现次数的设定阈值一般设定为3次。
进一步的,在级联匹配的最后阶段,为了缓解因为表观突变或者部分遮挡导致的较大变化,可以对unconfirmed和age=1的未匹配轨迹进行基于IOU的匹配。通过级联匹配对频繁出现的目标赋予优先权是针对一个目标被长时间遮挡的状态设置的。其中,当一个目标长时间被遮挡之后,卡尔曼滤波预测的不确定性就会大大增加,状态空间内的可观察性就会大大降低。假如此时两个追踪器竞争同一个检测结果的匹配权,往往遮挡时间较长的那条轨迹的马氏距离更小,使得检测结果更可能和遮挡时间较长的那条轨迹相关联,这种不理想的效果往往会破坏追踪的持续性。
也就是说,假设本来协方差矩阵是一个正态分布,那么连续的预测不更新就会导致这个正态分布的方差越来越大,那么离均值欧氏距离远的点可能和之前分布中离得较近的点获得同样的马氏距离值。所以本申请中使用了级联匹配(Matching Cascade)来对更加频繁出现的目标赋予优先权。
需要说明的是,级联匹配的意思就是将各种匹配方式相结合(如IOU匹配或特征匹配),通过级联的方式(即一个匹配方式接一个匹配方式)进行匹配;或者,进一步的先加入挑选准则,然后进行对应的匹配。
在一个具体的实施例中,采用第二种级联方式,也就是说,先加入挑选准则,然后进行对应的匹配动作。因此,先加入了一个时间点序列,优先从出现频率高的目标上进行挑选,然后进入匹配机制,从而使得长时间被遮挡的目标较难被优先匹配,即对更加频繁出现的目标赋予优先权。
具体算法参见论文SIMPLE ONLINE AND REALTIME TRACKING WITH A DEEP ASSOCIATION METRIC;Nicolai Wojke,Alex Bewley,Dietrich Paulus,University of Koblenz-Landau
Figure PCTCN2019117801-appb-000002
Queensland University of Technology,在此不再赘述。
图3示出了本申请实施例的将Darknet网络结构转换成Python的模型结 构的转换流程;
如图3所示,将Darknet网络结构转换成Python的模型结构的转换流程,包括如下步骤:也就是说通过Darkflow将Darknet翻译为Tensorflow所使用的流;Darkflow将Darknet翻译为Tensorflow的过程。
通过Cython我们将原基于C的Darknet网络结构转换成Python的模型结构方便供DeepSort使用。同时,也可以生成Tensorflow使用的pb模型结构供其他的算法使用。
图4示出了本申请实施例的基于Darkflow的网络结构;
如图4所示,Darkflow的网络结构如下:
在Darkflow网络结构中所有卷积层的padding均为1,池化层均为最大池化。其他的参数如步长,卷积核尺寸,滤波器的个数,均如图所示。最开始是卷积核为(3*3)filter个数为32的卷积层;接着是一个步长为2,池化大小为2的最大池化;之后是一个卷积核为(3*3)filter个数为64的卷积层,跟着一个步长为2,大小为2的最大池化。之后的网络结构比较类似,都是先进行一个卷积核为(3*3)filter个数为N的卷积层,其中N为上一次大的卷积结构的filter个数的二倍。然后进行一个卷积核为(1*1)filter个数为N/2的卷积,再进行一个卷积核为(3*3)filter个数为N的卷积层,最后进行一次最大池化。形成一个大的卷积结构。该卷积结构一共进行4次,在最后一次时去掉池化层,接上两个对应的卷积层。
图6是本申请一实施例提供的基于Darkflow-DeepSort的多目标追踪检测系统的结构示意图。
如图6所示,与上述基于Darkflow-DeepSort的多目标追踪检测方法相对应,本申请还包括基于Darkflow-DeepSort的多目标追踪检测系统600,包括基于Darkflow的目标检测模型获取单元610、基于DeepSort的目标跟踪模型获取单元620以及追踪结果获取单元630;其中,所述基于Darkflow的目标检测模型获取单元610,用于利用YOLOv3算法训练得到基于Darkflow的目标检测模型;将检测图像输入训练好的基于Darkflow的目标检测模型,得到多个目标的表观特征;所述基于DeepSort的目标跟踪模型获取单元620,用于将多个目标的表观特征输入训练好的基于DeepSort的目标跟踪模型;所述目标跟踪模型通过多目标检测的数据集MOT16 Challenge训练得到;所述追 踪结果获取单元630,用于利用目标跟踪模型的卡尔曼滤波器对所述监控视频进行逐帧的数据关联处理,实现所述监控视频中的多目标追踪。所述基于Darkflow的目标检测模型获取单元610包括网络结构转换模块611,所述网络结构转换模块,用于通过Cython将Darknet网络结构转换获得基于Darkflow的目标检测模型。
需要说明的是,追踪结果获取单元630包括第一匹配度获取模块631、第二匹配度获取模块632和追踪结果判定模块633;其中,所述第一匹配度获取模块631,用于获得多目标的运动匹配度以及表观特征匹配度;其中,所述运动匹配度通过对卡尔曼滤波器获得的多目标的运动相似性进行计算获得;所述表观特征匹配度通过将所述多个目标的表观特征计算获得;所述第二匹配度获取模块632,利用多目标的运动匹配度和表观特征匹配度,通过对所述监控视频逐帧的数据关联处理,得到的IOU匹配值以及表观特征匹配值,通过IOU匹配值以及表观特征匹配值计算目标框的最终匹配度。所述追踪结果判定模块633,选取最终匹配度达到预设匹配参数的目标框作为目标追踪结果。
具体地,所述基于DeepSort的目标跟踪模型获取单元620包括表观特征获取模块623和优先权赋予模块624;其中,所述表观特征获取模块623,用于将检测图像输入训练好的基于Darkflow的目标检测模型,得到多个目标的表观特征;所述优先权赋予模块624,用于对于所得到的多个目标的表观特征,筛选出现次数超过设定阈值的目标,对所述目标通过级联匹配赋予其优先权。所述基于Darkflow的目标检测模型的卷积层的padding均为1,池化层均为最大池化。
其中,表观特征获取模块623包括表观特征描述子模块,所述表观特征描述子模块,用于所述表观特征用(u,v,r,h,
Figure PCTCN2019117801-appb-000003
)进行描述;其中,(u,v)是检测目标的中心坐标,r是长宽比,h表示高度,
Figure PCTCN2019117801-appb-000004
四个变量表示检测目标在图像坐标系中的对应的速度信息。
所述基于DeepSort的目标跟踪模型获取单元620包括基于DeepSort的目标跟踪模型训练模块621和基于DeepSort的目标跟踪模型测试模块622;所述数据集MOT16 Challenge按照8:2的比例分为训练集和测试集;其中,所述基于DeepSort的目标跟踪模型训练模块621,用于通过所述训练集训练所述基于DeepSort的目标跟踪模型;所述基于DeepSort的目标跟踪模型测试模块 622,用于通过所述测试集测试所述基于DeepSort的目标跟踪模型。其中,多目标检测单元和多目标追踪单元的具体实现功能与实施例中基于Darkflow-DeepSort的多目标追踪检测方法的对应的步骤一一对应,本实施例不一一详述。
图7是本申请一实施例提供的电子装置逻辑结构的示意图。
如图7所示,该实施例的电子装置70包括处理器71、存储器72以及存储在存储器72中并可在处理器71上运行的计算机程序73。处理器71执行计算机程序73时实现实施例中基于Darkflow-DeepSort的多目标追踪检测方法的各个步骤,例如图1所示的步骤S110至S140。或者,处理器71执行基于Darkflow-DeepSort的多目标追踪检测方法时实现上述各装置实施例中各模块/单元的功能。
示例性的,计算机程序73可以被分割成一个或多个模块/单元,一个或者多个模块/单元被存储在存储器72中,并由处理器71执行,以完成本申请。一个或多个模块/单元可以是能够完成特定功能的一系列计算机程序指令段,该指令段用于描述计算机程序73在电子装置70中的执行过程。例如,计算机程序73可以被分割成多目标检测单元和多目标追踪单元,其功能作用在实施例中有详细描述,在此不一一赘述。
电子装置70可以是桌上型计算机、笔记本、掌上电脑及云端服务器等计算设备。电子装置70可包括,但不仅限于,处理器71、存储器72。本领域技术人员可以理解,图7仅仅是电子装置70的示例,并不构成对电子装置70的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件,例如电子装置还可以包括输入输出设备、网络接入设备、总线等。
所称处理器71可以是中央处理单元(Central Processing Unit,CPU),还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
存储器72可以是电子装置70的内部存储单元,例如电子装置70的硬盘或内存。存储器72也可以是电子装置70的外部存储设备,例如电子装置70 上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。进一步地,存储器72还可以既包括电子装置70的内部存储单元也包括外部存储设备。存储器72用于存储计算机程序以及电子设备所需的其他程序和数据。存储器72还可以用于暂时地存储已经输出或者将要输出的数据。
本实施例提供一计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器执行时实现实施例中基于Darkflow-DeepSort的多目标追踪检测方法,为避免重复,这里不再赘述。或者,该计算机程序被处理器执行时实现上述基于Darkflow-DeepSort的多目标追踪检测系统中各模块/单元的功能,为避免重复,这里不再赘述。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
在本申请所提供的实施例中,应该理解到,所揭露的装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通讯连接可以是通过一些接口,装置或单元的间接耦合或通讯连接,可以是电性,机械或其它的形式。
作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的模块/单元如果以软件功能单元的形式实现并作为独立的产品 销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请实现上述实施例方法中的全部或部分流程,也可以通过计算机程序来指令相关的硬件来完成,所述的计算机程序可存储于一计算机可读存储介质中,该计算机程序在被处理器执行时,可实现上述各个方法实施例的步骤。其中,所述计算机程序包括计算机程序代码,所述计算机程序代码可以为源代码形式、对象代码形式、可执行文件或某些中间形式等。所述计算机可读介质可以包括:能够携带所述计算机程序代码的任何实体或装置、记录介质、U盘、移动硬盘、磁碟、光盘、计算机存储器、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、电载波信号、电信信号以及软件分发介质等。需要说明的是,所述计算机可读介质包含的内容可以根据司法管辖区内立法和专利实践的要求进行适当的增减,例如在某些司法管辖区,根据立法和专利实践,计算机可读介质不包括是电载波信号和电信信号。
如上参照图1-图7以示例的方式描述根据本申请的基于Darkflow-DeepSort的多目标追踪检测方法、电子装置及存储介质。但是,本领域技术人员应当理解,对于上述本申请所提出的基于Darkflow-DeepSort的多目标追踪检测方法、装置及存储介质,还可以在不脱离本申请内容的基础上做出各种改进。因此,本申请的保护范围应当由所附的权利要求书的内容确定。

Claims (20)

  1. 一种基于Darkflow-DeepSort的多目标追踪检测方法,应用于电子装置,其特征在于,包括以下步骤:
    S110、利用YOLOv3算法训练得到基于Darkflow的目标检测模型;
    S120、将检测图像输入训练好的基于Darkflow的目标检测模型,得到多个目标的表观特征;其中,所述检测图像基于对监控视频进行解码获得;
    S130、将多个目标的表观特征输入训练好的基于DeepSort的目标跟踪模型;所述目标跟踪模型通过多目标检测的数据集MOT16 Challenge训练得到;
    S140、利用目标跟踪模型的卡尔曼滤波器对所述监控视频进行逐帧的数据关联处理,实现所述监控视频中的多目标追踪。
  2. 根据权利要求1所述的基于Darkflow-DeepSort的多目标追踪检测方法,其特征在于,所述基于Darkflow的目标检测模型为Python模型,所述Python模型通过Cython将Darknet网络结构转换获得。
  3. 根据权利要求1所述的基于Darkflow-DeepSort的多目标追踪检测方法,其特征在于,所述步骤140包括:
    S210、获得多目标的运动匹配度以及表观特征匹配度;其中,所述运动匹配度通过对卡尔曼滤波器获得的多目标的运动相似性进行计算获得;所述表观特征匹配度通过将所述多个目标的表观特征计算获得;
    S220、利用多目标的运动匹配度和表观特征匹配度,通过对所述监控视频逐帧的数据关联处理,得到的IOU匹配值以及表观特征匹配值,通过IOU匹配值以及表观特征匹配值计算目标框的最终匹配度;
    S230、选取最终匹配度达到预设匹配参数的目标框作为目标追踪结果。
  4. 根据权利要求1所述的基于Darkflow-DeepSort的多目标追踪检测方法,其特征在于,对于所述步骤S120中得到的多个目标的表观特征,筛选出现次数超过设定阈值的目标,对所述目标通过级联匹配赋予其优先权。
  5. 根据权利要求2所述的基于Darkflow-DeepSort的多目标追踪检测方法,其特征在于,所述Darkflow的网络结构中卷积层的padding均为1,池化层均为最大池化。
  6. 根据权利要求1所述的基于Darkflow-DeepSort的多目标追踪检测方 法,其特征在于,对于所述步骤S120中得到的多个目标的表观特征,所述表观特征用(u,v,r,h,
    Figure PCTCN2019117801-appb-100001
    )进行描述;其中,(u,v)是检测目标的中心坐标,r是长宽比,h表示高度,
    Figure PCTCN2019117801-appb-100002
    四个变量表示检测目标在图像坐标系中的对应的速度信息。
  7. 根据权利要求1所述的基于Darkflow-DeepSort的多目标追踪检测方法,其特征在于,将所述数据集MOT16 Challenge按照8:2的比例分为训练集和测试集。
  8. 一种基于Darkflow-DeepSort的多目标追踪检测系统,其特征在于,包括基于Darkflow的目标检测模型获取单元、基于DeepSort的目标跟踪模型获取单元以及追踪结果获取单元;其中,
    所述基于Darkflow的目标检测模型获取单元,用于利用YOLOv3算法训练得到基于Darkflow的目标检测模型;将检测图像输入训练好的基于Darkflow的目标检测模型,得到多个目标的表观特征;
    所述基于DeepSort的目标跟踪模型获取单元,用于将多个目标的表观特征输入训练好的基于DeepSort的目标跟踪模型;所述目标跟踪模型通过多目标检测的数据集MOT16 Challenge训练得到;
    所述追踪结果获取单元,用于利用目标跟踪模型的卡尔曼滤波器对所述监控视频进行逐帧的数据关联处理,实现所述监控视频中的多目标追踪。
  9. 根据权利要求8所述的基于Darkflow-DeepSort的多目标追踪检测系统,其特征在于,所述基于Darkflow的目标检测模型获取单元包括网络结构转换模块,所述网络结构转换模块,用于通过Cython将Darknet网络结构转换获得基于Darkflow的目标检测模型。
  10. 根据权利要求8所述的基于Darkflow-DeepSort的多目标追踪检测系统,其特征在于,所述追踪结果获取单元包括第一匹配度获取模块、第二匹配度获取模块和追踪结果判定模块;其中,
    所述第一匹配度获取模块,用于获得多目标的运动匹配度以及表观特征匹配度;其中,所述运动匹配度通过对卡尔曼滤波器获得的多目标的运动相似性进行计算获得;所述表观特征匹配度通过将所述多个目标的表观特征计算获得;
    所述第二匹配度获取模块,利用多目标的运动匹配度和表观特征匹配度, 通过对所述监控视频逐帧的数据关联处理,得到的IOU匹配值以及表观特征匹配值,通过IOU匹配值以及表观特征匹配值计算目标框的最终匹配度
    所述追踪结果判定模块,选取最终匹配度达到预设匹配参数的目标框作为目标追踪结果。
  11. 根据权利要求8所述的基于Darkflow-DeepSort的多目标追踪检测系统,其特征在于,所述基于DeepSort的目标跟踪模型获取单元包括表观特征获取模块和优先权赋予模块;其中,
    所述表观特征获取模块,用于将检测图像输入训练好的基于Darkflow的目标检测模型,得到多个目标的表观特征;
    所述优先权赋予模块,用于对于所得到的多个目标的表观特征,筛选出现次数超过设定阈值的目标,对所述目标通过级联匹配赋予其优先权。
  12. 根据权利要求9所述的基于Darkflow-DeepSort的多目标追踪检测系统,其特征在于,所述基于Darkflow的目标检测模型的卷积层的padding均为1,池化层均为最大池化。
  13. 根据权利要求11所述的基于Darkflow-DeepSort的多目标追踪检测系统,其特征在于,所述表观特征获取模块包括表观特征描述子模块,所述表观特征描述子模块,用于所述表观特征用(u,v,r,h,
    Figure PCTCN2019117801-appb-100003
    )进行描述;其中,(u,v)是检测目标的中心坐标,r是长宽比,h表示高度,
    Figure PCTCN2019117801-appb-100004
    四个变量表示检测目标在图像坐标系中的对应的速度信息。
  14. 根据权利要求8所述的基于Darkflow-DeepSort的多目标追踪检测系统,其特征在于,所述基于DeepSort的目标跟踪模型获取单元还包括基于DeepSort的目标跟踪模型训练模块和基于DeepSort的目标跟踪模型测试模块;所述数据集MOT16 Challenge按照8:2的比例分为训练集和测试集;其中,所述基于DeepSort的目标跟踪模型训练模块,用于通过所述训练集训练所述基于DeepSort的目标跟踪模型;所述基于DeepSort的目标跟踪模型测试模块,用于通过所述测试集测试所述基于DeepSort的目标跟踪模型。
  15. 一种电子装置,其特征在于包括:存储器、处理器以及存储在所述存储器中并可在所述处理器上运行基于Darkflow-DeepSort的多目标追踪检测方法的计算机程序,所述基于Darkflow-DeepSort的多目标追踪检测方法的计算机程序被所述处理器执行时实现如下步骤:
    S110、利用YOLOv3算法训练得到基于Darkflow的目标检测模型;
    S120、将检测图像输入训练好的基于Darkflow的目标检测模型,得到多个目标的表观特征;其中,所述检测图像基于对监控视频进行解码获得;
    S130、将多个目标的表观特征输入训练好的基于DeepSort的目标跟踪模型;所述目标跟踪模型通过多目标检测的数据集MOT16 Challenge训练得到;
    S140、利用目标跟踪模型的卡尔曼滤波器对所述监控视频进行逐帧的数据关联处理,实现所述监控视频中的多目标追踪。
  16. 根据权利要求15所述的电子装置,其特征在于,所述基于Darkflow的目标检测模型为Python模型,所述Python模型通过Cython将Darknet网络结构转换获得。
  17. 根据权利要求15所述的电子装置,其特征在于,所述步骤140包括:
    S210、获得多目标的运动匹配度以及表观特征匹配度;其中,所述运动匹配度通过对卡尔曼滤波器获得的多目标的运动相似性进行计算获得;所述表观特征匹配度通过将所述多个目标的表观特征计算获得;
    S220、利用多目标的运动匹配度和表观特征匹配度,通过对所述监控视频逐帧的数据关联处理,得到的IOU匹配值以及表观特征匹配值,通过IOU匹配值以及表观特征匹配值计算目标框的最终匹配度;
    S230、选取最终匹配度达到预设匹配参数的目标框作为目标追踪结果。
  18. 根据权利要求15所述的电子装置,其特征在于,
    对于所述步骤S120中得到的多个目标的表观特征,筛选出现次数超过设定阈值的目标,对所述目标通过级联匹配赋予其优先权。
  19. 根据权利要求15所述的电子装置,其特征在于,所述Darkflow的网络结构中卷积层的padding均为1,池化层均为最大池化。
  20. 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序包括基于Darkflow-DeepSort的多目标追踪检测程序,所述基于Darkflow-DeepSort的多目标追踪检测程序被处理器执行时实现如权利要求1~7中任一项所述的基于Darkflow-DeepSort的多目标追踪检测方法中的步骤。
PCT/CN2019/117801 2019-07-31 2019-11-13 基于Darkflow-DeepSort的多目标追踪检测方法、装置及存储介质 WO2021017291A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910701678.5A CN110516556B (zh) 2019-07-31 2019-07-31 基于Darkflow-DeepSort的多目标追踪检测方法、装置及存储介质
CN201910701678.5 2019-07-31

Publications (1)

Publication Number Publication Date
WO2021017291A1 true WO2021017291A1 (zh) 2021-02-04

Family

ID=68624348

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/117801 WO2021017291A1 (zh) 2019-07-31 2019-11-13 基于Darkflow-DeepSort的多目标追踪检测方法、装置及存储介质

Country Status (2)

Country Link
CN (1) CN110516556B (zh)
WO (1) WO2021017291A1 (zh)

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113341391A (zh) * 2021-06-01 2021-09-03 电子科技大学 基于深度学习的未知环境下雷达目标多帧联合检测方法
CN113343836A (zh) * 2021-06-02 2021-09-03 禾麦科技开发(深圳)有限公司 一种基于卷积神经网络的楼层电梯等候人群检测系统及方法
CN113420679A (zh) * 2021-06-26 2021-09-21 南京搜文信息技术有限公司 一种人工智能跨相机多目标追踪系统及追踪算法
CN113470078A (zh) * 2021-07-15 2021-10-01 浙江大华技术股份有限公司 一种目标跟踪方法、装置及系统
CN113674321A (zh) * 2021-08-25 2021-11-19 燕山大学 一种基于云端的监控视频下多目标跟踪的方法
CN113701758A (zh) * 2021-08-23 2021-11-26 中国北方工业有限公司 一种基于生物搜索算法的多目标数据关联方法及系统
CN113781521A (zh) * 2021-07-12 2021-12-10 山东建筑大学 一种基于改进YOLO-DeepSort的仿生机器鱼检测跟踪方法
CN113822153A (zh) * 2021-08-11 2021-12-21 桂林电子科技大学 一种基于改进DeepSORT算法的无人机跟踪方法
CN113962282A (zh) * 2021-08-19 2022-01-21 大连海事大学 一种基于改进YOLOv5L+DeepSort的船舶机舱火灾实时检测系统及方法
CN114022812A (zh) * 2021-11-01 2022-02-08 大连理工大学 一种基于轻量化SSD的DeepSort水面漂浮物多目标跟踪方法
CN114240996A (zh) * 2021-11-16 2022-03-25 灵译脑科技(上海)有限公司 一种基于目标运动预测的多目标追踪方法
CN114820699A (zh) * 2022-03-29 2022-07-29 小米汽车科技有限公司 多目标跟踪方法、装置、设备及介质
CN114913212A (zh) * 2022-06-24 2022-08-16 成都云擎科技有限公司 一种基于特征共用的DeepSORT目标跟踪方法
CN116132818A (zh) * 2023-02-01 2023-05-16 辉羲智能科技(上海)有限公司 用于自动驾驶的图像处理方法及系统
CN116777950A (zh) * 2023-04-19 2023-09-19 长沙理工大学 基于相机参数的多目标视觉跟踪方法、装置、设备及介质
CN117739994A (zh) * 2024-02-20 2024-03-22 广东电网有限责任公司阳江供电局 一种视觉机器人水下目标识别追踪方法及系统
CN117830399A (zh) * 2023-12-14 2024-04-05 华中科技大学 水下航行器自主对接过程中的定位方法及装置

Families Citing this family (33)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111178208B (zh) * 2019-12-20 2023-08-15 华瑞新智科技(北京)有限公司 基于深度学习的行人检测方法、装置及介质
CN111222574B (zh) * 2020-01-07 2022-04-05 西北工业大学 基于多模型决策级融合的舰船与民船目标检测与分类方法
CN111260628A (zh) * 2020-01-15 2020-06-09 北京林业大学 基于视频图像的大型苗圃苗木数量统计方法及电子设备
CN111340764A (zh) * 2020-02-20 2020-06-26 江苏东印智慧工程技术研究院有限公司 一种基于DeepSort的拉索表观病害自动计数方法
CN111259868B (zh) * 2020-03-10 2023-12-12 以萨技术股份有限公司 基于卷积神经网络的逆行车辆检测方法、系统及介质
CN111382704B (zh) * 2020-03-10 2023-12-15 以萨技术股份有限公司 基于深度学习的车辆压线违章判断方法、装置及存储介质
CN111401285B (zh) * 2020-03-23 2024-02-23 北京迈格威科技有限公司 目标跟踪方法、装置及电子设备
CN111428642A (zh) * 2020-03-24 2020-07-17 厦门市美亚柏科信息股份有限公司 一种多目标跟踪算法、电子装置及计算机可读存储介质
CN111428644A (zh) * 2020-03-25 2020-07-17 北京以萨技术股份有限公司 基于深度神经网络的斑马线区域监测方法、系统及介质
CN111460968B (zh) * 2020-03-27 2024-02-06 上海大学 基于视频的无人机识别与跟踪方法及装置
CN111723664A (zh) * 2020-05-19 2020-09-29 烟台市广智微芯智能科技有限责任公司 一种用于开放式区域的行人计数方法及系统
CN112241974B (zh) * 2020-05-29 2024-05-10 北京国家新能源汽车技术创新中心有限公司 交通事故检测方法及处理方法、系统、存储介质
CN111708380B (zh) * 2020-06-29 2023-11-10 北京御航智能科技有限公司 风电机组外观缺陷的检测方法、平台、无人机以及系统
CN112950671B (zh) * 2020-08-06 2024-02-13 中国人民解放军32146部队 一种无人机对运动目标实时高精度参数测量方法
CN112036271B (zh) * 2020-08-18 2023-10-10 汇纳科技股份有限公司 基于卡尔曼滤波的行人重识别方法、系统、介质及终端
CN112200021B (zh) * 2020-09-22 2022-07-01 燕山大学 基于有限范围场景内的目标人群跟踪监控方法
CN112329521A (zh) * 2020-09-24 2021-02-05 上海品览数据科技有限公司 一种基于深度学习的多目标跟踪视频巡店方法
CN112668432A (zh) * 2020-12-22 2021-04-16 上海幻维数码创意科技股份有限公司 一种基于YoloV5和DeepSort的地面互动投影系统中人体检测跟踪方法
CN112785625B (zh) * 2021-01-20 2023-09-22 北京百度网讯科技有限公司 目标跟踪方法、装置、电子设备及存储介质
CN112767711B (zh) * 2021-01-27 2022-05-27 湖南优美科技发展有限公司 一种多类别多尺度多目标抓拍方法及系统
CN112985439B (zh) * 2021-02-08 2023-10-17 青岛大学 一种基于YOLOv3与卡尔曼滤波的行人堵塞状态预测方法
CN112883871B (zh) * 2021-02-19 2022-06-10 北京三快在线科技有限公司 一种模型训练以及确定无人车运动策略方法及装置
CN112926649A (zh) * 2021-02-24 2021-06-08 北京优创新港科技股份有限公司 一种烟框重复过磅行为识别方法及装置
CN113139442A (zh) * 2021-04-07 2021-07-20 青岛以萨数据技术有限公司 一种图像跟踪方法、装置、存储介质及电子设备
CN113076899B (zh) * 2021-04-12 2023-04-07 华南理工大学 一种基于目标跟踪算法的高压输电线路异物检测方法
CN113158995A (zh) * 2021-05-21 2021-07-23 西安建筑科技大学 一种多目标跟踪检测方法、系统、设备及存储介质
CN113256690B (zh) * 2021-06-16 2021-09-17 中国人民解放军国防科技大学 一种基于视频监控的行人多目标跟踪方法
CN113591577A (zh) * 2021-06-30 2021-11-02 安徽省国维通信工程有限责任公司 一种通讯工程中人工智能视频图像检测系统
CN114155275A (zh) * 2021-11-17 2022-03-08 深圳职业技术学院 一种基于IOU-Tracker的鱼类跟踪方法和装置
CN114524339B (zh) * 2022-01-06 2024-02-09 广东博智林机器人有限公司 电梯轿厢安全运行检测方法、装置、设备及存储介质
CN115049924B (zh) * 2022-06-06 2023-04-14 四川大学 基于视频监控下非结构构件损伤识别的建筑震害评估方法
CN115082526B (zh) * 2022-07-26 2023-02-03 复亚智能科技(太仓)有限公司 目标追踪方法及装置
CN116069801B (zh) * 2023-03-06 2023-06-30 山东华夏高科信息股份有限公司 一种交通视频结构化数据生成方法、装置及介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170286774A1 (en) * 2016-04-04 2017-10-05 Xerox Corporation Deep data association for online multi-class multi-object tracking
CN109816690A (zh) * 2018-12-25 2019-05-28 北京飞搜科技有限公司 基于深度特征的多目标追踪方法及系统
CN109871763A (zh) * 2019-01-16 2019-06-11 清华大学 一种基于yolo的特定目标跟踪方法
CN109977818A (zh) * 2019-03-14 2019-07-05 上海极链网络科技有限公司 一种基于空间特征和多目标检测的动作识别方法及系统
CN110047095A (zh) * 2019-03-06 2019-07-23 平安科技(深圳)有限公司 基于目标检测的跟踪方法、装置及终端设备

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107679455A (zh) * 2017-08-29 2018-02-09 平安科技(深圳)有限公司 目标跟踪装置、方法及计算机可读存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170286774A1 (en) * 2016-04-04 2017-10-05 Xerox Corporation Deep data association for online multi-class multi-object tracking
CN109816690A (zh) * 2018-12-25 2019-05-28 北京飞搜科技有限公司 基于深度特征的多目标追踪方法及系统
CN109871763A (zh) * 2019-01-16 2019-06-11 清华大学 一种基于yolo的特定目标跟踪方法
CN110047095A (zh) * 2019-03-06 2019-07-23 平安科技(深圳)有限公司 基于目标检测的跟踪方法、装置及终端设备
CN109977818A (zh) * 2019-03-14 2019-07-05 上海极链网络科技有限公司 一种基于空间特征和多目标检测的动作识别方法及系统

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
A PERSISTENT LITTLE ROOKIE: "Multi-target tracking algorithm: deep-sort", 24 June 2017 (2017-06-24), pages 1 - 8, XP055776542, Retrieved from the Internet <URL:https://www.cnblogs.com/YiXiaoZhou/p/7074037.html> *
NICOLAI WOJKE; ALEX BEWLEY; DIETRICH PAULUS: "Simple Online and Realtime Tracking with a Deep Association Metric", ARXIV.ORG, 21 March 2017 (2017-03-21), pages 1 - 5, XP080758706, DOI: 10.1109/ICIP.2017.8296962 *

Cited By (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113341391B (zh) * 2021-06-01 2022-05-10 电子科技大学 基于深度学习的未知环境下雷达目标多帧联合检测方法
CN113341391A (zh) * 2021-06-01 2021-09-03 电子科技大学 基于深度学习的未知环境下雷达目标多帧联合检测方法
CN113343836A (zh) * 2021-06-02 2021-09-03 禾麦科技开发(深圳)有限公司 一种基于卷积神经网络的楼层电梯等候人群检测系统及方法
CN113420679A (zh) * 2021-06-26 2021-09-21 南京搜文信息技术有限公司 一种人工智能跨相机多目标追踪系统及追踪算法
CN113420679B (zh) * 2021-06-26 2024-04-26 南京搜文信息技术有限公司 一种人工智能跨相机多目标追踪系统及追踪方法
CN113781521A (zh) * 2021-07-12 2021-12-10 山东建筑大学 一种基于改进YOLO-DeepSort的仿生机器鱼检测跟踪方法
CN113781521B (zh) * 2021-07-12 2023-08-08 山东建筑大学 一种基于改进YOLO-DeepSort的仿生机器鱼检测跟踪方法
CN113470078A (zh) * 2021-07-15 2021-10-01 浙江大华技术股份有限公司 一种目标跟踪方法、装置及系统
CN113822153A (zh) * 2021-08-11 2021-12-21 桂林电子科技大学 一种基于改进DeepSORT算法的无人机跟踪方法
CN113962282B (zh) * 2021-08-19 2024-04-16 大连海事大学 一种基于改进YOLOv5L+DeepSort的船舶机舱火灾实时检测系统及方法
CN113962282A (zh) * 2021-08-19 2022-01-21 大连海事大学 一种基于改进YOLOv5L+DeepSort的船舶机舱火灾实时检测系统及方法
CN113701758A (zh) * 2021-08-23 2021-11-26 中国北方工业有限公司 一种基于生物搜索算法的多目标数据关联方法及系统
CN113701758B (zh) * 2021-08-23 2024-06-07 中国北方工业有限公司 一种基于生物搜索算法的多目标数据关联方法及系统
CN113674321B (zh) * 2021-08-25 2024-05-17 燕山大学 一种基于云端的监控视频下多目标跟踪的方法
CN113674321A (zh) * 2021-08-25 2021-11-19 燕山大学 一种基于云端的监控视频下多目标跟踪的方法
CN114022812A (zh) * 2021-11-01 2022-02-08 大连理工大学 一种基于轻量化SSD的DeepSort水面漂浮物多目标跟踪方法
CN114022812B (zh) * 2021-11-01 2024-05-10 大连理工大学 一种基于轻量化SSD的DeepSort水面漂浮物多目标跟踪方法
CN114240996A (zh) * 2021-11-16 2022-03-25 灵译脑科技(上海)有限公司 一种基于目标运动预测的多目标追踪方法
CN114240996B (zh) * 2021-11-16 2024-05-07 灵译脑科技(上海)有限公司 一种基于目标运动预测的多目标追踪方法
CN114820699A (zh) * 2022-03-29 2022-07-29 小米汽车科技有限公司 多目标跟踪方法、装置、设备及介质
CN114913212A (zh) * 2022-06-24 2022-08-16 成都云擎科技有限公司 一种基于特征共用的DeepSORT目标跟踪方法
CN116132818A (zh) * 2023-02-01 2023-05-16 辉羲智能科技(上海)有限公司 用于自动驾驶的图像处理方法及系统
CN116132818B (zh) * 2023-02-01 2024-05-24 辉羲智能科技(上海)有限公司 用于自动驾驶的图像处理方法及系统
CN116777950B (zh) * 2023-04-19 2024-05-03 长沙理工大学 基于相机参数的多目标视觉跟踪方法、装置、设备及介质
CN116777950A (zh) * 2023-04-19 2023-09-19 长沙理工大学 基于相机参数的多目标视觉跟踪方法、装置、设备及介质
CN117830399A (zh) * 2023-12-14 2024-04-05 华中科技大学 水下航行器自主对接过程中的定位方法及装置
CN117739994B (zh) * 2024-02-20 2024-04-30 广东电网有限责任公司阳江供电局 一种视觉机器人水下目标识别追踪方法及系统
CN117739994A (zh) * 2024-02-20 2024-03-22 广东电网有限责任公司阳江供电局 一种视觉机器人水下目标识别追踪方法及系统

Also Published As

Publication number Publication date
CN110516556B (zh) 2023-10-31
CN110516556A (zh) 2019-11-29

Similar Documents

Publication Publication Date Title
WO2021017291A1 (zh) 基于Darkflow-DeepSort的多目标追踪检测方法、装置及存储介质
Ciaparrone et al. Deep learning in video multi-object tracking: A survey
Feng et al. Multi-object tracking with multiple cues and switcher-aware classification
CN113284168A (zh) 目标跟踪方法、装置、电子设备及存储介质
US9767570B2 (en) Systems and methods for computer vision background estimation using foreground-aware statistical models
CN109035304B (zh) 目标跟踪方法、介质、计算设备和装置
Huang et al. Robust object tracking by hierarchical association of detection responses
Xu et al. Deepmot: A differentiable framework for training multiple object trackers
CN111080673B (zh) 一种抗遮挡目标跟踪方法
JP2018523877A (ja) オブジェクト追跡のためのシステムおよび方法
CN113191180B (zh) 目标跟踪方法、装置、电子设备及存储介质
Bashar et al. Multiple object tracking in recent times: A literature review
Wang et al. Multi-Target Video Tracking Based on Improved Data Association and Mixed Kalman/$ H_ {\infty} $ Filtering
KR101913648B1 (ko) 다중 객체 추적 방법
JP2019194758A (ja) 情報処理装置、情報処理方法、およびプログラム
CN115953434B (zh) 轨迹匹配方法、装置、电子设备和存储介质
Belmouhcine et al. Robust deep simple online real-time tracking
Khan et al. Foreground detection using motion histogram threshold algorithm in high-resolution large datasets
Phadke et al. Improved mean shift for multi-target tracking
Song et al. Online multi-object tracking and segmentation with GMPHD filter and mask-based affinity fusion
Liu et al. A simplified swarm optimization for object tracking
Tao et al. Adaptive spatio-temporal model based multiple object tracking in video sequences considering a moving camera
CN111860261A (zh) 一种客流值的统计方法、装置、设备及介质
CN108346158B (zh) 基于主块数据关联的多目标跟踪方法及系统
CN112001252A (zh) 一种基于异构图网络的多目标跟踪方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19939785

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19939785

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 19939785

Country of ref document: EP

Kind code of ref document: A1