WO2023065395A1 - Work vehicle detection and tracking method and system - Google Patents

Work vehicle detection and tracking method and system Download PDF

Info

Publication number
WO2023065395A1
WO2023065395A1 PCT/CN2021/127840 CN2021127840W WO2023065395A1 WO 2023065395 A1 WO2023065395 A1 WO 2023065395A1 CN 2021127840 W CN2021127840 W CN 2021127840W WO 2023065395 A1 WO2023065395 A1 WO 2023065395A1
Authority
WO
WIPO (PCT)
Prior art keywords
work vehicle
tracking
matching
image
detection
Prior art date
Application number
PCT/CN2021/127840
Other languages
French (fr)
Chinese (zh)
Inventor
刘世望
袁希文
林军
康高强
游俊
王泉东
丁驰
袁浩
徐阳翰
岳伟
熊群芳
Original Assignee
中车株洲电力机车研究所有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中车株洲电力机车研究所有限公司 filed Critical 中车株洲电力机车研究所有限公司
Publication of WO2023065395A1 publication Critical patent/WO2023065395A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/277Analysis of motion involving stochastic approaches, e.g. using Kalman filters
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/42Global feature extraction by analysis of the whole pattern, e.g. using frequency domain transformations or autocorrelation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/62Extraction of image or video features relating to a temporal dimension, e.g. time-based feature extraction; Pattern tracking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/762Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • G06V20/54Surveillance or monitoring of activities, e.g. for recognising suspicious objects of traffic, e.g. cars on the road, trains or boats
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Definitions

  • the invention relates to the technical field of visual detection and tracking, in particular to a detection and tracking method and system for a working vehicle.
  • the road test sensing module is required to have the functions of traffic flow statistics, vehicle intrusion, parking, retrograde, deceleration, and lane change detection.
  • the radar cannot obtain visual information such as environmental color and texture, resulting in problems such as insufficient ability to judge the target type.
  • Vision-based multi-target real-time detection and tracking technology can be applied to road test traffic flow statistics, vehicle intrusion, parking, retrograde detection and other scenarios, and can also make up for the lack of vehicle radar perception capabilities. With its low cost, rich perception information, and comparable driver's visual ability, it has become an important part of the unmanned driving system of mining trucks, and it is also an indispensable key core technology for the system's intelligent perception.
  • multi-target real-time detection and tracking technology has been a hot research topic in the fields of automatic driving and industrial inspection, and researchers have conducted a lot of research on it.
  • convolutional neural network in 2012 Taking the proposal of convolutional neural network in 2012 as a watershed, multi-target real-time detection and tracking technology can be divided into two main directions: traditional visual analysis and visual deep learning.
  • multi-target detection and tracking is carried out by manually selecting or designing image features, combined with machine learning and other methods.
  • the main methods are: 1) Modeling based on the object model: modeling the appearance model, and then finding the object in subsequent frames. For example, algorithms such as area matching, feature point tracking, active contours, and optical flow. Among them, the most commonly used method is the feature matching method. First, the target features are extracted, and then the most similar features are found in subsequent frames for target positioning. The commonly used features are: SIFT, SURF, Harris-corner, etc. 2) Search-based method: The researchers found that the method based on object model modeling needs to process the entire picture, resulting in poor real-time performance.
  • a prediction algorithm is added to search for targets close to the predicted value, narrowing the search range, and improving the real-time tracking performance.
  • Commonly used prediction algorithms include Kalman filter and particle filter.
  • Another method to narrow the search range is the kernel method: it uses the principle of steepest descent, iterates step by step in the direction of gradient descent on the target template, until the optimal position, such as meanshift and camshift algorithms.
  • the robustness of manually selecting or designing image features is poor, and the machine learning method itself has inherent defects.
  • Traditional visual analysis technology is more likely to be affected by many factors such as image quality, foreign object occlusion, target rotation, etc., and has poor practicability, especially in complex mine environments, where the similarity between vehicle targets and image backgrounds is high, and traditional visual analysis technologies cannot effectively distinguish vehicles with background.
  • convolutional neural networks are generally used to extract image features, which can well overcome the shortcomings of manual feature selection.
  • Optimizing the network parameters through the backpropagation algorithm and learning massive image data to train the deep network model can effectively reduce the impact of image quality, foreign object occlusion, target rotation, etc.
  • the researchers proposed a tracking method based on multi-level convolution filtering features.
  • the algorithm uses the principal component analysis feature vector obtained by hierarchical learning, then uses the Bhattacharyachian distance to evaluate the similarity between features, and finally combines the particle filter algorithm to achieve target tracking.
  • the target tracking framework of "detect first, then track” based on deep learning has gradually become the mainstream.
  • the detection model is used to obtain the target bounding box, and then the trajectory prediction and tracking are performed according to the relationship between the front and rear frames.
  • the classic representative of this type of tracking framework is deepsort, which uses a detection framework based on candidate regions for target detection, and then adds deep learning features based on sort's IOU fast matching, and obtains similarity metrics by calculating the cosine distance between detection features and tracking features Do target tracking.
  • the detection network used by it has a complex structure and deep layers, so its real-time performance is poor.
  • the researchers proposed a vehicle multi-target detection and tracking method that incorporates appearance features, using Kalman filtering to achieve single-target motion state tracking, by calculating the target position and features Loss to complete the associated matching of the target. This method improves the tracking speed to a certain extent, but the real-time performance still cannot meet the requirements.
  • the current target tracking methods based on deep learning are mainly tracking multiple targets of a single type, or tracking multiple targets without distinguishing categories, while the mining truck unmanned driving system requires simultaneous tracking of multiple types of operating vehicles. Trackers make even higher demands.
  • the purpose of the present invention is to solve the above problems, and provides a method and system for detection and tracking of work vehicles (hereinafter sometimes referred to as a method and system for detection and tracking of work vehicles), which is aimed at unstructured roads with complex pavement and small differences between targets and image backgrounds. ⁇ Variable size and variety of vehicles.
  • a method and system for detection and tracking of work vehicles which is aimed at unstructured roads with complex pavement and small differences between targets and image backgrounds.
  • ⁇ Variable size and variety of vehicles Through gamma image enhancement, multi-scale fusion prediction, multi-source information cascade matching and other means, the real-time detection and tracking of operating vehicles are realized, and the vehicle type, size, location, and quantity are obtained. , trajectory and other information.
  • the present invention provides a detection and tracking method for a work vehicle, acquires an image, and uses an image enhancement method to perform image enhancement processing on the image;
  • the work vehicle detection model performs target detection and obtains target detection results.
  • the work vehicle detection model adopts a deep learning target detection framework, extracts the image features of the work vehicle through a convolutional neural network, obtains multiple types of work vehicle detection results and inputs them into the work vehicle detection model;
  • the work vehicle tracking model is based on The working vehicle tracking method based on the cascade matching of motion information and appearance features obtains tracking targets and tracking trajectories, and realizes multi-type working vehicle tracking.
  • the image enhancement method when analyzing the image of the complex mine environment, is used to enhance the similar image background of the operating vehicle and the mine image in the image, which effectively solves the problem of similarity between the target and the background in the image in the complex mine environment.
  • the problem of low image recognition accuracy due to large and inconspicuous grayscale contrast improves the efficiency of detection and recognition of work vehicles and the real-time performance of work vehicle tracking.
  • the work vehicle detection model uses the convolutional neural network to extract the image features of the work vehicle from the image, which not only overcomes the errors caused by manual selection of image features, but also the convolutional neural network.
  • the training of the operating vehicle detection model is carried out through massive image feature learning, so that the trained vehicle detection model is more in line with the image features of the operating vehicle, and the efficiency and accuracy of operating vehicle detection are improved.
  • the work vehicle tracking model in the present invention can perform cascading matching of motion information and appearance features on multi-type work vehicles according to the detection results of different types of work vehicles, thereby realizing multi-type target tracking , which finally enables the present invention to track a variety of operating vehicles of different sizes and types in a complex mine environment.
  • the work vehicle detection model uses the YOLO framework as a deep learning target detection framework, uses a genetic algorithm to optimize grid hyperparameters, and outputs a multi-layer prediction module; wherein, the The work vehicle detection model uses DIOU to construct a regression loss function, and obtains the detection frame of the work vehicle through the K-means clustering algorithm.
  • the work vehicle detection model can directly output the position information and type information of the detection target through the end-to-end YOLO deep learning target detection framework, thereby improving the target detection speed and improving the real-time performance of work vehicle tracking.
  • the present invention uses the genetic algorithm to optimize the grid hyperparameters, and outputs multi-layer prediction modules for different sizes of operating vehicles, which not only can meet the detection requirements of operating vehicles of different sizes, but also improves the detection efficiency, and uses the genetic algorithm to optimize the grid hyperparameters. Parameters, the gradient change is integrated into the feature map of the operating vehicle, thereby reducing the weight, and greatly improving the accuracy of image feature recognition of the operating vehicle.
  • the working vehicle tracking model uses Kalman filter to predict and update the tracking trajectory of the working vehicle, and the cascade matching between the motion information and appearance features is performed based on IOU matching Work vehicle movement information association and work vehicle characteristic information association; the work vehicle movement information association adopts the Mahalanobis distance to evaluate the degree of motion state correlation, the work vehicle characteristic information association uses the cosine distance to evaluate the appearance feature association degree, and the Mahalanobis distance The cosine distance and the cosine distance are calculated by the comprehensive metric calculation formula to obtain the cascade matching comprehensive metric evaluation cascade matching correlation degree.
  • the tracking trajectory of the working vehicle is predicted and updated through the Kalman filter, and the tracking trajectory is matched with the working vehicle through IOU matching.
  • the appearance feature vector of the work vehicle is introduced for matching, and only the similarity measure that satisfies both the Mahalanobis distance and the cosine distance Only when the standard is met, the operating vehicle is determined to be correctly associated with the corresponding tracking track, thereby reducing the impact of the tracking track being incorrectly matched with the occluder due to the possible long-term occlusion of the operating vehicle in a complex mine environment, and greatly improving The accuracy of the association matching between the detection target and the corresponding tracking trajectory of the present invention.
  • IOU matching can solve the short-term occlusion problem of the working vehicle during the tracking process.
  • the working vehicle tracking model fails to match the predicted trajectory of the working vehicle within the predefined maximum frame number threshold, the tracking of the working vehicle will be terminated. The lost work vehicle is tracked and deleted, and the work vehicle tracking efficiency is improved.
  • the work vehicle tracking model stores the work vehicle feature map with successful data association in the corresponding work vehicle feature image database, and uses the work vehicle feature network from the associated The feature vector of the work vehicle is successfully extracted from the feature map of the work vehicle; wherein, the feature image library of the work vehicle is provided with a fixed storage threshold, and the feature map of the work vehicle is updated according to the data association time.
  • the cosine distance can be calculated by using the eigenvectors of the working vehicles that are successfully matched with the tracking trajectory, which can improve the matching speed of the tracking trajectory and the working vehicle, thereby improving the real-time performance of working vehicle tracking.
  • the work vehicle tracking method based on cascade matching of motion information and appearance features further includes:
  • the image enhancement method is gamma transformation or histogram equalization.
  • the gamma transformation is used to enhance the image, and the low gray value in a narrow range is mapped to the high gray value in a wide range, so that the gray distribution of the enhanced image is more balanced, the details of the dark part are richer, and the complexity is improved.
  • the similarity between the image target and the background is large, and the gray contrast is not obvious, which improves the recognition rate of image feature extraction.
  • the present invention also provides a work vehicle detection and tracking system, including a work vehicle detection module for obtaining work vehicle detection results and a work vehicle tracking module for tracking multiple types of work vehicles;
  • the work vehicle detection module includes an image processing unit , an image feature extraction unit and a working vehicle detection unit, the vehicle tracking module includes a trajectory tracking unit, a data association unit and a feature image storage unit;
  • the image processing unit uses an image enhancement method to perform image enhancement processing on the input image, and transmit the enhanced processed image to the image feature extraction unit;
  • the image feature extraction unit extracts the image features of the work vehicle from the image through a convolutional neural network, and transmits the image features of the work vehicle to the A work vehicle detection unit;
  • the work vehicle detection unit performs target detection through the image features of the work vehicle, and transmits the obtained work vehicle detection results to the trajectory tracking unit;
  • the trajectory tracking unit uses the work vehicle detection results Predict and update the tracking trajectory of the corresponding work vehicle, and transmit the trajectory to the data association unit for cascade matching;
  • the feature image storage unit has work vehicle feature image libraries of different types of work vehicles, and the work vehicle feature image library is provided with a fixed storage threshold, according to data association time The feature map of the work vehicle is updated.
  • the vehicle tracking module further includes a work vehicle feature vector extraction unit, and the work vehicle feature vector extraction unit uses the work vehicle feature network from the feature image storage The feature vector of the work vehicle is extracted from the work vehicle feature map of the unit.
  • the work vehicle tracking method based on the cascade matching of motion information and appearance features further includes: obtaining the target detection result, and using Kalman filter to predict the trajectory;
  • the present invention has the following beneficial effects: it provides a method and system for detecting and tracking an operating vehicle that combines an operating vehicle detection model based on deep learning with an operating vehicle tracking model based on cascade matching of motion information and appearance features.
  • the image enhancement method is used to enhance the image in the complex mine environment to improve the clarity and resolution of the image, and then improve the detection accuracy of the operating vehicle. accuracy and timeliness.
  • the work vehicle detection model uses the improved end-to-end YOLO as the deep learning target detection framework, uses DIOU to construct the vehicle frame loss function, and clusters the prediction frames of the work vehicle through K-means to obtain the detection that conforms to the image characteristics of the work vehicle
  • the real-time detection performance of the operating vehicle is improved by optimizing the network hyperparameters through the genetic algorithm.
  • the work vehicle tracking model in this application combines the motion information of the work vehicle and the appearance features of multi-layer depth to perform cascade matching of work vehicles, and uses Kalman filtering and the Hungarian algorithm based on IOU matching to perform data processing on work vehicles and tracking trajectories. Correlation, so as to realize the real-time tracking of various types of work vehicles.
  • the method proposed in this method can adapt to the image scene and effectively complete the real-time detection and tracking tasks of multiple types of operating vehicles in complex mine environments.
  • FIG. 1 is a system configuration diagram showing an embodiment of a work vehicle detection and tracking system of the present invention.
  • Fig. 2 is a flow chart showing an embodiment of the work vehicle detection and tracking method of the present invention.
  • FIG. 3 is a comparison diagram showing the effects of gamma transform enhanced images.
  • FIG. 4 is a network configuration diagram showing a work vehicle detection model.
  • FIG. 5 is a flow chart illustrating a work vehicle tracking method.
  • FIG. 6 is a network parameter table showing a characteristic network of a work vehicle.
  • FIG. 1 An embodiment of a working vehicle detection and tracking system is disclosed herein, as shown in FIG. 1 , including a working vehicle detection module and a working vehicle tracking module.
  • the work vehicle detection module includes an image processing unit, an image feature extraction unit and a work vehicle detection unit for obtaining work vehicle detection results;
  • the work vehicle detection module includes a trajectory tracking unit, a data association unit and a feature image storage unit for Detected work vehicles are tracked.
  • the working vehicle detection module and the working vehicle tracking module cooperate with each other, so as to realize the detection and tracking of the working vehicle in the complex mine environment.
  • FIG. 2 is a flow chart showing an embodiment of the method for detecting and tracking a work vehicle of the present invention. This embodiment will be further described below in conjunction with FIG. 1 and FIG. 2 .
  • the image processing unit uses gamma transformation to enhance the image in the complex mine environment, and maps the low gray value in a narrow range to the high gray value in a wide range.
  • Figure 3 is a comparison diagram of the effect of gamma transformation enhanced images. Comparing the grayscale distribution and pixel distribution of the image before and after gamma transformation in Figure 3, it can be clearly seen that after gamma transformation, the grayscale distribution of the image is more balanced, and the pixel The distribution is denser, and the details of the dark part are richer, thereby reducing the similarity between the working vehicle and the background image, and improving the accuracy of the working vehicle detection.
  • the enhanced mine area image can be stored not only on the vehicle side, but also on the ground server for extracting image features and model training.
  • the image feature extraction unit uses the convolutional neural network to extract the image features of the work vehicle from the enhanced image, and then the work vehicle detection unit performs target detection based on the extracted work vehicle image features, so as to obtain the work vehicle detection result and output it to the work vehicle tracking Model.
  • the image feature extraction unit has both the model training and the actual operation process. By training the image feature extraction module, the characteristics of the operating vehicle and its learning are completed, and then the trained image feature extraction module is used for actual operation.
  • Figure 4 shows A schematic diagram of the network structure of the work vehicle detection model will be further described below in conjunction with FIG. 4 .
  • the work vehicle detection model includes a backbone network (Backbone) and a neck (Neckpart).
  • Backbone backbone
  • neck Neck
  • the size of the image is changed to 608*608*3
  • the adjusted image is sliced into 304*304*12 through the Focus structure Feature map of size.
  • the feature map is further sliced to obtain feature maps of three different sizes.
  • the neck of the work vehicle detection model performs convolution and series operations on these feature maps to extract work vehicles. image features.
  • a cross-level local network is used to alleviate a large number of calculation problems, improve the real-time performance of image recognition, and change the gradient Integrate into the feature map, thereby reducing the deep learning weight and maintaining accuracy.
  • the neck of the work vehicle detection model adopts the structure of FPN and PAN.
  • the feature maps of three different sizes are convolved and concatenated to obtain 76*76*33 ⁇ 38 *38*33 and 19*19*33 models in three sizes realize the detection of different types of work vehicles.
  • DIOU in order to perform a stable real frame regression of the work vehicle and avoid training divergence of the work vehicle detection model, DIOU is used to construct the regression loss function of the work vehicle.
  • the K-means clustering algorithm is used to cluster and predict the size of the frame to obtain a detection frame that conforms to the characteristics of the operating vehicle.
  • the real frame is the frame marked on the image by humans
  • the predicted frame is the frame predicted by the network model.
  • the regression loss function of the real frame of the work vehicle is constructed using DIOU.
  • DIOU Distance-IoU loss
  • DIOU can still provide the moving direction for the predicted frame without overlapping with the real frame.
  • DIOU loss can directly minimize the distance between two vehicle boxes, so it converges faster than GIOU loss.
  • the non-maximum value suppression algorithm is used to filter the prediction frame to obtain the final position and category of the work vehicle.
  • the formula for DIOU is as follows:
  • b and bgt represent the center points of the prediction frame and the real frame respectively
  • represents the Euclidean distance between the two center points
  • C represents the The diagonal distance of the minimum closed area that contains both the predicted box and the ground truth box.
  • the regression loss function of the work vehicle detection model consists of the first behavior prediction frame loss, the second and third behavior target confidence loss and the fourth behavior classification loss.
  • the specific formula is as follows:
  • the binary cross entropy with logits loss is used for the confidence loss and classification loss part
  • the size of the prediction module is S ⁇ S ⁇ B
  • S ⁇ S represents the number of prediction grids
  • B represents the module depth.
  • the DIOU regression loss function is used to measure the error between the predicted frame and the real frame, and the predicted frame that does not conform to the characteristics of the work vehicle image is filtered out. Specifically, a threshold is set, and the error value between the predicted frame and the real frame is calculated. If the predicted frame is greater than the set threshold, the predicted frame is filtered out; if the predicted frame is smaller than the threshold, the predicted frame is retained. In the process of model training, if the model training meets the expected standard, the training is completed, and if the model loss function is lower than the expected standard (for example, 1), the training is ended. The threshold is selected according to the actual situation and personal experience.
  • the prediction frame that meets the conditions is used as the sample data, that is, the prediction frame, for machine learning, so as to obtain the detection frame that meets the characteristics of the work vehicle, and the detection frame is output as the target detection result.
  • the target detection results include information such as the center coordinates, width, height, and category of the operating vehicle in the detection frame.
  • the preset vehicle category number is used to identify the type of work vehicle, for example, the output number 1 indicates a truck, and the number 2 indicates a command vehicle, etc., and the work vehicle in the detection frame is identified through the output vehicle type number type.
  • the prediction frame is obtained by multi-scale traversal sliding window or selective search and then positioned, or the size of the prediction frame is manually set for position regression, but these methods are often inefficient and ineffective.
  • the K-means clustering algorithm is used to analyze the image of the operating vehicle in the mining area, and the IOU (Intersection over Union) of the real frame of the operating vehicle is obtained by dividing the overlapping part of the two regions by the set part of the two regions. The results, that is, the intersection ratio of the real frame and the candidate frame) are clustered as the "distance", so as to obtain the predicted frame size that conforms to the characteristic distribution of the mining vehicle image.
  • the K-means clustering steps are as follows:
  • Step 1 Take the height and width of the sample ground truth frame as a sample point (w n , h n ), n ⁇ 1,2,...,N ⁇ , the center point of the sample ground truth frame is (x n ,y n ), n ⁇ 1,2,...,N ⁇ , and then all sample points form a data set.
  • Step 2 Randomly select K sample points in the data set as cluster centers.
  • Step 3 Calculate the value d of the distance between all sample points in the data set and the K cluster centers, and assign the sample points to the cluster center with the smallest d value to obtain K cluster point clusters, that is, all real
  • the frame is classified into K categories, where the calculation formula of d is as follows:
  • Step 4 Recalculate the K cluster centers of the K cluster point clusters, where N m represents the mth cluster point
  • the number of sample points in the cluster that is, the number of real frames, is calculated as follows:
  • the working vehicle detection and tracking system detects the input mine video image through the working vehicle detection module, and after obtaining the detection result of the working vehicle, it inputs it into the working vehicle tracking The feature tracks detected work vehicles.
  • the trajectory tracking unit predicts the next movement trajectory of the work vehicle through Kalman filtering according to the current state of motion of the work vehicle.
  • eight parameters u, v, r, h, x', y', r', h'
  • u, v, r, h are respectively Indicates the position of the center point of the work vehicle detection frame.
  • (u, v) represent the center coordinates of the detection frame
  • r is the ratio of the target ordinate to the abscissa
  • h is the height
  • x', y', r', h' represent the corresponding positions of the operating vehicle in the image coordinates speed information.
  • the Kalman filter takes the four parameters u, v, r, and h as variables, and adopts the uniform velocity model or The linear observation model observes the detected operating vehicles and predicts the trajectory of the operating vehicles in the next frame of images.
  • the tracked work vehicle when the work vehicle detection module tracks the work vehicle, the tracked work vehicle is used as the detection target, the tracking track is defined as track k, and the parameter ak is used to count the number of image frames that match each track k with the detection target, and The maximum frame number threshold Amax is used as the maximum life cycle of the track.
  • the Kalman filter is used to track the work vehicle in real time, all the trajectories k are saved in the trajectory set, and ak is incremented with the matching times of the corresponding trajectories k and the detection target. If the track k matches the detection target successfully, set the track k to a certain state; when the track k matches the detection target again, reset ak to 0.
  • the track k fails to match the detection target, set the track k to an unconfirmed state; when the ak of the track k exceeds the predefined maximum frame number threshold Amax, delete the track k from the track set, and use the Cal Mann filter re-predicts the trajectory of the work vehicle. Classify the newly predicted motion trajectory in the first three frames of images as a tentative trajectory. If the detection target is not successfully matched within the three frames of images, these tentative trajectories will be deleted and the tracking of the working vehicle will be terminated. .
  • the data association unit introduces vehicle depth feature metrics in the process of matching motion trajectories and detecting targets, and performs cascade matching in combination with the motion information of the work vehicle and the appearance features of the work vehicle, And store all confirmed matched work vehicle feature vectors in the feature image storage unit.
  • cascade matching the cosine distance between the detection target and the corresponding work vehicle feature vector is calculated as the appearance feature correlation measure. Since the detection target is occluded for a period of time, the uncertainty of the Kalman filter prediction will greatly increase, the observability in the state space becomes very low, and the Mahalanobis distance is more inclined to the trajectory with greater uncertainty.
  • Fig. 5 is a flow chart of the working vehicle tracking method.
  • the working vehicle tracking method is based on the cascade matching of motion state and appearance features, and in series with IOU matching and Kalman filtering to track the working vehicle.
  • the cascading matching of motion information and appearance features includes the association of motion information of the work vehicle and the association of appearance features of the work vehicle.
  • the data association unit can only determine that the detection target is correctly associated with the predicted trajectory when the cosine metric and the Mahalanobis metric are satisfied at the same time. If the trajectory k matches the detection target successfully, the working vehicle tracking model takes the trajectory k as the confirmed trajectory, outputs the tracked working vehicle and the corresponding trajectory k, and then updates the parameters of the trajectory k.
  • the tracking model of the work vehicle sets the trajectory k as an unconfirmed state, re-performs the cascade matching between the motion state and the appearance features, and conducts the unconfirmed trajectory k, the unmatched trajectory and the unmatched detection target. IOUs were matched and tracked for allocation confirmation using the Hungarian algorithm again.
  • the working vehicle motion information association calculates the Mahalanobis distance between the predicted motion state and the detected motion state of the work vehicle observed at the current moment through Kalman filtering, and the formula is as follows:
  • m represents the Mahalanobis distance
  • T represents the matrix transpose
  • i represents the i-th trajectory
  • St is the covariance matrix of the observation space of the Kalman filter at the current moment
  • yt is the predictor at the current moment
  • dj is the jth detection
  • the motion state of the work vehicle (u, v, ⁇ , h).
  • the Mahalanobis distance represents the uncertainty of state estimation by measuring the standard deviation of the position away from the average orbit, and the 0.95 quantile of the inverse chi-square distribution (that is, the quantile of the normal distribution probability of 0.95) is used as a threshold to filter weak , where the filter function is as follows:
  • the average trajectory is the average value of each trajectory of Kalman filtering
  • the Mahalanobis distance is calculated between the average trajectory and the actual detected vehicle frame to determine whether the vehicle frame and the trajectory are coincident. The farther the standard deviation is, the larger the standard deviation is, the greater the uncertainty of the state estimation is.
  • Mahalanobis distance is a good correlation measure if the motion uncertainty of the detection target is low when performing data association between the detection target and the motion trajectory.
  • the Mahalanobis distance measurement method will be invalidated due to the movement of the camera during the movement of the work vehicle. Therefore, in this embodiment, when the Mahalanobis distance is used to correlate the movement information of the work vehicle, the appearance characteristics of the work vehicle are introduced, and the cosine distance is used.
  • the similarity measure is used to represent the correlation degree of the appearance characteristics of the work vehicle, and jointly measure the correlation measure between the detection target and the tracking track.
  • the Mahalanobis distance measures the standard deviation of the position of the average track to represent the degree of correlation between the detection frame of the work vehicle and the motion state in the tracking frame
  • the cosine distance calculates the cosine value of the two appearance feature vectors corresponding to the predicted trajectory and the detection result and extracts
  • the minimum cosine value is used as the degree of appearance relevance.
  • the feature image storage unit constructs a work vehicle appearance feature library for each tracked work vehicle to store the latest 125 frames of work vehicle feature vector rki successfully associated with each work vehicle.
  • k represents the number of frames
  • the maximum value is 125.
  • the stored work vehicle feature vector is used to calculate the appearance correlation degree between the i-th predicted trajectory and the j-th detection result of the current frame, where the detection result refers to the appearance of the detection target
  • the formulas of feature vector, appearance correlation function and corresponding filter function are as follows:
  • the cosine distance refers to the cosine value of the two appearance feature vectors corresponding to the i-th predicted trajectory and the j-th detection result
  • the appearance correlation function is used to extract the minimum cosine value as the appearance correlation degree.
  • f represents the appearance feature vector
  • l represents the distance of the appearance feature vector f
  • the feature vector of the work vehicle is extracted using the feature network of the work vehicle
  • FIG. 6 is a network parameter table of the feature network of the work vehicle.
  • the work vehicle feature network uses a residual network, which includes a convolutional layer, a maximum pooling layer, and six residual modules.
  • the global feature map with a dimension of 128 is calculated in the dense layer.
  • the Mahalanobis distance and the cosine distance are integrated to obtain the cascaded matching metric Ii, j, the cascaded matching metric function and the corresponding
  • the formula for the filter function is as follows:
  • I i, j ⁇ l m (i, j) + (1- ⁇ ) l f (i, j)
  • is the weight coefficient
  • g in the filter function represents the Mahalanobis metric m and the cosine metric f
  • the threshold of the filter function is set by the actual scene and personal experience, and is not a fixed value.
  • DSP digital signal processor
  • ASIC application-specific integrated circuit
  • FPGA field-programmable gate array
  • a general-purpose processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine.
  • a processor may also be implemented as a combination of computing devices, e.g., a combination of a DSP and a microprocessor, multiple microprocessors, one or more microprocessors in cooperation with a DSP core, or any other such configuration.
  • a software module may reside in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
  • An exemplary storage medium is coupled to the processor such that the processor can read information from, and write information to, the storage medium.
  • the storage medium may be integrated into the processor.
  • the processor and storage medium can reside in an ASIC.
  • the ASIC may reside in a user terminal.
  • the processor and storage medium may reside as discrete components in the user terminal.
  • the functions described may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, as a computer program product, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium.
  • Computer-readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another.
  • a storage media may be any available media that can be accessed by a computer.
  • such computer-readable media may include RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or other Any other medium that is suitable for program code and can be accessed by a computer. Any connection is also properly termed a computer-readable medium.
  • Disk and disc include compact disc (CD), laser disc, optical disc, digital versatile disc (DVD), floppy disk, and Blu-ray disc, where disks are often reproduced magnetically. data, while a disc (disc) uses laser light to reproduce data optically. Combinations of the above should also be included within the scope of computer-readable media.

Abstract

A work vehicle detection and tracking method and system. An image enhancement method is used for performing enhancement processing on an image of a complex mine environment, multiple types of work vehicle detection results are obtained by means of a work vehicle detection model based on a deep-learning target detection frame, a work vehicle tracking model uses the work vehicle tracking method based on cascade matching of motion information and appearance features, and multi-type target tracking is performed according to the work vehicle detection results.

Description

作业车辆检测与跟踪方法和系统Working vehicle detection and tracking method and system 发明领域field of invention
本发明涉及视觉检测与跟踪技术领域,具体涉及一种作业车辆检测与跟踪方法和系统。The invention relates to the technical field of visual detection and tracking, in particular to a detection and tracking method and system for a working vehicle.
背景技术Background technique
2020年,国家发改委、工信部等八部委发布了《关于加快煤矿智能化发展的指导意见》,明确指出发展露天矿卡无人驾驶系统,争取在2035年实现“智能感知,智能决策,自动执行”。在矿用卡车无人驾驶系统的智能感知环节中,要求路测感知模块具备交通流量统计,车辆入侵、停车、逆行、减速、换道检测功能。同时,在车载感知模块中,雷达无法获取环境颜色,纹理等视觉信息,产生了对目标类型判断能力不足等问题。基于视觉的多目标实时检测与跟踪技术能够应用于路测交通流量统计,车辆入侵、停车、逆行检测等场景,也能弥补车载雷达感知能力的不足。其凭借成本较低,感知信息丰富,拥有比拟驾驶员视觉能力等优势成为了矿卡无人驾驶系统中的重要组成部分,也是系统智能感知不可或缺的关键核心技术。In 2020, eight ministries and commissions including the National Development and Reform Commission and the Ministry of Industry and Information Technology issued the "Guiding Opinions on Accelerating the Intelligent Development of Coal Mine", which clearly pointed out the development of unmanned driving systems for open-pit mining trucks, and strived to achieve "intelligent perception, intelligent decision-making, and automatic execution" in 2035 . In the intelligent sensing link of the unmanned driving system for mining trucks, the road test sensing module is required to have the functions of traffic flow statistics, vehicle intrusion, parking, retrograde, deceleration, and lane change detection. At the same time, in the vehicle-mounted perception module, the radar cannot obtain visual information such as environmental color and texture, resulting in problems such as insufficient ability to judge the target type. Vision-based multi-target real-time detection and tracking technology can be applied to road test traffic flow statistics, vehicle intrusion, parking, retrograde detection and other scenarios, and can also make up for the lack of vehicle radar perception capabilities. With its low cost, rich perception information, and comparable driver's visual ability, it has become an important part of the unmanned driving system of mining trucks, and it is also an indispensable key core technology for the system's intelligent perception.
长期以来,多目标实时检测与跟踪技术都是自动驾驶,工业检测等领域的热点研究内容,科研人员对其进行了大量的研究。以2012年卷积神经网络的提出为分水岭,多目标实时检测与跟踪技术可以分为传统视觉分析和视觉深度学习两个主要方向。For a long time, multi-target real-time detection and tracking technology has been a hot research topic in the fields of automatic driving and industrial inspection, and researchers have conducted a lot of research on it. Taking the proposal of convolutional neural network in 2012 as a watershed, multi-target real-time detection and tracking technology can be divided into two main directions: traditional visual analysis and visual deep learning.
在传统视觉分析方向中,通过人工选取或设计图像特征,结合机器学习等方法来进行多目标检测与跟踪。主要方法有:1)基于目标模型建模:对外观模型进行建模,然后在后续帧中寻找目标。例如,区域匹配、特征点跟踪、主动轮廓、光流法等算法。其中,最常用的方法是特征匹配法,首先提取目标特征,然后在后续帧中找到最相似的特征进行目标定位,常用的特征有:SIFT、SURF、Harris-corner等。2)基于搜索的方法:研究者发现基于目标模型建模的方法需要处理整个画面,导致实时性差。于是,加入预测算法,搜索接近预测值的目标,缩小搜索范围,提升了跟踪实时性,常用的预测算法有卡尔曼滤波和粒子滤波。而另一种缩小搜索范围的方法是内核法:它利用最速下降原理, 在目标模板上按梯度下降方向逐步迭代,直到最优位置,例如meanshift和camshift算法。然而,人工选取或设计图像特征的鲁棒性较差,而机器学习方法自身也存在固有缺陷。传统视觉分析技术更容易受到图像质量、异物遮挡、目标旋转等诸多因素影响而实用性较差,特别是在复杂矿山环境下,车辆目标与图像背景相似度高,传统视觉分析技术无法有效区分车辆与背景。In the direction of traditional visual analysis, multi-target detection and tracking is carried out by manually selecting or designing image features, combined with machine learning and other methods. The main methods are: 1) Modeling based on the object model: modeling the appearance model, and then finding the object in subsequent frames. For example, algorithms such as area matching, feature point tracking, active contours, and optical flow. Among them, the most commonly used method is the feature matching method. First, the target features are extracted, and then the most similar features are found in subsequent frames for target positioning. The commonly used features are: SIFT, SURF, Harris-corner, etc. 2) Search-based method: The researchers found that the method based on object model modeling needs to process the entire picture, resulting in poor real-time performance. Therefore, a prediction algorithm is added to search for targets close to the predicted value, narrowing the search range, and improving the real-time tracking performance. Commonly used prediction algorithms include Kalman filter and particle filter. Another method to narrow the search range is the kernel method: it uses the principle of steepest descent, iterates step by step in the direction of gradient descent on the target template, until the optimal position, such as meanshift and camshift algorithms. However, the robustness of manually selecting or designing image features is poor, and the machine learning method itself has inherent defects. Traditional visual analysis technology is more likely to be affected by many factors such as image quality, foreign object occlusion, target rotation, etc., and has poor practicability, especially in complex mine environments, where the similarity between vehicle targets and image backgrounds is high, and traditional visual analysis technologies cannot effectively distinguish vehicles with background.
而在视觉深度学习方向中,一般采用卷积神经网络提取图像特征,能够很好地克服人工选取特征的不足。通过反向传播算法优化网络参数,学习海量图像数据来训练深度网络模型,能有效降低图像质量、异物遮挡、目标旋转等的影响。研究人员为克服异物遮挡、目标旋转和相机抖动带来的不良影响,提出了一种基于多级卷积滤波特征的跟踪方法。该算法利用分层学习得到的主成分分析特征向量,然后使用巴氏距离来评估特征之间的相似性,最后结合粒子滤波算法实现目标跟踪。但由于其缺乏实时目标检测信息,跟踪误差得到不到及时校正而逐渐扩大,导致跟踪稳定性和持续性较差。为此,基于深度学习的“先检测,再跟踪”的目标跟踪框架逐渐成为主流,首先使用检测模型获取目标边界框,然后根据前后帧的关系进行轨迹预测与跟踪。这类跟踪框架的经典代表是deepsort,它采用基于候选区域的检测框架进行目标检测,然后在sort的IOU快速匹配基础上加入深度学习特征,通过计算检测特征和跟踪特征的余弦距离获取相似度度量进行目标跟踪。其所采用的检测网络结构复杂,层数较深,所以实时性较差。为了提升目标跟踪的实时性,研究人员基于端到端的检测框架YOLO提出了一种融入外观特征的车辆多目标检测与跟踪方法,采用卡尔曼滤波实现单目标运动状态跟踪,通过计算目标位置和特征损失来完成了目标的关联匹配。该方法在一定程度上提升了跟踪速度,但实时性仍然不能满足要求。同时,目前基于深度学习的目标跟踪方法主要为单一种类的多个目标跟踪,或是不区分类别的多个目标跟踪,而矿卡无人驾驶系统要求对多类别的作业车辆同时跟踪,这对跟踪器提出了更高的要求。In the direction of visual deep learning, convolutional neural networks are generally used to extract image features, which can well overcome the shortcomings of manual feature selection. Optimizing the network parameters through the backpropagation algorithm and learning massive image data to train the deep network model can effectively reduce the impact of image quality, foreign object occlusion, target rotation, etc. In order to overcome the adverse effects of foreign object occlusion, target rotation and camera shake, the researchers proposed a tracking method based on multi-level convolution filtering features. The algorithm uses the principal component analysis feature vector obtained by hierarchical learning, then uses the Bhattacharyachian distance to evaluate the similarity between features, and finally combines the particle filter algorithm to achieve target tracking. However, due to the lack of real-time target detection information, the tracking error cannot be corrected in time and gradually expands, resulting in poor tracking stability and persistence. For this reason, the target tracking framework of "detect first, then track" based on deep learning has gradually become the mainstream. First, the detection model is used to obtain the target bounding box, and then the trajectory prediction and tracking are performed according to the relationship between the front and rear frames. The classic representative of this type of tracking framework is deepsort, which uses a detection framework based on candidate regions for target detection, and then adds deep learning features based on sort's IOU fast matching, and obtains similarity metrics by calculating the cosine distance between detection features and tracking features Do target tracking. The detection network used by it has a complex structure and deep layers, so its real-time performance is poor. In order to improve the real-time performance of target tracking, based on the end-to-end detection framework YOLO, the researchers proposed a vehicle multi-target detection and tracking method that incorporates appearance features, using Kalman filtering to achieve single-target motion state tracking, by calculating the target position and features Loss to complete the associated matching of the target. This method improves the tracking speed to a certain extent, but the real-time performance still cannot meet the requirements. At the same time, the current target tracking methods based on deep learning are mainly tracking multiple targets of a single type, or tracking multiple targets without distinguishing categories, while the mining truck unmanned driving system requires simultaneous tracking of multiple types of operating vehicles. Trackers make even higher demands.
目前尚未有成熟应用于复杂矿山环境下的作业车辆检测与跟踪方法,由于矿区场景复杂,存在非结构化道路路面复杂,车辆尺寸多变种类多样,目标与图形背景差异较小等客观问题。因此,传统视觉分析方法难以处理复杂矿山环境下的作业车辆检测与跟踪任务。同时,现有的基于深度学习的多目标检测跟 踪方法网络结构复杂,实时性较低。At present, there is no mature detection and tracking method for operating vehicles in complex mine environments. Due to the complex mining scene, there are objective problems such as complex unstructured road surfaces, variable vehicle sizes and types, and small differences between targets and graphic backgrounds. Therefore, traditional visual analysis methods are difficult to handle the detection and tracking tasks of operating vehicles in complex mine environments. At the same time, the existing multi-target detection and tracking method based on deep learning has a complex network structure and low real-time performance.
发明概述Summary of the invention
以下给出一个或多个方面的简要概述以提供对这些方面的基本理解。此概述不是所有构想到的方面的详尽综览,并且既非旨在指认出所有方面的关键性或决定性要素亦非试图界定任何或所有方面的范围。其唯一的目的是要以简化形式给出一个或多个方面的一些概念以为稍后给出的更加详细的描述之序。A brief summary of one or more aspects is presented below to provide a basic understanding of these aspects. This summary is not an exhaustive overview of all contemplated aspects and is intended to neither identify key or critical elements of all aspects nor attempt to delineate the scope of any or all aspects. Its sole purpose is to present some concepts of one or more aspects in a simplified form as a prelude to the more detailed description that is presented later.
本发明的目的在于解决上述问题,提供了一种作业车辆检测与跟踪方法和系统(以下有时简称作业车辆检测与跟踪方法和系统),针对非结构化道路路面复杂、目标与图像背景差异较小、车辆尺寸多变种类多样问题,通过伽马图像增强、多尺度融合预测、多源信息级联匹配等手段,实现了作业车辆的实时检测与跟踪,并获取了车辆类别、尺寸、位置、数量、轨迹等信息。The purpose of the present invention is to solve the above problems, and provides a method and system for detection and tracking of work vehicles (hereinafter sometimes referred to as a method and system for detection and tracking of work vehicles), which is aimed at unstructured roads with complex pavement and small differences between targets and image backgrounds. 、Variable size and variety of vehicles. Through gamma image enhancement, multi-scale fusion prediction, multi-source information cascade matching and other means, the real-time detection and tracking of operating vehicles are realized, and the vehicle type, size, location, and quantity are obtained. , trajectory and other information.
本发明的技术方案为:本发明提供一种作业车辆检测与跟踪方法,获取图像,采用图像增强方法对所述图像进行图像增强处理;将图像增强处理后的所述图像输入到事先训练好的作业车辆检测模型进行目标检测,获取目标检测结果。其中,所述作业车辆检测模型采用深度学习目标检测框架,通过卷积神经网络提取作业车辆图像特征,获取多类型的作业车辆检测结果并输入到作业车辆检测模型;所述作业车辆跟踪模型通过基于运动信息与外观特征级联匹配的作业车辆跟踪方法获取跟踪目标和跟踪轨迹,实现多类型作业车辆跟踪。The technical solution of the present invention is as follows: the present invention provides a detection and tracking method for a work vehicle, acquires an image, and uses an image enhancement method to perform image enhancement processing on the image; The work vehicle detection model performs target detection and obtains target detection results. Wherein, the work vehicle detection model adopts a deep learning target detection framework, extracts the image features of the work vehicle through a convolutional neural network, obtains multiple types of work vehicle detection results and inputs them into the work vehicle detection model; the work vehicle tracking model is based on The working vehicle tracking method based on the cascade matching of motion information and appearance features obtains tracking targets and tracking trajectories, and realizes multi-type working vehicle tracking.
根据本发明,在对复杂矿山环境的图像进行分析时,采用图像增强方法对图像中作业车辆与矿山的图像背景相似的图像进行增强,有效解决了由于复杂矿山环境下图像中目标与背景相似度大、灰度对比不明显而导致图像识别准确率较低的问题,提高了作业车辆检测识别效率和作业车辆跟踪实时性。在对作业车辆进行检测时,作业车辆检测模型利用卷积神经网络从图像中提取作业车辆图像特征,不仅很好地克服了人工选取图像特征所带来的误差,同时,卷积神经网络还可以通过海量的图像特征学习来进行作业车辆检测模型的训练,使得训练后的车辆检测模型更加符合作业车辆的图像特征,提高作业车辆检测效率和准确性。在对作业车辆进行目标跟踪的过程中,本发明中作业车辆跟踪模型可以根据不同类型的作业车辆检测结果,对多类型作业车辆进行运动信息与 外观特征级联匹配,从而实现了多类型目标跟踪,最终使得本发明可以在复杂矿山环境下对多种不同尺寸和类型的作业车辆进行跟踪。According to the present invention, when analyzing the image of the complex mine environment, the image enhancement method is used to enhance the similar image background of the operating vehicle and the mine image in the image, which effectively solves the problem of similarity between the target and the background in the image in the complex mine environment. The problem of low image recognition accuracy due to large and inconspicuous grayscale contrast improves the efficiency of detection and recognition of work vehicles and the real-time performance of work vehicle tracking. When detecting work vehicles, the work vehicle detection model uses the convolutional neural network to extract the image features of the work vehicle from the image, which not only overcomes the errors caused by manual selection of image features, but also the convolutional neural network. The training of the operating vehicle detection model is carried out through massive image feature learning, so that the trained vehicle detection model is more in line with the image features of the operating vehicle, and the efficiency and accuracy of operating vehicle detection are improved. In the process of target tracking of work vehicles, the work vehicle tracking model in the present invention can perform cascading matching of motion information and appearance features on multi-type work vehicles according to the detection results of different types of work vehicles, thereby realizing multi-type target tracking , which finally enables the present invention to track a variety of operating vehicles of different sizes and types in a complex mine environment.
根据本发明的作业车辆检测与跟踪方法的一实施例,所述作业车辆检测模型使用YOLO框架作为深度学习目标检测框架,采用遗传算法优化网格超参数,输出多层预测模块;其中,所述作业车辆检测模型采用DIOU构建回归损失函数,通过K-means聚类算法获取所述作业车辆的检测框。借助于此,作业车辆检测模型通过端到端的YOLO的深度学习目标检测框架,可以直接输出检测目标的位置信息和类型信息,从而提高了目标检测速度,进而提高作业车辆跟踪实时性。在分析复杂矿山环境的图像时,利于DIOU构建回归损失函数,可以进行稳定的作业车辆的真实框回归,从而避免作业车辆检测模型训练发散。作业车辆检测将获取到符合条件的预测框作为样本数据,利用K-means聚类算法对样本数据进行聚类分析,从而得到符合作业车辆图像特征分布的检测框尺寸大小,提高作业车辆检测准确性。此外,本发明采用遗传算法优化网格超参数,针对作业车辆的不同尺寸输出多层预测模块,不仅可以满足不同尺寸的作业车辆的检测要求,提高了检测效率,而且利用遗传算法优化网格超参数,将梯度变化整合到作业车辆特征图,从而降低权重,使得作业车辆图像特征识别准确率大大提高。According to an embodiment of the work vehicle detection and tracking method of the present invention, the work vehicle detection model uses the YOLO framework as a deep learning target detection framework, uses a genetic algorithm to optimize grid hyperparameters, and outputs a multi-layer prediction module; wherein, the The work vehicle detection model uses DIOU to construct a regression loss function, and obtains the detection frame of the work vehicle through the K-means clustering algorithm. With this, the work vehicle detection model can directly output the position information and type information of the detection target through the end-to-end YOLO deep learning target detection framework, thereby improving the target detection speed and improving the real-time performance of work vehicle tracking. When analyzing images of complex mine environments, it is beneficial for DIOU to construct a regression loss function, which can perform stable regression of the real frame of the operating vehicle, thereby avoiding the divergence of the training of the operating vehicle detection model. Work vehicle detection will obtain qualified prediction frames as sample data, and use the K-means clustering algorithm to cluster and analyze the sample data, so as to obtain the size of the detection frame that conforms to the feature distribution of the image of the work vehicle and improve the detection accuracy of the work vehicle . In addition, the present invention uses the genetic algorithm to optimize the grid hyperparameters, and outputs multi-layer prediction modules for different sizes of operating vehicles, which not only can meet the detection requirements of operating vehicles of different sizes, but also improves the detection efficiency, and uses the genetic algorithm to optimize the grid hyperparameters. Parameters, the gradient change is integrated into the feature map of the operating vehicle, thereby reducing the weight, and greatly improving the accuracy of image feature recognition of the operating vehicle.
根据本发明的作业车辆检测与跟踪方法的一实施例,所述作业车辆跟踪模型采用卡尔曼滤波预测并更新所述作业车辆的跟踪轨迹,所述运动信息与外观特征级联匹配基于IOU匹配进行作业车辆运动信息关联和作业车辆特征信息关联;所述作业车辆运动信息关联采用马氏距离评测运动状态关联度,所述作业车辆特征信息关联采用余弦距离评测外观特征关联度,所述马氏距离与所述余弦距离通过综合度量计算公式计算得到级联匹配综合度量评测级联匹配关联度。借助于此,在对作业车辆进行跟踪时,通过卡尔曼滤波预测并更新作业车辆的跟踪轨迹,通过IOU匹配将跟踪动轨迹与作业车辆进行匹配。在利用马氏距离评测监测框中作业车辆的运动状态和检测框中作业车辆的运动轨迹的关联度时,引入作业车辆外观特征向量进行匹配,只有同时满足马氏距离和余弦距离的相似度度量的标准时,才认定作业车辆与对应的跟踪轨迹正确关联,从而降低了由于作业车辆在复杂矿山环境下可能长期遮挡而导致跟踪轨迹错 误地与遮挡物进行匹配所所带来的影响,大大提高了本发明的检测目标与对应的跟踪轨迹关联匹配的准确性。此外,使用IOU匹配可以解决跟踪过程中作业车辆的短时被遮挡问题,当作业车辆跟踪模型在预定义的最大帧数阈值内未成功匹配到作业车辆的预测轨迹,将终止对该作业车辆的跟踪并删除该已丢失的作业车辆,提高了作业车辆跟踪效率。According to an embodiment of the working vehicle detection and tracking method of the present invention, the working vehicle tracking model uses Kalman filter to predict and update the tracking trajectory of the working vehicle, and the cascade matching between the motion information and appearance features is performed based on IOU matching Work vehicle movement information association and work vehicle characteristic information association; the work vehicle movement information association adopts the Mahalanobis distance to evaluate the degree of motion state correlation, the work vehicle characteristic information association uses the cosine distance to evaluate the appearance feature association degree, and the Mahalanobis distance The cosine distance and the cosine distance are calculated by the comprehensive metric calculation formula to obtain the cascade matching comprehensive metric evaluation cascade matching correlation degree. By means of this, when tracking the working vehicle, the tracking trajectory of the working vehicle is predicted and updated through the Kalman filter, and the tracking trajectory is matched with the working vehicle through IOU matching. When using the Mahalanobis distance to evaluate the correlation between the motion state of the work vehicle in the monitoring frame and the motion trajectory of the work vehicle in the detection frame, the appearance feature vector of the work vehicle is introduced for matching, and only the similarity measure that satisfies both the Mahalanobis distance and the cosine distance Only when the standard is met, the operating vehicle is determined to be correctly associated with the corresponding tracking track, thereby reducing the impact of the tracking track being incorrectly matched with the occluder due to the possible long-term occlusion of the operating vehicle in a complex mine environment, and greatly improving The accuracy of the association matching between the detection target and the corresponding tracking trajectory of the present invention. In addition, using IOU matching can solve the short-term occlusion problem of the working vehicle during the tracking process. When the working vehicle tracking model fails to match the predicted trajectory of the working vehicle within the predefined maximum frame number threshold, the tracking of the working vehicle will be terminated. The lost work vehicle is tracked and deleted, and the work vehicle tracking efficiency is improved.
根据本发明的作业车辆检测与跟踪方法的一实施例,所述作业车辆跟踪模型将数据关联成功的所述作业车辆特征图存储到对应的作业车辆特征图像库,并通过作业车辆特征网络从关联成功的所述作业车辆特征图中提取作业车辆特征向量;其中,所述作业车辆特征图像库设有固定存储阈值,根据数据关联时间更新所述作业车辆特征图。借助于此,使用与跟踪轨迹匹配成功的作业车辆特征向量来计算余弦距离,可以提高跟踪轨迹与作业车辆的匹配速度,进而提高作业车辆跟踪的实时性。According to an embodiment of the work vehicle detection and tracking method of the present invention, the work vehicle tracking model stores the work vehicle feature map with successful data association in the corresponding work vehicle feature image database, and uses the work vehicle feature network from the associated The feature vector of the work vehicle is successfully extracted from the feature map of the work vehicle; wherein, the feature image library of the work vehicle is provided with a fixed storage threshold, and the feature map of the work vehicle is updated according to the data association time. By means of this, the cosine distance can be calculated by using the eigenvectors of the working vehicles that are successfully matched with the tracking trajectory, which can improve the matching speed of the tracking trajectory and the working vehicle, thereby improving the real-time performance of working vehicle tracking.
根据本发明的作业车辆检测与跟踪方法的一实施例,所述基于运动信息与外观特征级联匹配的作业车辆跟踪方法进一步包括:According to an embodiment of the work vehicle detection and tracking method of the present invention, the work vehicle tracking method based on cascade matching of motion information and appearance features further includes:
获取目标检测结果,使用卡尔曼滤波预测轨迹;Obtain the target detection result and use the Kalman filter to predict the trajectory;
结合运动信息和外观特征进行级联匹配;Combine motion information and appearance features for cascade matching;
判断级联匹配是否成功,若所述轨迹匹配成功,则使用卡尔曼滤波更新跟踪所述轨迹;若所述轨迹匹配失败或者目标匹配失败,则执行IOU匹配;Judging whether the cascade matching is successful, if the trajectory matching is successful, then use the Kalman filter to update and track the trajectory; if the trajectory matching fails or the target matching fails, then perform IOU matching;
判断IOU匹配是否成功,若所述轨迹匹配成功或者所述目标匹配失败,则使用卡尔曼滤波更新跟踪轨迹;若所述轨迹匹配失败,则判断是否删除所述轨迹;Judging whether the IOU matching is successful, if the track matching is successful or the target matching fails, then use the Kalman filter to update the tracking track; if the track matching fails, then judge whether to delete the track;
判断所述轨迹是否处于确认状态,若否,则删除所述轨迹;若是,则判断所述轨迹是否超过最大帧数阈值,若否,则删除所述轨迹;若是,则使用卡尔曼滤波更新跟踪所述轨迹;Determine whether the track is in a confirmed state, if not, delete the track; if so, determine whether the track exceeds the maximum frame number threshold, if not, delete the track; if so, use the Kalman filter to update the track said trajectory;
判断卡尔曼滤波更新后的轨迹是否处于确认状态,若否,则执行IOU匹配;若是,则结合运动信息和外观特征进行级联匹配或输出所述目标与所述轨迹。Judging whether the trajectory updated by the Kalman filter is in a confirmed state, if not, then perform IOU matching; if so, perform cascade matching in combination with motion information and appearance features or output the target and the trajectory.
根据本发明的作业车辆检测与跟踪方法的一实施例,所述图像增强方法是伽马变换或直方图均衡化。借助于此,使用伽马变换来增强图像,将窄范围内的低灰度值映射到宽范围的高灰度值,使得增强后的图像灰度分布更加均衡,暗 部细节更加丰富,改善了复杂矿山环境下图像目标与背景相似度大、灰度对比不明显问题,提高了图像特征提取识别率。According to an embodiment of the work vehicle detection and tracking method of the present invention, the image enhancement method is gamma transformation or histogram equalization. With the help of this, the gamma transformation is used to enhance the image, and the low gray value in a narrow range is mapped to the high gray value in a wide range, so that the gray distribution of the enhanced image is more balanced, the details of the dark part are richer, and the complexity is improved. In the mining environment, the similarity between the image target and the background is large, and the gray contrast is not obvious, which improves the recognition rate of image feature extraction.
本发明还提供一种作业车辆检测与跟踪系统,包括用于获取作业车辆检测结果的作业车辆检测模块和用于跟踪多类型作业车辆的作业车辆跟踪模块;所述作业车辆检测模块包括图像处理单元、图像特征提取单元和作业车辆检测单元,所述车辆跟踪模块包括轨迹跟踪单元、数据关联单元和特征图像存储单元;其中,所述图像处理单元使用图像增强方法对输入的图像进行图像增强处理,并将增强处理后的图像传输到所述图像特征提取单元;所述图像特征提取单元通过卷积神经网络从所述图像中提取作业车辆图像特征,并将所述作业车辆图像特征传输到所述作业车辆检测单元;所述作业车辆检测单元通过所述作业车辆图像特征进行目标检测,并将获取的作业车辆检测结果传输到所述轨迹跟踪单元;所述轨迹跟踪单元根据所述作业车辆检测结果预测并更新对应作业车辆的跟踪轨迹,并将所述轨迹传输到所述数据关联单元进行级联匹配;所述数据关联单元基于运动信息和外观特征级联匹配的作业车辆跟踪方法进行级联匹配,所述轨迹跟踪单元根据级联匹配结果进行目标跟踪;所述特征图像存储单元用于存储级联匹配成功的作业车辆特征图。The present invention also provides a work vehicle detection and tracking system, including a work vehicle detection module for obtaining work vehicle detection results and a work vehicle tracking module for tracking multiple types of work vehicles; the work vehicle detection module includes an image processing unit , an image feature extraction unit and a working vehicle detection unit, the vehicle tracking module includes a trajectory tracking unit, a data association unit and a feature image storage unit; wherein the image processing unit uses an image enhancement method to perform image enhancement processing on the input image, and transmit the enhanced processed image to the image feature extraction unit; the image feature extraction unit extracts the image features of the work vehicle from the image through a convolutional neural network, and transmits the image features of the work vehicle to the A work vehicle detection unit; the work vehicle detection unit performs target detection through the image features of the work vehicle, and transmits the obtained work vehicle detection results to the trajectory tracking unit; the trajectory tracking unit uses the work vehicle detection results Predict and update the tracking trajectory of the corresponding work vehicle, and transmit the trajectory to the data association unit for cascade matching; the data association unit performs cascade matching based on the work vehicle tracking method of cascade matching of motion information and appearance features , the trajectory tracking unit performs target tracking according to the cascade matching result; the feature image storage unit is used for storing the feature map of the working vehicle whose cascade matching is successful.
根据本发明的作业车辆检测与跟踪系统的一实施例,所述特征图像存储单元具有不同作业车辆类型的作业车辆特征图像库,所述作业车辆特征图像库设有固定存储阈值,根据数据关联时间更新所述作业车辆特征图。According to an embodiment of the work vehicle detection and tracking system of the present invention, the feature image storage unit has work vehicle feature image libraries of different types of work vehicles, and the work vehicle feature image library is provided with a fixed storage threshold, according to data association time The feature map of the work vehicle is updated.
根据本发明的作业车辆检测与跟踪系统的一实施例,所述车辆跟踪模块还包括作业车辆特征向量提取单元,所述作业车辆特征向量提取单元通过作业车辆特征网络从存储于所述特征图像存储单元的作业车辆特征图中提取作业车辆特征向量。According to an embodiment of the work vehicle detection and tracking system of the present invention, the vehicle tracking module further includes a work vehicle feature vector extraction unit, and the work vehicle feature vector extraction unit uses the work vehicle feature network from the feature image storage The feature vector of the work vehicle is extracted from the work vehicle feature map of the unit.
根据本发明的作业车辆检测与跟踪系统的一实施例,所述基于运动信息与外观特征级联匹配的作业车辆跟踪方法进一步包括:获取目标检测结果,使用卡尔曼滤波预测轨迹;According to an embodiment of the work vehicle detection and tracking system of the present invention, the work vehicle tracking method based on the cascade matching of motion information and appearance features further includes: obtaining the target detection result, and using Kalman filter to predict the trajectory;
结合运动信息和外观特征进行级联匹配;Combine motion information and appearance features for cascade matching;
判断级联匹配是否成功,若所述轨迹匹配成功,则使用卡尔曼滤波更新跟踪所述轨迹;若所述轨迹匹配失败或者目标匹配失败,则执行IOU匹配;Judging whether the cascade matching is successful, if the trajectory matching is successful, then use the Kalman filter to update and track the trajectory; if the trajectory matching fails or the target matching fails, then perform IOU matching;
判断IOU匹配是否成功,若所述轨迹匹配成功或者所述目标匹配失败,则使用卡尔曼滤波更新跟踪轨迹;若所述轨迹匹配失败,则判断是否删除所述轨迹;Judging whether the IOU matching is successful, if the track matching is successful or the target matching fails, then use the Kalman filter to update the tracking track; if the track matching fails, then judge whether to delete the track;
判断所述轨迹是否处于确认状态,若否,则删除所述轨迹;若是,则判断所述轨迹是否超过最大帧数阈值,若否,则删除所述轨迹;若是,则使用卡尔曼滤波更新跟踪所述轨迹;Determine whether the track is in a confirmed state, if not, delete the track; if so, determine whether the track exceeds the maximum frame number threshold, if not, delete the track; if so, use the Kalman filter to update the track said trajectory;
判断卡尔曼滤波更新后的轨迹是否处于确认状态,若否,则执行IOU匹配;若是,则结合运动信息和外观特征进行级联匹配或输出所述目标与所述轨迹。Judging whether the trajectory updated by the Kalman filter is in a confirmed state, if not, then perform IOU matching; if so, perform cascade matching in combination with motion information and appearance features or output the target and the trajectory.
本发明对比现有技术有如下的有益效果:提供将基于深度学习的作业车辆检测模型与基于运动信息与外观特征级联匹配的作业车辆跟踪模型相结合的作业车辆检测与跟踪方法和系统,针对复杂矿山环境下作业车辆与背景相似度大,灰度对比不明显问题,采用图像增强方法对复杂矿山环境下的图像进行图像增强处理,提高图像的清晰度和分辨率,进而提高作业车辆检测的准确性和实时性。其中,作业车辆检测模型使用改进后的端到端的YOLO作为深度学习目标检测框架,采用DIOU构建车辆框损失函数,通过K-means聚类作业车辆的预测框,从而获取符合作业车辆图像特征的检测框,并通过遗传算法优化网络超参数等方式提升了作业车辆的实时检测性能。此外,本申请中的作业车辆跟踪模型结合作业车辆的运动信息与多层深度的外观特征进行作业车辆级联匹配,采用卡尔曼滤波和基于IOU匹配的匈牙利算法进行对作业车辆和跟踪轨迹进行数据关联,从而实现了多种类型作业车辆的实时跟踪。本方法所提的方法能够适应图像场景,并有效地完成复杂矿山环境下多类型作业车辆实时检测与跟踪任务。Compared with the prior art, the present invention has the following beneficial effects: it provides a method and system for detecting and tracking an operating vehicle that combines an operating vehicle detection model based on deep learning with an operating vehicle tracking model based on cascade matching of motion information and appearance features. In the complex mine environment, the similarity between the operating vehicle and the background is large, and the grayscale contrast is not obvious. The image enhancement method is used to enhance the image in the complex mine environment to improve the clarity and resolution of the image, and then improve the detection accuracy of the operating vehicle. accuracy and timeliness. Among them, the work vehicle detection model uses the improved end-to-end YOLO as the deep learning target detection framework, uses DIOU to construct the vehicle frame loss function, and clusters the prediction frames of the work vehicle through K-means to obtain the detection that conforms to the image characteristics of the work vehicle The real-time detection performance of the operating vehicle is improved by optimizing the network hyperparameters through the genetic algorithm. In addition, the work vehicle tracking model in this application combines the motion information of the work vehicle and the appearance features of multi-layer depth to perform cascade matching of work vehicles, and uses Kalman filtering and the Hungarian algorithm based on IOU matching to perform data processing on work vehicles and tracking trajectories. Correlation, so as to realize the real-time tracking of various types of work vehicles. The method proposed in this method can adapt to the image scene and effectively complete the real-time detection and tracking tasks of multiple types of operating vehicles in complex mine environments.
附图说明Description of drawings
在结合以下附图阅读本公开的实施例的详细描述之后,能够更好地理解本发明的上述特征和优点。在附图中,各组件不一定是按比例绘制,并且具有类似的相关特性或特征的组件可能具有相同或相近的附图标记。The above-mentioned features and advantages of the present invention can be better understood after reading the detailed description of the embodiments of the present disclosure in conjunction with the following drawings. In the drawings, components are not necessarily drawn to scale, and components with similar related properties or characteristics may have the same or similar reference numerals.
图1是示出本发明的作业车辆检测与跟踪系统的一实施例的系统结构图。FIG. 1 is a system configuration diagram showing an embodiment of a work vehicle detection and tracking system of the present invention.
图2是示出本发明的作业车辆检测与跟踪方法的一实施例的流程图。Fig. 2 is a flow chart showing an embodiment of the work vehicle detection and tracking method of the present invention.
图3是示出伽马变换增强图像的效果对比图。FIG. 3 is a comparison diagram showing the effects of gamma transform enhanced images.
图4是示出作业车辆检测模型的网络结构图。FIG. 4 is a network configuration diagram showing a work vehicle detection model.
图5是示出作业车辆跟踪方法的流程图。FIG. 5 is a flow chart illustrating a work vehicle tracking method.
图6是示出作业车辆特征网络的网络参数表。FIG. 6 is a network parameter table showing a characteristic network of a work vehicle.
发明的详细说明Detailed Description of the Invention
以下结合附图和具体实施例对本发明作详细描述。注意,以下结合附图和具体实施例描述的诸方面仅是示例性的,而不应被理解为对本发明的保护范围进行任何限制。The present invention will be described in detail below in conjunction with the accompanying drawings and specific embodiments. Note that the aspects described below in conjunction with the drawings and specific embodiments are only exemplary, and should not be construed as limiting the protection scope of the present invention.
近年来,卷积神经网络作为大多数场景中目标检测模型的基本结构,其效果已经可以与人类视觉相媲美。目前主流的检测算法分为直接检测和间接检测,直接法的代表是YOLO(You only look once,即只需要瞥一眼图像,立即知道图像中的物体,是一种一步式目标检测算法),而间接法的代表是Faster RCNN。Faster RCNN采用两步结构提取物体候选区域进行定位和识别,而YOLO直接输出位置和类别信息,无需候选区域。研究表明,间接法更耗时,而直接法更实时,更符合工程实际需要,因此本申请选择端到端的、一步式的直接检测作为检测算法,从而提高系统检测速度。In recent years, convolutional neural networks, as the basic structure of object detection models in most scenes, have achieved comparable effects to human vision. The current mainstream detection algorithms are divided into direct detection and indirect detection. The representative of the direct method is YOLO (You only look once, that is, you only need to glance at the image to immediately know the object in the image, which is a one-step target detection algorithm), and The representative of the indirect method is Faster RCNN. Faster RCNN uses a two-step structure to extract object candidate areas for positioning and recognition, while YOLO directly outputs position and category information without candidate areas. Research shows that the indirect method is more time-consuming, while the direct method is more real-time and more in line with the actual needs of the project. Therefore, this application chooses end-to-end, one-step direct detection as the detection algorithm to improve the system detection speed.
在此公开一种作业车辆检测与跟踪系统的一实施例,如图1所示,包括作业车辆检测模块和作业车辆跟踪模块。其中,作业车辆检测模块包括图像处理单元、图像特征提取单元和作业车辆检测单元,用于获取作业车辆检测结果;作业车辆检测模块包括轨迹跟踪单元、数据关联单元和特征图像存储单元,用于对检测到的作业车辆进行跟踪。作业车辆检测模块与作业车辆跟踪模块互相配合,从而实现在复杂的矿山环境下对作业车辆进行检测与跟踪。图2是示出本发明的作业车辆检测与跟踪方法的一实施例的流程图,下面结合图1和图2,进一步详述本实施形态。An embodiment of a working vehicle detection and tracking system is disclosed herein, as shown in FIG. 1 , including a working vehicle detection module and a working vehicle tracking module. Among them, the work vehicle detection module includes an image processing unit, an image feature extraction unit and a work vehicle detection unit for obtaining work vehicle detection results; the work vehicle detection module includes a trajectory tracking unit, a data association unit and a feature image storage unit for Detected work vehicles are tracked. The working vehicle detection module and the working vehicle tracking module cooperate with each other, so as to realize the detection and tracking of the working vehicle in the complex mine environment. FIG. 2 is a flow chart showing an embodiment of the method for detecting and tracking a work vehicle of the present invention. This embodiment will be further described below in conjunction with FIG. 1 and FIG. 2 .
本实施形态中,作业车辆检测与跟踪系统获取图像后,图像处理单元使用伽马变换对复杂矿山环境下的图像增强,将窄范围内的低灰度值映射到宽范围的高灰度值。图3是伽马变换增强图像的效果对比图,对比图3中伽马变换前后图像的灰度分布和像素分布,可以明显地看出通过伽马变换后,图像的灰度分布更加均衡,像素分布更加密集,暗部细节更加丰富,从而降低了作业车辆与背景图像的相似度,提高了作业车辆检测的准确性。In this embodiment, after the work vehicle detection and tracking system acquires the image, the image processing unit uses gamma transformation to enhance the image in the complex mine environment, and maps the low gray value in a narrow range to the high gray value in a wide range. Figure 3 is a comparison diagram of the effect of gamma transformation enhanced images. Comparing the grayscale distribution and pixel distribution of the image before and after gamma transformation in Figure 3, it can be clearly seen that after gamma transformation, the grayscale distribution of the image is more balanced, and the pixel The distribution is denser, and the details of the dark part are richer, thereby reducing the similarity between the working vehicle and the background image, and improving the accuracy of the working vehicle detection.
图像处理单元完成图像增强处理后,将增强处理后的矿区图像既可以存储在车载端,也同时存储在地面服务器,用于提取图像特征和模型训练。图像特征提取单元利用卷积神经网络从增强后的图像中提取作业车辆图像特征,然后作业车辆检测单元根据所提取的作业车辆图像特征进行目标检测,从而获取作业车辆检测结果并输出到作业车辆跟踪模型。其中,图像特征提取单元在模型训练与实际运行过程都有,通过训练图像特征提取模块,从而完成作业车辆特征及其学习,然后用训练好的图像特征提取模块去实际运行,图4示出了作业车辆检测模型的网络结构示意图,下面结合图4,进一步说明本实施形态。After the image processing unit completes the image enhancement processing, the enhanced mine area image can be stored not only on the vehicle side, but also on the ground server for extracting image features and model training. The image feature extraction unit uses the convolutional neural network to extract the image features of the work vehicle from the enhanced image, and then the work vehicle detection unit performs target detection based on the extracted work vehicle image features, so as to obtain the work vehicle detection result and output it to the work vehicle tracking Model. Among them, the image feature extraction unit has both the model training and the actual operation process. By training the image feature extraction module, the characteristics of the operating vehicle and its learning are completed, and then the trained image feature extraction module is used for actual operation. Figure 4 shows A schematic diagram of the network structure of the work vehicle detection model will be further described below in conjunction with FIG. 4 .
具体地,如图4所示,作业车辆检测模型包括骨干网(Backbone)和颈部(Neckpart)。将514*640*3的图像输入到作业车辆检测模型后,将图像的尺寸变化为为608*608*3的尺寸后,通过Focus结构对调整后的图像进行集中切片为304×304×12的尺寸大小的特征图。为了实现不同尺寸类型的作业车辆检测,对特征图进行进一步切片处理,得到三种不同尺寸大小的特征图,作业车辆检测模型的颈部对这些特征图执行卷积和串联操作,来提取作业车辆图像特征。此外,本实施形态中,作业车辆检测模型的骨干网和颈部之间传输特征图进行图像提取时候,采用跨级局部网络来缓解大量计算问题,提高了图像识别的实时性,并将梯度变化整合到特征图中,从而降低了深度学习权重并保持准确率。作业车辆检测模型的颈部采用FPN和PAN的结构,针对大、中、小三种尺寸的作业车辆,将三种不同尺寸大小的特征图执行卷积和串联,从而得到76*76*33\38*38*33和19*19*33三种尺寸的模型,实现了不同类型的作业车辆的检测。Specifically, as shown in FIG. 4 , the work vehicle detection model includes a backbone network (Backbone) and a neck (Neckpart). After inputting the image of 514*640*3 into the work vehicle detection model, the size of the image is changed to 608*608*3, and the adjusted image is sliced into 304*304*12 through the Focus structure Feature map of size. In order to realize the detection of different sizes and types of work vehicles, the feature map is further sliced to obtain feature maps of three different sizes. The neck of the work vehicle detection model performs convolution and series operations on these feature maps to extract work vehicles. image features. In addition, in this embodiment, when the feature map is transmitted between the backbone network of the work vehicle detection model and the neck for image extraction, a cross-level local network is used to alleviate a large number of calculation problems, improve the real-time performance of image recognition, and change the gradient Integrate into the feature map, thereby reducing the deep learning weight and maintaining accuracy. The neck of the work vehicle detection model adopts the structure of FPN and PAN. For large, medium and small work vehicles, the feature maps of three different sizes are convolved and concatenated to obtain 76*76*33\38 *38*33 and 19*19*33 models in three sizes realize the detection of different types of work vehicles.
进一步地,本实施形态中,为了进行稳定的作业车辆的真实框回归,避免作业车辆检测模型训练发散,采用DIOU构建作业车辆的回归损失函数。同时,为了加快卷积神经网络训练过程,提升卷积神经网络检测准确度,通过K-means聚类算法聚类预测框尺寸,获取符合作业车辆特征的检测框。其中,真实框是人为在图像上标注的框,预测框为网络模型预测的框。Furthermore, in this embodiment, in order to perform a stable real frame regression of the work vehicle and avoid training divergence of the work vehicle detection model, DIOU is used to construct the regression loss function of the work vehicle. At the same time, in order to speed up the training process of the convolutional neural network and improve the detection accuracy of the convolutional neural network, the K-means clustering algorithm is used to cluster and predict the size of the frame to obtain a detection frame that conforms to the characteristics of the operating vehicle. Among them, the real frame is the frame marked on the image by humans, and the predicted frame is the frame predicted by the network model.
具体地,本实施形态中,使用DIOU构建作业车辆的真实框的回归损失函数。DIOU(Distance-IoU loss)考虑了作业车辆的真实框与预测框之间的距离、重叠率和比例因子,与GIOU类似,DIOU在不与真实框重叠的情况下仍然可 以为预测框提供移动方向。DIOU loss可以直接最小化两个车辆框之间的距离,所以它比GIOU loss收敛得更快。最后使用非极大值抑制算法对预测框进行过滤,得到最终的作业车辆的位置和类别。DIOU的公式如下:Specifically, in this embodiment, the regression loss function of the real frame of the work vehicle is constructed using DIOU. DIOU (Distance-IoU loss) takes into account the distance, overlap rate and scale factor between the real frame and the predicted frame of the operating vehicle. Similar to GIOU, DIOU can still provide the moving direction for the predicted frame without overlapping with the real frame. . DIOU loss can directly minimize the distance between two vehicle boxes, so it converges faster than GIOU loss. Finally, the non-maximum value suppression algorithm is used to filter the prediction frame to obtain the final position and category of the work vehicle. The formula for DIOU is as follows:
Figure PCTCN2021127840-appb-000001
Figure PCTCN2021127840-appb-000001
其中,b,bgt分别表示预测框和真实框的中心点,ρ表示两个中心点之间的欧式距离,表示可以同时包含预测框和真实框的最小闭合区域的对角线距离,C表示可以同时包含预测框和真实框的最小闭合区域的对角线距离。计算得到DIOU的值后,将其代入作业车辆检测模型的回归损失函数进行计算,从而评价作业检测模型对于在复杂矿山环境下检测作业车辆的准确性。Among them, b and bgt represent the center points of the prediction frame and the real frame respectively, ρ represents the Euclidean distance between the two center points, and represents the diagonal distance of the smallest closed area that can contain both the prediction frame and the real frame, and C represents the The diagonal distance of the minimum closed area that contains both the predicted box and the ground truth box. After the value of DIOU is calculated, it is substituted into the regression loss function of the operating vehicle detection model for calculation, so as to evaluate the accuracy of the operating detection model for detecting operating vehicles in complex mine environments.
具体地,作业车辆检测模型的回归损失函数由第一行为预测框损失,第二、第三行为目标置信度损失和第四行为分类损失组成,具体公式如下:Specifically, the regression loss function of the work vehicle detection model consists of the first behavior prediction frame loss, the second and third behavior target confidence loss and the fourth behavior classification loss. The specific formula is as follows:
Figure PCTCN2021127840-appb-000002
Figure PCTCN2021127840-appb-000002
其中,具有logits损失的二元交叉熵用于置信损失和分类损失部分,预测模块的大小为S×S×B,S×S表示预测网格数,B表示模块深度。Among them, the binary cross entropy with logits loss is used for the confidence loss and classification loss part, the size of the prediction module is S×S×B, S×S represents the number of prediction grids, and B represents the module depth.
作业车辆检测模型在实际运行过程中,利用DIOU回归损失函数度量预测框和真实框之间误差大小,过滤掉不符合作业车辆图像特征的预测框。具体地,设置一个阈值,计算预测框与真实框误差值,若预测框大于设定阈值则过滤掉该预测框;若预测框小于阈值则保留该预测框。而在模型训练过程中,若模型训练在满足预期标准,则完成训练,若模型损失函数低于预期标准(例如1),则结束训练。阈值根据实际情况、个人经验等选择。将符合条件的预测框作为样本数据即预测框进行机器学习,从而获取符合该作业车辆特征的检测框,并将检测框作为目标检测结果输出。其中,目标检测结果包括检测框中作业车辆 的中心坐标、宽高、类别等信息。具体地,本实施形态中,使用预设好的车辆类别编号来标识作业车辆类别,例如输出数字1表示卡车,数字2表示指挥车等,通过输出的车辆类别编号来识别出检测框中作业车辆的类型。During the actual operation of the work vehicle detection model, the DIOU regression loss function is used to measure the error between the predicted frame and the real frame, and the predicted frame that does not conform to the characteristics of the work vehicle image is filtered out. Specifically, a threshold is set, and the error value between the predicted frame and the real frame is calculated. If the predicted frame is greater than the set threshold, the predicted frame is filtered out; if the predicted frame is smaller than the threshold, the predicted frame is retained. In the process of model training, if the model training meets the expected standard, the training is completed, and if the model loss function is lower than the expected standard (for example, 1), the training is ended. The threshold is selected according to the actual situation and personal experience. The prediction frame that meets the conditions is used as the sample data, that is, the prediction frame, for machine learning, so as to obtain the detection frame that meets the characteristics of the work vehicle, and the detection frame is output as the target detection result. Among them, the target detection results include information such as the center coordinates, width, height, and category of the operating vehicle in the detection frame. Specifically, in this embodiment, the preset vehicle category number is used to identify the type of work vehicle, for example, the output number 1 indicates a truck, and the number 2 indicates a command vehicle, etc., and the work vehicle in the detection frame is identified through the output vehicle type number type.
在传统目标检测方法中,一般是通过多尺度遍历滑窗或选择性搜索获取预测框然后进行定位,或者是人工设定预测框大小进行位置回归,但这些方法往往效率低、效果差。本实施形态中,采用K-means聚类算法对矿区的作业车辆图像进行分析,将作业车辆的真实框的IOU(Intersection over Union,两个区域重叠的部分除以两个区域的集合部分得出的结果,即真实框与候选框的交并比)作为“距离”进行聚类,从而得到符合矿区作业车辆图像特征分布的预测框大小,K-means聚类步骤如下:In traditional object detection methods, the prediction frame is obtained by multi-scale traversal sliding window or selective search and then positioned, or the size of the prediction frame is manually set for position regression, but these methods are often inefficient and ineffective. In this embodiment, the K-means clustering algorithm is used to analyze the image of the operating vehicle in the mining area, and the IOU (Intersection over Union) of the real frame of the operating vehicle is obtained by dividing the overlapping part of the two regions by the set part of the two regions. The results, that is, the intersection ratio of the real frame and the candidate frame) are clustered as the "distance", so as to obtain the predicted frame size that conforms to the characteristic distribution of the mining vehicle image. The K-means clustering steps are as follows:
步骤1:将样本真实框的高和宽作为一个样本点(w n,h n),n∈{1,2,…,N},样本真实框的中心点为(x n,y n),n∈{1,2,…,N},然后将所有样本点组成数据集合。 Step 1: Take the height and width of the sample ground truth frame as a sample point (w n , h n ), n∈{1,2,…,N}, the center point of the sample ground truth frame is (x n ,y n ), n∈{1,2,…,N}, and then all sample points form a data set.
步骤2:随机在数据集合中选择K个样本点作为聚类中心。Step 2: Randomly select K sample points in the data set as cluster centers.
步骤3:分别计算数据集合中所有的样本点到K个聚类中心的距离的值d,并将样本点分配到d值最小的聚类中心,以获得K个聚类点簇,即将所有真实框进行分类,分为K个类别,其中,d的计算公式如下:Step 3: Calculate the value d of the distance between all sample points in the data set and the K cluster centers, and assign the sample points to the cluster center with the smallest d value to obtain K cluster point clusters, that is, all real The frame is classified into K categories, where the calculation formula of d is as follows:
d=1-IOU[(x n,y n,w n,h n),(x n,y n,W m,H m)] d=1-IOU[(x n ,y n ,w n ,h n ),(x n ,y n ,W m ,H m )]
步骤4:重新计算K个聚类点簇的K个聚类中心,N m表示第m个聚类点 Step 4: Recalculate the K cluster centers of the K cluster point clusters, where N m represents the mth cluster point
Figure PCTCN2021127840-appb-000003
Figure PCTCN2021127840-appb-000003
簇中的样本点个数,即真实框个数,计算公式如下:The number of sample points in the cluster, that is, the number of real frames, is calculated as follows:
最后,重复步骤3和步骤4,直到K聚类中心停止移动,得到K个聚类中心,将其作为作业车辆的预测框的宽度和高度,从而得到作业车辆的预测框。Finally, repeat steps 3 and 4 until the K cluster centers stop moving, and K cluster centers are obtained, which are used as the width and height of the prediction frame of the work vehicle to obtain the prediction frame of the work vehicle.
本实施形态中,作业车辆检测与跟踪系统通过作业车辆检测模块对输入的矿山视频图像进行检测,获取到作业车辆检测结果后,将其输入到作业车辆跟踪模块,根据作业车辆的运动信息和外观特征对检测到的作业车辆进行跟踪。In this embodiment, the working vehicle detection and tracking system detects the input mine video image through the working vehicle detection module, and after obtaining the detection result of the working vehicle, it inputs it into the working vehicle tracking The feature tracks detected work vehicles.
具体地,作业车辆检测模块获取到作业车辆检测结果后,轨迹跟踪单元根据作业车辆此刻的运动状态,通过卡尔曼滤波预测该作业车辆的接下来的运动 轨迹。本实施形态中,使用八个参数(u,v,r,h,x’,y’,r’,h’)来描述运动轨迹在某一时刻的运动状态,u,v,r,h分别表示作业车辆检测框的中心点位置。其中(u,v)表示检测框的中心坐标,r是目标纵坐标与横坐标的比值,h是高度,x’,y’,r’,h’则分别表示作业车辆在图像坐标中对应的速度信息。随着作业车辆的不断运动,运动状态的参数也不断变化,卡尔曼滤波根据某一时刻作业车辆运动状态的参数信息,将u,v,r,h这四个参数作为变量,采用匀速模型或线性观测模型对检测到的作业车辆进行观测,并预测下一帧图像中作业车辆的运动轨迹。Specifically, after the work vehicle detection module obtains the detection results of the work vehicle, the trajectory tracking unit predicts the next movement trajectory of the work vehicle through Kalman filtering according to the current state of motion of the work vehicle. In this embodiment, eight parameters (u, v, r, h, x', y', r', h') are used to describe the motion state of the motion trajectory at a certain moment, and u, v, r, h are respectively Indicates the position of the center point of the work vehicle detection frame. Where (u, v) represent the center coordinates of the detection frame, r is the ratio of the target ordinate to the abscissa, h is the height, and x', y', r', h' represent the corresponding positions of the operating vehicle in the image coordinates speed information. With the continuous movement of the work vehicle, the parameters of the motion state are also constantly changing. According to the parameter information of the work vehicle motion state at a certain moment, the Kalman filter takes the four parameters u, v, r, and h as variables, and adopts the uniform velocity model or The linear observation model observes the detected operating vehicles and predicts the trajectory of the operating vehicles in the next frame of images.
本实施形状中,作业车辆检测模块跟踪作业车辆时,将跟踪的作业车辆作为检测目标,跟踪轨迹定义为轨迹k,使用参数ak来统计每条轨迹k与检测目标进行匹配的图像帧数,并最大帧数阈值Amax作为该条轨迹的最大生命周期。使用卡尔曼滤波实时跟踪作业车辆时,将所有的轨迹k保存到轨迹集合中,且ak随着对应的轨迹k与检测目标的匹配次数而递增。若轨迹k与检测目标匹配成功,则将轨迹k设为确定状态;当该条轨迹k与检测目标再次匹配时,将ak重置为0。若轨迹k与检测目标匹配失败,则将轨迹k设未确认状态;当该条轨迹k的ak超过预定义的最大帧数阈值Amax,则将该条轨迹k从轨迹集合中删除,并使用卡尔曼滤波重新预测作业车辆的运动轨迹。将新预测的运动轨迹将在前三帧图像中的归类为暂定轨迹,若在三帧图像内未成功匹配到检测目标,则将这些暂定轨迹删除,并终止对该作业车辆的跟踪。In this implementation form, when the work vehicle detection module tracks the work vehicle, the tracked work vehicle is used as the detection target, the tracking track is defined as track k, and the parameter ak is used to count the number of image frames that match each track k with the detection target, and The maximum frame number threshold Amax is used as the maximum life cycle of the track. When the Kalman filter is used to track the work vehicle in real time, all the trajectories k are saved in the trajectory set, and ak is incremented with the matching times of the corresponding trajectories k and the detection target. If the track k matches the detection target successfully, set the track k to a certain state; when the track k matches the detection target again, reset ak to 0. If the track k fails to match the detection target, set the track k to an unconfirmed state; when the ak of the track k exceeds the predefined maximum frame number threshold Amax, delete the track k from the track set, and use the Cal Mann filter re-predicts the trajectory of the work vehicle. Classify the newly predicted motion trajectory in the first three frames of images as a tentative trajectory. If the detection target is not successfully matched within the three frames of images, these tentative trajectories will be deleted and the tracking of the working vehicle will be terminated. .
进一步地,本实施形态中,为了更稳定地进行目标跟踪,数据关联单元在匹配运动轨迹和检测目标过程中,引入车辆深度特征度量,结合作业车辆运动信息和作业车辆外观特征进行级联匹配,并将所有已确认的完成匹配的作业车辆特征向量存储到特征图像存储单元。在进行级联匹配时,计算检测目标与对应的作业车辆特征向量的余弦距离作为外观特征关联度量。由于检测目标被遮挡一段时间后,卡尔曼滤波预测的不确定性将大大增加,状态空间上的可观察性变得很低,而马氏距离更倾向于不确定性更大的轨迹,因此,运动状态与外观特征级联匹配中,使用IOU匹配分配已确认的轨迹时,给予最近匹配上的轨迹更高的优先权,降低连续多帧特征图都未匹配上检测目标的轨迹的优先权。Furthermore, in this embodiment, in order to perform target tracking more stably, the data association unit introduces vehicle depth feature metrics in the process of matching motion trajectories and detecting targets, and performs cascade matching in combination with the motion information of the work vehicle and the appearance features of the work vehicle, And store all confirmed matched work vehicle feature vectors in the feature image storage unit. When cascade matching is performed, the cosine distance between the detection target and the corresponding work vehicle feature vector is calculated as the appearance feature correlation measure. Since the detection target is occluded for a period of time, the uncertainty of the Kalman filter prediction will greatly increase, the observability in the state space becomes very low, and the Mahalanobis distance is more inclined to the trajectory with greater uncertainty. Therefore, In the cascade matching of motion state and appearance features, when IOU matching is used to assign confirmed trajectories, the most recently matched trajectories are given higher priority, and the priority of trajectories that have not matched the detection target in consecutive multi-frame feature maps is lowered.
图5是作业车辆跟踪方法的流程图,作业车辆跟踪方法基于运动状态和外 观特征级联匹配,并串联IOU匹配和卡尔曼滤波,对作业车辆进行跟踪。其中运动信息与外观特征级联匹配包括作业车辆运动信息关联和作业车辆外观特征关联,数据关联单元只有在同时满足余弦度量和马氏度量时,才能认定检测目标与预测轨迹正确关联。若轨迹k与检测目标匹配成功,作业车辆跟踪模型将轨迹k作为确认的轨迹,输出跟踪的作业车辆和对应的轨迹k,然后更新轨迹k的参数。且当该条轨迹k与检测目标再次匹配时,将该条轨迹的最大生命周期ak重置为0。若轨迹k与检测目标匹配失败,作业车辆跟踪模型将轨迹k设未确认状态,重新进行运动状态与外观特征级联匹配,对未确认的轨迹k、未匹配的轨迹和未匹配的检测目标进行IOU匹配,并再次使用匈牙利算法进行分配确认的跟踪轨迹。Fig. 5 is a flow chart of the working vehicle tracking method. The working vehicle tracking method is based on the cascade matching of motion state and appearance features, and in series with IOU matching and Kalman filtering to track the working vehicle. The cascading matching of motion information and appearance features includes the association of motion information of the work vehicle and the association of appearance features of the work vehicle. The data association unit can only determine that the detection target is correctly associated with the predicted trajectory when the cosine metric and the Mahalanobis metric are satisfied at the same time. If the trajectory k matches the detection target successfully, the working vehicle tracking model takes the trajectory k as the confirmed trajectory, outputs the tracked working vehicle and the corresponding trajectory k, and then updates the parameters of the trajectory k. And when the track k matches the detection target again, the maximum lifetime ak of the track is reset to 0. If the matching between the trajectory k and the detection target fails, the tracking model of the work vehicle sets the trajectory k as an unconfirmed state, re-performs the cascade matching between the motion state and the appearance features, and conducts the unconfirmed trajectory k, the unmatched trajectory and the unmatched detection target. IOUs were matched and tracked for allocation confirmation using the Hungarian algorithm again.
具体地,作业车辆运动信息关联通过卡尔曼滤波计算预测运动状态与当前时刻所观测的作业车辆的检测运动状态之间的马氏距离,公式如下:Specifically, the working vehicle motion information association calculates the Mahalanobis distance between the predicted motion state and the detected motion state of the work vehicle observed at the current moment through Kalman filtering, and the formula is as follows:
Figure PCTCN2021127840-appb-000004
Figure PCTCN2021127840-appb-000004
其中,m表示马氏距离,T表示矩阵转置,i表示第i个轨迹,St是当前时刻卡尔曼滤波的观测空间的协方差矩阵,yt是当前时刻的预测器,dj是j次检测的作业车辆的运动状态(u,v,γ,h)。马氏距离通过测量远离平均轨道的位置的标准差来表示状态估计的不确定性,采用逆卡方分布的0.95分位数(即正太分布概率0.95的分位数)用作阈值来过滤较弱的关联,其中,过滤函数如下:Among them, m represents the Mahalanobis distance, T represents the matrix transpose, i represents the i-th trajectory, St is the covariance matrix of the observation space of the Kalman filter at the current moment, yt is the predictor at the current moment, and dj is the jth detection The motion state of the work vehicle (u, v, γ, h). The Mahalanobis distance represents the uncertainty of state estimation by measuring the standard deviation of the position away from the average orbit, and the 0.95 quantile of the inverse chi-square distribution (that is, the quantile of the normal distribution probability of 0.95) is used as a threshold to filter weak , where the filter function is as follows:
Figure PCTCN2021127840-appb-000005
Figure PCTCN2021127840-appb-000005
具体地,平均轨道为卡尔曼滤波的每个轨迹的平均值,将轨迹的平均值与实际检测到的车辆框进行马氏距离计算,判断车辆框与轨迹是否重合,重合表示匹配上了,离得越远表示不匹配,标准差越大表示状态估计不确定性越大。Specifically, the average trajectory is the average value of each trajectory of Kalman filtering, and the Mahalanobis distance is calculated between the average trajectory and the actual detected vehicle frame to determine whether the vehicle frame and the trajectory are coincident. The farther the standard deviation is, the larger the standard deviation is, the greater the uncertainty of the state estimation is.
在进行检测目标和运动轨迹进行数据关联时,若检测目标的运动不确定性较低,马氏距离是一个很好的关联度量。但在实际中,由于作业车辆运动过程中相机运动会使马氏距离测量方法失效,因此本实施形态中,在使用马氏距离进行作业车辆运动信息关联时,引入作业车辆外观特征,采用余弦距离的相似度度量来表示作业车辆外观特征的关联度,共同衡量检测目标与跟踪轨迹的关 联度量。Mahalanobis distance is a good correlation measure if the motion uncertainty of the detection target is low when performing data association between the detection target and the motion trajectory. However, in practice, the Mahalanobis distance measurement method will be invalidated due to the movement of the camera during the movement of the work vehicle. Therefore, in this embodiment, when the Mahalanobis distance is used to correlate the movement information of the work vehicle, the appearance characteristics of the work vehicle are introduced, and the cosine distance is used. The similarity measure is used to represent the correlation degree of the appearance characteristics of the work vehicle, and jointly measure the correlation measure between the detection target and the tracking track.
其中,马氏距离通过测量平均轨道的位置的标准差来表示作业车辆检测框与跟踪框中运动状态关联度,余弦距离通过计算预测轨迹和检测结果对应的两个外观特征向量的余弦值并提取最小余弦值作为外观关联度。具体地,特征图像存储单元为每个跟踪的作业车辆构建了一个作业车辆外观特征库,来存储与每个作业车辆成功关联的时间最近的125帧作业车辆特征向量rki。其中k表示帧数,最大值为125,利用所存储的作业车辆特征向量来计算当前帧的第i个预测轨迹和第j个检测结果的外观关联度,其中,检测结果是指检测目标的外观特征向量,外观关联度函数和对应的过滤函数的公式如下:Among them, the Mahalanobis distance measures the standard deviation of the position of the average track to represent the degree of correlation between the detection frame of the work vehicle and the motion state in the tracking frame, and the cosine distance calculates the cosine value of the two appearance feature vectors corresponding to the predicted trajectory and the detection result and extracts The minimum cosine value is used as the degree of appearance relevance. Specifically, the feature image storage unit constructs a work vehicle appearance feature library for each tracked work vehicle to store the latest 125 frames of work vehicle feature vector rki successfully associated with each work vehicle. Among them, k represents the number of frames, and the maximum value is 125. The stored work vehicle feature vector is used to calculate the appearance correlation degree between the i-th predicted trajectory and the j-th detection result of the current frame, where the detection result refers to the appearance of the detection target The formulas of feature vector, appearance correlation function and corresponding filter function are as follows:
Figure PCTCN2021127840-appb-000006
Figure PCTCN2021127840-appb-000006
Figure PCTCN2021127840-appb-000007
Figure PCTCN2021127840-appb-000007
具体地,余弦距离是指第i个预测轨迹和第j个检测结果对应的两个外观特征向量的余弦值,使用外观关联度函数来提取最小余弦值作为外观关联度。外观关联度函数对应的过滤函数中,f表示外观特征向量,l表示外观特征向量f的距离,使用该过滤函数将达不到外观关联度阈值的轨迹过滤掉。Specifically, the cosine distance refers to the cosine value of the two appearance feature vectors corresponding to the i-th predicted trajectory and the j-th detection result, and the appearance correlation function is used to extract the minimum cosine value as the appearance correlation degree. In the filter function corresponding to the appearance correlation function, f represents the appearance feature vector, and l represents the distance of the appearance feature vector f, and use this filter function to filter out the trajectories that do not reach the appearance correlation threshold.
进一步地,本实施形态中,使用作业车辆特征网络提取作业车辆特征向量,图6是作业车辆特征网络的网络参数表。如图6所示,作业车辆特征网络采用残差网络,包含了一个卷积层、一个最大池化层与六个残差模块,最终将维度为128的全局特征图在密集层中计算,通过正则化将特征投影为车辆特征向量。Further, in this embodiment, the feature vector of the work vehicle is extracted using the feature network of the work vehicle, and FIG. 6 is a network parameter table of the feature network of the work vehicle. As shown in Figure 6, the work vehicle feature network uses a residual network, which includes a convolutional layer, a maximum pooling layer, and six residual modules. Finally, the global feature map with a dimension of 128 is calculated in the dense layer. Through Regularization projects features into vehicle feature vectors.
本实施形态中,数据关联单元获取到关联运动信息的马氏度量和关联外观特征的余弦度量后,综合马氏距离和余弦距离得到级联匹配度量Ii,j,级联匹配度量函数和对应的过滤函数的公式如下:In this embodiment, after the data association unit obtains the Mahalanobis metric of the associated motion information and the cosine metric of the associated appearance feature, the Mahalanobis distance and the cosine distance are integrated to obtain the cascaded matching metric Ii, j, the cascaded matching metric function and the corresponding The formula for the filter function is as follows:
I i,j=λl m(i,j)+(1-λ)l f(i,j) I i, j = λl m (i, j) + (1-λ) l f (i, j)
Figure PCTCN2021127840-appb-000008
Figure PCTCN2021127840-appb-000008
其中,λ为权重系数,过滤函数中的g表示马氏度量m和余弦度量f,过滤函数的阈值通过实际场景、个人经验设置,并非固定值。Among them, λ is the weight coefficient, g in the filter function represents the Mahalanobis metric m and the cosine metric f, and the threshold of the filter function is set by the actual scene and personal experience, and is not a fixed value.
尽管为使解释简单化将上述方法图示并描述为一系列动作,但是应理解并领会,这些方法不受动作的次序所限,因为根据一个或多个实施例,一些动作可按不同次序发生和/或与来自本文中图示和描述或本文中未图示和描述但本领域技术人员可以理解的其他动作并发地发生。Although the methods described above are illustrated and described as a series of acts for simplicity of explanation, it is to be understood and appreciated that the methodologies are not limited by the order of the acts, as some acts may occur in a different order according to one or more embodiments And/or concurrently with other actions from those illustrated and described herein or not illustrated and described herein but can be understood by those skilled in the art.
本领域技术人员将进一步领会,结合本文中所公开的实施例来描述的各种解说性逻辑板块、模块、电路、和算法步骤可实现为电子硬件、计算机软件、或这两者的组合。为清楚地解说硬件与软件的这一可互换性,各种解说性组件、框、模块、电路、和步骤在上面是以其功能性的形式作一般化描述的。此类功能性是被实现为硬件还是软件取决于具体应用和施加于整体系统的设计约束。技术人员对于每种特定应用可用不同的方式来实现所描述的功能性,但这样的实现决策不应被解读成导致脱离了本发明的范围。Those of skill in the art would further appreciate that the various illustrative logical blocks, modules, circuits, and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both. To clearly illustrate this interchangeability of hardware and software, various illustrative components, blocks, modules, circuits, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
结合本文所公开的实施例描述的各种解说性逻辑板块、模块、和电路可用通用处理器、数字信号处理器(DSP)、专用集成电路(ASIC)、现场可编程门阵列(FPGA)或其它可编程逻辑器件、分立的门或晶体管逻辑、分立的硬件组件、或其设计成执行本文所描述功能的任何组合来实现或执行。通用处理器可以是微处理器,但在替换方案中,该处理器可以是任何常规的处理器、控制器、微控制器、或状态机。处理器还可以被实现为计算设备的组合,例如DSP与微处理器的组合、多个微处理器、与DSP核心协作的一个或多个微处理器、或任何其他此类配置。The various illustrative logic blocks, modules, and circuits described in connection with the embodiments disclosed herein may be implemented with a general-purpose processor, digital signal processor (DSP), application-specific integrated circuit (ASIC), field-programmable gate array (FPGA), or other Implemented or performed by programmable logic devices, discrete gate or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described herein. A general-purpose processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine. A processor may also be implemented as a combination of computing devices, e.g., a combination of a DSP and a microprocessor, multiple microprocessors, one or more microprocessors in cooperation with a DSP core, or any other such configuration.
结合本文中公开的实施例描述的方法或算法的步骤可直接在硬件中、在由处理器执行的软件模块中、或在这两者的组合中体现。软件模块可驻留在RAM存储器、闪存、ROM存储器、EPROM存储器、EEPROM存储器、寄存器、硬盘、可移动盘、CD-ROM、或本领域中所知的任何其他形式的存储介质中。示例性存储介质耦合到处理器以使得该处理器能从/向该存储介质读取和写入信息。在替换方案中,存储介质可以被整合到处理器。处理器和存储介质可驻留在ASIC中。ASIC可驻留在用户终端中。在替换方案中,处理器和存储介质可作为分立组件驻留在用户终端中。The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of both. A software module may reside in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art. An exemplary storage medium is coupled to the processor such that the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integrated into the processor. The processor and storage medium can reside in an ASIC. The ASIC may reside in a user terminal. In the alternative, the processor and storage medium may reside as discrete components in the user terminal.
在一个或多个示例性实施例中,所描述的功能可在硬件、软件、固件或其任何组合中实现。如果在软件中实现为计算机程序产品,则各功能可以作为一条或更多条指令或代码存储在计算机可读介质上或藉其进行传送。计算机可读介质包括计算机存储介质和通信介质两者,其包括促成计算机程序从一地向另一地转移的任何介质。存储介质可以是能被计算机访问的任何可用介质。作为示例而非限定,这样的计算机可读介质可包括RAM、ROM、EEPROM、CD-ROM或其它光盘存储、磁盘存储或其它磁存储设备、或能被用来携带或存储指令或数据结构形式的合意程序代码且能被计算机访问的任何其它介质。任何连接也被正当地称为计算机可读介质。例如,如果软件是使用同轴电缆、光纤电缆、双绞线、数字订户线(DSL)、或诸如红外、无线电、以及微波之类的无线技术从web网站、服务器、或其它远程源传送而来,则该同轴电缆、光纤电缆、双绞线、DSL、或诸如红外、无线电、以及微波之类的无线技术就被包括在介质的定义之中。如本文中所使用的盘(disk)和碟(disc)包括压缩碟(CD)、激光碟、光碟、数字多用碟(DVD)、软盘和蓝光碟,其中盘(disk)往往以磁的方式再现数据,而碟(disc)用激光以光学方式再现数据。上述的组合也应被包括在计算机可读介质的范围内。In one or more exemplary embodiments, the functions described may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, as a computer program product, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Computer-readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. A storage media may be any available media that can be accessed by a computer. By way of example and not limitation, such computer-readable media may include RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or other Any other medium that is suitable for program code and can be accessed by a computer. Any connection is also properly termed a computer-readable medium. For example, if the software is transmitted from a web site, server, or other remote source using coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technologies such as infrared, radio, and microwave , then the coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of media. Disk and disc, as used herein, include compact disc (CD), laser disc, optical disc, digital versatile disc (DVD), floppy disk, and Blu-ray disc, where disks are often reproduced magnetically. data, while a disc (disc) uses laser light to reproduce data optically. Combinations of the above should also be included within the scope of computer-readable media.
提供对本公开的先前描述是为使得本领域任何技术人员皆能够制作或使用本公开。对本公开的各种修改对本领域技术人员来说都将是显而易见的,且本文中所定义的普适原理可被应用到其他变体而不会脱离本公开的精神或范围。由此,本公开并非旨在被限定于本文中所描述的示例和设计,而是应被授予与本文中所公开的原理和新颖性特征相一致的最广范围。The previous description of the present disclosure is provided to enable any person skilled in the art to make or use the present disclosure. Various modifications to the present disclosure will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other variations without departing from the spirit or scope of the present disclosure. Thus, the disclosure is not intended to be limited to the examples and designs described herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (11)

  1. 一种作业车辆检测与跟踪方法,其特征在于,所述方法包括:A working vehicle detection and tracking method, characterized in that the method comprises:
    获取图像,采用图像增强方法对所述图像进行图像增强处理;Acquire an image, and perform image enhancement processing on the image by using an image enhancement method;
    将图像增强处理后的所述图像输入到作业车辆检测模型进行目标检测,获取目标检测结果;其中,所述作业车辆检测模型采用深度学习目标检测框架,通过卷积神经网络提取作业车辆图像特征,并将获取到的多类型的作业车辆检测结果输入到作业车辆跟踪模型进行目标跟踪;The image after the image enhancement processing is input into the work vehicle detection model for target detection, and the target detection result is obtained; wherein, the work vehicle detection model adopts a deep learning target detection framework, and extracts the image features of the work vehicle through a convolutional neural network, And input the obtained multi-type work vehicle detection results into the work vehicle tracking model for target tracking;
    所述作业车辆跟踪模型通过基于运动信息与外观特征级联匹配的作业车辆跟踪方法获取跟踪目标和跟踪轨迹,输出跟踪目标和目标运动轨迹。The working vehicle tracking model obtains the tracking target and the tracking trajectory through the working vehicle tracking method based on the cascade matching of motion information and appearance features, and outputs the tracking target and the target motion trajectory.
  2. 根据权利要求1所述的作业车辆检测与跟踪方法,其特征在于,所述作业车辆检测模型使用YOLO作为所述深度学习目标检测框架,采用遗传算法优化网格超参数,输出多层预测模块;其中,所述作业车辆检测模型采用DIOU构建回归损失函数,通过K-means聚类算法获取所述作业车辆的检测框。The work vehicle detection and tracking method according to claim 1, wherein the work vehicle detection model uses YOLO as the deep learning target detection framework, uses a genetic algorithm to optimize grid hyperparameters, and outputs a multi-layer prediction module; Wherein, the work vehicle detection model adopts DIOU to construct a regression loss function, and obtains the detection frame of the work vehicle through a K-means clustering algorithm.
  3. 根据权利要求1所述的作业车辆检测与跟踪方法,其特征在于,所述作业车辆跟踪模型采用卡尔曼滤波预测并更新所述作业车辆的跟踪轨迹,所述运动信息与外观特征级联匹配基于IOU匹配进行作业车辆运动信息关联和作业车辆特征信息关联。The work vehicle detection and tracking method according to claim 1, wherein the work vehicle tracking model uses a Kalman filter to predict and update the work vehicle tracking trajectory, and the cascade matching of the motion information and appearance features is based on The IOU matching performs the association of the motion information of the work vehicle and the feature information of the work vehicle.
  4. 根据权利要求3所述的作业车辆检测与跟踪方法,其特征在于,所述作业车辆运动信息关联采用马氏距离评测运动状态关联度,所述作业车辆特征信息关联采用余弦距离评测外观特征关联度,所述马氏距离与所述余弦距离综合度量计算公式计算得到级联匹配综合度量评测级联匹配关联度。The detection and tracking method of a working vehicle according to claim 3, wherein the working vehicle motion information is correlated using the Mahalanobis distance to evaluate the correlation degree of the motion state, and the working vehicle feature information is correlated using the cosine distance to evaluate the appearance feature correlation degree , the Mahalanobis distance and the cosine distance comprehensive metric calculation formula are calculated to obtain the cascade matching comprehensive metric evaluation cascade matching correlation degree.
  5. 根据权利要求4所述的作业车辆检测与跟踪方法,其特征在于,所述作业车辆跟踪模型将数据关联成功的作业车辆特征图存储到对应的作业车辆特征图像库,并通过作业车辆特征网络从所述作业车辆特征图中提取作业车辆特征向量;其中,所述作业车辆特征图像库设有固定存储阈值和作业车辆类型, 根据数据关联时间和作业车辆类型更新所述作业车辆特征图。The work vehicle detection and tracking method according to claim 4, wherein the work vehicle tracking model stores the work vehicle feature map with successful data association in the corresponding work vehicle feature image library, and obtains the work vehicle feature image from the work vehicle feature network through the work vehicle feature network. The feature vector of the work vehicle is extracted from the feature map of the work vehicle; wherein, the feature image library of the work vehicle is provided with a fixed storage threshold and the type of the work vehicle, and the feature map of the work vehicle is updated according to the data association time and the type of the work vehicle.
  6. 根据权利要求1所述的作业车辆检测与跟踪方法,其特征在于,所述基于运动信息与外观特征级联匹配的作业车辆跟踪方法进一步包括:The work vehicle detection and tracking method according to claim 1, wherein the work vehicle tracking method based on cascade matching of motion information and appearance features further comprises:
    获取目标检测结果,使用卡尔曼滤波预测轨迹;Obtain the target detection result and use the Kalman filter to predict the trajectory;
    结合运动信息和外观特征进行级联匹配;Combine motion information and appearance features for cascade matching;
    判断级联匹配是否成功,若所述轨迹匹配成功,则使用卡尔曼滤波更新跟踪所述轨迹;若所述轨迹匹配失败或者目标匹配失败,则执行IOU匹配;Judging whether the cascade matching is successful, if the trajectory matching is successful, then use the Kalman filter to update and track the trajectory; if the trajectory matching fails or the target matching fails, then perform IOU matching;
    判断IOU匹配是否成功,若所述轨迹匹配成功或者所述目标匹配失败,则使用卡尔曼滤波更新跟踪轨迹;若所述轨迹匹配失败,则判断是否删除所述轨迹;Judging whether the IOU matching is successful, if the track matching is successful or the target matching fails, then use the Kalman filter to update the tracking track; if the track matching fails, then judge whether to delete the track;
    判断所述轨迹是否处于确认状态,若否,则删除所述轨迹;若是,则判断所述轨迹是否超过最大帧数阈值,若否,则删除所述轨迹;若是,则使用卡尔曼滤波更新跟踪所述轨迹;Determine whether the track is in a confirmed state, if not, delete the track; if so, determine whether the track exceeds the maximum frame number threshold, if not, delete the track; if so, use the Kalman filter to update the track said trajectory;
    判断卡尔曼滤波更新后的轨迹是否处于确认状态,若否,则执行IOU匹配;若是,则结合运动信息和外观特征进行级联匹配或输出所述目标与所述轨迹。Judging whether the trajectory updated by the Kalman filter is in a confirmed state, if not, then perform IOU matching; if so, perform cascade matching in combination with motion information and appearance features or output the target and the trajectory.
  7. 根据权利要求1-6中任一项所述的作业车辆检测与跟踪方法,其特征在于,所述图像增强方法是伽马变换或直方图均衡化。The work vehicle detection and tracking method according to any one of claims 1-6, characterized in that the image enhancement method is gamma transformation or histogram equalization.
  8. 一种作业车辆检测与跟踪系统,其特征在于,包括用于获取作业车辆检测结果的作业车辆检测模块和用于跟踪多类型作业车辆的作业车辆跟踪模块;A work vehicle detection and tracking system, characterized in that it includes a work vehicle detection module for obtaining work vehicle detection results and a work vehicle tracking module for tracking multiple types of work vehicles;
    所述作业车辆检测模块包括图像处理单元、图像特征提取单元和作业车辆检测单元;The work vehicle detection module includes an image processing unit, an image feature extraction unit and a work vehicle detection unit;
    所述车辆跟踪模块包括轨迹跟踪单元、数据关联单元和特征图像存储单元;其中,The vehicle tracking module includes a trajectory tracking unit, a data association unit and a feature image storage unit; wherein,
    所述图像处理单元使用图像增强方法对输入的图像进行图像增强处理,并将增强处理后的图像传输到所述图像特征提取单元;The image processing unit uses an image enhancement method to perform image enhancement processing on the input image, and transmits the enhanced image to the image feature extraction unit;
    所述图像特征提取单元通过卷积神经网络从所述图像中提取作业车辆图 像特征,并将所述作业车辆图像特征传输到所述作业车辆检测单元;The image feature extraction unit extracts the image features of the work vehicle from the image through a convolutional neural network, and transmits the image features of the work vehicle to the work vehicle detection unit;
    所述作业车辆检测单元通过所述作业车辆图像特征进行目标检测,并将获取的作业车辆检测结果传输到所述轨迹跟踪单元;The work vehicle detection unit performs target detection through the image features of the work vehicle, and transmits the acquired work vehicle detection result to the trajectory tracking unit;
    所述轨迹跟踪单元根据所述作业车辆检测结果预测并更新作业车辆的轨迹,并将所述轨迹传输到所述数据关联单元进行级联匹配;The trajectory tracking unit predicts and updates the trajectory of the working vehicle according to the detection result of the working vehicle, and transmits the trajectory to the data association unit for cascade matching;
    所述数据关联单元通过基于运动信息和外观特征级联匹配的作业车辆跟踪方法进行级联匹配,所述轨迹跟踪单元根据级联匹配结果进行目标跟踪;The data association unit performs cascade matching through a work vehicle tracking method based on motion information and appearance feature cascade matching, and the trajectory tracking unit performs target tracking according to the cascade matching result;
    所述特征图像存储单元用于存储级联匹配成功的作业车辆特征图。The characteristic image storage unit is used for storing the characteristic map of the work vehicle whose cascade matching is successful.
  9. 根据权利要求8所述的作业车辆检测与跟踪系统,其特征在于,所述特征图像存储单元具有不同作业车辆类型的作业车辆特征图像库,所述作业车辆特征图像库设有固定存储阈值,根据数据关联时间更新所述作业车辆特征图。The work vehicle detection and tracking system according to claim 8, wherein the feature image storage unit has work vehicle feature image libraries of different types of work vehicles, and the work vehicle feature image library is provided with a fixed storage threshold, according to The feature map of the work vehicle is updated at data association time.
  10. 根据权利要求9所述的作业车辆检测与跟踪系统,其特征在于,所述车辆跟踪模块还包括作业车辆特征向量提取单元,所述作业车辆特征向量提取单元通过作业车辆特征网络从存储于所述特征图像存储单元的作业车辆特征图中提取作业车辆特征向量。The work vehicle detection and tracking system according to claim 9, wherein the vehicle tracking module further includes a work vehicle feature vector extraction unit, and the work vehicle feature vector extraction unit is stored in the A feature vector of the work vehicle is extracted from the feature map of the work vehicle in the feature image storage unit.
  11. 一种基于运动信息和外观特征级联匹配的作业车辆跟踪方法,包括以下步骤:A working vehicle tracking method based on cascade matching of motion information and appearance features, comprising the following steps:
    获取目标检测结果,使用卡尔曼滤波预测轨迹;Obtain the target detection result and use the Kalman filter to predict the trajectory;
    结合运动信息和外观特征进行级联匹配;Combine motion information and appearance features for cascade matching;
    判断级联匹配是否成功,若所述轨迹匹配成功,则使用卡尔曼滤波更新跟踪所述轨迹;若所述轨迹匹配失败或者目标匹配失败,则执行IOU匹配;Judging whether the cascade matching is successful, if the trajectory matching is successful, then use the Kalman filter to update and track the trajectory; if the trajectory matching fails or the target matching fails, then perform IOU matching;
    判断IOU匹配是否成功,若所述轨迹匹配成功或者所述目标匹配失败,则使用卡尔曼滤波更新跟踪轨迹;若所述轨迹匹配失败,则判断是否删除所述轨迹;Judging whether the IOU matching is successful, if the track matching is successful or the target matching fails, then use the Kalman filter to update the tracking track; if the track matching fails, then judge whether to delete the track;
    判断所述轨迹是否处于确认状态,若否,则删除所述轨迹;若是,则判断所述轨迹是否超过最大帧数阈值,若否,则删除所述轨迹;若是,则使用卡尔 曼滤波更新跟踪所述轨迹;Determine whether the track is in a confirmed state, if not, delete the track; if so, determine whether the track exceeds the maximum frame number threshold, if not, delete the track; if so, use the Kalman filter to update the track said trajectory;
    判断卡尔曼滤波更新后的轨迹是否处于确认状态,若否,则执行IOU匹配;若是,则结合运动信息和外观特征进行级联匹配或输出所述目标与所述轨迹。Judging whether the trajectory updated by the Kalman filter is in a confirmed state, if not, then perform IOU matching; if so, perform cascade matching in combination with motion information and appearance features or output the target and the trajectory.
PCT/CN2021/127840 2021-10-18 2021-11-01 Work vehicle detection and tracking method and system WO2023065395A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111208735.X 2021-10-18
CN202111208735.XA CN115995063A (en) 2021-10-18 2021-10-18 Work vehicle detection and tracking method and system

Publications (1)

Publication Number Publication Date
WO2023065395A1 true WO2023065395A1 (en) 2023-04-27

Family

ID=85990657

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/127840 WO2023065395A1 (en) 2021-10-18 2021-11-01 Work vehicle detection and tracking method and system

Country Status (2)

Country Link
CN (1) CN115995063A (en)
WO (1) WO2023065395A1 (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116469059A (en) * 2023-06-20 2023-07-21 松立控股集团股份有限公司 Parking lot entrance and exit vehicle backlog detection method based on DETR
CN116612642A (en) * 2023-07-19 2023-08-18 长沙海信智能系统研究院有限公司 Vehicle continuous lane change detection method and electronic equipment
CN116703983A (en) * 2023-06-14 2023-09-05 石家庄铁道大学 Combined shielding target detection and target tracking method
CN116828398A (en) * 2023-08-29 2023-09-29 中国信息通信研究院 Tracking behavior recognition method and device, electronic equipment and storage medium
CN117437261A (en) * 2023-10-08 2024-01-23 南京威翔科技有限公司 Tracking method suitable for edge-end remote target
CN117456407A (en) * 2023-10-11 2024-01-26 中国人民解放军军事科学院系统工程研究院 Multi-target image tracking method and device
CN117523379A (en) * 2023-11-20 2024-02-06 广东海洋大学 Underwater photographic target positioning method and system based on AI
CN117689907A (en) * 2024-02-04 2024-03-12 福瑞泰克智能系统有限公司 Vehicle tracking method, device, computer equipment and storage medium
CN117746304A (en) * 2024-02-21 2024-03-22 浪潮软件科技有限公司 Refrigerator food material identification and positioning method and system based on computer vision
CN117746304B (en) * 2024-02-21 2024-05-14 浪潮软件科技有限公司 Refrigerator food material identification and positioning method and system based on computer vision

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116993776B (en) * 2023-06-30 2024-02-13 中信重工开诚智能装备有限公司 Personnel track tracking method

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160063330A1 (en) * 2014-09-03 2016-03-03 Sharp Laboratories Of America, Inc. Methods and Systems for Vision-Based Motion Estimation
CN111768430A (en) * 2020-06-23 2020-10-13 重庆大学 Expressway outfield vehicle tracking method based on multi-feature cascade matching
CN111914664A (en) * 2020-07-06 2020-11-10 同济大学 Vehicle multi-target detection and track tracking method based on re-identification
CN112686923A (en) * 2020-12-31 2021-04-20 浙江航天恒嘉数据科技有限公司 Target tracking method and system based on double-stage convolutional neural network
CN113160274A (en) * 2021-04-19 2021-07-23 桂林电子科技大学 Improved deep sort target detection tracking method based on YOLOv4

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160063330A1 (en) * 2014-09-03 2016-03-03 Sharp Laboratories Of America, Inc. Methods and Systems for Vision-Based Motion Estimation
CN111768430A (en) * 2020-06-23 2020-10-13 重庆大学 Expressway outfield vehicle tracking method based on multi-feature cascade matching
CN111914664A (en) * 2020-07-06 2020-11-10 同济大学 Vehicle multi-target detection and track tracking method based on re-identification
CN112686923A (en) * 2020-12-31 2021-04-20 浙江航天恒嘉数据科技有限公司 Target tracking method and system based on double-stage convolutional neural network
CN113160274A (en) * 2021-04-19 2021-07-23 桂林电子科技大学 Improved deep sort target detection tracking method based on YOLOv4

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
LI-SHENG JIN, HUA QIANG, GUO BAI-CANG, XIE XIAN-YI, YAN FU-GANG, QU BO-TAO: "基于优化DeepSort 的前方车辆多目标跟踪 (Multi-target tracking of vehicles based on optimized DeepSort)", JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), GAIKAN BIANJIBU, HANGZHOU, CN, vol. 55, no. 6, 30 June 2021 (2021-06-30), CN , pages 1056 - 1064, XP093058550, ISSN: 1008-973X, DOI: 10.3785/j.issn.1008.973X.2021.06.005 *

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116703983A (en) * 2023-06-14 2023-09-05 石家庄铁道大学 Combined shielding target detection and target tracking method
CN116703983B (en) * 2023-06-14 2023-12-19 石家庄铁道大学 Combined shielding target detection and target tracking method
CN116469059A (en) * 2023-06-20 2023-07-21 松立控股集团股份有限公司 Parking lot entrance and exit vehicle backlog detection method based on DETR
CN116612642A (en) * 2023-07-19 2023-08-18 长沙海信智能系统研究院有限公司 Vehicle continuous lane change detection method and electronic equipment
CN116612642B (en) * 2023-07-19 2023-10-17 长沙海信智能系统研究院有限公司 Vehicle continuous lane change detection method and electronic equipment
CN116828398A (en) * 2023-08-29 2023-09-29 中国信息通信研究院 Tracking behavior recognition method and device, electronic equipment and storage medium
CN116828398B (en) * 2023-08-29 2023-11-28 中国信息通信研究院 Tracking behavior recognition method and device, electronic equipment and storage medium
CN117437261A (en) * 2023-10-08 2024-01-23 南京威翔科技有限公司 Tracking method suitable for edge-end remote target
CN117456407A (en) * 2023-10-11 2024-01-26 中国人民解放军军事科学院系统工程研究院 Multi-target image tracking method and device
CN117456407B (en) * 2023-10-11 2024-04-19 中国人民解放军军事科学院系统工程研究院 Multi-target image tracking method and device
CN117523379A (en) * 2023-11-20 2024-02-06 广东海洋大学 Underwater photographic target positioning method and system based on AI
CN117523379B (en) * 2023-11-20 2024-04-30 广东海洋大学 Underwater photographic target positioning method and system based on AI
CN117689907A (en) * 2024-02-04 2024-03-12 福瑞泰克智能系统有限公司 Vehicle tracking method, device, computer equipment and storage medium
CN117689907B (en) * 2024-02-04 2024-04-30 福瑞泰克智能系统有限公司 Vehicle tracking method, device, computer equipment and storage medium
CN117746304A (en) * 2024-02-21 2024-03-22 浪潮软件科技有限公司 Refrigerator food material identification and positioning method and system based on computer vision
CN117746304B (en) * 2024-02-21 2024-05-14 浪潮软件科技有限公司 Refrigerator food material identification and positioning method and system based on computer vision

Also Published As

Publication number Publication date
CN115995063A (en) 2023-04-21

Similar Documents

Publication Publication Date Title
WO2023065395A1 (en) Work vehicle detection and tracking method and system
CN108304798B (en) Street level order event video detection method based on deep learning and motion consistency
CN109360226B (en) Multi-target tracking method based on time series multi-feature fusion
Tsintotas et al. Assigning visual words to places for loop closure detection
CN108470354B (en) Video target tracking method and device and implementation device
Jana et al. YOLO based Detection and Classification of Objects in video records
WO2020215492A1 (en) Multi-bernoulli multi-target video detection and tracking method employing yolov3
CN103295242B (en) A kind of method for tracking target of multiple features combining rarefaction representation
CN107145862B (en) Multi-feature matching multi-target tracking method based on Hough forest
CN110781262B (en) Semantic map construction method based on visual SLAM
Lin et al. Integrating graph partitioning and matching for trajectory analysis in video surveillance
CN104303193A (en) Clustering-based object classification
CN112052802B (en) Machine vision-based front vehicle behavior recognition method
CN104424634A (en) Object tracking method and device
CN112836639A (en) Pedestrian multi-target tracking video identification method based on improved YOLOv3 model
CN111046856B (en) Parallel pose tracking and map creating method based on dynamic and static feature extraction
CN112241969A (en) Target detection tracking method and device based on traffic monitoring video and storage medium
CN112651995A (en) On-line multi-target tracking method based on multifunctional aggregation and tracking simulation training
Sharma et al. Vehicle identification using modified region based convolution network for intelligent transportation system
Doulamis Coupled multi-object tracking and labeling for vehicle trajectory estimation and matching
CN116402850A (en) Multi-target tracking method for intelligent driving
CN110909656B (en) Pedestrian detection method and system integrating radar and camera
Dadgar et al. Multi-view data fusion in multi-object tracking with probability density-based ordered weighted aggregation
Yang et al. Probabilistic projective association and semantic guided relocalization for dense reconstruction
Zhu et al. (Retracted) Transfer learning-based YOLOv3 model for road dense object detection