CN112380997A - Model identification and undercarriage retraction and extension detection method based on deep learning - Google Patents

Model identification and undercarriage retraction and extension detection method based on deep learning Download PDF

Info

Publication number
CN112380997A
CN112380997A CN202011277840.4A CN202011277840A CN112380997A CN 112380997 A CN112380997 A CN 112380997A CN 202011277840 A CN202011277840 A CN 202011277840A CN 112380997 A CN112380997 A CN 112380997A
Authority
CN
China
Prior art keywords
detection
target
yolov3
target tracking
frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011277840.4A
Other languages
Chinese (zh)
Inventor
陈海峰
朱学伟
刘青
贾昆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Joho Technology Co ltd
Original Assignee
Wuhan Joho Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Joho Technology Co ltd filed Critical Wuhan Joho Technology Co ltd
Priority to CN202011277840.4A priority Critical patent/CN112380997A/en
Publication of CN112380997A publication Critical patent/CN112380997A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/467Encoded features or binary features, e.g. local binary patterns [LBP]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Multimedia (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to the technical field of safe flight, in particular to a method for identifying a machine type and detecting retraction and extension of an undercarriage based on deep learning, wherein a YOLOv3 target tracking thread and a KCF target tracking thread are respectively designed, and after the YOLOv3 target tracking thread detects that the airplane or the undercarriage is retracted and extended, the detected category information and retraction information are sent to the KCF target tracking thread; carrying out target position detection on target position information detected by a YOLOv3 target tracking thread by using a KCF target tracking thread, calculating responses among samples, finding out a detection frame with the maximum response value as a target frame, and acquiring confidence information of the detection frame; and performing fusion comparison on the two data, and outputting the mean value of the position information and the confidence degrees of the two threads if the calculated position difference is within a set threshold value. The method reduces false detection of the YOLOv3 algorithm caused by sudden environmental change by exerting good tracking performance of KCF, and overcomes the defect that the YOLOv3 algorithm depends on training samples too much.

Description

Model identification and undercarriage retraction and extension detection method based on deep learning
Technical Field
The invention relates to the technical field of safe flight, in particular to a model identification and undercarriage retraction detection method based on deep learning.
Background
The target tracking based on detection is a common target tracking method, and the tracking of a video sequence can be completed by carrying out target detection and identification on each frame of image. Laser and infrared 1997, 03, discloses an artificial intelligent monitoring system for landing gear retraction, which is characterized in that video signals are pre-processed and fed into an artificial neural network together with distance signals acquired by a laser range finder through acquisition. The automatic identification system for the retraction and the release of the aircraft landing gear, which is disclosed by the patent number 201610554460.8, can realize full-time high-definition binocular observation of an aircraft landing area under the multispectral condition. The patent No. 201811313628.1 discloses a method for detecting the retractable state of landing gear of a multi-model airplane with a ground-based view angle, which automatically determines the retractable state of the landing gear of the airplane through feature analysis and multi-frame comprehensive decision processing.
The comparison document discloses a technical scheme of identifying a target image by an artificial neural network. In addition, the YOLOv3 algorithm based on deep learning in the prior art performs well in the aspect of target detection, but the YOLOv3 has high requirements on early-stage training samples, and if once a shot target and a background image are not contained in the training samples, the YOLOv3 cannot detect the target, so that tracking failure is caused. And the target tracking algorithm is influenced by adverse effects of illumination, deformation and the like, so that the accuracy is reduced, and for this reason, a model identification and undercarriage retraction detection method based on deep learning is provided.
Disclosure of Invention
The invention aims to provide a model identification and undercarriage retraction detection method based on deep learning, so as to solve the problems in the background technology.
In order to achieve the purpose, the invention provides the following technical scheme: the model identification and undercarriage retraction detection method based on deep learning comprises the following steps:
s1, respectively designing a YOLOv3 target tracking thread and a KCF target tracking thread, and after detecting that the airplane or the landing gear is folded and unfolded by the YOLOv3 target tracking thread, sending the detected category information and folding and unfolding information to the KCF target tracking thread;
s2, carrying out target position detection on target position information detected by a YOLOv3 target tracking thread by utilizing a KCF target tracking thread, calculating responses among samples, finding out a detection frame with the maximum response value as a target frame, and acquiring confidence information of the detection frame;
and S3, performing fusion comparison on information acquired by the KCF target tracking thread and a detection result of the YOLOv3 target tracking thread, outputting a mean value of the position information and confidence degrees of the two threads if the calculated position difference is within a set threshold value, and not outputting the information and updating the KCF template if the result comparison difference is large.
Preferably, the network building and model training process of YOLOv3 is as follows:
s11, zooming image and image segmentation: an input image is firstly divided into S multiplied by S grids with equal size and then processed in two aspects;
s12, boundary box prediction: in this step, YOLO gives two prediction frames for each grid, where the given prediction frames are self-defined based on the size of the center point of the grid, and each grid predicts B bounding frames, each bounding frame has four coordinates and a confidence, so the final prediction result is S × S (B × 5+ C) vectors, where S is the number of divided grids, B is the number of targets in charge of each grid, and C is the number of categories;
s13, prediction of class probability graph: the responsible is the classification of the grid, and the predicted result is put in the final result of S (B) S (5 + C);
s14, passing the image through a full convolution neural network: darknet-53 multi-scale classification model with four convolutional layers and two fully connected networks;
s15, setting a loss function as the square sum of the frame coordinate error, the IOU error and the category error;
s16, obtaining an optimal frame through a non-maximum suppression algorithm to be used as regression;
and S17, correcting the network parameters through multiple iterations.
Preferably, the target detection algorithm flow of the detector of the KCF target tracking thread is as follows:
s21, inputting a video and extracting a single frame;
s22, judging whether the image is a first frame image or not, if so, initializing the position of a target rectangular frame, constructing a training sample through a cyclic matrix according to the target position, and if not, constructing a detection sample at the target position in a cyclic displacement mode;
s23, extracting HOG features of the image at the position of the search rectangular frame, converting training of the sample into a ridge regression problem through Fourier transform, performing discrete Fourier transform, calculating weight coefficients of the training sample, updating parameters, judging whether video input exists or not, if so, circularly executing the step S21, otherwise, completing the target detection process.
Preferably, the parameter updating process of step S23 is: firstly, extracting HOG characteristics from a detection sample, and performing Fourier transform; secondly, calculating a cross-correlation matrix of the detection samples; then, calculating a response value of the detection sample, taking the detection sample as a confidence coefficient, and updating the position information; finally, judging whether the response value of the detection sample is greater than 0.75, and if so, extracting the HOG characteristics of the image at the position of the search rectangular frame; otherwise, no parameter update is performed.
Preferably, the YOLOv3 is accelerated by an accelerator for the neural network response phase in the response phase, so as to shorten the response time.
Preferably, the model generated by the YOLOv3 algorithm needs to be subjected to a branch reduction process, and the branch reduction process is as follows: training, pruning, fine-tuning the pruned model, and performing in a circulating manner; in the branch reducing process, a scaling factor gamma in batch standardization is used as an importance factor, namely, the smaller the gamma is, the less important the corresponding network layer is, and the network layer can be cut; to constrain the magnitude of γ, a regularization term for λ is added to the objective equation to achieve automatic pruning during training.
Compared with the prior art, the invention has the beneficial effects that: constructing a deep learning model for model identification and undercarriage retraction identification on the basis of a YOLOv3 detection algorithm by adopting a deep learning method; aiming at the defect that the YOLOv3 algorithm is excessively dependent on a training sample, a model landing gear detection system based on KCF and YOLOv3 algorithm is adopted, and the good tracking performance of KCF is exerted, so that the false detection of the YOLOv3 algorithm caused by sudden environmental changes is reduced; meanwhile, the YOLOv3 model is subjected to branch reduction and compression, and the response speed in engineering application is increased. The method is verified by early tests, the detection speed can be improved by 50% under the same hardware condition, compared with a conventional deep learning target identification detection algorithm, the precision ratio can be improved by about 4%, and the mAP of YOLOv2 is improved by about 2%.
Drawings
FIG. 1 is a flow chart of a detection algorithm based on KCF and YOLOv3 in accordance with the present invention;
FIG. 2 is a plot of LOSS, IOU and Batch and iteration times for this training;
FIG. 3 is a bar graph of the effect of the selection of the subject λ on γ;
FIG. 4 is a line graph of pruning proportion versus accuracy for the present invention;
FIG. 5 is a R-P graph of the model of the invention and the landing gear.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention provides a technical scheme that: the model identification and undercarriage retraction detection method based on deep learning comprises the following steps: firstly, a YOLOv3 detector is designed, and whether the object is the same as the existing KCF detector is judged every time the object is detected, or a new KCF detector is generated to track the object. The specific method comprises the following steps: respectively designing a YOLOv3 target tracking thread and a KCF target tracking thread, sending detected category information and retraction information to the KCF target tracking thread after the retraction of an airplane or a landing gear is detected by YOLOv3, then carrying out target position detection on the target position information detected by YOLOv3 by using the KCF target tracking thread, calculating the response between samples, finding out a detection frame with the maximum response value as a target frame, acquiring confidence information of the target frame, finally carrying out fusion comparison with the detection result of the YOLOv3 thread, and outputting the mean value of the position information and the confidence values of the two threads if the calculated position difference is within a certain threshold value. And if the comparison difference of the results is larger, not outputting the information and updating the KCF template.
According to the scheme, the detection of the targets such as the five types of machines, the landing gear of the five types of machines and the like is carried out on the premise of improving the YOLOv3 model, compared with the traditional target detection method, the method has the advantages in the detection and recognition level of a single-frame image, and higher detection accuracy and higher detection speed can be obtained. The KCF target tracking algorithm is that a target detector of a current frame is trained by a target and a target track of the current frame in the tracking process through an on-line target detector training method, then the target at the motion track position predicted by a tracker is detected by the detector in the next frame, whether the target is a target to be detected or not is judged, and the original detector is updated according to the detection result. The targets within the trace are typically recorded as positive samples and the remaining environments as negative samples. However, the KCF algorithm still has shortcomings in the aspects of scale transformation, feature extraction, target loss, and the like.
Aiming at the defects of the existing method, the KCF algorithm and the YOLO algorithm are combined in the scheme, so that the adverse effects of illumination, deformation and the like on the target tracking algorithm are overcome, and the accuracy, robustness and adaptability of the target tracking algorithm are improved. The specific flow is shown in figure 1.
Implementation flow of YOLOv3 detection method
Data preparation and environment construction scheme
In order to realize the identification of the two types of targets, the convolutional neural network is adopted to construct an identification model, so that a large number of airplanes (visible light and infrared images) and undercarriage books are needed to complete data acquisition and marking of 20000 pieces of target objects of 5 types as training data sets. And marking the airplane and the undercarriage in the image in sequence by adopting a calibration software Labelimg to obtain a calibration file.
A VOC data set is made from a large number of samples collected. The training hardware platform of the experiment in the scheme is 9700K + Titan X + CUDA + CUDNN; the test hardware platform is an Nvidia Jetson series development board; the software environment is Ubuntu16.04+ opencv2.4.9+ python 3. All the environments are completed on the basis of supporting CUDA9.0 and CUDNN7.0, a DarkNet framework is downloaded and built, the network structure design of YOLOv3 is completed, and a training file and a training environment are configured.
Network building and model training scheme
The network building and model training algorithm flow adopted by the scheme is as follows:
zooming an image and segmenting the image: an input image is first divided into S × S equal-sized grids, which are subsequently processed in two ways.
Boundary box prediction: in this step, YOLO gives two prediction boxes for each grid. The prediction box is given based on the center point of the grid, and the size is self-defined. Each grid predicts B bounding boxes, each bounding box has four coordinates and a confidence, so the final prediction result is S X S (B X5 + C) vectors, wherein S is the number of divided grids, B is the number of targets in charge of each grid, and C is the number of categories.
Predicting the class probability map: the responsibility is for the classification of the grid, and the predicted results are likewise placed in the final results of S x S (B x 5+ C). The expression means: each cell corresponds to B bounding boxes, the width and height range of the bounding boxes is a full graph, and the position of the bounding box of the object is found by taking the cell as the center. Each bounding box corresponds to a score which represents whether an object exists at the position and the positioning accuracy:
Figure BDA0002779711590000061
each cell corresponds to C probability values, the Class P (Class | object) corresponding to the maximum probability is found, and the cell is considered to contain the object or the objectA portion of an object;
fourthly, passing the image through a full convolution neural network: the Darknet-53 multiscale classification model is associated with four convolutional layers and two fully connected networks.
And fifthly, setting loss functions as square and square of frame coordinate error, IOU error and category error.
And sixthly, acquiring an optimal frame as regression through a non-maximum suppression algorithm.
And seventhly, iteratively correcting the network parameters for multiple times.
Inputting the marked image files and data files into a built network, training through the algorithm, setting the initial learning rate to be 0.0001, realizing training visualization through programming, and observing IOU (input output) rate, recall rate and a loss function curve during training. As shown in fig. 2. The Loss function Loss trend is the variation trend of the Loss function Loss in the training process, in the experimental process, 64 block graphs are used as a group, 8 block graphs are used as a batch for training, ten thousand iterations are performed totally, the drop of the Loss function basically tends to be stable when the training is performed for seven thousand iterations, no obvious change exists in the seven thousand to ten thousand iterations, and the model is regarded as convergence when the seven thousand iterations are performed. The experiment is realized in the environment that the GPU is Titan X.
Design scheme of KCF tracking flow
Aiming at the defect that the YOLOv3 algorithm depends on a data set too much, and in order to enhance the robustness, adaptability and accuracy of the model, the scheme adds a KCF detection target tracking thread for the omission checking and positioning correction of the YOLOv3 detection algorithm. The KCF target detection algorithm flow in the scheme is as follows:
input video and extract single frame.
And secondly, judging whether the image is a first frame image, initializing the position of a target rectangular frame if the image is the first frame image, constructing a training sample through a cyclic matrix according to the target position, and constructing a detection sample at the target position in a cyclic displacement mode if the image is not the first frame image.
Extracting HOG characteristics of the image at the position of the search rectangular frame, converting training of the sample into a ridge regression problem through Fourier transform, performing discrete Fourier transform, calculating weight coefficients of the training sample, updating parameters, judging whether video input exists or not, performing 1 in a circulating mode if the video input exists, and otherwise, finishing the target detection process.
And fourthly, constructing a detection sample at the target position in a cyclic displacement mode.
Updating parameters: firstly, extracting HOG characteristics from a detection sample, and performing Fourier transform; secondly, calculating a cross-correlation matrix of the detection samples; then, calculating a response value of the detection sample, taking the detection sample as a confidence coefficient, and updating the position information; finally, judging whether the response value of the detection sample is greater than 0.75, and if so, extracting the HOG characteristics of the image at the position of the search rectangular frame; otherwise, no parameter update is performed.
Optimization scheme of YOLOv3
TensorRT acceleration
The problem that the response time of a network model is too long is often faced when deep learning reality deployment, and therefore a phase-oriented acceleration method is needed. TensorRT is an accelerator provided by NVIDIA for the neural network response phase. Compared with the training process, the model structure and parameters are fixed during network inference, the size of the batch images is generally small, the requirement on precision is lower compared with the training process, and therefore a large optimization space is provided. The scheme adopts TensorRT and is optimized in the following aspects:
combining certain layers
Sometimes, the cost of reading and writing the memory by the network is too high instead of the calculation amount, and operations of multiple layers are combined into the same layer in TensorRT, so that kernel starting and memory reading and writing can be reduced to a certain extent. Operations such as convolution and excitation are done in one go. In addition, layers with the same input and the same convolution kernel size are merged into the same layer, and the computation amount of the same input is eliminated by using a pre-allocation buffer area and the like.
② support data types of FP16 or INT8
During training, the requirement on calculation accuracy is high due to gradient and the like, but the response stage can accelerate operation by using data types with low accuracy, so that the size of the model is reduced.
Automatic adjustment of kernel
TensorRT has optimization of a write algorithm level aiming at different hyper-parameters, for example, which algorithm is used for convolution operation can be determined according to the hyper-parameters such as the size of a convolution kernel and the input size.
Fourthly, dynamic tensor memory
TensorRT reduces the memory overhead through optimization, and improves the reloading of the memory.
Multiple parallel operations
For the case of CUDA support, parallel operations may be performed for multiple branches of the same input.
Model compression
The total number of models generated by using the YOLOv3 algorithm is 106, the parameter number and the network structure are very complex, the response speed can be greatly influenced in the engineering application process, and experiments show that the original model identification speed can only reach 5FPS in the Nvidia Jetson Nano. However, a redundant layer exists in the network, so that the network structure needs to be pruned. The flow of branch reduction is as follows:
the scaling factor gamma in batch standardization is used as an importance factor, namely, the smaller the gamma is, the less important the corresponding network layer is, and the network layer can be cut.
Secondly, a regular term related to lambda is added in the target equation for restricting the size of gamma, so that automatic pruning can be realized in training, which is not realized in the conventional model compression. The method comprises the following steps of dividing into three parts, namely, training; second, pruning; and thirdly, finely adjusting the pruned model and circularly executing.
The specific operation details are as follows: the analysis was carried out by experiment lambda, usually 0.00001 or 0.0001, as the case may be. After gamma is obtained, the method similar to the energy ratio in similar PCA is adopted, all the gamma of the current layer are added, then the gamma is arranged in the order from large to small, the larger part is selected, and the scheme is selected to be about 70%. The effect of the choice of λ on γ is shown in fig. 3.
When λ is 0, the objective function does not penalize γ, and when λ is 0.00001, it can be found that more than 450 λ ═ 0.0 are close to 0 as a whole. When λ is 0.0001, there is a greater sparsity constraint on γ, and it can be seen that nearly 2000 γ are around 0.0.
Percentage pruning: the more the clipping, the smaller the model; too much clipping results in loss of precision. This is contradictory, so this scheme has made experimental contrast, and the experiment finds that, when the pruning exceeds 80%, the precision can descend by a wide margin. As shown in fig. 4 (Baseline, sparse training Trained, modified Pruned, Fine-tuned, Test error, Pruned channels). In the application process, the accuracy and the speed are selected according to the environment and the requirement of the engineering. The scheme adopts a 35% branch-reducing optimization model, so that the speed is increased to 30 fps.
Verification experiment result and analysis of scheme and algorithm
The design of a double-thread model is completed through the algorithm, the model is used for testing images and videos after training is completed, 31925 frames are used for testing the videos, and the number of the frames with targets is 18723. The video is the coexistence of the airplane and the landing gear.
The detection result is visually programmed for the first video, the confidence coefficient of each frame of detection result is obtained and is shown in table 1, the IOU threshold value is set to be 0.7, and R-P curves of the machine type and the undercarriage are respectively drawn, and the R-P curves are shown in fig. 5. The evaluation of the algorithm is statistically mAP, and the index of the experimental result is shown in Table 2.
TABLE 1 Framing detection results and confidence statistics
Figure BDA0002779711590000091
Figure BDA0002779711590000101
Table 2 evaluation of the algorithm of this scheme
Figure BDA0002779711590000102
In the aspect of accuracy, the algorithm of the scheme is superior to the traditional detection method in the aspects of machine type and undercarriage identification; compared with a detection algorithm based on deep learning, the landing gear detection algorithm based on the LetNet-5 is improved by 6.3 percent, and is improved by 2.9 percent compared with a method based on moving target detection and ResNet-v 2; meanwhile, the algorithm of the scheme reduces the false detection rate and the missed detection rate. In the aspect of operation speed, the operation speed of the algorithm in the scheme is 56fps in a training environment, the algorithm is transplanted into an RTX2060 GPU system, and 30 frames per second can be achieved by adopting model branch reduction and model compression.
The method is used for detecting the targets of five types of machines, landing gears of the machines and the like on the premise of improving the YOLOv3 model, and compared with the traditional target detection method, the method has the advantages in the detection and identification level of a single-frame image, and can obtain higher detection accuracy and higher detection speed.
The KCF target tracking algorithm in the prior art trains a target detector with a target of a current frame and a target track by an on-line target detector training method in a tracking process, then detects the target at a motion track position predicted by a tracker in a next frame by using the detector, judges whether the target is a target to be detected, and updates the original detector according to a detection result. The targets within the trace are typically recorded as positive samples and the remaining environments as negative samples. However, the KCF algorithm still has shortcomings in the aspects of scale transformation, feature extraction, target loss, and the like. The scheme combines the KCF algorithm and the YOLO algorithm, overcomes the adverse effects of illumination, deformation and the like on the target tracking algorithm, improves the accuracy, robustness and adaptability of the target tracking algorithm, and has unexpected technical effects.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art should be considered to be within the technical scope of the present invention, and the technical solutions and the inventive concepts thereof according to the present invention should be equivalent or changed within the scope of the present invention.

Claims (6)

1. The model identification and undercarriage retraction detection method based on deep learning is characterized by comprising the following steps: the method comprises the following steps:
s1, respectively designing a YOLOv3 target tracking thread and a KCF target tracking thread, and after detecting that the airplane or the landing gear is folded and unfolded by the YOLOv3 target tracking thread, sending the detected category information and folding and unfolding information to the KCF target tracking thread;
s2, carrying out target position detection on target position information detected by a YOLOv3 target tracking thread by utilizing a KCF target tracking thread, calculating responses among samples, finding out a detection frame with the maximum response value as a target frame, and acquiring confidence information of the detection frame;
and S3, performing fusion comparison on information acquired by the KCF target tracking thread and a detection result of the YOLOv3 target tracking thread, outputting a mean value of the position information and confidence degrees of the two threads if the calculated position difference is within a set threshold value, and not outputting the information and updating the KCF template if the result comparison difference is large.
2. The deep learning-based model identification and undercarriage retraction detection method according to claim 1, wherein: the network building and model training process of the Yolov3 is as follows:
s11, zooming image and image segmentation: an input image is firstly divided into S multiplied by S grids with equal size and then processed in two aspects;
s12, boundary box prediction: in this step, YOLO gives two prediction frames for each grid, where the given prediction frames are self-defined based on the size of the center point of the grid, and each grid predicts B bounding frames, each bounding frame has four coordinates and a confidence, so the final prediction result is S × S (B × 5+ C) vectors, where S is the number of divided grids, B is the number of targets in charge of each grid, and C is the number of categories;
s13, prediction of class probability graph: the responsible is the classification of the grid, and the predicted result is put in the final result of S (B) S (5 + C);
s14, passing the image through a full convolution neural network: darknet-53 multi-scale classification model with four convolutional layers and two fully connected networks;
s15, setting a loss function as the square sum of the frame coordinate error, the IOU error and the category error;
s16, obtaining an optimal frame through a non-maximum suppression algorithm to be used as regression;
and S17, correcting the network parameters through multiple iterations.
3. The deep learning-based model identification and undercarriage retraction detection method according to claim 1, wherein: the target detection algorithm flow of the detector of the KCF target tracking thread is as follows:
s21, inputting a video and extracting a single frame;
s22, judging whether the image is a first frame image or not, if so, initializing the position of a target rectangular frame, constructing a training sample through a cyclic matrix according to the target position, and if not, constructing a detection sample at the target position in a cyclic displacement mode;
s23, extracting HOG features of the image at the position of the search rectangular frame, converting training of the sample into a ridge regression problem through Fourier transform, performing discrete Fourier transform, calculating weight coefficients of the training sample, updating parameters, judging whether video input exists or not, if so, circularly executing the step S21, otherwise, completing the target detection process.
4. The deep learning-based model identification and undercarriage retraction detection method according to claim 3, wherein: the parameter updating process of step S23 includes: firstly, extracting HOG characteristics from a detection sample, and performing Fourier transform; secondly, calculating a cross-correlation matrix of the detection samples; then, calculating a response value of the detection sample, taking the detection sample as a confidence coefficient, and updating the position information; finally, judging whether the response value of the detection sample is greater than 0.75, and if so, extracting the HOG characteristics of the image at the position of the search rectangular frame; otherwise, no parameter update is performed.
5. The deep learning-based model identification and undercarriage retraction detection method according to claim 1, wherein: the YOLOv3 accelerates in the response phase through an accelerator for the response phase of the neural network, and the response time is shortened.
6. The deep learning-based model identification and undercarriage retraction detection method according to claim 1, wherein: the model generated by the YOLOv3 algorithm needs to be subjected to branch reduction treatment, and the branch reduction process comprises the following steps: training, pruning, fine-tuning the pruned model, and performing in a circulating manner; in the branch reducing process, a scaling factor gamma in batch standardization is used as an importance factor, namely, the smaller the gamma is, the less important the corresponding network layer is, and the network layer can be cut; to constrain the magnitude of γ, a regularization term for λ is added to the objective equation to achieve automatic pruning during training.
CN202011277840.4A 2020-11-16 2020-11-16 Model identification and undercarriage retraction and extension detection method based on deep learning Pending CN112380997A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011277840.4A CN112380997A (en) 2020-11-16 2020-11-16 Model identification and undercarriage retraction and extension detection method based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011277840.4A CN112380997A (en) 2020-11-16 2020-11-16 Model identification and undercarriage retraction and extension detection method based on deep learning

Publications (1)

Publication Number Publication Date
CN112380997A true CN112380997A (en) 2021-02-19

Family

ID=74584717

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011277840.4A Pending CN112380997A (en) 2020-11-16 2020-11-16 Model identification and undercarriage retraction and extension detection method based on deep learning

Country Status (1)

Country Link
CN (1) CN112380997A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113138382A (en) * 2021-04-27 2021-07-20 中国电子科技集团公司第二十八研究所 Fully-automatic approach landing monitoring method for civil and military airport
CN114596335A (en) * 2022-03-01 2022-06-07 广东工业大学 Unmanned ship target detection tracking method and system
CN114627339A (en) * 2021-11-09 2022-06-14 昆明物理研究所 Intelligent recognition and tracking method for border crossing personnel in dense jungle area and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110147807A (en) * 2019-01-04 2019-08-20 上海海事大学 A kind of ship intelligent recognition tracking
CN110321853A (en) * 2019-07-05 2019-10-11 杭州巨骐信息科技股份有限公司 Distribution cable external force damage prevention system based on video intelligent detection
CN110706211A (en) * 2019-09-17 2020-01-17 中国矿业大学(北京) Convolutional neural network-based real-time detection method for railway roadbed disease radar map
CN110706266A (en) * 2019-12-11 2020-01-17 北京中星时代科技有限公司 Aerial target tracking method based on YOLOv3
CN111325342A (en) * 2020-02-19 2020-06-23 深圳中兴网信科技有限公司 Model compression method and device, target detection equipment and storage medium
CN111461291A (en) * 2020-03-13 2020-07-28 西安科技大学 Long-distance pipeline inspection method based on YO L Ov3 pruning network and deep learning defogging model
CN111832607A (en) * 2020-05-28 2020-10-27 东南大学 Bridge disease real-time detection method based on model pruning

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110147807A (en) * 2019-01-04 2019-08-20 上海海事大学 A kind of ship intelligent recognition tracking
CN110321853A (en) * 2019-07-05 2019-10-11 杭州巨骐信息科技股份有限公司 Distribution cable external force damage prevention system based on video intelligent detection
CN110706211A (en) * 2019-09-17 2020-01-17 中国矿业大学(北京) Convolutional neural network-based real-time detection method for railway roadbed disease radar map
CN110706266A (en) * 2019-12-11 2020-01-17 北京中星时代科技有限公司 Aerial target tracking method based on YOLOv3
CN111325342A (en) * 2020-02-19 2020-06-23 深圳中兴网信科技有限公司 Model compression method and device, target detection equipment and storage medium
CN111461291A (en) * 2020-03-13 2020-07-28 西安科技大学 Long-distance pipeline inspection method based on YO L Ov3 pruning network and deep learning defogging model
CN111832607A (en) * 2020-05-28 2020-10-27 东南大学 Bridge disease real-time detection method based on model pruning

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113138382A (en) * 2021-04-27 2021-07-20 中国电子科技集团公司第二十八研究所 Fully-automatic approach landing monitoring method for civil and military airport
CN113138382B (en) * 2021-04-27 2021-11-02 中国电子科技集团公司第二十八研究所 Fully-automatic approach landing monitoring method for civil and military airport
CN114627339A (en) * 2021-11-09 2022-06-14 昆明物理研究所 Intelligent recognition and tracking method for border crossing personnel in dense jungle area and storage medium
CN114627339B (en) * 2021-11-09 2024-03-29 昆明物理研究所 Intelligent recognition tracking method and storage medium for cross border personnel in dense jungle area
CN114596335A (en) * 2022-03-01 2022-06-07 广东工业大学 Unmanned ship target detection tracking method and system
CN114596335B (en) * 2022-03-01 2023-10-31 广东工业大学 Unmanned ship target detection tracking method and system

Similar Documents

Publication Publication Date Title
KR102382693B1 (en) Learning method and learning device of pedestrian detector for robust surveillance based on image analysis by using gan and testing method and testing device using the same
Zhang et al. Identification of maize leaf diseases using improved deep convolutional neural networks
CN112380997A (en) Model identification and undercarriage retraction and extension detection method based on deep learning
CN109613002B (en) Glass defect detection method and device and storage medium
CN109117876A (en) A kind of dense small target deteection model building method, model and detection method
CN109241982A (en) Object detection method based on depth layer convolutional neural networks
CN106295613A (en) A kind of unmanned plane target localization method and system
CN113160062B (en) Infrared image target detection method, device, equipment and storage medium
CN111160407A (en) Deep learning target detection method and system
CN112417981B (en) Efficient recognition method for complex battlefield environment targets based on improved FasterR-CNN
CN116824413A (en) Aerial image target detection method based on multi-scale cavity convolution
CN116385958A (en) Edge intelligent detection method for power grid inspection and monitoring
CN115965862A (en) SAR ship target detection method based on mask network fusion image characteristics
CN115984543A (en) Target detection algorithm based on infrared and visible light images
Wang et al. Deep learning model for target detection in remote sensing images fusing multilevel features
CN115994900A (en) Unsupervised defect detection method and system based on transfer learning and storage medium
Chen et al. Object detection using deep learning: Single shot detector with a refined feature-fusion structure
CN117746077B (en) Chip defect detection method, device, equipment and storage medium
Mittel et al. Vision-based crack detection using transfer learning in metal forming processes
CN113177956A (en) Semantic segmentation method for unmanned aerial vehicle remote sensing image
Ouyang et al. Aerial target detection based on the improved YOLOv3 algorithm
Liu et al. Object detection algorithm based on lightweight YOLOv4 for UAV
Li et al. Research on textile defect detection based on improved cascade R-CNN
CN115018884B (en) Visible light infrared visual tracking method based on multi-strategy fusion tree
CN114240822A (en) Cotton cloth flaw detection method based on YOLOv3 and multi-scale feature fusion

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210219

RJ01 Rejection of invention patent application after publication