CN115661595A

CN115661595A - Multi-model dynamic fusion method in deep learning object detection and storage medium

Info

Publication number: CN115661595A
Application number: CN202211301088.1A
Authority: CN
Inventors: 赵李强; 孙倩; 赵云; 高雪林
Original assignee: Kunming Enersun Technology Co Ltd
Current assignee: Kunming Enersun Technology Co Ltd
Priority date: 2022-10-24
Filing date: 2022-10-24
Publication date: 2023-01-31

Abstract

The invention discloses a multi-model dynamic fusion method in deep learning object detection and a storage medium, comprising the following steps: step 1, obtaining an object detection model, and training the object detection model based on deep learning through an existing data set to obtain the model; or directly integrating the existing object detection model; step 2, dynamically loading model object detection model integration by using a configuration file; and step 3, a multi-model fusion algorithm comprises the following steps: (1) Loading and analyzing different series of models, and performing forward prediction on an input image to obtain intensive frames and class output; (2) Adding output boxes of different models to a list; (3) And inputting all the intensive prediction frames into a non-maximum suppression module to obtain an optimal object detection result. The invention realizes the dynamic fusion of different series of deep learning object detection models, increases the reusability of the deep learning models, and greatly improves the overall performance of the object detection models.

Description

Multi-model dynamic fusion method in deep learning object detection and storage medium

Technical Field

The invention relates to an object detection method, in particular to a multi-model dynamic fusion method and a storage medium in deep learning object detection.

Background

Object detection is a machine-time vision technique that uses a computer and related algorithms to classify and locate objects contained in an image. The method can divide object detection into two types of traditional object detection technologies and deep learning-based object detection technologies, the traditional object detection technologies utilize manual feature design and aim at feature design and classifier training, and the method is difficult to obtain effective feature expression of a target object in an image, is particularly sensitive to environmental noise and causes great restriction on engineering application. The object detection technology based on deep learning can automatically extract effective characteristic expression of a target object in an image from training data to resist the influence of environmental noise, so that the object detection algorithm based on deep learning is applied to engineering application to a certain extent. In recent years, with the application of deep learning-based object detection algorithms to engineering, the problems of low object detection and identification precision and low generalization capability are gradually exposed, and how to improve the identification performance of the deep learning object detection algorithms is a problem to be solved urgently.

At present, an object detection algorithm based on deep learning mainly depends on the strong fitting capability of a neural network, can automatically learn the characteristics of an object to be detected in complex image data, but the model capacity cannot be simply increased by depending on the improvement of the complexity of a model structure. The capacity of the deep learning model is increased along with the increase of the structure and the number of layers of the neural network, the deep learning model enters a saturation state after reaching a certain degree, and the expressed characteristic is that the recognition of the model does not rise after reaching a certain numerical value, so that the phenomenon of training data overflow is generated. Once the model training is finished, the target classes solidified in the object detection model cannot be changed any more unless training from a new model is performed, resulting in a decrease in the flexibility of application of the object detection model.

Disclosure of Invention

The invention aims to overcome the defects and provides a multi-model dynamic fusion method and a storage medium in deep learning object detection.

According to a first aspect, the invention provides a method for dynamic fusion of multiple models in deep learning object detection, which comprises the following steps:

step 1, object detection model acquisition

The model used in the present invention can come from two aspects:

(1) Training an object detection model based on deep learning through an existing data set to obtain a model;

(2) The existing object detection model is directly integrated.

Step 2, integrating object detection models

2.1 dynamic Loading model

In the existing object detection model based on deep learning, once the model training is finished, the target class solidified in the object detection model cannot be changed any more unless the model training is carried out again, so that the application flexibility of the object detection model is reduced, and the model reusability is poor.

2.2. Multi-model fusion algorithm

The method aims to solve the problems that the existing deep learning-based object detection algorithm cannot perform class customization modification after training is completed and has poor reusability. In the invention, a method for fusing a plurality of models is invented, opencv is used for carrying out loading analysis on different series of models (yolo, ssd, fast rcnn), and forward prediction is carried out on an input image to obtain intensive frames and class output; then, adding output boxes of different models to a list; and finally, inputting all the intensive prediction frames into a non-maximum value suppression module to obtain an optimal object detection result.

According to a second aspect, the present invention also provides a computer-readable storage medium, on which a computer program is stored, the computer program being executable by a processor to implement the steps of the method for dynamic fusion of multiple models in deep learning object detection according to the present invention.

The invention has the beneficial effects that:

the method comprises the steps of dynamically appointing an object detection model for fusion in a configuration file appointing mode, loading the appointed object detection model by using a dnn module in opencv, carrying out forward reasoning and prediction on pictures to obtain an intensive target detection prediction frame, and carrying out post-processing on the intensive prediction frame by using a non-maximum suppression algorithm to output an object detection result. The dynamic fusion of different series deep learning object detection models (a Yolo series, an SSD series and an Rcnn series) is realized, the reusability of the deep learning models is increased, and the overall performance of the object detection models is greatly improved.

Drawings

FIG. 1 is a flow chart of the method of the present invention.

FIG. 2 is a dynamically loaded object detection model according to a configuration file.

Detailed Description

As shown in fig. 1, a method for dynamic fusion of multiple models in deep learning object detection includes the following steps:

step 1, object detection model acquisition

The model used in the present invention can come from two aspects:

(1) The model obtained by training the object detection model based on deep learning through the existing data set has high customization degree, and can control the classes contained in the model according to the specific application scene;

(2) The method has the greatest characteristic that an existing object detection model is directly integrated, a new training object detection network is not needed, application can be quickly formed, and the method has the defects that customization cannot be realized, and the types contained in the model cannot be changed according to needs.

Step 2, integrating object detection models

The basic idea of the invention is to promote a weak-performance object detection model to a strong-performance object detection model. Weak performing object detectors can achieve slightly better results than random choices in machine learning theory, while strong object detectors can be very close to optimal object detectors. In actual object detection work, various object detection models such as Yolo, fast rcnn and ssd can be obtained according to labeling data, under the condition that model training fine tuning is not performed, the detection performance obtained by each type of model has no significant difference, and under the condition that the model complexity is equivalent, the performance indexes of the Yolo series object detection models on a COCO2017val data set are shown in table 1.

TABLE 1 comparison of model Performance in the yolo series

Model (model)	YOLOv5-L	YOLOv6-L	YOLOv7-L
				Size of image	640	640	640
APval	49％	52.5％	51.2％

Under the condition that the single model performance does not have a lifting space, the direction for lifting the overall model identification performance is multi-model integration.

2.1 dynamic Loading model

In the existing object detection model based on deep learning, once the model training is finished, the target class solidified in the object detection model cannot be changed any more unless the model training is carried out again, so that the application flexibility of the object detection model is reduced, and the model reusability is poor. In the present invention, we use configuration files to specify the model that the fusion algorithm needs to load as shown in fig. 2, for example: model 1 can detect A, B, C, D, model 2 can detect E, F, G, model 3 can detect H, I, and configuration file config.

2.2. Multi-model fusion algorithm

Claims

1. A multi-model dynamic fusion method in deep learning object detection is characterized by comprising the following steps:

step 1, object detection model acquisition

Training an object detection model based on deep learning through an existing data set to obtain a model; or directly integrating the existing object detection model;

step 2, dynamically loading model object detection model integration by using a configuration file;

step 3, a multi-model fusion algorithm comprises the following steps:

(1) Loading and analyzing different series of models, and performing forward prediction on an input image to obtain intensive frames and class output;

(2) Adding output boxes of different models to a list;

(3) And inputting all the intensive prediction frames into a non-maximum suppression module to obtain an optimal object detection result.

2. The method for multi-model dynamic fusion in deep learning object detection according to claim 1, characterized in that:

the object detection models include yolo, ssd, and fast rcnn.

3. The method for multi-model dynamic fusion in deep learning object detection according to claim 1, wherein in step (3):

and loading the specified object detection model by using a dnn module in opencv.