WO2021185121A1 - 模型生成方法、目标检测方法、装置、设备及存储介质 - Google Patents
模型生成方法、目标检测方法、装置、设备及存储介质 Download PDFInfo
- Publication number
- WO2021185121A1 WO2021185121A1 PCT/CN2021/079690 CN2021079690W WO2021185121A1 WO 2021185121 A1 WO2021185121 A1 WO 2021185121A1 CN 2021079690 W CN2021079690 W CN 2021079690W WO 2021185121 A1 WO2021185121 A1 WO 2021185121A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- detection model
- pruned
- target
- model
- coefficients
- Prior art date
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 243
- 238000000034 method Methods 0.000 title claims abstract description 66
- 238000012549 training Methods 0.000 claims abstract description 72
- 238000013138 pruning Methods 0.000 claims abstract description 64
- 238000010606 normalization Methods 0.000 claims abstract description 17
- 238000001914 filtration Methods 0.000 claims abstract description 11
- 230000006870 function Effects 0.000 claims description 31
- 230000015654 memory Effects 0.000 claims description 15
- 238000004590 computer program Methods 0.000 claims description 4
- 238000002372 labelling Methods 0.000 claims description 4
- 238000005516 engineering process Methods 0.000 description 11
- 230000006835 compression Effects 0.000 description 10
- 238000007906 compression Methods 0.000 description 10
- 230000000694 effects Effects 0.000 description 10
- 238000012216 screening Methods 0.000 description 10
- 238000010586 diagram Methods 0.000 description 8
- 238000013136 deep learning model Methods 0.000 description 6
- 230000000007 visual effect Effects 0.000 description 3
- 238000004422 calculation algorithm Methods 0.000 description 2
- 238000013500 data storage Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000009825 accumulation Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000004883 computer application Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0495—Quantised networks; Sparse networks; Compressed networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/082—Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/0985—Hyperparameter optimisation; Meta-learning; Learning-to-learn
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/56—Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
- G06V20/58—Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30248—Vehicle exterior or interior
- G06T2207/30252—Vehicle exterior; Vicinity of vehicle
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/56—Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/08—Detecting or categorising vehicles
Definitions
- the embodiments of the present application relate to the field of computer application technology, for example, to a model generation method, a target detection method, a device, a device, and a storage medium.
- Object Detection technology is the basis of many computer vision tasks, which can be used to determine whether there is an interesting object to be detected in the image to be detected, and to accurately locate the object to be detected.
- target detection technology can be combined with target tracking, target re-identification and other technologies to be applied to artificial intelligence systems, vehicle automatic driving systems, intelligent robots, intelligent logistics and other fields.
- the embodiments of the present application provide a model generation method, a target detection method, a device, a device, and a storage medium, so as to achieve the effect of improving the detection speed of the model by compressing the model.
- an embodiment of the present application provides a model generation method, which may include:
- the intermediate detection model is obtained after training the original detection model based on multiple training samples.
- Each training sample includes a sample image and a sample Sample annotation results of known targets in the image;
- the channel to be pruned corresponding to the coefficient to be pruned is selected, and the channel to be pruned is pruned to generate the target detection model.
- an embodiment of the present application also provides a target detection method, which may include:
- the image to be detected is input into the target detection model, and the target detection result of the target to be detected in the image to be detected is obtained according to the output result of the target detection model.
- an embodiment of the present application also provides a model generation device, which may include:
- the first acquisition module is set to acquire multiple scaling coefficients of the batch normalization layer in the intermediate detection model after preliminary training, where the intermediate detection model is obtained after training the original detection model based on multiple training samples, each
- the training samples include sample images and sample annotation results of known targets in the sample images;
- the first screening module is set to filter the coefficients to be pruned from the multiple scaling factors according to the numerical value of the multiple scaling factors;
- the model generation module is set to filter out the channel to be pruned corresponding to the coefficient of pruning from the multiple channels of the intermediate detection model, and perform channel pruning on the channel to be pruned to generate a target detection model.
- an embodiment of the present application also provides a target detection device, which may include:
- the second acquisition module is configured to acquire the image to be detected and the target detection model generated according to any one of the above methods
- the target detection module is configured to input the image to be detected into the target detection model, and obtain the target detection result of the target to be detected in the image to be detected according to the output result of the target detection model.
- an embodiment of the present application also provides a device, which may include:
- At least one processor At least one processor
- Memory set to store at least one program
- the at least one processor When at least one program is executed by at least one processor, the at least one processor implements the model generation method or the target detection method provided in any embodiment of the present application.
- an embodiment of the present application also provides a computer-readable storage medium on which a computer program is stored.
- the computer program is executed by a processor, the model generation method or target detection provided by any embodiment of the present application is implemented. method.
- FIG. 1 is a flowchart of a model generation method in Embodiment 1 of the present application
- Fig. 2 is a flowchart of a model generation method in the second embodiment of the present application.
- Fig. 3 is a flowchart of a target detection method in the third embodiment of the present application.
- 4a is a flow chart of model compression in a target detection method in the third embodiment of the present application.
- 4b is a flow chart of model pruning in a target detection method in the third embodiment of the present application.
- Fig. 5 is a structural block diagram of a model generating device in the fourth embodiment of the present application.
- FIG. 6 is a structural block diagram of a target detection device in Embodiment 5 of the present application.
- FIG. 7 is a schematic structural diagram of a device in Embodiment 6 of the present application.
- FIG. 8 is a schematic structural diagram of the target detection system in the ninth embodiment of the present application.
- FIG. 9 is a schematic diagram of the structure of the unmanned vehicle in the tenth embodiment of the present application.
- FIG. 1 is a flowchart of a model generation method provided in Embodiment 1 of the present application. This embodiment is applicable to the case of compressing the deep learning model in the target detection technology.
- the method may be executed by the model generation apparatus provided in the embodiment of the present application, and the apparatus may be implemented by at least one of software and hardware, and the apparatus may be integrated on various electronic devices.
- the method of the embodiment of the present application includes steps S110 to S130.
- the untrained original detection model is obtained.
- the original detection model is a deep learning model used for visual detection. It can be divided into anchor-based, anchor-free and the fusion of the two. The difference between them is It is whether to use anchors to extract candidate frames.
- An anchor can also be called an anchor box, which is a set of rectangular boxes obtained based on the clustering algorithm on the training samples before the model training.
- the original detection models in anchor-based include fasterRCNN, SSD (Single Shot MultiBox Detector), YoloV2, YoloV3, etc.; the original detection models in anchor-free include CornerNet, ExtremeNet, CenterNet, FCOS, etc. ;
- the original detection models that integrate anchor-based branches and anchor-free branches include FSAF, SFace, GA-RPN, and so on.
- SSD is a one-stage (one-stage) detection model. It has no region proposalas stage and directly generates the category probability and position coordinates of the target to be detected. It has a greater advantage in detection speed and can be better Run on unmanned delivery vehicles and mobile terminals. Therefore, as an alternative example, the original detection model may be an SSD, and on this basis, the backbone network of the SSD may be an inception_v3 structure.
- Each training sample can include a sample image and a sample annotation result of a known target in the sample image.
- the sample image It can be a frame of image, video sequence, etc., and the sample labeling result can be category probability and location coordinates.
- BN Batch Normalization
- the BN layer includes scaling coefficients (gamma coefficients) and offset coefficients (beta coefficients).
- gamma coefficients scaling coefficients
- offset coefficients offset coefficients
- each scaling factor corresponds to a channel in the convolutional layer. For example, if there are 32 scaling factors in a certain BN layer, the convolutional layer immediately adjacent to the BN layer includes 32 channels, and the BN layer also includes 32 channels.
- each scaling factor is multiplied with the corresponding channel in the convolutional layer, that is, whether a certain scaling factor exists, will directly affect the channel in the corresponding convolutional layer Does it work? Therefore, it is possible to obtain multiple scaling factors of the batch normalization layer in the intermediate detection model, and determine which of the intermediate detection models to perform channel pruning according to the multiple scaling factors.
- S120 Filter the coefficients to be pruned from the multiple zoom coefficients according to the numerical value of the multiple zoom coefficients.
- the numerical values of multiple zoom factors can be sorted, and multiple zoom factors can be obtained according to the sorting results.
- the pruning threshold of multiple scaling factors can be obtained according to the numerical value of multiple scaling factors and the preset pruning rate, and the pruning threshold can be selected from the multiple scaling factors according to the pruning threshold. Pruning coefficient, the coefficient to be pruned may be a scaling factor whose value is less than or equal to the pruning threshold; and so on.
- multiple scaling factors and multiple channels in a certain convolutional layer are in one-to-one correspondence, and multiple channels in a certain convolutional layer and multiple channels in the BN layer immediately adjacent to the convolutional layer are also There is a one-to-one correspondence. Therefore, the channels to be pruned can be filtered from the multiple channels of the intermediate detection model according to the coefficients to be pruned.
- the channels to be pruned are those channels with lower importance, and they may be a certain volume.
- the channel in the build-up layer may also be a channel in a BN layer.
- the channel to be pruned can be pruned to generate a target detection model, thereby achieving the effect of model compression.
- channel pruning channel pruning is to simplify the model by deleting redundant channels in the model, which is a structured compression method; moreover, after channel pruning is performed on the channels to be pruned, the channel pruning The convolution kernel corresponding to the channel will also be deleted accordingly, so the amount of convolution operation is also reduced by channel pruning.
- the BN layer immediately adjacent to it also has 32 channels.
- Each channel in the BN layer includes a scaling factor and an offset factor.
- the coefficient to be pruned is derived from the scaling factor. After screening, it is possible to determine which channels in the BN layer are channels to be pruned according to the coefficients to be pruned, and correspondingly, which channels in the convolutional layer are channels to be pruned.
- the implementation process of the foregoing channel pruning may be: filtering out the current convolution corresponding to the current pruning coefficient among the multiple to-be-pruned coefficients from the multiple channels of the multiple convolutional layers of the intermediate detection model
- the output channel of the layer and the input channel of the next convolutional layer of the current convolutional layer, and the output channel of the current convolutional layer and the input channel of the next convolutional layer of the current convolutional layer are used as the channel to be pruned . This is because the output channel of the current convolutional layer is the input channel of the next convolutional layer of the current convolutional layer.
- the output channel of the current convolutional layer is 1-32, then the current convolutional layer The input channel of the next convolutional layer is also 1-32.
- the output channel 17 of the current convolutional layer corresponding to the current pruning coefficient is the channel to be pruned, then the next convolution of the current convolutional layer The input channel 17 of the layer is also the channel to be pruned.
- the to-be-pruned branches can be selected from the multiple scaling factors according to the numerical value of the multiple scaling factors.
- Coefficient because the coefficient to be pruned and the channel to be pruned have a corresponding relationship, the channel to be pruned corresponding to the coefficient to be pruned can be selected from the multiple channels of the intermediate detection model, and the channel to be pruned can be pruned , Generate target detection model.
- the above technical solution combines the channel pruning with the intermediate detection model, and can perform channel pruning on the intermediate detection model according to the scaling factor in the intermediate detection model completed by the preliminary training, thereby realizing the improvement of the model detection through the compression model The effect of speed.
- the above model generation method may further include: filtering out a prunable convolutional layer from a plurality of convolutional layers of the intermediate detection model, wherein the prunable convolutional layer includes dividing 1*1 volume The convolutional layer other than the convolutional layer in the accumulation layer and/or the classification regression branch; the scaling factor corresponding to the prunable convolutional layer is selected from a plurality of scaling factors, and the prunable convolutional layer corresponds to The zoom factor is the target zoom factor; accordingly, according to the numerical value of the multiple zoom factors, filtering the coefficient to be pruned from the multiple zoom factors may include: zooming from multiple targets according to the numerical value of the multiple target zoom factors The coefficient to be pruned is selected from the coefficient.
- the original detection model usually includes two parts: the backbone network and the classification regression branch.
- the backbone network can be used to extract feature maps.
- the classification regression branch is a classification branch and a regression branch branched from the backbone network. To classify or regress the extracted feature maps. Since the category of classification regression is usually fixed, the convolutional layer in the classification regression branch can be kept as fixed as possible, which can ensure the fixed output dimension and simplify the execution code. As a result, a convolutional layer other than at least one of the 1*1 convolutional layer and the convolutional layer in the classification regression branch can be used as a prunable convolutional layer, and more than one of the prunable convolutional layers can be used.
- a target zoom factor filters out the coefficients to be pruned.
- a pruning detection model can be generated first; then, the pruning detection model can be fine-tuned and trained to generate a target detection model.
- the simplified pruning detection model after the channel pruning can be fine-tuned and trained, thereby restoring the detection effect, that is, while compressing the model, the original performance of the model is maintained as much as possible.
- the process of fine-tuning training can be: acquiring historical images and historical annotation results of known targets in historical images, and using historical images and historical annotation results as a set of historical samples; training the pruning detection model based on multiple historical samples , Get the target detection model.
- the historical sample and the above-mentioned training sample are the same sample data, that is, during fine-tuning training, the historical image and the sample image can be the same image, and the historical labeling result and the sample labeling result can also be The same annotation result.
- Fig. 2 is a flowchart of a model generation method provided in the second embodiment of the present application. This embodiment is refined on the basis of the above-mentioned technical solutions.
- the above-mentioned model generation method may further include: obtaining multiple training samples, and performing sparse training on the original detection model based on the batch normalization layer based on the multiple training samples to obtain intermediate detection Model.
- explanations of terms that are the same as or corresponding to those in the foregoing embodiments will not be repeated here.
- the method of this embodiment may include steps S210 to S230.
- an intermediate detection model of the BN layer sparsity can be obtained, that is, sparsity is introduced on the dense connections of the original detection model.
- an optional solution for sparsity training based on the BN layer is to apply L1 regular constraints on each scaling factor in the original detection model, so that the original detection model adjusts its parameters in the direction of structural sparsity.
- BN The scaling factor (gamma coefficient) in the layer plays a role equivalent to the switch coefficient of the information flow channel, and controls the switch of the information flow channel to close.
- the reason for this setting is that in the model training process, the L1 regular constraint is imposed on the scaling factor, and more scaling factors can be adjusted to zero. Therefore, in the model training stage and application stage, because the scaling factor is multiplied with the corresponding channel in the convolutional layer, when more scaling factors are 0, the channels in the corresponding convolutional layer will no longer rise. To any effect, it also plays a role of channel pruning by greatly compressing the zoom factor. On this basis, when filtering the channels to be pruned according to the preset pruning rate, if there are more scaling factors with a value of 0 in the intermediate detection model, the more likely it is that the channels corresponding to the scaling factors with a value other than 0 will be pruned. If it is low, the network structure of the target detection model generated therefrom is more consistent with the network structure of the intermediate detection model. In this way, the detection performance of the two is also more consistent, that is, the effect of model compression is achieved while ensuring the detection performance.
- the target loss function in the original detection model may be composed of the original loss function and the L1 regular constraint function
- the L1 regular constraint function may include a loss function that performs L1 regular constraints on multiple scaling factors. That is, on the basis of the original loss function, the L1 regular constraint term of the scaling factor of the BN layer is introduced. In this way, during the training process, the minimum value can be solved according to the objective loss function, and multiple parameter values in the model can be adjusted according to the solution result.
- the objective loss function L can be expressed by the following formula:
- x is the sample image
- y is the sample annotation result of the sample image
- W is the parameter value in the original detection model
- f(x, W) is the sample prediction result of the known target in the sample image
- ⁇ is the scaling factor
- l() is the original loss function
- g() is the L1 regular constraint function
- ⁇ represents the set of all scaling coefficients in the original detection model.
- ⁇ grad ⁇ grad + ⁇ *sign( ⁇ )
- S220 Obtain multiple scaling coefficients of the batch normalization layer in the intermediate detection model, and filter the coefficients to be pruned from the multiple scaling coefficients according to the numerical value of the multiple scaling coefficients.
- the technical solution of the embodiment of the present application performs BN layer-based sparsity training on the original detection model based on multiple training samples to obtain a BN layer sparsity intermediate detection model; it can be generated after channel pruning is performed on the intermediate detection model
- the target detection model is more consistent, and the detection performance of the two is relatively consistent, that is, the effect of model compression is achieved while ensuring the detection performance.
- FIG. 3 is a flowchart of a target detection method provided in Embodiment 3 of the present application. This embodiment can be applied to a situation where a target detection model generated based on the method described in any of the above embodiments is used to perform target detection on an image to be detected.
- the method may be executed by the target detection device provided in the embodiment of the present application, the device may be implemented by at least one of software and hardware, and the device may be integrated on various electronic devices.
- the method of the embodiment of the present application includes steps S310 to S320.
- the image to be detected may be a frame of image, video sequence, etc.
- the target detection model may be a visual detection model generated according to the method described in any of the foregoing embodiments.
- a method for generating a target detection model may be as shown in Figure 4a:
- the inception_v3 structure is used as the backbone network.
- L1 regular constraints are imposed on the gamma coefficients in the BN layer next to the convolutional layer. , So that the model adjusts the parameters in the direction of structural sparseness, thus realizing the sparseness of the BN layer.
- the intermediate detection model after preliminary training can be trimmed according to the preset pruning rate according to the scaling factor of the BN layer. It can streamline the model and increase the detection speed.
- the above-mentioned channel pruning process can be shown in Figure 4b.
- the layers are not pruned, which can ensure that the dimensionality of the output remains unchanged.
- count the gamma coefficients in the BN layer corresponding to the remaining prunable convolutional layers sort all the gamma coefficients calculated, and calculate the pruning threshold of the gamma coefficient according to the preset pruning rate.
- S320 Input the image to be detected into the target detection model, and obtain the target detection result of the target to be detected in the image to be detected according to the output result of the target detection model.
- the above-mentioned target detection method can be applied to visual target detection on unmanned delivery vehicles in the field of intelligent logistics.
- the on-board processors on unmanned delivery vehicles are based on the Xvaier platform, which has relatively limited computing resources, the target detection models involved in the above-mentioned target detection methods are small in scale and fast in detection speed, even if the calculation is limited. Under the constraints of resources, it can still meet the real sense of unmanned operation of unmanned delivery vehicles.
- the structured pruning operation is implemented at the channel level, and the resulting streamlined model can be directly run on mature frameworks such as Pytorch, MXnet, TensorFlow, etc., or on hardware platforms such as Graphics Processing Unit (GPU) , Field Programmable Gate Array (Field Programmable Gate Array, FPGA), etc., without the support of a special algorithm library, making the application more convenient.
- mature frameworks such as Pytorch, MXnet, TensorFlow, etc.
- hardware platforms such as Graphics Processing Unit (GPU) , Field Programmable Gate Array (Field Programmable Gate Array, FPGA), etc.
- this method is applied to the 5-category subset (car, pedestrian, truck, bus, rider) of the Berkeley DeepDrive (Berkeley DeepDrive, BDD) dataset of the University of California, Berkeley Test the detection accuracy, and the quantitative results are shown in the following two tables. From the data in the table, it can be known that the structured pruning target detection method of the embodiment of the present application can achieve a relatively obvious compression effect while leaving part of the convolutional layer and the BN layer intact. At the same time, the detection result is mAP (Average of 5 types of subsets) There is only a slight drop.
- mAP Average of 5 types of subsets
- the technical solution of the embodiment of the present application can perform target detection on the image to be detected based on the generated target detection model. Because the target detection model is a simplified model after model compression, this can effectively improve the detection speed of the target to be detected in the image to be detected. And the original performance of the model can be maintained as much as possible.
- FIG. 5 is a structural block diagram of a model generation device provided in Embodiment 4 of the application, and the device is configured to execute the model generation method provided in any of the foregoing embodiments.
- This device and the model generation method of the foregoing embodiments belong to the same inventive concept.
- the device may include: a first acquisition module 410, a first screening module 420, and a model generation module 430.
- the first acquisition module 410 is configured to acquire multiple scaling coefficients of the batch normalization layer in the intermediate detection model after preliminary training.
- the intermediate detection model is obtained after training the original detection model based on multiple training samples.
- a training sample includes the sample image and the sample annotation result of the known target in the sample image;
- the first screening module 420 is configured to filter the coefficients to be pruned from the multiple zoom coefficients according to the numerical value of the multiple zoom coefficients;
- the model generation module 430 is configured to filter the channels to be pruned corresponding to the coefficients to be pruned from the multiple channels of the intermediate detection model, and perform channel pruning on the channels to be pruned to generate the target detection model.
- the first screening module 420 can be set to:
- the pruning thresholds of the multiple zoom factors are obtained, and the coefficients to be pruned are selected from the multiple zoom factors according to the pruning threshold.
- the device may further include:
- the second screening module is set to filter the prunable convolutional layer from the multiple convolutional layers of the intermediate detection model.
- the prunable convolutional layer includes the convolutional layer except the 1*1 convolutional layer and the convolutional layer in the classification regression branch A convolutional layer other than at least one of;
- the third screening module is configured to filter out the scaling factor corresponding to the prunable convolutional layer from a plurality of scaling factors, and the scaling factor corresponding to the prunable convolutional layer is the target scaling factor;
- the first screening module 420 may be set as:
- the coefficients to be pruned are selected from the multiple target zoom factors.
- model generation module 430 may include:
- the to-be-pruned channel filtering unit is set to filter out the output channels of the current convolutional layer corresponding to the current pruning coefficient among the multiple to-be-pruned coefficients from the multiple channels of the multiple convolutional layers of the intermediate detection model, And the input channel of the next layer of the current convolutional layer, and the output channel and input channel are used as channels to be pruned.
- the device may further include:
- the third acquisition module is configured to acquire multiple training samples, and perform sparse training on the original detection model based on the batch normalization layer based on the multiple training samples to obtain an intermediate detection model.
- the target loss function in the original detection model is composed of an original loss function and an L1 regular constraint function
- the L1 regular constraint function includes a loss function that performs L1 regular constraints on multiple scaling factors.
- the objective loss function L is expressed by the following formula:
- x is the sample image
- y is the sample annotation result of the sample image
- W is the parameter value in the original detection model
- f(x, W) is the sample prediction result of the known target in the sample image
- ⁇ is the scaling factor
- l() is the original loss function
- g() is the L1 regular constraint function
- ⁇ represents the set of all scaling coefficients in the original detection model.
- model generation module 430 may include:
- the channel pruning unit is set to perform channel pruning on the channel to be pruned to obtain a pruning detection model
- the fine-tuning training unit is set to fine-tune the pruning detection model to generate a target detection model.
- the model generation device obtaineds multiple scaling factors of the batch normalization layer in the preliminary training intermediate detection model through the first obtaining module; the first screening module can be based on the numerical value of the multiple scaling factors, Filter out the coefficients to be pruned from multiple scaling factors; because the model generation module has a corresponding relationship between the coefficients to be pruned and the channels to be pruned, it can filter out the coefficients to be pruned from the multiple channels of the intermediate detection model.
- the channel to be pruned, and the channel to be pruned is pruned to generate a target detection model.
- the above device combines the channel pruning with the intermediate detection model, and can perform channel pruning on the intermediate detection model according to the scaling factor in the intermediate detection model completed by the preliminary training, thereby realizing the compression model to improve the detection speed of the model Effect.
- the model generation device provided in the embodiment of the present application can execute the model generation method provided in any embodiment of the present application, and has functional modules corresponding to the execution method.
- FIG. 6 is a structural block diagram of a target detection device provided in Embodiment 5 of this application.
- the device is configured to execute the target detection method provided in any of the foregoing embodiments.
- This device belongs to the same inventive concept as the target detection method of the foregoing embodiments.
- the device may include: a second acquisition module 510 and a target detection module 520.
- the second acquisition module 510 is configured to acquire the image to be detected and the target detection model generated according to any one of the method in the first embodiment and the second embodiment;
- the target detection module 520 is configured to input the image to be detected into the target detection model, and obtain the target detection result of the target to be detected in the image to be detected according to the output result of the target detection model.
- the second acquisition module and the target detection module cooperate with each other to perform target detection on the image to be detected based on the generated target detection model.
- the target detection model is a simplified model after model compression, This can effectively improve the detection speed of the target to be detected in the image to be detected, and the original performance of the model can be maintained as much as possible.
- the target detection device provided in the embodiment of the present application can execute the target detection method provided in any embodiment of the present application, and has functional modules corresponding to the execution method.
- the various units and modules included are only divided according to the functional logic, but are not limited to the above-mentioned division, as long as the corresponding function can be realized; in addition, each The names of the functional units are only for the convenience of distinguishing each other, and are not used to limit the scope of protection of this application.
- FIG. 7 is a schematic structural diagram of a device provided in Embodiment 6 of this application.
- the device includes a memory 610, a processor 620, an input device 630, and an output device 640.
- the number of processors 620 in the device may be at least one.
- one processor 620 is taken as an example; the memory 610, the processor 620, the input device 630, and the output device 640 in the device may be connected by a bus or other means, as shown in FIG. In 7 the connection via the bus 650 is taken as an example.
- the memory 610 can be configured to store software programs, computer-executable programs, and modules, such as program instructions/modules corresponding to the model generation method in the embodiment of the present application (for example, the first part of the model generation device).
- the processor 620 executes various functional applications and data processing of the device by running software programs, instructions, and modules stored in the memory 610, that is, implements the aforementioned model generation method or target detection method.
- the memory 610 may mainly include a program storage area and a data storage area.
- the program storage area may store an operating system and an application program required by at least one function; the data storage area may store data created according to the use of the device, and the like.
- the memory 610 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, or other non-volatile solid-state storage devices.
- the memory 610 may include a memory remotely provided with respect to the processor 620, and these remote memories may be connected to the device through a network. Examples of the aforementioned networks include, but are not limited to, the Internet, corporate intranets, local area networks, mobile communication networks, and combinations thereof.
- the input device 630 may be configured to receive input numeric or character information, and generate key signal input related to user settings and function control of the device.
- the output device 640 may include a display device such as a display screen.
- the seventh embodiment of the present application provides a storage medium containing computer-executable instructions, which are used to execute a model generation method when the computer-executable instructions are executed by a computer processor, and the method includes:
- the intermediate detection model is obtained after training the original detection model based on multiple training samples.
- Each training sample includes a sample image and a sample Sample annotation results of known targets in the image;
- the channel to be pruned corresponding to the coefficient to be pruned is screened out, and the channel to be pruned is pruned to generate the target detection model.
- a storage medium containing computer-executable instructions provided by an embodiment of the present application is not limited to the above-mentioned method operations, and can also execute any of the model generation methods provided in any embodiment of the present application. Related operations.
- the eighth embodiment of the present application provides a storage medium containing computer-executable instructions, which are used to execute a target detection method when the computer-executable instructions are executed by a computer processor, and the method includes:
- the image to be detected is input into the target detection model, and the target detection result of the target to be detected in the image to be detected is obtained according to the output result of the target detection model.
- An embodiment of the present application also provides a target detection system.
- the system includes a collection device 710, a computing device 720, and a storage device 730.
- the storage device 730 stores the target detection generated in the first embodiment or the second embodiment.
- Model the collection device 710 is configured to collect images to be detected
- the computing device 720 is configured to load the images to be detected and the target detection model, and input the images to be detected into the target detection model, according to the target detection model The output result of to obtain the target detection result of the target to be detected in the image to be detected.
- the unmanned vehicle includes a driving device 810, a path planning device 820, and the device 830 described in the seventh embodiment.
- the driving device 810 is configured to drive the The unmanned vehicle runs according to the path planned by the path planning device 820.
- the device 830 described in the sixth embodiment is set to realize the detection of the target to be detected in the image to be detected, and the path planning device 820 is set as described in the sixth embodiment.
- the device plans the path of the unmanned vehicle with the detection result of the object to be detected in the image to be detected.
- this application can be implemented by software and necessary general-purpose hardware, and of course, it can also be implemented by hardware.
- the technical solution of this application essentially or the part that contributes to the related technology can be embodied in the form of a software product, and the computer software product can be stored in a computer-readable storage medium, such as a computer floppy disk, Read-Only Memory (ROM), Random Access Memory (RAM), Flash memory (FLASH), hard disk or optical disk, etc., including several instructions to make a computer device (which can be a personal computer, A server, or a network device, etc.) execute the method described in each embodiment of the present application.
- a computer device which can be a personal computer, A server, or a network device, etc.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Biomedical Technology (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Image Analysis (AREA)
Abstract
Description
Claims (17)
- 一种模型生成方法,包括:获取初步训练完成的中间检测模型中批归一化层的多个缩放系数,其中,所述中间检测模型是基于多个训练样本对原始检测模型进行训练后得到的,每个训练样本包括样本图像和所述样本图像中已知目标的样本标注结果;根据多个所述缩放系数的数值大小,从多个所述缩放系数中筛选出待剪枝系数;从所述中间检测模型的多个通道中,筛选出与所述待剪枝系数对应的待剪枝通道,并对所述待剪枝通道进行通道剪枝,生成目标检测模型。
- 根据权利要求1所述的方法,其中,所述根据多个所述缩放系数的数值大小,从多个所述缩放系数中筛选出待剪枝系数,包括:根据多个所述缩放系数的数值大小和预设剪枝率,得出多个所述缩放系数的剪枝阈值,并根据所述剪枝阈值从多个所述缩放系数中筛选出待剪枝系数。
- 根据权利要求1所述的方法,还包括:从所述中间检测模型的多个卷积层中筛选出可剪枝卷积层,其中,所述可剪枝卷积层包括除1*1卷积层和分类回归分支中的卷积层中的至少之一之外的卷积层;从多个所述缩放系数中筛选出与所述可剪枝卷积层对应的缩放系数,所述可剪枝卷积层对应的缩放系数为目标缩放系数;所述根据多个所述缩放系数的数值大小,从多个所述缩放系数中筛选出待剪枝系数,包括:根据多个所述目标缩放系数的数值大小,从多个所述目标缩放系数中筛选出待剪枝系数。
- 根据权利要求1所述的方法,其中,所述待剪枝系数为多个,所述从所述中间检测模型的多个通道中,筛选出与所述待剪枝系数对应的待剪枝通道,包括:从所述中间检测模型的多个卷积层的多个通道中,筛选出与多个所述待剪枝系数中的当前剪枝系数对应的当前卷积层的输出通道,以及所述当前卷积层 的下一层卷积层的输入通道,并将所述输出通道和所述输入通道作为待剪枝通道。
- 根据权利要求1所述的方法,还包括:获取多个所述训练样本,并基于多个所述训练样本对所述原始检测模型进行基于所述批归一化层的稀疏化训练,得到所述中间检测模型。
- 根据权利要求5所述的方法,其中,所述原始检测模型中的目标损失函数由原始损失函数和L1正则约束函数构成,所述L1正则约束函数包括对多个所述缩放系数进行L1正则约束的损失函数。
- 根据权利要求1所述的方法,其中,所述对所述待剪枝通道进行通道剪枝,生成目标检测模型,包括:对所述待剪枝通道进行通道剪枝,得到剪枝检测模型;对所述剪枝检测模型进行微调训练,生成目标检测模型。
- 根据权利要求1所述的方法,其中,所述原始检测模型包括单次多框检测器SSD。
- 根据权利要求9所述的方法,其中,所述SSD的主干网络包括inception_v3结构。
- 一种目标检测方法,包括:获取待检测图像和按照权利要求1-10中任一项的方法生成的目标检测模 型;将所述待检测图像输入至所述目标检测模型中,根据所述目标检测模型的输出结果,得到所述待检测图像中待检测目标的目标检测结果。
- 一种模型生成装置,包括:第一获取模块,设置为获取初步训练完成的中间检测模型中批归一化层的多个缩放系数,其中,所述中间检测模型是基于多个训练样本对原始检测模型进行训练后得到的,每个训练样本包括样本图像和所述样本图像中已知目标的样本标注结果;第一筛选模块,设置为根据多个所述缩放系数的数值大小,从多个所述缩放系数中筛选出待剪枝系数;模型生成模块,设置为从所述中间检测模型的多个通道中,筛选出与所述待剪枝系数对应的待剪枝通道,并对待剪枝通道进行通道剪枝,生成目标检测模型。
- 一种目标检测装置,包括:第二获取模块,设置为获取待检测图像和按照权利要求1-10中任一项的方法生成的目标检测模型;目标检测模块,设置为将所述待检测图像输入至所述目标检测模型中,根据所述目标检测模型的输出结果,得到待检测图像中待检测目标的目标检测结果。
- 一种设备,包括:至少一个处理器;存储器,设置为存储至少一个程序;当所述至少一个程序被所述至少一个处理器执行,使得所述至少一个处理器实现如权利要求1-10中任一所述的模型生成方法。
- 一种设备,包括:至少一个处理器;存储器,设置为存储至少一个程序;当所述至少一个程序被所述至少一个处理器执行,使得所述至少一个处理器实现如权利要求11中所述的目标检测方法。
- 一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时,实现如权利要求1-10中任一所述的模型生成方法,或者如权利要求11中所述的目标检测方法。
- 一种无人车,包括驱动设备、路径规划设备以及权利要求15所述的设备,所述驱动设备设置为驱动所述无人车按照所述路径规划设备规划的路径运行,权利要求15所述的设备设置为实现待检测图像中的待检测目标的检测,所述路径规划设备设置为根据权利要求15所述的设备对待检测图像中的待检测目标的检测结果对所述无人车的路径进行规划。
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020227026698A KR20220116061A (ko) | 2020-03-17 | 2021-03-09 | 모델 생성 방법, 오브젝트 검출 방법, 장치, 기기 및 저장매체 |
EP21771737.0A EP4080408A4 (en) | 2020-03-17 | 2021-03-09 | MODEL GENERATION METHOD AND APPARATUS, OBJECT DETECTION METHOD AND APPARATUS, APPARATUS, AND STORAGE MEDIUM |
US17/912,342 US20230131518A1 (en) | 2020-03-17 | 2021-03-09 | Model Generation Method and Apparatus, Object Detection Method and Apparatus, Device, and Storage Medium |
JP2022544673A JP2023527489A (ja) | 2020-03-17 | 2021-03-09 | モデル生成方法、オブジェクト検出方法、装置、機器、及び記憶媒体 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010188303.6A CN113408561A (zh) | 2020-03-17 | 2020-03-17 | 模型生成方法、目标检测方法、装置、设备及存储介质 |
CN202010188303.6 | 2020-03-17 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021185121A1 true WO2021185121A1 (zh) | 2021-09-23 |
Family
ID=77677171
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2021/079690 WO2021185121A1 (zh) | 2020-03-17 | 2021-03-09 | 模型生成方法、目标检测方法、装置、设备及存储介质 |
Country Status (6)
Country | Link |
---|---|
US (1) | US20230131518A1 (zh) |
EP (1) | EP4080408A4 (zh) |
JP (1) | JP2023527489A (zh) |
KR (1) | KR20220116061A (zh) |
CN (1) | CN113408561A (zh) |
WO (1) | WO2021185121A1 (zh) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115170902A (zh) * | 2022-06-20 | 2022-10-11 | 美的集团(上海)有限公司 | 图像处理模型的训练方法 |
CN115265881A (zh) * | 2022-09-28 | 2022-11-01 | 宁波普瑞均胜汽车电子有限公司 | 压力检测方法和装置 |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115552477A (zh) * | 2020-05-01 | 2022-12-30 | 奇跃公司 | 采用施加的分层归一化的图像描述符网络 |
CN115169556B (zh) * | 2022-07-25 | 2023-08-04 | 美的集团(上海)有限公司 | 模型剪枝方法及装置 |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107368885A (zh) * | 2017-07-13 | 2017-11-21 | 北京智芯原动科技有限公司 | 基于多粒度剪枝的网络模型压缩方法及装置 |
CN107728620A (zh) * | 2017-10-18 | 2018-02-23 | 江苏卡威汽车工业集团股份有限公司 | 一种新能源汽车的无人驾驶系统及方法 |
CN109344921A (zh) * | 2019-01-03 | 2019-02-15 | 湖南极点智能科技有限公司 | 一种基于深度神经网络模型的图像识别方法、装置及设备 |
CN110263841A (zh) * | 2019-06-14 | 2019-09-20 | 南京信息工程大学 | 一种基于滤波器注意力机制和bn层缩放系数的动态结构化网络剪枝方法 |
CN111062382A (zh) * | 2019-10-30 | 2020-04-24 | 北京交通大学 | 用于目标检测网络的通道剪枝方法 |
CN111325342A (zh) * | 2020-02-19 | 2020-06-23 | 深圳中兴网信科技有限公司 | 模型的压缩方法、装置、目标检测设备和存储介质 |
Family Cites Families (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP6811736B2 (ja) * | 2018-03-12 | 2021-01-13 | Kddi株式会社 | 情報処理装置、情報処理方法、及びプログラム |
US10936913B2 (en) * | 2018-03-20 | 2021-03-02 | The Regents Of The University Of Michigan | Automatic filter pruning technique for convolutional neural networks |
CN108764046A (zh) * | 2018-04-26 | 2018-11-06 | 平安科技(深圳)有限公司 | 车辆损伤分类模型的生成装置、方法及计算机可读存储介质 |
JP7047612B2 (ja) * | 2018-06-08 | 2022-04-05 | 沖電気工業株式会社 | ニューラルネットワーク軽量化装置、情報処理装置、ニューラルネットワーク軽量化方法およびプログラム |
CN110084181B (zh) * | 2019-04-24 | 2021-04-20 | 哈尔滨工业大学 | 一种基于稀疏MobileNetV2网络的遥感图像舰船目标检测方法 |
CN110633747A (zh) * | 2019-09-12 | 2019-12-31 | 网易(杭州)网络有限公司 | 目标检测器的压缩方法、装置、介质以及电子设备 |
CN110619391B (zh) * | 2019-09-19 | 2023-04-18 | 华南理工大学 | 一种检测模型压缩方法、装置和计算机可读存储介质 |
CN110796168B (zh) * | 2019-09-26 | 2023-06-13 | 江苏大学 | 一种基于改进YOLOv3的车辆检测方法 |
-
2020
- 2020-03-17 CN CN202010188303.6A patent/CN113408561A/zh active Pending
-
2021
- 2021-03-09 KR KR1020227026698A patent/KR20220116061A/ko active Search and Examination
- 2021-03-09 EP EP21771737.0A patent/EP4080408A4/en active Pending
- 2021-03-09 JP JP2022544673A patent/JP2023527489A/ja active Pending
- 2021-03-09 US US17/912,342 patent/US20230131518A1/en active Pending
- 2021-03-09 WO PCT/CN2021/079690 patent/WO2021185121A1/zh unknown
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107368885A (zh) * | 2017-07-13 | 2017-11-21 | 北京智芯原动科技有限公司 | 基于多粒度剪枝的网络模型压缩方法及装置 |
CN107728620A (zh) * | 2017-10-18 | 2018-02-23 | 江苏卡威汽车工业集团股份有限公司 | 一种新能源汽车的无人驾驶系统及方法 |
CN109344921A (zh) * | 2019-01-03 | 2019-02-15 | 湖南极点智能科技有限公司 | 一种基于深度神经网络模型的图像识别方法、装置及设备 |
CN110263841A (zh) * | 2019-06-14 | 2019-09-20 | 南京信息工程大学 | 一种基于滤波器注意力机制和bn层缩放系数的动态结构化网络剪枝方法 |
CN111062382A (zh) * | 2019-10-30 | 2020-04-24 | 北京交通大学 | 用于目标检测网络的通道剪枝方法 |
CN111325342A (zh) * | 2020-02-19 | 2020-06-23 | 深圳中兴网信科技有限公司 | 模型的压缩方法、装置、目标检测设备和存储介质 |
Non-Patent Citations (2)
Title |
---|
See also references of EP4080408A4 |
SONG, FEIYANG ET AL.: "MobileNetV3 (Optimization of Structural Pruning Based on MobileNetV3", AUTOMATION & INFORMATION ENGINEERING, vol. 40, no. 6, 15 December 2019 (2019-12-15), pages 20 - 25, XP055852229 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115170902A (zh) * | 2022-06-20 | 2022-10-11 | 美的集团(上海)有限公司 | 图像处理模型的训练方法 |
CN115170902B (zh) * | 2022-06-20 | 2024-03-08 | 美的集团(上海)有限公司 | 图像处理模型的训练方法 |
CN115265881A (zh) * | 2022-09-28 | 2022-11-01 | 宁波普瑞均胜汽车电子有限公司 | 压力检测方法和装置 |
CN115265881B (zh) * | 2022-09-28 | 2022-12-20 | 宁波普瑞均胜汽车电子有限公司 | 压力检测方法和装置 |
Also Published As
Publication number | Publication date |
---|---|
CN113408561A (zh) | 2021-09-17 |
EP4080408A1 (en) | 2022-10-26 |
EP4080408A4 (en) | 2023-12-27 |
JP2023527489A (ja) | 2023-06-29 |
US20230131518A1 (en) | 2023-04-27 |
KR20220116061A (ko) | 2022-08-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021185121A1 (zh) | 模型生成方法、目标检测方法、装置、设备及存储介质 | |
CN110991311B (zh) | 一种基于密集连接深度网络的目标检测方法 | |
CN110929577A (zh) | 一种基于YOLOv3的轻量级框架改进的目标识别方法 | |
CN111461083A (zh) | 基于深度学习的快速车辆检测方法 | |
CN111507226B (zh) | 道路图像识别模型建模方法、图像识别方法及电子设备 | |
CN114898327B (zh) | 一种基于轻量化深度学习网络的车辆检测方法 | |
CN110458047B (zh) | 一种基于深度学习的越野环境场景识别方法及系统 | |
CN112949578B (zh) | 车灯状态识别方法、装置、设备及存储介质 | |
CN110599453A (zh) | 一种基于图像融合的面板缺陷检测方法、装置及设备终端 | |
CN115830399B (zh) | 分类模型训练方法、装置、设备、存储介质和程序产品 | |
CN115861619A (zh) | 一种递归残差双注意力核点卷积网络的机载LiDAR城市点云语义分割方法与系统 | |
CN110909674A (zh) | 一种交通标志识别方法、装置、设备和存储介质 | |
CN112785610B (zh) | 一种融合低层特征的车道线语义分割方法 | |
CN114495060A (zh) | 一种道路交通标线识别方法及装置 | |
CN113569911A (zh) | 车辆识别方法、装置、电子设备及存储介质 | |
CN116051961A (zh) | 一种目标检测模型训练方法、目标检测方法、设备及介质 | |
CN116129158A (zh) | 一种输电线路铁塔小部件图像识别方法及装置 | |
CN115240163A (zh) | 一种基于一阶段检测网络的交通标志检测方法及系统 | |
CN114792397A (zh) | 一种sar影像城市道路提取方法、系统以及存储介质 | |
CN114694080A (zh) | 一种监控暴力行为检测方法、系统、装置及可读存储介质 | |
CN111797782A (zh) | 基于图像特征的车辆检测方法和系统 | |
US20240135679A1 (en) | Method for classifying images and electronic device | |
Wang et al. | Attentional single-shot network with multi-scale feature fusion for object detection in aerial images | |
CN111914765B (zh) | 一种服务区环境舒适度检测方法、设备及可读存储介质 | |
CN116246128B (zh) | 跨数据集的检测模型的训练方法、装置及电子设备 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21771737 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2022544673 Country of ref document: JP Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: 2021771737 Country of ref document: EP Effective date: 20220718 |
|
ENP | Entry into the national phase |
Ref document number: 20227026698 Country of ref document: KR Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |