CN115690704B - LG-CenterNet model-based complex road scene target detection method and device - Google Patents
LG-CenterNet model-based complex road scene target detection method and device Download PDFInfo
- Publication number
- CN115690704B CN115690704B CN202211179337.4A CN202211179337A CN115690704B CN 115690704 B CN115690704 B CN 115690704B CN 202211179337 A CN202211179337 A CN 202211179337A CN 115690704 B CN115690704 B CN 115690704B
- Authority
- CN
- China
- Prior art keywords
- module
- model
- road scene
- target
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 34
- 238000000034 method Methods 0.000 claims abstract description 9
- 238000012545 processing Methods 0.000 claims abstract description 5
- 238000010586 diagram Methods 0.000 claims description 24
- 238000011176 pooling Methods 0.000 claims description 16
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 claims description 14
- 230000004913 activation Effects 0.000 claims description 9
- 230000006870 function Effects 0.000 claims description 9
- 238000004590 computer program Methods 0.000 claims description 6
- 238000007781 pre-processing Methods 0.000 claims description 6
- 238000005070 sampling Methods 0.000 claims description 6
- 238000004364 calculation method Methods 0.000 claims description 5
- 238000012549 training Methods 0.000 claims description 4
- 238000012935 Averaging Methods 0.000 claims description 3
- 230000000875 corresponding effect Effects 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 3
- 230000002708 enhancing effect Effects 0.000 claims description 2
- 210000000988 bone and bone Anatomy 0.000 abstract description 5
- 238000004519 manufacturing process Methods 0.000 abstract description 2
- 238000011897 real-time detection Methods 0.000 abstract description 2
- 238000005516 engineering process Methods 0.000 description 4
- 238000000605 extraction Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000012795 verification Methods 0.000 description 2
- 206010039203 Road traffic accident Diseases 0.000 description 1
- 230000010485 coping Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 238000009827 uniform distribution Methods 0.000 description 1
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Abstract
The invention discloses a complex road scene target detection method and device based on an LG-CenterNet model, which are characterized in that an original road image dataset is collected to manufacture a dataset, an LG-CenterNet network model is built, a back bone 50 is used as a model to extract feature pairs, a hierarchical directing attention mechanism is adopted to guide features of different levels while feature images of different scales of a Backbone network are improved to improve the receptive field; inputting the feature map processed by the hierarchical guiding mechanism into a ScalesEncoder module for processing; adopting a deconvolution module to restore the characteristic pixels; adopting a new feature enhancement module to restore the restored features to solve the problem of feature information loss in the pixel restoring process; and finally, inputting the enhanced feature map to a Center points prediction module for road target category identification and position location. The recognition average precision of the self-built complex road scene data set is 86.93%, the detection speed of the road scene target image reaches 50 frames/s, and the requirements of accurate detection and real-time detection of the road scene can be met.
Description
Technical Field
The invention belongs to the fields of semantic segmentation, image processing and intelligent driving, and particularly relates to a complex road scene target detection method and device based on an LG-CenterNet model.
Background
The steady increase in the number of automobiles in recent years causes frequent traffic accidents, which seriously threatens the life safety of people. Today, with the development of automatic driving technology, researchers have also shifted from passive safety technology research into active safety technology research of automobiles. Some advanced technical means are necessary to realize the automation of the automobile to complete part of the automobile driving task. The intelligent detection of road scene targets by adopting a deep learning method is a key for solving the active safety technology of automobiles. The current target detection network mainly performs feature extraction through a backbone network, but does not take excessive consideration on the underlying multi-scale problem, which may result in insufficient multi-scale target detection capability.
Disclosure of Invention
The invention aims to: aiming at the problem that the existing complex road scene target detection application effect is poor, the conventional detection method cannot meet the detection requirement of the actual road environment, and the complex road scene target detection method and device based on the LG-CenterNet model are provided.
The technical scheme is as follows: the invention provides a complex road scene target detection method based on an LG-CenterNet model, which specifically comprises the following steps:
(1) Processing the image of the complex road scene to obtain a road target image containing various categories, marking the categories and positions of the road targets in the image, constructing a complex road scene data set and preprocessing the complex road scene data set;
(2) Constructing a target detection LG-CenterNet model, and training the road target data set through the LG-CenterNet model to obtain a model S; the LG-CenterNet model comprises a Backbone module, a hierarchical directing attention module, a Scales Encoder module, a deconvolution module, a feature enhancement module and a Centerpoints prediction module;
(3) And performing target positioning, frame size division and category prediction on a complex road target in a thermodynamic diagram mode through a Center points prediction module by using the trained model S, and displaying and inputting the obtained result on a video or an image to obtain a corresponding effect.
Further, the preprocessing of the road scene data set in the step (1) is to normalize the images of the road scene with different pixels and complex road scene, normalize the sizes of the images to 512×512 pixel sizes, and obtain uniformly distributed feature target samples through batch normalization, reLU activation function and maximum pooling operation.
Further, the implementation process of the step (2) is as follows:
(21) The LG-CenterNet model proposes a new Mresneit50 as a backbond module, wherein the Mresneit50 consists of a plurality of residual blocks, a characteristic diagram extracted by 4 residual blocks is marked as E1, and the channel number is 512; the characteristic diagram extracted by the 6 residual blocks is marked as E2, and the number of channels is 1024; the feature map extracted by the number of 3 channels is marked as E3, and the number of the channels is 2048;
(22) The feature maps E1, E2 and E3 extracted by the backstone are input into the hierarchical directing attention module, and the main structure of the feature maps comprises two branches: the global pooling branch and the hierarchical guiding branch are used for inputting a characteristic diagram E1 with the channel number of 512 into the global pooling branch, and EC1 is obtained through the operation of a global maximum pooling layer and an up-sampling layer; inputting the characteristic diagrams E1, E2 and E3 with the channel numbers of 512, 1024 and 2048 into a hierarchical guide branch, and obtaining EC2 through a series of averaging pooling and convolution operations and matching up sampling; combining the characteristics of the EC1 and the EC2 by using add to obtain EC3, thereby reducing calculation parameters;
(23) Inputting the extracted EC3 into a Scales Encoder module, and carrying out a series of convolution and residual error module operation to obtain EC4;
(24) The extracted EC4 is input into a deconvolution module, the deconvolution module consists of 3 deconv groups, the feature map size is continuously amplified through convolution operation of each deconv group, and meanwhile, the channel number is continuously reduced, so that a feature map with the dimension of 128 multiplied by 64 is obtained and is marked as EC5;
(25) The feature map EC5 is input into a feature enhancement module to carry out convolution operation to obtain a feature map EC6 with the size of 128 multiplied by 64, and the P-FEM is composed of 3 multiplied by 3 Poly-Scale Convolution, batch standardization, reLU activation function and Sigmoid activation function, and is mainly used for improving the correlation of local information in the feature map and enhancing the expression capability of the feature map on the feature.
Further, the implementation process of the step (3) is as follows:
the centroids prediction module generates a hetmap with the scale consistent with the EC6 size from the original image by classifying and predicting the input image by the trained model S, and then marks the loss value of the thermodynamic diagram as L by calculating the loss value of the thermodynamic diagram respectively h The loss value of the target length and width is recorded as L s And the loss value of the offset of the center point is recorded as L f To determine the location and size of the target and to generate a final classically located hetmap; wherein the overall network loss is:
L d =L k +λ s L s +λ f L f
wherein lambda is s =0.1,λ f =1; for an image of 512×215 input picture size, the feature map generated by the network is h×w×c, then L k 、L s And L f The calculation formulas are respectively as follows:
wherein A is HWC For the true value of the target mark in the image, A' HWC Alpha and beta are respectively 2 and 4 as predicted values of the image, N is the number of key points in the image, and s' pk To predict the size s k And p is the position of the center point of the target in the image.
Based on the same inventive concept, the invention also provides a complex road scene target detection device based on the LG-CenterNet model, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the computer program realizes the complex road scene target detection method based on the LG-CenterNet model when being loaded to the processor.
The beneficial effects are that: compared with the prior art, the invention has the beneficial effects that: 1. through improving the backbone network of the LG-CenterNet model, the Mresneit50 is proposed to strengthen the feature extraction effect; 2. the method comprises the steps of providing a hierarchical directing attention module for feature fusion of feature graphs extracted from a backbone network; 3. the new Scales Encoder module and the feature enhancement module are put forward to pay attention to the extraction of local features, so that the problem of feature loss in the deconvolution module is avoided; 4. the average precision mAP (meanAveragePrecision) of the improved LG-CenterNet target detection model is improved by 5 percentage points compared with the average precision mAP (meanAveragePrecision) of the original CenterNet framework; 5. the invention has higher detection precision in coping with complex road scenes.
Drawings
FIG. 1 is a flow chart of a complex road scene object detection method based on the LG-CenterNet model;
FIG. 2 is a schematic diagram of an LG-CenterNet-based target detection model according to the present invention;
fig. 3 is a schematic diagram of a residual block structure Mblock structure proposed by the present invention;
FIG. 4 is a schematic diagram of a hierarchical guided attention model structure;
FIG. 5 is a schematic diagram of the structure of the Scales Encoder module;
FIG. 6 is a schematic diagram of a feature enhancement module architecture;
FIG. 7 is a graph showing the detection effect obtained by using the LG-CenterNet target detection model.
Detailed Description
The invention is described in further detail below with reference to the accompanying drawings.
In this embodiment, a large number of variables are involved, and the variables will be described below. As shown in table 1.
Table 1 variable description table
Variable(s) | Description of variables |
S | 3 x 3, convolutional kernel with 1024 channels |
E1 | Feature map extracted from 4 residual blocks in back bone module |
E2 | Feature map extracted from 6 residual blocks in back bone module |
E3 | Feature map extracted from 3 residual blocks in back bone module |
EC1 | E1 is a feature map obtained through global pooling branching |
EC2 | E1, E2, E3 feature graphs obtained via hierarchical guided branching |
EC3 | Feature map EC2 is a feature map processed by a ScalesEncoder module |
EC4 | Feature map EC3 feature map processed by ScalesEncoder module |
EC5 | Feature map EC4 is a feature map processed by deconvolution module |
EC6 | Feature map EC5 is a feature map processed by the feature enhancement module |
The invention provides a complex road scene target detection method based on an LG-CenterNet model, which is characterized in that different target images of a road scene are collected and marked to manufacture a complex road scene data set, a proposed Mresneit50 is used as a main network for feature extraction, feature images with different Scales extracted from the main network are input into a hierarchical directing attention module, a plurality of receptive field features are obtained through a scale Encoder module, then feature pixel reduction is carried out through a deconvolution module, and a feature enhancement module is constructed through Poly-Scale Convolution (PSConv for short) to improve the information correlation of local features. And finally, predicting the position of the Center point of the target, the scale size of the prediction frame and the offset of the Center point by using a Center points prediction module, and identifying the category of the target. As shown in fig. 1, the method specifically comprises the following steps:
step 1: and processing the image of the complex road scene, preprocessing the obtained road target image containing various categories, and marking the categories and positions of the road targets in the image to construct a complex road scene data set.
The preprocessing of the road scene data set mainly comprises the steps of normalizing images of different pixels and complex road scenes, normalizing the sizes of the images to 512 multiplied by 512 pixels, and obtaining a target sample which is in uniform distribution in the images through batch normalization (Batch Normalizaition), reLU activation function and maximum pooling operation.
And 2, constructing a target detection LG-CenterNet model, wherein the LG-CenterNet model structure is shown in figure 2, and training the road target data set through the LG-CenterNet model to obtain a model S, and the LG-CenterNet network mainly comprises a Backbone module, a hierarchical directing attention module (Levels guide attention, LGA for short), a Scales Encoder module, a deconvolution module, a feature enhancement module (P-Feature enhancement module, P-FEM) and a Centerpoints prediction module.
(21) The LG-CenterNet model proposes a new Mresneit50 as a backbond module, wherein the Mresneit50 is composed of a plurality of residual blocks Mlock, the residual block structure Mlock is shown in figure 3, a feature map extracted from 4 residual blocks is marked as E1, and the channel number is 512; the characteristic diagram extracted by the 6 residual blocks is marked as E2, and the number of channels is 1024; the feature map extracted by the 3-channel number is denoted as E3, and the channel number is 2048.
(22) The feature maps E1, E2, E3 extracted by the back bone are input into a hierarchical directing attention module (Levels guide attention, LGA for short), and the LGA module structure is shown in fig. 4, and the main structure of the LGA module structure comprises two branches: the global pooling branch and the hierarchical guiding branch are used for inputting a characteristic diagram E1 with the channel number of 512 into the global pooling branch, and EC1 is obtained through the operation of a global maximum pooling layer and an up-sampling layer; the feature maps E1, E2 and E3 with the channel numbers of 512, 1024 and 2048 are input into the hierarchical guide branches, and EC2 is obtained through a series of averaging pooling and convolution operations and is matched with up-sampling. EC1 and EC2 were feature-combined using add to obtain EC3, thereby reducing computational parameters.
(23) The extracted EC3 is input to a Scales Encoder module, the structure of which is shown in FIG. 5, and a series of convolution and residual module operations are performed to obtain EC4.
(24) The extracted EC4 is input into a deconvolution module, the deconvolution module consists of 3 deconv groups, the feature map size is continuously amplified through convolution operation of each deconv group, and meanwhile, the channel number is continuously reduced, so that the feature map with the dimension of 128 multiplied by 64 is obtained and is marked as EC5.
(25) The feature map EC5 is input into a P-FEM for convolution operation to obtain a feature map EC6 with the scale of 128 multiplied by 64, wherein the P-FEM is composed of 3 multiplied by 3 Poly-Scale Convolution (PSConv for short), batch standardization (batch standardization), reLU activation function and Sigmoid activation function, and mainly aims to improve the correlation of local information in the feature map and enhance the expression capability of the feature map on the feature. The P-FEM structure is shown in FIG. 6.
Step 3: and performing target positioning, frame size division and category prediction on a road scene target in a thermodynamic diagram mode through a central points prediction module by using the trained model S, and displaying and inputting the obtained result on a video or an image to obtain a corresponding effect.
The centroids prediction module generates a hetmap with the scale consistent with the EC6 size from the original image by classifying and predicting the input image by the trained model S, and then marks the loss value of the thermodynamic diagram as L by calculating the loss value of the thermodynamic diagram respectively h The loss value of the target length and width (size) is recorded as L s And the loss value of the center point offset (offset) is denoted as L f To determine the location and size of the target and to generate the final classically located hetmap. Wherein the overall network loss is L d 。
L d =L k +λ s L s +λ f L f
Wherein lambda is s =0.1,λ f =1. For an image of input picture size 512 x 215The feature map generated by the network is H×W×C, L k 、L s And L f The calculation formulas are respectively as follows:
wherein A is HWC For the true value of the target mark in the image, A' HWC Alpha and beta are respectively 2 and 4 as predicted values of the image, N is the number of key points in the image, and s' pk To predict the size s k And p is the position of the center point of the target in the image.
Based on the same inventive concept, the invention also provides a complex road scene target detection device based on the LG-CenterNet model, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the computer program realizes the complex road scene target detection method based on the LG-CenterNet model when being loaded to the processor. As shown in fig. 7.
Training a self-built complex scene data set through an LG-centrnet network to obtain a model capable of identifying a complex scene target, and performing model performance verification through a verification set in the data set, as shown in fig. 7. The recognition average precision of the self-built complex road scene data set is 86.93%, the detection speed of the road scene target image reaches 50 frames/s, and the requirements of accurate detection and real-time detection of the road scene can be met.
Precision is Precision, recall is Recall, AP is Precision, mAP is average Precision, FPS is frame number, and t is time for detecting a single picture. There are more sample categories in the dataset (e.g., car, person, etc.), n represents the number of samples, TP (True Positives) is and is considered as the number of positive samples (i.e., the samples that are car are considered as the total number of car); TN (True Negatives) is the negative sample model identification and is the total number of negative samples; FP (False Positives) is the total number of negative samples for which the model is considered positive (i.e., samples other than car, the model is considered the total number of car); FN (False Negatives) is the total number of positive samples that the negative sample model considers to be.
The embodiments of the present invention have been described in detail with reference to the drawings, but the present invention is not limited to the above embodiments, and various changes can be made within the knowledge of those skilled in the art without departing from the spirit of the present invention.
Claims (4)
1. The complex road scene target detection method based on the LG-CenterNet model is characterized by comprising the following steps of:
(1) Processing the image of the complex road scene to obtain a road target image containing various categories, marking the categories and positions of the road targets in the image, constructing a complex road scene data set and preprocessing the complex road scene data set;
(2) Constructing a target detection LG-CenterNet model, and training the road target data set through the LG-CenterNet model to obtain a model S; the LG-CenterNet model comprises a Backbone module, a hierarchical directing attention module, a Scales Encoder module, a deconvolution module, a feature enhancement module and a Centerpoints prediction module;
(3) Performing target positioning, frame size division and category prediction on a complex road target in a thermodynamic diagram mode through a Center points prediction module by using a trained model S, and displaying and inputting the obtained result on a video or an image to obtain a corresponding effect;
the implementation process of the step (2) is as follows:
(21) The LG-CenterNet model proposes a new Mresneit50 as a backbond module, wherein the Mresneit50 consists of a plurality of residual blocks, a characteristic diagram extracted by 4 residual blocks is marked as E1, and the channel number is 512; the characteristic diagram extracted by the 6 residual blocks is marked as E2, and the number of channels is 1024; the feature map extracted by the number of 3 channels is marked as E3, and the number of the channels is 2048;
(22) The feature maps E1, E2 and E3 extracted by the backstone are input into the hierarchical directing attention module, and the main structure of the feature maps comprises two branches: the global pooling branch and the hierarchical guiding branch are used for inputting a characteristic diagram E1 with the channel number of 512 into the global pooling branch, and EC1 is obtained through the operation of a global maximum pooling layer and an up-sampling layer; inputting the characteristic diagrams E1, E2 and E3 with the channel numbers of 512, 1024 and 2048 into a hierarchical guide branch, and obtaining EC2 through a series of averaging pooling and convolution operations and matching up sampling; combining the characteristics of the EC1 and the EC2 by using add to obtain EC3, thereby reducing calculation parameters;
(23) Inputting the extracted EC3 into a Scales Encoder module, and carrying out a series of convolution and residual error module operation to obtain EC4;
(24) The extracted EC4 is input into a deconvolution module, the deconvolution module consists of 3 deconv groups, the feature map size is continuously amplified through convolution operation of each deconv group, and meanwhile, the channel number is continuously reduced, so that a feature map with the dimension of 128 multiplied by 64 is obtained and is marked as EC5;
(25) The feature map EC5 is input into a feature enhancement module to carry out convolution operation to obtain a feature map EC6 with the size of 128 multiplied by 64, and the P-FEM is composed of 3 multiplied by 3 Poly-Scale Convolution, batch standardization, reLU activation function and Sigmoid activation function, and is mainly used for improving the correlation of local information in the feature map and enhancing the expression capability of the feature map on the feature.
2. The method for detecting complex road scene targets based on LG-centrnet model as claimed in claim 1, wherein the step (1) of constructing complex road scene data set and preprocessing is to normalize images of different pixels and complex road scene, normalize the image size to 512×512 pixels, and obtain uniformly distributed feature target samples by batch normalization, reLU activation function and max pooling operation.
3. The complex road scene target detection method based on the LG-centrnet model as set forth in claim 1, wherein the implementation procedure of the step (3) is as follows:
the centroids prediction module generates a hetmap with the scale consistent with the EC6 size from the original image by classifying and predicting the input image by the trained model S, and then marks the loss value of the thermodynamic diagram as L by calculating the loss value of the thermodynamic diagram respectively h The loss value of the target length and width is recorded as L s And the loss value of the offset of the center point is recorded as L f To determine the location and size of the target and to generate a final classically located hetmap; wherein the overall network loss is:
L d =L k +λ s L s +λ f L f
wherein lambda is s =0.1,λ f =1; for an image of 512×215 input picture size, the feature map generated by the network is h×w×c, then L k 、L s And L f The calculation formulas are respectively as follows:
wherein A is HWC For the true value of the target mark in the image, A' HWC Alpha and beta are respectively 2 and 4 as predicted values of the image, N is the number of key points in the image, and s' pk To predict the size s k And p is the position of the center point of the target in the image.
4. An LG-centrnet model-based complex road scene object detection device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the computer program when loaded into the processor implements the LG-centrnet model-based complex road scene object detection method according to any of claims 1-3.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211179337.4A CN115690704B (en) | 2022-09-27 | 2022-09-27 | LG-CenterNet model-based complex road scene target detection method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211179337.4A CN115690704B (en) | 2022-09-27 | 2022-09-27 | LG-CenterNet model-based complex road scene target detection method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115690704A CN115690704A (en) | 2023-02-03 |
CN115690704B true CN115690704B (en) | 2023-08-22 |
Family
ID=85063352
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211179337.4A Active CN115690704B (en) | 2022-09-27 | 2022-09-27 | LG-CenterNet model-based complex road scene target detection method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115690704B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117690165A (en) * | 2024-02-02 | 2024-03-12 | 四川泓宝润业工程技术有限公司 | Method and device for detecting personnel passing between drill rod and hydraulic pliers |
Citations (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108717537A (en) * | 2018-05-30 | 2018-10-30 | 淮阴工学院 | A kind of face identification method and system of the complex scene based on pattern-recognition |
CN110543895A (en) * | 2019-08-08 | 2019-12-06 | 淮阴工学院 | image classification method based on VGGNet and ResNet |
CN111382714A (en) * | 2020-03-13 | 2020-07-07 | Oppo广东移动通信有限公司 | Image detection method, device, terminal and storage medium |
CN111709895A (en) * | 2020-06-17 | 2020-09-25 | 中国科学院微小卫星创新研究院 | Image blind deblurring method and system based on attention mechanism |
CN111814889A (en) * | 2020-07-14 | 2020-10-23 | 大连理工大学人工智能大连研究院 | Single-stage target detection method using anchor-frame-free module and enhanced classifier |
CN111932553A (en) * | 2020-07-27 | 2020-11-13 | 北京航空航天大学 | Remote sensing image semantic segmentation method based on area description self-attention mechanism |
CN112329800A (en) * | 2020-12-03 | 2021-02-05 | 河南大学 | Salient object detection method based on global information guiding residual attention |
CN112580443A (en) * | 2020-12-02 | 2021-03-30 | 燕山大学 | Pedestrian detection method based on embedded device improved CenterNet |
CN112686207A (en) * | 2021-01-22 | 2021-04-20 | 北京同方软件有限公司 | Urban street scene target detection method based on regional information enhancement |
CN112700444A (en) * | 2021-02-19 | 2021-04-23 | 中国铁道科学研究院集团有限公司铁道建筑研究所 | Bridge bolt detection method based on self-attention and central point regression model |
CN113378815A (en) * | 2021-06-16 | 2021-09-10 | 南京信息工程大学 | Model for scene text positioning recognition and training and recognition method thereof |
CN113408498A (en) * | 2021-08-05 | 2021-09-17 | 广东众聚人工智能科技有限公司 | Crowd counting system and method, equipment and storage medium |
CN113657326A (en) * | 2021-08-24 | 2021-11-16 | 陕西科技大学 | Weed detection method based on multi-scale fusion module and feature enhancement |
WO2021244621A1 (en) * | 2020-06-04 | 2021-12-09 | 华为技术有限公司 | Scenario semantic parsing method based on global guidance selective context network |
CN114359153A (en) * | 2021-12-07 | 2022-04-15 | 湖北工业大学 | Insulator defect detection method based on improved CenterNet |
CN114419589A (en) * | 2022-01-17 | 2022-04-29 | 东南大学 | Road target detection method based on attention feature enhancement module |
CN114581866A (en) * | 2022-01-24 | 2022-06-03 | 江苏大学 | Multi-target visual detection algorithm for automatic driving scene based on improved CenterNet |
CN114638836A (en) * | 2022-02-18 | 2022-06-17 | 湖北工业大学 | Urban street view segmentation method based on highly effective drive and multi-level feature fusion |
CN114863368A (en) * | 2022-07-05 | 2022-08-05 | 城云科技(中国)有限公司 | Multi-scale target detection model and method for road damage detection |
CN115035361A (en) * | 2022-05-11 | 2022-09-09 | 中国科学院声学研究所南海研究站 | Target detection method and system based on attention mechanism and feature cross fusion |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20220204035A1 (en) * | 2020-12-28 | 2022-06-30 | Hyundai Mobis Co., Ltd. | Driver management system and method of operating same |
US11915500B2 (en) * | 2021-01-28 | 2024-02-27 | Salesforce, Inc. | Neural network based scene text recognition |
-
2022
- 2022-09-27 CN CN202211179337.4A patent/CN115690704B/en active Active
Patent Citations (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108717537A (en) * | 2018-05-30 | 2018-10-30 | 淮阴工学院 | A kind of face identification method and system of the complex scene based on pattern-recognition |
CN110543895A (en) * | 2019-08-08 | 2019-12-06 | 淮阴工学院 | image classification method based on VGGNet and ResNet |
CN111382714A (en) * | 2020-03-13 | 2020-07-07 | Oppo广东移动通信有限公司 | Image detection method, device, terminal and storage medium |
WO2021244621A1 (en) * | 2020-06-04 | 2021-12-09 | 华为技术有限公司 | Scenario semantic parsing method based on global guidance selective context network |
CN111709895A (en) * | 2020-06-17 | 2020-09-25 | 中国科学院微小卫星创新研究院 | Image blind deblurring method and system based on attention mechanism |
CN111814889A (en) * | 2020-07-14 | 2020-10-23 | 大连理工大学人工智能大连研究院 | Single-stage target detection method using anchor-frame-free module and enhanced classifier |
CN111932553A (en) * | 2020-07-27 | 2020-11-13 | 北京航空航天大学 | Remote sensing image semantic segmentation method based on area description self-attention mechanism |
CN112580443A (en) * | 2020-12-02 | 2021-03-30 | 燕山大学 | Pedestrian detection method based on embedded device improved CenterNet |
CN112329800A (en) * | 2020-12-03 | 2021-02-05 | 河南大学 | Salient object detection method based on global information guiding residual attention |
CN112686207A (en) * | 2021-01-22 | 2021-04-20 | 北京同方软件有限公司 | Urban street scene target detection method based on regional information enhancement |
CN112700444A (en) * | 2021-02-19 | 2021-04-23 | 中国铁道科学研究院集团有限公司铁道建筑研究所 | Bridge bolt detection method based on self-attention and central point regression model |
CN113378815A (en) * | 2021-06-16 | 2021-09-10 | 南京信息工程大学 | Model for scene text positioning recognition and training and recognition method thereof |
CN113408498A (en) * | 2021-08-05 | 2021-09-17 | 广东众聚人工智能科技有限公司 | Crowd counting system and method, equipment and storage medium |
CN113657326A (en) * | 2021-08-24 | 2021-11-16 | 陕西科技大学 | Weed detection method based on multi-scale fusion module and feature enhancement |
CN114359153A (en) * | 2021-12-07 | 2022-04-15 | 湖北工业大学 | Insulator defect detection method based on improved CenterNet |
CN114419589A (en) * | 2022-01-17 | 2022-04-29 | 东南大学 | Road target detection method based on attention feature enhancement module |
CN114581866A (en) * | 2022-01-24 | 2022-06-03 | 江苏大学 | Multi-target visual detection algorithm for automatic driving scene based on improved CenterNet |
CN114638836A (en) * | 2022-02-18 | 2022-06-17 | 湖北工业大学 | Urban street view segmentation method based on highly effective drive and multi-level feature fusion |
CN115035361A (en) * | 2022-05-11 | 2022-09-09 | 中国科学院声学研究所南海研究站 | Target detection method and system based on attention mechanism and feature cross fusion |
CN114863368A (en) * | 2022-07-05 | 2022-08-05 | 城云科技(中国)有限公司 | Multi-scale target detection model and method for road damage detection |
Non-Patent Citations (1)
Title |
---|
改进CenterNet的交通标志检测算法;成怡 等;《信号处理》;第38卷(第3期);511-518 * |
Also Published As
Publication number | Publication date |
---|---|
CN115690704A (en) | 2023-02-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107563372B (en) | License plate positioning method based on deep learning SSD frame | |
CN107239778B (en) | Efficient and accurate license plate recognition method | |
CN111340123A (en) | Image score label prediction method based on deep convolutional neural network | |
CN111968150B (en) | Weak surveillance video target segmentation method based on full convolution neural network | |
CN112560656A (en) | Pedestrian multi-target tracking method combining attention machine system and end-to-end training | |
CN111738363B (en) | Alzheimer disease classification method based on improved 3D CNN network | |
CN115131760B (en) | Lightweight vehicle tracking method based on improved feature matching strategy | |
CN114266794B (en) | Pathological section image cancer region segmentation system based on full convolution neural network | |
CN115690704B (en) | LG-CenterNet model-based complex road scene target detection method and device | |
CN115880529A (en) | Method and system for classifying fine granularity of birds based on attention and decoupling knowledge distillation | |
CN113066089B (en) | Real-time image semantic segmentation method based on attention guide mechanism | |
CN113610760A (en) | Cell image segmentation and tracing method based on U-shaped residual error neural network | |
CN111507279B (en) | Palm print recognition method based on UNet + + network | |
CN111178370B (en) | Vehicle searching method and related device | |
CN110211150B (en) | Real-time visual target identification method with scale coordination mechanism | |
CN116228795A (en) | Ultrahigh resolution medical image segmentation method based on weak supervised learning | |
CN113269734B (en) | Tumor image detection method and device based on meta-learning feature fusion strategy | |
CN112348011B (en) | Vehicle damage assessment method and device and storage medium | |
CN114743257A (en) | Method for detecting and identifying image target behaviors | |
CN115359091A (en) | Armor plate detection tracking method for mobile robot | |
CN115170414A (en) | Knowledge distillation-based single image rain removing method and system | |
CN114972171A (en) | Prostate MR image semi-supervised segmentation method based on deep learning | |
Zheng et al. | A novel strategy for global lane detection based on key-point regression and multi-scale feature fusion | |
CN113012167A (en) | Combined segmentation method for cell nucleus and cytoplasm | |
CN112487927A (en) | Indoor scene recognition implementation method and system based on object associated attention |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
EE01 | Entry into force of recordation of patent licensing contract |
Application publication date: 20230203 Assignee: Jiangsu Kesheng Xuanyi Technology Co.,Ltd. Assignor: HUAIYIN INSTITUTE OF TECHNOLOGY Contract record no.: X2023980048436 Denomination of invention: Method and device for complex road scene object detection based on LG CenterNet model Granted publication date: 20230822 License type: Common License Record date: 20231129 |
|
EE01 | Entry into force of recordation of patent licensing contract |