CN112132032A - Traffic sign detection method and device, electronic equipment and storage medium - Google Patents
Traffic sign detection method and device, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN112132032A CN112132032A CN202011009444.3A CN202011009444A CN112132032A CN 112132032 A CN112132032 A CN 112132032A CN 202011009444 A CN202011009444 A CN 202011009444A CN 112132032 A CN112132032 A CN 112132032A
- Authority
- CN
- China
- Prior art keywords
- network
- traffic sign
- model
- feature map
- convolution operation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 105
- 238000000034 method Methods 0.000 claims abstract description 25
- 238000007781 pre-processing Methods 0.000 claims abstract description 13
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 claims description 52
- 230000004927 fusion Effects 0.000 claims description 37
- 238000010586 diagram Methods 0.000 claims description 21
- 238000012549 training Methods 0.000 claims description 15
- 238000004590 computer program Methods 0.000 claims description 11
- 230000004913 activation Effects 0.000 claims description 8
- 230000006835 compression Effects 0.000 claims description 8
- 238000007906 compression Methods 0.000 claims description 8
- 230000001537 neural effect Effects 0.000 claims description 8
- 238000010606 normalization Methods 0.000 claims description 8
- 230000006872 improvement Effects 0.000 claims description 7
- 238000013473 artificial intelligence Methods 0.000 abstract 1
- 238000010276 construction Methods 0.000 abstract 1
- 239000011159 matrix material Substances 0.000 description 16
- 238000000605 extraction Methods 0.000 description 6
- 230000006870 function Effects 0.000 description 6
- 238000013528 artificial neural network Methods 0.000 description 4
- 239000000284 extract Substances 0.000 description 3
- 230000003935 attention Effects 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 2
- 239000002131 composite material Substances 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000008521 reorganization Effects 0.000 description 2
- 238000010008 shearing Methods 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/52—Surveillance or monitoring of activities, e.g. for recognising suspicious objects
- G06V20/54—Surveillance or monitoring of activities, e.g. for recognising suspicious objects of traffic, e.g. cars on the road, trains or boats
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The application relates to the technical field of artificial intelligence, and can be applied to smart traffic scenes so as to promote the construction of smart cities. The application provides a traffic sign board detection method, a device, electronic equipment and a storage medium, wherein the method comprises the following steps: acquiring an image to be detected; preprocessing the image to be detected to obtain a preprocessed image; inputting the preprocessed image into a traffic sign detection model trained in advance to obtain a detection result, wherein the traffic sign detection model is obtained by improving a network structure based on a YOLOv3 network model; and outputting the detection result. The invention can improve the identification accuracy of the traffic sign board.
Description
Technical Field
The invention relates to the technical field of image detection, in particular to a method and a device for detecting a traffic sign board, electronic equipment and a storage medium.
Background
At present, traffic signboards of roads can be identified through a YOLOv3 network model, but in practice, it is found that the YOLOv3 network model has low identification accuracy on the traffic signboards in complex environments such as haze and night, and road driving safety is affected.
Therefore, how to improve the identification accuracy of the traffic sign is a technical problem which needs to be solved urgently.
Disclosure of Invention
In view of the above, it is necessary to provide a traffic sign detecting method capable of improving the accuracy of identifying a traffic sign.
A first aspect of the present invention provides a traffic sign detection method, the method comprising:
acquiring an image to be detected;
preprocessing the image to be detected to obtain a preprocessed image;
inputting the preprocessed image into a traffic sign detection model trained in advance to obtain a detection result, wherein the traffic sign detection model is obtained by improving a network structure based on a YOLOv3 network model, and the improvement of the network structure comprises the steps of fusing an EfficientNet network with a DarkNet53 network in the YOLOv3 network model and replacing residual connection of feature maps of different scales extracted from the DarkNet53 network and the YOLOv3 network model with residual connection of the fusion network and the feature maps;
and outputting the detection result.
In a possible implementation manner, before the obtaining of the image to be detected, the method for detecting a traffic sign further includes:
acquiring a preset YOLOv3 network model;
fusing an EfficientNet network with a DarkNet53 network in the YOLOv3 network model to obtain a fused network;
replacing residual connection of the DarkNet53 network and feature maps of different scales extracted from the YOLOv3 network model with residual connection of the fusion network and the feature maps to obtain an initial model;
acquiring a preset training set;
and training the initial model by using the preset training set to obtain a trained traffic sign board detection model.
In a possible implementation manner, the merging the EfficientNet network with the DarkNet53 network in the yollov 3 network model to obtain a merged network includes:
carrying out channel merging on the EfficientNet network and a DarkNet53 network in the YOLOv3 network model to obtain a merged network;
compressing the merged network to obtain a compressed network;
recombining the compression network through two preset fully-connected neural layers to obtain an attention network;
and carrying out attention weighting on the attention network and the merging network to obtain the fusion network.
In a possible implementation manner, the inputting the preprocessed image into a traffic sign detection model trained in advance, and obtaining a detection result includes:
inputting the preprocessed image into a pre-trained traffic sign detection model, acquiring a first residual block of the preprocessed image through the EfficientNet network, and acquiring a second residual block of the preprocessed image through the DarkNet53 network;
generating a first feature map, a second feature map and a third feature map according to the first residual block, the second residual block and the fusion network;
and generating the detection result according to the first feature map, the second feature map and the third feature map.
In a possible implementation manner, the generating the detection result according to the first feature map, the second feature map, and the third feature map includes:
predicting the first characteristic diagram, the second characteristic diagram and the third characteristic diagram through a plurality of prior frames obtained by preset clustering to obtain the coordinates of the traffic sign board;
and determining the coordinates of the traffic sign board as the detection result.
In a possible implementation manner, the generating a first feature map, a second feature map, and a third feature map according to the first residual block, the second residual block, and the fusion network includes:
inputting the first residual block and the second residual block into the fusion network, and performing a first convolution operation to obtain the first feature map;
performing a first splicing operation on the first residual block and the first convolution operation to obtain a first splicing result, and performing a second convolution operation on the first splicing result to obtain a second feature map;
and performing second splicing operation on the second residual block and a result obtained by performing second convolution operation on the first splicing result to obtain a second splicing result, and performing third convolution operation on the second splicing result to obtain a third feature map.
In one possible implementation, the first convolution operation, the second convolution operation, and the third convolution operation each include a first convolution operation, a second convolution operation, and a third convolution operation, the first convolution operation includes 5 convolution layers, 1 normalization layer, and 1 activation layer, the second convolution operation includes 1 convolution layer, 1 normalization layer, and 1 activation layer, and the third convolution operation includes 1 convolution layer.
A second aspect of the present invention provides a traffic sign detecting device including:
the acquisition module is used for acquiring an image to be detected;
the preprocessing module is used for preprocessing the image to be detected to obtain a preprocessed image;
an input module, configured to input the preprocessed image into a traffic sign detection model trained in advance, to obtain a detection result, where the traffic sign detection model is obtained by improving a network structure based on a YOLOv3 network model, and the improvement of the network structure includes merging an EfficientNet network with a DarkNet53 network in the YOLOv3 network model and replacing residual connections of feature maps of different scales extracted from the DarkNet53 network and the YOLOv3 network model with residual connections of the merged network and the feature maps;
and the output module is used for outputting the detection result.
A third aspect of the invention provides an electronic device comprising a processor and a memory, the processor being configured to implement the traffic sign detection method when executing a computer program stored in the memory.
A fourth aspect of the present invention provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the traffic sign detecting method.
By the technical scheme, the preprocessed image can be input into a pre-trained traffic sign detection model, when the traffic sign detection model extracts image features, a neural network fusing a plurality of networks is used for feature extraction, the extracted image features are more accurate than the image features extracted by an original Yolov3 network model through a single network, so that the finally obtained recognition result of the model can be more accurate, meanwhile, the original Yolov3 network model adopts a multi-scale fusion mode for prediction, the Yolov3 network model can predict objects on feature graphs with different scales, but the feature graphs with different scales are obtained by performing residual connection on intermediate state feature graphs generated by the same network, so that the features corresponding to different scales are likely to be closer, and the model cannot distinguish the features of the objects with different scales, the prediction result is inaccurate, residual error connection can be carried out through the intermediate state characteristic diagrams generated by different networks, so that the corresponding characteristics of different scales are more obviously distinguished, and the recall rate of the model is improved.
Drawings
Fig. 1 is a flow chart of a preferred embodiment of a traffic sign detection method disclosed in the present invention.
Fig. 2 is a functional block diagram of a traffic sign detecting apparatus according to a preferred embodiment of the present invention.
Fig. 3 is a schematic structural diagram of an electronic device for implementing the traffic sign detection method according to the preferred embodiment of the invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention.
The traffic sign detection method of the embodiment of the invention is applied to electronic equipment, and can also be applied to a hardware environment formed by the electronic equipment and a server connected with the electronic equipment through a network, and the hardware environment is executed by the server and the electronic equipment together. Networks include, but are not limited to: a wide area network, a metropolitan area network, or a local area network.
A server may refer to a computer system that provides services to other devices (e.g., electronic devices) in a network. A personal computer may also be called a server if it can externally provide a File Transfer Protocol (FTP) service. In a narrow sense, a server refers to a high-performance computer, which can provide services to the outside through a network, and compared with a common personal computer, the server has higher requirements on stability, security, performance and the like, and therefore, hardware such as a CPU, a chipset, a memory, a disk system, a network and the like is different from that of the common personal computer.
The electronic device is a device capable of automatically performing numerical calculation and/or information processing according to a preset or stored instruction, and the hardware thereof includes, but is not limited to, a microprocessor, an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA), a Digital Signal Processor (DSP), an embedded device, and the like. The electronic device may also include a network device and/or a user device. The network device includes, but is not limited to, a single network device, a server group consisting of a plurality of network devices, or a Cloud Computing (Cloud Computing) based Cloud consisting of a large number of hosts or network devices, wherein the Cloud Computing is one of distributed Computing, and is a super virtual computer consisting of a group of loosely coupled computers. The user device includes, but is not limited to, any electronic product that can interact with a user through a keyboard, a mouse, a remote controller, a touch pad, or a voice control device, for example, a personal computer, a tablet computer, a smart phone, a Personal Digital Assistant (PDA), or the like.
Referring to fig. 1, fig. 1 is a flow chart of a traffic sign detection method according to a preferred embodiment of the present invention. The order of the steps in the flowchart may be changed, and some steps may be omitted. Wherein, the execution subject of the traffic sign board detection method can be an electronic device.
And S11, acquiring an image to be detected.
Wherein, the image to be detected can contain a traffic sign board.
As an optional implementation manner, before the acquiring the image to be detected, the method for detecting a traffic sign further includes:
acquiring a preset YOLOv3 network model;
fusing an EfficientNet network with a DarkNet53 network in the YOLOv3 network model to obtain a fused network;
replacing residual connection of the DarkNet53 network and feature maps of different scales extracted from the YOLOv3 network model with residual connection of the fusion network and the feature maps to obtain an initial model;
acquiring a preset training set;
and training the initial model by using the preset training set to obtain a trained traffic sign board detection model.
The YOLOv3 network model is a third version of a yolo (young Only Look once) series target detection algorithm, a multi-scale fusion mode is adopted for prediction, YOLOv3 is a typical one-stage deep learning target detection network based on a regression method, a regression idea is utilized, for a given input image, a target frame and a target category of the position are directly regressed on a plurality of positions of the image, a traditional YOLOv3 network is improved in some applicability on the basis of YOLOv2, a stronger Darknet53 network is provided for feature extraction by taking reference to ResNet (residual error network), and a multi-scale fusion mode is adopted for target detection, so that the YOLOv3 network model has excellent detection performance.
The EfficientNet network is a network for model composite scaling, and is used for balancing three dimensions of network resolution, network depth and network width to realize optimization of the network on efficiency and accuracy.
Wherein, said Darknet53 network is a convolution network, which is basically formed by convolution of 1 × 1 and 3 × 3, and there are 53 convolution layers in the network. The method can be used for extracting the characteristic information of the picture.
Wherein the residual join may refer to a linear superposition of the output expressed as an input and one non-linear transformation of the input.
In this optional embodiment, an EfficientNet network may be fused with a preset DarkNet53 network in the yollov 3 network model to obtain a fused network, a residual connection between the DarkNet53 network and a residual connection between feature maps of different scales extracted from the YOLOv3 network model is replaced with the residual connection between the fused network and the feature maps to obtain an initial model, and then the initial model is trained by using the preset training set to obtain a trained traffic sign detection model.
Specifically, the merging the EfficientNet network with the DarkNet53 network in the yollov 3 network model to obtain a merged network includes:
carrying out channel merging on the EfficientNet network and a DarkNet53 network in the YOLOv3 network model to obtain a merged network;
compressing the merged network to obtain a compressed network;
recombining the compression network through two preset fully-connected neural layers to obtain an attention network;
and carrying out attention weighting on the attention network and the merging network to obtain the fusion network.
In this optional embodiment, the EfficientNet network and the DarkNet53 network are channel-merged to obtain a merged network, and then the merged network is compressed to obtain a compressed network, the compressed network is reorganized through two preset fully-connected neural layers to obtain an attention network, and the attention network and the merged network are attention-weighted to obtain the merged network, where operations such as compression, reorganization, attention weighting, and the like belong to an attention mechanism (channel attention), and a model can automatically focus on important channels (for example, important channels are weighted, and non-important channels are weighted), for example: assuming that the matrix size of the backbone1 network is 16 × 128 × 16, and the matrix size of the backbone2 network is 16 × 128 × 32, the size of the merged network matrix obtained after the backbone1 network and the backbone2 network perform channel merging (concat) operation is 16 × 128 × 48, then perform squeeze compression to obtain a compressed network matrix size of 16 × 48, then perform two fully-connected neural layer fully-connected operations to obtain an attention network matrix size of still 16 × 48, and finally perform element-level multiplication operation on the matrix 16 × 48 and the merged network matrix 16 × 128 × 48, namely perform the attention weighting to obtain a merged network matrix size of 16 × 128 × 48.
And S12, preprocessing the image to be detected to obtain a preprocessed image.
The preprocessing can be zooming or shearing of the image to be detected to obtain a preprocessed image, and the size of the preprocessed image can be fixed and is a preset size, so that the preprocessed image meets the input requirement of a subsequent traffic sign detection model.
S13, inputting the preprocessed image into a traffic sign detection model trained in advance to obtain a detection result, wherein the traffic sign detection model is obtained by improving a network structure based on a YOLOv3 network model, and the improvement of the network structure comprises the steps of fusing an EfficientNet network with a DarkNet53 network in the YOLOv3 network model and replacing residual connection of feature maps of different scales extracted from the DarkNet53 network and the YOLOv3 network model with residual connection of the fusion network and the feature maps.
Specifically, the inputting the preprocessed image into a traffic sign detection model trained in advance, and the obtaining of the detection result includes:
inputting the preprocessed image into a pre-trained traffic sign detection model, acquiring a first residual block of the preprocessed image through the EfficientNet network, and acquiring a second residual block of the preprocessed image through the DarkNet53 network;
generating a first feature map, a second feature map and a third feature map according to the first residual block, the second residual block and the fusion network;
and generating the detection result according to the first feature map, the second feature map and the third feature map.
The first residual block may be a result of an operation of the intermediate layer of the EfficientNet network on the preprocessed image, and the second residual block may be a result of an operation of the intermediate layer of the DarkNet53 network on the preprocessed image.
In this optional embodiment, the preprocessed image may be input into a pre-trained traffic sign detection model, a first residual block of the preprocessed image is obtained through the EfficientNet network, a second residual block of the preprocessed image is obtained through the DarkNet53 network, a first feature map, a second feature map and a third feature map are generated according to the first residual block, the second residual block and the fusion network, and the detection result is generated according to the first feature map, the second feature map and the third feature map.
Specifically, the generating the detection result according to the first feature map, the second feature map, and the third feature map includes:
predicting the first characteristic diagram, the second characteristic diagram and the third characteristic diagram through a plurality of prior frames obtained by preset clustering to obtain the coordinates of the traffic sign board;
and determining the coordinates of the traffic sign board as the detection result.
Wherein, three prior frames are preset in the YOLOv3 network model for the feature map of each scale, and nine prior frames are clustered together, for example: assume that the 9 prior boxes are: (10x13), (16x30), (33x23), (30x61), (62x45), (59x119), (116x90), (156x198), (373x326) larger prior boxes (116x90), (156x198), (373x326) are applied on the smallest 13 x13 feature map, suitable for detecting larger objects, medium prior boxes (30x61), (62x45), (59x119) are applied on the medium 26 x 26 feature map, suitable for detecting medium size objects, smaller prior boxes (10x13), (16x30), (33x23) are applied on the larger 52 x 52 feature map, suitable for detecting smaller objects.
Specifically, the generating a first feature map, a second feature map, and a third feature map according to the first residual block, the second residual block, and the fusion network includes:
inputting the first residual block and the second residual block into the fusion network, and performing a first convolution operation to obtain the first feature map;
performing a first splicing operation on the first residual block and the first convolution operation to obtain a first splicing result, and performing a second convolution operation on the first splicing result to obtain a second feature map;
and performing second splicing operation on the second residual block and a result obtained by performing second convolution operation on the first splicing result to obtain a second splicing result, and performing third convolution operation on the second splicing result to obtain a third feature map.
The first convolution operation, the second convolution operation and the third convolution operation all comprise a first convolution operation, a second convolution operation and a third convolution operation, the first convolution operation comprises 5 convolution layers, 1 normalization layer and 1 activation layer, the second convolution operation comprises 1 convolution layer, 1 normalization layer and 1 activation layer, and the third convolution operation comprises 1 convolution layer.
The splicing operation can be tensor splicing, the number of features can be increased through tensor splicing, and information of each feature is not increased.
And S14, outputting the detection result.
In the method flow described in fig. 1, the preprocessed image may be input into a traffic sign detection model trained in advance, when the traffic sign detection model extracts image features, a neural network fusing multiple networks is used for feature extraction, the extracted image features are more accurate than those of an original YOLOv3 network model using a single network, so that the final recognition result obtained by the model is more accurate, meanwhile, the original YOLOv3 network model adopts a multi-scale fusion mode for prediction, the YOLOv3 network model can predict objects on feature maps with different scales, but the feature maps with different scales are obtained by performing residual connection on intermediate state feature maps generated by the same network, which may cause that features corresponding to different scales are closer, so that the model cannot distinguish the features of the objects with different scales, the prediction result is inaccurate, residual error connection can be carried out through the intermediate state characteristic diagrams generated by different networks, so that the corresponding characteristics of different scales are more obviously distinguished, and the recall rate of the model is improved.
Fig. 2 is a functional block diagram of a traffic sign detecting apparatus according to a preferred embodiment of the present invention.
Referring to fig. 2, the traffic sign detecting device 20 may be operated in an electronic device. The traffic sign detection device 20 may comprise a plurality of functional modules consisting of program code segments. Program codes of respective program segments in the traffic sign detecting device 20 may be stored in the memory and executed by the at least one processor to perform some or all of the steps of the traffic sign detecting method described in fig. 1.
In this embodiment, the traffic sign detection device 20 may be divided into a plurality of functional modules according to the functions performed by the device. The functional module may include: the device comprises an acquisition module 201, a preprocessing module 202, an input module 203 and an output module 204. The module referred to herein is a series of computer program segments capable of being executed by at least one processor and capable of performing a fixed function and is stored in memory.
The acquiring module 201 is configured to acquire an image to be detected.
Wherein, the image to be detected can contain a traffic sign board.
The preprocessing module 202 is configured to preprocess the image to be detected to obtain a preprocessed image.
The preprocessing can be zooming or shearing of the image to be detected to obtain a preprocessed image, and the size of the preprocessed image can be fixed and is a preset size, so that the preprocessed image meets the input requirement of a subsequent traffic sign detection model.
An input module 203, configured to input the preprocessed image into a traffic sign detection model trained in advance, to obtain a detection result, where the traffic sign detection model is obtained by improving a network structure based on a YOLOv3 network model, and the improvement of the network structure includes merging an EfficientNet network with a DarkNet53 network in the YOLOv3 network model and replacing residual connections of feature maps of different scales extracted from the DarkNet53 network and the YOLOv3 network model with residual connections of the merged network and the feature maps.
And the output module 204 is configured to output the detection result.
As an optional implementation manner, the obtaining module 201 is further configured to obtain a preset YOLOv3 network model before obtaining an image to be detected;
the traffic sign detection device further includes:
the fusion module is used for fusing the EfficientNet network with a DarkNet53 network in the YOLOv3 network model to obtain a fusion network;
a replacing module, configured to replace residual connections of the DarkNet53 network and feature maps of different scales extracted in the YOLOv3 network model with residual connections of the fusion network and the feature maps, so as to obtain an initial model;
the obtaining module 201 is further configured to obtain a preset training set;
and the training module is used for training the initial model by using the preset training set to obtain a trained traffic sign board detection model.
The YOLOv3 network model is a third version of a yolo (young Only Look once) series target detection algorithm, a multi-scale fusion mode is adopted for prediction, YOLOv3 is a typical one-stage deep learning target detection network based on a regression method, a regression idea is utilized, for a given input image, a target frame and a target category of the position are directly regressed on a plurality of positions of the image, a traditional YOLOv3 network is improved in some applicability on the basis of YOLOv2, a stronger Darknet53 network is provided for feature extraction by taking reference to ResNet (residual error network), and a multi-scale fusion mode is adopted for target detection, so that the YOLOv3 network model has excellent detection performance.
The EfficientNet network is a network for model composite scaling, and is used for balancing three dimensions of network resolution, network depth and network width to realize optimization of the network on efficiency and accuracy.
Wherein, said Darknet53 network is a convolution network, which is basically formed by convolution of 1 × 1 and 3 × 3, and there are 53 convolution layers in the network. The method can be used for extracting the characteristic information of the picture.
Wherein the residual join may refer to a linear superposition of the output expressed as an input and one non-linear transformation of the input.
In this optional embodiment, an EfficientNet network may be fused with a preset DarkNet53 network in the yollov 3 network model to obtain a fused network, a residual connection between the DarkNet53 network and a residual connection between feature maps of different scales extracted from the YOLOv3 network model is replaced with the residual connection between the fused network and the feature maps to obtain an initial model, and then the initial model is trained by using the preset training set to obtain a trained traffic sign detection model.
As an optional implementation manner, the fusion module fuses the EfficientNet network with the DarkNet53 network in the yollov 3 network model, and the manner of obtaining the fusion network specifically includes:
carrying out channel merging on the EfficientNet network and a DarkNet53 network in the YOLOv3 network model to obtain a merged network;
compressing the merged network to obtain a compressed network;
recombining the compression network through two preset fully-connected neural layers to obtain an attention network;
and carrying out attention weighting on the attention network and the merging network to obtain the fusion network.
In this optional embodiment, the EfficientNet network and the DarkNet53 network are channel-merged to obtain a merged network, and then the merged network is compressed to obtain a compressed network, the compressed network is reorganized through two preset fully-connected neural layers to obtain an attention network, and the attention network and the merged network are attention-weighted to obtain the merged network, where operations such as compression, reorganization, attention weighting, and the like belong to an attention mechanism (channel attention), and a model can automatically focus on important channels (for example, important channels are weighted, and non-important channels are weighted), for example: assuming that the matrix size of the backbone1 network is 16 × 128 × 16, and the matrix size of the backbone2 network is 16 × 128 × 32, the size of the merged network matrix obtained after the backbone1 network and the backbone2 network perform channel merging (concat) operation is 16 × 128 × 48, then perform squeeze compression to obtain a compressed network matrix size of 16 × 48, then perform two fully-connected neural layer fully-connected operations to obtain an attention network matrix size of still 16 × 48, and finally perform element-level multiplication operation on the matrix 16 × 48 and the merged network matrix 16 × 128 × 48, namely perform the attention weighting to obtain a merged network matrix size of 16 × 128 × 48.
As an optional implementation manner, the input module 203 inputs the preprocessed image into a traffic sign detection model trained in advance, and the manner of obtaining the detection result specifically includes:
inputting the preprocessed image into a pre-trained traffic sign detection model, acquiring a first residual block of the preprocessed image through the EfficientNet network, and acquiring a second residual block of the preprocessed image through the DarkNet53 network;
generating a first feature map, a second feature map and a third feature map according to the first residual block, the second residual block and the fusion network;
and generating the detection result according to the first feature map, the second feature map and the third feature map.
The first residual block may be a result of an operation of the intermediate layer of the EfficientNet network on the preprocessed image, and the second residual block may be a result of an operation of the intermediate layer of the DarkNet53 network on the preprocessed image.
In this optional embodiment, the preprocessed image may be input into a pre-trained traffic sign detection model, a first residual block of the preprocessed image is obtained through the EfficientNet network, a second residual block of the preprocessed image is obtained through the DarkNet53 network, a first feature map, a second feature map and a third feature map are generated according to the first residual block, the second residual block and the fusion network, and the detection result is generated according to the first feature map, the second feature map and the third feature map.
As an optional implementation manner, the manner of generating the detection result by the input module 203 according to the first feature map, the second feature map, and the third feature map specifically is:
predicting the first characteristic diagram, the second characteristic diagram and the third characteristic diagram through a plurality of prior frames obtained by preset clustering to obtain the coordinates of the traffic sign board;
and determining the coordinates of the traffic sign board as the detection result.
Wherein, three prior frames are preset in the YOLOv3 network model for the feature map of each scale, and nine prior frames are clustered together, for example: assume that the 9 prior boxes are: (10x13), (16x30), (33x23), (30x61), (62x45), (59x119), (116x90), (156x198), (373x326) larger prior boxes (116x90), (156x198), (373x326) are applied on the smallest 13 x13 feature map, suitable for detecting larger objects, medium prior boxes (30x61), (62x45), (59x119) are applied on the medium 26 x 26 feature map, suitable for detecting medium size objects, smaller prior boxes (10x13), (16x30), (33x23) are applied on the larger 52 x 52 feature map, suitable for detecting smaller objects.
As an optional implementation manner, the manner of generating the first feature map, the second feature map, and the third feature map by the input module 203 according to the first residual block, the second residual block, and the fusion network specifically is:
inputting the first residual block and the second residual block into the fusion network, and performing a first convolution operation to obtain the first feature map;
performing a first splicing operation on the first residual block and the first convolution operation to obtain a first splicing result, and performing a second convolution operation on the first splicing result to obtain a second feature map;
and performing second splicing operation on the second residual block and a result obtained by performing second convolution operation on the first splicing result to obtain a second splicing result, and performing third convolution operation on the second splicing result to obtain a third feature map.
The first convolution operation, the second convolution operation and the third convolution operation all comprise a first convolution operation, a second convolution operation and a third convolution operation, the first convolution operation comprises 5 convolution layers, 1 normalization layer and 1 activation layer, the second convolution operation comprises 1 convolution layer, 1 normalization layer and 1 activation layer, and the third convolution operation comprises 1 convolution layer.
The splicing operation can be tensor splicing, the number of features can be increased through tensor splicing, and information of each feature is not increased.
In the traffic sign detection apparatus 20 depicted in fig. 2, the preprocessed image may be input into a traffic sign detection model trained in advance, when extracting image features, the traffic sign detection model performs feature extraction using a neural network that merges a plurality of networks, and the extracted image features are more accurate than those of an original YOLOv3 network model that uses a single network, so that the final recognition result obtained by the model is more accurate, meanwhile, the original YOLOv3 network model performs prediction by using a multi-scale merging method, the YOLOv3 network model may predict objects on feature maps with different scales, but feature maps with different scales are obtained by performing residual connection through intermediate state feature maps generated by the same network, and may cause features corresponding to different scales to be closer, the method and the device can carry out residual connection with intermediate state characteristic diagrams generated by different networks, so that the corresponding characteristics of different scales are distinguished more obviously, and the recall rate of the model is improved.
As shown in fig. 3, fig. 3 is a schematic structural diagram of an electronic device for implementing a traffic sign detection method according to a preferred embodiment of the present invention. The electronic device 3 comprises a memory 31, at least one processor 32, a computer program 33 stored in the memory 31 and executable on the at least one processor 32, and at least one communication bus 34.
Those skilled in the art will appreciate that the schematic diagram shown in fig. 3 is merely an example of the electronic device 3, and does not constitute a limitation of the electronic device 3, and may include more or less components than those shown, or combine some components, or different components, for example, the electronic device 3 may further include an input/output device, a network access device, and the like.
The electronic device 3 may also include, but is not limited to, any electronic product that can interact with a user through a keyboard, a mouse, a remote controller, a touch panel, or a voice control device, for example, a Personal computer, a tablet computer, a smart phone, a Personal Digital Assistant (PDA), a game console, an Internet Protocol Television (IPTV), a smart wearable device, and the like. The Network where the electronic device 3 is located includes, but is not limited to, the internet, a wide area Network, a metropolitan area Network, a local area Network, a Virtual Private Network (VPN), and the like.
The at least one Processor 32 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, a transistor logic device, a discrete hardware component, etc. The processor 32 may be a microprocessor or the processor 32 may be any conventional processor or the like, and the processor 32 is a control center of the electronic device 3 and connects various parts of the whole electronic device 3 by various interfaces and lines.
The memory 31 may be used to store the computer program 33 and/or the module/unit, and the processor 32 may implement various functions of the electronic device 3 by running or executing the computer program and/or the module/unit stored in the memory 31 and calling data stored in the memory 31. The memory 31 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data created according to the use of the electronic device 3, and the like. In addition, the memory 31 may include volatile and non-volatile memory, such as a hard disk, a memory, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), at least one disk storage device, a Flash memory device, and so forth.
With reference to fig. 1, the memory 31 in the electronic device 3 stores a plurality of instructions to implement a traffic sign detection method, and the processor 32 executes the plurality of instructions to implement:
acquiring an image to be detected;
preprocessing the image to be detected to obtain a preprocessed image;
inputting the preprocessed image into a traffic sign detection model trained in advance to obtain a detection result, wherein the traffic sign detection model is obtained by improving a network structure based on a YOLOv3 network model, and the improvement of the network structure comprises the steps of fusing an EfficientNet network with a DarkNet53 network in the YOLOv3 network model and replacing residual connection of feature maps of different scales extracted from the DarkNet53 network and the YOLOv3 network model with residual connection of the fusion network and the feature maps;
and outputting the detection result.
Specifically, the processor 32 may refer to the description of the relevant steps in the embodiment corresponding to fig. 1 for a specific implementation method of the instruction, which is not described herein again.
In the electronic device 3 depicted in fig. 3, the preprocessed image may be input into a traffic sign detection model trained in advance, when the traffic sign detection model extracts image features, a neural network fusing multiple networks is used for feature extraction, the extracted image features are more accurate than those of an original YOLOv3 network model using a single network, so that a recognition result obtained by the model finally can be more accurate, meanwhile, the original YOLOv3 network model adopts a multi-scale fusion mode for prediction, the YOLOv3 network model can predict objects on feature maps of different scales, but the feature maps of different scales are obtained by performing residual connection on intermediate state feature maps generated by the same network, which may cause that features corresponding to different scales are closer, so that the model cannot distinguish the features of different scales from each other, the prediction result is inaccurate, residual error connection can be carried out through the intermediate state characteristic diagrams generated by different networks, so that the corresponding characteristics of different scales are more obviously distinguished, and the recall rate of the model is improved.
The integrated modules/units of the electronic device 3 may be stored in a computer-readable storage medium if they are implemented in the form of software functional units and sold or used as separate products. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may also be implemented by a computer program, which may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method embodiments may be implemented. Wherein the computer program code may be in source code form, object code form, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), etc.
In the embodiments provided in the present invention, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is only one logical functional division, and other divisions may be realized in practice.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
In addition, functional modules in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional module.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned. Furthermore, it is obvious that the word "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. A plurality of units or means recited in the system claims may also be implemented by one unit or means in software or hardware. The terms second, etc. are used to denote names, but not any particular order.
Finally, it should be noted that the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.
Claims (10)
1. A traffic sign detection method, characterized by comprising:
acquiring an image to be detected;
preprocessing the image to be detected to obtain a preprocessed image;
inputting the preprocessed image into a traffic sign detection model trained in advance to obtain a detection result, wherein the traffic sign detection model is obtained by improving a network structure based on a YOLOv3 network model, and the improvement of the network structure comprises the steps of fusing an EfficientNet network with a DarkNet53 network in the YOLOv3 network model and replacing residual connection of feature maps of different scales extracted from the DarkNet53 network and the YOLOv3 network model with residual connection of the fusion network and the feature maps;
and outputting the detection result.
2. The traffic sign detecting method according to claim 1, wherein before the image to be detected is acquired, the traffic sign detecting method further comprises:
acquiring a preset YOLOv3 network model;
fusing an EfficientNet network with a DarkNet53 network in the YOLOv3 network model to obtain a fused network;
replacing residual connection of the DarkNet53 network and feature maps of different scales extracted from the YOLOv3 network model with residual connection of the fusion network and the feature maps to obtain an initial model;
acquiring a preset training set;
and training the initial model by using the preset training set to obtain a trained traffic sign board detection model.
3. The method of claim 2, wherein the merging the EfficientNet network with the DarkNet53 network in the YOLOv3 network model to obtain a merged network comprises:
carrying out channel merging on the EfficientNet network and a DarkNet53 network in the YOLOv3 network model to obtain a merged network;
compressing the merged network to obtain a compressed network;
recombining the compression network through two preset fully-connected neural layers to obtain an attention network;
and carrying out attention weighting on the attention network and the merging network to obtain the fusion network.
4. The method as claimed in claim 2, wherein the inputting the pre-processed image into a pre-trained traffic sign detection model, and the obtaining the detection result comprises:
inputting the preprocessed image into a pre-trained traffic sign detection model, acquiring a first residual block of the preprocessed image through the EfficientNet network, and acquiring a second residual block of the preprocessed image through the DarkNet53 network;
generating a first feature map, a second feature map and a third feature map according to the first residual block, the second residual block and the fusion network;
and generating the detection result according to the first feature map, the second feature map and the third feature map.
5. The method of claim 4, wherein the generating the detection result according to the first feature map, the second feature map, and the third feature map comprises:
predicting the first characteristic diagram, the second characteristic diagram and the third characteristic diagram through a plurality of prior frames obtained by preset clustering to obtain the coordinates of the traffic sign board;
and determining the coordinates of the traffic sign board as the detection result.
6. The method of claim 4, wherein the generating a first feature map, a second feature map, and a third feature map based on the first residual block, the second residual block, and the fusion network comprises:
inputting the first residual block and the second residual block into the fusion network, and performing a first convolution operation to obtain the first feature map;
performing a first splicing operation on the first residual block and the first convolution operation to obtain a first splicing result, and performing a second convolution operation on the first splicing result to obtain a second feature map;
and performing second splicing operation on the second residual block and a result obtained by performing second convolution operation on the first splicing result to obtain a second splicing result, and performing third convolution operation on the second splicing result to obtain a third feature map.
7. The traffic sign detection method of claim 6, wherein the first convolution operation, the second convolution operation, and the third convolution operation each include a first convolution operation, a second convolution operation, and a third convolution operation, the first convolution operation includes 5 convolution layers, 1 normalization layer, and 1 activation layer, the second convolution operation includes 1 convolution layer, 1 normalization layer, and 1 activation layer, and the third convolution operation includes 1 convolution layer.
8. A traffic sign detecting device, characterized in that it comprises:
the acquisition module is used for acquiring an image to be detected;
the preprocessing module is used for preprocessing the image to be detected to obtain a preprocessed image;
an input module, configured to input the preprocessed image into a traffic sign detection model trained in advance, to obtain a detection result, where the traffic sign detection model is obtained by improving a network structure based on a YOLOv3 network model, and the improvement of the network structure includes merging an EfficientNet network with a DarkNet53 network in the YOLOv3 network model and replacing residual connections of feature maps of different scales extracted from the DarkNet53 network and the YOLOv3 network model with residual connections of the merged network and the feature maps;
and the output module is used for outputting the detection result.
9. An electronic device, characterized in that the electronic device comprises a processor and a memory, the processor being configured to execute a computer program stored in the memory to implement the traffic sign detection method according to any one of claims 1 to 7.
10. A computer-readable storage medium storing at least one instruction which, when executed by a processor, implements a traffic sign detection method as recited in any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011009444.3A CN112132032B (en) | 2020-09-23 | 2020-09-23 | Traffic sign board detection method and device, electronic equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011009444.3A CN112132032B (en) | 2020-09-23 | 2020-09-23 | Traffic sign board detection method and device, electronic equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112132032A true CN112132032A (en) | 2020-12-25 |
CN112132032B CN112132032B (en) | 2024-07-12 |
Family
ID=73841184
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011009444.3A Active CN112132032B (en) | 2020-09-23 | 2020-09-23 | Traffic sign board detection method and device, electronic equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112132032B (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112712036A (en) * | 2020-12-31 | 2021-04-27 | 广西综合交通大数据研究院 | Traffic sign recognition method and device, electronic equipment and computer storage medium |
CN112819756A (en) * | 2021-01-15 | 2021-05-18 | 江苏理工学院 | PCB surface defect detection device and method |
CN113076800A (en) * | 2021-03-03 | 2021-07-06 | 惠州市博实结科技有限公司 | Road sign board detection method and device |
CN113506278A (en) * | 2021-07-19 | 2021-10-15 | 同济大学 | Road sign auxiliary detection control method based on YOLO-V3 |
CN113850283A (en) * | 2021-06-16 | 2021-12-28 | 中国联合网络通信集团有限公司 | Method and device for identifying violation of RCS (Rich client System) message |
CN113963333A (en) * | 2021-11-15 | 2022-01-21 | 北京联合大学 | Traffic sign board detection method based on improved YOLOF model |
CN114170421A (en) * | 2022-02-10 | 2022-03-11 | 青岛海尔工业智能研究院有限公司 | Image detection method, device, equipment and storage medium |
Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109284704A (en) * | 2018-09-07 | 2019-01-29 | 中国电子科技集团公司第三十八研究所 | Complex background SAR vehicle target detection method based on CNN |
CN109447034A (en) * | 2018-11-14 | 2019-03-08 | 北京信息科技大学 | Traffic mark detection method in automatic Pilot based on YOLOv3 network |
CN109902609A (en) * | 2019-02-22 | 2019-06-18 | 淮阴工学院 | A kind of road traffic sign detection and recognition methods based on YOLOv3 |
CN110852347A (en) * | 2019-10-17 | 2020-02-28 | 天津大学 | Fire detection method using improved YOLO v3 |
WO2020048265A1 (en) * | 2018-09-06 | 2020-03-12 | 北京市商汤科技开发有限公司 | Methods and apparatuses for multi-level target classification and traffic sign detection, device and medium |
CN111079584A (en) * | 2019-12-03 | 2020-04-28 | 东华大学 | Rapid vehicle detection method based on improved YOLOv3 |
CN111191608A (en) * | 2019-12-30 | 2020-05-22 | 浙江工业大学 | Improved traffic sign detection and identification method based on YOLOv3 |
CN111274970A (en) * | 2020-01-21 | 2020-06-12 | 南京航空航天大学 | Traffic sign detection method based on improved YOLO v3 algorithm |
US20200193244A1 (en) * | 2018-12-14 | 2020-06-18 | Industrial Technology Research Institute | Neural network model fusion method and electronic device using the same |
CN111415338A (en) * | 2020-03-16 | 2020-07-14 | 城云科技(中国)有限公司 | Method and system for constructing target detection model |
CN111428550A (en) * | 2019-11-29 | 2020-07-17 | 长沙理工大学 | Vehicle detection method based on improved YO L Ov3 |
CN111460968A (en) * | 2020-03-27 | 2020-07-28 | 上海大学 | Video-based unmanned aerial vehicle identification and tracking method and device |
CN111553406A (en) * | 2020-04-24 | 2020-08-18 | 上海锘科智能科技有限公司 | Target detection system, method and terminal based on improved YOLO-V3 |
CN111738212A (en) * | 2020-07-20 | 2020-10-02 | 平安国际智慧城市科技股份有限公司 | Traffic signal lamp identification method, device, equipment and medium based on artificial intelligence |
-
2020
- 2020-09-23 CN CN202011009444.3A patent/CN112132032B/en active Active
Patent Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020048265A1 (en) * | 2018-09-06 | 2020-03-12 | 北京市商汤科技开发有限公司 | Methods and apparatuses for multi-level target classification and traffic sign detection, device and medium |
CN109284704A (en) * | 2018-09-07 | 2019-01-29 | 中国电子科技集团公司第三十八研究所 | Complex background SAR vehicle target detection method based on CNN |
CN109447034A (en) * | 2018-11-14 | 2019-03-08 | 北京信息科技大学 | Traffic mark detection method in automatic Pilot based on YOLOv3 network |
US20200193244A1 (en) * | 2018-12-14 | 2020-06-18 | Industrial Technology Research Institute | Neural network model fusion method and electronic device using the same |
CN109902609A (en) * | 2019-02-22 | 2019-06-18 | 淮阴工学院 | A kind of road traffic sign detection and recognition methods based on YOLOv3 |
CN110852347A (en) * | 2019-10-17 | 2020-02-28 | 天津大学 | Fire detection method using improved YOLO v3 |
CN111428550A (en) * | 2019-11-29 | 2020-07-17 | 长沙理工大学 | Vehicle detection method based on improved YO L Ov3 |
CN111079584A (en) * | 2019-12-03 | 2020-04-28 | 东华大学 | Rapid vehicle detection method based on improved YOLOv3 |
CN111191608A (en) * | 2019-12-30 | 2020-05-22 | 浙江工业大学 | Improved traffic sign detection and identification method based on YOLOv3 |
CN111274970A (en) * | 2020-01-21 | 2020-06-12 | 南京航空航天大学 | Traffic sign detection method based on improved YOLO v3 algorithm |
CN111415338A (en) * | 2020-03-16 | 2020-07-14 | 城云科技(中国)有限公司 | Method and system for constructing target detection model |
CN111460968A (en) * | 2020-03-27 | 2020-07-28 | 上海大学 | Video-based unmanned aerial vehicle identification and tracking method and device |
CN111553406A (en) * | 2020-04-24 | 2020-08-18 | 上海锘科智能科技有限公司 | Target detection system, method and terminal based on improved YOLO-V3 |
CN111738212A (en) * | 2020-07-20 | 2020-10-02 | 平安国际智慧城市科技股份有限公司 | Traffic signal lamp identification method, device, equipment and medium based on artificial intelligence |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112712036A (en) * | 2020-12-31 | 2021-04-27 | 广西综合交通大数据研究院 | Traffic sign recognition method and device, electronic equipment and computer storage medium |
CN112819756A (en) * | 2021-01-15 | 2021-05-18 | 江苏理工学院 | PCB surface defect detection device and method |
CN112819756B (en) * | 2021-01-15 | 2023-07-11 | 江苏理工学院 | PCB surface defect detection device and method |
CN113076800A (en) * | 2021-03-03 | 2021-07-06 | 惠州市博实结科技有限公司 | Road sign board detection method and device |
CN113850283A (en) * | 2021-06-16 | 2021-12-28 | 中国联合网络通信集团有限公司 | Method and device for identifying violation of RCS (Rich client System) message |
CN113506278A (en) * | 2021-07-19 | 2021-10-15 | 同济大学 | Road sign auxiliary detection control method based on YOLO-V3 |
CN113506278B (en) * | 2021-07-19 | 2022-06-14 | 同济大学 | Road sign auxiliary detection control method based on YOLO-V3 |
CN113963333A (en) * | 2021-11-15 | 2022-01-21 | 北京联合大学 | Traffic sign board detection method based on improved YOLOF model |
CN113963333B (en) * | 2021-11-15 | 2022-05-31 | 北京联合大学 | Traffic sign board detection method based on improved YOLOF model |
CN114170421A (en) * | 2022-02-10 | 2022-03-11 | 青岛海尔工业智能研究院有限公司 | Image detection method, device, equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN112132032B (en) | 2024-07-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112132032B (en) | Traffic sign board detection method and device, electronic equipment and storage medium | |
CN111104962A (en) | Semantic segmentation method and device for image, electronic equipment and readable storage medium | |
CN113792871B (en) | Neural network training method, target identification device and electronic equipment | |
CN111832440B (en) | Face feature extraction model construction method, computer storage medium and equipment | |
CN112257578B (en) | Face key point detection method and device, electronic equipment and storage medium | |
JP2023527100A (en) | Event detection method, device, electronic device, readable storage medium, and computer program | |
CN107832794A (en) | A kind of convolutional neural networks generation method, the recognition methods of car system and computing device | |
CN112232203B (en) | Pedestrian recognition method and device, electronic equipment and storage medium | |
CN110689046A (en) | Image recognition method, image recognition device, computer device, and storage medium | |
CN112132033B (en) | Vehicle type recognition method and device, electronic equipment and storage medium | |
CN113536856A (en) | Image recognition method and system, and data processing method | |
CN115331048A (en) | Image classification method, device, equipment and storage medium | |
CN114943937A (en) | Pedestrian re-identification method and device, storage medium and electronic equipment | |
CN113705468B (en) | Digital image recognition method based on artificial intelligence and related equipment | |
CN118097341A (en) | Target detection method, model training method and related device | |
CN112132037B (en) | Pavement detection method, device, equipment and medium based on artificial intelligence | |
CN112396060B (en) | Identification card recognition method based on identification card segmentation model and related equipment thereof | |
CN117132950A (en) | Vehicle tracking method, system, equipment and storage medium | |
CN112036439A (en) | Dependency relationship classification method and related equipment | |
CN116468043A (en) | Nested entity identification method, device, equipment and storage medium | |
CN113139490B (en) | Image feature matching method and device, computer equipment and storage medium | |
CN112132031B (en) | Vehicle style identification method and device, electronic equipment and storage medium | |
CN112016503B (en) | Pavement detection method, device, computer equipment and storage medium | |
CN114385878A (en) | Visual display method and device for government affair data and terminal equipment | |
CN114764890A (en) | Pedestrian passageway environment assessment method and device and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |