CN111274942A - Traffic cone identification method and device based on cascade network - Google Patents

Traffic cone identification method and device based on cascade network Download PDF

Info

Publication number
CN111274942A
CN111274942A CN202010060114.0A CN202010060114A CN111274942A CN 111274942 A CN111274942 A CN 111274942A CN 202010060114 A CN202010060114 A CN 202010060114A CN 111274942 A CN111274942 A CN 111274942A
Authority
CN
China
Prior art keywords
traffic cone
network
traffic
model
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010060114.0A
Other languages
Chinese (zh)
Inventor
王鹤
李润泽
王博
高嵩
徐月云
刘洋
张�杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guoqi Beijing Intelligent Network Association Automotive Research Institute Co ltd
Original Assignee
Guoqi Beijing Intelligent Network Association Automotive Research Institute Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guoqi Beijing Intelligent Network Association Automotive Research Institute Co ltd filed Critical Guoqi Beijing Intelligent Network Association Automotive Research Institute Co ltd
Priority to CN202010060114.0A priority Critical patent/CN111274942A/en
Publication of CN111274942A publication Critical patent/CN111274942A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/58Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
    • G06V20/582Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads of traffic signs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the invention discloses a traffic cone identification method and device based on a cascade network. The traffic cone identification method based on the cascade network comprises the following steps: acquiring a compression and activation network SEnet and a dense convolution network DenseNet; determining a target network structure based on the SENEt, the DenseNet and a preset target detection model; and training a target network structure to obtain a traffic cone recognition model based on a plurality of original traffic cone scene images containing traffic cones. According to the embodiment of the invention, the traffic cone identification model with higher identification accuracy can be provided, so that the traffic cone can be identified more accurately.

Description

Traffic cone identification method and device based on cascade network
Technical Field
The invention belongs to the technical field of deep learning, and particularly relates to a traffic cone identification method and device based on a cascade network, electronic equipment and a computer storage medium.
Background
In recent years, unmanned driving technology is gradually mature, safety problems are highly emphasized by governments, scientific research institutions and automobile manufacturers, and a traffic cone is taken as one of important traffic signs, is usually placed on a road or a sidewalk, is used for temporarily changing traffic directions, blocking areas, warning roadside construction, lane accidents and the like, and plays a vital role in ensuring safe driving. Therefore, it is particularly important to design a traffic cone identification and detection system with good real-time performance and high accuracy, and the traffic cone identification and detection system can play a role in reducing traffic accidents.
Because the traffic cones on the roads or sidewalks are temporary and movable, the traffic cones cannot be identified on the high-definition map and need to be identified by a vehicle-mounted sensor. In the aspect of traffic cone detection and identification, the currently adopted method is based on the traditional image processing method, removes complex background through image processing, extracts traffic cones and detects the traffic cones through morphological processing.
Compared with the traditional image processing method, the target detection method based on deep learning can independently learn the features of different levels, and the more abundant the learned features are, the higher the accuracy is. The target detection method based on deep learning mainly utilizes a preset target detection model (for example, a Cascade R-CNN model) to perform target recognition, but the target detection model is simpler and influences the accuracy of target recognition.
Therefore, how to provide a traffic cone recognition model with higher recognition accuracy, and further to recognize a traffic cone more accurately is a technical problem that needs to be solved by those skilled in the art.
Disclosure of Invention
The embodiment of the invention provides a traffic cone identification method, a traffic cone identification device, electronic equipment and a computer storage medium based on a cascade network, and can provide a traffic cone identification model with higher identification accuracy so as to identify a traffic cone more accurately.
In a first aspect, a traffic cone identification method based on a cascade network is provided, which includes:
acquiring a compression and activation network SEnet and a dense convolution network DenseNet;
determining a target network structure based on the SENEt, the DenseNet and a preset target detection model;
and training a target network structure to obtain a traffic cone recognition model based on a plurality of original traffic cone scene images containing traffic cones.
Optionally, determining the target network structure based on the SENet, the DenseNet and a preset target detection model includes:
determining a target convolutional layer in the DenseNet; wherein, the target convolution layer outputs a target characteristic diagram;
adding SENEt after each target convolution layer in the DenseNet to obtain a backbone network;
and replacing the original backbone network in the target detection model by using the backbone network to obtain a target network structure.
Optionally, training a target network structure to obtain a traffic cone recognition model based on a plurality of original traffic cone scene images including traffic cones includes:
acquiring a plurality of original traffic cone scene images;
performing data enhancement on any original traffic cone scene image to obtain at least one first traffic cone scene image corresponding to any original traffic cone scene image;
establishing a label file corresponding to each first traffic cone scene image to obtain a traffic cone scene image data set;
dividing a traffic cone scene image data set to obtain a training set, a verification set and a test set;
and determining a traffic cone recognition model based on the training set, the verification set and the target network structure.
Optionally, performing data enhancement on any original traffic cone scene image to obtain at least one first traffic cone scene image corresponding to any original traffic cone scene image, including:
and performing at least one of rotation, turnover, contrast enhancement, cutting, brightness adjustment and affine transformation on any original traffic cone scene image to obtain at least one first traffic cone scene image corresponding to any original traffic cone scene image.
Optionally, determining a traffic cone recognition model based on the training set, the validation set, and the target network structure includes:
training a target network structure by using a training set to obtain an initial traffic cone recognition model;
determining the accuracy and/or loss value of the initial traffic cone identification model by using the verification set;
and when the accuracy is greater than the accuracy threshold and/or the loss value is less than the loss value threshold, determining the initial traffic cone identification model as the traffic cone identification model.
Optionally, after determining that the initial traffic cone identification model is the traffic cone identification model, the method further includes:
and verifying the accuracy and the recognition speed of the traffic cone recognition model by using the test set.
In a second aspect, a traffic cone identification model is provided, where the traffic cone identification model is obtained by using the cascade network-based traffic cone identification method of the first aspect, and includes:
acquiring an image to be identified;
inputting an image to be recognized into a traffic cone recognition model, and outputting a recognition result; and the identification result is that the image to be identified has the traffic cone and the position of the traffic cone in the image to be identified or the image to be identified has no traffic cone.
In a third aspect, a device of a traffic cone identification method based on a cascade network is provided, which includes:
the acquisition module is used for acquiring a compression and activation network SEnet and a dense convolution network DenseNet;
the determining module is used for determining a target network structure based on the SEnet, the DenseNet and a preset target detection model;
and the training module is used for training a target network structure to obtain a traffic cone recognition model based on a plurality of original traffic cone scene images containing traffic cones.
Optionally, the determining module is configured to determine a target convolutional layer in the DenseNet; wherein, the target convolution layer outputs a target characteristic diagram; adding SENEt after each target convolution layer in the DenseNet to obtain a backbone network; and replacing the original backbone network in the target detection model by using the backbone network to obtain a target network structure.
Optionally, the training module is configured to acquire a plurality of original traffic cone scene images; performing data enhancement on any original traffic cone scene image to obtain at least one first traffic cone scene image corresponding to any original traffic cone scene image; establishing a label file corresponding to each first traffic cone scene image to obtain a traffic cone scene image data set; dividing a traffic cone scene image data set to obtain a training set, a verification set and a test set; and determining a traffic cone recognition model based on the training set, the verification set and the target network structure.
Optionally, the training module is configured to perform at least one of rotation, flipping, contrast enhancement, clipping, brightness adjustment, and affine transformation on any one of the original traffic cone scene images to obtain at least one first traffic cone scene image corresponding to any one of the original traffic cone scene images.
Optionally, the training module is configured to train the target network structure by using a training set to obtain an initial traffic cone recognition model; determining the accuracy and/or loss value of the initial traffic cone identification model by using the verification set; and when the accuracy is greater than the accuracy threshold and/or the loss value is less than the loss value threshold, determining the initial traffic cone identification model as the traffic cone identification model.
Optionally, the training module is further configured to verify the accuracy and the recognition speed of the traffic cone recognition model by using the test set.
In a fourth aspect, a device based on a traffic cone identification model obtained by using the cascade network-based traffic cone identification method of the first aspect is provided, and the device includes:
the acquisition module is used for acquiring an image to be identified;
the output module is used for inputting the image to be recognized into the traffic cone recognition model and outputting a recognition result; and the identification result is that the image to be identified has the traffic cone and the position of the traffic cone in the image to be identified or the image to be identified has no traffic cone.
In a fifth aspect, an electronic device is provided, the electronic device comprising: a processor and a memory storing computer program instructions;
the processor, when executing the computer program instructions, implements the cascade network-based traffic cone identification method of the first aspect.
In a sixth aspect, a computer storage medium is provided, on which computer program instructions are stored, which, when executed by a processor, implement the cascade network-based traffic cone identification method of the first aspect.
The traffic cone identification method, the traffic cone identification device, the electronic equipment and the computer storage medium based on the cascade network can provide a traffic cone identification model with higher identification accuracy, so that the traffic cone can be identified more accurately. The traffic cone recognition method based on the cascade network determines a target network structure based on SENet, DenseNet and a preset target detection model, the SENet builds a model through the correlation among characteristic diagram channels, enhances the characteristics related to the preset target, weakens irrelevant characteristics, realizes the characteristic recalibration, solves the problem that a large amount of redundant information exists in the characteristics extracted by the existing target detection model, and can recognize the traffic cone more accurately based on a plurality of original traffic cone scene images containing the traffic cone and the traffic cone recognition model obtained by training the target network structure.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required to be used in the embodiments of the present invention will be briefly described below, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic flow chart of a traffic cone identification method based on a cascade network according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a SENET network structure provided by an embodiment of the present invention;
FIG. 3 is a schematic diagram illustrating a connection between a SE-Dense Block module and a SE-Transition module according to an embodiment of the present invention;
FIG. 4 is a schematic flow chart of an improved Cascade R-CNN algorithm provided in the embodiments of the present invention;
FIG. 5 is a flow chart of another traffic cone identification method based on a cascade network according to an embodiment of the present invention;
FIG. 6 is a flow chart illustrating a method for using a traffic cone recognition model according to an embodiment of the present invention;
fig. 7 is a schematic structural diagram of an apparatus of a traffic cone identification method based on a cascade network according to an embodiment of the present invention;
FIG. 8 is a schematic structural diagram of an apparatus based on a traffic cone identification model using method according to an embodiment of the present invention;
fig. 9 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
Features and exemplary embodiments of various aspects of the present invention will be described in detail below, and in order to make objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described in detail below with reference to the accompanying drawings and specific embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not to be construed as limiting the invention. It will be apparent to one skilled in the art that the present invention may be practiced without some of these specific details. The following description of the embodiments is merely intended to provide a better understanding of the present invention by illustrating examples of the present invention.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
In the aspect of traffic cone detection and identification, the currently adopted method is based on the traditional image processing method, removes complex background through image processing, extracts traffic cones and detects the traffic cones through morphological processing. Compared with the traditional image processing method, the target detection method based on deep learning can independently learn the features of different levels, and the more abundant the learned features are, the higher the accuracy is. The target detection method based on deep learning mainly utilizes a preset target detection model (for example, a Cascade R-CNN model) to perform target recognition, but the target detection model is simpler and influences the accuracy of target recognition.
In order to solve the problems in the prior art, embodiments of the present invention provide a method, an apparatus, an electronic device, and a computer storage medium for identifying a traffic cone based on a cascade network. First, a traffic cone identification method based on a cascade network provided by the embodiment of the invention is described below.
Fig. 1 is a schematic flow chart of a traffic cone identification method based on a cascade network according to an embodiment of the present invention. As shown in fig. 1, the method for identifying a traffic cone based on a cascade network includes:
s101, acquiring a compression and activation network SEnet and a dense convolution network DenseNet.
S102, determining a target network structure based on the SENET, the DenseNet and a preset target detection model.
The compression and activation network (SEnet) is not a complete network structure, but a substructure, which can be embedded into other classification or detection models, and the core idea is to enhance the target features and compress the non-relevant features by learning the feature weights according to a loss (loss) function through the network. The dynamic adaptability to the input image is introduced essentially, which is helpful for enhancing the feature distinguishing capability and realizing the feature recalibration. The characteristic weight calibration refers to changing the characteristics of the characteristic value output by the network in a differentiated weight mode so as to achieve the purpose of obviously highlighting the effective characteristics.
The configuration diagram of SENet is as shown in fig. 2, and H, W, C in fig. 2 indicates height, width, and channel, respectively, so H × W × C indicates height × width × channel. For the input feature U, the first thing to do is a compression (Squeeze) operation FsqFirstly, the SEnet structure adopts a global pooling mode to carry out feature compression along the space dimension, each two-dimensional feature channel is changed into a real number, the real number has a global receptive field to some extent, and the output dimension is matched with the number of input feature channels. It characterizes the global distribution of responses over the feature channels and makes it possible to obtain a global receptive field also for layers close to the input, which is very useful in many tasks. The next is the activation (Excitation) operation Fex(. W), inspiration comes from the Long Short-Term Memory (LSTM) door mechanism, where parameters are learned to explicitly model the correlation between feature channels. Finally, the weight of the feature is reset, the weight of the activation operation output is regarded as the importance of each feature channel after feature selection, and then multiplication F is carried outscale(-) channel-by-channel weighting onto previous features to get features
Figure BDA0002374187680000071
Completing the recalibration of the original features in the channel dimension.
In order to determine a target network structure with higher accuracy, in one embodiment, the determining the target network structure based on the SENet, the DenseNet and a preset target detection model may generally include: determining a target convolutional layer in the DenseNet; wherein, the target convolution layer outputs a target characteristic diagram; adding SENEt after each target convolution layer in the DenseNet to obtain a backbone network; and replacing the original backbone network in the target detection model by using the backbone network to obtain a target network structure.
Alternatively, in one embodiment, the densnet may be embodied as densnet-169, with SENet added after each target convolutional layer in densnet-169, resulting in a backbone network (i.e., SE-densnet-169 network). The SENet can make up the deficiency of the DenseNet-169 in the feature superposition multiplexing in the dimension of a channel (channel), the characterization capability of a model is improved by modeling the channel correlation among the convolutional layer features through display, the feature recalibration is realized by selectively enhancing the target related features and simultaneously compressing the non-related features by using global information, so that the network potential of the DenseNet-169 is further mined, and the feature extraction effect is improved.
In one embodiment, the SE-denenet-169 network adopts a "SE-detect Block + SE-Transition" structure, and comprises 4 SE-detect blocks, and the SE-detect blocks are connected by SE-transitions, and the network structure of the SE-denenet-169 network is specifically shown in table 1:
TABLE 1
Figure BDA0002374187680000072
Figure BDA0002374187680000081
As shown in FIG. 3, FIG. 3 is a detailed SE-Dense Block and SE-Transition Block. Wherein, the SE-Dense Block module adopts the ideas of feature multiplexing and short-circuit connections (short connections) to connect all the convolution layers with each other.
The four SE-Dense Block modules of the SE-DenseNet-169 network have the layer numbers of 6, 12, 32 and 32 respectively, namely 6, 12, 32 and 32 structures of 'BN-ReLU-1 x1 Conv-BN-ReLU-3 x 3 Conv' respectively. Wherein bn (batch normalization) indicates normalization; ReLU (rectified Linear Unit) represents a step function, which is an activation function.
The input and output of each part inside the SE-Dense Block and the SE-Transition Block are feature maps (feature maps), and the input and output between the SE-Dense Block and the SE-Transition Block are also feature maps (feature maps).
The input of each layer is a feature map of all previous layers spliced (Concat, see C in fig. 3) in the channel dimension, and in order to make the splicing operation possible, the feature map size of each layer needs to be set to be the same; adding a SENet structure between two layers, adding weights to each channel (channel) of the feature map according to the channel correlation between the convolutional layer features, selectively enhancing useful features (correlated features) by using global information, and simultaneously compressing useless features (uncorrelated features) to realize feature recalibration.
A SEnet structure is also added into the SE-Transition module, the SE-Transition module mainly plays a role in feature dimension reduction, more and more features generated by splicing are avoided, and overfitting is effectively inhibited. The number of channels passed to the next SE-Dense Block module is reduced by half by making the output channel half the input channel for the 1x1 convolutional layer (BN-ReLU-1x1 Conv) in the SE-Transition module. After the last SE-sense Block module, a global averaging pooling (averaging) layer and a normalized exponential function (softmax) classification layer are concatenated. Wherein the average pooling layer is used to vary the size of the feature map.
In one embodiment, the SE-DenseNet-169 network is used for replacing a ResNet50 network in a Cascade R-CNN model, the obtained target network structure is an improved Cascade R-CNN network structure, and the SE-DenseNet-169 network is a backbone network (backbone) in the improved Cascade R-CNN network structure.
In an embodiment, an improved Cascade R-CNN algorithm flow corresponding to the improved Cascade R-CNN Network structure is shown in fig. 4, where a trunk Network of the improved Cascade R-CNN end-to-end Cascade detection Network is an SE-densneet-169 Network, an image (image) is input into the SE-densneet-169 Network, a Feature map is output, the output Feature map is input into a Feature Pyramid Network (Feature Pyramid Networks, FPN) output Feature map, the output Feature map is input into a Region suggestion Network (Region pro security Network, RPN), and candidate frames (pro security) with scores (score) are output, and the candidate frames are target candidate regions. Then, 2000 target candidate Regions are selected by utilizing non-maximum suppression, the generated target candidate Regions are placed into a region of Interest (ROI) pooling layer, region pooling is carried out on a feature map output by a feature mapping layer of a network generated by the target candidate Regions, region features with fixed sizes are generated, namely, the ROI is mapped to the corresponding positions of the feature map, and the feature map with the same size is output.
And putting the target candidate region passing through the ROI posing layer into a full connection layer, inputting a subsequent normalized exponential function (softmax) classification layer and a bounding box (bbox) regression layer, and performing target classification and target bounding box regression correction. In fig. 4, C0, C1, C2, and C3 represent softmax target classifications in different stages (stages), respectively, and B1, B2, and B3 represent bounding box regressions in different stages (stages), respectively.
It is well known to those skilled in the art that cross-over-unity (IOU) threshold selection tends to result in high quality samples, but causes problems such as overfitting. In this embodiment, three cascaded R-CNN networks are trained by using different IOU thresholds, and through iterative bounding box regression, a bounding box coordinate result obtained by previous detection model regression is initialized to a bounding box of a next detection model, and then regression is continued to gradually improve the detection accuracy of the bounding box.
The intersection ratio represents the overlapping degree of the candidate frame (candidate frame) and the original mark frame (ground truth frame), namely the ratio of the intersection and the union of the candidate frame and the original mark frame, and is used for judging the Positive sample (Positive) and the Negative sample (Negative) in the training stage. Different IOU threshold values are set, so that the accuracy of each network output is improved a little and used as the input of the next higher-accuracy network, the accuracy of the network output is further improved step by step, the operation speed and the detection accuracy are considered, the network structure is simple to realize, and the requirement of the actual software engineering environment performance is met.
S103, training a target network structure to obtain a traffic cone recognition model based on a plurality of original traffic cone scene images containing traffic cones.
In order to obtain a more accurate traffic cone recognition model, in one embodiment, training the target network structure to obtain a traffic cone recognition model based on a plurality of original traffic cone scene images including traffic cones may generally include: acquiring a plurality of original traffic cone scene images; performing data enhancement on any original traffic cone scene image to obtain at least one first traffic cone scene image corresponding to any original traffic cone scene image; establishing a label file corresponding to each first traffic cone scene image to obtain a traffic cone scene image data set; dividing a traffic cone scene image data set to obtain a training set, a verification set and a test set; and determining a traffic cone recognition model based on the training set, the verification set and the target network structure.
In order to obtain a greater number of first traffic cone scene images, in an embodiment of the present invention, data enhancement is performed on any original traffic cone scene image to obtain at least one first traffic cone scene image corresponding to any original traffic cone scene image, which may generally include: and performing at least one of rotation, turnover, contrast enhancement, cutting, brightness adjustment and affine transformation on any original traffic cone scene image to obtain at least one first traffic cone scene image corresponding to any original traffic cone scene image.
To obtain a more accurate traffic cone recognition model, in one embodiment of the present invention, determining a traffic cone recognition model based on a training set, a validation set, and a target network structure may generally include: training a target network structure by using a training set to obtain an initial traffic cone recognition model; determining the accuracy and/or loss value of the initial traffic cone identification model by using the verification set; and when the accuracy is greater than the accuracy threshold and/or the loss value is less than the loss value threshold, determining the initial traffic cone identification model as the traffic cone identification model.
In order to ensure the practicability of the traffic cone identification model, in one embodiment of the present invention, after determining that the initial traffic cone identification model is the traffic cone identification model, the method may further generally include: and verifying the accuracy and the recognition speed of the traffic cone recognition model by using the test set.
The above is described in an embodiment, which is as follows (see fig. 5):
s1, establishing an image library: the method comprises the steps that scene images containing traffic cones of different scenes, different weather and different illumination are collected through a high-definition camera which is installed on a front windshield of a vehicle and has a depression angle of 8 degrees, the images are saved at a speed of 10 frames per second, and the images are normalized to a size of 1920 x 1080 pixels.
S2, data enhancement: data enhancement mainly includes rotation, upset, contrast reinforcing, tailorring, brightness adjustment, affine transform, and experiments find that the angle of random rotation sets up to minus 5 to plus 5 degrees, and the upset sets up to the upset of image about 10% at random, and the less degree changes luminance and contrast at random, can guarantee the meaning of image, for effectual data enhancement, adjusts the size (resize) of picture to certain size at last to expand effectual data sample, the variety of reinforcing data set.
S3, establishing a label library: and (4) making a label corresponding to the image according to the json label file standard of the COCO format through labeling software.
S4, establishing a data set: and dividing the manufactured image library and the label library thereof into a training set, a verification set and a test set, wherein the proportion of the training set, the verification set and the test set is 7:2: 1.
S5, setting parameters of a Cascade R-CNN model, namely setting 4 stages for the Cascade R-CNN model, setting an RPN and an R-CNN network with three IOUs gradually increasing, and training gradually. RPN network anchor (a set of fixed-size reference windows) is divided into five levels, 4, 8, 16, 32, 64 respectively, with an aspect ratio of 3 levels, 1: 1. 1: 2. 2: 1. the three IOUs are respectively 0.5, 0.6 and 0.7. The classification loss function adopts a softmax function, the regression loss function adopts a smooth L1 loss function, and the coordinates of the bounding box are normalized in order to prevent the problem of regression scale caused by the size and the position of the bounding box.
S6, training a traffic cone recognition model: training a training set by using an improved Cascade R-CNN network structure, setting that 8 pictures are processed each time in the training process, wherein the learning rate is smaller in the first 11 training periods (epoch), the learning rate is 0.02 after the model is stabilized, the proportion and the size of anchors and the parameters of a local loss function are kept unchanged, the performance of the model is increased along with the iteration times of the training, the training model is stored once at certain iteration times, and when the accuracy of a verification set and the loss value of the training tend to be stable, namely the model is converged, the model is stored and the training is stopped. In addition, in the training process, a random gradient descent (SGD) algorithm is adopted for parameter updating, and the convergence speed of the model is accelerated in a Warmup pre-heating learning rate mode, so that the model is better, and the training steps are 20 ten thousand.
S7, testing a traffic cone recognition model: the detection accuracy rate on the traffic cone test set reaches 95%, the detection speed is 9FPS (including picture reading time), compared with the original Cascade R-CNN model, the improved Cascade R-CNN model has the advantages that the detection accuracy rate is improved, the trained model has better robustness, and compared with the original algorithm, the traffic cone detection recognition method has greater advantages.
The traffic cone recognition method and device can be used for quickly, timely and effectively recognizing the traffic cone, reducing the influence of factors such as illumination change, color fading, motion blur, complex background and the like on the traffic cone recognition, have the anti-interference capacity, are high in recognition accuracy, and can timely eliminate the potential safety hazard of unmanned driving. Meanwhile, the embodiment of the invention also utilizes a data enhancement technology, expands effective data samples under the condition of insufficient quantity, enhances the diversity of data sets and improves the detection accuracy of the traffic cone identification model.
The embodiment also provides a traffic cone identification method based on a traffic cone identification model, as shown in fig. 6, the traffic cone identification model is a model obtained by using the traffic cone identification method based on the cascade network according to the embodiment shown in fig. 1, and includes:
s601, acquiring an image to be identified.
S602, inputting the image to be recognized into a traffic cone recognition model, and outputting a recognition result; and the identification result is that the image to be identified has the traffic cone and the position of the traffic cone in the image to be identified or the image to be identified has no traffic cone.
The embodiment also provides a device of the traffic cone identification method based on the cascade network, and the device of the traffic cone identification method based on the cascade network and the traffic cone identification method based on the cascade network described above can be correspondingly referred to each other. As shown in fig. 7, the device of the traffic cone identification method based on the cascade network includes:
an obtaining module 701, configured to obtain a compressed and activated network SENet and a dense convolutional network DenseNet;
a determining module 702, configured to determine a target network structure based on the SENet, the DenseNet, and a preset target detection model;
the training module 703 is configured to train a target network structure to obtain a traffic cone recognition model based on a plurality of original traffic cone scene images including traffic cones.
Optionally, a determining module 702, configured to determine a target convolutional layer in the DenseNet; wherein, the target convolution layer outputs a target characteristic diagram; adding SENEt after each target convolution layer in the DenseNet to obtain a backbone network; and replacing the original backbone network in the target detection model by using the backbone network to obtain a target network structure.
Optionally, the training module 703 is configured to obtain a plurality of original traffic cone scene images; performing data enhancement on any original traffic cone scene image to obtain at least one first traffic cone scene image corresponding to any original traffic cone scene image; establishing a label file corresponding to each first traffic cone scene image to obtain a traffic cone scene image data set; dividing a traffic cone scene image data set to obtain a training set, a verification set and a test set; and determining a traffic cone recognition model based on the training set, the verification set and the target network structure.
Optionally, the training module 703 is configured to perform at least one of rotation, flipping, contrast enhancement, clipping, brightness adjustment, and affine transformation on any original traffic cone scene image to obtain at least one first traffic cone scene image corresponding to any original traffic cone scene image.
Optionally, the training module 703 is configured to train the target network structure by using a training set to obtain an initial traffic cone identification model; determining the accuracy and/or loss value of the initial traffic cone identification model by using the verification set; and when the accuracy is greater than the accuracy threshold and/or the loss value is less than the loss value threshold, determining the initial traffic cone identification model as the traffic cone identification model.
Optionally, the training module 703 is further configured to verify the accuracy and the recognition speed of the traffic cone recognition model by using the test set.
Each module in the device of the traffic cone identification method based on the cascade network provided in fig. 7 has a function of implementing each step in the example shown in fig. 1, and achieves the same technical effect as the traffic cone identification method based on the cascade network shown in fig. 1, and for brevity, the details are not repeated here.
The embodiment also provides a device based on the traffic cone identification model, wherein the traffic cone identification model is obtained by using the cascade network-based traffic cone identification method of the embodiment shown in fig. 1, and the device based on the traffic cone identification model and the above-described traffic cone identification model-based usage method can be correspondingly referred to each other. As shown in fig. 8, the apparatus based on the traffic cone recognition model using method includes:
an obtaining module 801, configured to obtain an image to be identified;
the output module 802 is used for inputting the image to be recognized into the traffic cone recognition model and outputting a recognition result; and the identification result is that the image to be identified has the traffic cone and the position of the traffic cone in the image to be identified or the image to be identified has no traffic cone.
Each module in the device based on the traffic cone identification model using method provided in fig. 8 has a function of implementing each step in the example shown in fig. 6, and achieves the same technical effect as the traffic cone identification model using method shown in fig. 6, and for brevity, details are not repeated here.
Fig. 9 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
The electronic device may comprise a processor 901 and a memory 902 storing computer program instructions.
Specifically, the processor 901 may include a Central Processing Unit (CPU), or an Application Specific Integrated Circuit (ASIC), or may be configured as one or more Integrated circuits implementing the embodiments of the present invention.
Memory 902 may include mass storage for data or instructions. By way of example, and not limitation, memory 902 may include a Hard Disk Drive (HDD), floppy Disk Drive, flash memory, optical Disk, magneto-optical Disk, tape, or Universal Serial Bus (USB) Drive or a combination of two or more of these. Memory 902 may include removable or non-removable (or fixed) media, where appropriate. The memory 902 may be internal or external to the integrated gateway disaster recovery device, where appropriate. In a particular embodiment, the memory 902 is a non-volatile solid-state memory. In a particular embodiment, the memory 902 includes Read Only Memory (ROM). Where appropriate, the ROM may be mask-programmed ROM, Programmable ROM (PROM), Erasable PROM (EPROM), Electrically Erasable PROM (EEPROM), electrically rewritable ROM (EAROM), or flash memory or a combination of two or more of these.
The processor 901 reads and executes the computer program instructions stored in the memory 902 to implement the cascade network-based traffic cone identification method in the embodiment shown in fig. 1 among the above-described embodiments.
In one example, the electronic device can also include a communication interface 903 and a bus 910. As shown in fig. 9, the processor 901, the memory 902, and the communication interface 903 are connected via a bus 910 to complete communication with each other.
The communication interface 903 is mainly used for implementing communication between modules, apparatuses, units and/or devices in the embodiments of the present invention.
Bus 910 includes hardware, software, or both to couple the components of the online data traffic billing device to each other. By way of example, and not limitation, a bus may include an Accelerated Graphics Port (AGP) or other graphics bus, an Enhanced Industry Standard Architecture (EISA) bus, a Front Side Bus (FSB), a Hypertransport (HT) interconnect, an Industry Standard Architecture (ISA) bus, an infiniband interconnect, a Low Pin Count (LPC) bus, a memory bus, a Micro Channel Architecture (MCA) bus, a Peripheral Component Interconnect (PCI) bus, a PCI-Express (PCI-X) bus, a Serial Advanced Technology Attachment (SATA) bus, a video electronics standards association local (VLB) bus, or other suitable bus or a combination of two or more of these. Bus 910 can include one or more buses, where appropriate. Although specific buses have been described and shown in the embodiments of the invention, any suitable buses or interconnects are contemplated by the invention.
In addition, in combination with the traffic cone identification method based on the cascade network in the above embodiments, the embodiments of the present invention may provide a computer storage medium to implement. The computer storage medium having computer program instructions stored thereon; the computer program instructions, when executed by the processor, implement the cascade network based traffic cone identification method in the embodiment shown in fig. 1.
It is to be understood that the invention is not limited to the specific arrangements and instrumentality described above and shown in the drawings. A detailed description of known methods is omitted herein for the sake of brevity. In the above embodiments, several specific steps are described and shown as examples. However, the method processes of the present invention are not limited to the specific steps described and illustrated, and those skilled in the art can make various changes, modifications and additions or change the order between the steps after comprehending the spirit of the present invention.
The functional blocks shown in the above-described structural block diagrams may be implemented as hardware, software, firmware, or a combination thereof. When implemented in hardware, it may be, for example, an electronic circuit, an Application Specific Integrated Circuit (ASIC), suitable firmware, plug-in, function card, or the like. When implemented in software, the elements of the invention are the programs or code segments used to perform the required tasks. The program or code segments may be stored in a machine-readable medium or transmitted by a data signal carried in a carrier wave over a transmission medium or a communication link. A "machine-readable medium" may include any medium that can store or transfer information. Examples of a machine-readable medium include electronic circuits, semiconductor memory devices, ROM, flash memory, Erasable ROM (EROM), floppy disks, CD-ROMs, optical disks, hard disks, fiber optic media, Radio Frequency (RF) links, and so forth. The code segments may be downloaded via computer networks such as the internet, intranet, etc.
It should also be noted that the exemplary embodiments mentioned in this patent describe some methods or systems based on a series of steps or devices. However, the present invention is not limited to the order of the above-described steps, that is, the steps may be performed in the order mentioned in the embodiments, may be performed in an order different from the order in the embodiments, or may be performed simultaneously.
As described above, only the specific embodiments of the present invention are provided, and it can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the system, the module and the unit described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again. It should be understood that the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive various equivalent modifications or substitutions within the technical scope of the present invention, and these modifications or substitutions should be covered within the scope of the present invention.

Claims (11)

1. A traffic cone identification method based on a cascade network is characterized by comprising the following steps:
acquiring a compression and activation network SEnet and a dense convolution network DenseNet;
determining a target network structure based on the SENEt, the DenseNet and a preset target detection model;
and training the target network structure to obtain a traffic cone recognition model based on a plurality of original traffic cone scene images containing traffic cones.
2. The cascade network-based traffic cone recognition method of claim 1, wherein the determining a target network structure based on the SENET, the DenseNet and a preset target detection model comprises:
determining a target convolutional layer in the DenseNet; wherein the target convolutional layer outputs the target feature map;
adding the SENEt after each target convolution layer in the DenseNet to obtain a backbone network;
and replacing the original backbone network in the target detection model by using the backbone network to obtain the target network structure.
3. The cascade network-based traffic cone recognition method according to claim 1 or 2, wherein training the target network structure to obtain a traffic cone recognition model based on a plurality of original traffic cone scene images containing traffic cones comprises:
acquiring a plurality of original traffic cone scene images;
performing data enhancement on any original traffic cone scene image to obtain at least one first traffic cone scene image corresponding to any original traffic cone scene image;
establishing a label file corresponding to each first traffic cone scene image to obtain a traffic cone scene image data set;
dividing the traffic cone scene image data set to obtain a training set, a verification set and a test set;
determining the traffic cone recognition model based on the training set, the validation set, and the target network structure.
4. The cascade network-based traffic cone identification method according to claim 3, wherein the data enhancement of any original traffic cone scene image to obtain at least one first traffic cone scene image corresponding to any original traffic cone scene image comprises:
and performing at least one of rotation, turnover, contrast enhancement, cutting, brightness adjustment and affine transformation on any original traffic cone scene image to obtain at least one first traffic cone scene image corresponding to any original traffic cone scene image.
5. The cascade network-based traffic cone recognition method of claim 3, wherein the determining the traffic cone recognition model based on the training set, the validation set, and the target network structure comprises:
training the target network structure by using the training set to obtain an initial traffic cone recognition model;
determining an accuracy and/or loss value of the initial traffic cone identification model using the validation set;
when the accuracy is greater than an accuracy threshold and/or the loss value is less than a loss value threshold, determining that the initial traffic cone identification model is the traffic cone identification model.
6. The cascade network-based traffic cone identification method of claim 5, wherein after determining that the initial traffic cone identification model is the traffic cone identification model, further comprising:
and verifying the accuracy and the recognition speed of the traffic cone recognition model by using the test set.
7. A method for using a traffic cone identification model, wherein the traffic cone identification model is obtained by using the cascade network-based traffic cone identification method of any one of claims 1 to 6, and comprises the following steps:
acquiring an image to be identified;
inputting the image to be recognized into the traffic cone recognition model, and outputting a recognition result; and the identification result is that the image to be identified has a traffic cone and the position of the traffic cone in the image to be identified or the image to be identified has no traffic cone.
8. A device of a traffic cone identification method based on a cascade network is characterized by comprising the following steps:
the acquisition module is used for acquiring a compression and activation network SEnet and a dense convolution network DenseNet;
the determining module is used for determining a target network structure based on the SENET, the DenseNet and a preset target detection model;
and the training module is used for training the target network structure to obtain a traffic cone recognition model based on a plurality of original traffic cone scene images containing traffic cones.
9. An apparatus based on a traffic cone identification model using method, wherein the traffic cone identification model is obtained by using the cascade network based traffic cone identification method of any one of claims 1 to 6, and the apparatus comprises:
the acquisition module is used for acquiring an image to be identified;
the output module is used for inputting the image to be recognized into the traffic cone recognition model and outputting a recognition result; and the identification result is that the image to be identified has a traffic cone and the position of the traffic cone in the image to be identified or the image to be identified has no traffic cone.
10. An electronic device, characterized in that the electronic device comprises: a processor and a memory storing computer program instructions;
the processor, when executing the computer program instructions, implements a cascade network based traffic cone identification method according to any one of claims 1-6.
11. A computer storage medium having computer program instructions stored thereon, which when executed by a processor, implement the cascade network-based traffic cone identification method according to any one of claims 1 to 6.
CN202010060114.0A 2020-01-19 2020-01-19 Traffic cone identification method and device based on cascade network Pending CN111274942A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010060114.0A CN111274942A (en) 2020-01-19 2020-01-19 Traffic cone identification method and device based on cascade network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010060114.0A CN111274942A (en) 2020-01-19 2020-01-19 Traffic cone identification method and device based on cascade network

Publications (1)

Publication Number Publication Date
CN111274942A true CN111274942A (en) 2020-06-12

Family

ID=71002043

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010060114.0A Pending CN111274942A (en) 2020-01-19 2020-01-19 Traffic cone identification method and device based on cascade network

Country Status (1)

Country Link
CN (1) CN111274942A (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111860619A (en) * 2020-07-02 2020-10-30 苏州富鑫林光电科技有限公司 Industrial detection AI intelligent model for deep learning
CN111931802A (en) * 2020-06-16 2020-11-13 南京信息工程大学 Pedestrian re-identification method based on fusion of middle-layer features of Simese network structure
CN112180903A (en) * 2020-10-19 2021-01-05 江苏中讯通物联网技术有限公司 Vehicle state real-time detection system based on edge calculation
CN112949578A (en) * 2021-03-30 2021-06-11 苏州科达科技股份有限公司 Vehicle lamp state identification method, device, equipment and storage medium
CN112966780A (en) * 2021-03-31 2021-06-15 动联(山东)电子科技有限公司 Animal behavior identification method and system
CN113158913A (en) * 2021-04-25 2021-07-23 安徽科大擎天科技有限公司 Face mask wearing identification method, system and terminal
CN113283541A (en) * 2021-06-15 2021-08-20 无锡锤头鲨智能科技有限公司 Automatic floor sorting method
CN113947144A (en) * 2021-10-15 2022-01-18 北京百度网讯科技有限公司 Method, apparatus, device, medium and program product for object detection
CN115115873A (en) * 2022-06-08 2022-09-27 中国船舶集团有限公司系统工程研究院 Image classification method and device based on differentiable network structure search
CN115114924A (en) * 2022-06-17 2022-09-27 珠海格力电器股份有限公司 Named entity recognition method, device, computing equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109614985A (en) * 2018-11-06 2019-04-12 华南理工大学 A kind of object detection method based on intensive connection features pyramid network
CN109886922A (en) * 2019-01-17 2019-06-14 丽水市中心医院 Hepatocellular carcinoma automatic grading method based on SE-DenseNet deep learning frame and multi-modal Enhanced MR image
CN110135251A (en) * 2019-04-09 2019-08-16 上海电力学院 A kind of group's image Emotion identification method based on attention mechanism and hybrid network

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109614985A (en) * 2018-11-06 2019-04-12 华南理工大学 A kind of object detection method based on intensive connection features pyramid network
CN109886922A (en) * 2019-01-17 2019-06-14 丽水市中心医院 Hepatocellular carcinoma automatic grading method based on SE-DenseNet deep learning frame and multi-modal Enhanced MR image
CN110135251A (en) * 2019-04-09 2019-08-16 上海电力学院 A kind of group's image Emotion identification method based on attention mechanism and hybrid network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ZHOUYUANGAN: "利用分类任务冠军squeeze-and-excitation net修改DenseNet", pages 4 - 6, Retrieved from the Internet <URL:《https://zhuanlan.zhihu.com/p/48499356》> *

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111931802A (en) * 2020-06-16 2020-11-13 南京信息工程大学 Pedestrian re-identification method based on fusion of middle-layer features of Simese network structure
CN111860619A (en) * 2020-07-02 2020-10-30 苏州富鑫林光电科技有限公司 Industrial detection AI intelligent model for deep learning
CN112180903A (en) * 2020-10-19 2021-01-05 江苏中讯通物联网技术有限公司 Vehicle state real-time detection system based on edge calculation
CN112949578A (en) * 2021-03-30 2021-06-11 苏州科达科技股份有限公司 Vehicle lamp state identification method, device, equipment and storage medium
CN112966780A (en) * 2021-03-31 2021-06-15 动联(山东)电子科技有限公司 Animal behavior identification method and system
CN113158913A (en) * 2021-04-25 2021-07-23 安徽科大擎天科技有限公司 Face mask wearing identification method, system and terminal
CN113283541A (en) * 2021-06-15 2021-08-20 无锡锤头鲨智能科技有限公司 Automatic floor sorting method
CN113283541B (en) * 2021-06-15 2022-07-22 无锡锤头鲨智能科技有限公司 Automatic floor sorting method
CN113947144A (en) * 2021-10-15 2022-01-18 北京百度网讯科技有限公司 Method, apparatus, device, medium and program product for object detection
US11620815B2 (en) 2021-10-15 2023-04-04 Beijing Baidu Netcom Science Technology Co., Ltd. Method and device for detecting an object in an image
CN115115873A (en) * 2022-06-08 2022-09-27 中国船舶集团有限公司系统工程研究院 Image classification method and device based on differentiable network structure search
CN115114924A (en) * 2022-06-17 2022-09-27 珠海格力电器股份有限公司 Named entity recognition method, device, computing equipment and storage medium

Similar Documents

Publication Publication Date Title
CN111274942A (en) Traffic cone identification method and device based on cascade network
CN109977812B (en) Vehicle-mounted video target detection method based on deep learning
CN111814621A (en) Multi-scale vehicle and pedestrian detection method and device based on attention mechanism
CN111709416A (en) License plate positioning method, device and system and storage medium
CN111582339B (en) Vehicle detection and recognition method based on deep learning
CN106778659B (en) License plate recognition method and device
CN111340151B (en) Weather phenomenon recognition system and method for assisting automatic driving of vehicle
CN113762209A (en) Multi-scale parallel feature fusion road sign detection method based on YOLO
CN113723377A (en) Traffic sign detection method based on LD-SSD network
CN110826411B (en) Vehicle target rapid identification method based on unmanned aerial vehicle image
CN113486886B (en) License plate recognition method and device in natural scene
CN111950583B (en) Multi-scale traffic signal sign recognition method based on GMM (Gaussian mixture model) clustering
CN113723216A (en) Lane line detection method and device, vehicle and storage medium
CN114841920A (en) Flame identification method and device based on image processing and electronic equipment
CN111738228A (en) Multi-view vehicle feature matching method for hypermetrological evidence chain verification
CN115861756A (en) Earth background small target identification method based on cascade combination network
CN111160282B (en) Traffic light detection method based on binary Yolov3 network
CN110765940B (en) Target object statistical method and device
CN112364864A (en) License plate recognition method and device, electronic equipment and storage medium
CN115601717B (en) Deep learning-based traffic offence behavior classification detection method and SoC chip
CN112329886A (en) Double-license plate recognition method, model training method, device, equipment and storage medium
CN117152513A (en) Vehicle boundary positioning method for night scene
JP7264428B2 (en) Road sign recognition device and its program
CN114882469A (en) Traffic sign detection method and system based on DL-SSD model
CN114373081A (en) Image processing method and device, electronic device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination