CN118097625A - Obstacle recognition method and device - Google Patents
Obstacle recognition method and device Download PDFInfo
- Publication number
- CN118097625A CN118097625A CN202410498072.7A CN202410498072A CN118097625A CN 118097625 A CN118097625 A CN 118097625A CN 202410498072 A CN202410498072 A CN 202410498072A CN 118097625 A CN118097625 A CN 118097625A
- Authority
- CN
- China
- Prior art keywords
- obstacle
- category
- sample data
- power set
- identification information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 69
- 238000012545 processing Methods 0.000 claims abstract description 49
- 238000001514 detection method Methods 0.000 claims abstract description 25
- 230000006870 function Effects 0.000 claims description 34
- 238000002372 labelling Methods 0.000 claims description 31
- 238000010276 construction Methods 0.000 claims description 25
- 238000012549 training Methods 0.000 claims description 23
- 230000004888 barrier function Effects 0.000 claims description 22
- 238000013527 convolutional neural network Methods 0.000 claims description 14
- 238000005457 optimization Methods 0.000 claims description 13
- 238000011176 pooling Methods 0.000 claims description 13
- 230000004913 activation Effects 0.000 claims description 12
- 238000004590 computer program Methods 0.000 claims description 11
- 238000007781 pre-processing Methods 0.000 claims description 8
- 230000009467 reduction Effects 0.000 claims description 7
- 238000007619 statistical method Methods 0.000 claims description 7
- 238000009825 accumulation Methods 0.000 claims description 4
- 230000008447 perception Effects 0.000 abstract description 8
- 238000010586 diagram Methods 0.000 description 13
- 230000000694 effects Effects 0.000 description 8
- 230000008569 process Effects 0.000 description 8
- 230000009471 action Effects 0.000 description 3
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000003044 adaptive effect Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000013145 classification model Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 230000005012 migration Effects 0.000 description 1
- 238000013508 migration Methods 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000011897 real-time detection Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/56—Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
- G06V20/58—Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Software Systems (AREA)
- General Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Traffic Control Systems (AREA)
- Image Analysis (AREA)
Abstract
The application provides a method and a device for identifying an obstacle, wherein the method comprises the following steps: pre-constructing an obstacle recognition model and a tag power set; acquiring current image data through a vehicle-mounted camera; processing the current image data according to the obstacle recognition model to obtain a current obstacle detection result and a current obstacle classification result; determining a final obstacle recognition result based on the tag power set, a preset confidence threshold, a current obstacle detection result and a current obstacle classification result; wherein the final obstacle recognition result includes obstacle existence information and obstacle category information. Therefore, the method and the device can identify the specific category of the obstacle in real time based on the tag power set, and the identification is more accurate, so that more comprehensive environment perception can be provided for the vehicle, and the driving experience is improved.
Description
Technical Field
The application relates to the technical field of intelligent driving, in particular to a method and a device for identifying obstacles.
Background
Currently, in the existing obstacle recognition method, a millimeter wave radar is generally used for transmitting electromagnetic waves and analyzing echo signals to obtain the position and the speed of an obstacle. And obtaining an image in a visual field range by using a camera, performing pattern matching near the obstacle obtained by the radar, identifying the type of the obstacle, and obtaining the outline and the size of the obstacle. However, in practice, it is found that the existing method has a single obstacle recognition tag, and cannot accurately recognize the type of obstacle, thereby reducing the driving experience of the vehicle.
Disclosure of Invention
The embodiment of the application aims to provide a method and a device for identifying an obstacle, which can identify specific categories of the obstacle in real time based on a tag power set, and is more accurate in identification, so that more comprehensive environment perception can be provided for a vehicle, and driving experience is improved.
The first aspect of the present application provides an obstacle identifying method, including:
Pre-constructing an obstacle recognition model and a tag power set;
Acquiring current image data through a vehicle-mounted camera;
Processing the current image data according to the obstacle recognition model to obtain a current obstacle detection result and a current obstacle classification result;
Determining a final obstacle recognition result based on the tag power set, a preset confidence threshold, the current obstacle detection result and the current obstacle classification result; wherein the final obstacle recognition result includes obstacle existence information and obstacle category information.
Further, the pre-constructing the obstacle recognition model and the tag power set includes:
acquiring image sample data of an on-board camera;
Constructing a deep network structure based on a convolutional neural network;
Processing the image sample data through the deep network structure to obtain barrier position identification information and barrier category identification information;
Constructing a barrier category set;
Constructing a tag power set according to the obstacle position identification information, the obstacle category identification information and the obstacle category set;
Training the deep network structure through a preset random gradient descent optimization algorithm, the tag power set and the image sample data to obtain a trained target recognition model;
and optimizing the target recognition model to obtain an obstacle recognition model.
Further, the deep network structure comprises a convolution layer, a pooling layer and a full connection layer, wherein an activation function of the deep network structure is a nonlinear ReLU activation function, and a loss function of the deep network structure is a binary cross entropy loss function;
the obstacle category set at least comprises a motor vehicle category, a non-motor vehicle category, a pedestrian category, a traffic sign category, a traffic signal light category, a roadblock category, an animal group category, a road facility category, a building category, a construction area category and a rain and snow accumulation category.
Further, the processing the image sample data through the deep network structure to obtain obstacle position identification information and obstacle category identification information includes:
performing obstacle labeling processing on the image sample data to obtain first labeling sample data;
Preprocessing the first labeling sample data to obtain sample data to be processed;
Performing feature capturing processing on the sample data to be processed through the convolution layer to obtain a feature map;
And carrying out dimension reduction processing on the feature map through the pooling layer to obtain barrier position identification information and barrier category identification information.
Further, the constructing a tag power set according to the obstacle position identification information, the obstacle category identification information and the obstacle category set includes:
Combining the labels in the obstacle class set to obtain a plurality of combined labels; wherein each of the combined tags is a proper subset of the set of obstacle categories;
Performing tag statistical analysis processing according to the obstacle position identification information and the obstacle category identification information to obtain the correlation between tags;
performing label eliminating treatment on the labels in the combined labels according to the correlation among the labels to obtain target combined labels;
And constructing a tag power set according to the target combination tag.
Further, the training the deep network structure through a preset random gradient descent optimization algorithm, the tag power set and the image sample data to obtain a trained target recognition model includes:
labeling the image sample data through the label power set to obtain second labeled sample data;
Training the deep network structure through a preset random gradient descent optimization algorithm, the loss function and the second labeling sample data to obtain a trained target recognition model.
A second aspect of the present application provides an obstacle identifying apparatus, comprising:
The construction unit is used for constructing an obstacle recognition model and a tag power set in advance;
The acquisition unit is used for acquiring current image data through the vehicle-mounted camera;
The processing unit is used for processing the current image data according to the obstacle recognition model to obtain a current obstacle detection result and a current obstacle classification result;
A determining unit, configured to determine a final obstacle recognition result based on the tag power set, a preset confidence threshold, the current obstacle detection result, and the current obstacle classification result; wherein the final obstacle recognition result includes obstacle existence information and obstacle category information.
Further, the construction unit includes:
the acquisition subunit is used for acquiring image sample data of the vehicle-mounted camera;
The first construction subunit is used for constructing a deep network structure based on the convolutional neural network;
the processing subunit is used for processing the image sample data through the deep network structure to obtain barrier position identification information and barrier category identification information;
A second construction subunit for constructing a set of obstacle categories;
A third construction subunit, configured to construct a tag power set according to the obstacle location identification information, the obstacle category identification information, and the obstacle category set;
the training subunit is used for training the deep network structure through a preset random gradient descent optimization algorithm, the tag power set and the image sample data to obtain a trained target recognition model;
and the optimizing subunit is used for optimizing the target recognition model to obtain an obstacle recognition model.
Further, the deep network structure comprises a convolution layer, a pooling layer and a full connection layer, wherein an activation function of the deep network structure is a nonlinear ReLU activation function, and a loss function of the deep network structure is a binary cross entropy loss function;
the obstacle category set at least comprises a motor vehicle category, a non-motor vehicle category, a pedestrian category, a traffic sign category, a traffic signal light category, a roadblock category, an animal group category, a road facility category, a building category, a construction area category and a rain and snow accumulation category.
Further, the processing subunit includes:
the first labeling processing module is used for performing obstacle labeling processing on the image sample data to obtain first labeling sample data;
The preprocessing module is used for preprocessing the first marked sample data to obtain sample data to be processed;
The characteristic capturing module is used for carrying out characteristic capturing processing on the sample data to be processed through the convolution layer to obtain a characteristic diagram;
And the dimension reduction module is used for carrying out dimension reduction processing on the feature map through the pooling layer to obtain barrier position identification information and barrier category identification information.
Further, the third building subunit includes:
the combination module is used for combining the labels in the obstacle category set to obtain a plurality of combined labels; wherein each of the combined tags is a proper subset of the set of obstacle categories;
the statistical analysis module is used for carrying out tag statistical analysis processing according to the obstacle position identification information and the obstacle category identification information to obtain the correlation among tags;
The label shaving module is used for carrying out label eliminating treatment on the labels in the combined labels according to the correlation among the labels to obtain target combined labels;
and the construction module is used for constructing a label power set according to the target combination label.
Further, the training subunit comprises:
The second labeling processing module is used for labeling the image sample data through the label power set to obtain second labeling sample data;
The training module is used for training the deep network structure through a preset random gradient descent optimization algorithm, the loss function and the second labeling sample data to obtain a trained target recognition model.
A third aspect of the present application provides an electronic device comprising a memory for storing a computer program and a processor that runs the computer program to cause the electronic device to perform the obstacle recognition method of any one of the first aspects of the application.
A fourth aspect of the application provides a computer readable storage medium storing computer program instructions which, when read and executed by a processor, perform the obstacle recognition method of any one of the first aspects of the application.
The beneficial effects of the application are as follows: the method and the device can identify specific categories of the obstacles in real time based on the tag power set, and the identification is more accurate, so that more comprehensive environment perception can be provided for the vehicle, and driving experience is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments of the present application will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and should not be considered as limiting the scope, and other related drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic flow chart of an obstacle recognition method according to an embodiment of the present application;
fig. 2 is a flow chart of another obstacle identifying method according to an embodiment of the present application;
FIG. 3 is a schematic diagram of an exemplary convolutional neural network according to an embodiment of the present application;
Fig. 4 is a schematic flow chart of a label combination according to an embodiment of the present application;
FIG. 5 is a schematic view of an exemplary obstacle according to an embodiment of the present application;
FIG. 6 is a schematic view of another example of an obstacle according to an embodiment of the present application;
fig. 7 is a schematic structural diagram of an obstacle identifying apparatus according to an embodiment of the present application;
fig. 8 is a schematic structural diagram of another obstacle identifying apparatus according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be described below with reference to the accompanying drawings in the embodiments of the present application.
It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures. Meanwhile, in the description of the present application, the terms "first", "second", and the like are used only to distinguish the description, and are not to be construed as indicating or implying relative importance.
Example 1
Referring to fig. 1, fig. 1 is a flow chart of an obstacle identifying method according to the present embodiment. The obstacle identification method comprises the following steps:
S101, constructing an obstacle recognition model and a tag power set in advance.
S102, acquiring current image data through a vehicle-mounted camera.
S103, processing the current image data according to the obstacle recognition model to obtain a current obstacle detection result and a current obstacle classification result.
S104, determining a final obstacle recognition result based on the tag power set, a preset confidence threshold, a current obstacle detection result and a current obstacle classification result; wherein the final obstacle recognition result includes obstacle existence information and obstacle category information.
In this embodiment, the execution subject of the method may be a computing device such as a computer or a server, which is not limited in this embodiment.
In this embodiment, the execution body of the method may be an intelligent device such as a smart phone or a tablet computer, which is not limited in this embodiment.
Therefore, by implementing the obstacle recognition method described in the embodiment, specific categories of the obstacle can be recognized in real time based on the tag power set, recognition is more accurate, and accordingly more comprehensive environment perception can be provided for the vehicle, and driving experience is improved.
Example 2
Referring to fig. 2, fig. 2 is a flow chart of an obstacle identifying method according to the present embodiment. The obstacle identification method comprises the following steps:
s201, acquiring image sample data of an on-board camera.
In this embodiment, the method may collect image data acquired by the vehicle-mounted camera and label the image data, so as to improve recognition accuracy of the obstacle recognition model (i.e., the subsequent target recognition model) based on real and effective image data.
S202, constructing a deep network structure based on a convolutional neural network.
In this embodiment, the deep network structure includes a convolution layer, a pooling layer and a full connection layer, the activation function of the deep network structure is a nonlinear ReLU activation function, and the loss function of the deep network structure is a binary cross entropy loss function.
S203, performing obstacle labeling processing on the image sample data to obtain first labeling sample data.
In this embodiment, the labeling is aimed at determining the location and class of the obstacle in the image. The method can manually label the obstacles in the image by using an image labeling tool, and assign corresponding category labels to each obstacle.
In the present embodiment, the multi-tag classification described above means that one obstacle may belong to a plurality of categories, such as pedestrians, vehicles, bicycles, and the like. Meanwhile, when the frame selection positions of each picture are different, the labels of the pictures are also different.
S204, preprocessing the first marked sample data to obtain sample data to be processed.
In this embodiment, the method may preprocess the image data before performing the multi-label classification. Wherein the preprocessing includes resizing, color space conversion, data normalization, and the like for the image. The method aims at enabling input data to meet the requirements of a model, so that the robustness and performance of a trained target detection model can be improved.
S205, performing feature capturing processing on sample data to be processed through a convolution layer to obtain a feature map.
S206, performing dimension reduction processing on the feature map through the pooling layer to obtain obstacle position identification information and obstacle category identification information.
In this embodiment, the method may establish a Deep neural network (CNN) based on a convolutional CNN. The CNN performs well in image classification tasks and can be expanded to the problem of multi-label classification, so that the method is based on a common CNN network, and a deep convolutional neural network is built in a model structure.
By implementing this embodiment, the DCNN can realize high-performance and high-accuracy processing of complex features, and at the same time, can provide the complex features with migration learning capability (i.e., corresponding to the multi-label).
For example, the method may first capture local features in an image in DCNN using convolution layers, each of which is composed of a plurality of convolution kernels, each of which performs a convolution operation on an input image through a sliding window, generating a feature map. Then, after the DCNN is established, the feature map is reduced in size by the pooling layer, and the important features are preserved, in this method, maximum pooling is performed. A maximum or average value is selected within a particular region to reduce the dimension of the feature map. Wherein,
output_vector = max_pool(output_vector)
Full tie layer: y=f (W x+b)
(Wherein the symbol represents a matrix multiplication, and f is Relu activation functions)
Further, the use of a nonlinear ReLU as an activation function for DCNN introduces nonlinear characteristics that enable the network to learn more complex feature representations.
Referring to fig. 3, fig. 3 shows an exemplary schematic structure of a convolutional neural network. The DCNN is composed of a plurality of convolution layers, a pooling layer and a full connection layer, and a deep network structure is formed. Such deep structures allow the network to learn higher level abstract features. After the CNN is formed into the DCNN, the convolution layers in the DCNN share the parameters over the entire input image in a weight sharing manner, i.e., each convolution kernel. It can be seen that this approach can reduce the amount of parameters of the model and improve the training efficiency and generalization ability of the model.
S207, constructing an obstacle category set.
In this embodiment, the set of obstacle categories includes at least a motor vehicle category, a non-motor vehicle category, a pedestrian category, a traffic sign category, a traffic light category, a roadblock category, an animal group category, a road facility category, a building category, a construction area category, and a snow and rain accumulation category.
S208, combining the labels in the obstacle class set to obtain a plurality of combined labels; wherein each combined tag is a proper subset of the set of obstacle categories.
Referring to fig. 4, fig. 4 shows a schematic flow chart of label combining.
For example, when vehicle obstacle detection is involved, one obstacle may belong to multiple categories at the same time, and the following is a specific example:
(1) Vehicles and pedestrians: an obstacle may be a vehicle (a seated person in a car) that is parked on a sidewalk, and thus may belong to both the vehicle and the pedestrian category;
(2) Pedestrians and bicycles: an obstacle may be a cyclist and thus may belong to both the pedestrian and the bicycle categories;
(3) Pedestrian and traffic sign: an obstacle may be a pedestrian standing in an area with traffic signs and thus may belong to both the pedestrian and traffic sign categories;
(4) Vehicles and road barriers: an obstacle may be a vehicle that is parked in a road construction area and thus may belong to both the vehicle and the road barrier categories.
It can be seen that the purpose of building a power set at multi-tag classification is to be able to identify and classify multiple features and attributes of obstacles to provide more comprehensive information and decision support.
S209, performing label statistical analysis processing according to the barrier position identification information and the barrier category identification information to obtain the correlation between labels.
S210, performing label eliminating processing on labels in the combined labels according to the correlation among the labels to obtain target combined labels.
In this embodiment, the method may divide the tag set into the following tag subsets according to factors such as the occurrence frequency of the actual obstacle. Wherein the method may combine the set of established tags (assuming that the combined tag power set is m) such that the result of the combination is a proper subset of all tag classes (at least one tag does not exist in each tag subset).
It should be noted that in order to ensure the effectiveness (better classification effect) of the tag exponentiation set in the method, the labeling image dataset needs to be statistically analyzed according to the steps to obtain the correlation between the tags, and when the tag exponentiation set is built, the tags with higher tag correlation in the exponentiation set are removed to ensure that the classification result is not interfered by the tags with high correlation.
In addition, the method can also calculate the co-occurrence frequency among the labels and take the co-occurrence frequency as a correlation measurement index. Wherein co-occurrence frequency means a frequency at which two tags appear simultaneously in the same image.
S211, constructing a label power set according to the target combination label.
S212, labeling the image sample data through the label power set to obtain second labeled sample data.
In this embodiment, the method may label each tag set in the tag power set in the image sample data, so as to label different tag categories of the same obstacle.
For example, a pedestrian standing under a traffic sign may be labeled as a "traffic sign" by the power set 1 (no pedestrians are included in the class of the power set) and also labeled as a "pedestrian" by the power set 2 (no traffic sign is included in the class of the power set).
By implementing the embodiment, different reasoning effects can be obtained by using the same classification model and using data marked by different label sets.
S213, training the deep network structure through a preset random gradient descent optimization algorithm, a loss function and second labeling sample data to obtain a trained target recognition model.
In the embodiment, the method adopts a random gradient descent (SGD) optimization algorithm, and combines a loss function to optimize model parameters, so that the method is universal. Meanwhile, the method also uses a binary cross entropy loss function to process the relation among a plurality of labels, calculates the binary cross entropy on each label, and then sums the binary cross entropy to obtain a final loss value. It will be appreciated that this training step yields m pieces of parameter information (corresponding to the number of power sets described above).
Based on the above content reference related code, for example, the following is performed:
Calculating the parameters: y_true, y_pred (tensorflow for example, pytorch is similar thereto)
# Convert tag to floating point number tensor
y_true=tf.cast(y_true,tf.float32)
# Use binary cross entropy loss function
binary_cross_entropy=tf.keras.losses.BinaryCrossentropy()
# Calculate binary cross entropy on each tag
per_label_loss=binary_cross_entropy(y_true,y_pred)
# Calculate the total loss, add the losses on each tag
total_loss=tf.reduce_sum(per_label_loss)
And S214, optimizing the target recognition model to obtain an obstacle recognition model.
In this embodiment, the method may further optimize the model obtained by training, so as to improve accuracy and robustness of obstacle detection and classification. At the same time, the training data set can be expanded to increase the generalization capability of the model.
S215, acquiring current image data through the vehicle-mounted camera.
In this embodiment, the method may deploy the optimized model to an embedded system of the vehicle. And acquiring current image data through the carried specific camera, and transmitting the current image data to the vehicle-mounted computing platform to perform real-time obstacle detection and obstacle classification.
S216, processing the current image data according to the obstacle recognition model to obtain a current obstacle detection result and a current obstacle classification result.
In the embodiment, in the real-time detection process, the method inputs an image acquired by a vehicle-mounted camera into a model, extracts position and category information of the obstacle, performs multi-label classification, and determines existence and category of the obstacle by using a set threshold (0.8-1.0) or confidence.
S217, determining a final obstacle recognition result based on the tag power set, a preset confidence threshold, a current obstacle detection result and a current obstacle classification result; wherein the final obstacle recognition result includes obstacle existence information and obstacle category information.
In this embodiment, the adopted multi-label training model DCNN can identify a plurality of results for the same acquired image.
Referring to fig. 5 and 6, exemplary schematic diagrams of the obstacle based on the diagrams shown in fig. 5 and 6 are as follows:
the content of the vehicle-mounted image acquisition is "a pedestrian pushing a bicycle," and the model is used for identifying the pedestrian and the pedestrian respectively (the power set i of the model does not contain pedestrians) by the model.
And then combining m labels as the results of m power sets, combining and de-duplicating the m results, if the combined result only contains 'no obstacle identified', the label combination result is 'no obstacle identified', and if the combined label set contains other non- 'no obstacle identified', such as 'pedestrian', the label of 'no obstacle identified' is removed, and a result is obtained.
In the embodiment, the method can also feed back the result of obstacle detection to an automatic driving system, an adaptive cruising system, an intelligent parking assisting system and the like, and provide more comprehensive environment perception for the vehicle, so that the vehicle can run more safely and intelligently, and the driving comfort, efficiency and safety are improved.
In this embodiment, detection of obstacles and multi-label classification are a constantly improving and optimizing task. Therefore, the data are continuously collected, the model is retrained, performance evaluation and verification are carried out, and the accuracy and the robustness of obstacle detection and classification can be continuously improved.
In this embodiment, the execution subject of the method may be a computing device such as a computer or a server, which is not limited in this embodiment.
In this embodiment, the execution body of the method may be an intelligent device such as a smart phone or a tablet computer, which is not limited in this embodiment.
Therefore, by implementing the obstacle recognition method described in the embodiment, the tag can be built by self-definition based on the real data and the tag power set can be built, so that the effect of simultaneously detecting and distinguishing a plurality of different types of obstacles (the obstacles can be self-defined according to different scenes and areas) is realized, and the recognition effect on the obstacles is improved in complex scenes and special scenes through more flexible tags. Meanwhile, the same image information can be subjected to label multi-classification and combination through a DCNN model, so that a more accurate identification effect is obtained. Finally, the multi-label result can be fed back to systems such as automatic driving, self-adaptive cruising and intelligent parking assistance, and more comprehensive environment perception is provided for the vehicle, so that the vehicle can run more safely and intelligently, and the driving comfort, efficiency and safety are improved.
Example 3
Referring to fig. 7, fig. 7 is a schematic structural diagram of an obstacle identifying apparatus according to the present embodiment. As shown in fig. 7, the obstacle recognizing apparatus includes:
a construction unit 310 for constructing an obstacle recognition model and a tag power set in advance;
An obtaining unit 320, configured to obtain current image data through the vehicle-mounted camera;
A processing unit 330, configured to process the current image data according to the obstacle recognition model, so as to obtain a current obstacle detection result and a current obstacle classification result;
A determining unit 340, configured to determine a final obstacle recognition result based on the tag power set, a preset confidence threshold, the current obstacle detection result, and the current obstacle classification result; wherein the final obstacle recognition result includes obstacle existence information and obstacle category information.
In this embodiment, the explanation of the obstacle identifying apparatus may refer to the description in embodiment 1 or embodiment 2, and the description is not repeated in this embodiment.
Therefore, the obstacle recognition device described by the embodiment can recognize specific categories of the obstacles in real time based on the tag power set, recognition is more accurate, and accordingly more comprehensive environment perception can be provided for the vehicle, and driving experience is improved.
Example 4
Referring to fig. 8, fig. 8 is a schematic structural diagram of an obstacle identifying apparatus according to the present embodiment. As shown in fig. 8, the obstacle recognizing apparatus includes:
a construction unit 310 for constructing an obstacle recognition model and a tag power set in advance;
An obtaining unit 320, configured to obtain current image data through the vehicle-mounted camera;
A processing unit 330, configured to process the current image data according to the obstacle recognition model, so as to obtain a current obstacle detection result and a current obstacle classification result;
A determining unit 340, configured to determine a final obstacle recognition result based on the tag power set, a preset confidence threshold, the current obstacle detection result, and the current obstacle classification result; wherein the final obstacle recognition result includes obstacle existence information and obstacle category information.
As an alternative embodiment, the construction unit 310 includes:
an acquisition subunit 311, configured to acquire image sample data of the vehicle-mounted camera;
a first construction subunit 312, configured to construct a deep network structure based on the convolutional neural network;
A processing subunit 313, configured to process the image sample data through a deep network structure to obtain obstacle location identification information and obstacle category identification information;
a second construction subunit 314 for constructing a set of obstacle categories;
a third construction subunit 315 configured to construct a tag power set according to the obstacle location identification information, the obstacle category identification information, and the obstacle category set;
The training subunit 316 is configured to train the deep network structure through a preset random gradient descent optimization algorithm, a tag power set and image sample data, so as to obtain a trained target recognition model;
And the optimizing subunit 317 is configured to perform optimizing processing on the target recognition model to obtain an obstacle recognition model.
In this embodiment, the deep network structure includes a convolution layer, a pooling layer and a full connection layer, the activation function of the deep network structure is a nonlinear ReLU activation function, and the loss function of the deep network structure is a binary cross entropy loss function;
The set of obstacle categories includes at least a motor vehicle category, a non-motor vehicle category, a pedestrian category, a traffic sign category, a traffic light category, a roadblock category, an animal group category, a roadway facility category, a building category, a construction area category, and a snow and rain block category.
As an alternative embodiment, the processing subunit 313 includes:
The first labeling processing module is used for performing obstacle labeling processing on the image sample data to obtain first labeling sample data;
the preprocessing module is used for preprocessing the first marked sample data to obtain sample data to be processed;
the characteristic capturing module is used for carrying out characteristic capturing processing on the sample data to be processed through the convolution layer to obtain a characteristic diagram;
the dimension reduction module is used for carrying out dimension reduction processing on the feature map through the pooling layer to obtain barrier position identification information and barrier category identification information.
As an alternative embodiment, the third construction subunit 315 includes:
The combination module is used for combining the labels in the obstacle class set to obtain a plurality of combined labels; wherein each combined tag is a proper subset of the set of obstacle categories;
The statistical analysis module is used for carrying out tag statistical analysis processing according to the obstacle position identification information and the obstacle category identification information to obtain the correlation between the tags;
The label shaving module is used for carrying out label eliminating treatment on the labels in the combined labels according to the correlation among the labels to obtain target combined labels;
And the construction module is used for constructing a label power set according to the target combination label.
As an alternative embodiment, training subunit 316 includes:
The second labeling processing module is used for labeling the image sample data through the label power set to obtain second labeling sample data;
The training module is used for training the deep network structure through a preset random gradient descent optimization algorithm, a loss function and second labeling sample data to obtain a trained target recognition model.
In this embodiment, the explanation of the obstacle identifying apparatus may refer to the description in embodiment 1 or embodiment 2, and the description is not repeated in this embodiment.
Therefore, the obstacle recognition device described in the embodiment can be used for establishing the tag and the tag power set based on the real data in a self-defining manner, so that the effect of simultaneously detecting and distinguishing a plurality of different types of obstacles (the obstacles can be self-defined according to different scenes and areas) is realized, and the recognition effect on the obstacles is improved in complex scenes and special scenes through more flexible tags. Meanwhile, the same image information can be subjected to label multi-classification and combination through a DCNN model, so that a more accurate identification effect is obtained. Finally, the multi-label result can be fed back to systems such as automatic driving, self-adaptive cruising and intelligent parking assistance, and more comprehensive environment perception is provided for the vehicle, so that the vehicle can run more safely and intelligently, and the driving comfort, efficiency and safety are improved.
An embodiment of the present application provides an electronic device, including a memory and a processor, where the memory is configured to store a computer program, and the processor is configured to execute the computer program to cause the electronic device to execute the obstacle identifying method in embodiment 1 or embodiment 2 of the present application.
An embodiment of the present application provides a computer-readable storage medium storing computer program instructions that, when read and executed by a processor, perform the obstacle identifying method of embodiment 1 or embodiment 2 of the present application.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. The apparatus embodiments described above are merely illustrative, for example, of the flowcharts and block diagrams in the figures that illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, functional modules in the embodiments of the present application may be integrated together to form a single part, or each module may exist alone, or two or more modules may be integrated to form a single part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a usb disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The above description is only an example of the present application and is not intended to limit the scope of the present application, and various modifications and variations will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the protection scope of the present application. It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures.
The foregoing is merely illustrative of the present application, and the present application is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.
It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
Claims (10)
1. A method of identifying an obstacle, comprising:
Pre-constructing an obstacle recognition model and a tag power set;
Acquiring current image data through a vehicle-mounted camera;
Processing the current image data according to the obstacle recognition model to obtain a current obstacle detection result and a current obstacle classification result;
Determining a final obstacle recognition result based on the tag power set, a preset confidence threshold, the current obstacle detection result and the current obstacle classification result; wherein the final obstacle recognition result includes obstacle existence information and obstacle category information.
2. The obstacle recognition method of claim 1, wherein the pre-constructing the obstacle recognition model and the tag power set comprises:
acquiring image sample data of an on-board camera;
Constructing a deep network structure based on a convolutional neural network;
Processing the image sample data through the deep network structure to obtain barrier position identification information and barrier category identification information;
Constructing a barrier category set;
Constructing a tag power set according to the obstacle position identification information, the obstacle category identification information and the obstacle category set;
Training the deep network structure through a preset random gradient descent optimization algorithm, the tag power set and the image sample data to obtain a trained target recognition model;
and optimizing the target recognition model to obtain an obstacle recognition model.
3. The obstacle recognition method of claim 2, wherein the deep network structure comprises a convolutional layer, a pooling layer, and a fully-connected layer, an activation function of the deep network structure is a nonlinear ReLU activation function, and a loss function of the deep network structure is a binary cross entropy loss function;
the obstacle category set at least comprises a motor vehicle category, a non-motor vehicle category, a pedestrian category, a traffic sign category, a traffic signal light category, a roadblock category, an animal group category, a road facility category, a building category, a construction area category and a rain and snow accumulation category.
4. The obstacle identifying method according to claim 3, wherein the processing the image sample data through the deep network structure to obtain obstacle position identifying information and obstacle category identifying information includes:
performing obstacle labeling processing on the image sample data to obtain first labeling sample data;
Preprocessing the first labeling sample data to obtain sample data to be processed;
Performing feature capturing processing on the sample data to be processed through the convolution layer to obtain a feature map;
And carrying out dimension reduction processing on the feature map through the pooling layer to obtain barrier position identification information and barrier category identification information.
5. The obstacle recognition method according to claim 3, wherein the constructing a tag power set from the obstacle position recognition information, the obstacle category recognition information, and the obstacle category set includes:
Combining the labels in the obstacle class set to obtain a plurality of combined labels; wherein each of the combined tags is a proper subset of the set of obstacle categories;
Performing tag statistical analysis processing according to the obstacle position identification information and the obstacle category identification information to obtain the correlation between tags;
performing label eliminating treatment on the labels in the combined labels according to the correlation among the labels to obtain target combined labels;
And constructing a tag power set according to the target combination tag.
6. The obstacle recognition method according to claim 3, wherein the training the deep network structure by a preset random gradient descent optimization algorithm, the tag power set and the image sample data to obtain a trained target recognition model includes:
labeling the image sample data through the label power set to obtain second labeled sample data;
Training the deep network structure through a preset random gradient descent optimization algorithm, the loss function and the second labeling sample data to obtain a trained target recognition model.
7. An obstacle recognition device, characterized in that the obstacle recognition device comprises:
The construction unit is used for constructing an obstacle recognition model and a tag power set in advance;
The acquisition unit is used for acquiring current image data through the vehicle-mounted camera;
The processing unit is used for processing the current image data according to the obstacle recognition model to obtain a current obstacle detection result and a current obstacle classification result;
A determining unit, configured to determine a final obstacle recognition result based on the tag power set, a preset confidence threshold, the current obstacle detection result, and the current obstacle classification result; wherein the final obstacle recognition result includes obstacle existence information and obstacle category information.
8. The obstacle recognition device according to claim 7, wherein the construction unit includes:
the acquisition subunit is used for acquiring image sample data of the vehicle-mounted camera;
The first construction subunit is used for constructing a deep network structure based on the convolutional neural network;
the processing subunit is used for processing the image sample data through the deep network structure to obtain barrier position identification information and barrier category identification information;
A second construction subunit for constructing a set of obstacle categories;
A third construction subunit, configured to construct a tag power set according to the obstacle location identification information, the obstacle category identification information, and the obstacle category set;
the training subunit is used for training the deep network structure through a preset random gradient descent optimization algorithm, the tag power set and the image sample data to obtain a trained target recognition model;
and the optimizing subunit is used for optimizing the target recognition model to obtain an obstacle recognition model.
9. An electronic device comprising a memory for storing a computer program and a processor that runs the computer program to cause the electronic device to perform the obstacle recognition method of any one of claims 1 to 6.
10. A readable storage medium, wherein computer program instructions are stored in the readable storage medium, which computer program instructions, when read and executed by a processor, perform the obstacle recognition method of any one of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410498072.7A CN118097625B (en) | 2024-04-24 | 2024-04-24 | Obstacle recognition method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410498072.7A CN118097625B (en) | 2024-04-24 | 2024-04-24 | Obstacle recognition method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN118097625A true CN118097625A (en) | 2024-05-28 |
CN118097625B CN118097625B (en) | 2024-08-09 |
Family
ID=91144367
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410498072.7A Active CN118097625B (en) | 2024-04-24 | 2024-04-24 | Obstacle recognition method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118097625B (en) |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112464921A (en) * | 2021-02-02 | 2021-03-09 | 禾多科技(北京)有限公司 | Obstacle detection information generation method, apparatus, device and computer readable medium |
CN113128419A (en) * | 2021-04-23 | 2021-07-16 | 京东鲲鹏(江苏)科技有限公司 | Obstacle identification method and device, electronic equipment and storage medium |
WO2022077264A1 (en) * | 2020-10-14 | 2022-04-21 | 深圳市锐明技术股份有限公司 | Object recognition method, object recognition apparatus, and electronic device |
US20230150530A1 (en) * | 2021-11-18 | 2023-05-18 | SQ Technology (Shanghai) Corporation | Violation Inspection System Based on Visual Sensing of Self-Driving Vehicle and Method Thereof |
CN116311157A (en) * | 2023-02-15 | 2023-06-23 | 北京三快在线科技有限公司 | Obstacle recognition method and obstacle recognition model training method |
CN117292352A (en) * | 2023-09-11 | 2023-12-26 | 东南大学 | Obstacle recognition and avoidance method and trolley system for open world target detection |
CN117315624A (en) * | 2023-10-08 | 2023-12-29 | 北京京东乾石科技有限公司 | Obstacle detection method, vehicle control method, device, apparatus, and storage medium |
-
2024
- 2024-04-24 CN CN202410498072.7A patent/CN118097625B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2022077264A1 (en) * | 2020-10-14 | 2022-04-21 | 深圳市锐明技术股份有限公司 | Object recognition method, object recognition apparatus, and electronic device |
CN112464921A (en) * | 2021-02-02 | 2021-03-09 | 禾多科技(北京)有限公司 | Obstacle detection information generation method, apparatus, device and computer readable medium |
CN113128419A (en) * | 2021-04-23 | 2021-07-16 | 京东鲲鹏(江苏)科技有限公司 | Obstacle identification method and device, electronic equipment and storage medium |
US20230150530A1 (en) * | 2021-11-18 | 2023-05-18 | SQ Technology (Shanghai) Corporation | Violation Inspection System Based on Visual Sensing of Self-Driving Vehicle and Method Thereof |
CN116311157A (en) * | 2023-02-15 | 2023-06-23 | 北京三快在线科技有限公司 | Obstacle recognition method and obstacle recognition model training method |
CN117292352A (en) * | 2023-09-11 | 2023-12-26 | 东南大学 | Obstacle recognition and avoidance method and trolley system for open world target detection |
CN117315624A (en) * | 2023-10-08 | 2023-12-29 | 北京京东乾石科技有限公司 | Obstacle detection method, vehicle control method, device, apparatus, and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN118097625B (en) | 2024-08-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Alghmgham et al. | Autonomous traffic sign (ATSR) detection and recognition using deep CNN | |
Kortli et al. | Deep embedded hybrid CNN–LSTM network for lane detection on NVIDIA Jetson Xavier NX | |
CN111666921B (en) | Vehicle control method, apparatus, computer device, and computer-readable storage medium | |
Ye et al. | Lane detection method based on lane structural analysis and CNNs | |
CN109711463B (en) | Attention-based important object detection method | |
CN111104903B (en) | Depth perception traffic scene multi-target detection method and system | |
Huang et al. | Vehicle detection and inter-vehicle distance estimation using single-lens video camera on urban/suburb roads | |
CN111507226B (en) | Road image recognition model modeling method, image recognition method and electronic equipment | |
Chao et al. | Multi-lane detection based on deep convolutional neural network | |
Rateke et al. | Passive vision region-based road detection: A literature review | |
CN104915642A (en) | Method and apparatus for measurement of distance to vehicle ahead | |
CN113468994A (en) | Three-dimensional target detection method based on weighted sampling and multi-resolution feature extraction | |
Zheng et al. | Dim target detection method based on deep learning in complex traffic environment | |
CN104778699A (en) | Adaptive object feature tracking method | |
Bao et al. | Unpaved road detection based on spatial fuzzy clustering algorithm | |
Haris et al. | Lane lines detection under complex environment by fusion of detection and prediction models | |
Lee et al. | An intelligent driving assistance system based on lightweight deep learning models | |
Arthi et al. | Object detection of autonomous vehicles under adverse weather conditions | |
Al Mamun et al. | An efficient encode-decode deep learning network for lane markings instant segmentation | |
Al Nasim et al. | An automated approach for the recognition of bengali license plates | |
CN118097625B (en) | Obstacle recognition method and device | |
Zhang et al. | A front vehicle detection algorithm for intelligent vehicle based on improved gabor filter and SVM | |
CN117218858A (en) | Traffic safety early warning system and method for expressway | |
CN115984699A (en) | Illegal billboard detection method, device, equipment and medium based on deep learning | |
Kim et al. | Real Time multi-lane detection using relevant lines based on line labeling method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |