WO2021018144A1

WO2021018144A1 - Indication lamp detection method, apparatus and device, and computer-readable storage medium

Info

Publication number: WO2021018144A1
Application number: PCT/CN2020/105223
Authority: WO
Inventors: 何哲琪; 马佳彬; 王坤; 曾星宇
Original assignee: 浙江商汤科技开发有限公司
Priority date: 2019-07-31
Filing date: 2020-07-28
Publication date: 2021-02-04
Also published as: JP2022516183A; KR20210097782A; CN112307840A

Abstract

The present disclosure relates to an indication lamp detection method, apparatus and device, and a computer-readable storage medium. The method comprises: identifying a collected road image to obtain candidate boundary boxes of an indication lamp in the road image; predicting, according to image regions corresponding to the candidate boundary boxes in the road image, various classifications for the indication lamps to obtain a prediction result of the classifications of the indication lamp, wherein the classifications comprise at least two of the following: a usage classification, a shape classification, an arrangement classification, a function classification, a color classification and an orientation classification; and determining an exhibition state of the indication lamp according to the prediction result of the classifications of the indication lamp.

Description

Indicator lamp detection method, device, equipment and computer readable storage medium

Cross references to related applications

This disclosure requires the priority of a Chinese patent application filed with the Chinese Patent Office, the application number is CN2019107037635, and the invention title is "indicator lamp detection method, device, equipment and computer-readable storage medium" on July 31, 2019, and its entire content Incorporated in this disclosure by reference.

Technical field

The present disclosure relates to the field of computer vision technology, and in particular to an indicator light detection method, device, equipment and computer-readable storage medium.

Background technique

Indicator light detection is a very important part of autonomous vehicle driving and autonomous robot driving. In the process of vehicle autonomous driving and robot autonomous driving, it is necessary to detect the indicator light in the road image captured by the camera, and determine its status and meaning, in order to make the correct decision in compliance with the traffic rules for safe driving.

Summary of the invention

In order to overcome the problems in the related art, the present disclosure provides an indicator light detection method, device, equipment and computer-readable storage medium.

In a first aspect, an indicator light detection method is provided, which includes: recognizing a collected road image to obtain a candidate bounding box of the indicator in the road image; and according to the image area corresponding to the candidate bounding box in the road image , Predict multiple categories of indicator lights, and obtain prediction results of multiple categories of the indicator lights, where the multiple categories include at least two of the following: use category, shape category, arrangement category, function category, color Classification and orientation classification; according to the prediction results of multiple classifications of the indicator, the display state of the indicator is determined.

In a second aspect, an indicator light detection device is provided, which includes: an identification unit for recognizing collected road images to obtain candidate bounding boxes of indicator lights in the road image; In the image area corresponding to the candidate bounding box in the above, multiple classifications of the indicator are predicted to obtain prediction results of the multiple classifications of the indicator, wherein the multiple classifications include at least two of the following: use classification, Shape classification, arrangement classification, function classification, color classification, and orientation classification; the determining unit is used to determine the display state of the indicator light according to the prediction results of the multiple classifications of the indicator light.

In a third aspect, an indicator light detection device is provided. The device includes a memory and a processor, the memory is used to store computer instructions that can be run on the processor, and the processor is used to implement the computer instructions when the computer instructions are executed. The method described above.

In a fourth aspect, a computer-readable storage medium is provided, on which a computer program is stored, and the program is executed by a processor to implement the above-mentioned method.

In a fifth aspect, a computer program is provided, including computer-readable code, which, when executed by a computer, implements the method described in the embodiments of the present disclosure.

The indicator light detection method, device, device, and computer-readable storage medium of one or more embodiments of the present disclosure can detect and classify indicator lights by using road images collected by a camera without relying on high-precision sensors, etc. Effectively reduce the equipment hardware cost required to achieve indicator light detection; in the process of identifying the image area corresponding to the candidate bounding box of the indicator in the road image, the indicator has been clearly classified and logically divided, and the The classification of indicator lights in various aspects and multiple dimensions enables the prediction results of multiple classifications to cover the indicator lights in their respective situations as much as possible, which is beneficial to judge the display status of the indicator lights in various situations, which can be effective Improve the comprehensiveness and accuracy of indicator light detection.

It should be understood that the above general description and the following detailed description are only exemplary and explanatory, and cannot limit the present disclosure.

Description of the drawings

The drawings here are incorporated into the specification and constitute a part of the specification, show embodiments conforming to the specification, and are used together with the specification to explain the principle of the specification.

Fig. 1 is a schematic flowchart of a method for detecting an indicator light according to an exemplary embodiment of the present disclosure;

Fig. 2 is a logical schematic diagram showing a classification of indicator lights according to an exemplary embodiment of the present disclosure;

Fig. 3A is a schematic structural diagram of a neural network model shown in an exemplary embodiment of the present disclosure;

FIG. 3B is a schematic flow diagram of a variety of classification and prediction methods for indicator lights using the neural network model in FIG. 3A;

Fig. 4 is a schematic diagram showing a detection result of an indicator light according to an exemplary embodiment of the present disclosure;

5 is a schematic flowchart of a method for judging whether a detected indicator light is an indicator light at the same position according to an exemplary embodiment of the present disclosure;

Fig. 6 is a schematic flowchart of a method for training a neural network model according to an exemplary embodiment of the present disclosure;

Fig. 7 is a schematic structural diagram of an indicator light detection device according to an exemplary embodiment of the present disclosure;

Fig. 8 is a structural diagram of an indicator light detection device according to an exemplary embodiment of the present disclosure.

Detailed ways

Here, exemplary embodiments will be described in detail, and examples thereof are shown in the accompanying drawings. When the following description refers to the drawings, unless otherwise indicated, the same numbers in different drawings indicate the same or similar elements. The implementation manners described in the following exemplary embodiments do not represent all implementation manners consistent with the present disclosure. Rather, they are merely examples of devices and methods consistent with some aspects of the present disclosure as detailed in the appended claims.

The term "and/or" in this article is only an association relationship describing associated objects, which means that there can be three relationships, for example, A and/or B, which can mean: A alone exists, A and B exist at the same time, exist alone B these three situations. In addition, the term "at least two" in this document means any two of the multiple or any combination of at least two of the multiple, for example, including at least two of A, B, and C, and may mean including A, Any two or more elements selected in the set formed by B and C.

FIG. 1 is a schematic flowchart of a method for detecting an indicator light according to an embodiment of the disclosure. As shown in FIG. 1, the method in this embodiment includes steps 110 to 130.

In step 110, the collected road image will be identified to obtain the candidate bounding box of the indicator in the road image.

During the driving process of smart devices such as vehicles or robots, at least one image acquisition device (such as a camera, etc.) arranged on or around the smart device is used to collect road images around the smart device.

By recognizing the collected road image, for example, inputting the collected road image into a pre-trained neural network model, the candidate bounding box of the indicator lamp in the road image can be obtained. The indicator light includes, for example, a traffic signal light, a railway signal light, etc., and the present disclosure does not limit the type of the indicator light.

In step 120, according to the image area corresponding to the candidate bounding box in the road image, multiple classifications of indicator lights are predicted to obtain prediction results of the multiple classifications of the indicator lights.

Wherein, the multiple classifications include at least two of the following: use classification, shape classification, arrangement classification, function classification, color classification, and orientation classification.

The multiple classifications refer to the classification of the indicator lights in multiple aspects and multiple dimensions. The indicator light classification logic can be designed so that the indicator light classification covers various types of indicator lights.

In the embodiment of the present disclosure, at least two categories of the multiple categories are predicted, and each seed category of the two categories requires category prediction.

In step 130, the display state of the indicator light is determined according to the prediction results of the multiple classifications of the indicator light.

Corresponding to different situations of the prediction results of each category, different display states of the indicator light can be determined.

In the embodiment of the present disclosure, the display state of the indicator light is determined according to the prediction result of each of the at least two categories.

In this embodiment, the road image collected by the camera is used to detect and classify the indicator light, instead of relying on high-precision sensors, etc., it can effectively reduce the hardware cost of the equipment required to achieve indicator light detection; In the process of identifying the image area corresponding to the candidate bounding box, the indicator is clearly classified logically, and the indicator is classified from multiple aspects and multiple dimensions, so that the prediction of multiple categories As a result, the indicator lights in various situations can be covered as much as possible, which is helpful for judging the display status of the indicator lights in various situations, thereby effectively improving the comprehensiveness and accuracy of indicator light detection.

In the following description, the indicator detection method will be described in more detail.

Figure 2 shows an exemplary indicator light classification logic. As shown in Figure 2, the use classification may include, for example, that the indicator light is used to indicate pedestrians/used to indicate vehicles; in the case that the indicator light is used to indicate vehicles, the shape classification may include, for example, the indicator light is a full-screen light (also It can be called a circular light)/arrow light; the arrangement classification can include, for example, the indicator light belongs to a horizontal arrangement/vertical arrangement/individual indicator light; the function classification can include, for example, the indicator light belongs to a normal indicator light/warning light/electronic Electronic Toll Collection (ETC) indicator; color classification can include, for example, the indicator is red/yellow/green/unknown in color (corresponding to the case of not lighting); when the indicator is an arrow light Down, the pointing category may include, for example, left/right/front/left front/right front. Those skilled in the art should understand that the types of indicator lights are not limited to those described above, and may also include other aspects or dimensions.

In order to obtain the prediction results of multiple classifications of the indicator lights, the neural network model can be trained in advance through the road image with label information (which can be called sample image) (the training process will be detailed later), and the trained neural network model The indicator lights can be identified from the input road image, and the identified indicator lights can be predicted on multiple categories to obtain prediction results of multiple categories.

In some embodiments, multiple sub-network branches included in the neural network model may be used to respectively predict multiple categories of the indicator light, and obtain prediction results of multiple categories of the indicator light. The number of sub-network branches is the same as the number of the multiple categories, and each sub-network branch is used to identify a sub-category of one of the multiple categories.

FIG. 3A shows a schematic diagram of the network structure of a neural network model provided by at least one embodiment of the present disclosure. As shown in FIG. 3A, the neural network model includes a feature extraction layer 301, a region candidate (Region Proposal Network) layer 302, and a pool. Chemical layer 303, fully connected layer 304. Among them, the fully connected layer 304 includes a regression branch 3041 and multiple sub-network branches 3042.

In an alternative embodiment, the fully connected layer 304 further includes a convolutional layer, and the convolutional layer is connected to the pooling layer 303.

FIG. 3B shows a schematic flowchart of a method for applying the neural network model in FIG. 3A to perform various classification predictions of indicator lights. As shown in FIG. 3B, the method includes steps 310-340.

In step 310, the feature map of the road image is obtained through the feature extraction layer.

The feature extraction layer 301 is used to extract features of the input road image, which can be a convolutional neural network, for example, an existing Visual Geometry Group (VGG) network, residual network (Residual Network, ResNet), Dense Connection Network (DenseNet), etc., can also adopt other convolutional neural network structures. The present disclosure does not limit the specific structure of the feature extraction layer 301. In an alternative embodiment, the feature extraction layer 301 may include network units such as a convolutional layer, an incentive layer, and a pooling layer. Way stacked. Among them, the convolutional layer can extract different features in the input road image through multiple convolution kernels to obtain multiple feature maps. The pooling layer is located behind the convolutional layer and can perform local averaging and down-sampling operations on the feature maps. , Reduce the resolution of the feature map. As the number of convolutional layers and pooling layers increases, the number of feature maps gradually increases, and the resolution of the feature maps gradually decreases.

In step 320, the feature map is processed through the area candidate layer to generate a candidate bounding box of the indicator lamp in the road image.

The area candidate layer 302 is used to predict the candidate bounding box of the indicator light, that is, to generate prediction information of the candidate bounding box. The regional candidate layer 302 may be a regional candidate network (RPN, Region Proposal Network), and the present disclosure does not limit the specific structure of the regional candidate layer 302. In an optional implementation manner, the regional candidate layer 302 may include network units such as a convolutional layer, a classification layer, and a regression layer, which are formed by stacking the foregoing network units in a certain manner. Among them, the convolutional layer uses a sliding window (for example, 3*3) to convolve the input feature map. Each window corresponds to multiple anchor boxes, and each window generates one for the classification layer and the regression layer. Fully connected vector. The classification layer is used to determine whether the image area in the candidate bounding box generated by the anchor point box is the foreground or the background. The regression layer is used to obtain the approximate position of the candidate bounding box. Based on the output results of the classification layer and the regression layer, it can be predicted Contains the candidate bounding box of the indicator, and outputs the probability that the image area in the candidate bounding box is foreground and background, and the position parameter of the candidate bounding box.

In this step, the generated candidate bounding box may be called Region Proposal, and in subsequent steps, the region may be called Region of Interest (ROI).

In step 330, an image feature of a set size corresponding to the candidate bounding box in the feature map is obtained through the pooling layer.

In this step, the pooling layer 303 is used to extract a feature map of a set size (fixed size, for example 7×7) for each ROI, that is, the ROIs of different sizes obtained in step 320 are mapped into regions of the same size This process can be called ROI Polling. The pooling layer 303 may also be used to perform feature extraction on regions of the same size to obtain the image features corresponding to the ROI.

In step 340, multiple classification prediction results of the indicator light are obtained through the fully connected layer 304. The fully connected layer 304 includes a regression branch 3041 and multiple sub-network branches 3042. The regression branch 3041 and each sub-network branch 3042 can perform further feature extraction through a 1×1 convolution kernel respectively. The 1×1 convolution kernel has different parameters for each channel, which is equivalent to a fully connected function.

Among them, the regression branch 3041 regresses the aforementioned candidate bounding box, and corrects the position of the candidate bounding box, so as to more accurately locate the bounding box of the indicator light. The regression branch 3041 uses the conversion relationship between the candidate bounding box and the real bounding box obtained through learning in the training process to predict the bounding box of the indicator light, that is, predict the position information of the bounding box of the indicator light. The position information can be expressed as (x1, y1, x2, y2), where x1, y1 are the coordinates of the upper left corner of the predicted bounding box, and x2, y2 are the coordinates of the lower right corner of the predicted bounding box; the position The information can also be expressed as (x, y, w, h), where x, y represent the coordinates of the center point of the predicted bounding box, and w, h represent the width and height of the predicted bounding box, respectively.

Among them, each sub-network branch 3042 is used to identify a sub-category of one of the above-mentioned multiple categories.

In an example, the sub-network branch can obtain the prediction results of multiple classifications of the indicator light in the following manner: using the image feature corresponding to the candidate bounding box and the first sub-network branch of the multiple sub-network branches, Predict the first category in the multiple categories of the indicator light to obtain the predicted probabilities of at least two subcategories corresponding to the first category; mark the subcategory with the highest predicted probability in the at least two subcategories as The subcategory of the indicator lamp in the first category.

Wherein, the first sub-network branch may be any one of a plurality of sub-network branches, and the first category may be any one of multiple categories.

For example, the first sub-network branch is a sub-network branch for shape classification, which can obtain the predicted probabilities of the two sub-categories (full-screen lights and arrow lights) corresponding to the shape classification. For example, the predicted probability of the full-screen lights is 90%, arrow The predicted probability of the lamp is 10%. Then the sub-network branch marks the sub-category with the highest predicted probability, that is, the full-screen light as the sub-category under the shape classification.

Since the various categories of the indicator lights are hierarchical, that is, there is a logical relationship, it is necessary to jointly determine the display state of the indicator lights according to the prediction results of the multiple categories of the indicator lights.

In some embodiments, the following methods are used to determine the display state of the indicator light:

The multiple classifications include the usage classification and the shape classification, and the prediction result of the usage classification is that the indicator lamp is used to indicate a vehicle, and the prediction result of the shape classification indicates that the indicator lamp is circular In the case of a lamp, combining prediction results corresponding to the arrangement classification, the function classification, and the color classification to obtain the first display state of the indicator;

The multiple classifications include the usage classification and the shape classification, and the prediction result of the usage classification is that the indicator lamp is used to indicate a vehicle, and the prediction result of the shape classification indicates that the indicator lamp is an arrow lamp In the case of combining the prediction results corresponding to the color classification and the pointing classification to obtain the second display state of the indicator;

In the case where the multiple classifications include the use classification, and the prediction result of the use classification is that the indicator light is used to indicate pedestrians, the third display state of the indicator light is obtained.

Figure 4 shows an exemplary output result of the indicator light detection. As shown in Figure 4, the detection result includes the predicted bounding box of the indicator light, and the predicted results of multiple classifications of the indicator light, and also includes The confidence score of the bounding box obtained by this prediction. The confidence score comprehensively reflects the possibility of an indicator light in the predicted bounding box and the accuracy of the predicted bounding box position.

As shown in FIG. 4, in the road image (in order to clearly display the output prediction result, only a part of the road image is displayed), the bounding boxes of the three predicted indicators are output. Among them, the classification prediction result of the indicator in the black rectangular bounding box is "red horizon", which means that the indicator is a horizontally arranged, red ordinary indicator, and the confidence score of the predicted bounding box is 1.0. Since the classification prediction result of the ordinary indicator light is set to not be displayed (hidden), the display state of the indicator light corresponds to the first display state.

The classification prediction result of the indicator in the black square bounding box is "unknown color alone", that is, the indicator is a separate, unlit indicator. Since the two classification prediction results are set to not be displayed, they are displayed on the image Not shown, it also corresponds to the first display state. The confidence score of the bounding box obtained by this prediction is 0.98.

The classification prediction result of the indicator in the white square is "green arrow left", which means that the indicator is a separate arrow-shaped indicator, the color is green and points to the left, and the confidence of the bounding box obtained by this prediction The score is 0.99. The display state of the indicator light corresponds to the second display state.

Determining the display status of the indicator light also includes judging whether the indicator light is always on or flashing, so as to better guide the smart device's decision-making for automatic driving, so that the smart device can comply with traffic rules and realize safe driving.

In some embodiments, the indicator light at the same position in the multiple road images collected within a set time and the prediction results of multiple classifications of the indicator light are obtained; according to the prediction results of the same position in the multiple road images The prediction results of the multiple classifications of the indicator light determine the display state of the indicator light.

In an optional implementation manner, the indicator lights at the same position in the multiple road images may be obtained by collecting consecutive frames of road images. The continuous frame image may be a continuously captured multi-frame image, or may be a target frame selected every few frames from the continuously captured multi-frame image, and the continuously selected multiple target frames are regarded as continuous frames.

Fig. 5 shows a schematic flowchart of a method for judging whether the detected indicator light is an indicator light at the same position. As shown in Fig. 5, the method includes:

In step 510, the position of the indicator light in the first frame of the continuous frame of road images is obtained.

That is, the predicted position of the indicator lamp in the initial frame within the set time is obtained.

After the indicator light in the road image is detected by the above indicator light detection method, the position of the indicator light in the image can be obtained through the neural network model, that is, the position parameters of the predicted indicator light bounding box can be obtained; The processing method is to obtain the position of the detected indicator in the image.

In step 520, according to the position of the indicator light in the first frame of image, the movement speed and the shooting frequency of the device that took the road image, the indicator light is calculated to be divided by the continuous frame of the road image. The first position in each frame of images other than the first frame of image.

The device that takes the road image is the image acquisition device (such as a camera, etc.). The movement speed of the device is the same as the movement speed of an autonomous smart device. The shooting frequency can be preset or can be read by the device. Configure to get. In the case where the position of the indicator light in the image is known in the initial frame, according to the movement speed of the device and the frequency of shooting, it can be calculated to obtain the difference in each subsequent frame (other frames, that is, the continuous frame image) In an image other than the first frame), the theoretical position of the indicator light in the image. This position is called the first position to distinguish it from the position in the subsequent steps.

In step 530, the second position of the indicator light in the other frames of the image in the continuous frame is obtained.

For each subsequent frame, the position of the detected indicator in the image can be predicted by the neural network model, or the position of the detected indicator in the image can be obtained through image processing, which is called The second position.

In step 540, for each frame of the other frame images, in the case where the difference between the second position and the first position is less than a set value, determine the indication detected in the continuous frame of road image The light is the indicator light at the same position.

For the indicator lights in the same position, the second position of the indicator in the image detected in step 530 should be close to the first position calculated in step 520. Therefore, for each subsequent frame, when the difference between the second position and the first position is less than the set value, it can be determined that the detected indicator lights in the consecutive frames of road images are the indicator lights at the same position; , It is judged that it is not the indicator light at the same position. In the case that the detected indicator light is not the indicator light of the same position, there is no need to perform the subsequent step of determining the status of the indicator light. Those skilled in the art should understand that the above set value can be set according to the required detection accuracy.

After obtaining the indicator light at the same position in multiple road images, it can be judged by comparing whether the multiple classification prediction results of the indicator light at the same position in the multiple road images remain unchanged or have changed. Whether the indicator light is always on or flashing.

In an optional implementation manner, when the color classification prediction results of the indicator lights at the same position in the multiple road images are the same, it is determined that the display state of the indicator lights is always on;

In the case where the color classification prediction result interval of the indicator lamps at the same position in the multiple images changes, it is determined that the display state of the indicator lamps is blinking.

For example, if in the above multiple road images, the color classification prediction results of the indicator lamps at the same position are the same, it means that the color of the indicator lamp has not changed within the set time (for example, 3 seconds), so it can be determined The indicator light is always on. It should be noted that the color classification with the same prediction result here does not include the unknown color.

If in the above multiple road images, the color classification prediction results of the indicator lights in the same position change at intervals, for example, the prediction result of color classification is green for a period of time, and the prediction result of color classification for a period of time is unknown (or impossible). It is detected that there is a light here), and the two situations alternately appear, it means that the color of the indicator light has alternately changed within the set time, so it can be judged that the indicator light is flashing.

In the following description, how to train the neural network model will be explained. FIG. 6 is a schematic flowchart of a training method of a neural network model according to an embodiment of the disclosure. As shown in Figure 6, the method in this embodiment includes:

In step 610, the sample image containing the indicator light is input to the neural network model to obtain multiple classification prediction results and bounding box prediction results of the indicator light.

Before training, first initialize the neural network model and determine the initial network parameters.

The sample image input to the neural network model may be a road image containing indicator lights, and the sample image is pre-marked with the indicator information, and the label information contains the true bounding box information of the indicator, for example, the bounding box The coordinates of the upper left vertex and the lower left vertex; the label information also includes various classification information of the indicator light.

Inputting the sample image to the initialized neural network model can predict multiple classification prediction results of the indicator lights in the sample image and the bounding box prediction results.

In step 620, the loss value of the loss function is calculated according to the multiple classification prediction results and the bounding box prediction result, as well as the multiple classification information and the real bounding box information.

The loss value of the loss function represents the difference between the predicted multiple classification results and the predicted bounding box, and the pre-labeled multiple classification information and the true bounding box information.

In step 630, the network parameters of the neural network model are adjusted according to the loss value.

In an optional implementation, the loss value determined based on the loss function is passed back to the neural network model to adjust network parameters, such as adjusting the value of the convolution kernel of each layer and the weight parameter of each layer and many more.

When training the neural network model, the training sample can be divided into multiple image subsets (batch), each iteration of training to input an image subset to the neural network model in turn, combined with the prediction results of each sample in the training samples included in the image subset Adjust the network parameters for the loss value. After this iteration training is completed, input the next image subset to the neural network model for the next iteration training. The training samples included in different image subsets are at least partially different. When the predetermined ending condition is reached, the training of the neural network model can be completed. The predetermined training end condition, for example, may be that the loss value is reduced to a certain threshold, or the predetermined number of iterations of the neural network model is reached.

The neural network model training method of this implementation uses pre-marked classification information of indicator lights and sample images of real bounding boxes to train the neural network model, so that the trained neural network model can detect the indicator lights in the input image. And predict the various classifications of the indicator lights.

The neural network model to be trained is the neural network model used in the above embodiment of the indicator light detection method. Its structure is as shown in FIG. 3A. The only difference is that the input image is a sample image containing annotation information. For the neural network model shown in FIG. 3A, obtaining the prediction result of the indicator light based on the sample image may include: obtaining a feature map of the sample image through the feature extraction layer; processing the feature map through the region candidate layer , Generate the candidate bounding box of the indicator in the sample image; obtain the image feature of the set size corresponding to the candidate bounding box in the feature map through the pooling layer; obtain the indication through the fully connected layer The prediction results of multiple classifications of lights and the prediction results of bounding boxes.

The indicator light prediction process in the training process is similar to the indicator light prediction process in the indicator light detection method described above, and the detailed process can refer to the description in the indicator light detection method embodiment.

FIG. 7 provides an indicator light detection device. As shown in FIG. 7, the device may include: an identification unit 701, a prediction unit 702, and a determination unit 703.

Wherein, the recognition unit 701 is used to recognize the collected road image to obtain the candidate bounding box of the indicator in the road image; the prediction unit 702 is used to identify the image area corresponding to the candidate bounding box in the road image , Predict multiple categories of indicator lights, and obtain prediction results of multiple categories of the indicator lights, where the multiple categories include at least two of the following: use category, shape category, arrangement category, function category, color Classification and direction classification; the determining unit 703 is configured to determine the display state of the indicator light according to the prediction results of the multiple classifications of the indicator light.

In another embodiment, the prediction unit 702 is configured to: use a neural network model to perform feature extraction on the image region corresponding to the candidate bounding box to obtain the image feature corresponding to the candidate bounding box; and use the candidate bounding box to correspond to The image features of and the multiple sub-network branches included in the neural network model respectively predict multiple categories of the indicator light to obtain prediction results of the multiple categories of the indicator light; wherein, the multiple sub-networks The number of branches is the same as the number of the multiple categories, and each sub-network branch is used to identify a subcategory of one of the multiple categories.

In another embodiment, the prediction unit 702 is configured to: use the image feature corresponding to the candidate bounding box and the first sub-network branch of the plurality of sub-network branches to classify the first sub-network branch of the multiple classifications of the indicator light. One category is predicted to obtain the predicted probabilities of at least two subcategories corresponding to the first category; the subcategory with the highest predicted probability in the at least two subcategories is marked as the indicator of the indicator in the first category Subcategory.

In another embodiment, the determining unit 703 is configured to: in the multiple classifications, include the usage classification and the shape classification, and the prediction result of the usage classification is that the indicator light is used to indicate the vehicle, and the The prediction result of the shape classification indicates that when the indicator light is a circular light, the prediction results corresponding to the arrangement classification, the function classification, and the color classification are combined to obtain the first display of the indicator light State; or, in the multiple classifications including the usage classification and the shape classification, and the prediction result of the usage classification is that the indicator lamp is used to indicate the vehicle, and the prediction result of the shape classification indicates the indication When the light is an arrow light, the prediction results corresponding to the color classification and the pointing classification are combined to obtain the second display state of the indicator; or, the use classification is included in the multiple classifications , And when the prediction result of the usage classification is that the indicator light is used to indicate pedestrians, the third display state of the indicator light is obtained.

In another embodiment, the determining unit 703 is configured to: obtain the indicator light at the same position in the multiple road images collected within a set time and the prediction results of multiple classifications of the indicator light; The prediction results of multiple classifications of the indicator lamps at the same position in a road image are used to determine the display state of the indicator lamps.

In another embodiment, the multiple road images are continuous frame road images; the determining unit 703 is configured to: obtain the position of the indicator light in the first frame image of the continuous frame road image; The position of the lamp in the first frame of image, the movement speed and the shooting frequency of the device that took the road image, and calculate the indicators in the continuous frame of road images other than the first frame of image The first position in each frame of image; obtain the second position of the indicator light in the other frames of the continuous frame of image; for each frame of the other frame of image, in the first In the case that the difference between the second position and the first position is less than the set value, it is determined that the indicator lights detected in the consecutive frames of road images are the indicator lights at the same position.

In another embodiment, the display state of the indicator light includes: always on or flashing; the determining unit 703 is configured to: in the case where the color classification prediction results of the indicator lights at the same position in the multiple road images are the same, It is determined that the display state of the indicator light is always on; when the color classification prediction result interval of the indicator light at the same position in the multiple images changes, it is determined that the display state of the indicator light is blinking.

Fig. 8 is an indicator light detection device provided by at least one embodiment of the present disclosure. The device includes a memory and a processor. The memory is used to store computer instructions that can run on the processor. The processor is used to execute the The indicator light detection method described in any embodiment of this specification is implemented when the computer is instructed.

At least one embodiment of this specification also provides a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, the indicator light detection method described in any embodiment of this specification is implemented.

The embodiment of the present disclosure provides a computer program, including computer readable code, which, when executed by a computer, implements the indicator light detection method described in any embodiment of the present disclosure.

The foregoing description of the various embodiments tends to emphasize the differences between the various embodiments, and the same or similarities can be referred to each other, and for the sake of brevity, details are not repeated herein.

Those skilled in the art can understand that in the above methods of the specific implementation, the writing order of the steps does not mean a strict execution order but constitutes any limitation on the implementation process. The specific execution order of each step should be based on its function and possibility. The inner logic is determined.

In some embodiments, the functions or modules contained in the device provided in the embodiments of the present disclosure can be used to execute the methods described in the above method embodiments. For specific implementation, refer to the description of the above method embodiments. For brevity, here No longer.

In the embodiments of the present disclosure, the computer-readable storage medium may be in various forms. For example, in different examples, the machine-readable storage medium may be: RAM (Radom Access Memory, random access memory), volatile Memory, non-volatile memory, flash memory, storage drives (such as hard drives), solid state drives, any type of storage disks (such as optical discs, DVDs, etc.), or similar storage media, or a combination thereof. In particular, the computer-readable medium may also be paper or other suitable medium capable of printing programs. Using these media, these programs can be obtained by electrical means (for example, optical scanning), can be compiled, interpreted, and processed in a suitable manner, and then can be stored in a computer medium.

The above are only part of the embodiments of the present disclosure, and are not intended to limit the present disclosure. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present disclosure shall be included in the protection of the present disclosure. Within the range.

Claims

An indicator light detection method, including:

Recognizing the collected road image to obtain the candidate bounding box of the indicator in the road image;

According to the image area corresponding to the candidate bounding box in the road image, the multiple classifications of the indicator are predicted, and the prediction results of the multiple classifications of the indicator are obtained, wherein the multiple classifications include at least two of the following Species: use classification, shape classification, arrangement classification, function classification, color classification, pointing classification;

The display state of the indicator light is determined according to the prediction results of the multiple classifications of the indicator light.
The method according to claim 1, characterized in that, according to the image area corresponding to the candidate bounding box in the road image, the multiple classifications of the indicator are respectively predicted to obtain the multiple classifications of the indicator Forecast results, including:

Using a neural network model to perform feature extraction on the image region corresponding to the candidate bounding box to obtain the image feature corresponding to the candidate bounding box;

Using the image features corresponding to the candidate bounding box and the multiple sub-network branches included in the neural network model to respectively predict multiple categories of the indicator light, and obtain prediction results of the multiple categories of the indicator light;

Wherein, the number of the multiple sub-network branches is the same as the number of the multiple categories, and each sub-network branch is used to identify a subcategory of one of the multiple categories.
The method according to claim 2, wherein the image features corresponding to the candidate bounding box and the multiple sub-network branches included in the neural network model are used to respectively predict multiple categories of the indicator light, Obtain the prediction results of multiple classifications of the indicator, including:

Using the image features corresponding to the candidate bounding box and the first sub-network branch of the plurality of sub-network branches, the first classification of the multiple classifications of the indicator light is predicted, and at least two corresponding to the first classification are obtained. The predicted probability of each subcategory;

Mark the subcategory with the highest predicted probability in the at least two subcategories as the subcategory of the indicator lamp in the first category.
The method according to any one of claims 1 to 3, wherein determining the display state of the indicator light according to the prediction results of multiple classifications of the indicator light comprises:

The multiple classifications include the usage classification and the shape classification, and the prediction result of the usage classification is that the indicator lamp is used to indicate a vehicle, and the prediction result of the shape classification indicates that the indicator lamp is circular In the case of a light, the prediction results corresponding to the arrangement classification, the function classification, and the color classification are combined to obtain the first display state of the indicator light; or,

The multiple classifications include the usage classification and the shape classification, and the prediction result of the usage classification is that the indicator lamp is used to indicate a vehicle, and the prediction result of the shape classification indicates that the indicator lamp is an arrow lamp In the case of combining the prediction results corresponding to the color classification and the pointing classification respectively to obtain the second display state of the indicator; or,

In the case where the multiple classifications include the use classification, and the prediction result of the use classification is that the indicator light is used to indicate pedestrians, the third display state of the indicator light is obtained.
The method according to any one of claims 1 to 4, wherein determining the display state of the indicator light according to the prediction results of multiple classifications of the indicator light comprises:

Obtaining the indicator light at the same position in the multiple road images collected within a set time and the prediction results of multiple classifications of the indicator light;

The display state of the indicator light is determined according to the prediction results of multiple classifications of the indicator light at the same position in the multiple road images.
The method according to claim 5, wherein the multiple road images are consecutive frames of road images;

Obtaining the indicator lights at the same position in multiple road images collected within a set time includes:

Obtaining the position of the indicator light in the first frame of the continuous frame of road images;

According to the position of the indicator light in the first frame of image, the movement speed of the device that took the road image, and the shooting frequency, calculate the indicator light to divide the first frame of image from the continuous frame of road image The first position in each frame image except for;

Obtaining the second position of the indicator light in the other frames of the continuous frame image;

For each of the other frame images, in the case where the difference between the second position and the first position is less than a set value, it is determined that the indicator lights detected in the consecutive frames of road images are at the same position Indicator light.
The method according to claim 5 or 6, wherein the display state of the indicator light comprises: always on or flashing;

According to the prediction results of multiple classifications of the indicator lights at the same position in the multiple road images, determining the display state of the indicator lights includes:

In the case where the color classification prediction results of the indicator lights at the same position in the multiple road images are the same, determining that the display state of the indicator lights is always on;

In the case where the color classification prediction result interval of the indicator lamps at the same position in the multiple images changes, it is determined that the display state of the indicator lamps is blinking.
An indicator light detection device, including:

A recognition unit, configured to recognize the collected road image and obtain the candidate bounding box of the indicator lamp in the road image;

The prediction unit is configured to predict multiple classifications of the indicator light according to the image area corresponding to the candidate bounding box in the road image, and obtain prediction results of the multiple classifications of the indicator light, wherein the multiple Classification includes at least two of the following: use classification, shape classification, arrangement classification, function classification, color classification, and orientation classification;

The determining unit is configured to determine the display state of the indicator light according to the prediction results of multiple classifications of the indicator light.
The device according to claim 8, wherein the prediction unit is configured to:

Using a neural network model to perform feature extraction on the image region corresponding to the candidate bounding box to obtain the image feature corresponding to the candidate bounding box;

Using the image features corresponding to the candidate bounding box and the multiple sub-network branches included in the neural network model to respectively predict multiple categories of the indicator light, and obtain prediction results of the multiple categories of the indicator light;

Wherein, the number of the multiple sub-network branches is the same as the number of the multiple categories, and each sub-network branch is used to identify a subcategory of one of the multiple categories.
The apparatus according to claim 9, wherein the prediction unit is configured to:

Using the image features corresponding to the candidate bounding box and the first sub-network branch of the plurality of sub-network branches, the first classification of the multiple classifications of the indicator light is predicted, and at least two corresponding to the first classification are obtained. The predicted probability of each subcategory;

Mark the subcategory with the highest predicted probability in the at least two subcategories as the subcategory of the indicator lamp in the first category.
The device according to any one of claims 8-10, wherein the determining unit is configured to:

The multiple classifications include the usage classification and the shape classification, and the prediction result of the usage classification is that the indicator lamp is used to indicate a vehicle, and the prediction result of the shape classification indicates that the indicator lamp is circular In the case of a light, the prediction results corresponding to the arrangement classification, the function classification, and the color classification are combined to obtain the first display state of the indicator light; or,

The multiple classifications include the usage classification and the shape classification, and the prediction result of the usage classification is that the indicator light is used to indicate a vehicle, and the prediction result of the shape classification indicates that the indicator light is an arrow light In the case of combining the prediction results corresponding to the color classification and the pointing classification respectively, to obtain the second display state of the indicator light; or,

In the case where the multiple classifications include the use classification, and the prediction result of the use classification is that the indicator light is used to indicate pedestrians, the third display state of the indicator light is obtained.
The device according to any one of claims 8-11, wherein the determining unit is configured to:

Obtaining the indicator light at the same position in the multiple road images collected within a set time and the prediction results of multiple classifications of the indicator light;

The display state of the indicator light is determined according to the prediction results of multiple classifications of the indicator light at the same position in the multiple road images.
The device according to claim 12, wherein the multiple road images are consecutive frames of road images;

The determining unit is used for:

Obtaining the position of the indicator light in the first frame of the continuous frame of road images;

According to the position of the indicator light in the first frame of image, the movement speed of the device that took the road image, and the shooting frequency, calculate the indicator light to divide the first frame of image from the continuous frame of road image The first position in each frame image except for;

Obtaining the second position of the indicator light in the other frames of the continuous frame image;

For each of the other frame images, in the case where the difference between the second position and the first position is less than a set value, it is determined that the indicator lights detected in the consecutive frames of road images are at the same position Indicator light.
The device according to claim 12 or 13, wherein the display state of the indicator light comprises: always on or flashing;

The determining unit is used for:

In the case where the color classification prediction results of the indicator lights at the same position in the multiple road images are the same, determining that the display state of the indicator lights is always on;

In the case where the color classification prediction result interval of the indicator lamps at the same position in the multiple images changes, it is determined that the display state of the indicator lamps is blinking.
An indicator light detection device, comprising a memory and a processor, the memory is used to store computer instructions that can run on the processor, and the processor is used to implement any one of claims 1 to 7 when the computer instructions are executed The method described in the item.
A computer-readable storage medium having a computer program stored thereon, and when the computer program is executed by a processor, the method according to any one of claims 1 to 7 is realized.
A computer program comprising computer readable code, which when executed by a computer, implements the method according to any one of claims 1 to 7.