CN112785595B

CN112785595B - Target attribute detection, neural network training and intelligent driving method and device

Info

Publication number: CN112785595B
Application number: CN201911081216.4A
Authority: CN
Inventors: 林培文; 程光亮; 石建萍
Original assignee: Beijing Sensetime Technology Development Co Ltd
Current assignee: Beijing Sensetime Technology Development Co Ltd
Priority date: 2019-11-07
Filing date: 2019-11-07
Publication date: 2023-02-28
Anticipated expiration: 2039-11-07
Also published as: KR20210087496A; CN112785595A; JP2022513781A; WO2021088505A1

Abstract

The embodiment discloses a target attribute detection method, a neural network training method, an intelligent driving method, a device, electronic equipment and a computer storage medium, wherein the target attribute detection method comprises the following steps: performing semantic segmentation on an image to be processed, and determining a mask map of the image to be processed, wherein the mask map represents the position of a target in the image to be processed; according to the mask image, determining attribute features belonging to the target in an attribute feature image of the image to be processed; the attribute characteristic chart of the image to be processed characterizes the attribute of the image to be processed; and determining the attribute of the target according to the attribute characteristic of the target. Thus, in the embodiment of the present disclosure, since the region where the target is located in the image does not need to be extracted, but the attribute feature of the target can be determined on the mask image which is obtained by semantic segmentation and has more distinctiveness, the accuracy of target attribute detection can be improved.

Description

Target attribute detection, neural network training and intelligent driving method and device

Technical Field

The present disclosure relates to computer vision processing technologies, and in particular, to a target attribute detection method, a neural network training method, an intelligent driving device, an electronic device, and a computer storage medium.

Background

With the continuous development of computer vision technology, the identification of target attributes in images gradually becomes a research hotspot, for example, the identification of lane line attributes is beneficial to lane division, path planning, collision early warning and the like; in the related art, how to accurately identify the target attribute in the image is a technical problem to be solved urgently.

Disclosure of Invention

The embodiment of the disclosure is expected to provide a technical scheme for target attribute detection.

The embodiment of the disclosure provides a target attribute detection method, which comprises the following steps:

performing semantic segmentation on an image to be processed, and determining a mask map of the image to be processed, wherein the mask map represents the position of a target in the image to be processed;

determining attribute features belonging to the target in the attribute feature map of the image to be processed according to the mask map; the attribute characteristic chart of the image to be processed characterizes the attribute of the image to be processed;

and determining the attribute of the target according to the attribute characteristic of the target.

Optionally, the determining the attribute of the target according to the attribute feature of the target includes:

converting the attribute characteristics of the target into characteristics with preset length;

and determining the attribute of the target according to the converted attribute characteristic of the target with the preset length.

Optionally, the converting the attribute feature of the target into a feature of a preset length includes:

dividing points corresponding to the attribute characteristics of the target into k parts;

calculating the average value of the attribute characteristics of the target corresponding to each point to obtain k average values;

repeating the step n times, wherein the values of k are different in the process of executing any two times, k is less than the possible maximum number of points corresponding to the attribute characteristics of the target, and n is an integer greater than 1;

and using the obtained average value to form the characteristic of the preset length.

Optionally, the target attribute detection method is a lane line attribute detection method, and the image to be processed is a road image;

the semantic segmentation is carried out on the image to be processed, and the mask diagram of the image to be processed is determined, wherein the semantic segmentation comprises the following steps:

extracting the characteristics of the road image, and determining a region characteristic diagram of the road image and an attribute characteristic diagram of the road image;

determining a mask map of a lane line in the road image according to the regional characteristic map of the road image;

according to the mask image, determining the attribute features belonging to the target in the attribute feature image of the image to be processed, including:

determining attribute features belonging to the lane lines in the attribute feature map of the road image according to the mask map of the lane lines in the road image;

determining the attribute of the target according to the attribute feature of the target, wherein the determining of the attribute of the target comprises the following steps:

and determining the attribute of the lane line according to the attribute characteristic of the lane line.

Optionally, after determining the attribute of the lane line, the method further comprises:

and determining the lane lines in the road image according to the road image, the determined mask map of the lane lines in the road image and the determined attributes of the lane lines.

Optionally, the target attribute detection method is performed by a neural network, and the neural network is trained by using a sample image, a mask map of an annotation of the sample image, and an attribute of the annotation of the target of the sample image.

The embodiment of the present disclosure further provides a training method of a neural network, including:

determining attribute features belonging to a target in an attribute feature map of a sample image according to an annotated mask map of the sample image; the annotated mask map characterizes a location of the target in the sample image; the attribute feature chart of the sample image characterizes the attribute of the sample image;

determining the attribute of the target according to the attribute characteristic of the target;

and adjusting the network parameter value of the neural network according to the determined difference between the attribute of the target and the labeled attribute of the target and the difference between the labeled mask image and the mask image of the sample image determined after semantic segmentation is carried out on the sample image.

repeatedly executing the steps for n times, wherein the values of k are different in the process of executing any two times, k is smaller than the possible maximum number of points corresponding to the attribute characteristics of the target, and n is an integer larger than 1;

Optionally, the neural network is used for detecting attributes of lane lines, the sample image is a road sample image, and the target is a lane line;

the determining the attribute features belonging to the target in the attribute feature map of the sample image according to the labeled mask map of the sample image comprises:

according to the labeled mask map of the road sample image, determining the attribute features belonging to the lane lines in the attribute feature map of the road sample image, wherein the labeled mask map of the road sample image represents the positions of the lane lines in the road sample image;

the determining the attribute of the target according to the attribute feature of the target comprises:

determining the attribute of the lane line according to the attribute characteristic of the lane line;

the adjusting the network parameter value of the neural network according to the determined difference between the attribute of the target and the labeled attribute of the target and the difference between the labeled mask map and the mask map of the sample image determined after semantic segmentation of the sample image comprises:

and adjusting the network parameter value of the neural network according to the difference between the determined attribute of the lane line and the labeled attribute of the lane line and the difference between the labeled mask map of the road sample image and the mask map of the lane line determined according to the regional characteristic map of the road sample image.

The embodiment of the present disclosure further provides an intelligent driving method, including:

detecting the attribute of the lane line in the road image acquired by the intelligent driving equipment by using any one target attribute detection method;

and indicating the intelligent driving equipment to drive on the road corresponding to the road image according to the detected attribute of the lane line.

The embodiment of the present disclosure also provides a target property detection apparatus, which includes a first processing module, a second processing module and a third processing module,

the first processing module is used for performing semantic segmentation on an image to be processed and determining a mask map of the image to be processed, wherein the mask map represents the position of a target in the image to be processed;

the second processing module is used for determining attribute characteristics belonging to the target in the attribute characteristic diagram of the image to be processed according to the mask diagram; the attribute characteristic chart of the image to be processed characterizes the attribute of the image to be processed;

and the third processing module is used for determining the attribute of the target according to the attribute characteristic of the target.

Optionally, the third processing module is configured to: converting the attribute characteristics of the target into characteristics with preset length; and determining the attribute of the target according to the converted attribute characteristic of the target with the preset length.

Optionally, the third processing module is configured to, in terms of converting the attribute feature of the target into a feature with a preset length, be configured to:

dividing points corresponding to the attribute characteristics of the target into k parts; calculating the average value of the attribute characteristics of the target corresponding to each point to obtain k average values; repeating the step n times, wherein the values of k are different in the process of executing any two times, k is less than the possible maximum number of points corresponding to the attribute characteristics of the target, and n is an integer greater than 1; and using the obtained average value to form the characteristic of the preset length.

Optionally, the target attribute detection device is a lane line attribute detection device, and the image to be processed is a road image;

the first processing module is configured to: extracting the characteristics of the road image, and determining a region characteristic diagram of the road image and an attribute characteristic diagram of the road image; determining a mask map of a lane line in the road image according to the regional characteristic map of the road image;

the second processing module is configured to: determining attribute features belonging to the lane lines in the attribute feature map of the road image according to the mask map of the lane lines in the road image;

the third processing module is configured to: and determining the attribute of the lane line according to the attribute characteristic of the lane line.

Optionally, the third processing module is further configured to, after determining the attribute of the lane line, determine a lane line in the road image according to the road image, the determined mask map of the lane line in the road image, and the determined attribute of the lane line.

Optionally, the target attribute detection apparatus is implemented based on a neural network, and the neural network is trained by using a sample image, an annotated mask map of the sample image, and an annotated attribute of a target of the sample image.

The disclosed embodiment also provides a neural network training device, which comprises a fourth processing module, a fifth processing module and an adjusting module, wherein,

the fourth processing module is used for determining the attribute features belonging to the target in the attribute feature map of the sample image according to the labeled mask map of the sample image; the annotated mask map characterizes a location of the target in the sample image; the attribute feature chart of the sample image characterizes the attribute of the sample image;

the fifth processing module is used for determining the attribute of the target according to the attribute characteristic of the target;

and the adjusting module is used for adjusting the network parameter value of the neural network according to the determined difference between the attribute of the target and the labeled attribute of the target and the difference between the labeled mask map and the mask map of the sample image determined after the semantic segmentation is carried out on the sample image.

Optionally, the fifth processing module is configured to: converting the attribute characteristics of the target into characteristics with preset length; and determining the attribute of the target according to the converted attribute characteristic of the target with the preset length.

Optionally, the fifth processing module, configured to, in terms of converting the attribute feature of the target into a feature of a preset length, be configured to: dividing points corresponding to the attribute characteristics of the target into k parts; calculating the average value of the attribute characteristics of the target corresponding to the point in each share to obtain k average values; repeatedly executing the steps for n times, wherein the values of k are different in the process of executing any two times, k is smaller than the possible maximum number of points corresponding to the attribute characteristics of the target, and n is an integer larger than 1; and using the obtained average value to form the characteristic of the preset length.

the fourth processing module is configured to: according to the labeled mask map of the road sample image, determining the attribute features belonging to the lane line in the attribute feature map of the road sample image, wherein the labeled mask map of the road sample image represents the position of the lane line in the road sample image;

the fifth processing module is configured to: determining the attribute of the lane line according to the attribute characteristics of the lane line;

the adjusting module is configured to: and adjusting the network parameter value of the neural network according to the difference between the determined attribute of the lane line and the labeled attribute of the lane line and the difference between the labeled mask map of the road sample image and the mask map of the lane line determined according to the regional characteristic map of the road sample image.

The embodiment of the present disclosure further provides an intelligent driving device, which includes a detection module and an indication module, wherein,

the detection module is used for detecting the attribute of the lane line in the road image acquired by the intelligent driving equipment by using any one target attribute detection method;

and the indicating module is used for indicating the intelligent driving equipment to drive on the road corresponding to the road image according to the detected attribute of the lane line.

An embodiment of the present disclosure also provides an electronic device, including a processor and a memory for storing a computer program capable of running on the processor; wherein the content of the first and second substances,

the processor is configured to execute any one of the above target attribute detection methods, any one of the above neural network training methods, or any one of the above intelligent driving methods when the computer program is executed.

The embodiment of the present disclosure further provides a computer storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements any one of the above target attribute detection methods, or any one of the above neural network training methods, or any one of the above intelligent driving methods.

In a target attribute detection method, a neural network training method, an intelligent driving method, a device, an electronic device, and a computer storage medium provided by the embodiments of the present disclosure, a to-be-processed image is subjected to semantic segmentation, a mask map of the to-be-processed image is determined, and the mask map represents a position of a target in the to-be-processed image; according to the mask image, determining attribute features belonging to the target in an attribute feature image of the image to be processed; the attribute characteristic chart of the image to be processed characterizes the attribute of the image to be processed; and determining the attribute of the target according to the attribute characteristic of the target. In this way, the target attribute detection method provided by the embodiment of the present disclosure divides target attribute detection into two steps, first determining a position of a target from an image to be processed, then determining an attribute feature of the target based on the position of the target in the image to be processed in combination with an attribute feature map of the image to be processed, and then determining an attribute of the target according to the attribute feature of the target.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure.

FIG. 1 is a flow chart of a target attribute detection method of an embodiment of the present disclosure;

FIG. 2 is a flowchart of lane line attribute detection according to an embodiment of the present disclosure;

FIG. 3 is a flow chart of a neural network training method of an embodiment of the present disclosure;

fig. 4 is a flowchart of an intelligent driving method according to an embodiment of the present disclosure;

fig. 5 is a schematic structural diagram of a target attribute detection apparatus according to an embodiment of the present disclosure;

FIG. 6 is a schematic diagram of a structure of a neural network training device according to an embodiment of the present disclosure;

fig. 7 is a schematic structural diagram of the intelligent driving device according to the embodiment of the present disclosure;

fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the disclosure.

Detailed Description

The present disclosure will be described in further detail below with reference to the drawings and examples. It is to be understood that the examples provided herein are merely illustrative of the present disclosure and are not intended to limit the present disclosure. In addition, the following embodiments are provided for implementing part of the embodiments of the present disclosure, not for implementing all the embodiments of the present disclosure, and the technical solutions described in the embodiments of the present disclosure may be implemented in any combination without conflict.

It should be noted that, in the embodiments of the present disclosure, the terms "comprises," "comprising," or any other variation thereof are intended to cover a non-exclusive inclusion, so that a method or apparatus including a series of elements includes not only the explicitly recited elements but also other elements not explicitly listed or inherent to the method or apparatus. Without further limitation, the use of the phrase "including a. -. Said." does not exclude the presence of other elements (e.g., steps in a method or elements in a device, such as portions of circuitry, processors, programs, software, etc.) in the method or device in which the element is included.

For example, the target attribute detection method, the neural network training method, and the intelligent driving method provided in the embodiments of the present disclosure include a series of steps, but the target attribute detection method, the neural network training method, and the intelligent driving method provided in the embodiments of the present disclosure are not limited to the described steps, and similarly, the target attribute detection device, the neural network training device, and the intelligent driving device provided in the embodiments of the present disclosure include a series of modules, but the device provided in the embodiments of the present disclosure is not limited to include the explicitly described modules, and may include modules that are required to acquire relevant information or perform processing based on the information.

The term "and/or" herein is merely an association relationship describing an associated object, and means that there may be three relationships, for example, a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the term "at least one" herein means any one of a variety or any combination of at least two of a variety, for example, including at least one of A, B, C, and may mean including any one or more elements selected from the group consisting of A, B and C.

The disclosed embodiments may be implemented in a computer system comprised of terminals and servers and may be operational with numerous other general purpose or special purpose computing system environments or configurations. Here, the terminal may be a thin client, a thick client, a hand-held or laptop device, a microprocessor-based system, a set-top box, a programmable consumer electronics, a network personal computer, a small computer system, etc., and the server may be a server computer system, a small computer system, a mainframe computer system, a distributed cloud computing environment including any of the above, etc.

The electronic devices of the terminal, server, etc. may be described in the general context of computer system-executable instructions, such as program modules, being executed by a computer system. Generally, program modules may include routines, programs, objects, components, logic, data structures, etc. that perform particular tasks or implement particular abstract data types. The computer system/server may be practiced in distributed cloud computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed cloud computing environment, program modules may be located in both local and remote computer system storage media including memory storage devices.

In the related art, for attribute detection of an object such as a lane line, an object classification method and a semantic segmentation method may be generally adopted; the target classification method comprises the steps of extracting a region where a target is located from an image, inputting the image of the region where the target is located into a target classification network, and obtaining the attribute of the target through target classification, wherein the target classification method has the main problems that the image region occupied by the target is small, the degree of distinguishing is low, the target classification network is difficult to learn useful distinguishing characteristics, and the accuracy of the attribute of the identified target is low; the semantic segmentation method comprises the following steps: predicting the attribute of each pixel of the target in the image, and then determining the attribute of the whole target in a mode of mode selection, namely, taking the attribute with the largest number of times as the attribute of the whole target from the attributes of each pixel of the target; the main problem of the semantic segmentation method is that the target attribute is an integer for the whole target, and the semantic segmentation method breaks through the integral relation, which results in low accuracy of the identified target attribute.

In view of the above technical problems, in some embodiments of the present disclosure, a target attribute detection method is provided, and the embodiments of the present disclosure may be applied to scenes such as image classification, lane line attribute identification, and automatic driving.

Fig. 1 is a flowchart of a target attribute detection method according to an embodiment of the present disclosure, and as shown in fig. 1, the flowchart may include:

step 101: and performing semantic segmentation on the image to be processed, and determining a mask map of the image to be processed, wherein the mask map represents the position of a target in the image to be processed.

Here, the image to be processed is an image for which object attribute recognition is required, and for example, the object in the image to be processed may be a lane line or another object.

For example, the to-be-processed image may be obtained from a local storage area or a Network, and the format of the to-be-processed image may be Joint Photographic Experts GROUP (JPEG), bitmap (BMP), portable Network Graphics (PNG), or other formats; it should be noted that, the format and the source of the image to be processed are only exemplified here, and the embodiment of the present invention does not limit the format and the source of the image to be processed.

In the embodiment of the disclosure, the number of targets in the image to be processed is not limited, and one or more targets may be in the image to be processed; for example, in the case where the target is a lane line, a plurality of lane lines may exist in the image to be processed.

Obviously, in the case that a plurality of objects exist in the image to be processed, the positions of the respective objects in the image to be processed are characterized based on the mask map obtained in step 101.

In practical application, an image to be processed can be input into a trained semantic segmentation network, and in the semantic segmentation network, a mask map of the image to be processed is extracted from the image to be processed.

Step 102: determining attribute features belonging to a target in an attribute feature map of the image to be processed according to the mask map; the attribute feature chart of the image to be processed characterizes the attribute of the image to be processed.

In the embodiment of the disclosure, the attribute of the image to be processed can represent the characteristics of the image, such as color, texture, surface roughness, and the like, and the attribute of the image to be processed can be obtained from the attribute of each pixel of the image to be processed; the attribute of the pixel of the image to be processed may represent information such as a color of the pixel of the image to be processed. Similarly, the attribute features of the target may characterize the color, texture, surface roughness, etc. of the target. For example, the attribute feature of the target may be represented as a feature map of a set number of channels, and the set number of channels may be set according to the effect of target attribute identification, for example, the set number of channels is 5, 6 or 7.

Obviously, since the mask map can represent the position of the target in the image to be processed, according to the mask map, the attribute features belonging to the target in the attribute feature map of the image to be processed can be determined.

Step 103: and determining the attribute of the target according to the attribute characteristic of the target.

Here, the attribute of the object may represent information of a color, a size, a shape, and the like of the object in the image to be processed, for example, in the case where the object is a lane line, the attribute of the lane line may represent information of a color, a line width, a line type, and the like of the lane line.

It can be seen that, in the case of multiple objects existing in the image to be processed, the attributes of the respective objects in the image to be processed can be obtained by performing step 103.

In practical application, the attribute features of the target can be input into a trained target classification network, and the attribute features of the target are classified by using the target classification network to obtain the attributes of the target in the image to be processed.

In practical applications, steps 101 to 103 may be implemented by a Processor in an electronic Device, where the Processor may be at least one of an Application Specific Integrated Circuit (ASIC), a Digital Signal Processor (DSP), a Digital Signal Processing Device (DSPD), a Programmable Logic Device (PLD), an FPGA, a Central Processing Unit (CPU), a controller, a microcontroller, and a microprocessor.

In the embodiment of the present disclosure, firstly, a semantic segmentation method is used to obtain a mask map of an image to be processed, and then, an attribute feature of a target is determined according to the mask map, so as to determine an attribute of the target, in the method for detecting an attribute of a target provided in the embodiment of the present disclosure, the method for detecting an attribute of a target includes two steps, firstly, a position of the target is determined from the image to be processed, then, the attribute feature of the target is determined based on the position of the target in the image to be processed and the attribute feature of the target is determined according to the attribute feature of the target, compared with the method for determining a feature of an area where the target is located according to a pixel at the position of the target in the image to be processed, and for classifying the target according to the determined feature, feature extraction to be performed for classification is avoided, and the attribute feature of the target extracted in the method for detecting an attribute of a target provided in the embodiment of the present disclosure is more discriminative, so that a variety classification of the target is more accurately determined; in addition, the target attribute detection method provided by the embodiment of the disclosure classifies the whole target, and compared with a scheme of performing target attribute detection only by semantic segmentation, the target attribute detection method performs target attribute detection from the whole target, so that the target attribute can be accurately obtained.

For the implementation mode of determining the attribute of the target according to the attribute feature of the target, exemplarily, the attribute feature of the target may be converted into a feature with a preset length; determining the attribute of the target according to the converted attribute characteristic of the target with the preset length; in the embodiment of the present disclosure, the preset length may be set according to an actual application scenario.

Specifically, one or more targets may exist in the image to be processed, and the attribute features of the one or more targets are converted into features of a preset length; and according to the characteristics of the preset length of the one or more targets, carrying out target classification on the one or more targets to obtain the attributes of the one or more targets.

In addition, when a plurality of objects exist in the image to be processed and the length of the attribute features of the plurality of objects is the same, the objects may be classified directly according to the attribute features of the plurality of objects in the image to be processed to obtain the attributes of the plurality of objects.

It can be understood that some target classification methods need to be implemented on the basis of obtaining features with the same length, and therefore, in the embodiment of the present disclosure, by converting the attribute features of the target with an indefinite length into the features with the preset length, the implementation of target classification is facilitated, and further, the attribute of the target is facilitated.

For the implementation manner of converting the attribute feature of the target into the feature with the preset length, exemplarily, points corresponding to the attribute feature of the target may be divided into k parts, where k is an integer greater than or equal to 1; calculating the average value of the attribute characteristics of the target corresponding to each point to obtain k average values; repeatedly executing the steps for n times, wherein the values of k are different in the process of executing any two times, k is smaller than the possible maximum number of points corresponding to the attribute characteristics of the target, and n is an integer larger than 1; and using the obtained average value to form the characteristic of the preset length.

Specifically, n strokes are performed for each attribute feature of the objectDividing, wherein Ki parts can be obtained by dividing pixel points of attribute characteristics of each target for the ith time, wherein i is 1 to n, and Ki represents the value of k during the ith division; in the embodiment of the disclosure, the obtained Ki parts may be equal in length or unequal in length; uniformly pooling Ki parts obtained by the ith division respectively to obtain an average value of attribute characteristics of the targets corresponding to the points in each part; the resulting feature of length K1 may then be concatenated to a feature of length Kn to yield a feature of length P,

p represents a preset length.

In the embodiment of the present disclosure, k may be set according to an actual situation, for example, if the maximum possible number of the pixel points of the attribute feature of the target is 30, the value of k is less than or equal to 30.

In a specific example, the target attribute detection method is a lane line attribute detection method, the image to be processed is a road image, and the target is a lane line. Therefore, the feature extraction can be carried out on the road image, and the region feature map of the road image and the attribute feature map of the road image are determined; determining a mask map of a lane line in the road image according to the regional characteristic map of the road image; determining attribute features belonging to the lane lines in the attribute feature map of the road image according to the mask map of the lane lines in the road image; and determining the attribute of the lane line according to the attribute characteristics of the lane line.

In the embodiment of the present disclosure, the area feature map of the road image characterizes positions of the lane lines in the road image, and thus, a mask map of the lane lines in the road image can be obtained according to the area feature map of the road image.

Fig. 2 is a flowchart of detecting lane line attributes according to an embodiment of the present disclosure, and referring to fig. 2, in the embodiment of the present disclosure, a road image may be input into a trained semantic segmentation network, in the semantic segmentation network, a lane line segmentation result may be obtained by using the semantic segmentation network, and in fig. 2, the lane line segmentation result may be represented as a region feature map of the road image; moreover, the attribute feature map of the road image can be obtained by utilizing the semantic segmentation network; therefore, a mask map of the lane line can be obtained according to the regional characteristic map of the road image; according to the mask map of the lane line and the attribute feature map of the road image, the attribute features belonging to the lane line in the attribute feature map of the road image can be obtained.

In practical application, the lengths and angles of the lane lines are usually different, so that the lengths of the attribute features of the lane lines obtained by the embodiment of the disclosure are usually different, and when the target classification process needs to be implemented on the basis of obtaining the features with the same length, the attribute features of the lane lines with different lengths can be converted into the features with the same length in advance.

In a specific implementation, referring to fig. 2, the attribute features of each lane line may be input to a fixed-length feature extraction module, and the fixed-length feature extraction module may be configured to perform the following steps: dividing points corresponding to the attribute characteristics of each lane line into k parts, wherein k is an integer greater than or equal to 1; calculating the average value of the attribute characteristics of the corresponding lane lines in each share to obtain k average values; repeatedly executing the steps for n times, wherein the values of k are different in the process of executing any two times, k is smaller than the possible maximum number of points corresponding to the attribute characteristics of the target, and n is an integer larger than 1; and using the obtained average value to form the characteristic of the preset length.

In a specific example, the attribute features of one lane line may be directly pooled for the first time to obtain a feature value; then, dividing the pixels of the attribute characteristics of the lane line for 6 times respectively, wherein the pixels of the attribute characteristics of the lane line are divided into two parts, and averaging the pixel values of each part to obtain 2 characteristic values; dividing the pixels of the attribute characteristics of the lane line into three parts, and averaging the pixel values of each part to obtain 3 characteristic values; dividing the pixels of the attribute characteristics of the lane line into six parts, and averaging the pixel values of each part to obtain 6 characteristic values; dividing the pixels of the attribute characteristics of the lane line into eight parts, and averaging the pixel values of each part to obtain 8 characteristic values; dividing the pixels of the attribute characteristics of the lane line into ten parts, and averaging the pixel values of each part to obtain 10 characteristic values; dividing the pixels of the attribute characteristics of the lane line into twelve parts, and averaging the pixel values of each part to obtain 12 characteristic values; combining the obtained 1, 2, 3, 6, 8, 10 and 12 feature values to obtain a feature with the length of 42, namely obtaining the feature with the preset length of the lane line.

That is to say, for the attribute features of each lane line, the fixed-length feature extraction module may be used to obtain the pixel attribute features (the length is 42) with the same length.

Referring to fig. 2, after the fixed-length feature extraction module is used to convert the attribute features of each lane line into features of the same length, the features of the same length may be input to a trained target classification network, and the input features are subjected to target classification by using the target classification network, so as to obtain the attributes of each lane line.

In one specific example, the target classification network may include two fully-connected layers, where the input of the first fully-connected layer is a pixel attribute feature with the same length (e.g., length 42) corresponding to each target, the number of nodes of the first fully-connected layer is 256, the number of nodes of the second fully-connected layer is 128, and the second fully-connected layer may output the attribute of each target.

It should be noted that fig. 2 only illustrates a specific application scenario of target attribute detection, that is, lane line attribute identification is performed; the embodiments of the present disclosure are not limited to detecting attributes of lane lines, and for example, may detect attributes of other types of targets.

Further, after determining the attribute of the lane line, the lane line in the road image may be determined according to the road image, a mask map of the lane line in the determined road image, and the attribute feature of the determined lane line.

It can be understood that, by determining the lane lines of the road image, assistance to vehicle driving is facilitated, and safety of vehicle driving is improved.

Optionally, the target attribute detection method is executed by a neural network, and the neural network is obtained by training using the sample image, a mask map of the annotation of the sample image, and the attribute of the annotation of the target of the sample image.

Here, the sample image is a predetermined image containing an object, for example, a lane line or other object.

Illustratively, the sample image may be obtained from a local storage area or a network, and the format of the sample image may be JPEG, BMP, PNG, or other formats; it should be noted that, the format and the source of the sample image are merely illustrated, and the format and the source of the sample image are not limited in the embodiment of the present invention.

In the embodiment of the present disclosure, the number of targets in the sample image is not limited, and the number of targets in the sample image may be one or multiple; for example, in the case where the target is a lane line, a plurality of lane lines may exist in the sample image.

In practical application, a mask diagram of the annotation of the sample image can be preset; obviously, because the labeled mask map of the sample image represents the position of the target in the sample image, the attribute features belonging to the target in the attribute feature map of the sample image can be determined according to the labeled mask map of the sample image; and further, the trained neural network can determine the attribute characteristics of the target, and further, the trained neural network can determine the attributes of the target according to the attribute characteristics of the target.

The training process of the neural network is exemplarily described below with reference to the drawings.

Fig. 3 is a flowchart of a neural network training method according to an embodiment of the present disclosure, and as shown in fig. 3, the flowchart may include:

step 301: determining attribute features belonging to a target in the attribute feature map of the sample image according to the labeled mask map of the sample image; the annotated mask map characterizes the position of the target in the sample image; a property profile of a sample image characterizes a property of the sample image.

In the embodiment of the disclosure, the attribute of the sample image can represent the characteristics of the image, such as color, texture, surface roughness, and the like, and the attribute of the sample image can be obtained from the attribute of each pixel of the sample image; the attribute of the pixel of the sample image may represent information such as a color of the pixel of the sample image. Similarly, the object attribute features may characterize the object color, texture, surface roughness, and the like. For example, the attribute feature of the target may be represented as a feature map of a set number of channels, and the set number of channels may be set according to the effect of target attribute identification, for example, the set number of channels is 5, 6 or 7.

Step 302: and determining the attribute of the target according to the attribute characteristic of the target.

The implementation of this step has already been described in the foregoing description, and is not described herein again.

Step 303: and adjusting the network parameter value of the neural network according to the difference between the determined attribute of the target and the labeled attribute of the target and the difference between the labeled mask image and the mask image of the sample image determined after the semantic segmentation is carried out on the sample image.

For the implementation of this step, for example, the loss of the initial neural network may be calculated according to the difference between the determined attribute of the target and the labeled attribute of the target, and the difference between the labeled mask map and the mask map of the sample image determined after semantic segmentation of the sample image; and adjusting network parameters of the neural network according to the loss of the neural network.

Step 304: judging that the processing of the neural network on the sample image after the network parameter adjustment meets the preset requirement, and if not, repeatedly executing the step 301 to the step 304; if so, step 305 is performed.

Optionally, the preset requirement may be that the loss of the neural network after the network parameter adjustment is smaller than a set loss value; in the embodiment of the present disclosure, the set loss value may be preset according to actual requirements.

Step 305: and taking the neural network after the network parameters are adjusted as the trained neural network.

In practical applications, steps 301 to 305 may be implemented by a processor in an electronic device, where the processor may be at least one of an ASIC, a DSP, a DSPD, a PLD, an FPGA, a CPU, a controller, a microcontroller, and a microprocessor.

In a specific embodiment, the neural network is used for detecting attributes of a lane line, the sample image is a road sample image, and the target is the lane line; in this way, first, the attribute feature belonging to the lane line in the attribute feature map of the road sample image may be determined according to the labeled mask map of the road sample image, where the labeled mask map of the road sample image represents the position of the lane line in the road sample image; then, determining the attribute of the lane line according to the attribute characteristic of the lane line; finally, the network parameter values of the neural network may be adjusted according to the difference between the determined attributes of the lane lines and the labeled attributes of the lane lines, and the difference between the labeled mask map of the road sample image and the mask map of the lane lines (the mask map of the lane lines may be detected by the semantic segmentation network) determined according to the area feature map of the road sample image.

It can be seen that, in the neural network training process, the attribute feature of the target is determined according to the labeled mask diagram, and then the attribute of the target is determined, because the segmentation network in the neural network is not trained well in the training stage, and the classification network in the subsequent neural network cannot be converged due to the fact that the untrained network is used for predicting the mask diagram of the lane line, the labeled mask diagram is used for determining the attribute feature of the target in the training stage.

In the neural network training process, the target attribute is determined through two steps, firstly, the attribute characteristic of the target is determined according to the labeled mask image, then the attribute of the target is determined according to the attribute characteristic of the target, and compared with the method that the characteristic of the area where the target is located is determined according to the pixels of the position of the target in the image to be processed, more attribute characteristics with higher discriminability can be extracted when the target is classified according to the determined characteristic, so that the classification can be better learned, and the accuracy of the trained neural network detection target is higher; in the training process of the neural network, the whole target is classified when the attribute of the target is determined, and compared with a scheme of detecting the target attribute only through semantic segmentation, the target attribute is detected from the whole target, so that the target attribute can be obtained more accurately, and the accuracy of detecting the target by the trained neural network is higher.

On the basis of the target attribute detection method provided in the foregoing embodiment, the embodiment of the present disclosure further provides an intelligent Driving method, which may be applied to an intelligent Driving device, where the intelligent Driving device includes, but is not limited to, an automatic Driving vehicle, a vehicle equipped with an Advanced Driving Assistance System (ADAS), a robot equipped with an ADAS, and the like.

Fig. 4 is a flowchart of an intelligent driving method according to an embodiment of the present disclosure, and as shown in fig. 4, the flowchart may include:

step 401: and under the condition that the target attribute detection method is a lane line attribute detection method and the image to be processed is a road image, detecting the lane line attribute in the road image acquired by the intelligent driving equipment by using any one of the above target attribute detection methods.

Here, the lane line attribute includes, but is not limited to, a type of line, a color of the line, a line width, etc., and the type of line may be a single line, a double line, a solid line, or a dotted line; the color of the line may be, for example, white, yellow or blue, or a combination of two colors, etc.

Step 402: and indicating the intelligent driving equipment to drive on the road corresponding to the road image according to the detected attribute of the lane line.

In practical applications, the intelligent driving device can be directly controlled to drive (automatic driving and robot), and the instructions can also be sent to the driver, so that the driver can control the vehicle (such as the vehicle with the ADAS) to drive.

Therefore, the lane line attribute can be obtained based on the lane line attribute detection method, so that the assistance for vehicle driving is facilitated, and the safety of vehicle driving is improved.

It will be understood by those of skill in the art that in the above method of the present embodiment, the order of the steps is not meant to be strictly sequential and should not be construed as limiting the process, as the exact order of the steps can be determined by their function and possible inherent logic

On the basis of the target attribute detection method provided by the foregoing embodiment, the embodiment of the present disclosure provides a target attribute detection apparatus.

Fig. 5 is a schematic structural diagram of a target attribute detection apparatus according to an embodiment of the present disclosure, and as shown in fig. 5, the apparatus includes: a first processing module 501, a second processing module 502, and a third processing module 503, wherein,

the first processing module 501 is configured to perform semantic segmentation on an image to be processed, and determine a mask map of the image to be processed, where the mask map represents a position of a target in the image to be processed;

a second processing module 502, configured to determine, according to the mask map, an attribute feature belonging to the target in an attribute feature map of the image to be processed; the attribute characteristic chart of the image to be processed characterizes the attribute of the image to be processed;

a third processing module 503, configured to determine an attribute of the target according to the attribute feature of the target.

Optionally, the third processing module 503 is configured to: converting the attribute characteristics of the target into characteristics with preset length; and determining the attribute of the target according to the converted attribute characteristic of the target with the preset length.

Optionally, the third processing module 503 is configured to, in terms of converting the attribute feature of the target into a feature with a preset length, be configured to: dividing points corresponding to the attribute characteristics of the target into k parts; calculating the average value of the attribute characteristics of the target corresponding to each point to obtain k average values; repeatedly executing the steps for n times, wherein the values of k are different in the process of executing any two times, k is smaller than the possible maximum number of points corresponding to the attribute characteristics of the target, and n is an integer larger than 1; and using the obtained average value to form the characteristic of the preset length.

the first processing module 501 is configured to: extracting the characteristics of the road image, and determining a region characteristic diagram of the road image and an attribute characteristic diagram of the road image; determining a mask map of a lane line in the road image according to the regional characteristic map of the road image;

the second processing module 502 is configured to: determining attribute features belonging to the lane lines in the attribute feature map of the road image according to the mask map of the lane lines in the road image;

the third processing module 503 is configured to: and determining the attribute of the lane line according to the attribute characteristic of the lane line.

Optionally, the third processing module 503 is further configured to, after determining the attribute of the lane line, determine a lane line in the road image according to the road image, the determined mask map of the lane line in the road image, and the determined attribute of the lane line.

Optionally, the target attribute detection apparatus is implemented based on a neural network, and the neural network is trained by using a sample image, a mask map of an annotation of the sample image, and an attribute of an annotation of a target of the sample image.

In practical applications, the first processing module 501, the second processing module 502, and the third processing module 503 may all be implemented by a processor in an electronic device, and the processor may be at least one of an ASIC, a DSP, a DSPD, a PLD, an FPGA, a CPU, a controller, a microcontroller, and a microprocessor.

Fig. 6 is a schematic structural diagram of a neural network training device according to an embodiment of the present disclosure, and as shown in fig. 6, the device includes: a fourth processing module 601, a fifth processing module 602, and an adjustment module 603, wherein,

a fourth processing module 601, configured to determine, according to the labeled mask map of the sample image, an attribute feature belonging to the target in the attribute feature map of the sample image; the annotated mask map characterizes a location of the target in the sample image; the attribute feature chart of the sample image characterizes the attribute of the sample image;

a fifth processing module 602, configured to determine an attribute of the target according to the attribute feature of the target;

an adjusting module 603, configured to adjust a network parameter value of the neural network according to a difference between the determined attribute of the target and the labeled attribute of the target, and a difference between the labeled mask map and the mask map of the sample image determined after performing semantic segmentation on the sample image.

Optionally, the fifth processing module 602 is configured to: converting the attribute characteristics of the target into characteristics with preset length; and determining the attribute of the target according to the converted attribute characteristic of the target with the preset length.

Optionally, the fifth processing module 602 is configured to, in terms of converting the attribute feature of the target into a feature with a preset length, be configured to: dividing points corresponding to the attribute characteristics of the target into k parts; calculating the average value of the attribute characteristics of the target corresponding to each point to obtain k average values; repeatedly executing the steps for n times, wherein the values of k are different in the process of executing any two times, k is smaller than the possible maximum number of points corresponding to the attribute characteristics of the target, and n is an integer larger than 1; and using the obtained average value to form the characteristic of the preset length.

the fourth processing module 601 is configured to: according to the labeled mask map of the road sample image, determining the attribute features belonging to the lane line in the attribute feature map of the road sample image, wherein the labeled mask map of the road sample image represents the position of the lane line in the road sample image;

the fifth processing module 602 is configured to: determining the attribute of the lane line according to the attribute characteristic of the lane line;

the adjusting module 603 is configured to: and adjusting the network parameter value of the neural network according to the difference between the determined attribute of the lane line and the labeled attribute of the lane line and the difference between the labeled mask map of the road sample image and the mask map of the lane line determined according to the regional characteristic map of the road sample image.

In practical applications, the fourth processing module 601, the fifth processing module 602, and the adjusting module 603 may all be implemented by a processor in an electronic device, where the processor may be at least one of an ASIC, a DSP, a DSPD, a PLD, an FPGA, a CPU, a controller, a microcontroller, and a microprocessor.

Fig. 7 is a schematic structural diagram of a smart driving device according to an embodiment of the present disclosure, and as shown in fig. 7, the smart driving device includes: a detection module 701 and an indication module 702, wherein,

a detection module 701, configured to, when the target attribute detection method is a lane line attribute detection method and the image to be processed is a road image, detect a lane line attribute in the road image acquired by the intelligent driving apparatus by using any one of the above-mentioned target attribute detection methods;

and the indicating module 702 is configured to indicate the intelligent driving device to drive on the road corresponding to the road image according to the detected attribute of the lane line.

In practical applications, the detection module 701 and the indication module 702 may be implemented by a processor in the intelligent driving device, where the processor may be at least one of an ASIC, a DSP, a DSPD, a PLD, an FPGA, a CPU, a controller, a microcontroller, and a microprocessor.

In addition, each functional module in this embodiment may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware or a form of a software functional module.

Based on the understanding that the technical solution of the present embodiment essentially or a part contributing to the prior art, or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium, and include several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) to execute all or part of the steps of the method of the present embodiment. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

Specifically, the computer program instructions corresponding to any one of the target attribute detection method, the neural network training method, or the intelligent driving method in the present embodiment may be stored in a storage medium such as an optical disc, a hard disc, or a usb disk, and when the computer program instructions corresponding to any one of the target attribute detection method, the neural network training method, or the intelligent driving method in the storage medium are read or executed by an electronic device, the target attribute detection method or the neural network training method or the intelligent driving method in the foregoing embodiment is implemented.

Based on the same technical concept of the foregoing embodiment, referring to fig. 8, it illustrates an electronic device 80 provided in an embodiment of the present disclosure, which may include: a memory 81 and a processor 82; wherein the content of the first and second substances,

the memory 81 for storing computer programs and data;

the processor 82 is configured to execute the computer program stored in the memory to implement any one of the target attribute detection methods of the foregoing embodiments or any one of the neural network training methods described above or any one of the intelligent driving methods described above.

In practical applications, the memory 81 may be a volatile memory (RAM); or a non-volatile memory (non-volatile memory) such as a ROM, a flash memory (flash memory), a Hard Disk (Hard Disk Drive, HDD) or a Solid-State Drive (SSD); or a combination of the above types of memories and provides instructions and data to the processor 82.

The processor 82 may be at least one of ASIC, DSP, DSPD, PLD, FPGA, CPU, controller, microcontroller, and microprocessor. It is understood that the electronic devices for implementing the above-described processor functions may be other devices, and the embodiments of the present disclosure are not particularly limited.

In some embodiments, functions of or modules included in the apparatus provided in the embodiments of the present disclosure may be used to execute the method described in the above method embodiments, and for specific implementation, reference may be made to the description of the above method embodiments, and for brevity, details are not described here again.

The foregoing description of the various embodiments is intended to highlight various differences between the embodiments, and the same or similar parts may be referred to each other, which are not repeated herein for brevity

The methods disclosed in the method embodiments provided by the present application can be combined arbitrarily without conflict to obtain new method embodiments.

Features disclosed in various product embodiments provided by the application can be combined arbitrarily to obtain new product embodiments without conflict.

The features disclosed in the various method or apparatus embodiments provided herein may be combined in any combination to arrive at new method or apparatus embodiments without conflict.

Through the description of the foregoing embodiments, it is clear to those skilled in the art that the method of the foregoing embodiments may be implemented by software plus a necessary general hardware platform, and certainly may also be implemented by hardware, but in many cases, the former is a better implementation. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.

While the present invention has been described with reference to the embodiments shown in the drawings, the present invention is not limited to the embodiments, which are illustrative and not restrictive, and it will be apparent to those skilled in the art that various changes and modifications can be made therein without departing from the spirit and scope of the invention as defined in the appended claims.

Claims

1. A method for detecting a target attribute, the method comprising:

according to the mask image, determining attribute features belonging to the target in an attribute feature image of the image to be processed; the attribute characteristic chart of the image to be processed characterizes the attribute of the image to be processed;

dividing points corresponding to the attribute characteristics of the target into k parts; calculating the average value of the attribute characteristics of the target corresponding to each point to obtain k average values; repeatedly executing the steps for n times, wherein the values of k are different in the process of executing any two times, k is smaller than the possible maximum number of points corresponding to the attribute characteristics of the target, and n is an integer larger than 1; forming attribute characteristics of the target with preset length by using the obtained average value; and determining the attribute of the target according to the attribute characteristic of the target with the preset length.

2. The method according to claim 1, wherein the target attribute detection method is a lane line attribute detection method, and the image to be processed is a road image;

3. The method of claim 2, wherein after determining the attribute of the lane line, the method further comprises:

4. The method according to any one of claims 1 to 3, wherein the object property detection method is performed by a neural network trained using a sample image, a mask map of annotations of the sample image, and properties of annotations of objects of the sample image.

5. A method of training a neural network, comprising:

adjusting network parameter values of the neural network according to the determined difference between the attributes of the target and the labeled attributes of the target and the difference between the labeled mask image and the mask image of the sample image determined after semantic segmentation is carried out on the sample image;

6. The method of claim 5, wherein the neural network is used for lane line attribute detection, the sample image is a road sample image, and the target is a lane line;

according to the labeled mask map of the road sample image, determining the attribute features belonging to the lane line in the attribute feature map of the road sample image, wherein the labeled mask map of the road sample image represents the position of the lane line in the road sample image;

7. An intelligent driving method, comprising:

detecting lane line attributes in a road image acquired by an intelligent driving device by using the method of any one of claims 2 to 4;

and indicating the intelligent driving equipment to drive on the road corresponding to the road image according to the detected lane line attribute.

8. An object property detection apparatus, characterized in that the apparatus comprises a first processing module, a second processing module and a third processing module, wherein,

the second processing module is used for determining the attribute features belonging to the target in the attribute feature map of the image to be processed according to the mask map; the attribute characteristic chart of the image to be processed characterizes the attribute of the image to be processed;

the third processing module is used for dividing points corresponding to the attribute characteristics of the target into k parts; calculating the average value of the attribute characteristics of the target corresponding to each point to obtain k average values; repeatedly executing the steps for n times, wherein the values of k are different in the process of executing any two times, k is smaller than the possible maximum number of points corresponding to the attribute characteristics of the target, and n is an integer larger than 1; and forming the attribute characteristics of the target with preset length by using the obtained average value, and determining the attribute of the target according to the attribute characteristics of the target with the preset length.

9. The apparatus according to claim 8, wherein the target attribute detection means is a lane line attribute detection means, and the image to be processed is a road image;

10. The apparatus of claim 9, wherein the third processing module is further configured to determine the lane lines in the road image according to the road image, the determined mask map of the lane lines in the road image, and the determined attributes of the lane lines after determining the attributes of the lane lines.

11. The apparatus according to any one of claims 8 to 10, wherein the target property detection apparatus is implemented based on a neural network trained using the sample image, a mask map of the annotation of the sample image, and the property of the annotation of the target of the sample image.

12. A neural network training device, comprising a fourth processing module, a fifth processing module and an adjusting module, wherein,

the fifth processing module is used for dividing points corresponding to the attribute characteristics of the target into k parts; calculating the average value of the attribute characteristics of the target corresponding to each point to obtain k average values; repeatedly executing the steps for n times, wherein the values of k are different in the process of executing any two times, k is smaller than the possible maximum number of points corresponding to the attribute characteristics of the target, and n is an integer larger than 1; forming attribute characteristics of the target with preset length by using the obtained average value; determining the attribute of the target according to the attribute characteristic of the target with the preset length;

and the adjusting module is used for adjusting the network parameter value of the neural network according to the difference between the determined attribute of the target and the labeled attribute of the target and the difference between the labeled mask image and the mask image of the sample image determined after the semantic segmentation is carried out on the sample image.

13. The apparatus of claim 12, wherein the neural network is used for lane line attribute detection, the sample image is a road sample image, and the target is a lane line;

the fifth processing module is configured to: determining the attribute of the lane line according to the attribute characteristic of the lane line;

14. The intelligent driving device is characterized by comprising a detection module and an indication module, wherein,

a detection module for detecting the attribute of the lane line in the road image acquired by the intelligent driving device by using the method of any one of claims 2 to 4;

15. An electronic device comprising a processor and a memory for storing a computer program operable on the processor; wherein the content of the first and second substances,

the processor is configured to execute the target property detection method according to any one of claims 1 to 4, the neural network training method according to any one of claims 5 to 6, or the intelligent driving method according to claim 7 when the computer program is executed.

16. A computer storage medium having stored thereon a computer program, characterized in that the computer program, when being executed by a processor, implements the target property detection method of any one of claims 1 to 4 or the neural network training method of any one of claims 5 to 6 or the intelligent driving method of claim 7.