CN110395257B

CN110395257B - Lane line example detection method and device and automatic driving vehicle

Info

Publication number: CN110395257B
Application number: CN201910295868.1A
Authority: CN
Inventors: 李天�; 王泮渠; 陈鹏飞
Original assignee: Tusimple Inc
Current assignee: Tusimple Inc
Priority date: 2018-04-20
Filing date: 2019-04-12
Publication date: 2021-04-23
Anticipated expiration: 2039-04-12
Also published as: CN110395257A

Abstract

The embodiment of the application provides a lane line example detection method and device and an automatic driving vehicle. The method comprises the following steps: in the processing process, identifying lane line pixels in the characteristic data to obtain lane line pixel data; identifying the lane position category to which the pixel belongs in the feature data to obtain lane position category data, wherein one lane position category corresponds to a preset lane position; the method comprises the steps of comparing lane line pixel data with lane line position category data, assigning a lane line position category corresponding to a lane line pixel to the lane line pixel to obtain lane line example data, wherein the lane line example data comprises at least one lane line example, and each lane line example comprises a plurality of lane line pixels with the same lane line position category.

Description

Lane line example detection method and device and automatic driving vehicle

Technical Field

The invention relates to the field of machine vision, in particular to a lane line example detection method and device, a storage medium and an automatic driving vehicle.

Background

In autopilot technology, an autopilot control system typically includes a plurality of processing subsystems, such as decision making subsystems, path planning subsystems, image processing systems, and control subsystems. These processing subsystems process the received large amounts of sensor data and other input data in real time, thereby presenting a heavy processing burden and a short available processing time. The safety and utility of autonomous vehicles, as well as the safety of passengers, depend on the ability of these processing subsystems in the autonomous vehicle to operate as needed. These processing burdens can be handled by equipping the autonomous vehicle with a high energy consuming processing system and an expensive data processing system. However, lowest cost, lowest weight, lowest power consumption, lowest operating temperature, high levels of adaptability and customization capability are long-felt expectations for autonomous vehicles. Conventional autonomous vehicle control systems fail to meet these demands while achieving fast response, highly reliable, and highly effective autonomous control.

One significant processing burden of an autonomous vehicle control system is the processing of images acquired by an onboard camera. Image processing typically includes semantic segmentation of the image. Image semantic segmentation aims at identifying image regions from an image that directly correspond to an object by assigning a semantic class to each pixel in the image. Unlike object recognition techniques, which detect only objects in an image, semantic segmentation assigns to each pixel in the image a class of objects to which the pixel belongs. The purpose of semantic segmentation is thus to assign a class label to each pixel in the image. Semantic segmentation plays an important role for image analysis and automatic driving systems. However, semantic segmentation assigns the same class label to each object instance of the same object class in an image, so that semantic segmentation cannot distinguish different object instances of the same object class in an image, e.g., cannot identify multiple lane line instances on a road.

Disclosure of Invention

In view of this, embodiments of the present application provide a lane line instance detection method and apparatus, a storage apparatus, and an autonomous vehicle, so as to solve the problem in the prior art that a lane line instance cannot be effectively identified from image data.

According to an aspect of the embodiments of the present application, there is provided a lane line example detection method, including:

in the processing process, the lane line example detection device executes feature extraction processing and lane line example generation processing; wherein

The feature extraction processing includes extracting feature data from one image data, the feature data including lane line information;

the lane line example generation processing comprises lane line pixel identification processing and lane line category identification processing which are executed in parallel, and lane line example extraction processing; wherein the content of the first and second substances,

the lane line pixel identification processing comprises identifying lane line pixels in the characteristic data to obtain lane line pixel data; the lane line type identification processing includes identifying a lane line position type to which the pixel belongs in the feature data to obtain lane line position type data, one lane line position type corresponding to a predetermined lane line position; the lane line example extraction processing comprises comparing lane line pixel data with lane line position category data, assigning a lane line position category corresponding to a lane line pixel to the lane line pixel to obtain lane line example data, wherein the lane line example data comprises at least one lane line example, and each lane line example comprises a plurality of lane line pixels with the same lane line position category.

According to another aspect of the embodiments of the present application, there is provided a lane line example detecting device including: the system comprises a processor and at least one memory, wherein at least one machine executable instruction is stored in the at least one memory, and the processor executes the at least one machine executable instruction to realize lane line example detection processing; the lane line instance detection process includes:

in the processing process, performing feature extraction processing and lane line instance generation processing; wherein

According to another aspect of the embodiments of the present application, there is provided a non-volatile storage medium having at least one machine executable instruction stored therein, where the at least one machine executable instruction is executed by a processor to implement a lane line instance detection process, where the process includes:

According to an aspect of an embodiment of the present application, an autonomous driving vehicle is provided, including a vehicle-mounted camera and an autonomous driving control system, where the autonomous driving control system includes a lane line instance detection device;

the vehicle-mounted camera is used for acquiring image data of the running environment of the automatic driving vehicle;

the lane line example detection device is used for performing feature extraction processing and lane line example generation processing on the image data acquired by the vehicle-mounted camera to obtain lane line example data; wherein

the lane line pixel identification processing comprises identifying lane line pixels in the characteristic data to obtain lane line pixel data; the lane line type identification processing includes identifying a lane line position type to which the pixel belongs in the feature data to obtain lane line position type data, one lane line position type corresponding to a predetermined lane line position; the lane line example extraction processing comprises the steps of comparing lane line pixel data with lane line position category data, assigning a lane line position category corresponding to one lane line pixel to the lane line pixel to obtain lane line example data, wherein the lane line example data comprises at least one lane line example, and each lane line example comprises a plurality of lane line pixels with the same lane line position category;

and the automatic driving control system carries out automatic driving control on the vehicle according to the lane line example data output by the lane line example detection device.

In the technical solution provided in the embodiment of the present application, the image data is respectively subjected to lane line pixel identification processing and lane line type identification processing, the lane line pixels and the lane line position types are identified from the image data, the identified lane line pixels have lane line type information, and the identified lane line position types have lane line position information. And comparing the lane line pixel data with the lane line position type data, assigning the lane line position type corresponding to the lane line pixel, and forming a lane line example by a plurality of lane line pixels with the same lane line position type. According to the technical scheme provided by the embodiment of the application, the lane line example can be effectively identified from the image data; and by executing the lane line pixel identification processing and the lane line category identification processing in parallel, the processing speed and the processing efficiency of the lane line instance identification can be improved. Therefore, the problem that the lane line instance cannot be effectively identified in the prior art can be solved.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention.

Fig. 1 is a processing flow chart of a lane line example detection method according to an embodiment of the present disclosure;

FIG. 2 is a schematic structural diagram of a deep neural network of a multi-classification task model according to an embodiment of the present application;

fig. 3a is another processing flow chart of the lane line example detection method according to the embodiment of the present disclosure;

FIG. 3b is a schematic structural diagram of a deep neural network of a multi-classification task model according to an embodiment of the present application;

FIG. 3c is an example of a lane marking detected according to FIG. 1 or FIG. 3 a;

FIG. 3d is another example of a lane marking detected according to FIG. 1 or FIG. 3 a;

fig. 4a is a flowchart illustrating a process of obtaining a multi-segmentation task model by pre-training a deep neural network according to an embodiment of the present disclosure;

FIG. 4b is a flowchart illustrating a process of pre-training a deep neural network to obtain a multi-class task model according to an embodiment of the present disclosure;

FIG. 5 is a flowchart of a process for examining whether training of a branch network is completed according to an embodiment of the present disclosure;

FIG. 6 is a flowchart of the process of iteratively modifying parameters in a branch network at step 502 of FIG. 5;

FIG. 7 is a flowchart of the process of iteratively modifying parameters in the backbone network at step 502 of FIG. 5;

fig. 8 is a block diagram of a lane line example detection structure provided in the embodiment of the present application.

Detailed Description

In order to make those skilled in the art better understand the technical solution of the present invention, the technical solution in the embodiment of the present invention will be clearly and completely described below with reference to the drawings in the embodiment of the present invention, and it is obvious that the described embodiment is only a part of the embodiment of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

In the automatic driving technology, a large amount of image data acquired by a camera needs to be subjected to real-time image processing, and a lane line example is identified from the image data. In the technical scheme provided by the embodiment of the application, a multi-classification task model is obtained through pre-training, and the model operates according to input image data to obtain processing results of two classification tasks, wherein the two classification tasks comprise a lane line pixel identification task and a lane line category identification task, namely the processing results of the two classification tasks, including lane line pixel data and lane line position category data, can be obtained through one-time processing of the multi-classification task model; the model further compares the lane line pixel data with the lane line position category data, and assigns the lane line position category corresponding to one lane line pixel to the pixel, so that a plurality of lane line pixels corresponding to the same lane line position category form a lane line example, and the model outputs at least one detected lane line example. According to the technical scheme provided by the embodiment of the application, the lane line example can be effectively identified from the image data; and by executing the lane line pixel identification processing and the lane line category identification processing in parallel, the processing speed and the processing efficiency of the lane line instance identification can be improved. Therefore, the problem that the lane line instance cannot be effectively identified in the prior art can be solved.

The lane line example detection method provided by the embodiment of the application can be applied to automatic driving vehicles. Autonomous vehicles typically include an onboard camera for acquiring image data of the vehicle surroundings. The lane line example detection device provided by the embodiment of the application can be positioned in an automatic driving control system of an automatic driving vehicle, and also can be positioned in an automatic driving simulation device or a test device.

Fig. 1 shows a processing flow of a lane line example detection method provided in an embodiment of the present application, where the method includes:

102, in the processing process, the lane line example detection device executes feature extraction processing, wherein the feature extraction processing comprises extracting feature data from image data, and the feature data comprises lane line information;

the image data can be obtained by a vehicle-mounted camera of the automatic driving vehicle in the real-time prediction processing process;

103, executing a lane line instance generation process by the lane line instance detection device, wherein the process comprises a lane line pixel identification process and a lane line category identification process which are executed in parallel, and further comprises a lane line instance extraction process; the lane line pixel identification processing comprises the steps of identifying lane line pixels in the characteristic data to obtain lane line pixel data; the lane line type identification processing includes identifying a lane line position type to which the pixel belongs in the feature data to obtain lane line position type data, one lane line position type corresponding to a predetermined lane line position; the lane line example extraction processing comprises comparing lane line pixel data with lane line position category data, assigning a lane line position category corresponding to a lane line pixel to the lane line pixel to obtain lane line example data, wherein the lane line example data comprises at least one lane line example, and each lane line example comprises a plurality of lane line pixels with the same lane line position category.

The lane line pixel identification processing and the lane line category identification processing include image multi-classification processing.

In the embodiment shown in fig. 1, the image data is subjected to lane line pixel identification processing and lane line type identification processing, and the lane line pixels and the lane line position types are identified from the image data, the identified lane line pixels have lane line type information, and the identified lane line position types have lane line position information. And comparing the lane line pixel data with the lane line position category data, and assigning the lane line position category corresponding to the lane line pixel, so that a plurality of lane line pixels with the same lane line position category form a lane line example.

In some embodiments, the process illustrated in FIG. 1 may be performed by a multi-classification task model. As shown in fig. 2, the lane marking example detecting apparatus inputs image data to a multi-classification task model 20, and the multi-classification task model 20 performs a feature extraction process and a lane marking example generation process, and outputs the lane marking example data.

The multi-classification task model comprises a deep neural network comprising a backbone network 200, first branch networks 201, second branch networks 202, and an output layer 203; the backbone network 200 is configured to perform feature extraction processing to obtain feature data; the first branch network 201 is configured to perform lane line pixel identification processing, that is, identify lane line pixels in the feature data to obtain lane line pixel data; the second branch network 202 is configured to perform lane line type identification processing, that is, identify a lane line position type to which a pixel belongs in the feature data to obtain lane line position type data; the output layer 203 is configured to perform lane line instance extraction processing, that is, compare lane line pixel data with lane line position category data, and assign a lane line position category corresponding to a lane line pixel to the lane line pixel to obtain lane line instance data.

The multi-classification task model shown in fig. 2 may be obtained by training in advance, and may perform lane line instance detection processing during real-time operation of the autonomous driving vehicle, or may perform lane line instance detection processing in a simulation or test scenario of autonomous driving.

Before the processing shown in fig. 1, the processing for detecting the lane line example provided in the embodiment of the present application may further include processing for obtaining a multi-classification task model through pre-training, as shown in fig. 3a, including:

step 100, in the pre-training process, setting a structure of a deep neural network, wherein the deep neural network is used for learning lane line example detection processing.

The structure of the deep neural network comprises a backbone network, a first branch network, a second branch network and an output layer, wherein the backbone network, the first branch network and the second branch network comprise a plurality of network layers; the input data of the deep neural network is input data of a main network, the output data of the main network is input data of each branch network, the output data of each branch network is input data of an output layer, and the output data of the output layer is output data of the deep neural network;

the main network is used for learning common task characteristics in the lane line pixel identification task and the lane line category identification task, and the common task characteristics comprise characteristic extraction processing; each branch network is respectively used for learning specific classification task characteristics in the corresponding task; the specific classification task characteristics of the first branch network comprise lane line pixel identification processing, and the specific classification task characteristics of the second branch network comprise lane line category identification processing; the input layer is used for extracting and processing the learning lane line example;

further, as shown in fig. 3b, a gradient balancing layer may be disposed at the last layer of the backbone network; the gradient balance layer is only used for determining the gradient transmitted back to the main network from the branch network in the backward propagation process, so that the learning of each segmentation task by the main network is more balanced;

and 101, iteratively training the set deep neural network by using a supervision method according to the training data to obtain a multi-classification task model.

Step 102 and step 103 are the same as the processing in fig. 1, and are not described again here.

In the above process, when training the first branch network and the second branch network, the first branch network is used for the identification process of learning the pixel classes through the image multi-classification task, and in the normal operating environment of the autonomous vehicle, the pixel classes in the image may include lane line pixels, road edge pixels, background pixels, and other classes. The second branch network is used for learning the identification processing of the lane line position type through the image multi-classification task, and during training, the positions of the lane lines can be classified in advance, for example, the positions of n lane lines can be classified into the 1 st to the nth categories in the order from left to right in the image. The second branch network is used for learning the lane line position category to which the image pixel belongs.

In the technical scheme provided by the embodiment of the application, the lane line pixels in the image data can be effectively obtained by executing the multi-classification processing through the first branch network, the lane line position type to which the pixels in the image data belong can be effectively obtained by executing the multi-classification processing through the second branch network, the lane line position type to which the lane line pixels belong can be obtained by combining the lane line pixels and the lane line position type, and the lane line example of the corresponding position can be obtained through all the lane line pixels corresponding to the lane line position type. Therefore, the embodiment of the application can accurately extract the lane line pixels in the image data, determine the lane line positions to which the pixels belong, and extract the lane line examples through the lane line pixels corresponding to the lane line positions.

Examples of lane lines that may be output by the processes shown in fig. 1 or fig. 3a in some examples are shown in fig. 3c and 3 d. Where 510 in fig. 3c and 610 in fig. 3d are examples of lane lines marked in real data (ground route), 520 in fig. 3c and 620 in fig. 3d are cases where 510 and 610 are to be superimposed on original image data, respectively, for visual display, 530 in fig. 3c and 630 in fig. 3d are examples of lane lines output by the processing shown in fig. 1 or 3a, 540 in fig. 3c is a case where 530 is to be superimposed on original image data, for visual display, and 640 in fig. 3d is a case where 630 is to be superimposed on original image data, for visual display. As can be seen from the examples in fig. 3c and fig. 3d, the technical solution provided by the embodiment of the present application can accurately and effectively extract the lane line example in the image data.

The following is a detailed description of the process of pre-training to obtain the multi-classification task model, and as shown in fig. 4a, the process of pre-training includes:

step 401, setting a first teacher network and a second teacher network, wherein the first teacher network corresponds to the lane line pixel identification task, and the second teacher network corresponds to the lane line category identification task;

step 402, respectively training a first teacher network and a second teacher network according to real data in training data to obtain a first teacher model and a second teacher model for executing corresponding tasks;

two teacher networks are arranged to respectively execute corresponding multi-classification tasks, and a teacher model with a better effect can be obtained by arranging the teacher network with a more complex structure;

wherein, the real data (ground route) is image data with accurate all-category marks, and can be manually calibrated image data;

step 403, providing the partially marked image data as input to a first teacher model and a second teacher model respectively, and outputting the learning image data used for training corresponding tasks respectively by the first teacher model and the second teacher model, wherein the learning image data is completely marked image data; saving the learning image data to a set of training data;

in the processing process, on one hand, the teacher network can predict the image data of the partial marks to obtain the completely marked learning image data, so that the data volume of the training data can be expanded aiming at the problem that the image data of the partial marks cannot be applied to the multi-classification task model and a large amount of partial mark data is wasted, and the set of the training data comprises real data and the completely marked learning data obtained by predicting the partial mark data; on the other hand, teacher models which complete different multi-classification tasks can be obtained through teacher network training, and training reference is provided for subsequently obtaining multi-classification task models which can be rapidly subjected to real-time prediction processing;

in the above step 402 and/or step 403, the training data provided to the first teacher's network may be selected and the lane line pixels are labeled with accurate data, so that the first teacher's network learns to more accurate lane line pixel identification processing; the training data provided to the second teacher network may be selected data with accurate lane position type marks, so that the second teacher network learns more accurate lane position type identification processing;

in a specific application scenario, in case that the image data (e.g. real data) of the complete mark is sufficient, as shown in fig. 4b, step 403 may also be omitted, i.e. the process of step 402 jumps to step 404;

step 404, setting a student network, wherein the structure of the student network comprises a main network and a plurality of branch networks, and the last layer of the main network is a gradient balance layer; step 404 corresponds to step 100 above;

step 405, training students to learn understanding of corresponding tasks by a plurality of teacher models through a network to obtain a multi-classification task model; step 405 corresponds to step 101 above.

The student network with a relatively simple structure is arranged to learn the understanding of the two teacher models to respective multi-classification tasks, so that the fast processing speed can be obtained through the student network, and the real-time prediction processing requirement is met.

The method for training the deep neural network, that is, the student network in the step 405 to obtain the multi-classification task model may include various methods, and the method includes the following steps:

the first method is as follows: randomly initializing parameters of a deep neural network (namely a student network), iteratively training the deep neural network by using a supervision method according to training data, and simultaneously learning a lane line pixel identification task and a lane line category identification task to obtain a multi-classification task model;

the second method comprises the following steps: initializing a main network of a deep neural network (namely a student network) and parameters of a branch network corresponding to a task with the largest learning difficulty by using the parameters of an existing model corresponding to the task which is determined in advance, and randomly initializing parameters of the other branch network; sequentially and iteratively training the other branch network of the deep neural network by using a supervision method according to the training data and learning a task corresponding to the other branch network; and obtaining a multi-classification task model.

The difficulty level of the task can be predetermined and can be determined according to training experience.

In the above manner, the deep neural network is trained to simultaneously learn the lane line pixel identification task and the lane line category identification task, which easily causes the loss value (as detailed in step 501) of the network to be hard to converge to an ideal value. In the second mode, the parameters of the trunk network and the corresponding branch network of the deep neural network are initialized by adopting the parameters of the existing model, the parameters of the other branch network are initialized randomly, and the other branch network is trained sequentially, so that the loss value of the network can reach a lower and more ideal value, and a better training effect is achieved.

Other equivalent or alternative training modes can be provided by one of ordinary skill in the art according to the first mode or the second mode, which are not listed here.

In the training process of the first or second mode, the method provided in the embodiment of the present application respectively examines whether each branch network is trained completely, and as shown in fig. 5, includes the following processing procedures:

step 501, inputting a predicted value output by a branch network and a true value in training data into a predefined loss function for the branch network, wherein the loss function outputs a loss value (loss value);

step 502, iteratively modifying parameters in the branch network and the backbone network when the loss value is greater than a predetermined confidence value; determining that the iterative training of the branch network is complete if the loss value is less than or equal to the confidence value.

For example, for a branch network corresponding to the lane line pixel identification task, inputting a predicted value of a pixel class output by the branch network and a corresponding real value in training data (for example, real data of a manually calibrated object class) into a loss function, where the loss function outputs a loss value capable of reflecting a distance or a difference between the predicted value and the real value;

when the loss value is larger than a preset confidence value, the difference between the predicted value and the true value is larger, and the predicted value cannot be converged; and under the condition that the loss value is less than or equal to the preset confidence value, the difference between the true values of the predicted values is small, the predicted values are converged, and the iterative training of the branch network is completed.

In step 502, iteratively modifying parameters in the branch network, as shown in fig. 6, includes:

step 5021a, in the back propagation process of one iteration training, the learning rate of one branch network is determined to be the product of the preset network learning rate and the preset weight value of the branch network.

For example, for the branch network i corresponding to the lane line pixel identification task, the preset weight value of the branch network i is a, which may be the preset weight value of the branch network in the above step 5021a, and the network learning rate is α_nThen the learning rate alpha of the branch network i_i＝a*α_n。

Step 5022a, modifying the parameters in the branch network according to the determined learning rate of the branch network.

The learning rate of a branch network is set as the product of the weight value of the branch network and the network learning rate, and the learning speed of the parameters of the branch network can be influenced by the weight value.

In step 502, iteratively modifying parameters in the backbone network, as shown in fig. 7, includes:

step 5021b, in the back propagation process of one iteration training, determining the gradient of the gradient balance layer as the sum of the products of the gradients of all the branch networks and the preset weight values of all the branch networks, and returning the determined gradient value to the next network layer in the main network;

for example, for the first branch network i and the second branch network j, the gradients returned by the two branch networks to the main network are gi and gj, respectively, the preset weight values of the two branch networks are a and b, respectively, and the weight values may be the weight values preset in the step 5021 a. Then, the gradient equilibrium layer determines that the gradient is g ═ a × gi + b × gj.

Step 5022b, for each other network layer except the gradient balance layer in the backbone network, modifying the parameters of the layer according to the gradient returned by the previous layer.

A gradient balancing layer is arranged in the main network, the gradient of the gradient balancing layer is determined to be the weighted sum value of the gradient returned by each branch network, the learning bias of the main network can be influenced through the weight value of the branch network, and the learning speed of the main network is balanced.

On the basis of the processing, a model for simultaneously processing lane line pixel identification and lane line position type identification, namely a multi-classification task model, can be obtained.

Further, since the multi-classification task model is used for real-time actual prediction, there is no backward propagation process in the actual prediction, and the gradient balance layer in the backbone network may not be included in the model.

Further, the softmax layer may also not be included in the multi-segmented task model. Since the operations performed by the softmax layer are exponential operations, the exponential operations occupy large processing resources, and the processing speed of the whole prediction processing is reduced. The multi-split task model provided by the embodiment of the application does not comprise a softmax layer, can remarkably improve the processing speed, and does not have substantial influence on the prediction result, so that the requirements of real-time prediction processing on the processing speed and the processing efficiency can be better met.

In the method provided by the embodiment of the application, a multi-classification task model is obtained through pre-training, and the model operates according to input image data to obtain processing results of two classification tasks, wherein the two classification tasks comprise a lane line pixel identification task and a lane line category identification task, namely the processing results of the two classification tasks, including lane line pixel data and lane line position category data, can be obtained through one-time processing of the multi-classification task model; the model further compares the lane line pixel data with the lane line position category data, and assigns the lane line position category corresponding to one lane line pixel to the pixel, so that a plurality of lane line pixels corresponding to the same lane line position category form a lane line example, and the model outputs at least one detected lane line example. According to the technical scheme provided by the embodiment of the application, the lane line example can be effectively identified from the image data; and by executing the lane line pixel identification processing and the lane line category identification processing in parallel, the processing speed and the processing efficiency of the lane line instance identification can be improved. Therefore, the problem that the lane line instance cannot be effectively identified in the prior art can be solved.

Based on the same inventive concept, the embodiment of the application also provides a lane line example detection device.

As shown in fig. 8, the lane marking detection apparatus provided in the embodiment of the present application includes a processor 801 and at least one memory 802. At least one memory 802 has at least one machine executable instruction stored therein, and the processor 801 executes the at least one machine executable instruction to implement a lane line instance detection process, which is the process shown in fig. 1 or 3 a. The embodiments of the present application are not described here.

Based on the same inventive concept, the embodiment of the present application further provides a non-volatile storage medium, where at least one machine executable instruction is stored in the storage medium, and the processor executes the at least one machine executable instruction to implement the lane line instance detection processing, where the lane line instance detection processing is the processing shown in fig. 1 or 3 a. The embodiments of the present application are not described here.

Based on the same inventive concept, the embodiment of the application also provides an automatic driving vehicle.

The term "vehicle" is to be interpreted broadly in this application to include any moving object including, for example, aircraft, watercraft, spacecraft, automobiles, trucks, vans, semi-trailers, motorcycles, golf carts, off-road vehicles, warehouse or farm vehicles, and vehicles traveling on rails such as trams or trains and other rail vehicles. The "vehicle" in the present application may generally include: power systems, sensor systems, control systems, peripheral devices, and computer systems. In other embodiments, the vehicle may include more, fewer, or different systems.

Wherein, the driving system is the system for providing power motion for the vehicle, includes: engine/motor, transmission and wheels/tires, power unit.

The control system may comprise a combination of devices controlling the vehicle and its components, such as a steering unit, a throttle, a brake unit.

The peripheral devices may be devices that allow the vehicle to interact with external sensors, other vehicles, external computing devices, and/or users, such as wireless communication systems, touch screens, microphones, and/or speakers.

Based on the vehicle described above, the unmanned vehicle is also provided with a sensor system and an unmanned control device.

The sensor system may include a plurality of sensors for sensing information about the environment in which the vehicle is located, and one or more actuators for changing the position and/or orientation of the sensors. The sensor system may include any combination of sensors such as global positioning system sensors, inertial measurement units, radio detection and ranging (RADAR) units, cameras, laser rangefinders, light detection and ranging (LIDAR) units, and/or acoustic sensors; the sensor system may also include sensors that monitor the vehicle's internal systems (e.g., O2 monitors, fuel gauges, engine thermometers, etc.).

The drone controlling device may include a processor and a memory, the memory having stored therein at least one machine executable instruction, the processor executing the at least one machine executable instruction to implement functions including a map engine, a positioning module, a perception module, a navigation or routing module, and an automatic control module, among others. The map engine and the positioning module are used for providing map information and positioning information. The sensing module is used for sensing things in the environment where the vehicle is located according to the information acquired by the sensor system and the map information provided by the map engine. And the navigation or path module is used for planning a driving path for the vehicle according to the processing results of the map engine, the positioning module and the sensing module. The automatic control module inputs and analyzes decision information of modules such as a navigation module or a path module and the like and converts the decision information into a control command output to a vehicle control system, and sends the control command to a corresponding component in the vehicle control system through a vehicle-mounted network (for example, an electronic network system in the vehicle, which is realized by CAN (controller area network) bus, local area internet, multimedia directional system transmission and the like), so as to realize automatic control of the vehicle; the automatic control module can also acquire information of each component in the vehicle through a vehicle-mounted network.

In the embodiment of the present application, the sensing module includes a lane line instance detection device, which may be a device as shown in fig. 8. And the sensing module directly outputs the lane line example data output by the lane line example detection device or outputs the lane line example data after further processing, and the output result is provided for the navigation or path module. And the navigation or path module plans a driving route or a driving path for the vehicle according to the output results of the map engine, the positioning module and the sensing module.

The automatic driving vehicle provided by the embodiment of the application can effectively detect the lane line example in the driving environment, and can carry out route planning or path planning according to the detected lane line example, so that safe and effective automatic driving is realized. And the processing speed of the lane line example detection processing is high and the efficiency is high.

It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims

1. A lane line instance detection method is characterized by comprising the following steps:

the lane line pixel identification processing comprises identifying lane line pixels in the characteristic data to obtain lane line pixel data; the lane line type identification processing includes classifying lane line positions in the image data in the feature data, identifying lane line position types to which pixels belong, and obtaining lane line position type data, one lane line position type corresponding to a predetermined lane line position; the lane line example extraction processing comprises comparing lane line pixel data with lane line position category data, assigning a lane line position category corresponding to a lane line pixel to the lane line pixel to obtain lane line example data, wherein the lane line example data comprises at least one lane line example, and each lane line example comprises a plurality of lane line pixels with the same lane line position category.

2. The method according to claim 1, wherein the lane line pixel identification process and the lane line class identification process comprise an image multi-classification process.

3. The method according to claim 1, wherein the lane marking instance detecting means inputs the image data to a multi-classification task model which performs a feature extraction process and a lane marking instance generation process and outputs lane marking instance data;

the multi-classification task model comprises a deep neural network, wherein the deep neural network comprises a main network, a first branch network, a second branch network and an output layer; the backbone network is used for executing feature extraction processing; the first branch network is used for executing lane line pixel identification processing, and the second branch network is used for executing lane line category identification processing; the output layer is used for executing lane line instance extraction processing.

4. The method of claim 1, further comprising a pre-training process comprising:

setting a structure of a deep neural network, wherein the deep neural network is used for learning a lane line pixel identification task, a lane line category identification task and a lane line example extraction task, and realizing the detection processing of the lane line example;

the main network is used for learning common task processing in the lane line pixel identification task and the lane line category identification task, and the common task processing comprises feature extraction processing; each branch network is respectively used for learning specific classification task processing in corresponding tasks, wherein the specific classification task processing of the first branch network comprises lane line pixel identification processing, and the specific classification task processing of the second branch network comprises lane line category identification processing; the input layer is used for learning a lane line example extraction task and realizing lane line example extraction processing;

and iteratively training the deep neural network set by the training according to the training data by using a supervision method to obtain a multi-classification task model.

5. The method of claim 4, wherein iteratively training the deep neural network of the | exercise setting using a supervised method based on the training | exercise data comprises:

randomly initializing parameters of the deep neural network;

and (4) iteratively training the deep neural network by using a supervision method according to the training data and simultaneously learning the lane line pixel identification task and the lane line category identification task.

6. The method of claim 4, wherein iteratively training the deep neural network of the | exercise setting using a supervised method based on the training | exercise data comprises:

initializing parameters of a main network of a deep neural network and parameters of a branch network corresponding to a task with the largest learning difficulty by using parameters of an existing model corresponding to the task, which are determined in advance, and randomly initializing parameters of another branch network;

and iteratively training the other branch network of the deep neural network by using a supervision method according to the training data, and learning a task corresponding to the other branch network.

7. The method of claim 4, wherein iteratively training the set deep neural network using a supervised approach based on the training data comprises:

for a branch network, inputting a predicted value output by the branch network and a real value in training data into a predefined loss function, wherein the loss function outputs a loss value;

iteratively modifying parameters in the branch network and the backbone network if the loss value is greater than a predetermined confidence value; determining that the iterative training of the branch network is complete if the loss value is less than or equal to the predetermined confidence value.

8. The method of claim 7, wherein iteratively modifying parameters in the branch network comprises:

in the back propagation process of one iteration training, determining the learning rate of one branch network as the product of the preset network learning rate and the preset weight value of the branch network;

and modifying the parameters in the branch network according to the determined learning rate of the branch network.

9. The method of claim 7, wherein iteratively modifying the parameters in the backbone network comprises:

the last layer of the backbone network comprises a gradient balancing layer;

in the back propagation process of one iteration training, determining the gradient of the gradient balance layer as the sum of the products of the gradients of all branch networks and the preset weight values of all branch networks, and transmitting the determined gradient value back to the next network layer in the main network;

for each network layer except the gradient balancing layer in the backbone network, modifying the parameters of the layer according to the gradient returned by the previous layer.

10. The method of claim 9, wherein the trained multi-classification task model does not include a gradient balancing layer.

11. The method of claim 4, further comprising:

setting a first teacher network and a second teacher network, wherein the first teacher network corresponds to the lane line pixel identification task, and the second teacher network corresponds to the lane line category identification task;

respectively training a first teacher network and a second teacher network according to real data in the training data to obtain a first teacher model and a second teacher model which execute corresponding tasks;

respectively providing the partially marked image data as input to a first teacher model and a second teacher model, and respectively outputting the first teacher model and the second teacher model to obtain learning image data used for training corresponding tasks, wherein the learning image data is completely marked image data;

the learning image data is saved to a set of training data.

12. The method of claim 11, further comprising:

the set deep neural network is a student network;

iteratively training the set deep neural network by using a supervision method according to the training data to obtain a multi-classification task model, wherein the method comprises the following steps:

and training the students to learn understanding of the corresponding tasks by the plurality of teacher models through a network to obtain a multi-classification task model.

13. The method of claim 4, wherein the trained multi-classification task model does not include a softmax layer.

14. A lane marking instance detecting device, comprising: the system comprises a processor and at least one memory, wherein at least one machine executable instruction is stored in the at least one memory, and the processor executes the at least one machine executable instruction to realize lane line example detection processing; the lane line instance detection process includes:

15. A non-transitory storage medium having at least one machine executable instruction stored therein, wherein execution of the at least one machine executable instruction by a processor implements a lane line instance detection process, the process comprising:

16. An automatic driving vehicle is characterized by comprising a vehicle-mounted camera and an automatic driving control system, wherein the automatic driving control system comprises a lane line example detection device;

the lane line pixel identification processing comprises the steps of classifying lane line positions in the image data in the feature data, identifying lane line pixels and obtaining lane line pixel data; the lane line type identification processing includes identifying a lane line position type to which the pixel belongs in the feature data to obtain lane line position type data, one lane line position type corresponding to a predetermined lane line position; the lane line example extraction processing comprises the steps of comparing lane line pixel data with lane line position category data, assigning a lane line position category corresponding to one lane line pixel to the lane line pixel to obtain lane line example data, wherein the lane line example data comprises at least one lane line example, and each lane line example comprises a plurality of lane line pixels with the same lane line position category;