CN113739811B

CN113739811B - Method and equipment for training key point detection model and generating high-precision map lane line

Info

Publication number: CN113739811B
Application number: CN202111033468.7A
Authority: CN
Inventors: 何雷
Original assignee: Apollo Intelligent Technology Beijing Co Ltd
Current assignee: Apollo Intelligent Technology Beijing Co Ltd
Filing date: 2021-09-03
Publication date: 2024-06-11
Anticipated expiration: 2041-09-03

Abstract

The disclosure provides a method, a device and equipment for training a key point detection model and generating a high-precision map lane line, and relates to the fields of automatic driving, artificial intelligence, intelligent traffic, deep learning and the like. The specific implementation scheme comprises the following steps: extracting features of a semantic map sample and features of coordinates of start and stop points of a lane line sample by using an initial network; according to the characteristics of the semantic map sample and the characteristics of the coordinates of the start and stop points of the lane line sample, obtaining the predicted coordinates of at least one intermediate point of the lane line sample by using an initial network; and adjusting parameters of the initial network according to the predicted coordinates and the true value coordinates of at least one intermediate point of the lane line sample to obtain a key point detection model, wherein the key point detection model is used for predicting the coordinates of at least one intermediate point of the target lane line according to the coordinates of a start point and a stop point pair of the target lane line. The technical scheme of the present disclosure can automatically generate lane lines conforming to the driving habit of a human.

Description

Method and equipment for training key point detection model and generating high-precision map lane line

Technical Field

The present disclosure relates to the field of computer technology, and in particular, to the fields of automatic driving, artificial intelligence, intelligent transportation, deep learning, and the like, and in particular, to a method, apparatus, device, storage medium, and computer program product for training a key point detection model and generating a high-precision map lane line.

Background

The lane lines are the core elements of the experience layer of the high-precision map. In the prior art, lane lines are mostly generated according to traffic signs in a semi-automatic mode, and the lane lines generated in the mode do not accord with the driving habit of people. And anthropomorphic lane lines are generated based on the aggregation of perceived obstacle vehicle tracks, but due to the limitation of the deployment number of obstacle vehicles, the problem of partial lane line missing can occur.

Disclosure of Invention

The present disclosure provides a method, apparatus, device, storage medium and computer program product for training a keypoint detection model and generating a high-precision map lane line.

According to a first aspect of the present disclosure, there is provided a training method of a keypoint detection model, including:

extracting features of a semantic map sample and features of coordinates of start and stop points of a lane line sample by using an initial network;

according to the characteristics of the semantic map sample and the characteristics of the coordinates of the start and stop points of the lane line sample, obtaining the predicted coordinates of at least one intermediate point of the lane line sample by using an initial network;

And adjusting parameters of the initial network according to the predicted coordinates and the true value coordinates of at least one intermediate point of the lane line sample to obtain a key point detection model, wherein the key point detection model is used for predicting the coordinates of at least one intermediate point of the target lane line according to the coordinates of a start point and a stop point pair of the target lane line.

According to a second aspect of the present disclosure, there is provided a method for generating a high-precision map lane line, including:

Acquiring a semantic map corresponding to the high-precision map;

Inputting the coordinates of a start point and a stop point of the semantic map and the target lane line into a key point detection model to predict the coordinates of at least one intermediate point of the target lane line; the key point detection model is obtained by the training method;

And generating the target lane line according to the coordinates of the starting point pair and at least one middle point of the target lane line.

According to a third aspect of the present disclosure, there is provided a training apparatus of a keypoint detection model, comprising:

the feature extraction module is used for extracting features of the semantic map sample and features of coordinates of start and stop points of the lane line sample by using the initial network;

The prediction coordinate determining module is used for obtaining the prediction coordinate of at least one middle point of the lane line sample by utilizing the initial network according to the characteristics of the semantic map sample and the characteristics of the coordinates of the start point and the stop point of the lane line sample;

The parameter adjustment module is used for adjusting parameters of the initial network according to the predicted coordinates and the true coordinates of at least one intermediate point of the lane line sample to obtain a key point detection model, and the key point detection model is used for predicting the coordinates of at least one intermediate point of the target lane line according to the coordinates of a start point and a stop point pair of the target lane line.

According to a fourth aspect of the present disclosure, there is provided a high-precision map lane line generation apparatus, including:

the semantic map acquisition module is used for acquiring a semantic map corresponding to the high-precision map;

The prediction module is used for inputting the coordinates of the start point and the stop point of the semantic map and the target lane line into the key point detection model so as to predict the coordinates of at least one middle point of the target lane line; the key point detection model is obtained by the training device;

The generating module is used for generating the target lane line according to the starting point pair of the target lane line and the coordinates of at least one middle point.

According to a fifth aspect of the present disclosure, there is provided a training apparatus of a keypoint detection model, comprising:

at least one processor; and

A memory communicatively coupled to the at least one processor; wherein,

The memory stores instructions executable by the at least one processor to enable the at least one processor to perform the training method provided by any of the embodiments of the present disclosure.

According to a sixth aspect of the present disclosure, there is provided a high-precision map lane line generation apparatus including:

at least one processor; and

A memory communicatively coupled to the at least one processor; wherein,

The memory stores instructions executable by the at least one processor to enable the at least one processor to perform the methods of generating provided by any of the embodiments of the present disclosure.

According to a seventh aspect of the present disclosure, there is provided a non-transitory computer readable storage medium storing computer instructions for causing a computer to perform the method provided by any of the embodiments of the present disclosure.

According to an eighth aspect of the present disclosure, there is provided a computer program product comprising a computer program which, when executed by a processor, implements the method provided by any of the embodiments of the present disclosure.

According to a ninth aspect of the present disclosure, there is provided an autonomous vehicle including the high-precision map lane line generating apparatus provided by any embodiment of the present disclosure or the high-precision map lane line generating device provided by any embodiment of the present disclosure.

The technical scheme of the embodiment of the disclosure can automatically generate the lane line conforming to the driving habit of the human.

It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.

Drawings

The drawings are for a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:

FIG. 1 is an example diagram of a target lane line to be generated in accordance with an embodiment of the present disclosure;

FIG. 2 is a flow chart of a method of training a keypoint detection model in accordance with an embodiment of the present disclosure;

FIG. 3 is an example diagram of a high-precision map in accordance with an embodiment of the present disclosure;

FIG. 4 is an example diagram of a semantic map according to an embodiment of the present disclosure;

FIG. 5 is an example diagram of a vectorized lane line in accordance with an embodiment of the present disclosure;

FIG. 6 is a flowchart of a method of generating high-precision map lane lines according to an embodiment of the present disclosure;

fig. 7 is an application example diagram of a high-precision map lane line generation method according to an embodiment of the present disclosure;

FIG. 8 is a schematic diagram of an application scenario according to an embodiment of the present disclosure;

FIG. 9 is a block diagram of a training apparatus of a keypoint detection model in accordance with an embodiment of the present disclosure;

Fig. 10 is a block diagram of a high-precision map lane line generation apparatus according to an embodiment of the present disclosure;

Fig. 11 is a block diagram of an electronic device for implementing the methods of embodiments of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

The experience layer of the high-precision map may provide a reference for a Plan and Control (PNC) module of an autonomous vehicle. Some virtual lane lines can be provided in the experience layer to provide references for vehicle travel. For example: in the current L4 unmanned technology, when the intersection turns or turns around, the PNC module refers to the steering curve marked in the high-precision map. The steering curve can be regarded as a virtual lane line, as shown in fig. 1. The embodiment of the application aims to provide a training method, so that a key point detection model is obtained, and virtual lane lines between start and stop point pairs are automatically generated according to a high-precision map and coordinates of the start and stop point pairs by using the key point detection model.

FIG. 2 illustrates a flow chart of a method of training a keypoint detection model in accordance with an embodiment of the present disclosure. As shown in fig. 2, the training method includes:

step S201: extracting features of a semantic map sample and features of coordinates of start and stop points of a lane line sample by using an initial network;

Step S202: according to the characteristics of the semantic map sample and the characteristics of the coordinates of the start and stop points of the lane line sample, obtaining the predicted coordinates of at least one intermediate point of the lane line sample by using an initial network;

Step S203: and adjusting parameters of the initial network according to the predicted coordinates and the true value coordinates of at least one intermediate point of the lane line sample to obtain a key point detection model, wherein the key point detection model is used for predicting the coordinates of at least one intermediate point of the target lane line according to the coordinates of a start point and a stop point pair of the target lane line.

The training samples of the initial network include a large number of lane line samples. For example, these lane line samples are collected from the driving track of a human-driven vehicle, and thus conform to the human driving habit.

The lane line sample includes a pair of start points, i.e., a start point and an end point. And extracting the characteristics of the coordinates of the start point and the stop point pair through the constructed initial network. For example, the start-stop point pair coordinates may be encoded in advance, and the features of the encoded start-stop point pair coordinates may be extracted.

And taking the high-precision map of the area where the lane line sample is located as a high-precision map sample, encoding the high-precision map as a semantic map sample, and further extracting the characteristics of the encoded semantic map sample through an initial network. Fig. 3 is an exemplary diagram of an intersection region in a high-precision map, and fig. 4 is an exemplary diagram of an intersection region in a semantic map.

The initial network outputs the predicted coordinates of at least one intermediate point between the starting point and the end point according to the characteristics of the semantic map sample and the characteristics of the coordinates of the starting point and the ending point; and adjusting parameters of the initial network according to the predicted coordinates and the true value coordinates of each intermediate point until the convergence condition of the loss function is met, and obtaining a trained network, namely a key point detection model. It should be noted that, the present embodiment does not limit the type and structure of the initial network.

Illustratively, as shown in fig. 5, a curve a 'C' B 'represents a lane line sample, where a' is a start point of the lane line sample, B 'is an end point of the lane line sample, and C' is a middle point of the lane line sample. The initial network may output predicted coordinates of a ', B ', C ', which may be represented by A, B, C in fig. 5, respectively. The predicted coordinates of the starting point A are assigned to be true coordinates of A ', the predicted coordinates of the end point B are assigned to be true coordinates of B', and then the predicted lane line ACB is generated.

The trained key point detection model can output the coordinates of at least one intermediate point between the starting point pair and the stopping point pair according to the coordinates of the starting point pair and the stopping point pair of the input target lane line, and then generate the anthropomorphic target lane line according to the coordinates of the starting point pair and the stopping point pair and the coordinates of the at least one intermediate point.

The generated target lane line accords with the human driving habit, can be marked in an experience layer of a high-precision map and is used for automatically generating the experience layer, so that the production efficiency is improved; the target lane line can enable the PNC module, so that the success rate of crossing traffic is improved; the method can also be used for planning the running track between the starting point and the end point in real time during the running process of the automatic driving vehicle.

In one embodiment, in step S201, it may include: extracting features of a semantic map sample by using a first network in the initial network; and extracting the characteristic of the coordinates of the start and stop points of the lane line sample by using a second network in the initial network.

That is, the initial network adopts two network structures to extract features from the semantic map sample and to extract features from the start-stop point coordinates, so as to learn global features and local features respectively, and improve training efficiency and prediction accuracy.

Illustratively, the first network may include a convolutional neural network (Convolutional Neural Networks, CNN) and a full connectivity layer (MLP), such as one CNN and two-layer MLP. The CNN is used for extracting semantic map features from the semantic map samples, and then two layers of MLPs are input, so that global features of lane line samples in the semantic map samples can be extracted. Illustratively, the second network may include a three-layer MLP.

In one embodiment, in step S202, it may include: splicing the characteristics of the semantic map sample and the characteristics of the coordinates of the start and stop points of the lane line sample; and decoding the spliced features by using a third network in the initial network to obtain the predicted coordinates of at least one intermediate point of the lane line sample.

Illustratively, the third network may include a three-layer MLP to decode the stitched global features and local features, ultimately outputting predicted coordinates of one or more intermediate points. The network has a simple structure and light weight, can reduce the operation amount and improve the prediction efficiency.

Preferably, the predicted coordinates of the plurality of intermediate points may be output. The intermediate points to be predicted can be set between the starting point and the end point at intervals of preset distance, and then the coordinates of the series of fixed intermediate points are obtained.

In one embodiment, in step S203, adjusting parameters of the initial network according to the predicted coordinates and the true coordinates of the at least one intermediate point of the lane line sample includes: constructing a regression loss function according to the difference between the predicted coordinates and the true coordinates of the intermediate points; parameters of the initial network are adjusted until the regression loss function converges.

Illustratively, the regression loss function L _disp may be represented by the following equation:

Wherein, Representing the coordinates of Ci, i.e., the predicted coordinates of the intermediate points; /(I)Representing the coordinates of C _i', namely the true value coordinates of the intermediate point; n represents the number of intermediate points, and in the regression loss function, N is an integer of 1 or more.

The regression loss function is used for training the network, so that the predicted coordinates of the output intermediate points are more approximate to the true coordinates.

In one embodiment, in step S203, adjusting parameters of the initial network according to the predicted coordinates and the true coordinates of the at least one intermediate point of the lane line sample includes: according to the predicted coordinates of the first intermediate point and the second intermediate point of the lane line sample, determining a predicted vector of a first line segment between the first intermediate point and the second intermediate point; determining a true value vector of the first line segment according to the true value coordinates of the first intermediate point and the second intermediate point; constructing a local structure loss function according to the difference between the prediction vector and the true value vector of the first line segment; parameters of the initial network are adjusted until the local structure loss function converges.

Illustratively, the local structure loss function L _LocaL may be represented by the following equation:

Wherein, Prediction vector representing the i first line segment,/>The true value vector of the ith first line segment is represented, N represents the number of intermediate points, and in the local structure loss function, N is an integer greater than 1, and i is less than or equal to N.

By training the network with the local structure loss function, the curve between the output intermediate points can be smoother, and the generated lane lines can be smoother.

Preferably, the first intermediate point and the second intermediate point are adjacent intermediate points, so as to improve the smoothness of the local structure.

In one embodiment, in step S203, adjusting parameters of the initial network according to the predicted coordinates and the true coordinates of the at least one intermediate point of the lane line sample includes: determining a prediction vector of a second line segment between the third intermediate point and the end point according to the prediction coordinate of the third intermediate point of the lane line sample and the truth coordinate of the end point of the lane line sample; determining a truth vector of the second line segment according to the truth coordinates of the third intermediate point and the truth coordinates of the end points; constructing a global structure loss function according to the difference between the prediction vector and the true value vector of the second line segment; parameters of the initial network are adjusted until the global structure loss function converges. Wherein the end point includes a start point or an end point.

Illustratively, the global structure loss function L _gLobaL may be represented by the following equation:

Wherein, The prediction vector representing the ith third line segment, the true value vector representing the ith third line segment, and M is the number of third line segments.

The network is trained through the global structure loss function, so that the overall shape of the generated lane line is more reasonable.

Preferably, the third intermediate point is the intermediate point furthest from the start or end point, and global features may be maximized.

In one embodiment, a loss function L including a regression loss function, a local structure loss function, and a global structure loss function may be constructed, and in step S203, parameters of the initial network are adjusted until the loss function L converges, that is:

L＝L_disp+L_local+L_global。

fig. 6 shows a flowchart of a method of generating a high-precision map lane line according to an embodiment of the present disclosure. As shown in fig. 6, the generating method includes:

Step S601: acquiring a semantic map corresponding to the high-precision map;

Step S602: inputting the coordinates of a start point and a stop point of the semantic map and the target lane line into a key point detection model to predict the coordinates of at least one intermediate point of the target lane line;

Step S603: and generating the target lane line according to the coordinates of the starting point pair and at least one middle point of the target lane line.

In one example, as shown in fig. 7, coordinates of a pair of start and stop points a and B of a target lane line and a semantic map corresponding to a high-definition map are input into a key point detection model composed of a first network, a second network and a third network, coordinates of at least one intermediate point C _i of the target lane line can be obtained, and coordinates of the pair of start and stop points a and B and coordinates of at least one intermediate point C _i are fitted to generate the target lane line.

Fig. 8 is an application scenario schematic diagram of an embodiment of the present disclosure. As shown in fig. 8, the terminal 801 may be hardware, such as a mobile phone, a tablet, a vehicle-mounted terminal, a portable computer, or the like, which is an electronic device having a display screen. When the terminal 801 is software or an Application (APP), it can be installed in the above-described electronic device. The server 802 may provide various services, such as support for applications installed on the terminal 801. The training method and the generating method provided in the embodiments of the present disclosure may be executed by the server 802 or may be executed by the terminal 801, and the corresponding apparatus corresponding to the method may be set in the terminal 801 or may be set in the server 802. Wherein any number of terminals, networks and servers may be configured to fulfill the need.

Fig. 9 shows a block diagram of a training apparatus of a keypoint detection model according to an embodiment of the present disclosure, as shown in fig. 9, the training apparatus including:

The feature extraction module 901 is used for extracting features of the semantic map sample and features of coordinates of start and stop points of the lane line sample by using an initial network;

the predicted coordinate determining module 902 is configured to obtain, according to the features of the semantic map sample and the features of the coordinates of the start and stop points of the lane line sample, predicted coordinates of at least one intermediate point of the lane line sample by using the initial network;

The parameter adjustment module 903 is configured to adjust parameters of the initial network according to the predicted coordinates and the true coordinates of at least one intermediate point of the lane line sample, so as to obtain a key point detection model, where the key point detection model is configured to predict coordinates of at least one intermediate point of the target lane line according to coordinates of a start-stop point pair of the target lane line.

In one embodiment, the feature extraction module 901 includes:

the first extraction sub-module is used for extracting the characteristics of the semantic map sample by utilizing a first network in the initial network;

and the second extraction sub-module is used for extracting the characteristic of the coordinates of the start point and the stop point of the lane line sample by using a second network in the initial network.

In one embodiment, the predicted coordinate determination module 902 includes:

the splicing sub-module is used for splicing the characteristics of the semantic map sample and the characteristics of the coordinates of the start and stop points of the lane line sample;

and the decoding sub-module is used for decoding the spliced characteristics by utilizing a third network in the initial network to obtain the predicted coordinates of at least one middle point of the lane line sample.

In one embodiment, the parameter adjustment module 903 includes:

the regression loss function construction submodule is used for constructing a regression loss function according to the difference between the predicted coordinates and the true coordinates of the intermediate points;

and the first adjusting sub-module is used for adjusting the parameters of the initial network until the regression loss function converges.

In one embodiment, the parameter adjustment module 903 includes:

The first prediction vector determination submodule is used for determining a prediction vector of a first line segment between a first intermediate point and a second intermediate point according to the prediction coordinates of the first intermediate point and the second intermediate point of the lane line sample;

The first truth vector determining submodule is used for determining the truth vector of the first line segment according to the truth coordinates of the first intermediate point and the second intermediate point;

the local structure loss function construction submodule is used for constructing a local structure loss function according to the difference between the prediction vector and the true value vector of the first line segment;

and the second adjusting sub-module is used for adjusting the parameters of the initial network until the local structure loss function converges.

In one embodiment, the parameter adjustment module 903 includes:

the second prediction vector determination submodule is used for determining a prediction vector of a second line segment between the third intermediate point and the endpoint according to the prediction coordinate of the third intermediate point of the lane line sample and the true value coordinate of the endpoint of the lane line sample; wherein the endpoints include a start point or an end point;

The second truth vector determining submodule is used for determining the truth vector of the second line segment according to the truth coordinates of the third intermediate point and the truth coordinates of the end points;

The global structure loss function construction submodule is used for constructing a global structure loss function according to the difference between the prediction vector and the true value vector of the second line segment;

And the third adjusting sub-module is used for adjusting the parameters of the initial network until the global structure loss function converges.

Fig. 10 shows a block diagram of a structure of a high-precision map lane line generating apparatus according to an embodiment of the present disclosure. As shown in fig. 10, the generating device includes:

A semantic map acquisition module 1001, configured to acquire a semantic map corresponding to a high-precision map;

A prediction module 1002, configured to input a semantic map and coordinates of a start-stop point pair of a target lane line into a key point detection model to predict coordinates of at least one intermediate point of the target lane line; the key point detection model is obtained by the training device;

The generating module 1003 is configured to generate the target lane line according to the start-stop point pair of the target lane line and coordinates of at least one intermediate point.

The functions of each module in each apparatus of the embodiments of the present disclosure may be referred to the corresponding descriptions in the above methods, which are not repeated herein.

According to embodiments of the present disclosure, the present disclosure also provides an electronic device, a readable storage medium and a computer program product. The electronic device can be a high-precision map lane line generating device or a key point detection model training device.

Fig. 11 illustrates a schematic block diagram of an example electronic device 1100 that can be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 11, the apparatus 1100 includes a computing unit 1101 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 1102 or a computer program loaded from a storage unit 1108 into a Random Access Memory (RAM) 1103. In the RAM 1103, various programs and data required for the operation of the device 1100 can also be stored. The computing unit 1101, ROM 1102, and RAM 1103 are connected to each other by a bus 1104. An input/output (I/O) interface 1105 is also connected to bus 1104.

Various components in device 1100 are connected to I/O interface 1105, including: an input unit 1106 such as a keyboard, a mouse, etc.; an output unit 1107 such as various types of displays, speakers, and the like; a storage unit 1108, such as a magnetic disk, optical disk, etc.; and a communication unit 1109 such as a network card, modem, wireless communication transceiver, or the like. The communication unit 1109 allows the device 1100 to exchange information/data with other devices through a computer network such as the internet and/or various telecommunication networks.

The computing unit 1101 may be a variety of general purpose and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 1101 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 1101 performs the respective methods and processes described above. For example, in some embodiments, the various methods described above may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as storage unit 1108. In some embodiments, some or all of the computer programs may be loaded and/or installed onto device 1100 via ROM 1102 and/or communication unit 1109. When the computer program is loaded into the RAM1103 and executed by the computing unit 1101, one or more steps of the respective methods described above may be performed. Alternatively, in other embodiments, the computing unit 1101 may be configured to perform the methods described above by any other suitable means (e.g., by means of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or a lane ball) through which a user may provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.

The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server incorporating a blockchain.

According to an embodiment of the present disclosure, the present disclosure further provides an automatic driving vehicle, including the high-precision map lane line generating apparatus provided by any embodiment of the present disclosure or the high-precision map lane line generating device provided by any embodiment of the present disclosure.

It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps recited in the present disclosure may be performed in parallel or sequentially or in a different order, provided that the desired results of the technical solutions of the present disclosure are achieved, and are not limited herein.

The above detailed description should not be taken as limiting the scope of the present disclosure. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present disclosure are intended to be included within the scope of the present disclosure.

Claims

1. A training method of a key point detection model comprises the following steps:

extracting features of a semantic map sample and features of coordinates of start and stop points of a lane line sample by using an initial network; the lane line sample comprises a starting point coordinate and an ending point coordinate of virtual lane change corresponding to a running track of a user driving vehicle; the semantic map sample is obtained based on a high-precision map of the area where the lane line sample is located;

According to the characteristics of the semantic map sample and the characteristics of the coordinates of the start and stop points of the lane line sample, obtaining the predicted coordinates of at least one intermediate point of the lane line sample by using the initial network;

And adjusting parameters of the initial network according to the predicted coordinates and the true coordinates of at least one intermediate point of the lane line sample to obtain a key point detection model, wherein the key point detection model is used for predicting the coordinates of at least one intermediate point of the target lane line according to the coordinates of a start point and a stop point of the target lane line so as to generate a virtual lane line between the start point and the stop point of the target lane line.

2. The training method of claim 1, wherein extracting features of the semantic map samples and features of the start-stop point pair coordinates of the lane line samples using the initial network comprises:

Extracting features of the semantic map sample by using a first network in the initial network;

and extracting the characteristic of the coordinates of the start and stop points of the lane line sample by using a second network in the initial network.

3. The training method of claim 1, wherein deriving predicted coordinates of at least one intermediate point of the lane line sample using the initial network from the features of the semantic map sample and the features of the start-stop point pair coordinates of the lane line sample comprises:

splicing the characteristics of the semantic map sample and the characteristics of the coordinates of the start and stop points of the lane line sample;

And decoding the spliced features by using a third network in the initial network to obtain the predicted coordinates of at least one intermediate point of the lane line sample.

4. A training method according to any one of claims 1 to 3, wherein adjusting parameters of the initial network in accordance with predicted coordinates and truth coordinates of at least one intermediate point of the lane line sample comprises:

Constructing a regression loss function according to the difference between the predicted coordinates and the true coordinates of the intermediate points;

And adjusting parameters of the initial network until the regression loss function converges.

5. A training method according to any one of claims 1 to 3, wherein adjusting parameters of the initial network in accordance with predicted coordinates and truth coordinates of at least one intermediate point of the lane line sample comprises:

according to the predicted coordinates of a first intermediate point and a second intermediate point of the lane line sample, determining a predicted vector of a first line segment between the first intermediate point and the second intermediate point;

determining a truth vector of the first line segment according to the truth coordinates of the first intermediate point and the second intermediate point;

Constructing a local structure loss function according to the difference between the prediction vector and the true value vector of the first line segment;

And adjusting parameters of the initial network until the local structure loss function converges.

6. A training method according to any one of claims 1 to 3, wherein adjusting parameters of the initial network in accordance with predicted coordinates and truth coordinates of at least one intermediate point of the lane line sample comprises:

Determining a prediction vector of a second line segment between a third intermediate point of the lane line sample and an endpoint of the lane line sample according to the prediction coordinate of the third intermediate point and the truth coordinate of the endpoint; wherein the end point comprises a start point or an end point;

determining a true value vector of the second line segment according to the true value coordinates of the third intermediate point and the true value coordinates of the end points;

Constructing a global structure loss function according to the difference between the prediction vector and the true value vector of the second line segment;

and adjusting parameters of the initial network until the global structure loss function converges.

7. A method for generating a lane line of a high-precision map comprises the following steps:

Acquiring a semantic map corresponding to the high-precision map;

Inputting the coordinates of the start and stop point pairs of the semantic map and the target lane line into a key point detection model to predict the coordinates of at least one intermediate point of the target lane line; wherein the keypoint detection model is obtained by the training method of any one of claims 1 to 6;

And generating the target lane line according to the coordinates of the start and stop point pair and at least one middle point of the target lane line.

8. A training device for a keypoint detection model, comprising:

the feature extraction module is used for extracting features of the semantic map sample and features of coordinates of start and stop points of the lane line sample by using the initial network; the lane line sample comprises a starting point coordinate and an ending point coordinate of virtual lane change corresponding to a running track of a user driving vehicle; the semantic map sample is obtained based on a high-precision map of the area where the lane line sample is located;

And the parameter adjustment module is used for adjusting parameters of the initial network according to the predicted coordinates and the true coordinates of at least one intermediate point of the lane line sample to obtain a key point detection model, and the key point detection model is used for predicting the coordinates of at least one intermediate point of the target lane line according to the coordinates of a start point and a stop point pair of the target lane line to generate a virtual lane line between the start point and the stop point pair of the target lane line.

9. The training device of claim 8, wherein the feature extraction module comprises:

a first extraction sub-module, configured to extract features of the semantic map sample using a first network of the initial networks;

and the second extraction submodule is used for extracting the characteristic of the coordinates of the start point and the stop point of the lane line sample by utilizing a second network in the initial network.

10. The training device of claim 8, wherein the predicted coordinate determination module comprises:

11. Training device according to any of the claims 8-10, wherein the parameter adjustment module comprises:

And the first adjustment sub-module is used for adjusting the parameters of the initial network until the regression loss function converges.

12. Training device according to any of the claims 8-10, wherein the parameter adjustment module comprises:

a first prediction vector determining sub-module, configured to determine a prediction vector of a first line segment between a first intermediate point and a second intermediate point of the lane line sample according to prediction coordinates of the first intermediate point and the second intermediate point;

a first truth vector determining submodule, configured to determine a truth vector of the first line segment according to the truth coordinates of the first intermediate point and the second intermediate point;

13. Training device according to any of the claims 8-10, wherein the parameter adjustment module comprises:

A second prediction vector determining sub-module, configured to determine a prediction vector of a second line segment between a third intermediate point of the lane line sample and an endpoint of the lane line sample according to a prediction coordinate of the third intermediate point and a truth coordinate of the endpoint; wherein the end point comprises a start point or an end point;

A second truth vector determining sub-module, configured to determine a truth vector of the second line segment according to the truth coordinates of the third intermediate point and the truth coordinates of the endpoint;

And the third adjustment sub-module is used for adjusting the parameters of the initial network until the global structure loss function converges.

14. A high-precision map lane line generation device, comprising:

The prediction module is used for inputting the coordinates of the start point and the stop point of the semantic map and the target lane line into the key point detection model so as to predict the coordinates of at least one middle point of the target lane line; wherein the keypoint detection model is obtained for the training device of any one of claims 8 to 13;

and the generating module is used for generating the target lane line according to the start and stop point pair of the target lane line and the coordinates of at least one middle point.

15. A training apparatus for a keypoint detection model, comprising:

at least one processor; and

A memory communicatively coupled to the at least one processor; wherein,

The memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1 to 6.

16. A high-precision map lane line generation apparatus comprising:

at least one processor; and

A memory communicatively coupled to the at least one processor; wherein,

The memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of claim 7.

17. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1 to 7.

18. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1 to 7.

19. An autonomous vehicle comprising the high-precision map lane line generating apparatus according to claim 14 or the high-precision map lane line generating device according to claim 16.