CN113739811A

CN113739811A - Method and device for training key point detection model and generating high-precision map lane line

Info

Publication number: CN113739811A
Application number: CN202111033468.7A
Authority: CN
Inventors: 何雷
Original assignee: Apollo Intelligent Technology Beijing Co Ltd
Current assignee: Apollo Intelligent Technology Beijing Co Ltd
Priority date: 2021-09-03
Filing date: 2021-09-03
Publication date: 2021-12-03

Abstract

The invention provides a method, a device and equipment for training a key point detection model and generating a high-precision map lane line, and relates to the fields of automatic driving, artificial intelligence, intelligent transportation, deep learning and the like. The specific implementation scheme comprises the following steps: extracting the characteristics of the semantic map sample and the characteristics of the start-stop point pair coordinates of the lane line sample by using an initial network; obtaining a predicted coordinate of at least one intermediate point of the lane line sample by using an initial network according to the characteristics of the semantic map sample and the characteristics of the start-stop point pair coordinate of the lane line sample; and adjusting parameters of the initial network according to the predicted coordinates and the true coordinates of at least one intermediate point of the lane line sample to obtain a key point detection model, wherein the key point detection model is used for predicting the coordinates of at least one intermediate point of the target lane line according to the coordinates of the start-stop point pair of the target lane line. The technical scheme of the present disclosure can automatically generate the lane line according with the driving habit of human beings.

Description

Method and device for training key point detection model and generating high-precision map lane line

Technical Field

The present disclosure relates to the field of computer technologies, and in particular, to the fields of automatic driving, artificial intelligence, intelligent transportation, deep learning, and the like, and in particular, to a method, an apparatus, a device, a storage medium, and a computer program product for training a keypoint detection model and generating a high-precision map lane line.

Background

The lane line is a core element of the experience map layer of the high-precision map. In the prior art, most of lane lines are generated semi-automatically according to traffic signs, and the lane lines generated in the mode do not conform to the driving habits of human beings. And an anthropomorphic lane line is generated based on the perception of the obstacle vehicle track aggregation, but due to the limitation of the deployment number of the obstacle vehicles, the problem of partial lane line loss can occur.

Disclosure of Invention

The present disclosure provides a method, apparatus, device, storage medium, and computer program product for training of a keypoint detection model and generation of high-precision map lane lines.

According to a first aspect of the present disclosure, there is provided a method for training a keypoint detection model, including:

extracting the characteristics of the semantic map sample and the characteristics of the start-stop point pair coordinates of the lane line sample by using an initial network;

obtaining a predicted coordinate of at least one intermediate point of the lane line sample by using an initial network according to the characteristics of the semantic map sample and the characteristics of the start-stop point pair coordinate of the lane line sample;

and adjusting parameters of the initial network according to the predicted coordinates and the true coordinates of at least one intermediate point of the lane line sample to obtain a key point detection model, wherein the key point detection model is used for predicting the coordinates of at least one intermediate point of the target lane line according to the coordinates of the start-stop point pair of the target lane line.

According to a second aspect of the present disclosure, there is provided a method for generating a high-precision map lane line, including:

obtaining a semantic map corresponding to the high-precision map;

inputting the semantic map and the coordinates of the starting point and ending point pairs of the target lane line into the key point detection model to predict the coordinates of at least one intermediate point of the target lane line; wherein, the key point detection model is obtained by the training method;

and generating the target lane line according to the coordinates of the starting point and the ending point pairs of the target lane line and the at least one intermediate point.

According to a third aspect of the present disclosure, there is provided a training apparatus for a keypoint detection model, comprising:

the characteristic extraction module is used for extracting the characteristics of the semantic map sample and the characteristics of the start-stop point coordinates of the lane line sample by using the initial network;

the predicted coordinate determination module is used for obtaining the predicted coordinate of at least one intermediate point of the lane line sample by utilizing an initial network according to the characteristics of the semantic map sample and the characteristics of the start-stop point coordinates of the lane line sample;

and the parameter adjusting module is used for adjusting the parameters of the initial network according to the predicted coordinates and the true coordinates of at least one intermediate point of the lane line sample to obtain a key point detection model, and the key point detection model is used for predicting the coordinates of at least one intermediate point of the target lane line according to the coordinates of the starting point pair and the ending point pair of the target lane line.

According to a fourth aspect of the present disclosure, there is provided a high-precision map lane line generation apparatus including:

the semantic map acquisition module is used for acquiring a semantic map corresponding to the high-precision map;

the prediction module is used for inputting the semantic map and the coordinates of the starting point-ending point pair of the target lane line into the key point detection model so as to predict the coordinates of at least one intermediate point of the target lane line; wherein, the key point detection model is obtained by the training device;

and the generating module is used for generating the target lane line according to the coordinates of the starting point and the ending point pairs of the target lane line and the at least one intermediate point.

According to a fifth aspect of the present disclosure, there is provided a training apparatus for a keypoint detection model, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a training method provided by any of the embodiments of the present disclosure.

According to a sixth aspect of the present disclosure, there is provided a high-precision map lane line generation device including:

at least one processor; and

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the generation method provided by any of the embodiments of the present disclosure.

According to a seventh aspect of the present disclosure, there is provided a non-transitory computer readable storage medium having stored thereon computer instructions for causing a computer to perform a method provided by any of the embodiments of the present disclosure.

According to an eighth aspect of the present disclosure, there is provided a computer program product comprising a computer program which, when executed by a processor, implements the method provided by any of the embodiments of the present disclosure.

According to a ninth aspect of the present disclosure, there is provided an autonomous vehicle including the high-precision map lane line generation apparatus provided in any of the embodiments of the present disclosure or the high-precision map lane line generation device provided in any of the embodiments of the present disclosure.

The technical scheme of the embodiment of the disclosure can automatically generate the lane line which accords with the driving habits of human beings.

It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.

Drawings

The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:

fig. 1 is an example diagram of a target lane line to be generated in accordance with an embodiment of the present disclosure;

FIG. 2 is a flow chart of a method of training a keypoint detection model according to an embodiment of the disclosure;

FIG. 3 is an exemplary diagram of a high precision map in accordance with an embodiment of the present disclosure;

FIG. 4 is an exemplary diagram of a semantic map in accordance with an embodiment of the present disclosure;

fig. 5 is an example diagram of a vectored lane line in an embodiment in accordance with the disclosure;

FIG. 6 is a flow chart of a method of generating high precision map lane lines according to an embodiment of the present disclosure;

fig. 7 is a diagram of an application example of a generation method of a high-precision map lane line according to an embodiment of the present disclosure;

FIG. 8 is a schematic diagram of an application scenario in accordance with an embodiment of the present disclosure;

FIG. 9 is a block diagram of a training apparatus for a keypoint detection model according to an embodiment of the present disclosure;

fig. 10 is a block diagram of a high-precision map lane line generation apparatus according to an embodiment of the present disclosure;

FIG. 11 is a block diagram of an electronic device used to implement methods of embodiments of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

The experience layer of the high-precision map may provide a reference for a Plan and Control (PNC) module of the autonomous vehicle. Virtual lane lines can be provided in the experience map layer to provide reference for vehicle driving. For example: in the current L4 unmanned technology, when the intersection turns or turns, the PNC module will refer to the steering curve labeled in the high-precision map. This steering curve can be considered as a virtual lane line, as shown in fig. 1. The embodiment of the application aims to provide a training method, so that a key point detection model is obtained, and virtual lane lines between starting and stopping point pairs are automatically generated according to a high-precision map and coordinates of the starting and stopping point pairs by using the key point detection model.

FIG. 2 shows a flow diagram of a method of training a keypoint detection model according to an embodiment of the disclosure. As shown in fig. 2, the training method includes:

step S201: extracting the characteristics of the semantic map sample and the characteristics of the start-stop point pair coordinates of the lane line sample by using an initial network;

step S202: obtaining a predicted coordinate of at least one intermediate point of the lane line sample by using an initial network according to the characteristics of the semantic map sample and the characteristics of the start-stop point pair coordinate of the lane line sample;

step S203: and adjusting parameters of the initial network according to the predicted coordinates and the true coordinates of at least one intermediate point of the lane line sample to obtain a key point detection model, wherein the key point detection model is used for predicting the coordinates of at least one intermediate point of the target lane line according to the coordinates of the start-stop point pair of the target lane line.

The training samples of the initial network comprise a large number of lane line samples. These lane line samples are illustratively collected from the driving trajectory of a human-driven vehicle, and thus conform to human driving habits.

The lane line samples include a start-stop point pair, i.e., a start point and an end point. And extracting the characteristics of the starting point and the ending point to the coordinates through the constructed initial network. For example, the start-stop point pair coordinates may be encoded in advance, and the features of the encoded start-stop point pair coordinates may be extracted.

And taking the high-precision map of the area where the lane line sample is located as a high-precision map sample, coding the high-precision map sample into a semantic map sample, and further extracting the characteristics of the coded semantic map sample through an initial network. FIG. 3 is an exemplary diagram of a intersection region in a high-precision map, and FIG. 4 is an exemplary diagram of a intersection region in a semantic map.

The initial network outputs the predicted coordinates of at least one intermediate point between the starting point and the end point according to the characteristics of the semantic map sample and the characteristics of the starting point, the end point and the coordinates; and adjusting parameters of the initial network according to the predicted coordinates and the true coordinates of each intermediate point until the convergence condition of the loss function is met, and obtaining a trained network, namely a key point detection model. It should be noted that the present embodiment does not limit the type and structure of the initial network.

Illustratively, as shown in fig. 5, a curve a 'C' B 'represents a lane line sample, where a' is a start point of the lane line sample, B 'is an end point of the lane line sample, and C' is a middle point of the lane line sample. The initial network may output predicted coordinates for a ', B ', C ', which may be represented in fig. 5 as A, B, C, respectively. And the predicted coordinate of the starting point A is assigned as the true coordinate of A ', the predicted coordinate of the end point B is assigned as the true coordinate of B', and then the predicted lane line ACB is generated.

The trained key point detection model can output the coordinates of at least one intermediate point between the starting point pair and the ending point pair according to the input coordinates of the starting point pair and the ending point pair of the target lane line, and further generate the anthropomorphic target lane line according to the coordinates of the starting point pair and the ending point pair and the coordinates of the at least one intermediate point.

The generated target lane line accords with the driving habit of human beings, can be marked in an experience layer of a high-precision map and used for automatically generating the experience layer, and the production efficiency is improved; the PNC module can be energized by the target lane line, so that the crossing passing success rate is improved; the method can also be used for planning the running track between the starting point and the terminal point in real time during the running process of the automatic driving vehicle.

In one embodiment, step S201 may include: extracting the characteristics of the semantic map sample by utilizing a first network in the initial network; and extracting the characteristics of the start-stop point pair coordinates of the lane line sample by using a second network in the initial network.

That is, the initial network adopts two network structures to extract features from semantic map samples and from start and stop point coordinates respectively, so that global features and local features are learned respectively, and training efficiency and prediction accuracy are improved.

Illustratively, the first network may include a Convolutional Neural Network (CNN) and a fully connected layer (MLP), such as one CNN and two layers MLP. The CNN can be used for extracting semantic map features from the semantic map samples, then two-layer MLPs are input, and global features of lane line samples in the semantic map samples can be extracted. Illustratively, the second network may include three layers of MLPs.

In one embodiment, step S202 may include: splicing the characteristics of the semantic map sample and the characteristics of the start-stop point pair coordinates of the lane line sample; and decoding the spliced characteristics by using a third network in the initial network to obtain the predicted coordinates of at least one intermediate point of the lane line sample.

Illustratively, the third network may include three layers of MLPs to decode the stitched global and local features and ultimately output predicted coordinates for one or more intermediate points. The network has simple structure and light weight, and can reduce the calculation amount and improve the prediction efficiency.

Preferably, the predicted coordinates of the plurality of intermediate points may be output. The coordinates of the series of fixed intermediate points can be obtained by setting an intermediate point to be predicted between the starting point and the end point at intervals of a preset distance.

In one embodiment, in step S203, adjusting parameters of the initial network according to the predicted coordinates and the true coordinates of the at least one intermediate point of the lane line sample includes: constructing a regression loss function according to the difference between the predicted coordinate and the true coordinate of the intermediate point; the parameters of the initial network are adjusted until the regression loss function converges.

Illustratively, the regression loss function L_dispCan be expressed by the following formula:

wherein the content of the first and second substances,

coordinates representing Ci, i.e., predicted coordinates of the intermediate point;

is represented by C_iThe coordinates of' are the true coordinates of the middle points; n represents the number of intermediate points, and N is an integer of 1 or more in the regression loss function.

The network is trained through a regression loss function, so that the predicted coordinates of the output intermediate points can be closer to the true coordinates.

In one embodiment, in step S203, adjusting parameters of the initial network according to the predicted coordinates and the true coordinates of the at least one intermediate point of the lane line sample includes: determining a prediction vector of a first line segment between a first intermediate point and a second intermediate point according to the prediction coordinates of the first intermediate point and the second intermediate point of the lane line sample; determining a truth value vector of the first line section according to the truth value coordinates of the first intermediate point and the second intermediate point; constructing a local structure loss function according to the difference between the prediction vector and the truth value vector of the first line segment; the parameters of the initial network are adjusted until the local structure loss function converges.

Illustratively, the local structure loss function L_LocaLCan be expressed by the following formula:

wherein the content of the first and second substances,

a prediction vector representing the ith first line segment,

and the truth value vector of the ith first line segment is represented, N represents the number of intermediate points, N is an integer greater than 1 in the local structure loss function, and i is smaller than or equal to N.

The network is trained through the local structure loss function, so that the curve between the output intermediate points can be smoother, and the generated lane line can be smoother.

Preferably, the first intermediate point and the second intermediate point are adjacent intermediate points, thereby improving the smoothness of the local structure.

In one embodiment, in step S203, adjusting parameters of the initial network according to the predicted coordinates and the true coordinates of the at least one intermediate point of the lane line sample includes: determining a prediction vector of a second line segment between the third intermediate point and the end point according to the prediction coordinate of the third intermediate point of the lane line sample and the true value coordinate of the end point of the lane line sample; determining a truth value vector of the second line segment according to the truth value coordinate of the third intermediate point and the truth value coordinate of the end point; constructing a global structure loss function according to the difference between the prediction vector and the truth value vector of the second line segment; the parameters of the initial network are adjusted until the global structure loss function converges. Wherein the endpoint includes a start point or an end point.

Illustratively, the global structure loss function L_gLobaLCan be expressed by the following formula:

wherein the content of the first and second substances,

the prediction vector of the ith third line segment is represented, the truth vector of the ith third line segment is represented, and M is the number of the third line segments.

The network is trained through the global structure loss function, so that the overall shape of the generated lane line is more reasonable.

Preferably, the third intermediate point is the intermediate point farthest from the start point or the end point, so that the global feature can be maximized.

In one embodiment, a loss function L including a regression loss function, a local structure loss function and a global structure loss function may be constructed, and in step S203, the parameters of the initial network are adjusted until the loss function L converges, that is:

L＝L_disp+L_local+L_global。

fig. 6 illustrates a flowchart of a method of generating a high-precision map lane line according to an embodiment of the present disclosure. As shown in fig. 6, the generation method includes:

step S601: obtaining a semantic map corresponding to the high-precision map;

step S602: inputting the semantic map and the coordinates of the starting point and ending point pairs of the target lane line into the key point detection model to predict the coordinates of at least one intermediate point of the target lane line;

step S603: and generating the target lane line according to the coordinates of the starting point and the ending point pairs of the target lane line and the at least one intermediate point.

In one example, as shown in fig. 7, a semantic map corresponding to the high-precision map and coordinates of start-stop point pairs a and B of the target lane line are input into a key point detection model composed of a first network, a second network and a third network, and at least one middle point C of the target lane line can be obtained_iFitting the coordinates of the starting and ending point pairs A and B and at least one intermediate point C_iMay generate a target lane line.

Fig. 8 is a schematic view of an application scenario according to an embodiment of the present disclosure. As shown in fig. 8, the terminal 801 may be hardware such as a mobile phone, a tablet, a vehicle-mounted terminal, a portable computer, or the like, which has an electronic device with a display screen. When the terminal 801 is software or an Application (APP), it can be installed in the electronic device. The server 802 may provide various services, such as support for applications installed on the terminal 801. The training method and the generating method provided by the embodiment of the present disclosure may be executed by the server 802, or may be executed by the terminal 801, and a corresponding apparatus corresponding to the method may be disposed in the terminal 801, or may be disposed in the server 802. Wherein any number of terminals, networks, and servers may be configured for implementation.

Fig. 9 is a block diagram of a training apparatus for a keypoint detection model according to an embodiment of the disclosure, as shown in fig. 9, the training apparatus comprising:

a feature extraction module 901, configured to extract features of the semantic map sample and features of start and stop points to coordinates of the lane line sample by using an initial network;

a predicted coordinate determination module 902, configured to obtain a predicted coordinate of at least one intermediate point of the lane line sample by using an initial network according to the features of the semantic map sample and the features of the start-stop point to the coordinate of the lane line sample;

a parameter adjusting module 903, configured to adjust a parameter of the initial network according to the predicted coordinate and the true coordinate of the at least one intermediate point of the lane line sample, to obtain a key point detection model, where the key point detection model is configured to predict the coordinate of the at least one intermediate point of the target lane line according to the coordinates of the start-stop point pair of the target lane line.

In one embodiment, the feature extraction module 901 includes:

the first extraction submodule is used for extracting the characteristics of the semantic map sample by utilizing a first network in the initial network;

and the second extraction submodule is used for extracting the characteristics of the start-stop point pair coordinates of the lane line sample by utilizing a second network in the initial network.

In one embodiment, the predicted coordinate determination module 902 includes:

the splicing submodule is used for splicing the characteristics of the semantic map sample and the characteristics of the start-stop point pair coordinates of the lane line sample;

and the decoding submodule is used for decoding the spliced characteristics by using a third network in the initial network to obtain the predicted coordinates of at least one intermediate point of the lane line sample.

In one embodiment, the parameter tuning module 903 comprises:

the regression loss function building submodule is used for building a regression loss function according to the difference between the predicted coordinate and the true coordinate of the intermediate point;

and the first adjusting submodule is used for adjusting the parameters of the initial network until the regression loss function converges.

In one embodiment, the parameter tuning module 903 comprises:

the first prediction vector determination submodule is used for determining a prediction vector of a first line segment between a first intermediate point and a second intermediate point according to the prediction coordinates of the first intermediate point and the second intermediate point of the lane line sample;

the first truth value vector determining submodule is used for determining a truth value vector of the first line section according to the truth value coordinates of the first intermediate point and the second intermediate point;

the local structure loss function constructing submodule is used for constructing a local structure loss function according to the difference between the prediction vector and the true value vector of the first line segment;

and the second adjusting submodule is used for adjusting the parameters of the initial network until the local structure loss function is converged.

In one embodiment, the parameter tuning module 903 comprises:

the second prediction vector determination submodule is used for determining the prediction vector of a second line segment between a third intermediate point and an end point according to the prediction coordinate of the third intermediate point of the lane line sample and the truth coordinate of the end point of the lane line sample; wherein the end point comprises a starting point or an end point;

the second true value vector determining submodule is used for determining the true value vector of the second line segment according to the true value coordinate of the third intermediate point and the true value coordinate of the end point;

the global structure loss function constructing submodule is used for constructing a global structure loss function according to the difference between the prediction vector and the truth value vector of the second line segment;

and the third adjusting submodule is used for adjusting the parameters of the initial network until the global structure loss function is converged.

Fig. 10 is a block diagram showing the configuration of a high-precision map lane line generation device according to an embodiment of the present disclosure. As shown in fig. 10, the generating means includes:

a semantic map acquisition module 1001 configured to acquire a semantic map corresponding to the high-precision map;

a prediction module 1002, configured to input the semantic map and coordinates of a start-stop point pair of the target lane line into the key point detection model, so as to predict coordinates of at least one intermediate point of the target lane line; wherein, the key point detection model is obtained by the training device;

the generating module 1003 is configured to generate a target lane line according to the coordinates of the start-stop point pair and the at least one intermediate point of the target lane line.

The functions of each module in each apparatus in the embodiments of the present disclosure may refer to the corresponding description in the above method, and are not described herein again.

The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure. The electronic equipment can be generation equipment of a high-precision map lane line and also can be training equipment of a key point detection model.

FIG. 11 shows a schematic block diagram of an example electronic device 1100 that may be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 11, the device 1100 comprises a computing unit 1101, which may perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM)1102 or a computer program loaded from a storage unit 1108 into a Random Access Memory (RAM) 1103. In the RAM1103, various programs and data necessary for the operation of the device 1100 may also be stored. The calculation unit 1101, the ROM 1102, and the RAM1103 are connected to each other by a bus 1104. An input/output (I/O) interface 1105 is also connected to bus 1104.

A number of components in device 1100 connect to I/O interface 1105, including: an input unit 1106 such as a keyboard, a mouse, and the like; an output unit 1107 such as various types of displays, speakers, and the like; a storage unit 1108 such as a magnetic disk, optical disk, or the like; and a communication unit 1109 such as a network card, a modem, a wireless communication transceiver, and the like. The communication unit 1109 allows the device 1100 to exchange information/data with other devices through a computer network such as the internet and/or various telecommunication networks.

The computing unit 1101 can be a variety of general purpose and/or special purpose processing components having processing and computing capabilities. Some examples of the computing unit 1101 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and the like. The calculation unit 1101 performs the respective methods and processes described above. For example, in some embodiments, the various methods described above may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as storage unit 1108. In some embodiments, part or all of the computer program may be loaded and/or installed onto device 1100 via ROM 1102 and/or communication unit 1109. When loaded into RAM1103 and executed by computing unit 1101, may perform one or more steps of the respective methods described above. Alternatively, in other embodiments, the computing unit 1101 may be configured by any other suitable means (e.g., by means of firmware) to perform the various methods described above.

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a lane ball) through which a user may provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server with a combined blockchain.

According to an embodiment of the present disclosure, the present disclosure further provides an autonomous vehicle including the apparatus for generating a high-precision map lane line provided in any of the embodiments of the present disclosure or the apparatus for generating a high-precision map lane line provided in any of the embodiments of the present disclosure.

It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel, sequentially, or in different orders, as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved, and the present disclosure is not limited herein.

The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.

Claims

1. A method for training a keypoint detection model comprises the following steps:

obtaining a predicted coordinate of at least one intermediate point of the lane line sample by utilizing the initial network according to the characteristics of the semantic map sample and the characteristics of the start-stop point pair coordinates of the lane line sample;

2. The training method of claim 1, wherein extracting the features of the semantic map samples and the features of the start-stop point to coordinates of the lane line samples by using the initial network comprises:

extracting features of the semantic map samples by using a first network in the initial networks;

and extracting the characteristics of the start-stop point coordinates of the lane line sample by using a second network in the initial network.

3. The training method according to claim 1, wherein obtaining the predicted coordinates of at least one intermediate point of the lane line sample by using the initial network according to the features of the semantic map sample and the features of the start-stop point to coordinates of the lane line sample comprises:

splicing the characteristics of the semantic map sample and the characteristics of the start-stop point pair coordinates of the lane line sample;

and decoding the spliced characteristics by using a third network in the initial network to obtain the predicted coordinate of at least one intermediate point of the lane line sample.

4. Training method according to any of claims 1 to 3, wherein adjusting the parameters of the initial network according to the predicted and the true coordinates of at least one intermediate point of the lane line sample comprises:

constructing a regression loss function according to the difference between the predicted coordinate and the true coordinate of the intermediate point;

adjusting parameters of the initial network until the regression loss function converges.

5. Training method according to any of claims 1 to 3, wherein adjusting the parameters of the initial network according to the predicted and the true coordinates of at least one intermediate point of the lane line sample comprises:

determining a prediction vector of a first line segment between a first intermediate point and a second intermediate point according to the prediction coordinates of the first intermediate point and the second intermediate point of the lane line sample;

determining a truth vector of the first line segment according to the truth coordinates of the first intermediate point and the second intermediate point;

constructing a local structure loss function according to the difference between the prediction vector and the true value vector of the first line segment;

adjusting parameters of the initial network until the local structure loss function converges.

6. Training method according to any of claims 1 to 3, wherein adjusting the parameters of the initial network according to the predicted and the true coordinates of at least one intermediate point of the lane line sample comprises:

determining a prediction vector of a second line segment between a third intermediate point of the lane line sample and an end point according to the prediction coordinate of the third intermediate point and the truth coordinate of the end point of the lane line sample; wherein the endpoint comprises a start point or an end point;

determining a truth vector of the second line segment according to the truth coordinate of the third intermediate point and the truth coordinate of the endpoint;

constructing a global structure loss function according to the difference between the prediction vector and the truth value vector of the second line segment;

adjusting parameters of the initial network until the global structure loss function converges.

7. A high-precision map lane line generation method comprises the following steps:

obtaining a semantic map corresponding to the high-precision map;

inputting the semantic map and the coordinates of the starting and ending point pairs of the target lane line into a key point detection model to predict the coordinates of at least one intermediate point of the target lane line; wherein, the key point detection model is obtained by the training method of any one of claims 1 to 6;

and generating the target lane line according to the coordinates of the starting point and the ending point pairs of the target lane line and at least one intermediate point.

8. A training apparatus for a keypoint detection model, comprising:

the predicted coordinate determination module is used for obtaining the predicted coordinate of at least one intermediate point of the lane line sample by utilizing the initial network according to the characteristics of the semantic map sample and the characteristics of the start-stop point coordinates of the lane line sample;

and the parameter adjusting module is used for adjusting the parameters of the initial network according to the predicted coordinates and the true coordinates of at least one intermediate point of the lane line sample to obtain a key point detection model, and the key point detection model is used for predicting the coordinates of at least one intermediate point of the target lane line according to the coordinates of a start-stop point pair of the target lane line.

9. The training apparatus of claim 8, wherein the feature extraction module comprises:

10. The training device of claim 8, wherein the predicted coordinate determination module comprises:

and the decoding submodule is used for decoding the spliced characteristics by utilizing a third network in the initial network to obtain the predicted coordinate of at least one intermediate point of the lane line sample.

11. The training apparatus of any one of claims 8 to 10, wherein the parameter adjustment module comprises:

a first adjusting submodule, configured to adjust a parameter of the initial network until the regression loss function converges.

12. The training apparatus of any one of claims 8 to 10, wherein the parameter adjustment module comprises:

a first prediction vector determination submodule configured to determine a prediction vector of a first line segment between a first intermediate point and a second intermediate point of the lane line sample according to prediction coordinates of the first intermediate point and the second intermediate point;

a first true value vector determining submodule, configured to determine a true value vector of the first line segment according to true value coordinates of the first intermediate point and the second intermediate point;

13. The training apparatus of any one of claims 8 to 10, wherein the parameter adjustment module comprises:

a second prediction vector determination submodule configured to determine a prediction vector of a second line segment between a third intermediate point of the lane line sample and an end point of the lane line sample according to a prediction coordinate of the third intermediate point and a true coordinate of the end point; wherein the endpoint comprises a start point or an end point;

a second true value vector determining submodule, configured to determine a true value vector of the second line segment according to the true value coordinate of the third intermediate point and the true value coordinate of the end point;

14. A high-precision map lane line generation apparatus comprising:

the prediction module is used for inputting the semantic map and the coordinates of the starting point-ending point pair of the target lane line into a key point detection model so as to predict the coordinates of at least one intermediate point of the target lane line; wherein the key point detection model is obtained by the training device of any one of claims 8 to 13;

and the generating module is used for generating the target lane line according to the coordinates of the starting point and the ending point of the target lane line and the at least one intermediate point.

15. A training apparatus for a keypoint detection model, comprising:

at least one processor; and

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1 to 6.

16. A high-precision map lane line generation apparatus comprising:

at least one processor; and

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of claim 7.

17. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1 to 7.

18. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1 to 7.

19. An autonomous vehicle comprising the high-precision map lane line generation device of claim 14 or the high-precision map lane line generation apparatus of claim 16.