CN113706705A

CN113706705A - Image processing method, device and equipment for high-precision map and storage medium

Info

Publication number: CN113706705A
Application number: CN202111032344.7A
Authority: CN
Inventors: 何雷
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2021-09-03
Filing date: 2021-09-03
Publication date: 2021-11-26
Anticipated expiration: 2041-09-03
Also published as: CN113706705B

Abstract

The disclosure provides an image processing method, an image processing device, image processing equipment and a storage medium for a high-precision map, and relates to the technical field of computers, in particular to the technical field of automatic driving. The specific implementation scheme is as follows: performing feature extraction on an image to be detected to obtain element features, wherein the element features comprise first element features corresponding to lane line elements; determining a plurality of expressions to be selected corresponding to the lane line elements based on the first element characteristics; the expression to be selected is used for representing a fitting curve corresponding to the lane line element; determining the confidence corresponding to each expression to be selected; and determining a target expression from the multiple expressions to be selected based on the confidence degrees. According to the technology disclosed by the invention, the lane line elements in the image to be detected can be accurately detected.

Description

Image processing method, device and equipment for high-precision map and storage medium

Technical Field

The present disclosure relates to the field of computer technology, and more particularly, to the field of automated driving technology.

Background

The high-precision map mainly comprises traffic lights and other signboards and lane lines and other road surface elements, wherein the former has regular geometric shapes, and the latter has irregular shapes. In the related art, detection of lane line elements is generally realized by adopting semantic segmentation and post-processing strategies, and the defects of large calculation amount, low identification precision and the like exist.

Disclosure of Invention

The present disclosure provides an image processing method, apparatus, device, and storage medium for high-precision maps.

According to an aspect of the present disclosure, there is provided an image processing method for a high-precision map, including:

performing feature extraction on an image to be detected to obtain element features, wherein the element features comprise first element features corresponding to lane line elements;

determining a plurality of expressions to be selected corresponding to the lane line elements based on the first element characteristics; the expression to be selected is used for representing a fitting curve corresponding to the lane line element;

determining the confidence corresponding to each expression to be selected;

and determining a target expression from the multiple expressions to be selected based on the confidence degrees.

According to another aspect of the present disclosure, there is provided a model training method, including:

carrying out feature extraction on the lane line elements in the sample image to obtain sample features;

determining a sample coefficient group corresponding to a sample expression of the lane line element in the sample image;

inputting the sample characteristics into a coefficient prediction model to be trained to obtain a prediction coefficient group;

and determining the difference between the prediction coefficient group and the sample coefficient group, and training the coefficient prediction model to be trained according to the difference until the difference is within an allowable range.

According to another aspect of the present disclosure, there is provided an image processing apparatus including:

the feature extraction module is used for extracting features of the image to be detected to obtain element features, and the element features comprise first element features corresponding to lane line elements;

the candidate expression determining module is used for determining a plurality of candidate expressions corresponding to the lane line elements based on the first element characteristics; the expression to be selected is used for representing a fitting curve corresponding to the lane line element;

the confidence coefficient determining module is used for determining the confidence coefficient corresponding to each expression to be selected;

and the target expression determining module is used for determining a target expression from the multiple expressions to be selected based on the confidence coefficient.

According to another aspect of the present disclosure, there is provided a model training apparatus including:

the sample feature extraction module is used for extracting features of the lane line elements in the sample image to obtain sample features;

the system comprises a sample coefficient group determining module, a sample coefficient group determining module and a processing module, wherein the sample coefficient group determining module is used for determining a sample coefficient group corresponding to a sample expression of a lane line element in a sample image;

the prediction coefficient group generation module is used for inputting the sample characteristics into a coefficient prediction model to be trained to obtain a prediction coefficient group;

and the training module is used for determining the difference between the prediction coefficient group and the sample coefficient group, and training the coefficient prediction model to be trained according to the difference until the difference is within an allowable range.

According to another aspect of the present disclosure, there is provided an electronic device including:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a method according to any one of the embodiments of the present disclosure.

According to another aspect of the present disclosure, there is provided a non-transitory computer readable storage medium having stored thereon computer instructions for causing a computer to perform a method in any of the embodiments of the present disclosure.

According to another aspect of the present disclosure, a computer program product is provided, comprising a computer program which, when executed by a processor, implements the method in any of the embodiments of the present disclosure.

According to the technology disclosed by the invention, the target expression of the lane line element is determined based on the confidence coefficient by extracting the first element characteristic corresponding to the lane line element in the image to be detected, determining a plurality of expressions to be selected corresponding to the lane line element and the confidence coefficient thereof according to the first element characteristic. Therefore, the lane line elements in the image to be detected can be accurately detected, the expression of the fitting curve corresponding to the outputted lane line elements is more accurate, and based on the expression, the construction precision of the high-precision map is improved, and the accuracy and the reliability of the path planning of automatic driving are improved.

It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.

Drawings

The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:

fig. 1 shows a flow chart of an image processing method for high precision maps according to an embodiment of the present disclosure;

fig. 2 shows a specific flowchart of determining a plurality of expressions to be selected in a method according to an embodiment of the present disclosure;

FIG. 3 illustrates a detailed flow chart for determining confidence in a method according to an embodiment of the present disclosure;

FIG. 4 illustrates a detailed flow chart of determining attribute description information for a lane line element in a method according to an embodiment of the disclosure;

FIG. 5 illustrates a detailed flow chart of determining attribute description information of a non-lane line in a method according to an embodiment of the present disclosure;

FIG. 6 shows a schematic diagram of a method according to an embodiment of the present disclosure;

FIG. 7 shows a flow diagram of a model training method according to an embodiment of the present disclosure;

fig. 8 shows a block diagram of an image processing apparatus according to an embodiment of the present disclosure;

FIG. 9 shows a block diagram of a model training apparatus according to an embodiment of the present disclosure;

fig. 10 is a block diagram of an electronic device for implementing an image processing method for a high-precision map according to an embodiment of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

An image processing method for a high-precision map according to an embodiment of the present disclosure is described below with reference to fig. 1 to 6.

Fig. 1 illustrates a flowchart of an image processing method for a high-precision map according to an embodiment of the present disclosure. As shown in fig. 1, the method specifically includes the following steps:

s101: performing feature extraction on an image to be detected to obtain element features, wherein the element features comprise first element features corresponding to lane line elements;

s102: determining a plurality of expressions to be selected corresponding to the lane line elements based on the first element characteristics; the expression to be selected is used for representing a fitting curve corresponding to the lane line element;

s103: determining the confidence corresponding to each expression to be selected;

s104: and determining a target expression from the multiple expressions to be selected based on the confidence degrees.

The image processing method for the high-precision map according to the embodiment of the disclosure can be applied to the field of automatic driving, and particularly, by using the image processing method for the high-precision map according to the embodiment of the disclosure, the vehicle guide line elements contained in the image to be detected can be detected and identified, and the target expression of the lane line elements can be generated. The target expression based on the lane line elements can be used for constructing a high-precision map.

The image to be detected may be an image of the road surface ahead of the autonomous vehicle captured by the vision sensor during driving. The image to be detected may include a plurality of road surface elements, for example, lane line elements and non-lane line elements. The non-lane line elements may include traffic light marks, ground marks, fences, traffic signboards, and the like.

Illustratively, in step S101, the feature features in the image to be detected can be extracted through a feature extraction layer of a convolutional neural network trained in advance.

In addition, in step S101, the extracted feature features are not limited to the first feature features corresponding to the lane line elements, and for example, the second feature features corresponding to the non-lane line elements may be extracted.

In one example, the feature extraction layer of the convolutional neural network may include six down-sampling modules and a stack network, and feature extraction is performed on an image to be detected in sequence to convert the image to be detected into a feature image with a decreasing size, and finally the feature image with the smallest size is output through the stack network as the first element feature.

Specifically, the six downsampling modules include 32, 64, 128, 256, 512 and 1024 channels, and each residual unit changes the size of the output characteristic image into 1/2 of the input front-view image. For example, assume that the size of the image to be detected is h × w × 3, where h represents the image height, w represents the image width, and 3 represents the image channel (RGB). After being processed by 6 down-sampling modules of the convolution network feature extraction module, the size of the finally output element feature image is (h/210) × (w/210) × C, wherein C represents the number of channels of the feature image. After the first two down-sampling modules finish coarse-grained feature extraction on the image, the last four down-sampling modules extract more detailed image features.

For example, in step S102, a preset number of anchor frames and corresponding anchor frame parameters may be determined from the plurality of first feature characteristics, and a preset number of expressions to be selected may be determined based on each anchor frame and its corresponding anchor frame parameter. The expression to be selected can be a binary equation expressing the abscissa value and the ordinate value of the lane line element in the image to be detected.

For example, in step S103, the confidence of each prediction expression may be determined by inputting each prediction expression into a confidence prediction unit of the convolutional neural network. The confidence prediction unit may be obtained by performing corresponding calculation using a confidence loss function.

For example, in step S104, according to the confidence degrees corresponding to the expressions to be selected, the expression to be selected with the highest confidence degree may be selected as the target expression, so as to determine the target fitting curve of the lane line element.

According to the image processing method for the high-precision map, the first element characteristics corresponding to the lane line elements in the image to be detected are extracted, the multiple expressions to be selected corresponding to the lane line elements and the confidence degrees of the expressions to be selected are determined according to the first element characteristics, and the target expressions of the lane line elements are determined based on the confidence degrees. Therefore, the lane line elements in the image to be detected can be accurately detected, the expression of the fitting curve corresponding to the outputted lane line elements is more accurate, and based on the expression, the construction precision of the high-precision map is improved, and the accuracy and the reliability of the path planning of automatic driving are improved.

As shown in FIG. 2, in one embodiment, the first constituent feature includes a plurality of anchor frame information; step S102 includes:

s201: inputting a plurality of anchor frame information into a coefficient prediction model to obtain a coefficient group to be selected corresponding to each anchor frame information;

s202: and determining a corresponding candidate expression according to each candidate coefficient group.

In one example, the feature extraction layer of the convolutional neural network further includes a lane line prediction module, and the lane line prediction module may locate each feature image based on the first feature output by the feature extraction layer, and regress the feature images of different channel numbers to a channel number of 2 through the convolutional layer, where 2 represents a category confidence dimension and a position dimension of the feature image. The lane line prediction module represents whether the pixel semantics of the image to be detected corresponding to the feature image contain the lane line feature through a Boolean variable, wherein the pixel semantics of the image to be detected contain the lane line feature if the pixel semantics of the image to be detected contain the lane line feature, and the pixel semantics of the image to be detected does not contain the lane line feature if the pixel semantics of the image to be detected contains the lane line feature, and the pixel semantics of the image to be detected contains the lane line feature and is 0 if the pixel semantics of the image to be detected does not contain the lane line feature. Therefore, according to the lane line prediction module, the feature image output as 1 is subjected to stacking processing to obtain a corresponding anchor frame and anchor frame information.

The anchor frame information may include anchor frame size information and position information, wherein the position information is used for representing position coordinates of the anchor frame in the image to be detected. The coefficient prediction model may be a full-link layer of the convolutional neural network, and receives a preset number of prediction coefficient sets output by the full-link layer by inputting the anchor frame information to the full-link layer.

In one specific example, the prediction coefficient group includes four coefficients a, b, c and d, and the four coefficients in each candidate coefficient group satisfy the following relationship in the corresponding candidate expression:

y＝ax³+bx²+cx+d，

wherein x and y are used for representing the horizontal coordinate and the vertical coordinate of the fitting curve on the ground, and a, b, c and d are used for representing the coefficient of a cubic term, the coefficient of a quadratic term, the coefficient of a primary term and the coefficient of a constant term respectively.

Through the implementation mode, the preset number of expressions to be selected corresponding to the lane line elements can be determined according to the anchor frame information in the first element characteristics.

As shown in fig. 3, in an embodiment, determining the confidence level corresponding to each candidate expression includes:

s301: determining the offset of each candidate coefficient included in each candidate coefficient group;

s302: and calculating the confidence coefficient corresponding to each coefficient group to be selected by utilizing a cross entropy function based on the offset of each coefficient to be selected.

For example, the offset of each candidate coefficient may be calculated by the first branch convolutional layer of the convolutional neural network, and the confidence corresponding to each candidate coefficient group is calculated by using a cross entropy loss function according to the offset of each candidate coefficient.

Through the embodiment, the confidence of each coefficient group to be selected can be determined, so that the target coefficient group can be determined from the plurality of coefficient groups to be selected according to the confidence, and the target expression corresponding to the lane line element can be obtained.

As shown in fig. 4, in one embodiment, the method further comprises:

s401: based on the target expression, attribute description information of the lane line element is determined, the attribute description information including at least one of a color attribute, a line type attribute, and a boundary attribute of the lane line element.

For example, the target expression may be input to a second branch convolutional layer of the convolutional neural network, and attribute description information of the lane line element may be received from the second branch convolutional network.

In a specific example, for a lane line element, a feature extraction layer of a convolutional neural network may be utilized to perform feature extraction on an input image to be detected, so as to obtain a first element feature corresponding to the lane line element. And then inputting the first feature characteristics into a coefficient prediction model of the convolutional neural network to obtain a preset number of prediction coefficient groups, and determining a corresponding number of expressions to be selected according to the preset number of prediction coefficient groups. Meanwhile, a plurality of expressions to be selected are input to a first branch convolutional layer of the convolutional neural network, the confidence corresponding to each expression to be selected is obtained through cross entropy function calculation, and a target expression is determined from the plurality of expressions to be selected according to the confidence. And finally, inputting the target expression into a second branch convolution layer of the convolutional neural network, and determining attribute description information of the lane line element.

Through the embodiment, not only can the target expression of the lane line element be determined, but also the color attribute, the line type attribute and the boundary attribute corresponding to the lane line element can be determined, so that the output lane line element has certain semantic understanding capability.

As shown in fig. 5, in one embodiment, the element features further include a second element feature corresponding to a non-lane line element, and the method further includes:

s501: and inputting the second element characteristics into the single-stage detector to obtain attribute description information of the non-lane line elements, wherein the attribute description information of the non-lane line elements comprises the type attributes and/or the position attributes of the lane line elements.

The non-lane line elements may include traffic light marks, ground marks, fences, traffic signboards, and the like.

Illustratively, the single-stage detector may employ a yolo (you Only Look one) model. Specifically, the implementation method of the YOLO model is as follows: the feature images corresponding to the non-lane line elements extracted by the feature extraction layer are divided into SxS grid cells (grids), and if the center of a certain non-lane line element falls into the grid, the grid is responsible for predicting the non-lane line element. Each grid needs to predict a plurality of bounding boxes (borders), and each bounding box needs to additionally predict a confidence score (confidence) besides the position of the bounding box. If there is no object (element) in the grid cell, the confidence score is 0; if so, the confidence score is equal to the IOU (Intersection over Unit) of the predicted box and group treth. Each bounding box needs to predict 5 values of (u, v, h, w) and confidence, and each grid needs to predict a category information, which is marked as category C. Then, in the case that the number of grids is SxS, each grid is required to predict B bounding boxes and also C categories (category information). The final output is attribute description information of S × S (5 × B + C).

Through the above embodiment, the image processing method for the high-precision map according to the embodiment of the disclosure can not only extract the lane line elements in the image to be detected, but also extract the non-lane line elements therein, and can output the attribute description information of the non-lane line elements, so that the finally extracted non-lane line elements have a certain semantic understanding capability.

In one particular example, as shown in fig. 6, the method of embodiments of the present disclosure may be implemented by a convolutional neural network.

Specifically, the convolutional neural network comprises a feature extraction layer, a lane line element decoding network and a non-lane line element decoding network, an image to be detected is input into the feature extraction layer, the feature extraction layer extracts a first element feature from a lane line element in the image to be detected, and extracts a second element feature from the non-lane line element in the image to be detected.

And aiming at the anchor frame information in the first element characteristics, obtaining a plurality of prediction coefficient groups corresponding to the lane line elements by using a coefficient prediction model in a lane line element decoding network, and determining a plurality of expressions to be selected according to the prediction coefficient groups. Then, a plurality of expressions to be selected are input to a first branch convolution layer in the lane line decoding network, the confidence of each expression to be selected is obtained through calculation, and a target expression is determined from the plurality of expressions to be selected based on the confidence. And finally, inputting the target expression to the second branch convolution layer to obtain attribute description information of the lane line element. The final output of the lane line element decoding network is:

N×3×(num_class1+5)

wherein N is used for representing the number of the lane line elements in the image to be detected, and 3 is the anchor frame information, num, corresponding to each lane line element_class1For characterizing the lane line element attribute description information, 5 for characterizing the target coefficient groups (a, b, c, d) and the confidence of the target coefficient groups.

For the second feature, the non-lane line feature may be extracted using a non-lane line decoding network. The non-lane line decoding network may be a single-stage detector, for example, a YOLO model may be used. By inputting the anchor frame information contained in the second element feature into the single-stage detector, the final output of the single-stage detector is:

3×(num_class2+5)

wherein 3 is used for characterizing the number of anchor frames, num, in the second element characteristic_class2The attribute description information is used for representing the attribute description information corresponding to the non-lane line elements, and 5 the coordinates (u, v, h, w) and the confidence coefficient of the hit frame are represented.

According to the embodiment of the disclosure, a model training method is also provided.

As shown in fig. 7, the model training method includes the following steps:

s701: carrying out feature extraction on the lane line elements in the sample image to obtain sample features;

s702: determining a sample coefficient group corresponding to a sample expression of the lane line element in the sample image;

s703: inputting the sample characteristics into a coefficient prediction model to be trained to obtain a prediction coefficient group;

s704: and determining the difference between the prediction coefficient group and the sample coefficient group, and training the coefficient prediction model to be trained according to the difference until the difference is within an allowable range.

In step S701, preprocessing may be performed on the historically acquired lane line image and the labels thereof, and the training data in the obtained training data set is used as a sample image.

According to the model training method disclosed by the embodiment of the disclosure, the trained coefficient prediction model can predict a plurality of corresponding prediction coefficient groups aiming at the first element characteristics corresponding to the lane line elements, so as to obtain a plurality of prediction expressions corresponding to the lane line elements.

According to an embodiment of the present disclosure, there is also provided an image processing apparatus.

As shown in fig. 8, the image processing apparatus includes:

the first feature extraction module 801 is configured to perform feature extraction on lane line features in an image to be detected to obtain first feature features;

a candidate expression determining module 802, configured to determine, based on the first element feature, a plurality of candidate expressions corresponding to lane line elements; the expression to be selected is used for representing a fitting curve corresponding to the lane line element;

a confidence determining module 803, configured to determine a confidence corresponding to each expression to be selected;

and a target expression determining module 804, configured to determine a target expression from the multiple expressions to be selected based on the confidence.

In one embodiment, the first constituent feature includes a plurality of anchor frame information; the candidate expression determining module 802 includes:

the to-be-selected coefficient group determining submodule is used for inputting a plurality of anchor frame information into the coefficient prediction model to obtain to-be-selected coefficient groups corresponding to the anchor frame information;

and the candidate expression determining submodule is used for determining a corresponding candidate expression according to each candidate coefficient group.

In one embodiment, the confidence determination module 803 includes:

the offset determining submodule is used for determining the offset of each candidate coefficient contained in each candidate coefficient group;

and the confidence coefficient calculation submodule is used for calculating the confidence coefficient corresponding to each coefficient group to be selected by utilizing a cross entropy function based on the offset of each coefficient to be selected.

In one embodiment, the apparatus further comprises:

and the attribute description information determining module is used for determining attribute description information of the lane line element based on the target expression, wherein the attribute description information comprises at least one of color attribute, line type attribute and boundary attribute of the lane line element.

In one embodiment, the apparatus further comprises:

the second feature extraction module is used for extracting features of non-lane line features in the image to be detected to obtain second feature features;

and the attribute description information generation module is used for inputting the second element characteristics into the single-stage detector to obtain attribute description information of the non-lane line elements, wherein the attribute description information of the non-lane line elements comprises the type attributes and/or the position attributes of the lane line elements.

According to the embodiment of the disclosure, a model training device is also provided.

As shown in fig. 9, the model training apparatus includes:

a sample feature extraction module 901, configured to perform feature extraction on lane line elements in a sample image to obtain sample features;

a sample coefficient group determining module 902, configured to determine a sample coefficient group corresponding to a sample expression of a lane line element in a sample image;

a prediction coefficient group generation module 903, configured to input sample characteristics into a coefficient prediction model to be trained, so as to obtain a prediction coefficient group;

and the training module 904 is configured to determine a difference between the prediction coefficient set and the sample coefficient set, and train the coefficient prediction model to be trained according to the difference until the difference is within an allowable range.

The functions of each module, sub-module, or unit in the image processing apparatus and the model training apparatus in the embodiments of the present disclosure may refer to the corresponding description in the above method embodiments, and are not described herein again.

In the technical scheme of the disclosure, the acquisition, storage, application and the like of the personal information of the related user all accord with the regulations of related laws and regulations, and do not violate the good customs of the public order.

The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure.

FIG. 10 illustrates a schematic block diagram of an example electronic device 1000 that can be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 10, the apparatus 1000 includes a computing unit 1001 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM)1002 or a computer program loaded from a storage unit 1008 into a Random Access Memory (RAM) 1003. In the RAM 1003, various programs and data necessary for the operation of the device 1000 can also be stored. The calculation unit 1001, the ROM 1002, and the RAM 1003 are connected to each other by a bus 1004. An input/output (I/O) interface 1005 is also connected to bus 1004.

A number of components in device 1000 are connected to I/O interface 1005, including: an input unit 1006 such as a keyboard, a mouse, and the like; an output unit 1007 such as various types of displays, speakers, and the like; a storage unit 1008 such as a magnetic disk, an optical disk, or the like; and a communication unit 1009 such as a network card, a modem, a wireless communication transceiver, or the like. The communication unit 1009 allows the device 1000 to exchange information/data with other devices through a computer network such as the internet and/or various telecommunication networks.

Computing unit 1001 may be a variety of general and/or special purpose processing components with processing and computing capabilities. Some examples of the computing unit 1001 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The calculation unit 1001 executes the respective methods and processes described above, such as an image processing method or a model training method for a high-precision map. For example, in some embodiments, the image processing method or the model training method for high-precision maps may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as the storage unit 1008. In some embodiments, part or all of the computer program may be loaded and/or installed onto device 1000 via ROM 1002 and/or communications unit 1009. When the computer program is loaded into the RAM 1003 and executed by the computing unit 1001, one or more steps of the image processing method or the model training method for a high-precision map described above may be performed. Alternatively, in other embodiments, the computing unit 1001 may be configured in any other suitable way (e.g. by means of firmware) to perform an image processing method or a model training method for high-precision maps.

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server with a combined blockchain.

It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel, sequentially, or in different orders, as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved, and the present disclosure is not limited herein.

The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.

Claims

1. An image processing method for high-precision maps, comprising:

determining the confidence corresponding to each expression to be selected;

and determining a target expression from the expressions to be selected based on the confidence.

2. The method of claim 1, wherein the first constituent features include a plurality of anchor box information;

determining a plurality of expressions to be selected corresponding to the lane line elements based on the first element characteristics, including:

inputting a plurality of anchor frame information into a coefficient prediction model to obtain a coefficient group to be selected corresponding to each anchor frame information;

and determining a corresponding candidate expression according to each candidate coefficient group.

3. The method of claim 2, wherein determining the confidence level corresponding to each candidate expression comprises:

determining the offset of each candidate coefficient included in each candidate coefficient group;

and calculating the confidence corresponding to each coefficient group to be selected by utilizing a cross entropy function based on the offset of each coefficient to be selected.

4. The method of claim 1, further comprising:

determining attribute description information of the lane line element based on the target expression, the attribute description information including at least one of a color attribute, a line type attribute, and a boundary attribute of the lane line element.

5. The method of claim 1, wherein the element features further comprise second element features corresponding to non-lane line elements, the method further comprising:

inputting the second element characteristics into a single-stage detector to obtain attribute description information of the non-lane line elements, wherein the attribute description information of the non-lane line elements comprises the type attributes and/or the position attributes of the lane line elements.

6. A model training method, comprising:

determining a sample coefficient group corresponding to a sample expression of a lane line element in the sample image;

and determining the difference between the prediction coefficient set and the sample coefficient set, and training the coefficient prediction model to be trained according to the difference until the difference is within an allowable range.

7. An image processing apparatus comprising:

the characteristic extraction module is used for extracting characteristics of the image to be detected to obtain element characteristics, wherein the element characteristics comprise first element characteristics corresponding to lane line elements;

a candidate expression determining module, configured to determine, based on the first element feature, a plurality of candidate expressions corresponding to the lane line element; the expression to be selected is used for representing a fitting curve corresponding to the lane line element;

and the target expression determining module is used for determining a target expression from the expressions to be selected based on the confidence coefficient.

8. The apparatus of claim 7, wherein the first constituent feature comprises a plurality of anchor box information; the candidate expression determining module comprises:

a candidate coefficient group determining submodule, configured to input multiple anchor frame information into a coefficient prediction model, so as to obtain a candidate coefficient group corresponding to each anchor frame information;

9. The apparatus of claim 8, wherein the confidence determination module comprises:

the offset determining submodule is used for determining the offset of each candidate coefficient included in each candidate coefficient group;

10. The apparatus of claim 7, further comprising:

a first attribute description information generation module, configured to generate attribute description information of the lane line element based on the target expression, where the attribute description information includes at least one of a color attribute, a line type attribute, and a boundary attribute of the lane line element.

11. The apparatus of claim 7, wherein the feature extraction module is further configured to extract a second feature corresponding to a non-lane line feature, and the apparatus further comprises:

and the second attribute description information generation module is used for inputting the second element characteristics into the single-stage detector to obtain the attribute description information of the non-lane line element, wherein the attribute description information of the non-lane line element comprises the type attribute and/or the position attribute of the lane line element.

12. A model training apparatus comprising:

the sample coefficient group determining module is used for determining a sample coefficient group corresponding to a sample expression of the lane line element in the sample image;

13. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1 to 6.

14. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1 to 6.

15. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1 to 5.

16. An autonomous vehicle comprising an image processing apparatus according to any one of claims 7 to 11.