CN116399360A

CN116399360A - Vehicle path planning method

Info

Publication number: CN116399360A
Application number: CN202310265297.3A
Authority: CN
Inventors: 贾凡; 汪天才; 李帅霖
Original assignee: Beijing Megvii Technology Co Ltd
Current assignee: Beijing Megvii Technology Co Ltd
Priority date: 2023-03-13
Filing date: 2023-03-13
Publication date: 2023-07-07

Abstract

The application discloses a vehicle path planning method, comprising the following steps: acquiring an acquisition image; respectively extracting two-dimensional image characteristics of each acquired image; inputting a plurality of initial features and each two-dimensional image feature into a coding network of a prediction model, and carrying out feature coding fusion through the coding network of the prediction model to obtain target features corresponding to each initial feature; the method comprises the steps of representing a first target characteristic based on a target vehicle, azimuth information of the target vehicle relative to surrounding objects and advancing direction information of the target vehicle, inputting the first path prediction network of a prediction model, and obtaining a moving path of the target vehicle. According to the method and the device, the target feature representing the azimuth of the object in the surrounding environment where the vehicle is located can be constructed through feature coding fusion operation, the first target feature in the target feature is utilized for path planning, accuracy of a path planning result is guaranteed, and because the extraction difficulty of the two-dimensional image feature is low, the calculation amount is reduced, and the path planning efficiency is improved.

Description

Vehicle path planning method

Technical Field

The present disclosure relates to the field of image processing technologies, and in particular, to a vehicle path planning method, a computer readable storage medium, an electronic device, and a computer program product.

Background

Autopilot is a field in which artificial intelligence technology is widely applied, and vehicle driving path planning is an important link for realizing autopilot functions.

At present, a high-definition Bird's Eye View (BEV) can be used as an input of a planning model, three-dimensional BEV features are extracted, and path planning is realized based on the BEV features.

However, the difficulty in extracting BEV features is high, and the BEV features participate in path planning, so that the operand is excessive, and the path planning efficiency is reduced.

Disclosure of Invention

The embodiment of the application provides a vehicle path planning method, a computer readable storage medium, electronic equipment and a computer program product, so as to achieve the purposes of reducing the excessive operand and improving the path planning efficiency.

According to a first aspect of the present application, a vehicle path planning method is disclosed, comprising:

acquiring acquisition images of different visual angles acquired by an image acquisition device loaded on a target vehicle;

respectively extracting two-dimensional image characteristics of each acquired image;

inputting a plurality of initial features and each two-dimensional image feature into a coding network of a prediction model, and carrying out feature coding fusion through the coding network of the prediction model to obtain a target feature corresponding to each initial feature; the target features include: a first target feature characterizing positional information of the target vehicle relative to surrounding objects with the target vehicle as a reference;

Inputting the first target characteristics and the advancing direction information of the target vehicle into a first path prediction network of the prediction model, and predicting the path of the target vehicle through the first path prediction network to obtain the moving path of the target vehicle.

According to a second aspect of the present application, an electronic device is disclosed, comprising: a memory, a processor and a program stored on the memory and executable on the processor, which when executed by the processor implements the steps of the vehicle path planning method as in the first aspect.

According to a third aspect of the present application, a computer readable storage medium is disclosed, having stored thereon a program which, when executed by the processor, implements the steps of the vehicle path planning method as in the first aspect.

According to a fourth aspect of the present application, a computer program product is disclosed, comprising a computer program which, when executed by a processor, implements the steps of the vehicle path planning method as in the first aspect.

According to the method and the device, two-dimensional image features can be extracted based on the collected images shot by the image collecting device loaded on the target vehicle, feature encoding fusion is carried out according to preset initial features and the two-dimensional image features, target features corresponding to each initial feature are obtained, the target features comprise first target features used for representing the direction of the target vehicle relative to surrounding object objects by taking the target vehicle as a reference, the first target features and the advancing direction of the target vehicle are input into a path prediction network, and then a moving path of the target vehicle for avoiding the surrounding object objects to move can be obtained. According to the method, the target feature representing the direction of the object in the surrounding environment where the vehicle is located can be constructed based on the initial feature and the two-dimensional image feature through feature coding fusion operation, and finally, the first target feature in the target feature is utilized for path planning, so that accuracy of a path planning result is guaranteed.

Drawings

FIG. 1 is a flow chart of a vehicle path planning method of some embodiments of the present application;

FIG. 2 is a flow chart of another vehicle path planning method of some embodiments of the present application;

FIG. 3 is a top view of a vehicle of some embodiments of the present application;

FIG. 4 is a schematic illustration of an implementation of a vehicle path planning method according to some embodiments of the present application;

FIG. 5 is a schematic illustration of an autopilot navigation interface of some embodiments of the present application;

FIG. 6 is a schematic structural view of a vehicle path planning apparatus of some embodiments of the present application;

fig. 7 is a block diagram of an electronic device of some embodiments of the present application.

Detailed Description

In order that the above-recited objects, features and advantages of the present application will become more readily apparent, a more particular description of the invention briefly described above will be rendered by reference to specific embodiments that are illustrated in the appended drawings.

It should be noted that, for simplicity of description, the method embodiments are shown as a series of acts, but it should be understood by those skilled in the art that the embodiments are not limited by the order of acts described, as some steps may occur in other orders or concurrently in accordance with the embodiments. Further, those skilled in the art will appreciate that the embodiments described in the specification are all preferred embodiments and that the acts referred to are not necessarily required by the embodiments of the present application.

In recent years, technology research such as computer vision, deep learning, machine learning, image processing, image recognition and the like based on artificial intelligence has been advanced significantly. Artificial intelligence (AI, artificial Intelligence) is an emerging scientific technology for studying and developing theories, methods, techniques and application systems for simulating and extending human intelligence. The artificial intelligence discipline is a comprehensive discipline and relates to various technical categories such as chips, big data, cloud computing, internet of things, distributed storage, deep learning, machine learning, neural networks and the like. Computer vision is an important branch of artificial intelligence, and specifically, machine recognition is a world, and computer vision technologies generally include technologies such as face recognition, vehicle path planning, fingerprint recognition and anti-counterfeit verification, biometric feature recognition, face detection, pedestrian detection, object detection, pedestrian recognition, image processing, image recognition, image semantic understanding, image retrieval, word recognition, video processing, video content recognition, behavior recognition, three-dimensional reconstruction, virtual reality, augmented reality, synchronous positioning and map construction (SLAM, simultaneous localization and mapping), computational photography, robot navigation and positioning, and the like. With research and progress of artificial intelligence technology, the technology expands application in various fields, such as security protection, city management, traffic management, building management, park management, face passing, face attendance, logistics management, warehouse management, robots, intelligent marketing, computed photography, mobile phone images, cloud services, intelligent home, wearing equipment, unmanned driving, automatic driving, intelligent medical treatment, face payment, face unlocking, fingerprint unlocking, personnel verification, intelligent screen, intelligent television, camera, mobile internet, network living broadcast, beauty, make-up, medical beauty, intelligent temperature measurement and the like.

The method and the device are mainly applied to the field of vehicle path planning of automatic driving, and an automatic driving technology aims at planning a moving path of the vehicle in a future time range according to surrounding environment information acquired by the vehicle so as to realize automatic driving of the vehicle, and the planned moving path needs to avoid object objects around the vehicle to realize safe driving.

Of course, the foregoing is merely an exemplary listing of possible scenarios of the methods provided by the embodiments of the present application, and is not meant to limit the embodiments of the present application.

The embodiment of the application provides a vehicle path planning method, which can extract two-dimensional image features based on an acquired image shot by an image acquisition device loaded on a target vehicle, perform feature coding fusion according to preset initial features and the two-dimensional image features to obtain target features, wherein the target features comprise first target features for representing the direction of the target vehicle relative to surrounding object objects by taking the target vehicle as a reference, input the first target features and the advancing direction of the target vehicle into a path prediction network, and can obtain a moving path of the target vehicle for avoiding the surrounding object objects to move. According to the method, the target feature representing the direction of the object in the surrounding environment where the vehicle is located can be constructed based on the initial feature and the two-dimensional image feature through feature coding fusion operation, and finally, the first target feature in the target feature is utilized for path planning, so that accuracy of a path planning result is guaranteed.

Referring to fig. 1, a flowchart of a vehicle path planning method according to an embodiment of the present application is shown. As shown in fig. 1, the vehicle path planning method includes steps 101-104.

Step 101, acquiring acquired images of different visual angles acquired by an image acquisition device loaded on a target vehicle.

In this application embodiment, the image acquisition device shoots the collection image, can provide the required vehicle surrounding environment's of vehicle route planning characteristic, and image acquisition device can arrange on the vehicle, and image acquisition device's shooting direction can be fixed towards a certain direction (according to actual demand, image acquisition device's shooting direction also can be adjusted) to shoot the collection image of this direction, can arrange the image acquisition device of a plurality of different shooting directions on the vehicle, thereby can acquire the collection image in each direction all around the vehicle.

And 102, respectively extracting the two-dimensional image characteristics of each acquired image.

In the embodiment of the application, the prediction model comprises a feature extraction network, the feature extraction network can extract two-dimensional image features corresponding to each acquired image, the two-dimensional image features can be used for representing the features of the environment background and the object in the acquired images, so that the prediction model is convenient to sense the feature conditions in the surrounding environment of the vehicle, and further subsequent path planning is realized.

Specifically, a two-dimensional feature extraction Network (back bone) such as a reset (a Residual Network) and a swin transform (a multi-level feature extraction structure, which can extract visual features of different levels from an image and make it more suitable for tasks such as segmentation detection) may be used to perform feature extraction on the acquired images respectively, so as to obtain two-dimensional image features. Compared with the complex extraction of BEV features in the related art, the two-dimensional image feature extraction difficulty is low, the required calculation amount is small, and the response speed of automatic driving is greatly improved.

And 103, inputting a plurality of initial features and the two-dimensional image features into a coding network of a prediction model, and carrying out feature coding fusion through the coding network of the prediction model to obtain target features corresponding to the initial features.

Wherein the target feature comprises: a first target feature for characterizing an orientation of a target vehicle relative to surrounding object objects with respect to the target vehicle.

In the embodiment of the application, in order to accurately predict the moving path of the target vehicle, the first path prediction network of the prediction model needs to receive the information of the azimuth of the target vehicle relative to the surrounding object objects, so that the first path prediction network can subsequently utilize the information to plan a moving path for the target vehicle to avoid the surrounding object objects to move.

Therefore, the coding network of the prediction model can perform feature fusion coding based on the two-dimensional image features and preset initial features and on the attention mechanism to obtain target features for representing the positions of all object objects in the surrounding environment where the target vehicle is located, the number of the target features is multiple, one target feature can represent the positions of one object (the target vehicle or any object in the surrounding), the feature representing the positions of the target vehicle relative to the surrounding object is a first target feature based on the target vehicle in the target features, and the first target feature can be used for inputting the first path prediction network, so that the path planning of the target vehicle is obtained.

Specifically, feature encoding fusion is to define azimuth features of an object through initial features (queries) under a three-dimensional coordinate system, generate random azimuth information through a position information encoder, add the random azimuth information to the initial features (queries), input the initial features (queries) added with the random azimuth information into the encoder, search local features matched with the azimuth information in the two-dimensional image features, update values of the initial features (queries) by using the local features searched, and obtain target features (updated queries), so that the target features can represent azimuth of the object in an environment where a target vehicle is located.

And 104, inputting the first target characteristics and the advancing direction information of the target vehicle into a first path prediction network of the prediction model, and predicting the path of the target vehicle through the first path prediction network to obtain the moving path of the target vehicle.

In the embodiment of the present application, since the first target feature is based on the target vehicle and characterizes the position of the target vehicle relative to the surrounding object objects, the first path prediction network may predict, for the movement path prediction task of the target vehicle, a movement path for which the target vehicle avoids surrounding object objects to move, with the first target feature and the forward direction of the target vehicle as inputs. The forward direction of the target vehicle may be obtained based on a current steering instruction of the target vehicle, and if the current steering instruction of the target vehicle is 30 degrees right turn, the forward direction of the target vehicle may be 30 degrees right turn. The moving path is a path formed by the position coordinates of the target vehicle at each predicted time within a preset time range (e.g. 4 s) in the future.

After the moving path of the target vehicle is obtained, the automatic driving function of the target vehicle can control the target vehicle to run according to the moving path, and in addition, the moving path of the target vehicle can be displayed in an automatic driving interface of a central control display screen of the target vehicle so as to enable a user to perceive the path planning of automatic driving.

In summary, according to the vehicle path planning method provided by the embodiment of the application, two-dimensional image features can be extracted based on an acquired image shot by an image acquisition device loaded on a target vehicle, feature encoding fusion is performed according to preset initial features and the two-dimensional image features, and target features are obtained, wherein the target features comprise first target features for representing the direction of the target vehicle relative to surrounding object objects by taking the target vehicle as a reference, the first target features and the advancing direction information of the target vehicle are input into a path prediction network, so that a moving path of the target vehicle for avoiding the surrounding object objects to move can be obtained. According to the method, the target feature representing the direction of the object in the surrounding environment where the vehicle is located can be constructed based on the initial feature and the two-dimensional image feature through feature coding fusion operation, and finally, the first target feature in the target feature is utilized for path planning, so that accuracy of a path planning result is guaranteed.

Another vehicle path planning method provided by the embodiment of the application. Referring to fig. 2, the vehicle path planning method includes steps 201-210.

Step 201, acquiring acquired images of different visual angles acquired by an image acquisition device loaded on a target vehicle.

The step may refer to step 101, and will not be described herein.

Optionally, the image acquisition device includes: one or more of a front camera for shooting the front direction of the target vehicle, a left front camera for shooting the left front direction of the target vehicle, a right front camera for shooting the right front direction of the target vehicle, a right rear camera for shooting the front rear direction of the target vehicle, a right rear camera for shooting the right rear direction of the target vehicle, and a left rear camera for shooting the left rear direction of the target vehicle.

In the embodiment of the application, referring to fig. 3, which shows a top view of a vehicle, in order to improve the perception of the target vehicle to the surrounding environment, 6 image acquisition devices may be disposed on the target vehicle, and the image acquisition devices are respectively: a front camera 11 that photographs the front direction of the target vehicle, a front left camera 12 that photographs the front left direction of the target vehicle, a front right camera 13 that photographs the front right direction of the target vehicle, a rear right camera 14 that photographs the rear right direction of the target vehicle, a rear right camera 15 that photographs the rear right direction of the target vehicle, and a rear left camera 16 that photographs the rear left direction of the target vehicle. As shown in fig. 3, the 6 cameras respectively shoot different directions, and shooting angles of the 6 cameras are combined to form a looking-around effect on the surrounding environment of the target vehicle, and the acquired image obtained by adopting the layout mode can completely cover the characteristics in the surrounding environment of the target vehicle, so that the precision of subsequent path planning is improved.

And 202, respectively extracting the two-dimensional image characteristics of each acquired image.

The step may refer to step 102, and will not be described herein.

And 203, obtaining target features corresponding to each initial feature according to a plurality of preset initial features, the coding network and the two-dimensional image features through repeated iterative operation.

In the embodiment of the application, the feature coding fusion operation can be repeatedly performed through multiple iteration operations by utilizing the coding network, each iteration operation performs one-time feature coding fusion, the output of each iteration operation is used as the input of the next iteration operation, the final target feature is obtained after repeated iteration is performed for multiple times, and the precision of the target feature for representing the azimuth of the object can be greatly improved through multiple iterations, so that the precision of the subsequent path planning is effectively improved.

Preferably, the number of iterative operations may be 6, and according to different actual requirements, the number of iterative operations may also be adjusted correspondingly.

Optionally, for each iterative operation, step 203 may specifically include:

sub-step 2031, determining initial characteristics of the current iteration according to the target characteristics output by the previous iteration, inputting the determined initial characteristics of the current iteration and each two-dimensional image characteristic into a coding network of the prediction network, and performing feature coding fusion through the coding network to obtain target characteristics corresponding to each initial characteristic output by the current iteration.

Wherein the initial features of the first iteration operation are randomly generated features.

Referring to fig. 4, a schematic diagram of an implementation process of a vehicle path planning method is shown, wherein the implementation process includes a feature extraction network, a coding network, a first path prediction network, a second path prediction network and a three-dimensional object detection network of a prediction model.

The feature extraction network is used for extracting features of the input acquired images to obtain two-dimensional image features.

The encoding network is used for obtaining target features (updated queries) corresponding to each initial feature according to a plurality of preset initial features (queries), the encoding network and the two-dimensional image features through a plurality of iterative operations.

The first path prediction network is used for obtaining a moving path for moving an object around the avoidance of the target vehicle by taking a first target feature a' in the target features as input.

The second path prediction network is used for obtaining a predicted path of an object around the target vehicle by taking target characteristics except the first target characteristics a' as input.

The three-dimensional object detection network is used for obtaining the three-dimensional detection frame of the object around the target vehicle by taking the target characteristics except the first target characteristic a' as input.

For the target feature output by the current iteration operation in sub-step 2031, the next iteration operation can be input for feature encoding fusion by combining the two-dimensional image features in the next iteration operation through the encoding network, so that the target feature output by the next iteration operation is obtained, the target feature output by the last iteration operation can be enabled to more accurately represent the azimuth of the object through multiple iterations, and therefore the accuracy of the subsequent path planning is effectively improved.

Optionally, the substep 2031 may specifically include:

and a substep A1, generating random three-dimensional coordinates by using a position information encoder of the coding network.

And step A2, converting the three-dimensional coordinates into feature vectors, and fusing the feature vectors with target features output by the previous iteration operation to obtain initial features of the current iteration operation.

In this embodiment, referring to fig. 4, for the sub-steps A1-A2, in the first iteration operation, the initial features may be directly defined in the three-dimensional space, that is, initial features (queries) may be created in the three-dimensional space, the created initial features (queries) may be 901 (1 initial feature is used to represent the target vehicle, and at least some of the 900 initial features are used to represent object objects around the target vehicle), a set (queries) of initial features is formed, each initial feature (query) may be defined as a 256-bit feature vector (for the first iteration operation, the 256-bit values are all 0), and in order to enable the initial feature to represent the orientation of the object subsequently, the initial feature may be set to a learnable reference point, where the reference point represents three values (x 0, y0, z 0) of one three-dimensional coordinates that need to be predicted. Therefore, the embodiment of the application can generate a 256-bit feature vector corresponding to the random three-dimensional coordinate by using the position information encoder, and add the 256-bit feature vector corresponding to the random three-dimensional coordinate with the 256-bit feature vector defined by the initial feature to obtain the initial feature which can be used in the iterative operation.

Optionally, step 203 may specifically include:

substep 2032, using each of the initial features, obtains, from each of the two-dimensional image features, a local feature matching the three-dimensional coordinates represented by the initial feature.

In the substep 2033, the initial feature is updated through the local feature, and coding fusion based on an attention mechanism is performed according to the updated initial feature, so as to obtain a target feature corresponding to the initial feature.

Further, referring to fig. 4, by encoding the position information, the initial feature may be made to contain a random three-dimensional coordinate feature, then the initial feature may be input into the encoding network, the two-dimensional image feature extracted from the acquired image may be used as a key value pair, the initial feature may be used as a query key, a local feature matching the random three-dimensional coordinate in the initial feature may be queried from the two-dimensional image feature, the value of the initial feature may be updated by using the vector value of the local feature, and after the updated initial feature is calculated by the attention mechanism, the target feature (query ') output by the first iterative operation may be obtained, where the target feature (query ') forms a set (queries ').

After the 901 initial features are respectively queried and matched in the two-dimensional image features, 1 initial feature can be matched to a target vehicle to form a first target feature representing the azimuth of the target vehicle relative to surrounding object objects by taking the target vehicle as a reference, n initial features in the remaining 900 initial features can be matched to n object objects around the target vehicle to form n second target features representing the azimuth of the object relative to other object objects by taking the object objects as a reference, n is smaller than or equal to 900, and the original value is reserved for the initial features which are not matched to other object objects.

In non-first iteration operation, a 256-bit feature vector corresponding to a random three-dimensional coordinate can be generated by using a position information encoder, the 256-bit feature vector corresponding to the random three-dimensional coordinate is added with the 256-bit feature vector of the target feature output by the previous iteration operation to obtain an initial feature which can be used by the current iteration operation, then the initial feature can be input into a coding network, a two-dimensional image feature extracted from an acquired image can be used as a key value pair, the initial feature can be used as a query key, a local feature matched with the random three-dimensional coordinate in the initial feature is queried from the two-dimensional image feature, the value of the initial feature is updated by using the vector value of the local feature, the updated initial feature is calculated by an attention mechanism to obtain a target feature (query ') output by the current iteration operation, and the target feature (query') output by the last iteration operation can have higher precision after multiple iterations, so that the target feature output by the last iteration operation can be used for a subsequent prediction task.

And 204, inputting the first target feature and a control instruction for representing the advancing direction information of the target vehicle into the first path prediction network to obtain the predicted position information of each predicted time of the target vehicle in the future target duration.

In this embodiment of the present application, when the first path prediction network predicts the moving path of the target vehicle, it needs to acquire a first target feature that uses the target vehicle as a reference and characterizes the azimuth of the target vehicle relative to the surrounding object, and the advancing direction of the target vehicle, where the advancing direction of the target vehicle may be characterized by a control instruction for indicating the advancing direction of the target vehicle, and if the current steering instruction for the vehicle is 30 degrees right turn, the advancing direction of the target vehicle may be 30 degrees right turn.

Based on the above information, the first path prediction network may output the position coordinates of the target vehicle at each predicted time within a preset time range (e.g., 4 s) in the future.

And 205, arranging and combining all the predicted position information according to a time sequence to obtain a moving path of the target vehicle.

In this embodiment of the present application, the position coordinates of each predicted time of the target vehicle output by the first path prediction network in a preset time range (e.g. 4 s) in the future may be arranged and combined according to a time sequence, so as to obtain a path, where the path is a moving path of the target vehicle.

Optionally, the target feature further comprises: the second target feature is used for representing the azimuth of any object around the target vehicle relative to other object objects by taking the object as a reference; the method further comprises the steps of:

And 206, inputting all the second target features into a three-dimensional object detection network of the prediction model, and detecting through the three-dimensional object detection network to obtain the target vehicle and a three-dimensional detection frame of the object objects around the target vehicle.

In this embodiment of the present application, the target features may include, in addition to a first target feature that uses the target vehicle as a reference and characterizes an azimuth of the target vehicle relative to a surrounding object, a second target feature that uses any object around the target vehicle as a reference and characterizes an azimuth of the object relative to other object objects, where the first target feature may be used to implement a prediction task for predicting a moving path of the target vehicle, and the second target feature may be used to implement a three-dimensional object detection task, where the three-dimensional object detection task may be implemented by a three-dimensional object detection network of a prediction model, and the three-dimensional object detection network uses the second target feature as an input and may output the target vehicle, and a three-dimensional detection frame of the object around the target vehicle.

The three-dimensional detection function of the target vehicle and the peripheral object realized by the three-dimensional object detection network can further provide richer reference information on the basis of providing an accurate three-dimensional object positioning function for an automatic driving task, namely, the detected peripheral object can be displayed in an automatic driving interface of a central control display screen of the target vehicle, more visual peripheral environment condition perception is provided for a user, and the effect of the target vehicle for avoiding the peripheral object during automatic driving is also improved.

Optionally, step 206 may specifically include:

substep 2061, inputting all the second target features into the three-dimensional object detection network, obtaining three-dimensional coordinate deviation values, volume information and yaw angles of each of the object objects through the three-dimensional object detection network.

Substep 2062, constructing a three-dimensional detection frame of the object based on the three-dimensional coordinate deviation value, the volume information and the yaw angle of the object.

The three-dimensional coordinate deviation value is used for determining the center point coordinate of the three-dimensional detection frame, the volume information of the object is used for determining the size of the three-dimensional detection frame, and the yaw angle of the object is used for determining the deflection direction of the three-dimensional detection frame.

In this embodiment, for the substeps 2061-2062, after the second target feature is input into the three-dimensional object detection network, the three-dimensional object detection network may obtain the three-dimensional coordinate deviation value (Δx, Δy, Δz) corresponding to each second target feature, the volume information (w, h, l), and the yaw angle θ, where the three-dimensional coordinate deviation value (Δx, Δy, Δz) may be combined with the three-dimensional coordinate (x 0, y0, z 0) represented by the second target feature, the center point coordinate (x0+Δx, y0+Δy, z0+Δz) of the three-dimensional detection frame of the object corresponding to the second target feature may be calculated, and the volume information (w, h, l) may determine the volume information of the three-dimensional detection frame of the object corresponding to the second target feature, and the yaw angle θ may determine the yaw direction of the three-dimensional detection frame of the object corresponding to the second target feature.

Step 207, inputting all the second target features into a second path prediction network of the prediction model, and predicting the moving path of the object around the target vehicle through the second path prediction network to obtain a predicted path of the object around the target vehicle.

In this embodiment of the present application, the target feature includes a second target feature that uses any object around the target vehicle as a reference, and characterizes the azimuth of the object relative to other object objects, where the second target feature may also be used for a path planning task of other object objects around the target vehicle, where the path planning task of the other object objects may be implemented by a second path prediction network of the prediction model, and the second path prediction network uses the second target feature as an input, and may output a predicted path of the object around the target vehicle.

The path planning function of the object objects around the target vehicle, which is realized by the second path prediction network, can further improve the information enrichment degree of the automatic driving task, so that the automatic driving efficiency and the automatic driving safety are improved, the predicted path of the object objects around the target vehicle, which is obtained by prediction, can be displayed in the automatic driving interface of the central control display screen of the target vehicle, the perception of the movement condition of the object objects in the surrounding environment by a user is further improved, and the effect that the target vehicle avoids the surrounding object objects during the automatic driving is also further improved.

It should be noted that, referring to fig. 4, the path planning function of the target vehicle implemented by the first path prediction network, the path planning function of the other object implemented by the second path prediction network, and the three-dimensional object detection function implemented by the three-dimensional object detection network are three functions that can run in parallel, and according to actual requirements, the three functions may be started at the same time, or one or two functions may be started.

Optionally, the method further comprises:

and step 208, displaying a moving path of the target vehicle, a three-dimensional detection frame of the object objects around the target vehicle and a predicted path of the object objects around the target vehicle by taking the target vehicle as a reference in a vehicle navigation interface of the target vehicle.

Referring to fig. 5, which is a schematic diagram illustrating an autopilot navigation interface provided in an embodiment of the present application, in a case where a path planning function of a target vehicle implemented by a first path prediction network, a path planning function of other object objects implemented by a second path prediction network, and a three-dimensional object detection function implemented by a three-dimensional object detection network are simultaneously enabled, a central control display screen of the target vehicle may display the interface illustrated in fig. 5, where the interface includes: the three-dimensional detection frame of the target vehicle 21, the movement path of the target vehicle 21 in the future time range, the three-dimensional detection frame of the surrounding vehicle 22, the movement path of the surrounding vehicle 22 in the future time range, and the three-dimensional detection frame of the surrounding vehicle 23, the movement path of the surrounding vehicle 23 in the future time range, thereby realizing a more intelligent and efficient automatic driving function.

Optionally, the method further comprises:

step 209, acquiring training data. The training data includes acquired raw image data.

And step 210, training network parameters of a feature extraction network, a coding network, a first path prediction network, a second path prediction network and a three-dimensional object detection network of the prediction model by using the training data to obtain a trained prediction model.

In the embodiment of the application, three prediction tasks based on parallel design can be: the prediction tasks of the first path prediction network, the prediction tasks of the second path prediction network and the prediction tasks of the three-dimensional object detection network are trained by utilizing training data, so that when each prediction task is trained, a loss function can be reversely transmitted to the feature extraction network and the coding network, parameters of the feature extraction network and the coding network are optimized, the overall precision of a prediction model is improved, in addition, the prediction tasks of the first path prediction network, the prediction tasks of the second path prediction network and the prediction tasks of the three-dimensional object detection network are mutually and parallelly operated, an end-to-end model architecture is formed, mutual interference among the three prediction tasks is avoided, and the prediction tasks of the first path prediction network can be used as a training core.

Optionally, step 210 may specifically include:

and 2101, inputting the training data into a feature extraction network to obtain training data features.

And step 2102, according to a plurality of preset initial training features and the training data features, performing feature coding fusion through the coding network to obtain target training features corresponding to each initial training feature.

Substep 2103 trains network parameters of the feature extraction network, the encoding network, the first path prediction network, the second path prediction network, and the three-dimensional object detection network using the target training features.

In this embodiment, for sub-steps 2101-2103, the training data is standard data, and the training data may be marked with a three-dimensional detection frame and a future travel path of the target vehicle, and may also be marked with a three-dimensional detection frame and a future travel path of an object around the target vehicle.

After the training data is input into the prediction model, firstly, the training data characteristics are obtained through the characteristic extraction network extraction, then, a plurality of preset initial training characteristics and the training data characteristics are subjected to characteristic coding fusion through the coding network, the target training characteristics corresponding to each initial training characteristic are obtained, and finally, the target training characteristics can be input into the first path prediction network, the second path prediction network and the three-dimensional object detection network through the input of the first path prediction network, the second path prediction network, so that the purposes of the training characteristic extraction network, the coding network, the first path prediction network, the second path prediction network and the network parameters of the three-dimensional object detection network are achieved.

Optionally, the substep 2103 may specifically include:

and B1, inputting a first target training feature in the target training features into the first path prediction network, and training network parameters of the feature extraction network, the coding network and the first path prediction network according to the output value of the first path prediction network, the labeling value of the training data and a first loss function.

And B2, inputting a second target training feature in the target training features into the second path prediction network, and training network parameters of the feature extraction network, the coding network and the second path prediction network according to the output value of the second path prediction network, the labeling value of the training data and a second loss function.

And B3, inputting a second target training feature in the target training features into the three-dimensional object detection network, and training network parameters of the feature extraction network, the coding network and the three-dimensional object detection network according to the output value of the three-dimensional object detection network, the labeling value of the training data and a third loss function.

The first target training feature is used for representing the azimuth of the current vehicle relative to surrounding object objects by taking the current vehicle as a reference; the second target feature is used for representing the orientation of any object around the current vehicle relative to other object objects based on any object.

In this embodiment, for the sub-steps B1 to B3, when the first path prediction network is trained, the first target training feature representing the azimuth of the current vehicle relative to the surrounding object with respect to the current vehicle as a reference may be input into the first path prediction network, the loss value may be calculated by using the output value of the first path prediction network and the labeling value of the first target training feature, and the loss value may be used to train the network parameters of the first path prediction network in combination with the first loss function, and the network parameters of the network and the coding network may be extracted by using the reverse training feature, thereby improving the model precision of the prediction model.

In addition, the training tasks of the second path prediction network and the three-dimensional object detection network are used as auxiliary supervision, the overall training effect of the prediction model is improved, particularly, when the second path prediction network is trained, a second target training feature representing the azimuth of an object relative to other object objects by taking any object around a current vehicle as a reference can be input into the second path prediction network, a loss value can be calculated through the output value of the second path prediction network and the labeling value of the second target training feature, the loss value can be used for training the network parameters of the second path prediction network in combination with the second loss function, and the network parameters of the reverse training feature extraction network and the coding network are extracted, so that the accuracy of the feature extraction network and the coding network is improved through the auxiliary supervision.

When the three-dimensional object detection network is trained, a second target training feature representing the azimuth of the object relative to other object objects by taking any object around the current vehicle as a reference can be input into the three-dimensional object detection network, a loss value can be calculated through an output value of the three-dimensional object detection network and a labeling value of the second target training feature, the loss value can be used for training network parameters of the three-dimensional object detection network in combination with a second loss function, and the network parameters of the feature extraction network and the coding network are reversely trained, so that the accuracy of the feature extraction network and the coding network is improved through auxiliary supervision.

If the first target training features are input into the first path prediction network to train the prediction model, the training process is repeatedly explored in a small amount of first target training features, so that the training process is easy to fall into local optimum, and more time is required for achieving the training target. By adopting the training mode provided by the path planning method of the embodiment of the application, training experiments are carried out on the nuScenes data set, so that better performance is realized.

In summary, according to the vehicle path planning method provided by the embodiment of the application, two-dimensional image features can be extracted based on an acquired image shot by an image acquisition device loaded on a target vehicle, feature encoding fusion is performed according to preset initial features and the two-dimensional image features, and target features are obtained, wherein the target features comprise first target features for representing the direction of the target vehicle relative to surrounding object objects by taking the target vehicle as a reference, the first target features and the advancing direction of the target vehicle are input into a path prediction network, and then a moving path of the target vehicle for avoiding the surrounding object objects to move can be obtained. According to the method, the target feature representing the direction of the object in the surrounding environment where the vehicle is located can be constructed based on the initial feature and the two-dimensional image feature through feature coding fusion operation, and finally, the first target feature in the target feature is utilized for path planning, so that accuracy of a path planning result is guaranteed.

Fig. 6 is a schematic structural view of a vehicle path planning apparatus of some embodiments of the present application. As shown in fig. 6, the vehicle path planning apparatus may include: the acquisition module 301 and the processing module 302.

An acquisition module 301, configured to acquire acquired images of different perspectives acquired by an image acquisition device loaded on a target vehicle;

a processing module 302, configured to extract two-dimensional image features of each of the acquired images respectively; inputting a plurality of initial features and each two-dimensional image feature into a coding network of a prediction model, and carrying out feature coding fusion through the coding network of the prediction model to obtain a target feature corresponding to each initial feature; the target features include: a first target feature characterizing positional information of the target vehicle relative to surrounding objects with the target vehicle as a reference; inputting the first target characteristics and the advancing direction information of the target vehicle into a first path prediction network of the prediction model, and predicting the path of the target vehicle through the first path prediction network to obtain the moving path of the target vehicle.

Optionally, the target feature further comprises: the second target feature is used for representing the azimuth of any object around the target vehicle relative to other object objects by taking the object as a reference;

The processing module 302 is further configured to:

inputting all the second target features into a three-dimensional object detection network of the prediction model, and detecting through the three-dimensional object detection network to obtain a three-dimensional detection frame of the target vehicle and object objects around the target vehicle;

and inputting all the second target features into a second path prediction network of the prediction model, and predicting the moving path of the object objects around the target vehicle through the second path prediction network to obtain the predicted path of the object objects around the target vehicle.

Optionally, the processing module 302 is further configured to:

and displaying a moving path of the target vehicle, a three-dimensional detection frame of the object around the target vehicle and a predicted path of the object around the target vehicle by taking the target vehicle as a reference in a vehicle-machine navigation interface of the target vehicle.

Optionally, inputting a plurality of initial features and each two-dimensional image feature into a coding network of a prediction model, and performing feature coding fusion through the coding network of the prediction model, wherein the step of obtaining the target feature corresponding to each initial feature is completed through a plurality of iterative operations; the processing module 302 is specifically configured to:

According to the target characteristics output by the previous iteration operation, determining initial characteristics of the current iteration operation, inputting the determined initial characteristics of the current iteration operation and the two-dimensional image characteristics into a coding network of the prediction network, and carrying out characteristic coding fusion through the coding network to obtain target characteristics corresponding to each initial characteristic output by the current iteration operation;

Optionally, the processing module 302 is specifically configured to:

generating random three-dimensional coordinates by using a position information encoder of the coding network;

and converting the three-dimensional coordinates into feature vectors, and fusing the feature vectors with target features output by the previous iteration operation to obtain initial features of the current iteration operation.

Optionally, the processing module 302 is specifically configured to:

acquiring local features matched with three-dimensional coordinates represented by the initial features from the two-dimensional image features by using each initial feature;

and updating the initial feature through the local feature, and carrying out coding fusion based on an attention mechanism according to the updated initial feature to obtain a target feature corresponding to the initial feature.

Optionally, the processing module 302 is specifically configured to:

inputting the first target characteristics and a control instruction for representing the advancing direction information of the target vehicle into the first path prediction network to obtain the predicted position information of each predicted moment of the target vehicle in the future target duration;

and arranging and combining all the predicted position information according to a time sequence to obtain the moving path of the target vehicle.

Optionally, the processing module 302 is specifically configured to:

inputting all the second target features into the three-dimensional object detection network, and obtaining a three-dimensional coordinate deviation value, volume information and yaw angle of each object through the three-dimensional object detection network;

constructing a three-dimensional detection frame of the object according to the three-dimensional coordinate deviation value, the volume information and the yaw angle of the object;

Optionally, the processing module 302 is further configured to:

The prediction model is obtained through training in the following process:

acquiring training data; the training data comprises collected original image data;

inputting the training data into a feature extraction network to obtain training data features;

according to the multiple initial training features and the training data features, performing feature coding fusion through the coding network to obtain target training features corresponding to each initial training feature;

and training network parameters of the feature extraction network, the coding network, the first path prediction network, the second path prediction network and the three-dimensional object detection network by using the target training features.

Optionally, the processing module 302 is specifically configured to:

inputting a first target training feature of the target training features into the first path prediction network, and training network parameters of the feature extraction network, the coding network and the first path prediction network according to an output value of the first path prediction network, a labeling value of the training data and a first loss function;

inputting a second target training feature of the target training features into the second path prediction network, and training network parameters of the feature extraction network, the coding network and the second path prediction network according to an output value of the second path prediction network, a labeling value of the training data and a second loss function;

Inputting a second target training feature in the target training features into the three-dimensional object detection network, and training network parameters of the feature extraction network, the coding network and the three-dimensional object detection network according to an output value of the three-dimensional object detection network, a labeling value of the training data and a third loss function;

In summary, according to the vehicle path planning device provided by the embodiment of the application, two-dimensional image features can be extracted based on the acquired image shot by the image acquisition device loaded on the target vehicle, feature encoding fusion is performed according to the preset initial features and the two-dimensional image features, the target features are obtained, the target features comprise first target features for representing the direction of the target vehicle relative to surrounding object objects by taking the target vehicle as a reference, the first target features and the advancing direction of the target vehicle are input into the path prediction network, and then the moving path of the target vehicle for avoiding the surrounding object objects to move can be obtained. According to the method, the target feature representing the direction of the object in the surrounding environment where the vehicle is located can be constructed based on the initial feature and the two-dimensional image feature through feature coding fusion operation, and finally, the first target feature in the target feature is utilized for path planning, so that accuracy of a path planning result is guaranteed.

For the device embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference is made to the description of the method embodiments for relevant points.

In addition, referring to fig. 7, the embodiment of the present application further provides an electronic device, and the electronic device 700 includes a processor 710, a memory 720, and a computer program stored in the memory 720 and capable of running on the processor 710, where the computer program when executed by the processor 710 implements each process of the vehicle path planning method embodiment of the foregoing embodiment, and the process can achieve the same technical effect, so that repetition is avoided and redundant description is omitted herein.

The embodiment of the application further provides a computer readable storage medium, on which a computer program is stored, which when executed by a processor, implements each process of the above-mentioned vehicle path planning method embodiment, and can achieve the same technical effects, so that repetition is avoided, and no further description is provided herein. The computer readable storage medium may be a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, an optical disk, or the like.

The embodiments of the present application further provide a computer program product, where the computer program product includes a computer program, and when the computer program is executed by a processor, the computer program realizes each process of the embodiments of the vehicle path planning method, and the same technical effects can be achieved, so that repetition is avoided, and no redundant description is given here.

It will be apparent to those skilled in the art that embodiments of the present application may be provided as a method, apparatus, or computer program product. Accordingly, the present embodiments may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present application may take the form of a computer program product on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.

Embodiments of the present application are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal device to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal device, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

While preferred embodiments of the present embodiments have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the embodiments of the present application.

Finally, it is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or terminal device comprising the element.

The foregoing has described in detail the methods, apparatus, electronic devices and computer storage media for vehicle path planning provided herein, and specific examples have been presented herein to illustrate the principles and embodiments of the present application, the above examples being provided only to assist in understanding the methods and core ideas of the present application; meanwhile, as those skilled in the art will have modifications in the specific embodiments and application scope in accordance with the ideas of the present application, the present description should not be construed as limiting the present application in view of the above.

Claims

1. A vehicle path planning method, comprising:

2. The method of claim 1, wherein the target feature further comprises: a second target feature that characterizes azimuth information of an object with respect to other object objects with reference to any object around a target vehicle;

The method further comprises the steps of:

3. The method according to claim 1 or 2, characterized in that the method further comprises:

4. The method according to claim 1 or 2, wherein a plurality of initial features and each two-dimensional image feature are input into a coding network of a prediction model, feature coding fusion is performed through the coding network of the prediction model, and the step of obtaining a target feature corresponding to each initial feature is completed through a plurality of iterative operations;

Each of the iterative operations includes:

5. The method of claim 4, wherein determining the initial characteristic of the current iteration based on the target characteristic output from the previous iteration comprises:

6. The method according to claim 1 or 2, wherein the initial features are used for characterizing three-dimensional coordinates of an object, the feature code fusion is performed through a coding network of the prediction model, and a target feature corresponding to each initial feature is obtained, including:

7. The method according to claim 1 or 2, wherein the inputting the first target feature and the heading information of the target vehicle into the first path prediction network of the prediction model, predicting the path of the target vehicle through the first path prediction network, obtaining the moving path of the target vehicle, includes:

8. The method according to claim 2, wherein the inputting all the second target features into the three-dimensional object detection network of the prediction model, detecting through the three-dimensional object detection network, obtaining the target vehicle and the three-dimensional detection frame of the object objects around the target vehicle, includes:

9. The method according to claim 1 or 2, wherein the predictive model comprises a feature extraction network, an encoding network, a first path prediction network, a second path prediction network, and a three-dimensional object detection network; the feature extraction network is used for extracting two-dimensional image features of the acquired image;

the prediction model is obtained through training in the following process:

10. The method of claim 9, wherein the training network parameters of the feature extraction network, the encoding network, the first path prediction network, the second path prediction network, and the three-dimensional object detection network using the target training features comprises:

11. A computer-readable storage medium, on which a computer program is stored, which computer program, when being executed by a processor, implements the steps of the vehicle path planning method according to any one of claims 1 to 10.

12. An electronic device comprising a processor, a memory and a computer program stored on the memory and executable on the processor, the computer program implementing the steps of the vehicle path planning method of any one of claims 1 to 10 when executed by the processor.

13. A computer program product, characterized in that the computer program product has stored thereon a computer program which, when executed by a processor, realizes the steps of the vehicle path planning method according to any one of claims 1 to 10.