CN116430404A

CN116430404A - Method and device for determining relative position, storage medium and electronic device

Info

Publication number: CN116430404A
Application number: CN202310401378.1A
Authority: CN
Inventors: 张川峰; 林乾浩; 陈锦明; 杨烨; 梁恒恒; 李海珠
Original assignee: Foss Hangzhou Intelligent Technology Co Ltd
Current assignee: Foss Hangzhou Intelligent Technology Co Ltd
Priority date: 2023-04-13
Filing date: 2023-04-13
Publication date: 2023-07-14

Abstract

The application discloses a method and a device for determining relative positions, a storage medium and an electronic device, wherein the method comprises the following steps: semantic recognition is carried out on a first point cloud sensed by a laser radar on a target vehicle at the current time, so that first lane line information and first traffic sign recognition system TSR information are obtained; acquiring second lane line information and second TSR information output by a semantic camera on a target vehicle at the current time; according to the first lane line information, the first TSR information, the second lane line information and the second TSR information, the relative positions of targets corresponding to the laser radar and the semantic camera in the current time are determined, and by adopting the technical scheme, the problems that in the related art, the accuracy of matching between the object identified by the semantic camera and the object identified by the laser radar is low and the like are solved, and the technical effect of improving the accuracy of matching between the object identified by the semantic camera and the object identified by the laser radar is achieved.

Description

Method and device for determining relative position, storage medium and electronic device

Technical Field

The present invention relates to the field of automobiles, and in particular, to a method and apparatus for determining a relative position, a storage medium, and an electronic apparatus.

Background

The unmanned automobile often realizes the perception of objects on a road through sensors such as cameras, laser radars, millimeter wave radars and the like which are arranged on the automobile, wherein the cameras can well identify the types of the scanning objects, but cannot acquire the information such as the distance, the speed and the like of the scanning objects; although the laser radar can acquire the position and speed information of the scanning object in real time, the type of the scanning object cannot be well identified. Therefore, the multi-sensor sensing fusion technology is a popular technology of current sensing, the identification of the information such as the position, the speed and the type of the scanned object is better realized by combining the characteristics of each sensor, and the combination calibration of the relative positions of the same objects identified by the laser radar and the semantic camera is often needed to ensure that the type, the position and the speed information of the scanned object can be accurately acquired in order to better combine the characteristics of the camera and the laser radar.

In the prior art, under the condition of real-time matching of the point cloud of the camera scanning and the point cloud of the laser radar scanning, static calibration is usually carried out based on a calibration plate, and the method needs foreign object assistance. In addition, it is often difficult to extract semantic information of the lidar, which in turn results in poor accuracy of calibration.

Aiming at the problems of low accuracy and the like of matching between an object identified by a semantic camera and an object identified by a laser radar in the related art, no effective solution has been proposed yet.

Disclosure of Invention

The embodiment of the application provides a method and a device for determining a relative position, a storage medium and an electronic device, which are used for at least solving the problems of low accuracy and the like of matching between an object identified by a semantic camera and an object identified by a laser radar in the related technology.

According to an embodiment of the present application, there is provided a method for determining a relative position, including: semantic recognition is carried out on a first point cloud sensed by a laser radar on a target vehicle at the current time to obtain first lane line information and first traffic sign recognition system TSR information, wherein the first lane line information is used for identifying lane lines recognized by the laser radar at the current time in a space where the target vehicle is located, and the first TSR information is used for identifying traffic identification objects recognized by the laser radar at the current time in the space where the target vehicle is located; acquiring second lane line information and second TSR information output by a semantic camera on the target vehicle at the current time, wherein the second lane line information is used for identifying a lane line identified by the semantic camera at the current time in a space where the target vehicle is located, and the second TSR information is used for identifying a traffic identification object identified by the semantic camera at the current time in the space where the target vehicle is located; and determining the target relative positions corresponding to the laser radar and the semantic camera in the current time according to the first lane line information, the first TSR information, the second lane line information and the second TSR information, wherein the target relative positions are used for representing the relative positions between a target object identified by the laser radar and the target object identified by the semantic camera, and the target object comprises a lane line or a traffic sign object.

Optionally, the semantic recognition of the first point cloud sensed by the laser radar on the target vehicle at the current time to obtain first lane line information and first traffic sign recognition system TSR information includes: dividing each point in the first point cloud into a first candidate point cloud and a second candidate point cloud, wherein the position of each point in the first candidate point cloud under a first coordinate system where the laser radar is located is in the same plane with the target vehicle, and the position of each point in the second candidate point cloud under the first coordinate system is in a different plane with the target vehicle; and acquiring the first lane line information according to the first candidate point cloud, and acquiring the first TSR information according to the second candidate point cloud.

Optionally, the acquiring the first lane line information according to the first candidate point cloud includes: obtaining the reflection intensity of each point in the first candidate point cloud, wherein the reflection intensity of each point is carried in the first point cloud; determining points meeting preset matching conditions between the reflection intensity and a preset reflection intensity threshold value in the first candidate point cloud to obtain a second point cloud; fitting the second point cloud to obtain the first lane line information.

Optionally, the obtaining the first TSR information according to the second candidate point cloud includes: when the second candidate point cloud comprises N points and N is equal to 2, one point of the N points is used as a circle center, a first reference point with the distance smaller than or equal to a target radius between the second candidate point cloud and the one point is determined in the second candidate point cloud, and a third point cloud is obtained, wherein the first reference point and the one point are used for identifying a target traffic identification object of a target type; fitting the third point cloud to obtain the first TSR information; and/or under the condition that the second candidate point cloud comprises the N points and N is a positive integer greater than 2, acquiring TSR information corresponding to the ith point by executing the following steps, wherein the first TSR information comprises TSR information corresponding to each point in the N points: determining a fourth point cloud containing the ith point in the second candidate point cloud, wherein the distance between each point in the fourth point cloud and at least one point in the fourth point cloud is smaller than or equal to the target radius, and each point in the fourth point cloud is used for identifying a target traffic identification object of the target type; fitting the fourth point cloud to obtain TSR information corresponding to the ith point, wherein i is a positive integer which is greater than or equal to 1 and less than or equal to N.

Optionally, the determining, according to the first lane line information, the first TSR information, the second lane line information, and the second TSR information, the target relative position corresponding to the laser radar and the semantic camera at the current time includes: determining a first preliminary relative position corresponding to the laser radar and the semantic camera in the current time according to the first lane line information and the second lane line information, wherein the first preliminary relative position is used for representing the preliminary relative position of the target object identified by the laser radar and the target object identified by the semantic camera on a horizontal plane, and the horizontal plane is a plane where the semantic camera is located; and according to the first TSR information and the second TSR information, the first preliminary relative position is adjusted to obtain a first relative position corresponding to the laser radar and the semantic camera in the current time, and according to the first TSR information and the second TSR information, a second relative position corresponding to the laser radar and the semantic camera in the current time is determined, wherein the target relative position comprises the first relative position and the second relative position, the first relative position is used for representing the relative position of the target object identified by the laser radar and the target object identified by the semantic camera on the horizontal plane, the second relative position is used for representing the relative position of the target object identified by the laser radar and the target object identified by the semantic camera on the vertical plane, and the vertical plane is a plane perpendicular to the horizontal plane.

Optionally, the determining, according to the first lane line information and the second lane line information, a first preliminary relative position corresponding to the laser radar and the semantic camera at the current time includes: starting from a preset first initial relative position, adjusting a first current relative position until a first target condition is met, and determining the first current relative position when the first target condition is met as the first initial relative position; wherein the first target condition includes: the sum of the loss function values of M points is smaller than or equal to a preset first target threshold, wherein the M points are M points of the lane line recognized by the laser radar in the space where the target vehicle is located, and the M points are obtained by projecting the first lane line information to a second coordinate system where the semantic camera is located according to the first current relative position; wherein the loss function values of the M points are obtained by: inputting a first projection position of a jth point in the M points under the second coordinate system and a first reference position of the jth point under the second coordinate system into a preset first loss function to obtain a loss function value of the jth point, wherein the loss function value of the jth point is used for representing the position difference between the projection position of the jth point under the second coordinate system and the reference position of the jth point under the second coordinate system, M is a positive integer greater than or equal to 1, and j is a positive integer greater than or equal to 1 and less than or equal to M.

Optionally, the adjusting the first preliminary relative position according to the first TSR information and the second TSR information to obtain a first relative position corresponding to the lidar and the semantic camera at the current time includes: starting from the first preliminary relative position, adjusting the first current relative position until a second target condition is met, and determining the first current relative position when the second target condition is met as the first relative position; wherein the second target condition includes: the sum of the loss function values of Q points is smaller than or equal to the second target threshold, wherein the Q points are Q points of a traffic recognition object recognized by the laser radar in the space where the target vehicle is located, and the Q points are obtained by projecting the first TSR information to the second coordinate system according to the first current relative position; wherein the loss function values of the Q points are obtained by: and inputting a second projection position of a p-th point in the Q points under the second coordinate system and a second reference position of the p-th point under the second coordinate system into a preset second loss function to obtain a loss function value of the p-th point, wherein the loss function value of the p-th point is used for representing the position difference between the projection position of the p-th point under the second coordinate system and the reference position of the p-th point under the second coordinate system, Q is a positive integer greater than or equal to 1, and p is a positive integer greater than or equal to 1 and less than or equal to Q.

Optionally, the adjusting, according to the first TSR information and the second TSR information, a second initial relative position between the laser radar and the semantic camera to obtain the second relative position includes: starting from a preset second initial relative position, adjusting a second current relative position until a third target condition is met, and determining the second current relative position when the third target condition is met as the second relative position; wherein the third target condition includes: the sum of loss function values of L points is smaller than or equal to a third target threshold, wherein the L points are L points of a traffic recognition object recognized by the laser radar in a space where the target vehicle is located, and the L points are obtained by projecting the first TSR information to the second coordinate system according to the second current relative position; wherein the loss function values of the L points are obtained by the following steps: and inputting a second projection position of a q-th point in the L points under the second coordinate system and a second reference position of the q-th point under the second coordinate system into a preset third loss function to obtain a loss function value of the q-th point, wherein the loss function value of the q-th point is used for representing the position difference between the projection position of the q-th point under the second coordinate system and the reference position of the q-th point under the second coordinate system, L is a positive integer greater than or equal to 1, and q is a positive integer greater than or equal to 1 and less than or equal to L.

Optionally, after determining the relative positions of the targets corresponding to the laser radar and the semantic camera at the current time according to the first lane line information, the first TSR information, the second lane line information and the second TSR information, the method further includes: performing semantic recognition on a fifth point cloud sensed by the laser radar on the target vehicle to obtain fourth lane line information and fourth TSR information, wherein the fourth lane line information is used for identifying a lane line identified by the laser radar in the space where the target vehicle is located at the next time of the current time, and the fourth TSR information is used for identifying a traffic sign object identified by the laser radar in the space where the target vehicle is located at the next time; obtaining fifth lane line information and fifth TSR information output by the semantic camera on the target vehicle, wherein the fifth lane line information is used for identifying a lane line identified by the semantic camera at the next time in a space where the target vehicle is located, and the fifth TSR information is used for identifying a traffic identification object identified by the semantic camera at the next time in the space where the target vehicle is located; and determining the next relative position corresponding to the laser radar and the semantic camera at the next time according to the fourth lane line information, the fourth TSR information, the fifth lane line information, the fifth TSR information and the first TSR information.

Optionally, the determining the next relative position corresponding to the laser radar and the semantic camera at the next time according to the fourth lane line information, the fourth TSR information, the fifth lane line information, the fifth TSR information, and the first TSR information includes: performing union operation on the point of the traffic recognition object in the first TSR information and the point of the traffic recognition object in the fourth TSR information to obtain sixth TSR information; determining a second preliminary relative position corresponding to the laser radar and the semantic camera at the next time according to the fourth lane line information and the fifth lane line information; and adjusting the second preliminary relative position according to the fifth TSR information and the sixth TSR information to obtain a third relative position corresponding to the laser radar and the semantic camera in the next time, and determining a fourth relative position corresponding to the laser radar and the semantic camera in the next time according to the fifth TSR information and the sixth TSR information, wherein the next relative position comprises the third relative position and the fourth relative position.

According to another embodiment of the embodiments of the present application, there is also provided a device for determining a relative position, including: the system comprises a first identification module, a first traffic sign identification system TSR (traffic sign recognition) and a second identification module, wherein the first identification module is used for carrying out semantic identification on a first point cloud sensed by a laser radar on a target vehicle at the current time to obtain first lane line information and first traffic sign identification system TSR information, the first lane line information is used for identifying a lane line identified by the laser radar at the current time in a space where the target vehicle is located, and the first TSR information is used for identifying a traffic sign object identified by the laser radar at the current time in the space where the target vehicle is located; the first acquisition module is used for acquiring second lane line information and second TSR information output by the semantic camera on the target vehicle at the current time, wherein the second lane line information is used for identifying lane lines identified by the semantic camera at the current time in a space where the target vehicle is located, and the second TSR information is used for identifying traffic identification objects identified by the semantic camera at the current time in the space where the target vehicle is located; the first determining module is configured to determine, according to the first lane line information, the first TSR information, the second lane line information, and the second TSR information, a target relative position corresponding to the laser radar and the semantic camera at the current time, where the target relative position is used to represent a relative position between a target object identified by the laser radar and the target object identified by the semantic camera, and the target object includes a lane line or a traffic sign object.

According to a further aspect of embodiments of the present application, there is also provided a computer readable storage medium having a computer program stored therein, wherein the computer program is arranged to perform the above-described method of determining relative position when run.

According to still another aspect of the embodiments of the present application, there is further provided an electronic device including a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor executes the above-mentioned method for determining a relative position by using the computer program.

In the embodiment of the application, the relative positions of the same lane lines or the same traffic sign objects identified by the laser radar and the semantic camera can be determined according to the lane line information and the TSR information identified by the laser radar and the lane line information and the TSR information output by the semantic camera, the position, the distance and the like of the identified objects can be better reflected by the point cloud scanned by the laser radar, the type and the like of the identified objects can be better reflected by the point cloud scanned by the semantic camera, and in this way, the accurate matching of the position, the distance and the like of the objects scanned by the laser radar and the objects shot by the semantic camera is realized. By adopting the technical scheme, the problems of low accuracy of matching between the object identified by the semantic camera and the object identified by the laser radar and the like in the related technology are solved, and the technical effect of improving the accuracy of matching between the object identified by the semantic camera and the object identified by the laser radar is realized.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application.

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required to be used in the description of the embodiments or the prior art will be briefly described below, and it will be obvious to those skilled in the art that other drawings can be obtained from these drawings without inventive effort.

Fig. 1 is a hardware block diagram of a mobile terminal according to a method for determining a relative position according to an embodiment of the present application;

FIG. 2 is a flow chart of a method of determining relative position according to an embodiment of the present application;

FIG. 3 is a schematic diagram of a first point cloud according to an embodiment of the present application;

FIG. 4 is a schematic diagram of a second point cloud according to an embodiment of the present application;

FIG. 5 is a schematic illustration of a fitted lane line according to an embodiment of the present application;

FIG. 6 is a schematic diagram of extracting TSR information and lane line information according to an embodiment of the present application;

FIG. 7 is a flow chart of a point cloud stitching local map according to lidar output according to an embodiment of the present application;

FIG. 8 is a schematic diagram of semantic information matching of a laser radar and semantic information of a semantic camera according to an embodiment of the present application;

FIG. 9 is a schematic diagram of a loss function according to an embodiment of the present application;

FIG. 10 is a schematic diagram of matching semantic information of a lidar and semantic information of a semantic camera according to an embodiment of the present application;

FIG. 11 is a flow chart of a dynamic joint calibration algorithm according to an embodiment of the present application;

FIG. 12 is a schematic illustration of a semantic registration according to an embodiment of the present application;

fig. 13 is a block diagram of a relative position determining apparatus according to an embodiment of the present application.

Detailed Description

In order to make the present application solution better understood by those skilled in the art, the following description will be made in detail and with reference to the accompanying drawings in the embodiments of the present application, it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, shall fall within the scope of the present application.

It should be noted that the terms "first," "second," and the like in the description and claims of the present application and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that embodiments of the present application described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

The method embodiments provided in the embodiments of the present application may be performed in a computer terminal, a device terminal, or a similar computing apparatus. Taking a computer terminal as an example, fig. 1 is a block diagram of a hardware structure of a mobile terminal according to a method for determining a relative position according to an embodiment of the present application. As shown in fig. 1, the mobile terminal may include one or more (only one is shown in fig. 1) processors 102 (the processor 102 may include, but is not limited to, a microprocessor MCU or a processing device such as a programmable logic device FPGA) and a memory 104 for storing data, and in one exemplary embodiment, the computer terminal may further include a transmission device 106 for communication functions and an input-output device 108. It will be appreciated by those skilled in the art that the configuration shown in fig. 1 is merely illustrative and is not intended to limit the configuration of the computer terminal described above. For example, a computer terminal may also include more or fewer components than shown in FIG. 1, or have a different configuration than the equivalent functions shown in FIG. 1 or more than the functions shown in FIG. 1.

The memory 104 may be used to store computer programs, such as software programs of application software and modules, such as computer programs corresponding to the method for determining relative positions in the embodiments of the present invention, and the processor 102 executes the computer programs stored in the memory 104 to perform various functional applications and data processing, i.e., to implement the above-described methods. Memory 104 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 104 may further include memory remotely located relative to the processor 102, which may be connected to the computer terminal via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The transmission means 106 is arranged to receive or transmit data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of a computer terminal. In one example, the transmission device 106 includes a network adapter (Network Interface Controller, simply referred to as NIC) that can connect to other network devices through a base station to communicate with the internet. In one example, the transmission device 106 may be a Radio Frequency (RF) module, which is used to communicate with the internet wirelessly.

In this embodiment, a method for determining a relative position is provided and applied to the computer terminal, and fig. 2 is a flowchart of a method for determining a relative position according to an embodiment of the present application, as shown in fig. 2, where the flowchart includes the following steps:

step S202, carrying out semantic recognition on first point clouds sensed by a laser radar on a target vehicle at the current time to obtain first lane line information and first traffic sign recognition system TSR information, wherein the first lane line information is used for identifying lane lines recognized by the laser radar at the current time in a space where the target vehicle is located, and the first TSR information is used for identifying traffic sign objects recognized by the laser radar at the current time in the space where the target vehicle is located;

step S204, second lane line information and second TSR information which are output by a semantic camera on the target vehicle at the current time are obtained, wherein the second lane line information is used for identifying lane lines which are identified by the semantic camera at the current time in a space where the target vehicle is located, and the second TSR information is used for identifying traffic identification objects which are identified by the semantic camera at the current time in the space where the target vehicle is located;

Step S206, determining a target relative position corresponding to the laser radar and the semantic camera at the current time according to the first lane line information, the first TSR information, the second lane line information and the second TSR information, where the target relative position is used to represent a relative position between a target object identified by the laser radar and the target object identified by the semantic camera, and the target object includes a lane line or a traffic sign object.

Through the steps, the relative positions of the same lane lines or the same traffic sign objects identified by the laser radar and the semantic camera can be determined according to the lane line information and the TSR information identified by the laser radar and the lane line information and the TSR information output by the semantic camera, the position, the distance and the like of the identified objects can be better reflected by the point cloud scanned by the laser radar, the type and the like of the identified objects can be better reflected by the point cloud scanned by the semantic camera, and in this way, the accurate matching of the position, the distance and the like of the objects scanned by the laser radar and the objects shot by the semantic camera is realized. By adopting the technical scheme, the problems of low accuracy of matching between the object identified by the semantic camera and the object identified by the laser radar and the like in the related technology are solved, and the technical effect of improving the accuracy of matching between the object identified by the semantic camera and the object identified by the laser radar is realized.

In the solution provided in step S202, the lidar on the target vehicle may include, but is not limited to, different types of lidars, such as: low line number mechanical lidar, etc., may, but is not limited to, semantically identify a first point cloud sensed by the lidar on the target vehicle at a current time, resulting in first lane line information and first traffic sign identification system TSR (Traffic Sign Recognition) information.

Alternatively, in this embodiment, the first TSR information may be, but is not limited to, used to identify a traffic identification object that is identified by the lidar at the current time in the space where the target vehicle is located, such as: traffic lights (e.g., traffic lights, etc.), traffic signs, street lights, etc.

Alternatively, in the present embodiment, the first lane line information may include, but is not limited to, a type of lane line identified by the laser radar at the current time in the space where the target vehicle is located, and the like, such as: a stop line, a lane line of a lane in which the target vehicle is traveling, a manner in which the lane in which the target vehicle is traveling is allowed to travel, and the like.

Fig. 3 is a schematic diagram of a first point cloud, as shown in fig. 3, in which a point cloud sensed by a laser radar may include, but is not limited to, a point cloud for identifying a lane line where a target vehicle is located and a point cloud for identifying first TSR information, and in which a point cloud for identifying first TSR information may include, but is not limited to, a point cloud for identifying street lamp 1 and street lamp 2, and in which a point cloud for identifying street lamp 1 may include, but is not limited to, a point a, a point b, a point c, and a point d, and a point cloud for identifying street lamp 2 may include, but is not limited to, a point e, a point f, and a point g.

In one exemplary embodiment, the first lane line information and the first traffic sign recognition system TSR information may be obtained, but are not limited to, by: dividing each point in the first point cloud into a first candidate point cloud and a second candidate point cloud, wherein the position of each point in the first candidate point cloud under a first coordinate system where the laser radar is located is in the same plane with the target vehicle, and the position of each point in the second candidate point cloud under the first coordinate system is in a different plane with the target vehicle; and acquiring the first lane line information according to the first candidate point cloud, and acquiring the first TSR information according to the second candidate point cloud.

Alternatively, in the present embodiment, each point in the first point cloud identified by the laser radar may be, but is not limited to, divided into a first candidate point cloud, which may be, but is not limited to, a ground point cloud (may be, but is not limited to, including a point cloud for identifying a lane line), and a second candidate point cloud, which may be, but is not limited to, a non-ground point cloud (may be, but is not limited to, including a point cloud for identifying a traffic identification object), first lane line information is acquired according to the first candidate point cloud, and first TSR information is acquired according to the second candidate point cloud.

In one exemplary embodiment, the first lane line information may be acquired from a first candidate point cloud by, but is not limited to, the following: obtaining the reflection intensity of each point in the first candidate point cloud, wherein the reflection intensity of each point is carried in the first point cloud; determining points meeting preset matching conditions between the reflection intensity and a preset reflection intensity threshold value in the first candidate point cloud to obtain a second point cloud; fitting the second point cloud to obtain the first lane line information.

Alternatively, in this embodiment, the points between the reflection intensity and the preset reflection intensity threshold value that satisfy the preset matching condition may be but are not limited to determined in the first candidate point cloud, fig. 4 is a schematic diagram of a second point cloud according to an embodiment of the present application, and as shown in fig. 4, the second point cloud for identifying the lane line may be obtained by extracting, from the first candidate point cloud, the points where the difference between the reflection intensity and the reflection intensity of the asphalt pavement is less than or equal to the reflection intensity threshold value, according to the difference between the reflection intensity of each point in the first candidate point cloud and the reflection intensity of the asphalt pavement (or the road surface on which other target vehicles travel, which is not limited in this application).

Alternatively, in this embodiment, where the second point cloud is obtained, a lane line may be, but is not limited to, fitted according to the second point cloud, and fig. 5 is a schematic diagram of a fitted lane line according to an embodiment of the present application, as shown in fig. 5, and may be, but is not limited to, fitted to a lane line identified by the second point cloud by a ranac algorithm (or other algorithm, which is not limited in this application), and then lane line semantic information (i.e., first lane line information) is output, where the lane line semantic information carries a position (for example, on the left of the target vehicle, or on the right of the target vehicle, etc.) of a point representing the lane line relative to the target vehicle.

In one exemplary embodiment, acquiring the first TSR information may include, but is not limited to, at least one of:

in a first case, when the second candidate point cloud includes N points and N is equal to 2, determining a first reference point with a distance smaller than or equal to a target radius from one point in the second candidate point cloud by taking one point of the N points as a circle center, and obtaining a third point cloud, wherein the first reference point and the one point are used for identifying a target traffic identification object of a target type; fitting the third point cloud to obtain the first TSR information.

Alternatively, in this embodiment, in the case where the second candidate point cloud includes only 2 points, the first reference point with a distance from one point being less than or equal to the target radius may be determined in the second candidate point cloud, and the third point cloud may be obtained when the first reference point with a distance from one point being less than or equal to the target radius is determined in the second candidate point cloud, and the third point cloud may be fitted to obtain the first TSR information.

In a second case, when the second candidate point cloud includes the N points and N is a positive integer greater than 2, the TSR information corresponding to the ith point is obtained by executing the following steps, where the first TSR information includes TSR information corresponding to each of the N points: determining a fourth point cloud containing the ith point in the second candidate point cloud, wherein the distance between each point in the fourth point cloud and at least one point in the fourth point cloud is smaller than or equal to the target radius, and each point in the fourth point cloud is used for identifying a target traffic identification object of the target type; fitting the fourth point cloud to obtain TSR information corresponding to the ith point, wherein i is a positive integer which is greater than or equal to 1 and less than or equal to N.

Optionally, in this embodiment, in the case that the second candidate point cloud includes more than 2 points, a point may be selected from the non-ground point clouds (i.e., the second candidate point cloud), whether there are other points within 1m of the square circle of the point is found, if there are other points within 1m of the square circle of the point cloud, the points are all put together and classified into a class, then whether there are other points within 1m of each point of the point cloud already classified into a class (i.e., the target radius may be adjusted according to the actual requirement, and the present application does not limit the point cloud until the number of points in the class of point clouds is no longer increased, which may indicate that the class of point cloud clustering is completed. And then selecting one point from the points which are not classified in the second candidate point cloud, and searching whether other points exist within 1m of the square circle of the point or not until all the points in the second candidate point cloud are completely searched.

Alternatively, in this embodiment, the points for identifying the traffic marking object may be extracted from the non-ground point cloud (i.e., the second candidate point cloud described above) by, but not limited to, the european clustering algorithm and the ranac algorithm (or other algorithms, which the present application is not limited to). Fig. 6 is a schematic diagram of extracting TSR information and lane line information according to an embodiment of the present application, as shown in fig. 6, but not limited to, acquiring a point cloud (i.e., a first point cloud) sensed by a laser radar, performing fusion optimization on a laser odometer pose SLAM (simultaneous localization and mapping, real-time positioning and mapping) and a GNSS (Global Navigation Satellite System ) pose, acquiring accurate mileage information, combining the laser radar ground point cloud (i.e., a first candidate point cloud) with a non-ground point cloud (i.e., a second candidate point cloud) according to the mileage information, and splicing a ground point cloud map and a non-ground point cloud map.

FIG. 7 is a flowchart of a method for stitching local maps according to point clouds output by a laser radar according to an embodiment of the present application, as shown in FIG. 7, and may, but is not limited to, performing de-distortion compensation on a first point cloud sensed by lidar (radar) through imu (Inertial Measurement Unit ), segmenting a ground point Yun Yufei ground point cloud according to ground fitting, and obtaining laser mileage information by registration of the corrected point cloud; performing fusion optimization on the laser odometer pose and the GNSS pose to obtain accurate mileage information; and combining the lidar ground point Yun Yufei ground point cloud according to the mileage information, and splicing a ground point cloud map and a non-ground point cloud map.

The point cloud (second point cloud) for representing the lane lines can be extracted from the ground point cloud, and the second point cloud is fitted to obtain the lane line semantic information (namely, the first lane line information). The point cloud for representing the traffic identification object can be clustered and segmented from the non-ground point cloud, and the point cloud for identifying the traffic identification object is fitted to obtain first TSR information of the traffic identification object including a street lamp (such as a lamp post of the street lamp), a traffic indicator lamp, a traffic sign and the like.

In the technical solution provided in step S204, the second lane line information and the second TSR information that are output by the semantic camera on the target vehicle at the current time may be but not limited to be obtained, where the second lane line information is used to identify the type of lane line (such as a stop line, a lane line of a lane on which the target vehicle is traveling, etc.) that is identified by the semantic camera in the space where the target vehicle is located at the current time, and the second TSR information is used to identify the traffic sign object (which may include, but is not limited to, a street lamp (such as a lamp post of a street lamp), a traffic indicator, a traffic sign, etc.) that is identified by the semantic camera in the space where the target vehicle is located at the current time.

In the technical solution provided in step S206, the relative position between the lane line or the traffic sign object identified by the laser radar and the lane line or the traffic sign object identified by the semantic camera may be determined according to, but not limited to, the first lane line information and the first TSR information identified by the laser radar, and the second lane line information and the second TSR information identified by the semantic camera.

Alternatively, in the present embodiment, the relative position between the target object identified by the lidar and the target object identified by the semantic camera may be, but is not limited to, identified by external parameters between the lidar and the semantic camera. It can be understood in more detail that by determining the relative positions of the targets corresponding to the laser radar and the semantic camera in the current time, the positions and the distances of the target objects identified by the laser radar can be accurately projected onto the target objects identified by the semantic camera, and in this way, the safety of the automatic driving process of the target vehicle is improved.

In one exemplary embodiment, the relative positions of the targets corresponding to the lidar and the semantic camera at the current time may be determined, but are not limited to, by: determining a first preliminary relative position corresponding to the laser radar and the semantic camera in the current time according to the first lane line information and the second lane line information, wherein the first preliminary relative position is used for representing the preliminary relative position of the target object identified by the laser radar and the target object identified by the semantic camera on a horizontal plane, and the horizontal plane is a plane where the semantic camera is located; and according to the first TSR information and the second TSR information, the first preliminary relative position is adjusted to obtain a first relative position corresponding to the laser radar and the semantic camera in the current time, and according to the first TSR information and the second TSR information, a second relative position corresponding to the laser radar and the semantic camera in the current time is determined, wherein the target relative position comprises the first relative position and the second relative position, the first relative position is used for representing the relative position of the target object identified by the laser radar and the target object identified by the semantic camera on the horizontal plane, the second relative position is used for representing the relative position of the target object identified by the laser radar and the target object identified by the semantic camera on the vertical plane, and the vertical plane is a plane perpendicular to the horizontal plane.

Alternatively, in this embodiment, the lane lines may be considered to be identical in the road on which the target vehicle is traveling, and the lane line information identified by the laser radar may be, but is not limited to, first matched with the lane line information output by the semantic camera, where the lane line identified by the laser radar is classified from the output lane line of the semantic camera, so that the left lane line and the right lane line can be better distinguished, and thus rough alignment is performed.

Alternatively, in this embodiment, the first preliminary relative position corresponding to the laser radar and the semantic camera at the current time may be determined, but not limited to, according to the first lane line information and the second lane line information, and the first preliminary relative position may be identified by, but not limited to, y, z, yaw (yaw angle) external parameters, that is, preliminary adjustment may be performed on the external parameters y, z, yaw between the laser radar and the semantic camera according to the first lane line information and the second lane line information, to obtain preliminary values of the external parameters y, z, yaw.

Optionally, in this embodiment, after the lane line information identified by the laser radar and the lane line information output by the semantic camera are matched to obtain a part of parameters (i.e., y, z, yaw), the part of parameters (i.e., y, z, yaw) are substituted into the matching of the TSR information identified by the laser radar and the TSR information identified by the semantic camera, and the TSR information identified by the laser radar and the TSR information output by the semantic camera (the TSR information identified by the laser radar may be longitudinally aligned, and the TSR information output by the semantic camera may be distributed by pixels), so as to adjust external parameters between the laser radar and the semantic camera, which may include, but are not limited to, x, roll (pitch angle), y, z, and yaw.

Optionally, in this embodiment, when the first preliminary relative position is obtained, the external parameters y, z, and yaw for identifying the first preliminary relative position may be adjusted according to, but not limited to, the first TSR information and the second TSR information, to obtain a first relative position corresponding to the laser radar and the semantic camera at the current time, and the second relative position corresponding to the laser radar and the semantic camera at the current time may be determined according to the first TSR information and the second TSR information, and may be identified by, but not limited to, x, roll, pitch external parameters.

Fig. 8 is a schematic diagram of matching semantic information of a laser radar and semantic information of a semantic camera according to an embodiment of the present application, as shown in fig. 8, the laser radar scans T0, T1, … …, tn in time T0 respectively, so as to obtain local_map of time T0 including sacn_0, sacn_1, … …, and sacn_n, the semantic camera scans image_0 in time T0, and may, but not limited to, match sacn_0, sacn_1, … …, and sacn_n with image_0, so as to obtain external parameters between the semantic camera and the laser radar … … in time T0, the laser radar scans tn+1, tn+2, … …, tn+m in time Tn respectively, so as to obtain local_map of time Tn including sacn+1, sacn+2, … …, and the semantic camera scans in time Tn, so as to obtain semantic camera scan in time Tn, and may not be limited to match sacn+1, sacn+2, and the laser radar obtains external parameters between the time tn+2 and the laser radar.

In one exemplary embodiment, the first preliminary relative position corresponding to the lidar and semantic camera at the current time may be determined, but is not limited to, by: starting from a preset first initial relative position, adjusting a first current relative position until a first target condition is met, and determining the first current relative position when the first target condition is met as the first initial relative position; wherein the first target condition includes: the sum of the loss function values of M points is smaller than or equal to a preset first target threshold, wherein the M points are M points of the lane line recognized by the laser radar in the space where the target vehicle is located, and the M points are obtained by projecting the first lane line information to a second coordinate system where the semantic camera is located according to the first current relative position; wherein the loss function values of the M points are obtained by: inputting a first projection position of a jth point in the M points under the second coordinate system and a first reference position of the jth point under the second coordinate system into a preset first loss function to obtain a loss function value of the jth point, wherein the loss function value of the jth point is used for representing the position difference between the projection position of the jth point under the second coordinate system and the reference position of the jth point under the second coordinate system, M is a positive integer greater than or equal to 1, and j is a positive integer greater than or equal to 1 and less than or equal to M.

Alternatively, in the present embodiment, the first preliminary relative position may be, but is not limited to, a relative position between a position of the laser radar disposed on the target vehicle and a position of the semantic camera disposed on the target vehicle. The distance from the point to the line can be used as an optimization variable, the lane line information identified by the laser radar is projected to a coordinate system where the lane line information output by the semantic camera is located according to the current external parameter (namely the first preliminary relative position), the lane line information identified by the projected laser radar is scattered into M points, then the distance projection is carried out on the lane line information output by the semantic camera by the M points, and when the sum of the distances is minimum, the algorithm converges. FIG. 9 is a schematic diagram of a loss function according to an embodiment of the present application, as shown in FIG. 9, projection points of points i, l, j in a coordinate system where a semantic camera is located are X (M, i), X (M, l), X (M, j); x (M, i), X (M, l), X (M, j) are visual semantics, and the calculation formula for obtaining the loss function (i.e. the first loss function) is as follows:

after obtaining the loss function (i.e. the first loss function), nonlinear optimization is performed by using a Levenberg-Marquardt method LM (Levenberg-Marquardt).

In an exemplary embodiment, the first preliminary relative position may be adjusted, but is not limited to, by: starting from the first preliminary relative position, adjusting the first current relative position until a second target condition is met, and determining the first current relative position when the second target condition is met as the first relative position; wherein the second target condition includes: the sum of the loss function values of Q points is smaller than or equal to the second target threshold, wherein the Q points are Q points of a traffic recognition object recognized by the laser radar in the space where the target vehicle is located, and the Q points are obtained by projecting the first TSR information to the second coordinate system according to the first current relative position; wherein the loss function values of the Q points are obtained by: and inputting a second projection position of a p-th point in the Q points under the second coordinate system and a second reference position of the p-th point under the second coordinate system into a preset second loss function to obtain a loss function value of the p-th point, wherein the loss function value of the p-th point is used for representing the position difference between the projection position of the p-th point under the second coordinate system and the reference position of the p-th point under the second coordinate system, Q is a positive integer greater than or equal to 1, and p is a positive integer greater than or equal to 1 and less than or equal to Q.

Alternatively, in the present embodiment, the second loss function may be, but is not limited to, the same loss function as the first loss function, or a different loss function; the first target threshold may be, but is not limited to being, the same threshold as the second target threshold, or a different threshold.

Optionally, in this embodiment, y, z, and yaw after lane line information identified by the laser radar and lane line information identified by the semantic camera are adjusted according to, but not limited to, the first TSR information and the second TSR information, and in this way, the accuracy of the external parameters y, z, and yaw between the laser radar and the semantic camera is improved by fully using the traffic recognition object scanned by the laser radar and the traffic recognition object identified by the semantic camera.

In one exemplary embodiment, the second initial relative position between the lidar and the semantic camera may be adjusted, but is not limited to, by: starting from a preset second initial relative position, adjusting a second current relative position until a third target condition is met, and determining the second current relative position when the third target condition is met as the second relative position; wherein the third target condition includes: the sum of loss function values of L points is smaller than or equal to a third target threshold, wherein the L points are L points of a traffic recognition object recognized by the laser radar in a space where the target vehicle is located, and the L points are obtained by projecting the first TSR information to the second coordinate system according to the second current relative position; wherein the loss function values of the L points are obtained by the following steps: and inputting a second projection position of a q-th point in the L points under the second coordinate system and a second reference position of the q-th point under the second coordinate system into a preset third loss function to obtain a loss function value of the q-th point, wherein the loss function value of the q-th point is used for representing the position difference between the projection position of the q-th point under the second coordinate system and the reference position of the q-th point under the second coordinate system, L is a positive integer greater than or equal to 1, and q is a positive integer greater than or equal to 1 and less than or equal to L.

Alternatively, in the present embodiment, the third loss function may be, but is not limited to, the same loss function as the first loss function and the second loss function, or a different loss function; the third target threshold may be, but is not limited to being, the same threshold as the second target threshold, the first target threshold being the same threshold, or a different threshold, etc.

Optionally, in this embodiment, the parameters x, roll and pitch between the lidar and the semantic camera may be adjusted according to, but not limited to, the first TSR information and the second TSR information until the third target condition is met, which may indicate that the parameters x, roll and pitch between the lidar and the semantic camera are already adjusted, in which case, by adjusting parameters y, z and yaw, x, roll and pitch between the lidar and the semantic camera, it may be achieved that the target object (e.g., the position of the target object, the distance between the target object and the target vehicle) identified by the lidar is accurately matched with the target object (e.g., the type of the target object) identified by the semantic camera, thereby improving the accuracy of matching the target object identified by the lidar with the target object identified by the semantic camera.

In order to better understand the above-mentioned determination process of the relative position, the following description will further explain the above-mentioned determination procedure of the relative position with reference to the alternative embodiment, but is not used to limit the technical solution of the embodiment of the present application.

Fig. 10 is a schematic diagram of matching semantic information of a laser radar with semantic information of a semantic camera according to an embodiment of the present application, as shown in fig. 10, but not limited to, performing semantic association on the semantic information identified by the laser radar and the semantic information output by the semantic camera, fitting a lane line from the semantic information identified by the laser radar and the semantic information output by the semantic camera, constructing a corresponding loss function (i.e. the first loss function) and performing nonlinear optimization through a Levenberg-Marquardt LM (Levenberg-Marquardt), to obtain a preliminarily adjusted external parameter part_1 (i.e. y, z, yaw for identifying the first preliminary relative position).

And then bringing the preliminarily adjusted external parameter part_1 into matching of TSR information identified by the laser radar and TSR information output by the semantic camera, fitting TSR information (which can include lamp poles, signboards, stop lines and the like but is not limited to the semantic information identified by the laser radar and the semantic information output by the semantic camera), constructing a corresponding loss function, and carrying out nonlinear optimization through a Levenberg-Marquardt LM (Levenberg-Marquardt) to obtain an adjusted external parameter part_2 (namely y, z and yaw for identifying the first relative position and x, roll and pitch for identifying the second relative position), obtaining an adjusted external parameter, and determining that the external parameter between the adjusted laser radar and the semantic camera is obtained under the condition of residual convergence.

In one exemplary embodiment, the next relative position of the next temporal lidar and semantic camera may be determined, but is not limited to, by: performing semantic recognition on a fifth point cloud sensed by the laser radar on the target vehicle to obtain fourth lane line information and fourth TSR information, wherein the fourth lane line information is used for identifying a lane line identified by the laser radar in the space where the target vehicle is located at the next time of the current time, and the fourth TSR information is used for identifying a traffic sign object identified by the laser radar in the space where the target vehicle is located at the next time; obtaining fifth lane line information and fifth TSR information output by the semantic camera on the target vehicle, wherein the fifth lane line information is used for identifying a lane line identified by the semantic camera at the next time in a space where the target vehicle is located, and the fifth TSR information is used for identifying a traffic identification object identified by the semantic camera at the next time in the space where the target vehicle is located; and determining the next relative position corresponding to the laser radar and the semantic camera at the next time according to the fourth lane line information, the fourth TSR information, the fifth lane line information, the fifth TSR information and the first TSR information.

Alternatively, in the present embodiment, in the case where it is desired to determine the next relative position of the lidar and the semantic camera at the next time. The method comprises the steps that according to fourth lane line information and fourth TSR information which are recognized by a laser radar at the next time of the current time, fifth lane line information and fifth TSR information which are output by a semantic camera, and first TSR information which are recognized by the laser radar at the current time, the next relative position corresponding to the laser radar at the next time and the semantic camera is determined, TSR information recognized by a plurality of laser radars at the next time is accumulated, when the next relative position corresponding to the laser radar at the next time and the semantic camera is determined, TSR information recognized by the laser radar history can be utilized to describe traffic identification objects more accurately, and accuracy of determining the next relative position corresponding to the laser radar at the next time and the semantic camera is further improved.

Alternatively, in this embodiment, the current time may, but is not limited to, include a time, a time period, and the like, and may, but is not limited to, determine a next relative position corresponding to the laser radar and the semantic camera in a next time period of the current time period, or determine a next relative position corresponding to the laser radar and the semantic camera in a next time period of the current time, according to the fourth lane line information, the fourth TSR information, the fifth lane line information, and the first TSR information.

In one exemplary embodiment, the next relative position of the lidar and the semantic camera for the next time may be determined, but is not limited to, by: performing union operation on the point of the traffic recognition object in the first TSR information and the point of the traffic recognition object in the fourth TSR information to obtain sixth TSR information; determining a second preliminary relative position corresponding to the laser radar and the semantic camera at the next time according to the fourth lane line information and the fifth lane line information; and adjusting the second preliminary relative position according to the fifth TSR information and the sixth TSR information to obtain a third relative position corresponding to the laser radar and the semantic camera in the next time, and determining a fourth relative position corresponding to the laser radar and the semantic camera in the next time according to the fifth TSR information and the sixth TSR information, wherein the next relative position comprises the third relative position and the fourth relative position.

Alternatively, in the present embodiment, the TSR information accumulated with the TSR information identified by the laser radar at the next time may be, but not limited to, the TSR information identified by the laser radar at a time adjacent to the next time. By the mode, TSR information with larger time span is prevented from being accumulated, excessive storage space is occupied, and the utilization rate of the storage space is improved.

Alternatively, in this embodiment, the operation of combining the TSR information identified by the lidar at the current time with the TSR information identified by the lidar at the next time at the current time may be performed, or the TSR information identified by the lidar at the current time and the TSR information identified by the lidar at the next time may be directly accumulated (for example, the TSR information identified by the lidar at the current time and the TSR information identified by the lidar at the next time may be directly spliced), without deleting the point where the TSR information identified by the lidar at the current time coincides with the TSR information identified by the lidar at the next time at the current time. By the mode, the extraction difficulty of traffic sign objects such as traffic signs and street lamp poles is reduced, and the accuracy of the extraction of the traffic sign objects such as the traffic signs and the street lamp poles is improved.

FIG. 11 is a flowchart of a dynamic joint calibration algorithm according to an embodiment of the present application, as shown in FIG. 11, and may be, but not limited to, time-aligning semantic information identified by a lidar with semantic information output by a semantic camera. On the one hand, the point cloud sensed by the laser radar can be divided into the ground point Yun Yufei ground point cloud, and the corrected point cloud acquires laser mileage information through registration; performing fusion optimization on the laser odometer pose and the GNSS pose to obtain accurate mileage information; and combining the lidar ground point Yun Yufei ground point cloud according to the mileage information, and splicing a ground point cloud map and a non-ground point cloud map. And extracting lane line semantics and stop line semantics of a lane where the target vehicle is located according to the ground point cloud, and extracting TSR information of a space where the target vehicle is located according to the non-ground point cloud. On the other hand, the lane line semantic sliding window and the TSR semantic sliding window are extracted from semantic information output by the semantic camera.

The semantic matching can be performed on the semantic information identified by the laser radar and the semantic information output by the semantic camera, lane lines are fitted from the semantic information identified by the laser radar and the semantic information output by the semantic camera, a corresponding loss function (namely the first loss function) is constructed, nonlinear optimization is performed through a Levenberg-Marquardt LM (Levenberg-Marquardt) to obtain a preliminarily adjusted external parameter part_1 (namely y, z and yaw for identifying the first preliminary relative position).

And substituting the preliminarily adjusted external parameter part_1 into the matching of the TSR information identified by the laser radar and the TSR information output by the semantic camera, fitting the TSR information (which can comprise a lamp post, a guideboard, a stop line and the like) from the semantic information identified by the laser radar and the semantic information output by the semantic camera, constructing a corresponding loss function, performing nonlinear optimization through a Levenberg-Marquardt LM (Levenberg-Marquardt), obtaining an adjusted external parameter part_2 (namely y, z and yaw for identifying the first relative position and x, roll and pitch for identifying the second relative position), obtaining an adjusted external parameter, and determining the obtained external parameter between the adjusted laser radar and the semantic camera under the condition of residual convergence, and ending calibration.

Through the external parameters between the adjusted laser radar and the semantic camera, the position of the target object identified by the laser radar in the space where the target vehicle is located and the distance between the target object and the target vehicle can be accurately projected onto the target object output by the semantic camera, fig. 12 is a schematic diagram of semantic registration according to an embodiment of the present application, as shown in fig. 12, before registration, in the case that the target object identified by the laser radar in the space where the target vehicle is located is projected onto the coordinate system where the semantic camera (with O as an origin, including three projection axes of x, y and z), the target object identified by the laser radar (which may but is not limited to include a driving direction identifier of a lane where the target vehicle is located, a street lamp and a traffic sign) cannot be overlapped with the target object (with a solid line) output by the semantic camera after projection (which is represented by a dotted line). Through the method in the embodiment of the application, through calibrated external parameters, the target object (which can include, but is not limited to, a driving direction mark of a lane where a target vehicle is located, a street lamp and a traffic sign) identified by the laser radar can be basically overlapped with the target object output by the semantic camera after projection, and the point cloud sensed by the laser radar is fully utilized.

According to the embodiment of the application, when the external parameters between the semantic camera and the laser radar are calibrated, on one hand, the calibration plate can be removed, and manpower and material resources are saved; the method can be suitable for 3D laser radars of different types and low line numbers, and can well extract lane line information and traffic sign object semantic information carried in point clouds sensed by the laser radars; different parameters are calculated by adopting different semantics, the functions of lane line information and TSR information identified by the laser radar are fully exerted, and the external parameters between the laser radar and the semantic camera are better and more accurately acquired.

From the description of the above embodiments, it will be clear to a person skilled in the art that the method according to the above embodiments may be implemented by means of software plus the necessary general hardware platform, but of course also by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk), comprising several instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method of the embodiments of the present application.

FIG. 13 is a block diagram of a relative position determining device according to an embodiment of the present application; as shown in fig. 13, includes:

the first recognition module 1302 is configured to perform semantic recognition on a first point cloud sensed by a laser radar on a target vehicle at a current time to obtain first lane line information and first traffic sign recognition system TSR information, where the first lane line information is used to identify a lane line recognized by the laser radar at the current time in a space where the target vehicle is located, and the first TSR information is used to identify a traffic sign object recognized by the laser radar at the current time in the space where the target vehicle is located;

a first obtaining module 1304, configured to obtain second lane line information and second TSR information output by a semantic camera on the target vehicle at the current time, where the second lane line information is used to identify a lane line identified by the semantic camera at the current time in a space where the target vehicle is located, and the second TSR information is used to identify a traffic sign object identified by the semantic camera at the current time in the space where the target vehicle is located;

A first determining module 1306, configured to determine, according to the first lane information, the first TSR information, the second lane information, and the second TSR information, a target relative position corresponding to the laser radar and the semantic camera at the current time, where the target relative position is used to represent a relative position between a target object identified by the laser radar and the target object identified by the semantic camera, and the target object includes a lane line or a traffic sign object.

Through the embodiment, the relative positions of the same lane lines or the same traffic sign objects identified by the laser radar and the semantic camera can be determined according to the lane line information and the TSR information identified by the laser radar and the lane line information and the TSR information output by the semantic camera, the position, the distance and the like of the identified objects can be better reflected by the point cloud scanned by the laser radar, the type and the like of the identified objects can be better reflected by the point cloud scanned by the semantic camera, and in this way, the accurate matching of the position, the distance and the like of the objects scanned by the laser radar and the objects shot by the semantic camera is realized. By adopting the technical scheme, the problems of low accuracy of matching between the object identified by the semantic camera and the object identified by the laser radar and the like in the related technology are solved, and the technical effect of improving the accuracy of matching between the object identified by the semantic camera and the object identified by the laser radar is realized.

In one exemplary embodiment, the first identification module includes:

a dividing unit, configured to divide each point in the first point cloud into a first candidate point cloud and a second candidate point cloud, where a position of each point in the first candidate point cloud under a first coordinate system where the laser radar is located is in the same plane as the target vehicle, and a position of each point in the second candidate point cloud under the first coordinate system is in a different plane from the target vehicle;

the acquisition unit is used for acquiring the first lane line information according to the first candidate point cloud and acquiring the first TSR information according to the second candidate point cloud.

In an exemplary embodiment, the acquiring unit is configured to:

obtaining the reflection intensity of each point in the first candidate point cloud, wherein the reflection intensity of each point is carried in the first point cloud;

determining points meeting preset matching conditions between the reflection intensity and a preset reflection intensity threshold value in the first candidate point cloud to obtain a second point cloud;

fitting the second point cloud to obtain the first lane line information.

In an exemplary embodiment, the acquisition unit is further configured to:

When the second candidate point cloud comprises N points and N is equal to 2, one point of the N points is used as a circle center, a first reference point with the distance smaller than or equal to a target radius between the second candidate point cloud and the one point is determined in the second candidate point cloud, and a third point cloud is obtained, wherein the first reference point and the one point are used for identifying a target traffic identification object of a target type; fitting the third point cloud to obtain the first TSR information; and/or

Under the condition that the second candidate point cloud comprises the N points and N is a positive integer greater than 2, acquiring TSR information corresponding to the ith point by executing the following steps, wherein the first TSR information comprises TSR information corresponding to each point in the N points: determining a fourth point cloud containing the ith point in the second candidate point cloud, wherein the distance between each point in the fourth point cloud and at least one point in the fourth point cloud is smaller than or equal to the target radius, and each point in the fourth point cloud is used for identifying a target traffic identification object of the target type; fitting the fourth point cloud to obtain TSR information corresponding to the ith point, wherein i is a positive integer which is greater than or equal to 1 and less than or equal to N.

In one exemplary embodiment, the first determining module includes:

the first determining unit is used for determining a first preliminary relative position corresponding to the laser radar and the semantic camera in the current time according to the first lane line information and the second lane line information, wherein the first preliminary relative position is used for representing the preliminary relative position of the target object identified by the laser radar and the target object identified by the semantic camera on a horizontal plane, and the horizontal plane is a plane where the semantic camera is located;

the first processing unit is configured to adjust the first preliminary relative position according to the first TSR information and the second TSR information, obtain a first relative position corresponding to the lidar and the semantic camera in the current time, and determine a second relative position corresponding to the lidar and the semantic camera in the current time according to the first TSR information and the second TSR information, where the target relative position includes the first relative position and the second relative position, the first relative position is used to represent a relative position of the target object identified by the lidar and the target object identified by the semantic camera on the horizontal plane, and the second relative position is used to represent a relative position of the target object identified by the lidar and the target object identified by the semantic camera on a vertical plane, and the vertical plane is a plane perpendicular to the horizontal plane.

In an exemplary embodiment, the first determining unit is configured to:

starting from a preset first initial relative position, adjusting a first current relative position until a first target condition is met, and determining the first current relative position when the first target condition is met as the first initial relative position;

wherein the first target condition includes: the sum of the loss function values of M points is smaller than or equal to a preset first target threshold, wherein the M points are M points of the lane line recognized by the laser radar in the space where the target vehicle is located, and the M points are obtained by projecting the first lane line information to a second coordinate system where the semantic camera is located according to the first current relative position;

wherein the loss function values of the M points are obtained by:

inputting a first projection position of a jth point in the M points under the second coordinate system and a first reference position of the jth point under the second coordinate system into a preset first loss function to obtain a loss function value of the jth point, wherein the loss function value of the jth point is used for representing the position difference between the projection position of the jth point under the second coordinate system and the reference position of the jth point under the second coordinate system, M is a positive integer greater than or equal to 1, and j is a positive integer greater than or equal to 1 and less than or equal to M.

In an exemplary embodiment, the first processing unit is configured to:

starting from the first preliminary relative position, adjusting the first current relative position until a second target condition is met, and determining the first current relative position when the second target condition is met as the first relative position;

wherein the second target condition includes: the sum of the loss function values of Q points is smaller than or equal to the second target threshold, wherein the Q points are Q points of a traffic recognition object recognized by the laser radar in the space where the target vehicle is located, and the Q points are obtained by projecting the first TSR information to the second coordinate system according to the first current relative position;

wherein the loss function values of the Q points are obtained by:

and inputting a second projection position of a p-th point in the Q points under the second coordinate system and a second reference position of the p-th point under the second coordinate system into a preset second loss function to obtain a loss function value of the p-th point, wherein the loss function value of the p-th point is used for representing the position difference between the projection position of the p-th point under the second coordinate system and the reference position of the p-th point under the second coordinate system, Q is a positive integer greater than or equal to 1, and p is a positive integer greater than or equal to 1 and less than or equal to Q.

In an exemplary embodiment, the first processing unit is configured to:

starting from a preset second initial relative position, adjusting a second current relative position until a third target condition is met, and determining the second current relative position when the third target condition is met as the second relative position;

wherein the third target condition includes: the sum of loss function values of L points is smaller than or equal to a third target threshold, wherein the L points are L points of a traffic recognition object recognized by the laser radar in a space where the target vehicle is located, and the L points are obtained by projecting the first TSR information to the second coordinate system according to the second current relative position;

wherein the loss function values of the L points are obtained by the following steps:

and inputting a second projection position of a q-th point in the L points under the second coordinate system and a second reference position of the q-th point under the second coordinate system into a preset third loss function to obtain a loss function value of the q-th point, wherein the loss function value of the q-th point is used for representing the position difference between the projection position of the q-th point under the second coordinate system and the reference position of the q-th point under the second coordinate system, L is a positive integer greater than or equal to 1, and q is a positive integer greater than or equal to 1 and less than or equal to L.

In one exemplary embodiment, the apparatus further comprises:

the second identifying module is configured to, after determining, according to the first lane line information, the first TSR information, the second lane line information, and the second TSR information, a target relative position corresponding to the lidar and the semantic camera at the current time, perform semantic identification on a fifth point cloud sensed by the lidar on the target vehicle, so as to obtain fourth lane line information and fourth TSR information, where the fourth lane line information is used to identify a lane line identified by the lidar at a time next to the current time in a space where the target vehicle is located, and the fourth TSR information is used to identify a traffic sign object identified by the lidar at the time next in the space where the target vehicle is located;

the second acquisition module is used for acquiring fifth lane line information and fifth TSR information output by the semantic camera on the target vehicle, wherein the fifth lane line information is used for identifying a lane line identified by the semantic camera at the next time in a space where the target vehicle is located, and the fifth TSR information is used for identifying a traffic identification object identified by the semantic camera at the next time in the space where the target vehicle is located;

The second determining module is configured to determine a next relative position corresponding to the laser radar and the semantic camera at the next time according to the fourth lane line information, the fourth TSR information, the fifth lane line information, the fifth TSR information, and the first TSR information.

In one exemplary embodiment, the second determining module includes:

the union taking unit is used for carrying out union taking operation on the point of the traffic identification object in the first TSR information and the point of the traffic identification object in the fourth TSR information to obtain sixth TSR information;

the second determining unit is used for determining a second preliminary relative position corresponding to the laser radar and the semantic camera at the next time according to the fourth lane line information and the fifth lane line information;

the second processing unit is configured to adjust the second preliminary relative position according to the fifth TSR information and the sixth TSR information, obtain a third relative position corresponding to the laser radar and the semantic camera at the next time, and determine a fourth relative position corresponding to the laser radar and the semantic camera at the next time according to the fifth TSR information and the sixth TSR information, where the next relative position includes the third relative position and the fourth relative position.

Embodiments of the present application also provide a storage medium including a stored program, wherein the program performs the method of any one of the above when run.

Alternatively, in the present embodiment, the above-described storage medium may be configured to store program code for performing the steps of:

s11, carrying out semantic recognition on a first point cloud sensed by a laser radar on a target vehicle at the current time to obtain first lane line information and first traffic sign recognition system TSR information, wherein the first lane line information is used for identifying a lane line recognized by the laser radar at the current time in a space where the target vehicle is located, and the first TSR information is used for identifying a traffic sign object recognized by the laser radar at the current time in the space where the target vehicle is located;

s12, obtaining second lane line information and second TSR information output by a semantic camera on the target vehicle at the current time, wherein the second lane line information is used for identifying lane lines identified by the semantic camera at the current time in a space where the target vehicle is located, and the second TSR information is used for identifying traffic identification objects identified by the semantic camera at the current time in the space where the target vehicle is located;

S13, determining the target relative positions corresponding to the laser radar and the semantic camera in the current time according to the first lane line information, the first TSR information, the second lane line information and the second TSR information, wherein the target relative positions are used for representing the relative positions between a target object identified by the laser radar and the target object identified by the semantic camera, and the target object comprises a lane line or a traffic sign object.

Alternatively, in the present embodiment, the above-described storage medium may be further configured to store program code for performing the steps of:

s21, carrying out semantic recognition on first point clouds sensed by a laser radar on a target vehicle at the current time to obtain first lane line information and first traffic sign recognition system TSR information, wherein the first lane line information is used for identifying lane lines recognized by the laser radar at the current time in a space where the target vehicle is located, and the first TSR information is used for identifying traffic sign objects recognized by the laser radar at the current time in the space where the target vehicle is located;

S22, obtaining second lane line information and second TSR information output by a semantic camera on the target vehicle at the current time, wherein the second lane line information is used for identifying lane lines identified by the semantic camera at the current time in a space where the target vehicle is located, and the second TSR information is used for identifying traffic identification objects identified by the semantic camera at the current time in the space where the target vehicle is located;

s23, determining the target relative positions corresponding to the laser radar and the semantic camera in the current time according to the first lane line information, the first TSR information, the second lane line information and the second TSR information, wherein the target relative positions are used for representing the relative positions between a target object identified by the laser radar and the target object identified by the semantic camera, and the target object comprises a lane line or a traffic sign object.

Embodiments of the present application also provide an electronic device comprising a memory having a computer program stored therein and a processor arranged to run the computer program to perform the steps of any of the method embodiments described above.

Optionally, the electronic apparatus may further include a transmission device and an input/output device, where the transmission device is connected to the processor, and the input/output device is connected to the processor.

Alternatively, in the present embodiment, the storage medium may include, but is not limited to: a U-disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a removable hard disk, a magnetic disk, or an optical disk, or other various media capable of storing program codes.

Alternatively, specific examples in this embodiment may refer to examples described in the foregoing embodiments and optional implementations, and this embodiment is not described herein.

It will be appreciated by those skilled in the art that the modules or steps of the application described above may be implemented in a general purpose computing device, they may be centralized on a single computing device, or distributed across a network of computing devices, or they may alternatively be implemented in program code executable by computing devices, such that they may be stored in a memory device for execution by the computing devices and, in some cases, the steps shown or described may be performed in a different order than what is shown or described, or they may be implemented as individual integrated circuit modules, or as individual integrated circuit modules. Thus, the present application is not limited to any specific combination of hardware and software.

The foregoing is merely a preferred embodiment of the present application and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present application and are intended to be comprehended within the scope of the present application.

Claims

1. A method of determining a relative position, comprising:

semantic recognition is carried out on a first point cloud sensed by a laser radar on a target vehicle at the current time to obtain first lane line information and first traffic sign recognition system TSR information, wherein the first lane line information is used for identifying lane lines recognized by the laser radar at the current time in a space where the target vehicle is located, and the first TSR information is used for identifying traffic identification objects recognized by the laser radar at the current time in the space where the target vehicle is located;

acquiring second lane line information and second TSR information output by a semantic camera on the target vehicle at the current time, wherein the second lane line information is used for identifying a lane line identified by the semantic camera at the current time in a space where the target vehicle is located, and the second TSR information is used for identifying a traffic identification object identified by the semantic camera at the current time in the space where the target vehicle is located;

And determining the target relative positions corresponding to the laser radar and the semantic camera in the current time according to the first lane line information, the first TSR information, the second lane line information and the second TSR information, wherein the target relative positions are used for representing the relative positions between a target object identified by the laser radar and the target object identified by the semantic camera, and the target object comprises a lane line or a traffic sign object.

2. The method according to claim 1, wherein the semantic recognition of the first point cloud sensed by the lidar on the target vehicle at the current time to obtain the first lane line information and the first traffic sign recognition system TSR information includes:

dividing each point in the first point cloud into a first candidate point cloud and a second candidate point cloud, wherein the position of each point in the first candidate point cloud under a first coordinate system where the laser radar is located is in the same plane with the target vehicle, and the position of each point in the second candidate point cloud under the first coordinate system is in a different plane with the target vehicle;

And acquiring the first lane line information according to the first candidate point cloud, and acquiring the first TSR information according to the second candidate point cloud.

3. The method of claim 2, wherein the obtaining the first lane line information from the first candidate point cloud comprises:

fitting the second point cloud to obtain the first lane line information.

4. The method of claim 2, wherein the obtaining the first TSR information from the second candidate point cloud comprises:

5. The method of claim 1, wherein determining the relative positions of the targets corresponding to the lidar and the semantic camera at the current time based on the first lane information, the first TSR information, the second lane information, and the second TSR information comprises:

determining a first preliminary relative position corresponding to the laser radar and the semantic camera in the current time according to the first lane line information and the second lane line information, wherein the first preliminary relative position is used for representing the preliminary relative position of the target object identified by the laser radar and the target object identified by the semantic camera on a horizontal plane, and the horizontal plane is a plane where the semantic camera is located;

And according to the first TSR information and the second TSR information, the first preliminary relative position is adjusted to obtain a first relative position corresponding to the laser radar and the semantic camera in the current time, and according to the first TSR information and the second TSR information, a second relative position corresponding to the laser radar and the semantic camera in the current time is determined, wherein the target relative position comprises the first relative position and the second relative position, the first relative position is used for representing the relative position of the target object identified by the laser radar and the target object identified by the semantic camera on the horizontal plane, the second relative position is used for representing the relative position of the target object identified by the laser radar and the target object identified by the semantic camera on the vertical plane, and the vertical plane is a plane perpendicular to the horizontal plane.

6. The method of claim 5, wherein determining a first preliminary relative position of the lidar and the semantic camera at the current time based on the first lane information and the second lane information comprises:

wherein the loss function values of the M points are obtained by:

7. The method of claim 6, wherein the adjusting the first preliminary relative position according to the first TSR information and the second TSR information to obtain the first relative position corresponding to the lidar and the semantic camera at the current time includes:

wherein the loss function values of the Q points are obtained by:

8. The method of claim 5, wherein adjusting a second initial relative position between the lidar and the semantic camera based on the first TSR information and the second TSR information to obtain the second relative position comprises:

9. The method of claim 1, wherein after the determining the relative positions of the targets corresponding to the lidar and the semantic camera at the current time based on the first lane line information, the first TSR information, the second lane line information, and the second TSR information, the method further comprises:

performing semantic recognition on a fifth point cloud sensed by the laser radar on the target vehicle to obtain fourth lane line information and fourth TSR information, wherein the fourth lane line information is used for identifying a lane line identified by the laser radar in the space where the target vehicle is located at the next time of the current time, and the fourth TSR information is used for identifying a traffic sign object identified by the laser radar in the space where the target vehicle is located at the next time;

obtaining fifth lane line information and fifth TSR information output by the semantic camera on the target vehicle, wherein the fifth lane line information is used for identifying a lane line identified by the semantic camera at the next time in a space where the target vehicle is located, and the fifth TSR information is used for identifying a traffic identification object identified by the semantic camera at the next time in the space where the target vehicle is located;

And determining the next relative position corresponding to the laser radar and the semantic camera at the next time according to the fourth lane line information, the fourth TSR information, the fifth lane line information, the fifth TSR information and the first TSR information.

10. The method of claim 8, wherein the determining the next relative position of the lidar and the semantic camera at the next time based on the fourth lane line information, the fourth TSR information, and the fifth lane line information, the fifth TSR information, and the first TSR information comprises:

performing union operation on the point of the traffic recognition object in the first TSR information and the point of the traffic recognition object in the fourth TSR information to obtain sixth TSR information;

determining a second preliminary relative position corresponding to the laser radar and the semantic camera at the next time according to the fourth lane line information and the fifth lane line information;

and adjusting the second preliminary relative position according to the fifth TSR information and the sixth TSR information to obtain a third relative position corresponding to the laser radar and the semantic camera in the next time, and determining a fourth relative position corresponding to the laser radar and the semantic camera in the next time according to the fifth TSR information and the sixth TSR information, wherein the next relative position comprises the third relative position and the fourth relative position.

11. A relative position determining apparatus, comprising:

the system comprises a first identification module, a first traffic sign identification system TSR (traffic sign recognition) and a second identification module, wherein the first identification module is used for carrying out semantic identification on a first point cloud sensed by a laser radar on a target vehicle at the current time to obtain first lane line information and first traffic sign identification system TSR information, the first lane line information is used for identifying a lane line identified by the laser radar at the current time in a space where the target vehicle is located, and the first TSR information is used for identifying a traffic sign object identified by the laser radar at the current time in the space where the target vehicle is located;

the first acquisition module is used for acquiring second lane line information and second TSR information output by the semantic camera on the target vehicle at the current time, wherein the second lane line information is used for identifying lane lines identified by the semantic camera at the current time in a space where the target vehicle is located, and the second TSR information is used for identifying traffic identification objects identified by the semantic camera at the current time in the space where the target vehicle is located;

the first determining module is configured to determine, according to the first lane line information, the first TSR information, the second lane line information, and the second TSR information, a target relative position corresponding to the laser radar and the semantic camera at the current time, where the target relative position is used to represent a relative position between a target object identified by the laser radar and the target object identified by the semantic camera, and the target object includes a lane line or a traffic sign object.

12. A computer readable storage medium, characterized in that the computer readable storage medium comprises a stored program, wherein the program when run performs the method of any one of claims 1 to 10.

13. An electronic device comprising a memory and a processor, wherein the memory has stored therein a computer program, the processor being arranged to perform the method of any of claims 1 to 10 by means of the computer program.