WO2023283929A1

WO2023283929A1 - Method and apparatus for calibrating external parameters of binocular camera

Info

Publication number: WO2023283929A1
Application number: PCT/CN2021/106747
Authority: WO
Inventors: 黄海晖; 何启盛; 张建军
Original assignee: 华为技术有限公司
Priority date: 2021-07-16
Filing date: 2021-07-16
Publication date: 2023-01-19
Also published as: CN116917936A

Abstract

The present application relates to the field of artificial intelligence, and particularly relates to the field of autonomous driving; provided are a method and an apparatus for calibrating the external parameters of a binocular camera, the method comprising: acquiring a binocular image photographed by a binocular camera; respectively extracting m straight lines from a first image and a second image of the binocular image; on the basis of the external parameters of the binocular camera, reconstructing a plurality of straight lines that match in the first image and the second image in a three-dimensional space to obtain reconstructed straight lines, wherein the plurality of straight lines that match in the binocular image are projections of a plurality of straight lines in the photographed scene; and, on the basis of a reconstruction error, adjusting the external parameters of the binocular camera, wherein the reconstruction error is determined on the basis of the positional relationship between the reconstructed straight lines and the positional relationship between the straight lines in the photographed scene. In the solution of the present application, the external parameters of the binocular camera are adjusted by means of the geometric constraints between multiple straight lines in a three-dimensional space, increasing the precision of calibrating the external parameters of a binocular camera.

Description

Method and device for extrinsic calibration of binocular camera

technical field

The present application relates to the field of data processing, in particular to a method and device for calibrating external parameters of a binocular camera.

Background technique

Camera calibration refers to the process of obtaining camera parameters. The camera parameters include internal parameters and external parameters. The internal parameters are the parameters of the camera itself, and the external parameters are parameters related to the installation position of the camera, such as pitch angle, roll angle, and yaw angle.

The binocular camera can obtain dense depth information and realize functions such as distance measurement of the target object. However, due to the small length of the binocular baseline, a small change in the angle in the extrinsic parameters will significantly affect the results of the binocular camera work, for example, affect the distance measurement results. Therefore, the accuracy of extrinsic calibration directly affects the accuracy of binocular camera work results.

The external parameter calibration of the binocular camera on the traditional production line usually relies on the target, for example, a checkerboard calibration board or a QR code calibration board. One solution is to adjust the angle of the target through the mechanical arm or manually, so that the binocular camera can acquire images of various angles of the checkerboard calibration board, and complete the calibration of the extrinsic parameters of the binocular camera. This solution is complex, time-consuming and relies on expensive equipment. Another solution is to arrange a large number of targets, the binocular camera is located on the vehicle, and the external parameter calibration is completed during the operation of the vehicle. This solution needs to maintain a large number of targets, and the high-precision calibration of the targets is difficult to achieve.

In addition, in the existing scheme, the feature point extraction and matching between binocular cameras can also be used to realize the calibration of extrinsic parameters through the polar constraint relationship. However, when there are few feature points in the environment or the feature points are unstable and controllable, it is difficult for this scheme to guarantee the accuracy of the calibration results. For example, when the environment in the factory is not controlled, the solution cannot guarantee that the calibration results of each calibration are accurate, that is, it is difficult to guarantee the stability of the calibration results, which will affect the number of beats in the production line.

Therefore, how to improve the accuracy of extrinsic calibration of binocular cameras has become an urgent problem to be solved.

Contents of the invention

The present application provides a method and device for calibrating external parameters of a binocular camera, which adjusts the extrinsic parameters of the binocular camera through geometric constraints between multiple straight lines in a three-dimensional space, thereby improving the accuracy of calibrating the external parameters of the binocular camera.

In a first aspect, a method for calibrating extrinsic parameters of a binocular camera is provided, the method comprising: acquiring a first image and a second image, the first image being obtained by shooting a shooting scene with a first camera in the binocular camera, The second image is obtained by shooting the shooting scene with the second camera in the binocular camera; m straight lines are extracted from the first image and the second image respectively, m is an integer greater than 1, the m straight lines of the first image and There is a corresponding relationship between the m straight lines in the second image; based on the external parameters of the binocular camera, n straight lines in the m straight lines in the first image and n straight lines in the m straight lines in the second image are reconstructed to three dimensions In space, n straight lines after reconstruction are obtained, n straight lines in the first image and n straight lines in the second image are projections of n straight lines in the shooting scene, 1<n≤m, n is an integer; according to The external parameters of the binocular camera are adjusted according to the reconstruction error, and the reconstruction error is determined according to the positional relationship between the reconstructed n straight lines and the positional relationship between the n straight lines in the shooting scene.

The first image and the second image are two images in the binocular image, that is, two images captured synchronously by two cameras in the binocular camera.

The shooting scene includes one or more calibration objects, and the calibration objects refer to the objects photographed by the binocular camera. That is, the first image and the second image include imaging of the calibration object.

Exemplarily, the calibration object includes a horizontal object or a vertical object and the like.

For example, horizontal objects include road markings.

For example, vertical objects include rods or pillars and the like.

There is a corresponding relationship between the m straight lines of the first image and the m straight lines of the second image, which means that the m straight lines of the first image and the m straight lines of the second image are the projections of the m straight lines in the shooting scene . The projection of the m straight lines in the shooting scene can also be understood as the imaging of the m straight lines in the shooting scene in the binocular camera.

For different binocular images, m can be the same or different. That is to say, the number of straight lines extracted in different binocular images may be the same or different.

For different binocular images, n can be the same or different. That is to say, the number of reconstructed straight lines in different binocular images may be the same or different.

Obtaining the reconstructed straight lines can also be understood as obtaining the spatial positions of the n reconstructed straight lines. Exemplarily, the binocular camera may be a vehicle-mounted camera, and the reconstructed spatial positions of the n straight lines may be represented by the coordinates of the n straight lines in the ego vehicle coordinate system.

Adjusting the extrinsic parameters of the binocular camera according to the reconstruction error may be adjusting the extrinsic parameters of the binocular camera according to the reconstruction error of one or more frames of binocular images.

In the embodiment of the present application, the positional relationship between the straight lines in the shooting scene is used as a geometric constraint, according to the positional relationship between the reconstructed n straight lines and the positional relationship between the n straight lines in the shooting scene Differential adjustment of the external parameters of the binocular camera does not need to precisely locate the position coordinates of the straight line in the shooting scene in three-dimensional space, avoiding the impact of the accuracy of the position coordinates of the straight line in the shooting scene on the calibration result, and improving the external parameters of the binocular camera. Calibration accuracy of the reference.

Moreover, in the solution of the embodiment of the present application, at least only the positional relationship of two straight lines in the scene needs to be photographed to complete the calibration, which reduces the amount of calculation and improves the efficiency of calibration.

In addition, the solution of the embodiment of the present application can further improve the calibration accuracy by adding geometric constraints.

In addition, the solutions in the embodiments of the present application have strong generalization ability and are applicable to various calibration scenarios. Taking the vehicle-mounted binocular camera as an example, the calibration objects can use common elements on open roads, such as street light poles or road markings, without pre-arranging the calibration scene. For example, there is no need to preset targets, which reduces costs.

With reference to the first aspect, in some implementation manners of the first aspect, adjusting the extrinsic parameters of the binocular camera according to the reconstruction error includes: adjusting the extrinsic parameters of the binocular camera according to the sum of the reconstruction errors of multiple frames of binocular images.

Alternatively, adjusting the extrinsic parameters of the binocular camera according to the reconstruction error includes: adjusting the extrinsic parameters of the binocular camera according to the average value of the reconstruction errors of multiple frames of binocular images.

Since there may be certain errors in line detection, in the embodiment of the present application, the external parameters of the binocular camera are adjusted through the accumulation of reconstruction errors of multiple frames of binocular images, which can reduce the impact of line detection errors and improve the performance of the binocular camera. The accuracy of the external reference calibration.

With reference to the first aspect, in some implementations of the first aspect, the reconstruction error includes at least one of the following: an angle error between the reconstructed n straight lines or a distance error between the reconstructed n straight lines , the angle error between the reconstructed n straight lines is based on the angle between at least two straight lines in the reconstructed n straight lines and the angle between at least two straight lines in the n straight lines in the shooting scene The distance error between the reconstructed n straight lines is determined according to the distance between at least two of the reconstructed n straight lines and at least one of the n straight lines in the shooting scene The difference between the distances between two straight lines is determined.

With reference to the first aspect, in some implementation manners of the first aspect, the at least two straight lines in the shooting scene include at least two parallel straight lines.

Adjusting the external parameters according to the parallel error and the distance error can ensure the accuracy of the external parameters. At the same time, the solution of the embodiment of the present application only needs two parallel lines with known distances in the shooting scene to realize the calibration of the extrinsic parameters of the binocular camera, which can reduce the error term in the reconstruction error, thereby reducing the amount of calculation and improving the external parameters. The adjustment speed of the parameters is improved, that is, the calibration efficiency of the external parameters is improved. In addition, since the solution of the embodiment of the present application only needs to shoot two parallel lines with known distances in the scene to realize the calibration of the external parameters of the binocular camera, there is no need to accurately locate the position of the line in the three-dimensional space, and there is no need to preset the target , which reduces the requirements for the shooting scene and reduces the calibration cost.

With reference to the first aspect, in some implementations of the first aspect, extracting m straight lines in the first image and the second image respectively includes: performing instance segmentation on the first image and the second image respectively, to obtain The instance of and the instance in the second image; extract m straight lines in the instance in the first image and the instance in the second image respectively, between the m straight lines in the first image and the m straight lines in the second image The correspondence of is determined from the correspondence between instances in the first image and instances in the second image.

In the embodiment of the present application, the correspondence between the straight lines is determined through the correspondence between the instances in the two images, which can improve the accuracy of straight line matching, improve the accuracy of external parameter calibration, reduce computational complexity, and improve The efficiency of external reference calibration is improved.

With reference to the first aspect, in some implementations of the first aspect, extracting m straight lines from the instance in the first image and the instance in the second image respectively includes: respectively extracting m straight lines in the instance of the first image and the second image Extract a plurality of original straight lines in the first image; fit a plurality of original straight lines of the same-side edge of the instance in the first image to a target straight line of the side edge of the instance in the first image; fit the same-side edge of the instance in the second image Multiple original straight lines on the side edge are fitted to a target straight line on the side edge of the instance in the second image, and the m straight lines belong to the target straight line.

In the solution of the embodiment of this application, the original straight line on the same side of the example is fitted to obtain a more accurate target straight line, and the target straight line is used to calibrate the external parameters of the binocular camera, which is conducive to improving the calibration of the external parameters of the binocular camera the accuracy of the results.

With reference to the first aspect, in some implementations of the first aspect, instance segmentation is performed on the first image and the second image respectively to obtain instances in the first image and instances in the second image, including: Perform semantic segmentation with the second image respectively to obtain the semantic segmentation result of the first image and the semantic segmentation result of the second image. The semantic segmentation result of the first image includes horizontal objects or vertical objects in the first image, and the semantic segmentation result of the second image Semantic segmentation results include horizontal objects or vertical objects in the second image; based on the semantic segmentation results of the first image, the first image is instance-segmented to obtain instances in the first image, and based on the semantic segmentation results of the second image, the second image is Instance segmentation is performed on the second image to obtain the instance in the second image.

In the embodiment of the present application, the horizontal objects or vertical objects in the image are distinguished through semantic segmentation, so that the geometric constraints between the horizontal objects or the vertical objects in the shooting scene, such as vertical constraints, can be used to achieve dual Adjustment of the extrinsic parameters of the camera. If the binocular camera is a vehicle-mounted camera, road markings or poles are more common on open roads, the calibration objects on the open road can be used to adjust the external parameters of the binocular camera, and there is no need to pre-arrange the calibration site, which reduces costs.

With reference to the first aspect, in some implementation manners of the first aspect, the method further includes: controlling the display to display the calibration status of the extrinsic parameters of the binocular camera.

For example, the display may be a vehicle-mounted display.

In the solution of the embodiment of the present application, the current calibration situation can be displayed in real time, which is beneficial for the user to know the current calibration progress and improves the user experience.

In combination with the first aspect, in some implementations of the first aspect, the calibration of the extrinsic parameters of the binocular camera includes at least one of the following: the current calibration progress, the current reconstruction error, or the reconstructed p straight lines , the reconstructed p lines are based on the extrinsic parameters of the current binocular camera to reconstruct p lines in the m lines of the first image and p lines in the m lines of the second image into the three-dimensional space Obtained, 1<p≤m, p is an integer.

Exemplarily, the current extrinsic parameters of the binocular camera may be adjusted extrinsic parameters of the binocular camera.

Alternatively, the current extrinsic parameters of the binocular camera may be the optimal extrinsic parameters of the binocular camera during the adjustment process. The optimal external parameter of the binocular camera during the adjustment process may be the external parameter that minimizes the reconstruction error during the adjustment process.

In the embodiment of the present application, the calibration result is visualized and the three-dimensional space position of the reconstructed straight line is displayed, which is helpful for the user to intuitively experience the current calibration situation.

With reference to the first aspect, in some implementation manners of the first aspect, the current calibration progress includes at least one of the following: current extrinsic parameters of the binocular camera or current calibration completion.

With reference to the first aspect, in some implementation manners of the first aspect, the current reconstruction error includes at least one of the following: a current reconstruction error, a current distance error, or a current angle error.

In a second aspect, a device for calibrating external parameters of a binocular camera is provided, the device comprising: an acquisition unit configured to acquire a first image and a second image, the first image being taken by a pair of first cameras in the binocular camera The scene is shot, and the second image is shot by the second camera in the binocular camera to the shooting scene; the processing unit is used to extract m straight lines in the first image and the second image respectively, and m is greater than 1 Integer, there is a corresponding relationship between the m lines of the first image and the m lines of the second image; based on the external parameters of the binocular camera, the n lines of the m lines of the first image and the m lines of the second image The n straight lines in the straight line are reconstructed into the three-dimensional space, and the reconstructed n straight lines are obtained. The n straight lines in the first image and the n straight lines in the second image are the projections of the n straight lines in the shooting scene, 1< n≤m, n is an integer; adjust the external parameters of the binocular camera according to the reconstruction error, the reconstruction error is based on the positional relationship between the reconstructed n straight lines and the positional relationship between the n straight lines in the shooting scene definite.

Optionally, as an embodiment, the reconstruction error includes at least one of the following: the angle error between the reconstructed n straight lines or the distance error between the reconstructed n straight lines, and the reconstructed n straight lines The angle error between the straight lines is determined according to the difference between the angle between at least two straight lines among the n straight lines after reconstruction and the angle between at least two straight lines among the n straight lines in the shooting scene ; The distance error between the reconstructed n straight lines is based on the distance between at least two straight lines in the reconstructed n straight lines and the distance between at least two straight lines in the n straight lines in the shooting scene The difference between is determined.

Optionally, as an embodiment, the at least two straight lines in the shooting scene include at least two parallel straight lines.

Optionally, as an embodiment, the processing unit is specifically configured to: respectively perform instance segmentation on the first image and the second image to obtain instances in the first image and instances in the second image; m straight lines are extracted from the instance and the instance in the second image, and the correspondence between the m straight lines in the first image and the m straight lines in the second image is based on the examples in the first image and the m straight lines in the second image The correspondence between instances is determined.

Optionally, as an embodiment, the processing unit is specifically configured to: perform semantic segmentation on the first image and the second image respectively, obtain the semantic segmentation result of the first image and the semantic segmentation result of the second image, and obtain the semantic segmentation result of the first image. The segmentation result includes horizontal or vertical objects in the first image, and the semantic segmentation result of the second image includes horizontal or vertical objects in the second image; instance segmentation is performed on the first image based on the semantic segmentation result of the first image , get the instance in the first image, perform instance segmentation on the second image based on the semantic segmentation result of the second image, and obtain the instance in the second image.

Optionally, as an embodiment, the device further includes: a display unit, configured to display the calibration situation of the extrinsic parameters of the binocular camera.

Optionally, as an embodiment, the calibration of the external parameters of the binocular camera includes at least one of the following: the current calibration progress, the current reconstruction error situation or the reconstructed p lines, the reconstructed p lines The straight line is obtained by reconstructing p straight lines out of the m straight lines in the first image and p straight lines in the m straight lines in the second image based on the extrinsic parameters of the current binocular camera, 1<p≤ m and p are integers.

Optionally, as an embodiment, the current calibration progress includes at least one of the following: current extrinsic parameters of the binocular camera or current calibration completion.

Optionally, as an embodiment, the condition of the current reconstruction error includes at least one of the following: a current reconstruction error, a current distance error, or a current angle error.

Optionally, the binocular camera is a vehicle-mounted camera, and the vehicle on which the binocular camera is carried may be in a stationary state or in a moving state.

In a third aspect, a device for calibrating external parameters of a binocular camera is provided, the device includes a processor, the processor is coupled with a memory, the memory is used to store computer programs or instructions, and the processor is used to execute the computer programs or instructions stored in the memory , so that the method in the first aspect or any implementation manner in the first aspect is executed.

Optionally, the device includes one or more processors.

Optionally, the device may further include a memory coupled to the processor.

Optionally, the device may include one or more memories.

Optionally, the memory can be integrated with the processor, or set separately.

Optionally, the device may also include a data interface.

In a fourth aspect, a computer-readable medium is provided, where the computer-readable medium stores program code for execution by a device, and the program code includes a program code for executing the above-mentioned first aspect or any implementation manner in the first aspect. method.

In a fifth aspect, a computer program product including instructions is provided, and when the computer program product is run on a computer, it causes the computer to execute the method in the above first aspect or any one of the implementation manners of the first aspect.

In a sixth aspect, a chip is provided, the chip includes a processor and a data interface, and the processor reads instructions stored on the memory through the data interface, and executes any one of the above-mentioned first aspect or the first aspect method in the implementation.

Optionally, as an implementation manner, the chip may further include a memory, the memory stores instructions, the processor is configured to execute the instructions stored in the memory, and when the instructions are executed, the The processor is configured to execute the method in the foregoing first aspect or any implementation manner of the first aspect.

In a seventh aspect, a terminal is provided, and the terminal includes the second aspect and the device in any one implementation manner of the second aspect.

Optionally, the terminal further includes a binocular camera.

Exemplarily, the terminal may be a vehicle.

Description of drawings

FIG. 1 is a schematic diagram of an application scenario provided by an embodiment of the present application;

FIG. 2 is a schematic flow chart of a calibration method for extrinsic parameters of a binocular camera provided in an embodiment of the present application;

Fig. 3 is a schematic diagram of the imaging principle of a binocular camera provided by the embodiment of the present application;

FIG. 4 is a schematic flow chart of another calibration method for extrinsic parameters of a binocular camera provided by an embodiment of the present application;

Fig. 5 is a schematic diagram of the calibration site provided by the embodiment of the present application;

Fig. 6 is a schematic diagram of the semantic segmentation result of the binocular image provided by the embodiment of the present application;

Fig. 7 is a schematic diagram of an example tagging result of a binocular image provided by an embodiment of the present application;

Fig. 8 is a schematic diagram of a target line in a binocular image provided by an embodiment of the present application;

Fig. 9 is a schematic diagram of the spatial position of the reconstructed straight line provided by the embodiment of the present application;

Fig. 10 is a schematic diagram of the current calibration situation provided by the embodiment of the present application;

Fig. 11 is a schematic diagram of a calibration device for extrinsic parameters of a binocular camera provided by an embodiment of the present application;

Fig. 12 is a schematic diagram of another calibration device for extrinsic parameters of a binocular camera provided by an embodiment of the present application.

detailed description

The technical solution in this application will be described below with reference to the accompanying drawings.

The solutions of the embodiments of the present application can be applied to smart devices. Smart device refers to any kind of equipment, apparatus or machine with computing power. The smart devices in the embodiments of the present application may be robots, autonomous vehicles, intelligent assisted driving vehicles, unmanned aerial vehicles, intelligent assisted aircraft, smart home devices, and the like. This application does not impose any limitation on the smart device. Any device that can be equipped with a binocular camera can be included in the scope of smart devices of this application.

The method provided by the embodiment of the present application can be applied to automatic driving, drone navigation, robot navigation, industrial non-contact detection, 3D reconstruction, virtual reality and other scenarios that require binocular camera calibration. Specifically, the method in the embodiment of the present application can be applied in an automatic driving scenario, and the automatic driving scenario is briefly introduced below.

As shown in FIG. 1 , vehicle 110 may be configured in a fully or partially autonomous driving mode. For example, vehicle 110 may control itself while in an autonomous driving mode, and may be human-operated to determine the current state of the vehicle and its surroundings, determine the likely behavior of at least one other vehicle in the surroundings, and determine the A confidence level corresponding to the likelihood of performing the possible action is used to control the vehicle 110 based on the determined information. While the vehicle 110 is in the autonomous driving mode, the vehicle 100 may be set to operate without human interaction.

Mobile data center 120 (mobile data center, MDC) is an automatic driving computing platform, which is used to process various sensor data and provide decision support for automatic driving.

Vehicle 110 includes a sensor system. The sensor system includes several sensors that sense information about the environment around the vehicle 110 . For example, the sensor system may include a positioning system (the positioning system may be a GPS system, or a Beidou system or other positioning systems), an inertial measurement unit (IMU), a radar, a laser range finder, and a binocular camera. It should be understood that although only the binocular camera 111 is shown in FIG. 1 , other sensors may also be included in the vehicle 110 .

In order to obtain better perception results, it is necessary to fuse information from multiple sensors. Specifically, different sensors can be unified into the same coordinate system through the external reference between the various sensors, thereby realizing the fusion of information from multiple sensors.

The calibration module 121 is used to determine the external parameters of the sensor. In the embodiment of the present application, the sensor system includes a binocular camera, and the calibration module 121 is used to determine external parameters of the binocular camera. As shown in FIG. 1 , the calibration module 121 can calibrate the extrinsic parameters of the binocular camera according to the binocular image collected by the binocular camera.

The upper layer function module 122 can realize corresponding functions based on the external parameters of the binocular camera. In other words, the external parameter calibration results of the binocular camera can be provided to the upper-level business of automatic driving. For example, the ranging function module can determine the distance between the obstacle and the vehicle from the images collected by the binocular camera according to the external parameters of the binocular camera. For another example, the obstacle avoidance function module can identify, evaluate and avoid potential obstacles in the environment of the vehicle in other ways according to the external parameters of the binocular camera from the images collected by the binocular camera.

Further, the current calibration situation can be displayed through a human machine interface (human machine interface, HMI) 130 on the vehicle.

Exemplarily, HMI 130 may be a vehicle display.

It should be noted that accompanying drawing 1 is only a schematic diagram of a system architecture provided by an embodiment of the present application, and the positional relationship between devices, devices, modules, etc. shown in the figure does not constitute any limitation, for example, in accompanying drawing 1 , the calibration module 121 and the upper-layer function module 122 are located in the MDC 120 , and in other cases, the calibration module 121 or the upper-layer function module can also be placed in other processors of the vehicle 110 . Alternatively, some of the processes described above are performed by a processor disposed within the vehicle 110 while others are performed by a remote processor. For another example, in FIG. 1 , the MDC 120 is located outside the vehicle 110 and can communicate with the vehicle 110 wirelessly. In other cases, the MDC may be located inside the vehicle 110 . HMI 130 may be located inside vehicle 110 .

The solution of the embodiment of the present application can be applied to the calibration module 121 . The solution in the embodiment of the present application may be executed by the calibration module 121 . The solution of the embodiment of the present application can realize the calibration of the external parameters of the binocular camera, improve the efficiency of the calibration of the external parameters of the binocular camera, update the calibration value of the external parameters of the binocular camera in the system, and provide high-precision external parameters for the upper-level business. In order to improve the accuracy of upper-level business, and then improve the performance of automatic driving.

In order to facilitate the understanding of the embodiments of the present application, concepts involved in the embodiments of the present application are firstly introduced below for description.

Camera calibration is also called camera calibration.

Based on the imaging principle of the camera, it can be known that there is a corresponding relationship between the three-dimensional space points in the geometric model of the camera imaging and the two-dimensional image points on the image plane, and this corresponding relationship is determined by the parameters of the camera. The process of obtaining the parameters of the camera is called camera calibration. The imaging principle of the camera is the prior art, and this article will not describe it in detail.

As an example, assume that the three-dimensional space point in the geometric model of camera imaging is recorded as X _W , the two-dimensional image point on the image plane in the geometric model of camera imaging is recorded as X _P , the three-dimensional space point X _W and the two-dimensional image point X The relationship between _P can be expressed as follows:

X _P =MX _W

Wherein, M represents the transformation matrix between the three-dimensional space point X _W and the two-dimensional image point X _P , which may be called a projection matrix. Some elements in the projection matrix M represent camera parameters. Camera calibration is to obtain the projection matrix M.

Camera parameters include intrinsic and extrinsic parameters. Wherein, the internal reference is a parameter of the camera itself, for example, a focal length and the like. The external parameters are parameters related to the installation position of the camera, such as pitch angle (pitch), roll angle (roll) and yaw angle (yaw).

The conversion matrix corresponding to the internal reference may be called the internal reference conversion matrix, and the conversion matrix corresponding to the external reference may be called the external reference conversion matrix.

Camera calibration generally requires a calibration reference object (also called a calibration object or a reference object). The calibration reference object indicates the object captured by the camera during the camera calibration process.

For example, in the above example, the three-dimensional space point X _W may be the coordinates of the calibration reference object in the world coordinate system, and the two-dimensional image point X _P may be the two-dimensional coordinates of the calibration reference object on the image plane of the camera.

Whether it is in image measurement or machine vision applications, the calibration of camera parameters is a very critical link. The accuracy of the calibration results directly affects the accuracy of the results produced by the camera.

A binocular camera can also be called a binocular camera or a binocular sensor. A binocular camera includes two cameras: a left camera and a right camera. The binocular camera can obtain the depth information of the scene, and can reconstruct the three-dimensional shape and position of the surrounding scenery. The purpose of binocular camera calibration is mainly to obtain the internal and external parameters of the left and right cameras. The extrinsic parameters of the left and right cameras refer to the relative positional relationship between the left and right cameras, for example, the translation vector and rotation matrix of the right camera relative to the left camera. The rotation matrix can also be expressed as pitch angle (pitch), roll angle (roll) and yaw angle (yaw).

The calibration of the external parameters of the binocular camera is a key link in image measurement or machine vision applications. The accuracy of the calibration results directly affects the accuracy of the results produced by the binocular camera.

Embodiments of the present application provide a method and device for calibrating extrinsic parameters of a binocular camera, which can improve the accuracy of calibrating extrinsic parameters of a binocular camera.

FIG. 2 shows a schematic diagram of a method 200 for calibrating extrinsic parameters of a binocular camera provided by an embodiment of the present application. The method 200 includes steps S210 to S240.

Exemplarily, the method 200 can be applied to the calibration of a vehicle-mounted camera, and the method 200 can be executed by a calibration module, which can be located on a vehicle-mounted computer platform.

S210. Acquire at least one frame of binocular images captured by a binocular camera.

The binocular image in the embodiment of the present application refers to two images captured synchronously by two cameras in a binocular camera. Two images captured simultaneously can also be understood as two images captured at the same moment. For example, two images in a binocular image have the same timestamp. In this case, acquiring a binocular image taken by a binocular camera can also be understood as acquiring two images with the same time stamp taken by a binocular camera.

Any binocular image in at least one frame of binocular images includes a first image and a second image. The first image is obtained by shooting the shooting scene with the first camera in the binocular camera, and the second image in the binocular image is obtained by shooting the shooting scene with the second camera in the binocular camera.

For example, horizontal objects include road markings.

For example, vertical objects include rods or pillars and the like.

The first camera may be a left camera, and the second camera may be a right camera. Alternatively, the first camera may be a right camera, and the second camera may be a left camera, which is not limited in this embodiment of the present application. An image captured by the left camera may also be called a left-eye image, and an image captured by the right camera may also be called a right-eye image.

It should be noted that the "first" and "second" in the "first image" and "second image" in the embodiment of the present application are only used to distinguish different images in a frame of binocular images, and do not have other Limitation. The first images in different binocular images are different images, and the second images in different binocular images are different images.

At least one frame of binocular images may be obtained from shooting different shooting scenes, or may be obtained from the same shooting scene.

In the embodiment of the present application, the same shooting scene means that the calibration objects in the shooting scenes are the same, and the different shooting scenes means that the calibration objects in the shooting scenes are different. That is to say, the calibration objects in different binocular images can be the same or different.

Exemplarily, the binocular camera may be a vehicle-mounted camera, and the at least one frame of binocular images may be multiple frames of binocular images captured while the vehicle is driving on the calibration site.

For simplicity and clarity of description, only one frame of binocular image is used as an example for illustration in steps S220 to S240 , and other binocular images can be processed in the same manner.

S220. Extract m straight lines from the first image and the second image respectively, where m is an integer greater than 1.

There is a corresponding relationship between the m straight lines of the first image and the m straight lines of the second image.

The straight lines in the shooting scene are the straight lines in the three-dimensional space. The imaging of the straight line in the three-dimensional space in the image coordinate system is the straight line in the image. In other words, the straight line in the image is the projection of the straight line in the three-dimensional space.

That is to say, there is a correspondence between the m straight lines in the first image, the m straight lines in the second image, and the m straight lines in the shooting scene.

The multiple straight lines in the shooting scene can be understood as the straight lines in the calibration object in the shooting scene. The multiple straight lines may be straight lines in one calibration object, or may be straight lines in multiple calibration objects.

If the at least one frame of binocular images includes multiple frames of binocular images, step S220 can be understood as extracting a straight line from each frame of binocular images in the multiple frames of binocular images. In other words, step S220 is performed on the multiple frames of binocular images.

It should be understood that for different binocular images, m may be the same or different. That is to say, the number of straight lines extracted in different binocular images may be the same or different, which is not limited in this embodiment of the present application.

S230. Reconstruct the n straight lines in the first image and the n straight lines in the second image into a three-dimensional space based on the external parameters of the binocular camera, to obtain n straight lines after reconstruction. The n straight lines in the first image and the n straight lines in the second image are projections of the n straight lines in the shooting scene. 1<n≤m, n is an integer.

The n straight lines in the first image belong to the m straight lines in the first image. The n lines in the second image belong to the m lines in the second image.

For a straight line in the three-dimensional space, the image obtained by shooting the straight line with the camera includes the imaging of the straight line, that is, the straight line in the image. According to the imaging principle of the camera, the straight line in the three-dimensional space, the straight line in the image, and the optical center of the camera are on the same plane. If two cameras in the binocular camera shoot a straight line in the three-dimensional space at the same time, the straight line in the three-dimensional space, the straight line in the left-eye image captured by the left camera, and the optical center of the left camera are in plane 1#, and in the three-dimensional space The straight line, the straight line in the right-eye image captured by the right camera, and the optical center of the right camera are in plane 2#. The intersection of plane 1# and plane 2# is the straight line in the three-dimensional space.

As shown in Figure 3, if the two cameras in the binocular camera capture a straight line in the three-dimensional space at the same time, the left-eye image and the right-eye image include the projection of the same line, based on the internal and external parameters of the binocular camera, we can get The straight line in the left-eye image and the plane where the optical center of the left camera are located, and the plane where the straight line in the right-eye image and the optical center of the right camera are located, the straight line obtained by the intersection of the two planes is the reconstructed straight line. This process is the process of reconstructing the straight lines in the image into three-dimensional space. The more accurate the external reference of the binocular camera is, the closer the spatial position of the reconstructed straight line is to the spatial position of the straight line in three-dimensional space.

For example, the n straight lines in the shooting scene include straight line 1# and straight line 2#, and the projections of straight line 1# and straight line 2# in the first image are respectively straight line 1# in the first image and straight line 2 in the first image #. The projections of straight line 1# and straight line 2# in the second image are respectively called straight line 1# in the second image and straight line 2# in the second image.

Reconstructing the straight line 1# in the first image and the straight line 1# in the second image into space, the reconstructed straight line 1# can be obtained. Similarly, by reconstructing the straight line 2# in the first image and the straight line 2# in the second image into space, the reconstructed straight line 2# can be obtained.

If the at least one frame of binocular images includes multiple frames of binocular images, step S230 can be understood as reconstructing the straight lines in the multiple frames of binocular images into space based on the external parameters of the binocular cameras to obtain the multi-frame binocular The reconstructed straight line in the target image. In other words, step S230 is performed on the multiple frames of binocular images.

It should be understood that for different binocular images, n may be the same or different. That is to say, the number of reconstructed straight lines in different binocular images may be the same or different, which is not limited in this embodiment of the present application.

S240. Adjust the external parameters of the binocular camera according to the reconstruction error. The reconstruction error is determined according to the positional relationship between the reconstructed n straight lines and the positional relationship between the n straight lines in the shooting scene.

Step S240 can be understood as adjusting the extrinsic parameters of the binocular camera with the goal of reducing the reconstruction error, or in other words, with the goal of minimizing the reconstruction error. That is to say, the extrinsic parameters of the binocular camera are used as independent variables to construct the equation of the reconstruction error, and the goal is to obtain the extrinsic parameters of the binocular camera that minimize the reconstruction error.

Adjusting the extrinsic parameters of the binocular camera according to the reconstruction error may be adjusting the extrinsic parameters of the binocular camera according to the reconstruction error of one or more frames of binocular images. For the convenience of description, the reconstruction error of a frame of binocular image will be described below.

Specifically, for a frame of binocular image, the reconstruction error of the frame of binocular image is used to indicate the difference between the positional relationship between the reconstructed n straight lines and the positional relationship between n straight lines in the shooting scene. difference between. The smaller the reconstruction error, the more the positional relationship between the reconstructed n straight lines conforms to the positional relationship between the n straight lines in the shooting scene, and the higher the accuracy of the extrinsic parameters of the current binocular camera.

In an implementation manner, the adjusted extrinsic parameters of the binocular camera may be used as calibration values of the extrinsic parameters of the binocular camera.

In another implementation, the external parameters of the binocular camera in step S230 can be updated to the adjusted external parameters of the binocular camera, and steps S230 to S240 are repeated until a binocular camera that meets the preset conditions is obtained. The external parameter of the binocular camera is used as the external parameter calibration value of the binocular camera. For example, the preset condition may be an external parameter whose reconstruction error is less than or equal to an error threshold.

Exemplarily, the extrinsic parameters of the binocular camera that minimize the reconstruction error may be searched in the pose space of the extrinsic parameters of the binocular camera, and the searched extrinsic parameters may be used as calibration values of the extrinsic parameters of the binocular camera.

Alternatively, the extrinsic parameters of the binocular camera can be adjusted in a non-linear optimization manner, and the adjusted extrinsic parameters can be used as calibration values of the extrinsic parameters of the binocular camera.

It should be understood that the above is only an example, and other ways of finding the optimal solution are also applicable to the solution of the embodiment of the present application.

The positional relationship between the n straight lines in the shooting scene can be used as a geometric constraint between the n straight lines. The smaller the reconstruction error, the better the reconstructed n straight lines can satisfy the geometric constraint.

In this case, step S240 can be understood as adjusting the extrinsic parameters of the binocular camera so that the reconstructed n straight lines meet the geometric constraints among the n straight lines in the shooting scene as much as possible. In other words, adjust the extrinsic parameters of the binocular camera according to the geometric constraints between n straight lines in the shooting scene.

The specific form of the geometric constraint is related to the positional relationship between n straight lines in the shooting scene.

Taking two straight lines in the shooting scene as an example, the positional relationship between the two straight lines in the shooting scene is two straight lines with an angle of 60 degrees. The geometric constraints satisfied by the two straight lines may include: the intersection of the two straight lines , with an included angle of 60 degrees. In this case, the extrinsic parameters of the binocular camera can be adjusted so that the reconstructed two straight lines satisfy the geometric constraint as much as possible.

Optionally, the reconstruction error includes at least one of the following: an angle error between the reconstructed n straight lines or a distance error between the reconstructed n straight lines.

The angle error between the reconstructed n straight lines is based on the difference between the angle between at least two straight lines in the reconstructed n straight lines and the angle between at least two straight lines in the n straight lines in the shooting scene. The difference between is determined. The reconstructed at least two straight lines correspond to at least two straight lines in the shooting scene.

The angle error between the reconstructed n straight lines is used to constrain the angle between the reconstructed n straight lines. That is, the angular error can be used as an angular constraint.

The angle error of the reconstructed two straight lines is determined according to the difference between the angle between the reconstructed two straight lines and the angle between the two straight lines in the shooting scene.

For example, the angle between two straight lines in the shooting scene is a. The angle error of the reconstructed two straight lines is the absolute value of the difference between the angle between the reconstructed two straight lines and a.

Exemplarily, if the number of at least two reconstructed straight lines is 2, the angle error between the n reconstructed straight lines may be the angle error of the reconstructed two straight lines.

That is to say, two straight lines can be selected from the reconstructed n straight lines, and the reconstruction error of the two straight lines can be used as the angle error between the n reconstructed straight lines.

Exemplarily, if the number of the reconstructed at least two straight lines is greater than 2, the angle error between the reconstructed n straight lines may be the sum of the angle errors of the reconstructed at least two straight lines, or, The average value of angle errors of at least two straight lines after the reconstruction.

That is to say, 3 or more straight lines can be selected from the reconstructed n straight lines, and the sum of the angular errors between the selected straight lines, or the sum of the angular errors between the selected straight lines The average value is used as the angle error between the reconstructed n straight lines.

For example, the at least two reconstructed straight lines include reconstructed straight line 1#, reconstructed straight line 2# and reconstructed straight line 3#. The angle error between the reconstructed straight line 1# and the reconstructed straight line 2# is the angle error 1#, and the angle error between the reconstructed straight line 1# and the reconstructed straight line 3# is the angle error 2#. The angle error of the reconstructed n straight lines may be the sum of angle error 1# and angle error 2#, or the angle error may be the average value of angle error 1# and angle error 2#.

The distance error between the reconstructed n straight lines is based on the distance between at least two straight lines in the reconstructed n straight lines and the distance between at least two straight lines in the n straight lines in the shooting scene. The difference between is determined. The reconstructed at least two straight lines correspond to at least two straight lines in the shooting scene.

The at least two straight lines in the shooting scene used when calculating the distance error include at least two parallel straight lines. For example, the at least two straight lines in the shooting scene may be parallel to each other, or the at least two straight lines include multiple groups of parallel straight lines, the multiple straight lines in the same group are parallel to each other, and the straight lines in different groups are not parallel. There is no limit to this.

It should be understood that the at least two straight lines used for the distance error between the n straight lines after reconstruction and the at least two straight lines used for the angle error between the n straight lines after reconstruction may be the same or different.

The distance error between the reconstructed n straight lines is used to constrain the distance between the reconstructed straight lines. That is, the distance error can be used as a distance constraint.

The distance error of the reconstructed two straight lines is determined according to the difference between the distance between the reconstructed two straight lines and the distance between the two straight lines in the shooting scene.

For example, the distance between two straight lines in the shooting scene is b, and the distance error of the reconstructed two straight lines may be determined according to the difference between the distance between the reconstructed two straight lines and b.

Exemplarily, the distance between the reconstructed two straight lines may be determined according to the distance between one or more points on one of the straight lines and the other straight line.

The one or more points can be set as required, for example, the one or more points are determined according to the depth value. For example, select a point on one of the lines at a depth of 0 meters and a point at a depth of 30 meters.

For example, the average value of multiple distances between multiple points on one of the reconstructed straight lines and the other straight line is taken as the distance between the reconstructed two straight lines.

Take two points as an example, take two points from straight line 1# of the two straight lines after reconstruction, calculate the distance from these two points to another straight line 2#, and connect these two points to another straight line The average value of the distance of 2# is used as the distance between the two straight lines after reconstruction. Or, take a point from straight line 1# among the two reconstructed straight lines, calculate the distance from this point to another straight line 2#, take a point from straight line 2#, and calculate the distance from this point to straight line 1# , taking the average of the two distances as the distance between the reconstructed two straight lines.

It should be understood that the above is only an example, and the distance between the reconstructed two straight lines may also be determined in other ways, which is not limited in this embodiment of the present application.

Exemplarily, if the number of at least two reconstructed straight lines is 2, the distance error between the n reconstructed straight lines may be the distance error of the reconstructed two straight lines.

Exemplarily, if the number of the reconstructed at least two straight lines is greater than 2, the distance error between the reconstructed n straight lines may be the sum of the distance errors of the reconstructed at least two straight lines, or, The average value of distance errors of at least two straight lines after the reconstruction.

For example, the at least two reconstructed straight lines include reconstructed straight line 1#, reconstructed straight line 2# and reconstructed straight line 3#. The distance error between the reconstructed straight line 1# and the reconstructed straight line 2# is the distance error 1#, and the distance error between the reconstructed straight line 1# and the reconstructed straight line 3# is the distance error 2#. The distance error between the reconstructed n straight lines may be the sum of the distance error 1# and the distance error 2#, or the distance error may be the average value of the distance error 1# and the distance error 2#.

Optionally, the at least two straight lines in the shooting scene include at least two parallel straight lines.

In this case, angular errors include parallel errors.

Exemplarily, the reconstruction error includes a parallel error and a distance error.

The parallel error is used to constrain the parallel relationship between the reconstructed lines. That is, the parallel error can be used as a parallel constraint.

Exemplarily, the angle between two parallel straight lines may be zero. The parallel error may be the angle between the reconstructed straight lines.

Exemplarily, the distance between two parallel straight lines in the shooting scene may be determined through a high-precision map.

Alternatively, the distance between two parallel straight lines in the shooting scene can also be measured by other sensors. The method for determining the distance between the straight lines in the shooting scene in the embodiment of the present application is not limited.

For example, at least two straight lines in the shooting scene are two parallel straight lines. That is to say, the reconstructed straight line is constrained by the positional relationship between two parallel straight lines in the shooting scene. The reconstruction error may include a parallel error and a distance error between the two reconstructed straight lines.

Further, the at least two straight lines in the shooting scene include at least two mutually perpendicular straight lines.

In this case, angular errors include vertical errors.

The vertical error is used to constrain the vertical relationship between the reconstructed lines. That is, the vertical error can be used as a vertical constraint.

Specifically, the vertical error is used to calculate the difference between the angle between the reconstructed two straight lines and the angle between the two mutually perpendicular straight lines in the shooting scene.

For example, if the angle between two mutually perpendicular straight lines is 90 degrees, the vertical error term between the reconstructed two straight lines can be the difference between the angle between the reconstructed two straight lines and 90 degrees value.

As mentioned above, step S240 may be to adjust the extrinsic parameters of the binocular camera according to the reconstruction errors of the multi-frame binocular images.

Exemplarily, step S240 includes: adjusting the extrinsic parameters of the binocular camera according to the sum of reconstruction errors of multiple frames of binocular images.

Alternatively, step S240 includes: adjusting the extrinsic parameters of the binocular camera according to the average value of the reconstruction errors of multiple frames of binocular images.

In this case, step S240 can be understood as, taking the external parameters of the binocular camera as a variable, constructing the equation of the reconstruction error of the multi-frame binocular image, and calculating the binocular camera that minimizes the reconstruction error of the multi-frame binocular image , and use this external parameter as the external parameter calibration value of the binocular camera.

The reconstruction error of each frame of binocular images can be calculated according to the description above, and will not be repeated here.

Optionally, the method 200 further includes: controlling the display to display the calibration situation of the extrinsic parameters of the binocular camera.

Exemplarily, the display may be a vehicle-mounted display.

In other words, the on-board display can display the calibration of the external parameters of the binocular camera in real time.

In this way, it is beneficial for the user to know the calibration situation of the external parameters in real time and improve the user experience.

Optionally, the calibration conditions of the extrinsic parameters of the binocular camera include at least one of the following: current calibration progress, current reconstruction error conditions, or reconstructed p straight lines.

Among them, the reconstructed p straight lines are based on the extrinsic parameters of the current binocular camera to reconstruct the p straight lines in the m straight lines of the first image and the p straight lines in the m straight lines of the second image to the three-dimensional space obtained from. 1<p≤m, p is an integer.

The p straight lines in the first image and the p straight lines in the second image are matched straight lines. Project the p lines in the first image and the p lines in the second image into three-dimensional space.

That is, the calibration results are visualized by reconstructing the spatial positions of the straight lines.

It should be noted that the straight lines reconstructed in the process of visualizing the calibration results and the straight lines reconstructed in the process of adjusting the external parameters can correspond to the same straight line in the shooting scene, or they can correspond to different straight lines in the shooting scene. The embodiment of the application does not limit this.

In the existing calibration scheme, the reprojection error is usually given after the calibration is completed, and the current calibration situation cannot be given in real time.

Optionally, the current calibration progress includes at least one of the following: current extrinsic parameters of the binocular camera or current calibration completion.

For example, the extrinsic parameters of the current binocular camera can be expressed in the form of yaw, pitch and roll. Alternatively, the extrinsic parameters of the current binocular camera can be represented in the form of a rotation matrix. This embodiment of the present application does not limit it.

Exemplarily, the current calibration completion degree can be determined according to the current reconstruction error.

The current reconstruction error refers to the value of the reconstruction error corresponding to the extrinsic parameters of the current binocular camera, that is, the value of the reconstruction error obtained based on the extrinsic parameters of the current binocular camera. For example, the current extrinsic parameters of the binocular camera may be the extrinsic parameters of the optimal binocular camera during the adjustment process, and the current reconstruction error is the smallest reconstruction error during the adjustment process.

For example, the current calibration completion degree can be determined according to the difference between the current reconstruction error and the error threshold. The current calibration completion degree may be the difference, or a percentage determined according to the difference. That is to say, the smaller the difference between the current reconstruction error and the error threshold, the higher the current calibration completion.

Exemplarily, the current calibration completion degree can be determined according to the current search times of the external parameters.

As mentioned above, in step S240, the extrinsic parameters of the binocular camera that minimize the reconstruction error may be searched in the pose space of the extrinsic parameters of the binocular camera. The current calibration completion degree can be determined according to the current search times and the search times threshold. The closer the current search times are to the search times threshold, the higher the current calibration completion.

Exemplarily, the current calibration completion degree may be determined according to the number of currently processed binocular image frames.

As mentioned above, in step S240, the extrinsic parameters of the binocular camera can be adjusted according to the reconstruction errors of multiple frames of binocular images. The current calibration completion degree can be determined according to the number of frames of currently processed binocular images and the total number of frames of binocular images that need to be processed. The closer the number of frames of currently processed binocular images is to the total number of frames of binocular images that need to be processed, the higher the current calibration completion. For example, if the total number of binocular images to be processed is 50 frames, and 30 frames of the 50 images have been processed, the current calibration completion degree may be 60%.

For another example, the current calibration completion degree may be the current reconstruction error.

Optionally, the situation of the current reconstruction error may include at least one of the following: a current reconstruction error, a current distance error, or a current angle error.

That is to say, the constraints currently used for calibration can be displayed in the context of the current reconstruction error.

For example, the reconstruction error is determined according to the angle error and the distance error, and the current reconstruction error may include the current reconstruction error, the current angle error, and the current distance error. The current reconstruction error is a value determined according to the current angle error and the current distance error.

The current calibration progress and the current reconstruction error can quantitatively display the current calibration situation.

It should be understood that the above is only an example, and other display items may also be set as required during the calibration process, which is not limited in this embodiment of the present application.

Through step S221, projections of multiple straight lines in the shooting scene in the first image and the second image can be acquired. In other words, through step S221, the corresponding relationship between the straight lines in the shooting scene, the straight lines in the first image, and the straight lines in the second image can be obtained.

Optionally, step S221 includes step S2211 to step S2213 (not shown in the figure). Steps S2211 to S2213 will be described below. For the sake of brevity and clarity, only one frame of binocular image is used as an example for illustration in steps S2211 to S2213, and straight lines can also be extracted in the same way in other binocular images, which will not be repeated here.

S2211. Perform instance segmentation on the first image and the second image respectively, to obtain instances in the first image and instances in the second image.

Instance segmentation is performed on an image to obtain different instances in the image. In other words, the instance segmentation of the image can obtain the instance to which the pixel in the image belongs.

Optionally, step S2211 may be implemented through step 11) and step 12).

Step 11), performing semantic segmentation on the first image and the second image respectively, to obtain the semantic segmentation result of the first image and the semantic segmentation result of the second image.

The semantic segmentation result of the image includes the semantic information corresponding to the pixels in the image. The semantic information corresponding to a pixel can also be understood as the category to which the pixel belongs.

Exemplarily, each image may be processed through a semantic segmentation network to obtain a semantic segmentation result.

The semantic segmentation network can adopt the existing neural network model, for example, deeplabv3 and so on.

Semantic segmentation networks can be trained using public datasets. The specific training process is the prior art, and will not be repeated here.

Input the image into the semantic segmentation network, and the semantic information corresponding to the pixels in the image can be obtained.

Exemplarily, the categories output by the semantic segmentation network may include horizontal objects and vertical objects. That is, a semantic segmentation network is able to distinguish whether a pixel in an image is horizontal or vertical.

Optionally, the semantic segmentation result of the first image includes horizontal objects or vertical objects in the first image. The semantic segmentation result of the second image includes horizontal objects or vertical objects in the second image.

In other words, semantic segmentation is performed on the first image to obtain pixels belonging to horizontal objects and pixels belonging to vertical objects in the first image. Perform semantic segmentation on the second image to obtain pixels belonging to horizontal objects and pixels belonging to vertical objects in the second image.

As mentioned above, the binocular camera in the embodiment of the present application may be a vehicle-mounted camera. In this case, the level may include road markings. For example, the road markings may include solid road markings or dashed road markings. Vertical objects may include rods or columns, among others. For example, poles may include street light poles and the like.

It should be understood that the above semantic segmentation result is only an example, and the semantic information may be set according to the category of the calibration object. If the calibration site includes other types of calibration objects, the semantic segmentation network can be trained to output other types of semantic information. For example, the semantic segmentation network can also be trained to distinguish between triangular objects or square objects, which is not limited in this embodiment of the present application.

Step 12), according to the semantic segmentation result of the first image, the first image is instance-segmented to obtain the instance in the first image; according to the semantic segmentation result of the second image, the second image is instance-segmented to obtain the instance in the second image instance.

Instance segmentation of an image according to the result of semantic segmentation refers to distinguishing different individuals among the pixels of the same semantics, that is, distinguishing the instance to which the pixel in the image belongs. An instance represents an individual.

That is to say, in step 12), the input can be the coordinates of all pixels with the same semantics, and the output can be the instance to which the pixel belongs.

Exemplarily, a clustering method can be used to distinguish different individuals in pixels of the same semantics. For example, different individuals can be distinguished by density-based spatial clustering of applications with noise (Dbscan), etc.

Specifically, when the distance between two pixels with the same semantics is less than or equal to the interval threshold, the two pixels belong to the same instance.

It should be understood that other instance segmentation methods in the prior art may also be used to perform instance segmentation on the image, and the embodiment of the present application does not limit the specific implementation manner of the instance segmentation.

The correspondence between the instance in the first image and the instance in the second image may be determined according to the location of the instance in the first image and the location of the instance in the second image.

Instances with a corresponding relationship correspond to the same calibration object in the shooting scene. In other words, the instances with corresponding relationship in the image are projections of the same calibration object in the shooting scene.

The position of the instance in the image may be the absolute position of the instance in the image, for example, the coordinates of the instance in the image. Alternatively, the position of an instance in the image may also be a relative position among multiple instances in the image.

The first image and the second image are images taken by the two cameras of the binocular camera for the same shooting scene, and the difference between the two images is relatively small. That is to say, the positions of the projections of the same calibration object in the two images are close. Therefore, the correspondence between instances in two images can be determined by location.

For example, there is a corresponding relationship between the leftmost instance of the plurality of vertical objects in the first image and the leftmost instance of the plurality of vertical objects in the second image.

Further, instance annotation may be performed on the first image and the second image respectively, to obtain the annotation information of the instance in the first image and the annotation information of the instance in the second image.

Instance labeling of images refers to labeling instances in images. Different annotation information in an image is used to indicate different instances in the image.

For example, the annotation information may be an instance number. Different instance numbers in an image are used to indicate different instances in the image.

Specifically, instances in the image are labeled according to their locations in the image.

For example, in the first image and the second image, instances with the same relative position are labeled with the same instance number.

In this way, the corresponding relationship between the instance in the first image and the instance in the second image can be indicated by the annotation information of the instance. For example, there is a correspondence between instances with the same instance number in the first image and the second image.

By matching the instance in the first image with the instance in the second image, the corresponding relationship between the instance in the first image and the instance in the second image can be obtained.

That is to say, matching the instance in the first image with the instance in the second image can be realized by respectively performing instance annotation on the first image and the second image.

It should be understood that the correspondence between the instances in the first image and the instances in the second image may be the correspondence between all the instances in the first image and all the instances in the second image, or, the correspondence in the first image The correspondence between the partial instances of and the partial instances in the second image.

S2212. Extract m straight lines from the instance in the first image and the instance in the second image respectively.

The correspondence between the m straight lines in the first image and the m straight lines in the second image is determined according to the correspondence between the instances in the first image and the instances in the second image.

Optionally, step S2212 may be implemented through steps 21) to 23).

21) Extract a plurality of original straight lines in the instances of the first image and the second image respectively.

Exemplarily, a plurality of original straight lines may be extracted in an instance by using a machine vision method.

For example, the original straight line can be extracted in the instance by the hough transform.

Specifically, on the edge of each instance in the image, for example, on the left and right sides of each instance, a small region of interest (region of interest, ROI) is set, and the instance edge pixels in the ROI are projected to a straight line parameter space, and by setting the threshold of the number of points in the parameter space, extract the straight line in the instance, which is the original straight line in the instance.

The extracted instance information of the original straight line in the instance is used to indicate the position of the original straight line in the instance, for example, on the left or right side of the instance.

It should be understood that the position of the original straight line in the example is related to the setting manner of the ROI, which is not limited in this embodiment of the present application.

22) multiple original straight lines of the same-side edge of the example in the first image are fitted as a target straight line of the side edge of the example in the first image; multiple straight lines of the same-side edge of the example in the second image The original line is fitted to a target line for the side edge of the instance in the second image.

Multiple original straight lines may be extracted from one edge of an instance, for example, multiple original straight lines may be extracted from the left side of an instance. In order to further improve the accuracy of the straight line required for calibration, multiple original straight lines extracted from one side edge of the instance can be fitted into a straight line, and the fitted straight line is the target of the side edge of the instance straight line.

Step 22) is an optional step. In the case that step S2212 does not include step S22, an original straight line may also be selected from a plurality of original straight lines extracted from one side edge of the instance as the target straight line of the side edge of the instance.

If one side edge of an instance only includes one original straight line, the original straight line can be used as the target straight line of the side edge of the instance.

Exemplarily, the fitting may be performed in a random sample consensus (RANSAC) manner.

For example, two points are randomly selected from the boundary points of multiple original straight lines on the same side of an instance, and a straight line is determined according to the two points. The straight line can be expressed by the slope k and the intercept b, for example, the straight line can be represented by is (k,b). A number of midpoints of the plurality of original straight lines lying on the straight line is determined. The number of midpoints of the multiple original straight lines located on the straight line can also be understood as the number of midpoints of the multiple original straight lines that the straight line passes through. The straight line passing through the midpoints of the plurality of original straight lines with the largest number may be used as the target straight line.

In this way, fitting the original straight line on the same side of the example can obtain a more accurate target straight line, and using the target straight line to calibrate the extrinsic parameters of the binocular camera will help improve the accuracy of the calibration results of the extrinsic parameters of the binocular camera.

Further, the straight line passing through the midpoints of the plurality of original straight lines with the largest number may also be processed to obtain the target straight line.

For ease of description, the straight line passing through the midpoints of the multiple original straight lines with the largest number is called an intermediate straight line.

Traverse each row in the middle straight line, take the point of each row in the middle straight line as the center point, determine the pixel point with the largest pixel gradient among the pixels around the center point of each row, and use the pixel point with the largest pixel gradient in each row as The original point is fitted again by RANSAC to obtain the target straight line.

For example, take the point of each row in the middle straight line as the center point, and determine the point with the largest pixel gradient among the 5 pixels on the left and 5 pixels on the right of the center point in each row, and the pixel gradient in each row The largest pixel point is used as the original point, two points are randomly selected from the original points, a straight line is determined according to the two points, and the number of original points passed by the straight line is determined. The straight line with the largest number of original points can be used as the target straight line.

The pixel gradient at the boundary is usually large, and searching for the pixel point with the largest pixel gradient is beneficial to find a more accurate boundary and improve the accuracy of line extraction.

The extracted instance information of the target straight line of the instance is used to indicate the position of the target straight line in the instance, for example, the target straight line is located on the left or right side of the instance.

It should be understood that the above is only an example for illustration, and in other examples the straight line can be extracted in the same manner.

S2213. Match the target straight lines in the first image and the target straight lines in the second image to obtain correspondences between the m straight lines in the first image and the m straight lines in the second image.

That is, the corresponding relationship between the target straight line in the first image and the target straight line in the second image is determined. The two straight lines having a corresponding relationship are projections of the same straight line in the shooting scene on the first image and the second image.

In other words, the projections of the same straight line in the shooting scene in the first image and the second image are matched.

It should be understood that in step S2213, all target straight lines in the first image may be matched with all target straight lines in the second image. Alternatively, part of the target straight line in the first image may also be matched with the target straight line in the second image, which is not limited in this embodiment of the present application.

That is to say, in the embodiment of the present application, it is not necessary to determine the corresponding relationship between each target straight line in the first image and each target straight line in the second image. It only needs to determine the corresponding relationship between the m target straight lines in the first image and the m target straight lines in the second image.

The m straight lines in the first image and the second image are the target straight lines having a corresponding relationship between the first image and the second image.

The corresponding relationship between the target straight line in the first image and the target straight line in the second image is determined according to the instance information of the target straight line.

Specifically, the corresponding relationship between the target straight line in the first image and the target straight line in the second image is determined according to the corresponding relationship between the instances in the first image and the second image and the position of the target straight line in the instance.

In the example of the corresponding relationship between the first image and the second image, the target straight line with the same position is the straight line with the corresponding relationship between the first image and the second image.

For example, instance 1# in the first image corresponds to instance 1# in the second image. The instance information of line a in the first image indicates that line a is located on the left side of instance 1# in the first image, and the instance information of line 1# in the second image indicates that line b in the second image is located in the second image On the left side of Example 1#, the straight line a in the first image and the straight line b in the second image are straight lines with a corresponding relationship.

It should be understood that steps S2211 to S2213 are only one possible line matching method, and other methods in the prior art can also be used to determine the correspondence between the lines in the two images, which is not limited in the embodiment of the present application .

Similarly, the corresponding relationship between the straight lines in the image and the straight lines in the shooting scene can be determined by the relative positions between the straight lines in the shooting scene and the relative positions between the straight lines in the image, which will not be repeated here.

FIG. 4 shows another method 400 for calibrating extrinsic parameters of a binocular camera provided by an embodiment of the present application. The method 400 may be regarded as a specific implementation manner of the method 400 . Therefore, for the content not described in detail in the method 400, reference may be made to the method 200 above, and for the sake of brevity, descriptions are appropriately omitted in the following.

As mentioned above, the method for calibrating the extrinsic parameters of the binocular camera provided in the embodiment of the present application can be applied to the vehicle camera system. For example, the binocular camera in the embodiment of the present application is a vehicle-mounted camera, and the vehicle where the binocular camera is located may be in a stationary state or in a moving state. In the method 400, the method for calibrating the extrinsic parameters of the binocular camera is described by taking the binocular camera as a vehicle-mounted camera as an example, which does not limit the application scenario of the embodiment of the present application.

After the vehicle enters the calibration site, execute method 400 to complete the calibration of the external parameters of the binocular camera, update the external parameters of the binocular camera with the calibration value, provide high-precision external parameters for the upper-level business, and improve the accuracy of the upper-level business, and then Improved autopilot performance. Fig. 5 shows a schematic diagram of a calibration site. The calibration field in Fig. 5 is provided with roadway lines and vertical poles. Street lines or vertical poles can be used as calibration objects for binocular cameras.

The method 400 includes step S410 to step S440. Steps S410 to S440 will be described below.

S410, acquiring binocular images.

Specifically, at least one frame of binocular images captured by a binocular camera is acquired. The left-eye image in the binocular image is captured by the left camera in the binocular camera, and the right-eye image in the binocular image is captured by the right camera in the binocular camera.

Exemplarily, the at least one frame of binocular images may be multiple frames of binocular images captured by the vehicle while driving on the calibration site.

The vehicle can acquire multiple frames of binocular images after driving a short distance, and complete the calibration of the extrinsic parameters of the binocular camera.

Step S410 corresponds to step S210, and for a specific description, refer to step S210, which will not be repeated here.

S420. Extract m straight lines from the left-eye image and the right-eye image respectively, where m is an integer greater than 1.

There is a corresponding relationship between the m straight lines of the left-eye image and the m straight lines of the right-eye image.

Exemplarily, step S420 includes step S421 to step S425.

S421. Perform semantic segmentation on the binocular image to obtain a semantic segmentation result of the binocular image.

Specifically, semantic segmentation is performed on the left-eye image and the right-eye image respectively, and a semantic segmentation result of the left-eye image and a semantic segmentation result of the right-eye image are obtained.

Exemplarily, the left-eye image and the right-eye image are respectively processed by using the semantic segmentation network to obtain a semantic segmentation result of the left-eye image and a semantic segmentation result of the right-eye image.

For example, the output of a semantic segmentation network includes two types of semantics: road lines or vertical objects. That is to say, the semantic segmentation network can distinguish whether the pixel points of objects in the image belong to road lines or vertical objects. Fig. 6(a) shows the semantic segmentation result of the left-eye image, and Fig. 6(b) shows the semantic segmentation result of the right-eye image. As shown in Figure 6, the vertical objects and road lines in the left-eye image and right-eye image are distinguished through semantic segmentation.

Step S421 corresponds to step 11) in step S2211. For specific description, please refer to the previous description, which will not be repeated here.

S422. Perform instance labeling according to the semantic segmentation result.

Specifically, instance annotation is performed on the left-eye image according to the semantic segmentation result of the left-eye image, and the annotation information of the instances in the left-eye image is obtained. According to the semantic segmentation result of the right-eye image, instance annotation is performed on the right-eye image, and the annotation information of the instance in the right-eye image is obtained.

Specifically, instance segmentation is performed on the first image according to the semantic segmentation result of the first image to obtain the instance in the first image; an instance in the first image is annotated according to the position of the instance in the first image to obtain the annotation of the instance information.

The second image is instance-segmented according to the semantic segmentation result of the second image to obtain instances in the second image; the instances in the second image are annotated according to the positions of the instances in the second image to obtain annotation information of the instances.

(a) of FIG. 7 shows the annotation information of the instance in the left-eye image, and (b) of FIG. 7 shows the annotation information of the instance in the right-eye image. As shown in FIG. 7 , the labeled information of the example includes: left 1 pillar, left 2 pillar, right 1 pillar, right 2 pillar, lane line 1, lane line 2, lane line 3, lane line 4 and lane line 5. The annotation information of instances with corresponding relations in the left-eye image and the right-eye image is the same. That is to say, the annotation information of the instance in the left-eye image and the annotation information of the instance in the right-eye image are used to indicate the correspondence between the instance in the left-eye image and the instance in the right-eye image.

As shown in FIG. 7 , only some instances are marked in step S422 , and more or fewer instances can be marked as required in practical applications.

Step S422 corresponds to step 12) in step S2211. For specific description, please refer to the previous description, which will not be repeated here.

S423. Extract a plurality of original straight lines in the instance of the binocular image.

Specifically, a plurality of original straight lines are extracted from the instances of the left-eye image and the right-eye image respectively.

In step S423, straight line extraction may be performed on the instances marked in step S422. In other words, line extraction is performed on instances with annotation information.

Step S423 corresponds to step 21) in step S2212. For specific description, please refer to the previous description, which will not be repeated here.

S424. Perform straight line fitting on multiple original straight lines to obtain a target straight line.

Specifically, multiple original straight lines on the same-side edge of the instance in the left-eye image are fitted to a target straight line on the side edge of the instance in the left-eye image. Fit multiple original straight lines on the same side edge of the instance in the right-eye image to a target straight line on the side edge of the instance in the right-eye image.

(a) of FIG. 8 shows the target straight line in the left-eye image, and FIG. 8( b ) shows the target straight line in the right-eye image.

As shown in FIG. 8 , for an example, straight lines on both sides of the example may be extracted, or only straight lines on one side of the example may be extracted, which is not limited in the present application.

Step S424 corresponds to step 22) in step S2212, and the specific description may refer to the previous description, which will not be repeated here.

S425. Match the target straight line in the left-eye image with the target straight line in the right-eye image to obtain the correspondence between the m straight lines in the left-eye image and the m straight lines in the right-eye image.

In other words, the projections of the m straight lines in the shooting scene on the left-eye image and the right-eye image are obtained.

For example, as shown in FIG. 8, m may be 12. That is, the corresponding relationship between the 12 straight lines in (a) in FIG. 8 and the 12 straight lines in (b) in FIG. 8 is obtained.

Step S425 corresponds to step S2213, and the specific description may refer to the previous description, which will not be repeated here.

S430. Reconstruct the n straight lines in the left-eye image and the n straight lines in the right-eye image into a three-dimensional space based on the external parameters of the binocular camera, and obtain the reconstructed n straight lines. The n straight lines in the left-eye image and the n straight lines in the right-eye image are projections of the n straight lines in the shooting scene.

For example, n is 12 as shown in FIG. 9 . The straight lines in the left-eye image and right-eye image in Figure 8 are reconstructed into space, and the spatial positions of the reconstructed 12 straight lines are obtained.

S440. Adjust extrinsic parameters of the binocular camera according to the reconstruction error.

The reconstruction error is determined according to the positional relationship between the reconstructed n straight lines and the positional relationship between the n straight lines in the shooting scene.

Exemplarily, the reconstruction error is determined according to the positional relationship between two straight lines in the reconstructed n straight lines and the positional relationship between two straight lines in the n straight lines in the shooting scene.

As mentioned above, the reconstruction error includes at least one of the following: angle error or distance error.

The reconstruction error will be described below by taking the reconstructed horizontal line _l1 and horizontal line _l4 as examples.

The reconstruction error satisfies the following formula:

Among them, f _i (yaw, pitch, roll) represents the reconstruction error of the i-th binocular image obtained based on the external parameters of the binocular camera, and yaw, pitch, roll is the Euler angle representation of the external parameters of the binocular camera .

is the angle between the reconstructed horizontal line _l1 and the horizontal line _l4 , and is used to calculate the angle error between the reconstructed horizontal line _l1 and the horizontal line _l4 . |dis(l ₁ ,l ₄ )-d1|is used to calculate the distance error between the reconstructed horizontal line l ₁ and the horizontal line l ₄ , where dis() is used to calculate the distance between the reconstructed two straight lines Distance, for example, dis(l ₁ ,l ₄ ) is used to calculate the distance between the reconstructed horizontal line l ₁ and the horizontal line l ₄ , d1 represents the distance between the horizontal line l ₁ and the horizontal line l ₄ in the shooting scene, namely Actual distance between horizontal line l ₁ and horizontal line l ₄ .

It should be understood that the above _formula only takes the horizontal line _l1 and the horizontal line l4 in the shooting scene as an example to illustrate the reconstruction error, and the reconstruction error can also be calculated according to other parallel lines in the shooting scene, for example, according to the vertical line L1 and Vertical line L5 calculates the reconstruction error. The above formula can also be used for the reconstruction error between other parallel lines, as long as the straight line in the above formula is replaced with the corresponding straight line.

Further, the reconstruction error may also be determined according to the positional relationship between the reconstructed n straight lines and the positional relationship between the n straight lines and the positional relationship between the n straight lines in the shooting scene. q is an integer greater than 2 and less than or equal to n.

The reconstruction error will be described below by taking the 12 straight lines in FIG. 9 as an example, that is, q is 12.

The reconstruction error satisfies the following formula:

Among them, the first term is used to calculate the angular error between the reconstructed horizontal lines. Specifically, in the above formula, the angle error between the four reconstructed horizontal lines is equal to the difference between the angle errors between the reconstructed l ₂ , l ₃ , l ₄ and the reconstructed l ₁ and. It should be understood that the first item in the above formula is only an example, and the angle error between the reconstructed horizontal lines can also be calculated in other ways, for example, the angle error between the four reconstructed horizontal lines can also be re- The average value of the angle errors between the reconstructed horizontal lines and the reconstructed l ₁ . For another example, the angle errors between the four reconstructed horizontal lines may also be the sum of the angle errors between the reconstructed horizontal lines and the reconstructed _l2 . The present application does not limit the specific calculation method of the angular error between the reconstructed horizontal lines, as long as the angular error between the reconstructed horizontal lines can constrain the parallel relationship between the reconstructed horizontal lines.

The second term is used to calculate the angular error between the reconstructed perpendiculars. Specifically, in the above formula, the angle error between the 8 reconstructed vertical lines is each vertical line, that is, the reconstructed L ₂ , L ₃ , L ₄ , L ₅ , L ₆ , L ₇ and the reconstructed The sum of the angular errors between L and ₁ after construction. It should be understood that the second item in the above formula is only an example, and the angle error between the reconstructed vertical lines can also be calculated in other ways, for example, the angle error between the four reconstructed vertical lines can also be is the average value _of the angle errors between the reconstructed vertical lines and the reconstructed L1. For another example, the angle errors between the four reconstructed vertical lines may also be the sum of the angle errors between the reconstructed vertical lines and the reconstructed L ₂ . The present application does not limit the specific calculation method of the angle error between the reconstructed perpendicular lines, as long as the angle error between the reconstructed perpendicular lines can constrain the parallel relationship between the reconstructed perpendicular lines.

The third term is used to calculate the angular error between the reconstructed vertical and horizontal lines. Specifically, in the above formula, the angle error between the reconstructed vertical line and the horizontal line is the angle error between the reconstructed horizontal line l ₁ and the reconstructed L ₁ . It should be understood that the third item in the above formula is only an example, and the angle error between the reconstructed vertical line and the horizontal line can also be calculated in other ways, for example, the angle error between the reconstructed vertical line and the horizontal line can also be calculated by Can be the angle error between the vertical and horizontal lines after other reconstructions. For another example, the angle error between the reconstructed vertical line and the horizontal line may also be the sum of the angle errors between each vertical line and each horizontal line after reconstruction. For another example, the reconstructed angle error between the vertical line and the horizontal line may also be an average value of the angle errors between the reconstructed vertical lines and each horizontal line. This application does not limit the specific calculation method of the angle error between the reconstructed vertical line and the horizontal line, as long as the angle error between the reconstructed vertical line and the horizontal line can constrain the angle error between the reconstructed vertical line and the horizontal line vertical relationship.

The fourth term is used to calculate the distance error between the reconstructed horizontal lines. Specifically, in the above formula, the distance error between the reconstructed horizontal lines is the distance error between the reconstructed horizontal line l ₁ and the reconstructed l ₄ . It should be understood that the fourth item in the above formula is only an example, and the distance error between the reconstructed horizontal lines can also be calculated in other ways, for example, the distance error between the reconstructed horizontal lines can also be other reconstructed The distance error between the horizontal lines. For another example, the distance error between the reconstructed horizontal lines may also be the sum of the distance errors between the reconstructed horizontal lines. For another example, the distance error between the reconstructed horizontal lines may also be an average value of the distance errors between the reconstructed horizontal lines. The present application does not limit the specific calculation method of the distance error between the reconstructed horizontal lines, as long as the distance error between the reconstructed horizontal lines can constrain the distance between the reconstructed horizontal lines.

The fifth term is used to calculate the distance error between the reconstructed vertical lines. Specifically, in the above formula, the distance error between the reconstructed vertical lines is the distance error between the reconstructed vertical line L ₁ and the reconstructed L ₃ . Among them, d2 represents the distance between _L1 and _L3 in the shooting scene. It should be understood that the fifth item in the above formula is only an example, and the distance error between the reconstructed vertical lines can also be calculated in other ways, for example, the distance error between the reconstructed vertical lines can also be calculated by other The distance error between the constructed vertical lines. For another example, the distance error between the reconstructed vertical lines may also be the sum of the distance errors between the reconstructed vertical lines. For another example, the distance error between the reconstructed vertical lines may also be an average value of the distance errors between the reconstructed vertical lines. The present application does not limit the specific calculation method of the distance error between the reconstructed vertical lines, as long as the distance error between the reconstructed vertical lines can constrain the distance between the reconstructed vertical lines.

In step S440, the extrinsic parameters of the binocular camera may be adjusted according to the reconstruction error of one or more frames of binocular images.

In the embodiment of the present application, the reconstruction errors of multiple frames of binocular images are accumulated, and the external parameters of the binocular cameras are adjusted according to the accumulated reconstruction errors, which can reduce the influence of the accuracy of straight line extraction on the calibration results and improve Accuracy of calibration results.

That is to say, the extrinsic parameters of the binocular camera are adjusted with the goal of reducing the reconstruction error of multi-frame binocular images.

For example, the extrinsic parameters of the binocular camera that minimize the reconstruction error of multiple frames of binocular images are used as the calibration value of the extrinsic parameters of the binocular camera.

Exemplarily, the minimum reconstruction error of multiple frames of binocular images may be the minimum sum of reconstruction errors of multiple frames of binocular images.

For example, the reconstruction error of multi-frame binocular images satisfies the following formula:

F(yaw, pitch, roll) = ∑ _i f _i (yaw, pitch, roll);

F(yaw,pitch,roll) represents the reconstruction error of multi-frame binocular images.

Alternatively, the smallest reconstruction error of multiple frames of binocular images may also be the smallest average value of reconstruction errors of multiple frames of binocular images.

After obtaining the external parameter calibration value of the binocular camera, the external parameter of the binocular camera in the system can be updated. For example, other functional modules provided to the automatic driving system.

S450, controlling the vehicle-mounted display to display the calibration of the external parameters of the binocular camera.

Exemplarily, the calibration situation of the extrinsic parameters of the binocular camera includes the current calibration progress, the current reconstruction error situation or the reconstructed p straight lines.

FIG. 10 shows a schematic diagram of calibration of extrinsic parameters of a binocular camera. As shown in Figure 10, the calibration of the external parameters of the binocular camera includes the current calibration progress, the current reconstruction error, or the 12 reconstructed straight lines.

The current calibration progress includes the current calibration completion and the current extrinsic parameters of the binocular camera.

It should be understood that the representation of the external parameters in the form of Euler angles in FIG. 10 is only an example, and the external parameters of the binocular camera may also be represented in other forms.

The current reconstruction error conditions include the current distance error and the current angle error. In FIG. 10 , four reconstructed straight lines L ₁ , L ₅ , l ₁ , and l ₄ are selected during the calibration process to calculate the reconstruction error. For example, as shown in Figure ₁₀ , the angle error includes _: the angle error between L1 and _L5 , the angle error between _L1 and _L4 , and the angle error between L1 and L1 _. The distance error includes: the distance error between L ₁ and L ₅ .

It should be understood that the situation of the reconstruction error in FIG. 10 is only an example, and the situation of the current reconstruction error may also include the current reconstruction error, that is, the reconstruction error calculated according to the current distance error and the current angle error, For example, the value of f _i (yaw, pitch, roll). Alternatively, other straight lines are also used to calculate the reconstruction error. Alternatively, other angle errors or distance errors may also be used to calculate the reconstruction error, which is not limited in this embodiment of the present application.

(a) and (b) of Fig. 10 show two calibration situations. In (a) of Figure 10, the calibration completion rate is 25%. At this time, the reconstruction error is relatively large, and the 12 straight lines obtained based on the current external parameter reconstruction are relatively distorted, which does not conform to the shooting scene in the real world. In (b) of Figure 10, the calibration completion is 80%, and the reconstruction error at this time is relatively small, and the 12 straight lines reconstructed based on the current extrinsic parameters are more in line with the shooting scene in the real world.

The embodiment of the present application may be applicable to dynamic calibration of a camera, and may also be applicable to static calibration of a camera.

For example, the camera in the embodiment of the present application is a vehicle-mounted camera, and the vehicle on which the camera is carried is in a moving state.

In the solution provided by this application, the landmarks can be any type of road features, and are not strictly limited to lane lines. For example, the calibration reference object in the solution provided by the present application may be any of the following lane features: lane lines, signs, pole objects, road signs, and traffic lights. Wherein, the signboards are, for example, traffic signboards or pole plates, and the pole-like objects are, for example, street light poles.

In addition, the solution provided in this application can be applied to both dynamic camera calibration and static camera calibration. Moreover, the embodiment of the present application may use elements in the open road as calibration objects. Therefore, the solution provided by this application has good versatility.

It should be understood that the solution provided by the present application can be applied to the camera parameter calibration link of the automatic driving vehicle assembly line, and need not be limited to a fixed calibration workshop.

It should also be understood that the solution provided in this application can also be applied to the initial calibration scene after the vehicle leaves the factory and the scene that causes external parameters to change during use and requires real-time online correction or regular calibration.

For example, the initial calibration value of the external parameters of the binocular camera may be obtained by manual measurement during the object assembly process. After the vehicle leaves the factory, the solution of the embodiment of the application can be used to adjust the external parameters of the binocular camera and update the binocular camera in the system. The external parameters of the camera provide high-precision external parameters for the upper-level business to improve the accuracy of the upper-level business, thereby improving the driving performance.

It should also be understood that the solution provided by the present application can greatly reduce the dependence on a specific calibration site, and realize high-precision calibration of the external parameters of the vehicle-mounted camera anytime, anywhere (ie real-time online).

The various embodiments described herein may be independent solutions, or may be combined according to internal logic, and these solutions all fall within the protection scope of the present application.

The method embodiments provided by the present application are described above, and the device embodiments provided by the present application will be described below. It should be understood that the descriptions of the device embodiments correspond to the descriptions of the method embodiments. Therefore, for details that are not described in detail, reference may be made to the method embodiments above. For brevity, details are not repeated here.

FIG. 11 is a device 600 for calibrating extrinsic parameters of a binocular camera provided by an embodiment of the present application. The device 600 includes an acquisition unit 610 and a processing unit 620 .

The acquiring unit 610 is configured to acquire a first image and a second image, the first image is captured by the first camera pair in the binocular camera, and the second image is captured by the second camera pair in the binocular camera The scene was shot.

The processing unit 620 is configured to extract m straight lines in the first image and the second image respectively, where m is an integer greater than 1, and there is a corresponding relationship between the m straight lines in the first image and the m straight lines in the second image; based on The extrinsic parameters of the binocular camera reconstruct n straight lines in the m straight lines of the first image and n straight lines in the m straight lines of the second image into three-dimensional space, and obtain n straight lines after reconstruction, the first The n straight lines of the image and the n straight lines of the second image are projections of n straight lines in the shooting scene, 1<n≤m, n is an integer; adjust the external parameters of the binocular camera according to the reconstruction error, and the reconstruction error is It is determined according to the positional relationship between the reconstructed n straight lines and the positional relationship between the n straight lines in the shooting scene.

Optionally, as an embodiment, the reconstruction error includes at least one of the following: an angle error between the reconstructed n straight lines or a distance error between the reconstructed n straight lines,

The angle error between the reconstructed n straight lines is based on the difference between the angle between at least two straight lines in the reconstructed n straight lines and the angle between at least two straight lines in the n straight lines in the shooting scene. The difference between is determined;

The distance error between the reconstructed n straight lines is based on the distance between at least two straight lines in the reconstructed n straight lines and the distance between at least two straight lines in the n straight lines in the shooting scene. The difference between is determined.

Optionally, as an embodiment, the processing unit 620 is specifically configured to:

performing instance segmentation on the first image and the second image respectively, to obtain an instance in the first image and an instance in the second image;

Extract m straight lines from the instance in the first image and the instance in the second image respectively, and the correspondence between the m straight lines in the first image and the m straight lines in the second image is based on the The corresponding relationship between the instance and the instance in the second image is determined.

Semantic segmentation is performed on the first image and the second image respectively to obtain a semantic segmentation result of the first image and a semantic segmentation result of the second image, the semantic segmentation result of the first image includes horizontal objects or vertical objects in the first image, The semantic segmentation result of the second image includes horizontal objects or vertical objects in the second image;

Instance segmentation is performed on the first image based on the semantic segmentation result of the first image to obtain instances in the first image, and instance segmentation is performed on the second image based on the semantic segmentation results of the second image to obtain instances in the second image.

Optionally, as an embodiment, the current calibration progress includes at least one of the following:

The extrinsic parameters of the current binocular camera or the current calibration completion.

Optionally, as an embodiment, the current reconstruction error situation includes at least one of the following:

Current reconstruction error, current range error, or current angle error.

As shown in FIG. 12 , the apparatus 3000 may include at least one processor 3002 and a communication interface 3003 .

Optionally, the apparatus 3000 may further include at least one of a memory 3001 and a bus 3004 . Among them, any two or all three of the memory 3001 , the processor 3002 and the communication interface 3003 can be connected to each other through the bus 3004 .

Optionally, the memory 3001 may be a read only memory (read only memory, ROM), a static storage device, a dynamic storage device or a random access memory (random access memory, RAM). The memory 3001 can store a program. When the program stored in the memory 3001 is executed by the processor 3002, the processor 3002 and the communication interface 3003 are used to execute each step of the method for calibrating the external parameters of the binocular camera according to the embodiment of the present application. That is to say, the processor 3002 can acquire stored instructions from the memory 3001 through the communication interface 3003, so as to execute each step of the method for calibrating the external parameters of the binocular camera according to the embodiment of the present application.

Optionally, the memory 3001 may implement the above function of storing programs. Optionally, the processor 3002 may adopt a general-purpose CPU, a microprocessor, an ASIC, a graphics processing unit (graphic processing unit, GPU) or one or more integrated circuits for executing related programs, so as to implement the embodiments of the present application. The functions to be performed by the processing unit in the device, or each step of the method for calibrating the extrinsic parameters of the binocular camera according to the embodiment of the present application.

Optionally, the processor 3002 may implement the above-mentioned function of executing related programs.

Optionally, the processor 3002 may also be an integrated circuit chip, which has a signal processing capability. During implementation, each step of the control method in the embodiment of the present application may be completed by an integrated logic circuit of hardware in a processor or instructions in the form of software.

Optionally, the above-mentioned processor 3002 can also be a general-purpose processor, a digital signal processor (digital signal processing, DSP), an application-specific integrated circuit (ASIC), a ready-made programmable gate array (field programmable gate array, FPGA) or other available Program logic devices, discrete gate or transistor logic devices, discrete hardware components. Various methods, steps, and logic block diagrams disclosed in the embodiments of the present application may be implemented or executed. A general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like. The steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module can be located in a mature storage medium in the field such as random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, register. The storage medium is located in the memory, and the processor reads the information in the memory, and combines its hardware to complete the functions required by the units included in the calibration device of the embodiment of the application, or perform the external parameter calibration of the binocular camera in the embodiment of the application steps of the method.

Optionally, the communication interface 3003 can use a transceiver device such as but not limited to a transceiver to implement communication between the device and other devices or communication networks, for example, the communication interface 3003 can be used to acquire binocular images. The communication interface 3003 may also be an interface circuit, for example.

Bus 3004 may include pathways for transferring information between various components of the device (eg, memory, processor, communication interface).

The embodiments of the present application also provide a computer program product including instructions, which, when executed by a computer, enable the computer to implement the methods in the above method embodiments.

An embodiment of the present application also provides a terminal, which includes any of the above-mentioned calibration devices, such as the device shown in FIG. 11 or FIG. 12 .

Exemplarily, the terminal may be a vehicle, a drone, or a robot.

The above-mentioned calibration device can be installed on the terminal or be independent from the terminal.

An embodiment of the present application further provides a computer-readable medium, where the computer-readable medium stores program code for execution by a device, where the program code includes the method for executing the above-mentioned embodiments.

The embodiment of the present application also provides a computer program product containing instructions, and when the computer program product is run on a computer, the computer is made to execute the method of the above embodiment.

The embodiment of the present application also provides a chip, the chip includes a processor and a data interface, and the processor reads the instructions stored in the memory through the data interface, and executes the method of the above embodiment.

Optionally, as an implementation manner, the chip may further include a memory, in which instructions are stored, and the processor is configured to execute the instructions stored in the memory, and when the instructions are executed, the processor is configured to execute the method in the foregoing embodiments.

Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the technical field to which this application belongs. The terms used herein in the specification of the application are only for the purpose of describing specific embodiments, and are not intended to limit the application.

Those skilled in the art can appreciate that the units and algorithm steps of the examples described in conjunction with the embodiments disclosed herein can be implemented by electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are executed by hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art may use different methods to implement the described functions for each specific application, but such implementation should not be regarded as exceeding the scope of the present application.

Those skilled in the art can clearly understand that for the convenience and brevity of the description, the specific working process of the above-described system, device and unit can refer to the corresponding process in the foregoing method embodiment, which will not be repeated here.

In the several embodiments provided in this application, it should be understood that the disclosed systems, devices and methods may be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components can be combined or May be integrated into another system, or some features may be ignored, or not implemented. In another point, the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.

The units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.

If the functions described above are realized in the form of software function units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application is essentially or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application. The aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (read-only memory, ROM), random access memory (random access memory, RAM), magnetic disk or optical disc and other media that can store program codes. .

The above is only a specific implementation of the application, but the scope of protection of the application is not limited thereto. Anyone familiar with the technical field can easily think of changes or substitutions within the technical scope disclosed in the application. Should be covered within the protection scope of this application. Therefore, the protection scope of the present application should be determined by the protection scope of the claims.

Claims

A method for calibrating external parameters of a binocular camera, comprising:

Obtain a first image and a second image, the first image is captured by the first camera pair in the binocular camera, and the second image is captured by the second camera pair in the binocular camera obtained from shooting the above-mentioned shooting scene;

Extract m straight lines from the first image and the second image respectively, m is an integer greater than 1, and there is a corresponding relationship between the m straight lines in the first image and the m straight lines in the second image ;

Based on the extrinsic parameters of the binocular camera, reconstruct n straight lines among the m straight lines of the first image and n straight lines among the m straight lines of the second image into a three-dimensional space, and obtain the reconstructed n straight lines, the n straight lines of the first image and the n straight lines of the second image are projections of n straight lines in the shooting scene, 1<n≤m, n is an integer;

Adjusting the extrinsic parameters of the binocular camera according to the reconstruction error, the reconstruction error is based on the positional relationship between the reconstructed n straight lines and the positional relationship between the n straight lines in the shooting scene definite.
The method according to claim 1, wherein the reconstruction error comprises at least one of the following: an angle error between the reconstructed n straight lines or an angle error between the reconstructed n straight lines the distance error,

The angle error between the n straight lines after reconstruction is based on the angle between at least two straight lines among the n straight lines after reconstruction and at least two of the n straight lines in the shooting scene. determined by the difference between the angles between the straight lines;

The distance error between the reconstructed n straight lines is based on the distance between at least two of the reconstructed n straight lines and at least two of the n straight lines in the shooting scene. The difference between the distances between the lines is determined.
The method according to claim 2, wherein the at least two straight lines in the shooting scene include at least two parallel straight lines.
The method according to any one of claims 1 to 3, wherein said extracting m straight lines in said first image and said second image respectively comprises:

respectively performing instance segmentation on the first image and the second image to obtain instances in the first image and instances in the second image;

Extracting m straight lines from the instance in the first image and the instance in the second image respectively, the correspondence between the m straight lines in the first image and the m straight lines in the second image A relationship is determined from a correspondence between instances in the first image and instances in the second image.
The method according to claim 4, wherein the instance segmentation is performed on the first image and the second image respectively to obtain the instance in the first image and the instance in the second image ,include:

Semantic segmentation is performed on the first image and the second image respectively to obtain a semantic segmentation result of the first image and a semantic segmentation result of the second image, and the semantic segmentation result of the first image includes the Horizontal objects or vertical objects in the first image, the semantic segmentation result of the second image includes horizontal objects or vertical objects in the second image;

performing instance segmentation on the first image based on the semantic segmentation result of the first image to obtain instances in the first image, and performing instance segmentation on the second image based on the semantic segmentation result of the second image, Get the instance in the second image.
The method according to any one of claims 1 to 5, further comprising: controlling the display to display the calibration of the extrinsic parameters of the binocular camera.
The method according to claim 6, wherein the calibration conditions of the extrinsic parameters of the binocular camera include at least one of the following: current calibration progress, current reconstruction error conditions or reconstructed p straight lines, The reconstructed p straight lines are based on the current extrinsic parameters of the binocular camera, p lines among the m straight lines of the first image and p lines among the m straight lines of the second image Obtained by reconstructing a straight line into a three-dimensional space, 1<p≤m, p is an integer.
The method according to claim 7, wherein the current calibration progress includes at least one of the following:

The current extrinsic parameters of the binocular camera or the current calibration completion degree.
The method according to claim 8, wherein the current reconstruction error situation includes at least one of the following:

Current reconstruction error, current range error, or current angle error.
A device for calibrating external parameters of a binocular camera, characterized in that it comprises:

An acquisition unit, configured to acquire a first image and a second image, the first image is captured by the first camera in the binocular camera on the shooting scene, and the second image is obtained by the first camera in the binocular camera obtained by shooting the shooting scene by the second camera;

processing unit for:

Extract m straight lines from the first image and the second image respectively, m is an integer greater than 1, and there is a corresponding relationship between the m straight lines in the first image and the m straight lines in the second image ;

Based on the extrinsic parameters of the binocular camera, reconstruct n straight lines among the m straight lines of the first image and n straight lines among the m straight lines of the second image into a three-dimensional space, and obtain the reconstructed n straight lines, the n straight lines of the first image and the n straight lines of the second image are projections of n straight lines in the shooting scene, 1<n≤m, n is an integer;

Adjusting the extrinsic parameters of the binocular camera according to the reconstruction error, the reconstruction error is based on the positional relationship between the reconstructed n straight lines and the positional relationship between the n straight lines in the shooting scene definite.
The device according to claim 10, wherein the reconstruction error comprises at least one of the following: an angle error between the reconstructed n straight lines or an angle error between the reconstructed n straight lines the distance error,

The angle error between the n straight lines after reconstruction is based on the angle between at least two straight lines among the n straight lines after reconstruction and at least two of the n straight lines in the shooting scene. determined by the difference between the angles between the straight lines;

The distance error between the reconstructed n straight lines is based on the distance between at least two of the reconstructed n straight lines and at least two of the n straight lines in the shooting scene. The difference between the distances between the lines is determined.
The device according to claim 11, wherein the at least two straight lines in the shooting scene include at least two parallel straight lines.
The device according to any one of claims 10 to 12, wherein the processing unit is specifically configured to:

respectively performing instance segmentation on the first image and the second image to obtain instances in the first image and instances in the second image;

Extracting m straight lines from the instance in the first image and the instance in the second image respectively, the correspondence between the m straight lines in the first image and the m straight lines in the second image A relationship is determined from a correspondence between instances in the first image and instances in the second image.
The device according to claim 13, wherein the processing unit is specifically used for:

Semantic segmentation is performed on the first image and the second image respectively to obtain a semantic segmentation result of the first image and a semantic segmentation result of the second image, and the semantic segmentation result of the first image includes the Horizontal objects or vertical objects in the first image, the semantic segmentation result of the second image includes horizontal objects or vertical objects in the second image;

performing instance segmentation on the first image based on the semantic segmentation result of the first image to obtain instances in the first image, and performing instance segmentation on the second image based on the semantic segmentation result of the second image, Get the instance in the second image.
The device according to any one of claims 10 to 14, characterized in that the device further comprises: a display unit for displaying the calibration of the extrinsic parameters of the binocular camera.
The device according to claim 15, wherein the calibration conditions of the extrinsic parameters of the binocular camera include at least one of the following: current calibration progress, current reconstruction error conditions or reconstructed p straight lines, The reconstructed p straight lines are based on the current extrinsic parameters of the binocular camera, p lines among the m straight lines of the first image and p lines among the m straight lines of the second image Obtained by reconstructing a straight line into a three-dimensional space, 1<p≤m, p is an integer.
The device according to claim 16, wherein the current calibration progress includes at least one of the following:

The current extrinsic parameters of the binocular camera or the current calibration completion degree.
The device according to claim 17, wherein the current reconstruction error situation includes at least one of the following:

Current reconstruction error, current range error, or current angle error.
A chip, characterized in that the chip includes at least one processor and an interface circuit, and the at least one processor obtains instructions stored on the memory through the interface circuit to execute any one of claims 1 to 9. the method described.
A computer-readable storage medium, characterized in that the computer-readable medium stores program code for execution by a device, and the program code includes instructions for performing the method according to any one of claims 1 to 9 .
A terminal, characterized in that the terminal comprises the device according to any one of claims 10-18.
The terminal according to claim 21, further comprising a binocular camera.