CN110880189B

CN110880189B - Combined calibration method and combined calibration device thereof and electronic equipment

Info

Publication number: CN110880189B
Application number: CN201811036954.2A
Authority: CN
Inventors: 黄菊; 徐伟健; 周俊; 陈远; 胡增新
Original assignee: Sunny Optical Zhejiang Research Institute Co Ltd
Current assignee: Sunny Optical Zhejiang Research Institute Co Ltd
Priority date: 2018-09-06
Filing date: 2018-09-06
Publication date: 2022-09-09
Anticipated expiration: 2038-09-06
Also published as: CN110880189A

Abstract

The invention provides a combined calibration method, a combined calibration device and electronic equipment. The combined calibration method is used for calibrating a vision inertia combined device, wherein the vision inertia combined device comprises an image acquisition unit and an inertia measurement unit. The combined calibration method comprises the following steps: processing the acquired image data sequence to obtain a visual pose sequence of the image acquisition unit; fitting the visual pose sequence into a continuous spline curve to obtain a visual calculation data sequence corresponding to the inertia time stamp of the inertia measurement unit; analyzing the cross-correlation degree of the obtained inertial data sequence and the visual calculation data sequence to obtain an initial time delay value between the image acquisition unit and the inertial measurement unit; and obtaining an initial value of a posture between the image acquisition unit and the inertial measurement unit by analyzing a minimized error between the vision calculation data and the inertial data sequence.

Description

Combined calibration method, combined calibration device and electronic equipment

Technical Field

The invention relates to the technical field of machine vision, in particular to a combined calibration method, a combined calibration device and electronic equipment.

Background

In the scheme of simultaneous positioning and composition, a camera and an Inertial Measurement Unit (IMU) are often combined to combine the advantages of two sensors and overcome the disadvantages of a single sensor. In the algorithm scheme of combining the camera and the IMU, data synchronization and alignment among sensors are important prerequisites for directly influencing the algorithm effect, time synchronization refers to that timestamps corresponding to the sensor data are aimed at the same clock, and time alignment refers to that the sensor data with high output frequency is paired with the sensor data with low output frequency.

On the one hand, most algorithms assume that data between the camera and IMU are synchronized on the time synchronization problem, and in practice multi-sensor synchronization is challenging in practical operation. At present, two methods are mainly used for synchronizing the two methods, one method is a hardware synchronization method, a single processor is used in hardware, and when a signal line of a camera or an IMU changes, the processor interrupts the generation time of a detection signal to acquire the synchronization data of multiple sensors; the other is software synchronization, which uses software algorithm to convert the camera clock and the IMU clock to the same clock and then output. On the other hand, in terms of the sensor time alignment problem, a general algorithm performs pre-integration on IMU data between adjacent frame image time stamps, and then fuses a visual calculation result and an IMU pre-integration result. Of course, there are also algorithms that directly find IMU data that has the least difference from the image timestamp as its alignment data.

However, although in the above mentioned time synchronization and alignment scheme, both hardware synchronization and software synchronization can achieve better time synchronization effect, the hardware synchronization requires special hardware support, such as a high precision positioning module, which increases the complexity of hardware integration. Accordingly, software synchronization also requires that each device be supported by a corresponding algorithm module, which also needs to be specifically tailored. Therefore, a time synchronization method with a wide application range is not provided for solving the problem of multi-touch sensor synchronization.

In addition, although the existing data alignment methods have good applicability, the existing data alignment methods all require down-sampling IMU data, which causes great loss of IMU data, and once a near-matching manner encounters a condition of time delay, calculation accuracy is greatly reduced.

Disclosure of Invention

One of the main advantages of the present invention is to provide a joint calibration method, a joint calibration apparatus and an electronic device thereof, which can calibrate an initial time delay value and an initial pose value between an image capturing unit and an inertia measuring unit for the application of a subsequent algorithm.

Another advantage of the present invention is to provide a joint calibration method, a joint calibration apparatus and an electronic device thereof, which can solve the problem that an image timestamp of the image capturing unit is not synchronized with an inertia timestamp of the inertia measuring unit, that is, which can solve the problem that an image data output time is not synchronized with an inertia data output time.

Another advantage of the present invention is to provide a joint calibration method, a joint calibration apparatus and an electronic device thereof, which do not need to down-sample the inertial data measured by the inertial measurement unit to ensure the integrity of the inertial data, and are helpful to improve the accuracy of the calibration result.

Another advantage of the present invention is to provide a joint calibration method, a joint calibration apparatus and an electronic device thereof, which can accelerate the speed of joint calibration to save calibration time and calibration cost while ensuring no instability and accuracy of calibration results.

Another advantage of the present invention is to provide a joint calibration method, a joint calibration apparatus and an electronic device thereof, which can align image data of a low frame rate and inertial data of a high frame rate in a high frame rate by spline fitting, and thus, can improve the accuracy of a calibration result.

Another advantage of the present invention is to provide a joint calibration method, a joint calibration apparatus and an electronic device thereof, which can optimize pose parameters of the image capturing unit and the inertial measurement unit in advance, and provide reliable initial values for bundle set optimization, so as to obtain more stable and reliable calibration results.

Another advantage of the present invention is to provide a joint calibration method, a joint calibration apparatus and an electronic device thereof, wherein the joint calibration method is simple to implement, has high applicability, and is significant for reducing industrial production cost, reducing development period, and improving accuracy of a later-stage algorithm.

Another advantage of the present invention is to provide a joint calibration method, a joint calibration device and an electronic device thereof, which have great application values to various intelligent systems or devices based on multi-sensor fusion, and are suitable for being applied to the fields of augmented reality, positioning and tracking, etc.

Additional advantages and features of the invention will be set forth in the detailed description which follows and in part will be apparent from the description, or may be learned by practice of the invention as set forth hereinafter.

In accordance with one aspect of the present invention, the foregoing and other objects and advantages are achieved by a combined calibration method for calibrating a vision-inertia combined apparatus, wherein the vision-inertia combined apparatus includes an image acquisition unit and an inertia measurement unit, comprising the steps of:

processing the acquired image data sequence to obtain a visual pose sequence of the image acquisition unit;

fitting the visual pose sequence into a continuous spline curve to obtain a visual calculation data sequence corresponding to the inertia time stamp of the inertia measurement unit;

analyzing the cross-correlation degree of the obtained inertial data sequence and the visual calculation data sequence to obtain an initial time delay value between the image acquisition unit and the inertial measurement unit; and

and obtaining an initial value of a posture between the image acquisition unit and the inertial measurement unit by analyzing the minimized error between the vision calculation data and the inertial data sequence.

According to an embodiment of the present invention, the joint calibration method further includes the steps of:

jointly acquiring the image data sequence and the inertial data sequence by the vision-inertia joint device, wherein the image data sequence comprises a set of image data with image time stamps acquired by the image acquisition unit, and the inertial data sequence comprises a set of inertial data with the inertial time stamps measured by the inertial measurement unit.

According to an embodiment of the present invention, the step of jointly acquiring the image data sequence and the inertial data sequence by the visual inertia joint apparatus, wherein the image data sequence includes a set of image data with image time stamps acquired by the image acquisition unit, and the inertial data sequence includes a set of inertial data with the inertial time stamps measured by the inertial measurement unit, includes the steps of:

moving the visual-inertial combination unit within a predetermined time to activate the inertial measurement unit and maintain a target always in the field of view of the image acquisition unit;

measuring a set of the inertial data within the predetermined time by the inertial measurement unit, wherein each of the inertial data includes a measured angular velocity of the inertial measurement unit rotating about a body coordinate system of the inertial measurement unit at the corresponding inertial timestamp; and

and acquiring a set of image data within the predetermined time by the image acquisition unit, wherein each set of image data comprises a target image acquired by shooting the target through the image acquisition unit at the corresponding image time stamp.

According to an embodiment of the present invention, said jointly acquiring, by the visual inertia joint apparatus, the image data sequence and the inertial data sequence, wherein the image data sequence includes a set of image data with image time stamps acquired by the image acquisition unit, the inertial data sequence includes a set of inertial data with the inertial time stamps measured by the inertial measurement unit, and further includes:

designing a preset motion track and coding the preset motion track to a mechanical arm, so that the visual inertia combination device is moved by the mechanical arm according to the preset motion track to activate the acceleration of the visual inertia combination device in all directions, and the inertia measurement unit is activated in all directions.

According to an embodiment of the present invention, the step of obtaining a sequence of visual poses of the image capturing unit by processing the acquired sequence of image data comprises the steps of:

detecting the coordinates of the corner points of each image data in the image data sequence based on the image data sequence;

solving a homography matrix between the virtual target and an image plane of the image acquisition unit at each image time stamp based on the corner point coordinates of each image data and the corner point coordinates of a virtual target; and

obtaining a pose of the image capture unit relative to the virtual target at each of the image timestamps by decomposing the homography matrix at each of the image timestamps to obtain the sequence of visual poses, wherein the sequence of visual poses includes the pose of the image capture unit relative to the virtual target at each of the image timestamps.

According to an embodiment of the present invention, the step of obtaining a sequence of visual poses of the image capturing unit by processing the acquired sequence of image data further includes the steps of:

and calculating the corner point coordinates of the virtual target based on the input target parameters.

According to an embodiment of the present invention, the step of obtaining the visual calculation data sequence corresponding to the inertial timestamp of the inertial measurement unit by fitting the visual pose sequence to a continuous spline comprises the steps of:

substituting the visual pose sequence into a spline curve segment fitting model to obtain a linear equation set;

solving the linear equation set by using a linear solving library to obtain spline parameters of the spline curve segment fitting model; and

after obtaining the spline parameters, the sequence of visual computation data is solved by processing the spline segment fitting model, wherein the sequence of visual computation data includes a set of active rotation angular velocities corresponding to the inertial time stamps.

after obtaining the spline parameters, solving the vision calculation data sequence by processing the spline curve segment fitting model, wherein the vision calculation data sequence comprises a set of active rotation angular velocities corresponding to the inertia time stamps.

According to an embodiment of the present invention, said processing the spline segment fitted model after obtaining the spline parameters to solve the sequence of visual computation data, wherein the sequence of visual computation data includes a set of active rotation angular velocities corresponding to the inertial time stamps, includes the steps of:

obtaining a passive rotation angular velocity at each inertial timestamp by solving a first derivative of the spline curve segment fitting model, wherein the passive rotation angular velocity is an angular velocity at which the image acquisition unit rotates around the body coordinate system of the virtual target; and

converting the passive rotation angular velocity into the active rotation angular velocity, wherein the active rotation angular velocity is an angular velocity at which the image capturing unit rotates around a body coordinate system of the image capturing unit at the time of the inertial timestamp.

According to an embodiment of the present invention, the step of obtaining an initial value of a time delay between the image acquisition unit and the inertial measurement unit by analyzing the degree of cross-correlation between the acquired inertial data sequence and the visual calculation data sequence includes the steps of:

obtaining a correlation value between each measured angular velocity of the inertial data sequence and each active rotational angular velocity of the vision calculation data sequence to obtain a cross-correlation coefficient sequence, wherein the cross-correlation coefficient sequence comprises a set of the correlation values corresponding to correlation coefficients;

obtaining an optimal correlation coefficient by comparing the magnitudes of all the correlation values in the cross-correlation coefficient sequence, wherein the optimal correlation coefficient corresponds to the largest correlation value; and

and solving a time delay value between the image acquisition unit and the inertial measurement unit based on the optimal correlation coefficient to serve as the initial time delay value.

According to an embodiment of the present invention, the step of obtaining an attitude threshold between the image acquisition unit and the inertial measurement unit by analyzing the minimized error between the vision calculation data and the inertial data sequence comprises the steps of:

converting each active rotation angular velocity of the vision calculation data sequence into an angular velocity rotating around a body coordinate system of the inertial measurement unit to obtain a converted vision calculation data sequence;

differencing each measured angular velocity of the inertial data sequence with a corresponding transformed angular velocity of the transformed sequence of vision calculation data to obtain an error term between each measured angular velocity and the corresponding transformed angular velocity; and

and solving a rotation value from the inertial measurement unit to the image acquisition unit corresponding to the minimized error by using a linear solution library to serve as a rotation initial value of the pose initial value between the image acquisition unit and the inertial measurement unit.

and based on the initial time delay value and the initial pose value, obtaining the optimal time delay and the optimal pose between the image acquisition unit and the inertial measurement unit through bundling set optimization.

According to an embodiment of the present invention, the step of obtaining the optimal time delay and the optimal pose between the image capturing unit and the inertial measurement unit by bundle set optimization based on the initial time delay value and the initial pose value includes the steps of:

based on the inertia timestamp and the initial time delay value, acquiring the pose from the virtual target to the inertia measurement unit;

based on the pose from the virtual target to the inertial measurement unit and the initial time delay value, transforming the world corner coordinates of the virtual target through an observation model to obtain the corner coordinates of a new target image; and

and solving errors between the corner coordinates of the new target image and the corner coordinates of the target image in the inertial data sequence through an error model, and bundling and optimizing all the errors to obtain the optimal time delay and the optimal pose.

and obtaining the camera parameters of the image acquisition unit by calibrating the image acquisition unit.

According to an embodiment of the present invention, in the step of obtaining the camera parameters of the image capturing unit by calibrating the image capturing unit:

and calibrating the image acquisition unit through a binocular to obtain the internal parameters and the external parameters of the camera parameters of the image acquisition unit.

and obtaining the internal parameters of the camera parameters of the image acquisition unit by calibrating the image acquisition unit through a single eye.

According to another aspect of the present invention, there is further provided a combined calibration apparatus for calibrating a vision-inertia combined apparatus, wherein the vision-inertia combined apparatus comprises an image acquisition unit and an inertia measurement unit, wherein the combined calibration apparatus comprises:

the visual pose acquisition unit is used for processing the acquired image data sequence to acquire a visual pose sequence of the image acquisition unit;

a spline curve fitting unit for fitting the visual pose sequence into a continuous spline curve to obtain a visual calculation data sequence corresponding to the inertia time stamp of the inertia measurement unit;

the cross-correlation degree analysis unit is used for analyzing the cross-correlation degree of the obtained inertial data sequence and the visual calculation data sequence to obtain an initial time delay value between the image acquisition unit and the inertial measurement unit; and

and the minimized error analysis unit is used for obtaining an initial attitude value between the image acquisition unit and the inertial measurement unit by analyzing the minimized error between the vision calculation data and the inertial data sequence.

According to an embodiment of the present invention, the joint calibration apparatus further includes a joint acquisition unit for jointly acquiring the image data sequence and the inertial data sequence by the visual-inertial joint apparatus, wherein the image data sequence includes a set of image data with image time stamps acquired by the image acquisition unit, and the inertial data sequence includes a set of inertial data with inertial time stamps measured by the inertial measurement unit.

According to an embodiment of the present invention, the joint obtaining unit is further configured to:

According to an embodiment of the present invention, the joint acquisition unit is further configured to design a predetermined motion trajectory and encode the predetermined motion trajectory to a robot arm, so as to move the visual-inertial combination unit according to the predetermined motion trajectory through the robot arm, so as to activate the acceleration of the visual-inertial combination unit in each direction, so that the inertial measurement unit is activated in each direction.

According to an embodiment of the present invention, the visual pose obtaining unit is further configured to:

obtaining the sequence of visual poses by decomposing the homography matrix at each of the image timestamps to obtain the pose of the image capture unit relative to the virtual target at each of the image timestamps, wherein the sequence of visual poses includes the pose of the image capture unit relative to the virtual target at each of the image timestamps.

According to an embodiment of the present invention, the visual pose obtaining unit is further configured to calculate the corner coordinates of the virtual target based on the input target parameters.

According to an embodiment of the invention, the spline curve fitting unit is further configured to:

According to an embodiment of the present invention, the cross-correlation degree analyzing unit is further configured to:

According to an embodiment of the present invention, the minimization of error analysis unit is further configured to:

According to an embodiment of the present invention, the joint calibration apparatus further includes a bundling optimization unit, configured to obtain an optimal time delay and an optimal pose between the image capturing unit and the inertial measurement unit through bundling optimization based on the initial time delay value and the initial pose value.

According to an embodiment of the present invention, the bundle set optimizing unit is further configured to:

According to an embodiment of the present invention, the combined calibration apparatus further includes a camera parameter calibration unit, configured to obtain the camera parameters of the image capturing unit by calibrating the image capturing unit.

According to an embodiment of the present invention, the camera parameter calibration unit is further configured to calibrate the image capturing unit through two eyes, so as to obtain the internal reference and the external reference of the camera parameters of the image capturing unit.

According to another aspect of the present invention, the present invention further provides an electronic device comprising

At least one processor; and

at least one memory having computer program instructions stored therein, which when executed by the processor, cause the processor to perform the joint calibration method described above.

According to another aspect of the present invention, there is further provided a computer readable storage medium, wherein the computer readable storage medium has stored thereon computer program instructions operable to perform the above-mentioned joint calibration method when the computer program instructions are executed by a computing device.

Further objects and advantages of the invention will be fully apparent from the ensuing description and drawings.

These and other objects, features and advantages of the present invention will become more fully apparent from the following detailed description, the accompanying drawings and the claims.

Drawings

These and/or other aspects and advantages of the present invention will become more apparent and more readily appreciated from the following detailed description of the embodiments of the invention, taken in conjunction with the accompanying drawings of which:

FIG. 1 is a flow chart of a joint calibration method according to a preferred embodiment of the present invention.

Fig. 2 is a schematic process diagram of the joint acquisition of data in the joint calibration method according to the above preferred embodiment of the present invention.

Fig. 3 is a schematic process diagram of the acquisition of the visual pose in the joint calibration method according to the above preferred embodiment of the present invention.

Fig. 4 is a schematic process diagram of spline curve fitting in the joint calibration method according to the above preferred embodiment of the present invention.

Fig. 5A is a schematic process diagram of cross-correlation analysis in the joint calibration method according to the above preferred embodiment of the present invention.

Fig. 5B shows a schematic diagram of a cross-correlation coefficient variation graph.

FIG. 6 is a schematic process diagram of the analysis of the minimum error in the joint calibration method according to the above preferred embodiment of the present invention.

Fig. 7 is a schematic process diagram of bundle set optimization in the joint calibration method according to the above preferred embodiment of the present invention.

Fig. 8 is a block diagram of a joint calibration apparatus according to the above preferred embodiment of the present invention.

FIG. 9 is a block diagram of an electronic device according to the above preferred embodiment of the present invention.

Detailed Description

The following description is presented to disclose the invention so as to enable any person skilled in the art to practice the invention. The preferred embodiments described below are by way of example only, and other obvious variations will occur to those skilled in the art. The underlying principles of the invention, as defined in the following description, may be applied to other embodiments, variations, modifications, equivalents, and other technical solutions without departing from the spirit and scope of the invention.

It is understood that the terms "a" and "an" should be interpreted as meaning that a number of one element or element is one in one embodiment, while a number of other elements is one in another embodiment, and the terms "a" and "an" should not be interpreted as limiting the number.

Positioning and composition are two fundamental problems in the field of robot navigation and control research, while simultaneous positioning and composition technology has always been the subject of intense research in the fields of robotics and computer vision. In recent years, there has been an increasing application of these technologies to small platforms such as small drones, handheld mobile devices, and augmented reality devices. Since the vision inertial combined device fixedly couples two sensors, namely, an image acquisition unit (such as a camera and the like) and an inertial measurement unit (such as a gyroscope and the like), the vision inertial combined device can integrate respective advantages of the two sensors and compensate disadvantages of the single sensor, so that the vision inertial combined device is gradually focused on the field of positioning and composition.

As is well known, in the algorithm scheme combining vision and inertia, time synchronization and time alignment between image data acquired by the image acquisition unit and inertial data measured by the inertial measurement unit are important prerequisites for directly affecting the algorithm effect, that is, in order to obtain an accurate algorithm result, the synchronization and alignment between the output time of the image data and the output time of the inertial data must be solved. On one hand, however, hardware synchronization in the existing time synchronization method needs special hardware support, which increases complexity of hardware integration, and software synchronization also requires that each device needs to be specially customized to be supported by a corresponding algorithm module, which cannot adapt to a wider application range; on the other hand, the existing time alignment methods all need to perform downsampling on the inertial data, so that the inertial data has a large loss, and the calculation accuracy is greatly reduced.

Aiming at the technical problem, the basic idea of the invention is to design a calibration method for jointly calibrating the visual-inertial joint device by deeply analyzing the working principle of the camera imaging and inertial visual joint positioning system so as to estimate the initial time delay value and the initial external parameter value between the image acquisition unit and the inertial measurement unit based on spline interpolation. The method comprises the steps of firstly, fitting poses obtained by pure vision into a continuous spline curve; secondly, initializing a time delay value between the image acquisition unit and the inertia measurement unit through the cross-correlation degree between the vision calculation data solved based on the spline and the measured inertia data, and using the time delay value as an initial time delay value between the image acquisition unit and the inertia measurement unit; and finally, obtaining a better pose initial value between the image acquisition unit and the inertial measurement unit by minimizing the error between the visual calculation data and the inertial data. The calibration method is simple to operate, the angular point detection is quick and effective, the calibration result is stable, and the calibration method has important significance for the practical application of the vision and inertia integrated simultaneous positioning and composition system.

Based on the above, the invention provides a combined calibration method, a combined calibration device and an electronic device thereof, which are used for calibrating a visual inertia combined device, wherein the visual inertia combined device comprises an image acquisition unit and an inertia measurement unit, and a visual pose sequence of the image acquisition unit is obtained by processing an acquired image data sequence; then, fitting the vision pose sequence into a continuous spline curve to obtain a vision calculation data sequence corresponding to the inertia time stamp of the inertia measurement unit; then, analyzing the cross-correlation degree of the obtained inertial data sequence and the vision calculation data sequence to obtain an initial time delay value between the image acquisition unit and the inertial measurement unit; and finally, obtaining a position initial value between the image acquisition unit and the inertial measurement unit by minimizing the error between the vision calculation data after coordinate conversion and the inertial data sequence. Therefore, the visual pose sequence is fitted into a continuous spline curve, so that the alignment and fusion of the image data sequence and the inertial data sequence can be completed without down-sampling the inertial data sequence, the integrity of the inertial data sequence is ensured, the high-precision calibration result can be obtained by the combined calibration method, and the calculation precision of a subsequent algorithm is improved.

Having described the general principles of the present invention, various non-limiting embodiments of the present invention will now be described in detail with reference to the accompanying drawings.

It should be understood that in the present invention, the image acquisition unit of the visual inertial combination unit may be implemented as, but not limited to, a monocular camera, a binocular camera, or a multi-view camera; the Inertial Measurement Unit (IMU) of the visual-Inertial combination Unit is implemented as a device for three-axis attitude angles (or angular velocities) and accelerations of an object.

Illustrative method

Fig. 1 shows a flow chart of a joint calibration method for jointly calibrating a visual-inertial joint apparatus, wherein the visual-inertial joint apparatus includes an image-capturing unit and an inertial measurement unit, according to a preferred embodiment of the present invention, wherein the joint calibration method includes the steps of:

s220: processing the acquired image data sequence to acquire a visual pose sequence of the image acquisition unit;

s230: obtaining a vision calculation data sequence corresponding to the inertial measurement unit by fitting the vision pose sequence into a continuous spline curve;

s240: analyzing the cross-correlation degree of the obtained inertial data sequence and the visual calculation data sequence to obtain an initial time delay value between the image acquisition unit and the inertial measurement unit; and

s250: and obtaining a position initial value between the image acquisition unit and the inertial measurement unit by analyzing the minimized error between the vision calculation data and the inertial data sequence.

Further, as shown in fig. 1, before the step 220, the combined calibration method further includes the steps of:

s210: jointly acquiring, by the visual-inertial joint apparatus, the image data sequence and the inertial data sequence, wherein the image data sequence includes a set of image data with image time stamps acquired by the image acquisition unit, and the inertial data sequence includes a set of inertial data with inertial time stamps measured by the inertial measurement unit.

It should be understood that the image time stamp in the image data sequence is the time when the image data is output by the image acquisition unit, which is also called a camera time stamp or a camera time, that is, each image data in the image data sequence corresponds to each image time stamp; the inertial time stamp in the inertial data sequence is a time when the image acquisition unit outputs the inertial data, which is also referred to as an IMU time stamp or an IMU time, that is, each inertial data in the inertial data sequence corresponds to each inertial time stamp.

Specifically, as shown in fig. 2, the step S210 includes the steps of:

s211: moving the visual-inertial combination unit within a predetermined time to activate the inertial measurement unit and maintain a target always in the field of view of the image acquisition unit;

s212: measuring, by the inertial measurement unit, a set of the inertial data over the predetermined time, wherein each of the inertial data includes a measured angular velocity of the inertial measurement unit rotating about a body coordinate system of the inertial measurement unit at the respective inertial timestamp; and

s213: acquiring a set of the image data within the predetermined time by the image acquisition unit, wherein each of the image data includes a target image acquired by photographing the target by the image acquisition unit at the corresponding image time stamp.

More specifically, the step S211 includes the steps of:

designing a preset motion track and coding the preset motion track to a mechanical arm, so that the visual inertial combination device is moved by the mechanical arm according to the preset motion track to activate the acceleration of the visual inertial combination device in all directions, and the inertial measurement unit is activated in all directions.

Preferably, the predetermined motion trajectory is optimized to avoid blurring of the reticle image due to too fast a motion speed of the visual inertial combination unit while ensuring activation of the inertial measurement unit in all directions.

More preferably, the predetermined time is between 30 seconds and 45 seconds to ensure that the visual inertia associated apparatus acquires sufficient of the image data sequence and the inertial data sequence. It should be understood that, when data are collected jointly, various rotations and accelerations of the inertial measurement unit are activated in the shortest possible time through specific speed transformation and pose planning, so that the speed of joint calibration is increased, and high stability and high precision of a calibration result are ensured.

Exemplarily, firstly, calculating an optimal motion track and coding the optimal motion track to a mechanical arm, so that the motion track needs to be activated in each direction by acceleration and in each axis rotation by an IMU within 30-45 s, and simultaneously, the motion speed does not cause image blurring; then, when data is collected, a target is statically placed on a calibration table, and then the vision inertia combination device is moved by the mechanical arm according to the coded motion trail to change the pose of the vision inertia combination device, and meanwhile, image data and IMU data (inertial data) are collected through the vision inertia combination device.

According to the preferred embodiment of the present invention, in the step S220, a sequence of visual poses of the image capturing unit is obtained by processing the acquired image data sequence. Here, in the field of image processing, it is common to decompose a homography matrix between the sequence of image data and a virtual target to obtain the sequence of visual poses, wherein the sequence of visual poses includes a set of poses of the image capture unit relative to the virtual target at the time of the image timestamp. It can be understood by those skilled in the art that the coordinates of the corner points on the virtual target can be calculated from the target parameters of the target photographed by the image capturing unit, and will not be described herein again.

Specifically, as shown in fig. 3, the step S220 includes the steps of:

s221: detecting the corner coordinates of each image data in the image data sequence based on the image data sequence;

s222: solving a homography matrix between the virtual target and an image plane of the image acquisition unit at each image time stamp based on the corner point coordinates of each image data and the corner point coordinates of a virtual target; and

s223: obtaining the sequence of visual poses by decomposing the homography matrix at each of the image timestamps to obtain the poses of the image capture units relative to the virtual target at each of the image timestamps, wherein the sequence of visual poses comprises the poses of the image capture units relative to the virtual target at each of the image timestamps.

Further, the step S220 further includes the steps of: and calculating the coordinates of the corner points of the virtual target based on the input target parameters.

It is worth mentioning that the image acquisition unit of the visual inertial coupling device may be implemented as, but not limited to, a monocular camera, a binocular camera or a multi-ocular camera. As can be understood by those skilled in the art, since the image timestamp of the image acquisition unit is implemented as a timestamp of the monocular camera when the image acquisition unit is implemented as the monocular camera, it is necessary to calibrate an initial value of a time delay between the image acquisition unit and the inertial measurement unit by analyzing a value of a time delay between the monocular camera and the inertial measurement unit. Accordingly, the image capturing unit captures only one target image at each image time stamp, that is, each image time stamp corresponds to each target image one-to-one, so that the image data sequence includes only one set of the image data captured by the monocular camera and each image data includes one target image corresponding to the corresponding image time stamp, and therefore, all the image data in the image data sequence need to be processed in step S220 to obtain the visual pose sequence of the image capturing unit, wherein the visual pose sequence includes the pose of the monocular camera with respect to the virtual target at each image time stamp.

Here, although the image capturing unit may also be implemented as a binocular camera or a multi-view camera, timestamps between at least two cameras of the binocular camera or the multi-view camera are uniform so that the image timestamp of the image capturing unit is implemented as a timestamp of either one of the binocular camera or the multi-view camera. Therefore, the initial value of the time delay between the image acquisition unit and the inertial measurement unit can be calibrated only by analyzing the time delay value between any one of the binocular camera or the multi-view camera and the inertial measurement unit. Accordingly, when the image capturing unit is implemented as a binocular camera or a multi-view camera, the image capturing unit obtains at least two target images per the image time stamp, that is, each of the image time stamps corresponds to at least two target images, so that the image data sequence includes at least two sets of the image data captured by the binocular camera or the multi-view camera, and each of the image data in each set of the image data includes one target image captured by a corresponding one of the binocular camera or the multi-view camera corresponding to the corresponding image time stamp.

However, since only the time delay value between any one of the binocular camera or the multi-view camera and the inertial measurement unit needs to be analyzed when the image capturing unit is implemented as a binocular camera or a multi-view camera, the initial value of the time delay between the image capturing unit and the inertial measurement unit can be calibrated. In addition, because the relative pose between at least two cameras in the binocular camera or the multi-view camera is not changed, the pose between any camera in the binocular camera or the multi-view camera and the inertial measurement unit can be obtained through the poses between other cameras and the image acquisition unit, and therefore the initial pose value between the image acquisition unit and the inertial measurement unit can be calibrated only by analyzing the pose value between any camera in the binocular camera or the multi-view camera and the inertial measurement unit.

Therefore, in the step S220 of the joint calibration method of the present invention, it is not necessary to process all the image data in the image data sequence, and it is only necessary to process the image data acquired by any one of the binocular camera or the multi-view camera in the image data sequence. This helps to reduce the amount of calculations in the overall calibration process.

Specifically, in the step S220 of the combined calibration method of the present invention: when the image acquisition unit is implemented as a binocular camera or a multi-view camera, processing only a certain set of image data in the sequence of image data to obtain a sequence of visual poses of the image acquisition unit, wherein the sequence of visual poses includes poses of cameras corresponding to the certain set of image data in the image acquisition unit relative to the virtual target at each of the image time stamps.

For example, when the image capture unit is implemented as a binocular camera, first capturing the target at each of the image time stamps by a left camera of the binocular camera to capture a left target image, while capturing the target at each of the image time stamps by a right camera of the binocular camera to capture a right target image, such that the sequence of image data includes a set of left image data and a set of right image data, wherein each of the left image data includes the left target image captured by the left camera at the corresponding image time stamp, and the right image data includes the right target image captured by the right camera at the corresponding image time stamp; all of the left image data in the sequence of image data is then processed to obtain a sequence of visual poses of the binocular cameras, wherein the sequence of visual poses includes a pose of the left camera in the binocular cameras relative to the virtual target at each of the image timestamps to reduce the amount of computation by one-half. It is to be appreciated that in other examples of the invention, only all of the right image data in the sequence of image data may be processed to obtain a sequence of visual poses of the binocular cameras, wherein the sequence of visual poses includes the pose of the right camera of the binocular cameras relative to the virtual target at each of the image timestamps.

Next, the characteristics of step S220 in the joint calibration method according to the present invention will be described by taking the processing of the left image data acquired by the left camera of the binocular camera as an example.

In detail, first, the corner point coordinate p of the virtual target is calculated according to the input target parameters _wi ＝(x _wi ，y _wi ) (ii) a Then, according to the collected left image data, detecting the corner point coordinate p of the left mark plate image _ci ＝(x _ci ，y _ci ) (ii) a Then, substituting the pair of corner point coordinates of the virtual target and the left target image into a plane transformation model (such as a formula 1) to obtain a linear equation set, and solving the linear equation set to obtain a homography matrix H between each left image and the virtual target; and finally, decomposing each homography matrix H to obtain the pose of the left camera relative to the virtual target at each image time stamp. Of course, when all H solved by the left image data are decomposed, the pose T of the left camera at all image timestamps can be obtained _w，el By setting all the left camera poses T _w，el Is identified with a rotation vector, the sequence of visual poses p can be derived, wherein the sequence of visual poses p is comprised at each of the image time stamps t _i Pose of the left camera relative to the virtual target

The first three are position coordinates and the last three are rotation vectors.

Specifically, the plane transformation model between the target image plane and the virtual target plane is:

wherein:

in the formula: h represents a mapping matrix (i.e., a homography matrix) between the reticle image plane and the virtual reticle plane; superscript T denotes transpose; p is a radical of _ci ＝(x _ci ，y _ci 1) representing the homogeneous form coordinates of the ith corner point in the reticle image plane; p is a radical of _wi ＝(x _wi ，y _wi ) Is the coordinate of the corresponding ith angular point in the virtual target plane.

According to the preferred embodiment of the present invention, as shown in fig. 4, the step S230 of the joint calibration method includes the steps of:

s231: substituting the visual pose sequence into a spline curve segment fitting model to obtain a linear equation set;

s232: solving the linear equation set by using a linear solving library to obtain spline parameters of the spline curve segment fitting model; and

s233: after obtaining the spline parameters, solving for the sequence of visual computation data by processing the spline curve segment fitting model, wherein the sequence of visual computation data includes a set of active rotation angular velocities corresponding to the inertial timestamps.

Specifically, the spline curve segment fitting model is as follows:

wherein: b is _j，k (t)＝[1，u，u ² ，...，u ^k-1 ]M ^k (i)，

In the formula: b represents a basis function; c represents a spline curve segment formed by linear combination of the basis functions B; t represents a normalized timestamp; t is t _i The normalized timestamp is obtained by averaging normalized image timestamps; u by corresponding the normalized timestamp t to the node interval t _i ，t _i+1 ) Thus obtaining the compound; m is a basic matrix for calculating splines; n is an element of the matrix M; k represents the highest degree of the basis function B constituting the curve; v represents a control vector (i.e., a spline parameter vector).

It will be appreciated that the B-spline is made up of a series of curve segments c that are linearly combined by basis functions B. The shape of the spline is determined by three quantities, i.e., a pitch vector, { t }, an order and a control vector ₀ ，t ₁ ，t ₂ ，...，t _m Is obtained by averaging normalized image timestamp ranges, namely t ₀ ＝0，t _m ＝1，t ₁ -t ₀ ＝t _i+1 -t _i (ii) a The degree k represents the highest degree of the basis functions constituting the curve; the control vector V represents the combining weight when the basis functions are linearly combined.

Thus, assuming that the normalized timestamp of a certain visual pose is t, the t is corresponding to the node interval [ t ] where the t is located _i ，t _i+1 ) U in equation 2 is obtained.Since it is known from the definitional formula (formula 2.1) of the basis function B, the basis function B is only related to u and M, and the element N in the M matrix is a recursion derived from the Boor-Cox iterative formula (formula 2.3 and formula 2.4) and the spline basis transformation (formula 2.5), and is also a quantity related to only the timestamp. In addition, equation 2 defines that the spline curve segment in the node section is only related to the i-k +1 th to i-th basis functions. Therefore, the existing visual pose sequence p is substituted into formula 2, and the normalized timestamp sequence corresponding to the visual pose sequence is converted into u to calculate the basis function B, so as to form a linear equation system with the control vector V as a parameter. Then, the linear equation set is solved by using the linear solving library to obtain spline parameters, namely, the pose discrete sequence is expressed into a continuous curve, and then the pose of the spline at any time can be obtained.

According to the composition analysis of the pose, the angular velocity under the virtual calibration plate coordinate system under pure vision is directly related to the first derivative value of the last three dimensions of the pose calculated by splines, namely, U is equal to [1, U ═ ² ，...，u ^k-1 ]And (5) obtaining the passive rotation angular velocity of the image acquisition unit rotating around the virtual target body coordinate system by derivation of each element. And then, converting the obtained passive rotation angular velocity into an active rotation angular velocity of the image acquisition unit rotating around the image body coordinate system.

Specifically, as shown in fig. 4, the step S233 further includes the steps of:

s2331: obtaining a passive rotation angular velocity at each inertial timestamp by solving a first derivative of the spline curve segment fitting model, wherein the passive rotation angular velocity is an angular velocity at which the image acquisition unit rotates around a body coordinate system of the virtual target; and

s2332: converting the passive rotation angular velocity into the active rotation angular velocity, wherein each of the active rotation angular velocities is an angular velocity at which the image acquisition unit rotates about a body coordinate system of the image acquisition unit at each of the inertial time stamps.

In other words, the passive rotation angular velocity is converted into the active rotation angular velocity by an angular velocity conversion model such that the obtained vision calculation data sequence comprises a set of active rotation angular velocities corresponding to the inertial timestamps, wherein each of the active rotation angular velocities is an angular velocity at which the image acquisition unit rotates around the body coordinate system of the image acquisition unit at each of the inertial timestamps. It should be appreciated that the docking of the image data sequence with the inertial data sequence is obtained by spline fitting and spline derivation, so that the image data and the inertial data with large frame rate differences can be aligned in the form of a high frame rate (the inertial timestamps).

More specifically, the passive rotation angular velocity is converted in the step S2332 by a passive rotation angular velocity conversion model, which is as follows:

wherein:

in the formula: w is a _prediction Is an active rotation angular velocity;

is a first derivative of the rotation vector;

is the angle of the rotation vector; a being an axial representation of the rotation vector, i.e. the rotation vector

a ^× Is a matrix composed of a elements.

Exemplarily, taking a left camera of a binocular camera as an example, for any inertial timestamp, a corresponding u value can be obtained according to the spline node division in the step S230; then substituting the spline curve segment fitting model (formula 2) obtained in the step S230The spline expression can be used for obtaining the visual pose under the inertial timestamp; then, a spline expression is derived under the u value, and a first derivative of the visual pose can be obtained, wherein the front three dimension is the moving linear velocity of the left camera, and the rear three dimension is the first derivative of the rotation of the left camera around the virtual coordinate

This is the passive angular velocity of rotation; finally, the passive angular velocity of rotation can be converted by the angular velocity conversion module (equation 3)

The active rotation angular velocity w converted into rotation of the left camera around the body coordinate system of the left camera _prediction 。

According to the step S240 of the preferred embodiment of the present invention, after the vision calculation data sequence is obtained, an initial value of a time delay between the image acquisition unit and the inertia measurement unit is obtained by performing a cross-correlation degree analysis on the obtained inertia data sequence and the vision calculation data sequence. Here, the inertial data sequence includes a set of measured angular velocities corresponding to the inertial time stamps measured by the inertial measurement unit, wherein each of the measured angular velocities is an angular velocity at which the image acquisition unit rotates about a body coordinate system of the inertial measurement unit at each inertial time stamp.

Specifically, as shown in fig. 5A, the step S240 includes the steps of:

s241: obtaining a correlation value between each of said measured angular velocities of said inertial data sequence and each of said active rotational angular velocities of said vision calculation data sequence to obtain a cross-correlation coefficient sequence, wherein said cross-correlation coefficient sequence comprises a set of said correlation values corresponding to correlation coefficients;

s242: obtaining an optimal correlation coefficient by comparing the magnitudes of all correlation values in the cross-correlation coefficient sequence, wherein the optimal correlation coefficient corresponds to the largest correlation value; and

s243: and solving a time delay value between the image acquisition unit and the inertia measurement unit based on the optimal correlation coefficient to serve as the initial time delay value.

For example, when the correlation value between the measured angular velocity and the active rotation angular velocity is obtained, a cross-correlation coefficient model (equation 4) is generally used for solving, wherein the cross-correlation coefficient model is as follows:

corr(1)＝w _measurement (1)*w _prediction (M)

corr(2)＝w _measurement (1)*w _prediction (M-1)+w _measurement (2)*w _prediction (M)

...

corr(2M-1)＝w _measurement (M)*w _prediction (1) (formula 4)

In the formula: corr represents the correlation value; w is a _measurement Representing the measured angular velocity in the inertial data sequence; w is a _predicition Representing an active rotation angular velocity in the sequence of vision calculation data; m represents the sequence length.

It should be understood that the length of the inertial data sequence is the number of the measured angular velocities in the inertial data sequence; the length of the vision calculation data sequence is the number of the active rotation angular velocities in the vision calculation data sequence. In addition, when the length of the inertial data sequence and the length of the visual calculation data sequence are both M, the correlation coefficient id of the cross-correlation coefficient model is between 0 and 2M-1.

Then, the magnitudes of the calculated correlation values are compared to pick out the largest correlation value, and the obtained correlation coefficient id corresponding to the largest correlation value is taken as the optimal correlation coefficient x. Of course, the approximate range of the correlation coefficient id (i.e. the optimal correlation coefficient x) corresponding to the maximum correlation degree can also be analyzed by the cross-correlation coefficient variation graph (as shown in fig. 5B), and for example, in the cross-correlation coefficient variation graph shown in fig. 4, the optimal correlation coefficient x is between 6000 and 8000.

Finally, solving through an initial value of time delayModel is used for solving the initial value d of the time delay ₀ The initial time delay value solving model is as follows:

in the formula: d ₀ Representing an initial value of the time delay;

representing a time interval between two adjacent frames of images in the image data sequence; x is the optimal correlation coefficient; m is the sequence length.

It will be appreciated that, from the solution process of the cross-correlation coefficient, in the case of a perfect alignment of the two sequence time stamps, the maximum correlation value between the inertial data sequence and the visual calculation data sequence should correspond to the case of a perfect overlap of the two sequences, i.e. a correlation coefficient id of M-1, i.e. the optimum correlation coefficient x of M-1.

According to the joint calibration method of the preferred embodiment of the present invention, in the step 250, an initial value of a pose between the image acquisition unit and the inertial measurement unit is obtained by minimizing an error between the vision calculation data sequence and the inertial data sequence. Here, since each of the active rotational angular velocities in the visual calculation data series is an angular velocity at which the image pickup unit rotates around the body coordinate system of the image pickup unit at each of the inertia time stamps, and each of the measured angular velocities in the inertial data series is an angular velocity at which the image pickup unit rotates around the body coordinate system of the inertial measurement unit at each of the inertia time stamps, that is, the active rotational angular velocity of the visual calculation data series is different from the body coordinate system to which the measured angular velocity of the inertial data series refers, it is necessary to convert each of the active rotational angular velocities of the visual calculation data series into an angular velocity at which the image pickup unit rotates around the body coordinate system of the inertial measurement unit in advance when minimizing an error between the visual calculation data series and the inertial data series, to obtain a transformed sequence of vision calculation data, wherein the transformed sequence of vision calculation data comprises a set of transformed angular velocities, wherein each of the transformed angular velocities is an angular velocity of the image acquisition unit rotating around the body coordinate system of the inertial measurement unit at each of the inertial time stamps.

Specifically, as shown in fig. 6, the step S250 includes the steps of:

s251: converting each of the active rotational angular velocities of the vision calculation data sequence into an angular velocity of rotation about a body coordinate system of the inertial measurement unit to obtain a converted vision calculation data sequence;

s252: differencing each said measured angular velocity in said inertial data sequence with a corresponding converted angular velocity in said converted visual calculation data sequence to obtain an error term between each said measured angular velocity and a corresponding said converted angular velocity; and

s253: and solving a rotation value from the inertial measurement unit to the image acquisition unit corresponding to the minimum error by using a linear solution library to serve as a rotation initial value of the pose initial value between the image acquisition unit and the inertial measurement unit.

It is understood that the error between the vision calculation data sequence and the inertial data sequence is the sum of the error terms between each of the converted angular velocities in the converted vision calculation data sequence and the corresponding measured angular velocity in the inertial data sequence. In addition, the pose initial values include a position initial value and a rotation initial value, the rotation initial value of the pose initial values is obtained in the step S253 of the present invention, and the position initial value of the pose initial values is directly assigned to a preset initial value. For example, the location initial value may be, but is not limited to, assigned to 0, and the present invention is not limited in this respect.

More specifically, in converting each of the active rotation angular velocities of the vision calculation data sequence into an angular velocity rotating around the body coordinate system of the inertial measurement unit, it is necessary to convert the active rotation angular velocity using an active rotation angular velocity conversion model, wherein the active rotation angular velocity model is as follows:

w’＝T×w _prediction (formula 5)

In the formula: w' is the conversion angular velocity; w is a _prediction Is the active rotational angular velocity; and T is a rotation value from the inertia measurement unit to the image acquisition unit.

Subsequently, an error term between each of the measured angular velocities and the corresponding converted angular velocity is obtained by an error term construction model, wherein the error term construction model is as follows:

e＝w _measurement -w' (equation 6)

Wherein e is an error term between the measured angular velocity and the converted angular velocity; w is a _measurement Representing said measured angular velocity; w' is the conversion angular velocity.

Finally, solving the sum of all the error terms and the corresponding rotation value T by using a linear solving library, and taking the pose value corresponding to the minimum sum of the error terms as the initial pose value T between the image acquisition unit and the inertial measurement unit ₀ 。

Exemplarily, taking a left camera of the binocular camera as an example, an angular velocity (the measured angular velocity) of the vision calculation data sequence rotating around a body coordinate system of the left camera is first converted into an angular velocity (the converted angular velocity) of the vision calculation data sequence rotating around a body coordinate system of the inertial measurement unit using formula 5 to obtain the converted vision calculation data sequence; then, obtaining an error term e between each measured angular velocity and the corresponding converted angular velocity by using formula 6; finally, solving a rotation value between the inertial measurement unit and the left camera corresponding to the smallest sum of the error terms by using a linear solution library to obtain the initial pose value T between the image acquisition unit and the inertial measurement unit ₀ The initial value of rotation of (c).

It is worth mentioning that the initial delay value d is obtained ₀ And the pose initial value T ₀ Then, the invention describedThe joint calibration method of the preferred embodiment can further continue to calibrate the optimal time delay and the optimal pose between the image acquisition unit and the inertial measurement unit.

Specifically, as shown in fig. 1, the joint calibration method further includes the steps of:

s260: and obtaining the optimal time delay and the optimal pose between the image acquisition unit and the inertial measurement unit through bundle set optimization based on the initial time delay value and the initial pose value.

It should be noted that, by solving the correlation in step S240, a more reliable initial value of time delay is provided for optimizing the time delay parameter between the image acquisition unit and the inertial measurement unit for the bundle set, which is helpful for obtaining a stable and reliable calibration result (i.e. the optimal time delay) by the bundle set optimization. In addition, by minimizing the error between the vision calculation data sequence and the inertial data sequence in step S250, the pose parameters between the image capturing unit and the inertial measurement unit are optimized in advance, and reliable pose initial values are provided for bundle optimization of the pose parameters between the image capturing unit and the inertial measurement unit, so that the optimized bundle optimization algorithm sensitive to the initial values obtains a more stable and reliable calibration result (i.e. the optimal pose). As will be understood by those skilled in the art, bundle optimization refers to the simultaneous optimization of all objects desired to be optimized using all image information.

Specifically, as shown in fig. 7, the step S260 includes the steps of:

s261: obtaining the pose from the virtual target to the inertial measurement unit based on the inertial timestamp and the initial time delay value;

s262: based on the pose from the virtual target to the inertial measurement unit and the initial time delay value, transforming the world corner coordinates of the virtual target through an observation model to obtain the corner coordinates of a new target image; and

s263: and solving errors between the corner coordinates of the new target image and the corner coordinates of the target image in the inertial data sequence through an error model, and bundling and optimizing all the errors to obtain the optimal time delay and the optimal pose.

Illustratively, the initial value d of the delay is solved based on the previous steps ₀ And pose initial value T ₀ According to an observation equation, an error model composed of the following formula 7 and formula 8 can be constructed, so that an error err can be obtained for each corner point on each frame image; then, all errs are optimized through bundling set, and the optimal time delay d is obtained _B And the optimum pose T _B 。

In particular, the error model is as follows:

p′ _c ＝h(T ^T T _i，w (t+d)P _w ) (formula 7)

In the formula: p' _c The coordinates of the corner points of the new target image are obtained; h is an observation model; t is a pose parameter between the image acquisition unit and the inertial measurement unit; superscript T represents transposition; t is a unit of _i，w The pose of the virtual target to the inertial measurement unit is determined; t is the inertial timestamp; d is a time delay parameter between the image acquisition unit and the inertia measurement unit; p _w Representing world corner coordinates (i.e., three-dimensional coordinates of the corners) in the virtual target.

err＝p _c -p′ _c (formula 8)

In the formula: err is the error; p is a radical of _c Coordinates of corner points of the reticle images in the image data sequence; p' _c And the coordinates of the corner points of the new target image are obtained.

It should be understood that equation 7 represents: firstly, three-dimensional coordinates P of angular points in a virtual standard board _w By pose transformation T _i，w Transforming to a body coordinate system of the inertial measurement unit; then transforming the coordinate system to a body coordinate system of the image acquisition unit through posture transformation T; finally, projecting the three-dimensional points in the space to each pose camera again through the camera observation model h to obtain new image coordinates p' _c . Equation 8 shows: solving corner point coordinate p 'of new target image obtained by reprojection' _c With the coordinates p of the corner points detected from the taken image _c The error between.

Here, the initial value of the pose parameter T between the image capturing unit and the inertial measurement unit is the pose initial value T calculated by the step S250 ₀ (ii) a Pose T from the virtual target to the inertial measurement unit _i，w The initial value of (2) is set as a camera pose, wherein the camera pose may first obtain the u value in the formula (2.1) according to the inertia timestamp t and the time delay parameter d, and then substitute the u value into the spline curve in step S220 to obtain the camera pose, wherein the initial value of the time delay parameter d is the initial time delay value d calculated in step S240 ₀ 。P _w For one point in the world coordinate system, for example, the whole calibration process takes the lower left corner of a virtual target constructed according to the target as the world coordinate system, so that the three-dimensional coordinates of any corner point on the virtual target are known.

Furthermore, as can be seen from the error model, the camera parameters in the observation model h can be further adjusted, but since the camera calibration technique is relatively mature, fixed camera parameters can be selected, and only the time delay d and the pose T between the image capturing element and the inertial measurement unit are optimized in step S260.

Therefore, according to the preferred embodiment of the present invention, the joint calibration method may further calibrate the image capturing unit to obtain the internal and external parameters of the image capturing unit, so as to determine the observation model h in step S260. Specifically, as shown in fig. 1, the joint calibration method further includes the steps of:

s200: and obtaining the camera parameters of the image acquisition unit by calibrating the image acquisition unit.

It should be understood that the step S200 may be performed before the step S210, or may be performed between the step S210 and the step S260. In addition, when the image acquisition unit is a monocular camera, monocular calibration needs to be performed on the image acquisition unit to obtain camera internal parameters of the image acquisition unit; when the image liniment unit is a binocular camera, the image acquisition unit needs to be subjected to binocular calibration so as to obtain left and right camera internal parameters and left and right camera external parameters of the image acquisition unit.

Illustratively, taking a binocular camera as an example, customizing a high-precision target with no reflection and diffuse reflection, keeping the image inertia combination device to be statically placed on a support frame, changing the pose of the target, collecting 10-20 fixed-point shooting images, covering the position of the target in the target image in a nine-square grid manner, and enabling the angles and the dimensions of the target in each target image to be different as much as possible. Then binocular calibration is carried out to obtain internal reference K of the left camera and the right camera _l 、K _r Left and right camera distortion d _l 、d _r And the external parameters T of the left and right cameras _r，l So as to determine the observation model of the image acquisition unit by the left and right camera intrinsic parameters.

It is worth noting that when the double-target calibration is performed, the high-precision calibration plate is used as the calibration plate, and the fixed-point shooting method is adopted, so that the movement, rotation and coverage of the calibration plate are sufficient, the calibration data cannot cause overfitting, namely, the calibration speed is accelerated, and good conditions are provided for the accuracy of the subsequent combined calibration. In addition, in the process of the combined calibration, an april standard plate superior to a checkerboard is preferably adopted as the standard plate of the invention, and the internal and external parameters between the binocular cameras are accurately estimated through the precise size information of the april standard plate, the mapping relation of two-dimensional point pairs and the transformation principle of a Euclidean coordinate system, so that the calibration result obtained by the combined calibration method has higher precision.

Schematic device

Fig. 8 is a block diagram schematically illustrating a combined calibration apparatus according to the preferred embodiment of the present invention. As shown in fig. 8, the combined calibration apparatus 300 according to the preferred embodiment of the present invention includes: a visual pose obtaining unit 320, configured to obtain a visual pose sequence of the image acquisition unit by processing the acquired image data sequence; a spline curve fitting unit 330 for obtaining a visual calculation data sequence corresponding to the inertial time stamp of the inertial measurement unit by fitting the visual pose sequence to a continuous spline curve; a cross-correlation degree analysis unit 340, configured to perform cross-correlation degree analysis on the obtained inertial data sequence and the visual calculation data sequence to obtain an initial value of a time delay between the image acquisition unit and the inertial measurement unit; and a minimized error analyzing unit 350 for obtaining an initial value of a pose between the image capturing unit and the inertial measurement unit by analyzing a minimized error between the vision calculation data and the inertial data sequence.

Further, as shown in fig. 8, the joint calibration apparatus 300 further includes a joint acquisition unit 310 for jointly acquiring the image data sequence and the inertial data sequence by the visual-inertial joint apparatus, wherein the image data sequence includes a set of image data with image time stamp acquired by the image acquisition unit, and the inertial data sequence includes a set of inertial data with inertial time stamp measured by the inertial measurement unit.

Preferably, as shown in fig. 8, the joint calibration apparatus 300 further includes a bundling optimization unit 360, configured to obtain an optimal time delay and an optimal pose between the image capturing unit and the inertial measurement unit through bundling optimization based on the initial time delay value and the initial pose value.

More preferably, as shown in fig. 8, the combined calibration apparatus 300 further includes a camera parameter calibration unit 370, configured to obtain camera parameters of the image capturing unit by calibrating the image capturing unit.

In an example, the joint obtaining unit 310 of the joint calibration apparatus 300 is configured to: moving the visual-inertial combination unit within a predetermined time to activate the inertial measurement unit and maintain a target always in the field of view of the image acquisition unit; measuring, by the inertial measurement unit, a set of the inertial data over the predetermined time to obtain the inertial data sequence, wherein each of the inertial data comprises an angular velocity of the inertial measurement unit rotating about a body coordinate system of the inertial measurement unit at the corresponding inertial timestamp; and acquiring, by the image acquisition unit, a set of the image data within the predetermined time to acquire the image data sequence, wherein each of the image data includes a target image acquired by photographing the target by the image acquisition unit at the corresponding image time stamp.

In an example, the joint obtaining unit 310 of the joint calibration apparatus 300 is further configured to: designing a preset motion track and coding the preset motion track to a mechanical arm, so that the visual inertial combination unit is moved by the mechanical arm according to the preset motion track to activate the acceleration of the visual inertial combination unit in all directions, and the inertial measurement unit is activated in all directions.

In an example, the joint obtaining unit 310 of the joint calibration apparatus 300 is further configured to: the predetermined trajectory is optimized to avoid blurring of the reticle image caused by too fast a movement speed of the visual inertial joint unit while ensuring activation of the inertial measurement unit in all directions.

In an example, the visual pose obtaining unit 320 of the joint calibration apparatus 300 is configured to: detecting the corner coordinates of each image data in the image data sequence based on the image data sequence; solving a homography matrix between the virtual target and an image plane of the image acquisition unit at each image time stamp based on the corner point coordinates of each image data and the corner point coordinates of a virtual target; and obtaining the pose of the image capture unit relative to the virtual target at each of the image timestamps by decomposing the homography matrix at each of the image timestamps to obtain the sequence of visual poses, wherein the sequence of visual poses comprises the pose of the image capture unit relative to the virtual target at each of the image timestamps.

In an example, the visual pose obtaining unit 320 of the joint calibration apparatus 300 is further configured to: and calculating the corner point coordinates of the virtual target based on the input target parameters.

In one example, the spline curve fitting unit 330 of the joint calibration apparatus 300 is configured to: substituting the visual pose sequence into a spline curve segment fitting model to obtain a linear equation set; solving the linear equation set by using a linear solving library to obtain spline parameters of the spline curve segment fitting model; and after the spline parameters are obtained, processing the spline curve segment fitting model to solve the visual calculation data sequence.

In one example, the spline curve fitting unit 330 of the joint calibration apparatus 300 is further configured to: obtaining a passive rotation angular velocity at each inertial timestamp by solving a first derivative of the spline curve segment fitting model, wherein the passive rotation angular velocity is an angular velocity at which the image acquisition unit rotates around a body coordinate system of the virtual target; and converting the passive rotational angular velocity into the active rotational angular velocity to obtain the sequence of vision calculation data.

In an example, the cross-correlation degree analyzing unit 340 of the joint calibration apparatus 300 is configured to: obtaining a correlation value between each of the measured angular velocities of the inertial data sequence and each of the active rotational angular velocities of the vision calculation data sequence to obtain a cross-correlation coefficient sequence, wherein the cross-correlation coefficient sequence includes a set of the correlation values corresponding to correlation coefficients; obtaining an optimal correlation coefficient by comparing the magnitudes of all correlation values in the cross-correlation coefficient sequence, wherein the optimal correlation coefficient corresponds to the largest correlation value; and solving a time delay value between the image acquisition unit and the inertia measurement unit based on the optimal correlation coefficient to serve as the initial time delay value.

In one example, the minimized error analysis unit 350 of the joint calibration apparatus 300 is configured to: converting each of the active rotational angular velocities of the vision calculation data sequence into an angular velocity of rotation about a body coordinate system of the inertial measurement unit to obtain a converted vision calculation data sequence; differencing each said measured angular velocity in said inertial data sequence with a corresponding said converted angular velocity in said converted vision calculation data sequence to obtain an error term between each said measured angular velocity and a corresponding said converted angular velocity; and solving a rotation value from the inertial measurement unit to the image acquisition unit corresponding to the minimum error by using a linear solution library to serve as a rotation initial value of the pose initial value between the image acquisition unit and the inertial measurement unit.

In one example, the bundle optimization unit 360 of the joint calibration apparatus 300 is configured to: obtaining the pose from the virtual target to the inertial measurement unit based on the inertial timestamp and the initial time delay value; based on the pose from the virtual target to the inertial measurement unit and the initial time delay value, transforming the world corner coordinates of the virtual target through an observation model to obtain the corner coordinates of a new target image; and solving errors between the corner coordinates of the new target image and the corner coordinates of the target image in the inertial data sequence through an error model, and bundling and optimizing all the errors to obtain the optimal time delay and the optimal pose.

In one example, the camera parameter calibration unit 370 of the joint calibration apparatus 300 is configured to: and performing monocular calibration on the image acquisition unit to obtain the camera internal reference of the image acquisition unit.

In one example, the camera parameter calibration unit 370 of the joint calibration apparatus 300 is configured to: and carrying out binocular calibration on the image acquisition unit to obtain left and right camera internal parameters and left and right camera external parameters of the image acquisition unit.

Illustrative electronic device

Next, an electronic apparatus according to the preferred embodiment of the present invention is described with reference to fig. 9.

As shown in fig. 9, the electronic device 10 includes one or more processors 11 and a memory 12.

The processor 11 may be a Central Processing Unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in the electronic device 10 to perform desired functions.

The memory 12 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, Random Access Memory (RAM), cache memory (cache), and/or the like. The non-volatile memory may include, for example, Read Only Memory (ROM), hard disk, flash memory, etc. One or more computer program instructions may be stored on the computer-readable storage medium and executed by processor 11 to implement the joint calibration method of the various embodiments of the present invention described above and/or other desired functions.

In one example, as shown in fig. 9, the electronic device 10 may further include: an input device 13 and an output device 14, which are interconnected by a bus system and/or other form of connection mechanism (not shown).

The input means 13 may be, for example, a keyboard or the like for inputting the parameters of a target.

The output device 14 can output various information including the calibration result and the like to the outside. The output devices 14 may include, for example, a display, speakers, a printer, and a communication network and its connected remote output devices, among others.

Of course, for simplicity, only some of the components of the electronic device 10 relevant to the present invention are shown in fig. 9, omitting components such as buses, input/output interfaces, and the like. In addition, the electronic device 10 may include any other suitable components depending on the particular application.

Illustrative computer program product

In addition to the above-described methods and apparatus, embodiments of the present invention may also be a computer program product comprising computer program instructions that, when executed by a processor, cause the processor to perform some of the steps of the joint calibration method according to various embodiments of the present invention described in the "exemplary methods" section above of this specification.

The computer program product may write program code for carrying out operations for embodiments of the present invention in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as "r" or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server.

Furthermore, embodiments of the present invention may also be a computer-readable storage medium having stored thereon computer program instructions, which, when executed by a processor, cause the processor to perform some of the steps of the joint calibration method according to various embodiments of the present invention described in the above "exemplary method" section of this specification.

The computer-readable storage medium may take any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may include, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The basic principles of the present invention have been described above with reference to specific embodiments, but it should be noted that the advantages, effects, etc. mentioned in the present invention are only examples and are not limiting, and the advantages, effects, etc. must not be considered to be possessed by various embodiments of the present invention. Furthermore, the foregoing disclosure of specific details is for the purpose of illustration and description and is not intended to be limiting, since the invention is not limited to the specific details described above.

The block diagrams of devices, apparatuses, systems involved in the present invention are only given as illustrative examples and are not intended to require or imply that the connections, arrangements, configurations, etc. must be made in the manner shown in the block diagrams. These devices, apparatuses, devices, systems may be connected, arranged, configured in any manner, as will be appreciated by those skilled in the art. Words such as "including," "comprising," "having," and the like are open-ended words that mean "including, but not limited to," and are used interchangeably therewith. As used herein, the words "or" and "refer to, and are used interchangeably with, the word" and/or, "unless the context clearly dictates otherwise. The word "such as" is used herein to mean, and is used interchangeably with, the phrase "such as but not limited to".

It should also be noted that in the apparatus, devices and methods of the present invention, the components or steps may be broken down and/or re-combined. These decompositions and/or recombinations are to be regarded as equivalents of the present invention.

It will be appreciated by persons skilled in the art that the embodiments of the invention described above and shown in the drawings are given by way of example only and are not limiting of the invention. The objects of the invention have been fully and effectively accomplished. The functional and structural principles of the present invention have been shown and described in the examples, and any variations or modifications of the embodiments of the present invention may be made without departing from the principles.

Claims

1. A combined calibration method is used for calibrating a vision inertia combined device, wherein the vision inertia combined device comprises an image acquisition unit and an inertia measurement unit, and is characterized by comprising the following steps:

fitting the visual pose sequence into a continuous spline curve to obtain a visual calculation data sequence corresponding to the inertia timestamp of the inertia measurement unit;

obtaining a pose initial value between the image acquisition unit and the inertial measurement unit by analyzing a minimized error between the vision calculation data and the inertial data sequence;

the combined calibration method further comprises the following steps: jointly acquiring the image data sequence and the inertial data sequence by the vision-inertia joint device, wherein the image data sequence comprises a set of image data with image time stamps acquired by the image acquisition unit, and the inertial data sequence comprises a set of inertial data with the inertial time stamps measured by the inertial measurement unit;

wherein, the image data sequence and the inertial data sequence are jointly acquired by the vision-inertia joint apparatus, wherein the image data sequence includes a set of image data with image time stamp collected by the image collecting unit, and the inertial data sequence includes a set of inertial data with the inertial time stamp measured by the inertial measuring unit, including the steps of:

measuring, by the inertial measurement unit, a set of the inertial data within the predetermined time, wherein each of the inertial data includes a measured angular velocity of the inertial measurement unit rotating around a body coordinate system of the inertial measurement unit at the corresponding inertial timestamp; and

2. A joint calibration method according to claim 1, wherein said jointly acquiring, by the visual-inertial joint apparatus, the image data sequence and the inertial data sequence, wherein the image data sequence includes a set of image data with image time stamps acquired by the image acquisition unit, and the inertial data sequence includes a set of inertial data with the inertial time stamps measured by the inertial measurement unit, further comprises the steps of:

3. A joint calibration method according to claim 2, wherein the step of obtaining a sequence of visual poses of the image acquisition unit by processing the acquired sequence of image data comprises the steps of:

solving a homography matrix between the virtual target and an image plane of the image acquisition unit at each image time stamp based on the corner point coordinates of each image data and the corner point coordinates of a virtual target;

and obtaining the pose of the image capture unit relative to the virtual target at each of the image timestamps by decomposing the homography matrix at each of the image timestamps to obtain the sequence of visual poses, wherein the sequence of visual poses includes the pose of the image capture unit relative to the virtual target at each of the image timestamps.

4. A joint calibration method according to claim 3, wherein the step of obtaining a sequence of visual poses of the image acquisition unit by processing the acquired sequence of image data further comprises the steps of:

5. A joint calibration method according to claim 4, wherein the step of obtaining a sequence of visual calculation data corresponding to inertial time stamps of the inertial measurement unit by fitting the sequence of visual poses to a continuous spline comprises the steps of:

6. A joint calibration method according to claim 5, wherein said step of solving the sequence of visual calculation data by processing the spline segment fitting model after obtaining the spline parameters, wherein the sequence of visual calculation data includes a set of active rotation angular velocities corresponding to the inertial time stamps, comprises the steps of:

converting the passive rotation angular velocity into the active rotation angular velocity, wherein the active rotation angular velocity is an angular velocity at which the image capturing unit rotates around a body coordinate system of the image capturing unit at the inertial timestamp.

7. The joint calibration method according to claim 6, wherein the step of obtaining an initial value of a time delay between the image acquisition unit and the inertial measurement unit by performing cross-correlation analysis on the obtained inertial data sequence and the visual calculation data sequence comprises the steps of:

8. The joint calibration method as claimed in claim 7, wherein the step of obtaining an initial value of an attitude between the image acquisition unit and the inertial measurement unit by analyzing the minimized error between the vision calculation data and the inertial data sequence comprises the steps of:

differentiating each measured angular velocity of the inertial data sequence from a corresponding transformed angular velocity of the transformed visual calculation data sequence to obtain an error term between each measured angular velocity and the corresponding transformed angular velocity; and

and solving a rotation value from the inertial measurement unit to the image acquisition unit corresponding to the minimized error by using a linear solution library to serve as a rotation initial value in the pose initial value between the image acquisition unit and the inertial measurement unit.

9. A joint calibration method according to any one of claims 1 to 7, further comprising the steps of: and based on the initial time delay value and the initial pose value, obtaining the optimal time delay and the optimal pose between the image acquisition unit and the inertial measurement unit through bundling set optimization.

10. A joint calibration method according to claim 8, further comprising the steps of: and based on the initial time delay value and the initial pose value, obtaining the optimal time delay and the optimal pose between the image acquisition unit and the inertial measurement unit through bundle set optimization.

11. The joint calibration method according to claim 10, wherein the step of obtaining the optimal time delay and the optimal pose between the image capturing unit and the inertial measurement unit by bundle optimization based on the initial time delay value and the initial pose value comprises the steps of:

12. A joint calibration method according to any one of claims 1 to 8, further comprising the steps of: and obtaining the camera parameters of the image acquisition unit by calibrating the image acquisition unit.

13. A joint calibration method according to claim 11, further comprising the steps of: and obtaining the camera parameters of the image acquisition unit by calibrating the image acquisition unit.

14. A joint calibration method according to claim 13, wherein, in the step of obtaining camera parameters of the image acquisition unit by calibrating the image acquisition unit:

15. A joint calibration method according to claim 14, wherein, in the step of obtaining camera parameters of the image acquisition unit by calibrating the image acquisition unit:

16. A combined calibration method is used for calibrating a vision inertia combined device, wherein the vision inertia combined device comprises an image acquisition unit and an inertia measurement unit, and is characterized by comprising the following steps:

the step of obtaining a visual pose sequence of the image acquisition unit by processing the acquired image data sequence comprises the steps of:

17. A combined calibration method for calibrating a visual-inertial combined unit, wherein the visual-inertial combined unit comprises an image acquisition unit and an inertial measurement unit, comprising the steps of:

wherein the step of obtaining a visual computation data sequence corresponding to the inertial timestamp of the inertial measurement unit by fitting the visual pose sequence to a continuous spline curve comprises the steps of:

18. A combined calibration device for calibrating a visual-inertial combined apparatus, wherein the visual-inertial combined apparatus comprises an image acquisition unit and an inertial measurement unit, wherein the combined calibration device comprises:

a spline curve fitting unit for obtaining a visual calculation data sequence corresponding to the inertial time stamp of the inertial measurement unit by fitting the visual pose sequence into a continuous spline curve;

a minimized error analysis unit for obtaining a pose initial value between the image acquisition unit and the inertial measurement unit by analyzing the minimized error between the vision calculation data and the inertial data sequence;

the joint calibration device further comprises a joint acquisition unit, configured to jointly acquire the image data sequence and the inertial data sequence by the visual-inertial joint device, wherein the image data sequence includes a set of image data with an image time stamp acquired by the image acquisition unit, and the inertial data sequence includes a set of inertial data with an inertial time stamp measured by the inertial measurement unit;

wherein the joint obtaining unit is further configured to:

19. A joint calibration apparatus according to claim 18, wherein the joint acquisition unit is further configured to design a predetermined motion trajectory and encode the predetermined motion trajectory to a robotic arm, so as to move the visual inertial combination apparatus according to the predetermined motion trajectory by the robotic arm, so as to activate the acceleration of the visual inertial combination apparatus in each direction, so that the inertial measurement unit is activated in each direction.

20. The joint calibration apparatus as defined in claim 19, wherein the visual pose obtaining unit is further configured to:

21. The joint calibration apparatus as claimed in claim 20, wherein the visual pose obtaining unit is further configured to calculate the corner coordinates of the virtual target based on the input target parameters.

22. A joint calibration apparatus according to claim 21, wherein the spline curve fitting unit is further configured to:

23. The joint calibration apparatus of claim 22, wherein the spline curve fitting unit is further configured to:

obtaining a passive rotation angular velocity at each inertia time stamp by solving a first derivative of the spline curve segment fitting model, wherein the passive rotation angular velocity is an angular velocity of the image acquisition unit rotating around the body coordinate system of the virtual target; and

24. A combined calibration device according to claim 23, wherein the cross-correlation degree analysis unit is further configured to:

25. A joint calibration arrangement according to claim 24, wherein the minimization of error analysis unit is further configured to:

26. The joint calibration apparatus according to claim 18 or 19, further comprising a bundling optimization unit, configured to obtain an optimal time delay and an optimal pose between the image capturing unit and the inertial measurement unit through bundling optimization based on the initial time delay value and the initial pose value.

27. The joint calibration apparatus according to any one of claims 20 to 25, further comprising a bundling optimization unit, configured to obtain an optimal time delay and an optimal pose between the image capturing unit and the inertial measurement unit through bundling optimization based on the initial time delay value and the initial pose value.

28. The joint calibration apparatus of claim 27, wherein the bundle optimization unit is further configured to:

and solving errors between the corner point coordinates of the new target image and the corner point coordinates of the target image in the inertial data sequence through an error model, and bundling and optimizing all the errors to obtain the optimal time delay and the optimal pose.

29. A combined calibration apparatus according to claim 28, further comprising a camera parameter calibration unit for obtaining camera parameters of the image capturing unit by calibrating the image capturing unit.

30. A combined calibration apparatus according to claim 29, wherein the camera parameter calibration unit is further configured to obtain the internal reference and the external reference of the camera parameters of the image capturing unit by binocular calibrating the image capturing unit.

31. An electronic device, comprising

At least one processor; and

at least one memory having stored therein computer program instructions which, when executed by the processor, cause the processor to carry out the joint calibration method according to any one of claims 1 to 17.

32. A computer readable storage medium having computer program instructions stored thereon which, when executed by a computing device, are operable to perform a joint calibration method as defined in any one of claims 1 to 17.