WO2024093635A1 - Camera pose estimation method and apparatus, and computer-readable storage medium - Google Patents

Camera pose estimation method and apparatus, and computer-readable storage medium Download PDF

Info

Publication number
WO2024093635A1
WO2024093635A1 PCT/CN2023/124164 CN2023124164W WO2024093635A1 WO 2024093635 A1 WO2024093635 A1 WO 2024093635A1 CN 2023124164 W CN2023124164 W CN 2023124164W WO 2024093635 A1 WO2024093635 A1 WO 2024093635A1
Authority
WO
WIPO (PCT)
Prior art keywords
camera
slave
pose estimation
transformation relationship
parameter
Prior art date
Application number
PCT/CN2023/124164
Other languages
French (fr)
Chinese (zh)
Inventor
武云钢
Original Assignee
深圳市其域创新科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市其域创新科技有限公司 filed Critical 深圳市其域创新科技有限公司
Publication of WO2024093635A1 publication Critical patent/WO2024093635A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • G06T17/05Geographic models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/06Topological mapping of higher dimensional structures onto lower dimensional surfaces
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/30Assessment of water resources

Definitions

  • the embodiments of the present invention relate to the field of unmanned aerial survey technology, and specifically to a camera pose estimation method, device, equipment and computer-readable storage medium.
  • aerial photogrammetry technology can map topographic maps and image maps of various scales, and can also establish terrain databases to provide basic data for various geographic information systems and land information systems, so that people can plan and manage land more finely.
  • the precise geographic information data provided by aerial photogrammetry technology can also establish high-precision maps, which brings convenience to people's travel positioning.
  • Aerial photogrammetry technology often uses aerial triangulation, that is, using continuously captured aerial photos with a certain overlap, based on a small number of field control points, to establish a route model or regional network model corresponding to the field by photogrammetry methods, so as to obtain the plane coordinates and elevation of the encrypted points.
  • UAV aerial survey technology is a powerful supplement to traditional aerial photogrammetry methods.
  • UAV aerial survey technology when carrying multiple cameras for aerial photography, UAVs can effectively capture orthophotos and oblique images.
  • the frame of multiple cameras is large, and high-altitude operations can complete flight missions at one time.
  • an embodiment of the present invention provides a camera pose estimation method to solve the problems existing in the prior art.
  • a camera pose estimation method characterized by comprising:
  • Acquire external parameters of each camera in a multi-camera shooting device including at least two cameras, wherein the relative position relationship between each of the cameras is fixed, and the at least two cameras include a master camera and one or more slave cameras;
  • Three-dimensional points are generated according to the multiple images taken by the multi-camera shooting device, and the three-dimensional points are optimized and calculated according to the first geographical location and the posture transformation relationship to obtain the optimized first posture of each camera.
  • determining a first extrinsic parameter of the master camera and a second extrinsic parameter of each of the slave cameras from the extrinsic parameters, and calculating a pose transformation relationship of each of the slave cameras relative to the master camera according to the first extrinsic parameter and the second extrinsic parameter further includes:
  • the main camera extrinsic parameter and the slave camera extrinsic parameter are calculated according to a plurality of images taken by the main camera and the slave camera at the same track position;
  • T01 is the conversion relationship
  • Tw0 is the master camera external parameter
  • Tw0' is the inverse matrix of Tw0
  • Tw1 is the slave camera external parameter
  • the conversion relationship of the image is determined as the posture transformation relationship.
  • generating three-dimensional points according to the multiple images captured by the multi-camera shooting device further includes:
  • the three-dimensional point is generated according to the relative transformation relationship.
  • the optimizing calculation of the three-dimensional point according to the first geographical location and the posture transformation relationship further includes:
  • Pi is the projection matrix
  • tc ] is the posture transformation relationship
  • ti ] is the first geographical location
  • k is the internal parameter of any camera of the master camera and the slave camera
  • i is the sequence number of the first geographical location
  • c is the sequence number of the posture transformation relationship
  • the minimized reprojection error of the three-dimensional point is calculated according to the projection matrix, and the formula is:
  • x is the three-dimensional point
  • xo is the two-dimensional feature point obtained after reprojecting the three-dimensional point
  • o is the serial number of the three-dimensional point.
  • the camera posture estimation method further includes:
  • the three-dimensional points are globally optimized.
  • the camera pose estimation method further includes:
  • the method further includes:
  • the position transformation relationship of each of the slave cameras relative to the master camera obtained by calculating according to the first external parameter and the second external parameter during the last operation of the multi-camera shooting device is obtained.
  • a camera pose estimation device comprising:
  • a first acquisition module used to acquire an external parameter of each camera in a multi-camera shooting device including at least two cameras, wherein a relative position relationship between each of the cameras is fixed, and the at least two cameras include a master camera and one or more slave cameras;
  • a first calculation module is used to determine a first extrinsic parameter of the master camera and a second extrinsic parameter of each of the slave cameras from the extrinsic parameters, and calculate a posture transformation relationship of each of the slave cameras relative to the master camera according to the first extrinsic parameter and the second extrinsic parameter;
  • a second acquisition module configured to acquire a first geographic location of the main camera through a sensor
  • the second calculation module is used to generate three-dimensional points according to the multiple images taken by the multi-camera shooting device, and optimize the three-dimensional points according to the first geographical location and the posture transformation relationship to obtain the optimized first posture of each camera.
  • a camera pose estimation device comprising:
  • a processor a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface communicate with each other via the communication bus;
  • the memory is used to store at least one program, and the program enables the processor to perform operations such as the above-mentioned camera pose estimation method.
  • a computer-readable storage medium wherein at least one program is stored in the storage medium, and the program enables a camera pose estimation device to perform operations corresponding to the above method.
  • the pose transformation relationship of the slave camera extrinsic parameters relative to the main camera extrinsic parameters is calculated, and then the first geographic location of the main camera is obtained.
  • pose transformation relationship are used to optimize the 3D points generated by multiple target images taken by the main camera and the slave camera. Since the first geographical location of the main camera is its actual location, and the actual position relationship between the main camera and the slave camera is relatively fixed, obtaining the pose transformation relationship of the slave camera relative to the main camera is equivalent to obtaining the actual position of the slave camera.
  • Optimizing the 3D points based on the actual position of the camera can solve the problem of poor optimization effect caused by poor feature point matching when the ground texture of the captured image is poor. It has low dependence on feature point matching, can improve the optimization effect when the ground texture is poor, improve the optimization accuracy and applicability, and thus improve the accuracy of 3D reconstruction.
  • FIG1 is a schematic diagram showing a flow chart of a camera pose estimation method provided by an embodiment of the present invention
  • FIG2 is a schematic diagram showing the structure of a camera pose estimation device provided by an embodiment of the present invention.
  • FIG3 shows a schematic structural diagram of a camera pose estimation device provided by an embodiment of the present invention.
  • the inventors of the present application have designed a camera pose estimation method after research.
  • obtaining the main camera extrinsic parameters and the slave camera extrinsic parameters calculating the pose transformation relationship between the main camera extrinsic parameters and the slave camera extrinsic parameters, obtaining the first geographical location of the main camera, and then optimizing the three-dimensional points generated by three-dimensional reconstruction according to the first geographical location and the pose transformation relationship, the dependence of the optimization process on the matching relationship of feature points is reduced, and the accuracy of the optimized three-dimensional points is not easily affected by inaccurate matching relationships in complex terrain, which can improve the accuracy of three-dimensional reconstruction through unmanned aerial vehicle aerial survey.
  • FIG. 1 shows a flow chart of a camera pose estimation method provided by an embodiment of the present invention. As shown in FIG. 1 , the method includes the following steps:
  • Step 110 Obtaining external parameters of each camera in a multi-camera shooting device including at least two cameras, wherein the relative position relationship between each camera is fixed, and the at least two cameras include a master camera and one or more slave cameras.
  • the main camera refers to the camera used as an orthographic lens in a multi-camera shooting device, that is, the camera that shoots in the direction of the target.
  • a multi-camera shooting device that is, the camera that shoots in the direction of the target.
  • the shooting directions of the cameras are often inconsistent.
  • only a multi-camera shooting device with one main camera and one or more slave cameras is taken as an example.
  • a geographic location acquisition device such as various types of location sensors, needs to be installed on the main camera in the multi-camera shooting device.
  • the other cameras serve as slave cameras, and based on the geographic location of the main camera and the fixed relative position relationship between the other cameras and the main camera, the posture transformation relationship of the slave cameras relative to the main camera can be obtained in subsequent steps, or the geographic location of the slave cameras can be further obtained.
  • obtaining the extrinsic parameters of each camera in a multi-camera shooting device including at least two cameras refers to obtaining the extrinsic parameters of the master camera and the slave camera.
  • the extrinsic parameters of the camera refer to the parameters of the camera in the world coordinate system, including the rotation matrix R and the translation matrix T.
  • the camera extrinsic parameters can be obtained in a variety of ways. For example, an aerial triangulation calculation can be performed using multiple images taken by the master camera and the slave camera at the same trajectory point to obtain the master camera extrinsic parameters and the slave camera extrinsic parameters. They can also be obtained by camera self-calibration. There are also many methods for camera self-calibration.
  • the Tsai two-step method can be used. Different methods can be used to obtain camera extrinsics according to actual conditions, such as the Zhang calibration method, the active system controlling the camera to perform specific movements, the layered step-by-step calibration method, or the camera self-calibration based on the Kruppa equation.
  • the embodiments of the present application do not specifically limit this.
  • each camera in the multi-camera shooting device is rigidly connected. If one or more cameras in the multi-camera shooting device rotate the shooting angle, the direction of the camera changes, and the posture also changes. Since the relative position relationship between each camera is fixed, the movement of all cameras in the multi-camera shooting device is consistent. For example, all cameras in the multi-camera shooting device rotate a certain angle at the same time, or the multi-camera shooting device moves a certain distance as a whole. At this time, even if the posture of each camera changes, the relative posture between each camera remains unchanged.
  • a data basis is obtained for subsequent camera pose estimation calculations. Calculations based on the camera extrinsic parameters can obtain relatively accurate data calculation results.
  • Step 120 Determine a first extrinsic parameter of the master camera and a second extrinsic parameter of each of the slave cameras from the extrinsic parameters, and calculate a posture transformation relationship of each of the slave cameras relative to the master camera based on the first extrinsic parameter and the second extrinsic parameter.
  • the main camera extrinsics and slave camera extrinsics obtained in step 110 are used to calculate the pose transformation relationship between the main camera extrinsics and the slave camera extrinsics.
  • the pose transformation relationship is a transformation formula, and can also be a parameter, or one or more calculation formulas.
  • the purpose is to enable the main camera extrinsics or the slave camera extrinsics to be converted to each other after calculation in combination with the pose transformation relationship.
  • Different forms of pose transformation relationships can be used in the calculation according to actual conditions, as long as the conversion between the main camera extrinsics and the slave camera extrinsics can be conveniently realized.
  • the embodiment of the present application does not make any special limitation on this.
  • the fixed relative position data between the main camera and the slave camera can be added to the optimization process in the subsequent camera pose estimation through the pose transformation relationship, so that the camera pose estimation can use the relative position of the main camera and the slave camera as the optimization basis. It is not only optimized through the matching relationship of image feature points, but also can improve the applicability and stability of the optimization, making the camera pose estimation result more accurate.
  • Step 130 Acquire a first geographic location of the main camera through a sensor.
  • acquiring the first geographical location of the main camera refers to directly obtaining the first geographical location of the main camera through a sensor, and the geographical location refers to the position data of the main camera in the world coordinate system.
  • the sensor can be a gyroscope or a GPS, or other sensors.
  • the first geographic location can be obtained through different sensors according to actual conditions. It only needs to be able to conveniently obtain the position data of the main camera in the world coordinate system directly or through certain calculations.
  • the embodiment of the present application does not make any special limitations on this.
  • the first geographic location of the main camera is obtained through the sensor, making data acquisition more convenient and simple, and the obtained geographic location is the accurate actual location of the main camera, providing an accurate data basis for subsequent optimization calculations.
  • Step 140 Generate three-dimensional points based on the multiple images taken by the multi-camera shooting device, optimize and calculate the three-dimensional points according to the first geographical location and the posture transformation relationship, and obtain the optimized first posture of each camera.
  • generating three-dimensional points based on multiple images taken by a multi-camera shooting device means that after the main camera and the slave camera take multiple images, three-dimensional reconstruction is performed based on the multiple images to generate three-dimensional points.
  • aerial triangulation is performed based on the multiple images taken by the main camera and the slave cameras to generate three-dimensional points.
  • the generation of three-dimensional points refers to the generation by extracting feature points from multiple images. Depending on the number of images, the number of generated three-dimensional points will be sparse or dense.
  • the three-dimensional points are optimized and calculated according to the first geographic location and the posture transformation relationship to obtain the optimized first posture of each camera, which means that the data of the first geographic location and the posture transformation relationship are substituted into the optimization process as optimization parameters, and the three-dimensional points are optimized to obtain the optimized first posture.
  • the method for optimizing the calculation of the three-dimensional points according to the first geographic location and the position transformation relationship can be the bundle adjustment method.
  • the bundle adjustment method refers to extracting the optimal 3D model and camera parameters (intrinsic and external parameters) from visual reconstruction. The bundles of light rays reflected from each feature point converge to the optical center after we make the best adjustment (adjustment) to the camera posture and the position of the feature point. This process is referred to as BA.
  • the first geographic location and the position transformation relationship can be substituted into the bundle adjustment method to achieve the optimization calculation of the three-dimensional points.
  • the optimization of the three-dimensional points can be made not only dependent on Based on the matching relationship of feature points of multiple images, when encountering complex terrain and the matching relationship of feature points of the target image is not accurate, more accurate optimization results can be obtained.
  • the applicability and optimization accuracy of 3D reconstruction in complex terrain are improved.
  • the pose transformation relationship between the main camera extrinsic parameters and the slave camera extrinsic parameters can be calculated by obtaining the main camera extrinsic parameters and the slave camera extrinsic parameters, and then the first geographic location of the main camera is obtained, and the three-dimensional point is optimized by the pose transformation relationship and the first geographic location.
  • data acquisition is more convenient, and it is only necessary to install a sensor for obtaining the first geographic location on the main camera. Since the relative position between each camera does not change, the precise position of each camera can be calculated by the pose transformation relationship and the first geographic location obtained by the sensor on the main camera.
  • the optimization process does not completely rely on the matching relationship between the feature points of multiple target images.
  • the matching relationship of the feature points is inaccurate, the first geographic location and the pose transformation relationship are involved in the optimization calculation, so that the accuracy of the three-dimensional points generated by the optimized three-dimensional reconstruction can be improved, thereby improving the applicability.
  • step 120 further includes:
  • Step a01 Calculating the main camera extrinsic parameters and the slave camera extrinsic parameters according to a plurality of images taken by the main camera and the slave camera at the same track position;
  • Step a02 Calculate the conversion relationship between the image taken by the main camera and the image taken by the slave camera according to the main camera extrinsic parameter and the slave camera extrinsic parameter.
  • T01 is the conversion relationship
  • Tw0 is the master camera external parameter
  • Tw0' is the inverse matrix of Tw0
  • Tw1 is the slave camera external parameter
  • Step a03 Determine the conversion relationship of the image as the posture transformation relationship.
  • step a01 the main camera extrinsic parameters and the slave camera extrinsic parameters are calculated based on multiple images taken by the main camera and the slave camera at the same track position. Specifically, aerial triangulation is performed based on multiple images taken by the main camera and the slave camera at the same track position, and the main camera extrinsic parameters are obtained based on the results of aerial triangulation. Camera extrinsics and slave camera extrinsics.
  • the same trajectory position means that during UAV aerial survey, the flight route of the UAV can form a trajectory.
  • a trajectory contains multiple trajectory positions, which can also be called trajectory points.
  • Each trajectory position has fixed coordinates in the world coordinate system, that is, the main camera extrinsic parameters and the slave camera extrinsic parameters are calculated according to multiple images taken by the main camera and the slave camera at the same trajectory position. It can be understood that the main camera extrinsic parameters and the slave camera extrinsic parameters are calculated according to multiple images taken by the main camera and the slave camera at the same position.
  • Tw0 refers to the extrinsic parameter of the main camera in the world coordinate system
  • Tw0' refers to the transformation from the main camera coordinate system to the world coordinate system.
  • Tw0' can be obtained by calculating the inverse matrix of the main camera extrinsic parameter Tw0, which is used to calculate the transformation relationship between the main camera extrinsic parameter and the slave camera extrinsic parameter, that is, TW0'*TW1.
  • the positions of the main camera and the slave camera in the same coordinate system can be converted to each other. Since the relative positions of multiple cameras remain basically unchanged, the position data of other cameras can be obtained based on the position data of any camera. The data obtained in this way will not be affected by the shooting conditions or image quality.
  • the extrinsic parameters of any camera can be used as optimization data through the conversion relationship T01, which reduces the dependence on the feature point matching relationship. Accurate optimization results can be obtained in multiple images, and the optimization results will not be affected by inaccurate feature point matching due to poor image quality obtained by shooting, thereby improving applicability.
  • generating three-dimensional points according to the multiple images captured by the multi-camera shooting device further includes:
  • Step b01 extracting feature point information of each of the multiple images
  • Step b02 Generate bag-of-words information according to the feature point information
  • Step b03 performing matching calculation on at least two images having the same feature descriptor in the bag-of-words information to obtain a matching relationship between the two matching images;
  • Step b04 calculating the relative conversion relationship between every two images in all images according to the matching relationship
  • Step b05 Generate three-dimensional points based on the relative transformation relationship.
  • the feature point information of each target image can be extracted by FAST feature point Extraction, that is, traverse each pixel in each target image, select 16 surrounding pixels with the current pixel as the center and 3 as the radius, and compare them in turn. If the grayscale difference value is greater than the set threshold, it is marked as a feature point.
  • the set threshold can be set according to the actual situation, and the embodiment of the present application does not make special restrictions on this. You can also choose ORB feature point extraction or surf feature point extraction methods, as long as you can easily extract the feature point information of each target image, and the embodiment of the present application does not make special restrictions on this.
  • the feature point information can be all the pixels in an area centered on the feature point or covering the feature point, or it can be parameter information of a single pixel or multiple pixels.
  • the purpose is to enable subsequent word bag information to be generated based on the feature point information.
  • the feature point information can be in various forms, as long as it can facilitate the generation of subsequent word bag information.
  • the embodiments of the present application do not make special limitations on this.
  • word bag information is generated based on feature point information, specifically by generating words through clustering based on the feature point information, that is, word bag information.
  • the feature point information is all pixel points in multiple areas, and each area contains lakes and grasslands, then corresponding word bag information containing lakes and grasslands can be generated.
  • step b03 at least two target images with the same feature descriptor in the word bag information are matched and calculated to obtain the matching relationship between the two matched target images.
  • the corresponding target images in the word bag information are matched according to the feature descriptor through loop detection.
  • the feature descriptor refers to a descriptor (Descriptor), which is a data structure that describes features.
  • the dimension of a descriptor can be multidimensional and is used to describe feature points.
  • the acquisition method is to take the image feature point as the center, take an S*S neighborhood window, randomly select a pair of points in the window, compare the sizes of the pixels of the two, perform binary assignment, and then continue to randomly select N pairs of points, repeat the binary assignment, and form a binary code. This code is the description of the feature point, that is, the feature descriptor.
  • step b03 after the matching relationship is obtained, further, in order to improve the accuracy of the matching relationship, the erroneous matching relationship can be filtered out by geometric filtering.
  • step b04 the relative transformation relationship of the target image is calculated based on the matching relationship.
  • the matching relationship obtained in step b03 is used to calculate the relative transformation relationship of the extrinsic parameters of each pair of matched target images, and then rotation averaging and translation averaging are performed.
  • Rotation averaging refers to estimating the absolute position of the camera under a given relative rotation measurement value
  • translation averaging refers to estimating the absolute position of the camera under a given relative rotation measurement value.
  • the absolute position of the camera is estimated in the case of relative translation measurement values. Both relative rotation measurement values and relative translation measurement values can be obtained based on the relative conversion relationship.
  • the L2 norm can be used, because in the process of code optimization iteration, the L2 norm is to find the sum of squares. The code will optimize and solve this type of formula and converge quickly.
  • the L1 norm can be used, because the L1 norm has a more stable feedback on noise.
  • the optimizing calculation of the three-dimensional point according to the first geographical location and the posture transformation relationship further includes:
  • Pi is the projection matrix
  • tc ] is the conversion relationship
  • ti ] is the first geographical location
  • k is the internal parameter of any camera in the master camera and the slave camera
  • i is the sequence number of the first geographical location
  • c is the sequence number of the camera
  • Step b07 Calculate the minimized reprojection error of the three-dimensional point according to the projection matrix, and the formula is:
  • x is the three-dimensional point
  • xo is the two-dimensional feature point obtained after reprojecting the three-dimensional point
  • o is the sequence number of the three-dimensional point.
  • step b06 the conversion relationship [R c
  • the camera intrinsic parameter k can be the main camera intrinsic parameter or the slave camera intrinsic parameter. It can be optimized for different cameras according to actual conditions. It only needs to be able to obtain an accurate projection matrix in the end.
  • the camera intrinsic parameter k is the main camera intrinsic parameter
  • the optimized first pose obtained after optimization calculation is the first pose of the main camera.
  • the reprojection error is calculated by the least squares method, that is, the minimum distance that minimizes the reprojection of the three-dimensional point to the two-dimensional plane of the image is calculated.
  • ceres tools can be used to iteratively solve the optimal solution. Different tools can also be used to assist the calculation according to actual conditions. This application does not make any special restrictions on this.
  • the method further includes:
  • Step d01 according to the reprojection error obtained after optimizing the three-dimensional points, eliminating the points whose pixel errors are greater than 4 pixels from the three-dimensional points;
  • Step d02 Eliminate points whose angles between observation points and the three-dimensional points are less than 2 degrees;
  • Step d03 globally optimizing the three-dimensional points.
  • step d01 the pixel error can be calculated by the reprojection error.
  • the formula for calculating the reprojection error is:
  • x is the three-dimensional point
  • xo is the two-dimensional feature point obtained by reprojecting the three-dimensional point
  • o is the serial number of the three-dimensional point.
  • Pi is the projection matrix
  • x is the 3D point
  • xo is the 2D feature point obtained by reprojecting the 3D point.
  • the pixel error is the difference between the position of the 3D point and the 2D feature point when the 3D point is projected onto a 2D plane.
  • the observation point refers to a 3D point generated from multiple images captured by a multi-camera shooting device. If a 3D point can be observed by two cameras at the same time, then the 3D point to the two cameras The angle formed by the straight line is the observation point angle. If the largest observation point angle among all observation point angles of the same observation point is less than 2 degrees, this observation point will be eliminated.
  • the angle between the observation points is less than 2 degrees, it can be considered that the angle between the observation points of the two cameras is particularly small. In this case, the generated 3D points often have large errors.
  • the reprojection error is greater than 4 pixels, that is, when the difference between the 3D point projected onto the 2D plane and the 2D pixel position is greater than 4 pixels, it can also be considered that the error of this 3D point is large. Therefore, by eliminating points with pixel errors greater than 4 and eliminating points with an angle of less than 2 degrees between each observation point, the accuracy of the remaining 3D points can be higher, and the effect of global optimization of the 3D points can be better.
  • the camera pose estimation method further includes:
  • Step e01 Calculate the second geographical location of the slave camera according to the first geographical location of the master camera and the posture transformation relationship;
  • Step e02 Optimizing and calculating the three-dimensional points according to the second geographical location and the posture transformation relationship to obtain each second posture optimized from the camera.
  • the second geographical location of the slave camera is calculated based on the first geographical location and posture transformation relationship of the main camera. This means that the second geographical location of the slave camera is obtained by combining the first geographical location and the posture transformation relationship obtained by calculation, and the second geographical location refers to the position data of the slave camera in the world coordinate system.
  • step e02 the three-dimensional points are optimized and calculated according to the second geographic location and the posture transformation relationship to obtain each second posture optimized from the camera. This means that the data of the second geographic location and the posture transformation relationship are substituted into the optimization process as optimization parameters, the three-dimensional points are optimized, and the optimized second posture is obtained.
  • the second posture from the camera is obtained.
  • the optimization of the three-dimensional points not only depends on the matching relationship of the feature points of multiple images, but also can obtain more accurate optimization results when the matching relationship of the feature points is not accurate under complex terrain, thereby improving the applicability and the effect of three-dimensional reconstruction optimization.
  • the method further includes:
  • Step f01 Obtain the last operation of the multi-camera shooting device according to the first external reference and the second external reference.
  • the historical pose transformation relationship of each slave camera relative to the master camera is calculated, the historical pose transformation relationship is used as the pose transformation relationship, and the process jumps to the step of obtaining the first geographic location of the master camera through the sensor.
  • step f01 since each camera in the multi-camera shooting device is rigidly connected, the relative position relationship of each camera will not change, so the historical posture transformation relationship obtained in the previous operation can be read multiple times and reused.
  • the data obtained in this aerial photography can also be saved for subsequent use.
  • the data can be saved as a json format file, and can also be saved as other types of data according to actual conditions. It is only necessary to ensure that the data can be easily read and used repeatedly, and the embodiment of the present application does not make any special limitations on this.
  • each operation after the first operation can directly use the fixed and unchanging data calculated in the previous operation, which simplifies the operation process and improves the calculation efficiency.
  • Fig. 2 shows a functional block diagram of a camera pose estimation device 200 according to an embodiment of the present invention.
  • the device comprises: a first acquisition module 210 , a first calculation module 220 , a second acquisition module 230 and a second calculation module 240 .
  • a first acquisition module 210 is used to acquire external parameters of each camera in a multi-camera shooting device including at least two cameras, wherein the relative position relationship between each camera is fixed, and the at least two cameras include a master camera and one or more slave cameras;
  • a first calculation module 220 is used to determine a first extrinsic parameter of the master camera and a second extrinsic parameter of each of the slave cameras from the extrinsic parameters, and calculate a posture transformation relationship of each of the slave cameras relative to the master camera according to the first extrinsic parameter and the second extrinsic parameter;
  • a second acquisition module 230 configured to acquire a first geographical location of the main camera through a sensor
  • the second calculation module 240 is used to generate three-dimensional points according to the multiple images taken by the multi-camera shooting device, and optimize the three-dimensional points according to the first geographical location and the posture transformation relationship to obtain the optimized first posture of each camera.
  • the first calculation module 220 further includes:
  • a first calculation unit configured to calculate the main camera extrinsic parameters and the slave camera extrinsic parameters according to a plurality of images taken by the main camera and the slave camera at the same track position;
  • the third computing unit is used to determine the conversion relationship of the image as the posture transformation relationship.
  • the second calculation module 240 further includes:
  • a fourth computing unit configured to extract feature point information of each of the plurality of images
  • a fifth computing unit configured to generate bag-of-words information according to the feature point information
  • a sixth calculation unit configured to perform a matching calculation on at least two images having the same feature descriptor in the bag-of-words information to obtain a matching relationship between the two matching images
  • a seventh calculation unit configured to calculate a relative transformation relationship between every two images in all images according to the matching relationship
  • An eighth calculation unit is used to generate the three-dimensional point according to the relative transformation relationship.
  • the second calculation module 240 further includes:
  • a tenth calculation unit is used to calculate the minimized reprojection error of the three-dimensional point according to the projection matrix, and the formula is: Among them, x is The three-dimensional point, xo is a two-dimensional feature point obtained by reprojecting the three-dimensional point, and o is the sequence number of the three-dimensional point.
  • the camera pose estimation apparatus 200 further includes:
  • a first elimination module is used to eliminate points whose pixel errors are greater than 4 pixels from the three-dimensional points according to the reprojection errors obtained after optimizing and calculating the three-dimensional points;
  • a second elimination module is used to eliminate points whose angles between observation points and the three-dimensional points are less than 2 degrees;
  • the optimization module is used to perform global optimization on the three-dimensional points.
  • the camera pose estimation apparatus 200 further includes:
  • a third calculation module configured to calculate a second geographical location of the slave camera according to the first geographical location of the master camera and the posture transformation relationship;
  • the fourth calculation module is used to optimize the calculation of the three-dimensional point according to the second geographical location and the posture transformation relationship to obtain the second posture optimized from each camera.
  • the camera pose estimation apparatus 200 further includes:
  • a fifth calculation module configured to calculate a second geographical location of the slave camera according to a first geographical location and a position transformation relationship of the master camera
  • the sixth calculation module is used to optimize the calculation of the three-dimensional point according to the second geographical location and the posture transformation relationship to obtain the second posture optimized from each camera.
  • the camera pose estimation apparatus 200 further includes:
  • the third acquisition module is used to obtain the historical posture transformation relationship of each slave camera relative to the main camera calculated according to the first external parameter and the second external parameter during the last operation of the multi-camera shooting device, use the historical posture transformation relationship as the posture transformation relationship, and jump to the step of obtaining the first geographic location of the main camera through the sensor.
  • FIG3 shows a schematic structural diagram of a camera pose estimation device according to an embodiment of the present invention.
  • the specific embodiment of the present invention does not limit the specific implementation of the camera pose estimation device.
  • the camera pose estimation device may include: a processor 302 , a memory 306 , a communication interface 304 , and a communication bus 308 .
  • the processor 302, the memory 306 and the communication interface 304 communicate with each other via the communication bus 308. Communication.
  • the memory 306 is used to store at least one program 310 , and the program 310 enables the processor 302 to execute the relevant steps in the above-mentioned camera pose estimation method embodiment.
  • An embodiment of the present invention further provides a computer-readable storage medium, in which at least one program is stored.
  • the program runs on a camera pose estimation device
  • the camera pose estimation device can execute the camera pose estimation method in any of the above method embodiments.
  • modules in the devices in the embodiments may be adaptively changed and arranged in one or more devices different from the embodiments.
  • the modules or units or components in the embodiments may be combined into one module or unit or component, and may be divided into a plurality of submodules or subunits or subcomponents. Except that at least some of such features and/or processes or units are mutually exclusive, all features disclosed in this specification (including the accompanying claims, abstracts and drawings) and all processes or units of any method or device disclosed in this manner may be combined in any combination. Unless otherwise expressly stated, each feature disclosed in this specification (including the accompanying claims, abstracts and drawings) may be replaced by an alternative feature providing the same, equivalent or similar purpose.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Geometry (AREA)
  • Software Systems (AREA)
  • Remote Sensing (AREA)
  • Computer Graphics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Length Measuring Devices By Optical Means (AREA)

Abstract

A camera pose estimation method and apparatus, and a computer-readable storage medium. The camera pose estimation method comprises: acquiring extrinsic parameters of each camera in a multi-camera photographic apparatus that comprises at least two cameras (110); determining, from the extrinsic parameters, a first extrinsic parameter of a master camera and a second extrinsic parameter of each slave camera, and according to the first extrinsic parameter and the second extrinsic parameter, calculating a pose transformation relationship of each slave camera relative to the master camera (120); acquiring a first geographical location of the master camera by means of a sensor (130); and generating a three-dimensional point according to a plurality of images captured by the multi-camera photographic apparatus, and performing optimization calculation on the three-dimensional point according to the first geographical location and the pose transformation relationship, so as to obtain an optimized first pose of each camera (140). By means of the method, the embodiments of the present application can solve the problem of an optimization effect being poor due to a feature point matching effect being relatively poor when a captured image has relatively poor ground texture, and the embodiments of the present application have relatively low dependence on feature point matching, and can improve the optimization effect when the ground texture is relatively poor.

Description

相机位姿估计方法、装置及计算机可读存储介质Camera pose estimation method, device and computer readable storage medium 技术领域Technical Field
本发明实施例涉及无人机航测技术领域,具体涉及一种相机位姿估计方法、装置、设备及计算机可读存储介质。The embodiments of the present invention relate to the field of unmanned aerial survey technology, and specifically to a camera pose estimation method, device, equipment and computer-readable storage medium.
背景技术Background technique
目前,随着科技水平的不断进步,航空摄影测量技术能够测绘各种比例尺的地形图和图像地图,也可以建立地形数据库,为各种地理信息系统和土地信息系统提供基础数据,便于人们更精细地对土地做出规划与管理,通过航空摄影测量技术提供的精确的地理信息数据也能够建立高精度的地图,为人们的出行定位带来便利。航空摄影测量技术常常用到空中三角测量,即利用连续摄取的具有一定重叠的航摄相片,依据少量野外控制点,以摄影测量方法建立同实地相应的航线模型或区域网模型,从而获取加密点的平面坐标和高程。无人机航测技术是传统航空摄影测量手段的有力补充,具有机动灵活、高效快速、精细准确、作业成本低以及适用范围广等特点,利用无人机航测技术,无人机在搭载多相机进行航空摄影时,可以有效地拍摄到正射及倾斜图像,多相机的画幅较大,高空作业能一次性完成飞行任务。At present, with the continuous advancement of science and technology, aerial photogrammetry technology can map topographic maps and image maps of various scales, and can also establish terrain databases to provide basic data for various geographic information systems and land information systems, so that people can plan and manage land more finely. The precise geographic information data provided by aerial photogrammetry technology can also establish high-precision maps, which brings convenience to people's travel positioning. Aerial photogrammetry technology often uses aerial triangulation, that is, using continuously captured aerial photos with a certain overlap, based on a small number of field control points, to establish a route model or regional network model corresponding to the field by photogrammetry methods, so as to obtain the plane coordinates and elevation of the encrypted points. UAV aerial survey technology is a powerful supplement to traditional aerial photogrammetry methods. It has the characteristics of flexibility, high efficiency, precision and accuracy, low operating cost and wide application range. Using UAV aerial survey technology, when carrying multiple cameras for aerial photography, UAVs can effectively capture orthophotos and oblique images. The frame of multiple cameras is large, and high-altitude operations can complete flight missions at one time.
现有的无人机航测技术,在无人机搭载多相机进行航空摄影时,如果遇到纹理特性较差的地面,例如大片树林或者水域较多时,特征点的匹配效果往往较差。With existing drone aerial survey technology, when a drone is equipped with multiple cameras for aerial photography, the matching effect of feature points is often poor when encountering a ground with poor texture characteristics, such as large forests or large areas of water.
发明内容Summary of the invention
鉴于上述问题,本发明实施例提供了一种相机位姿估计方法,用于解决现有技术中存在的问题。In view of the above problems, an embodiment of the present invention provides a camera pose estimation method to solve the problems existing in the prior art.
根据本发明实施例的一个方面,提供了一种相机位姿估计方法,其特征在于,包括: According to one aspect of an embodiment of the present invention, a camera pose estimation method is provided, characterized by comprising:
获取包括至少两个相机的多相机拍摄装置中每个所述相机的外参,其中每个所述相机之间的相对位置关系固定不变,所述至少两个相机包括一个主相机和一个或多个从相机;Acquire external parameters of each camera in a multi-camera shooting device including at least two cameras, wherein the relative position relationship between each of the cameras is fixed, and the at least two cameras include a master camera and one or more slave cameras;
从所述外参中确定所述主相机的第一外参和每个所述从相机的第二外参,根据所述第一外参和所述第二外参计算得到每个所述从相机相对于所述主相机的位姿变换关系;Determine a first extrinsic parameter of the master camera and a second extrinsic parameter of each of the slave cameras from the extrinsic parameters, and calculate a posture transformation relationship of each of the slave cameras relative to the master camera according to the first extrinsic parameter and the second extrinsic parameter;
通过传感器获取所述主相机的第一地理位置;Acquire a first geographic location of the main camera through a sensor;
根据所述多相机拍摄装置拍摄的多张图像生成三维点,根据所述第一地理位置和所述位姿变换关系对所述三维点进行优化计算,得到每个所述相机优化后的第一位姿。Three-dimensional points are generated according to the multiple images taken by the multi-camera shooting device, and the three-dimensional points are optimized and calculated according to the first geographical location and the posture transformation relationship to obtain the optimized first posture of each camera.
在一种可选的方式中,所述从所述外参中确定所述主相机的第一外参和每个所述从相机的第二外参,根据所述第一外参和所述第二外参计算得到每个所述从相机相对于所述主相机的位姿变换关系,进一步包括:In an optional manner, determining a first extrinsic parameter of the master camera and a second extrinsic parameter of each of the slave cameras from the extrinsic parameters, and calculating a pose transformation relationship of each of the slave cameras relative to the master camera according to the first extrinsic parameter and the second extrinsic parameter further includes:
根据所述主相机与所述从相机在同一轨迹位置拍摄的多张图像计算得到所述主相机外参和所述从相机外参;The main camera extrinsic parameter and the slave camera extrinsic parameter are calculated according to a plurality of images taken by the main camera and the slave camera at the same track position;
根据所述主相机外参与所述从相机外参,计算所述主相机拍摄的图像到所述从相机拍摄的图像的转换关系,计算公式为:
T01=Tw0’*Tw1
According to the main camera extrinsic parameter and the slave camera extrinsic parameter, the conversion relationship between the image taken by the main camera and the image taken by the slave camera is calculated, and the calculation formula is:
T01=Tw0'*Tw1
其中,T01为所述转换关系,Tw0为所述主相机外参,Tw0’为Tw0的逆矩阵,Tw1为所述从相机外参;Wherein, T01 is the conversion relationship, Tw0 is the master camera external parameter, Tw0' is the inverse matrix of Tw0, and Tw1 is the slave camera external parameter;
将所述图像的转换关系确定为所述位姿变换关系。The conversion relationship of the image is determined as the posture transformation relationship.
在一种可选的方式中,所述根据所述多相机拍摄装置拍摄的多张图像生成三维点,进一步包括:In an optional manner, generating three-dimensional points according to the multiple images captured by the multi-camera shooting device further includes:
提取多张图像中每一张图像的特征点信息;Extract feature point information of each image from multiple images;
根据所述特征点信息生成词袋信息;Generate bag-of-words information according to the feature point information;
将所述词袋信息中具有相同特征描述子的至少两个图像进行匹配计算,得到匹配的两个图像间的匹配关系; Performing matching calculation on at least two images having the same feature descriptor in the bag-of-words information to obtain a matching relationship between the two matching images;
根据所述匹配关系计算全部图像中每两张图像间的相对转换关系;Calculating the relative transformation relationship between every two images in all images according to the matching relationship;
根据所述相对转换关系生成所述三维点。The three-dimensional point is generated according to the relative transformation relationship.
在一种可选的方式中,所述根据所述第一地理位置和所述位姿变换关系对所述三维点进行优化计算,进一步包括:In an optional manner, the optimizing calculation of the three-dimensional point according to the first geographical location and the posture transformation relationship further includes:
根据如下公式确定用于优化的投影矩阵:
Pi=k·[Rc|tc]·[Ri|ti]
The projection matrix used for optimization is determined according to the following formula:
P i = k · [R c | t c ] · [R i | t i ]
其中,Pi为所述投影矩阵,[Rc|tc]为所述位姿变换关系,[Ri|ti]为所述第一地理位置,k为所述主相机和所述从相机中任一相机内参,i为第一地理位置序号,c为位姿变换关系序号;Wherein, Pi is the projection matrix, [ Rc | tc ] is the posture transformation relationship, [ Ri | ti ] is the first geographical location, k is the internal parameter of any camera of the master camera and the slave camera, i is the sequence number of the first geographical location, and c is the sequence number of the posture transformation relationship;
根据所述投影矩阵计算所述三维点的最小化的重投影误差,其公式为:
The minimized reprojection error of the three-dimensional point is calculated according to the projection matrix, and the formula is:
其中,x为所述三维点,xo为将所述三维点进行重投影后得到的二维特征点,o为三维点的序号。Wherein, x is the three-dimensional point, xo is the two-dimensional feature point obtained after reprojecting the three-dimensional point, and o is the serial number of the three-dimensional point.
在一种可选的方式中,在根据所述第一地理位置和所述位姿变换关系对所述三维点进行优化计算之后,所述相机位姿估计方法还包括:In an optional manner, after optimizing and calculating the three-dimensional point according to the first geographical location and the posture transformation relationship, the camera posture estimation method further includes:
根据对所述三维点进行优化计算后得到的重投影误差,剔除所述三维点中像素误差大于4个像素点的点;According to the reprojection error obtained after optimizing the three-dimensional points, points whose pixel errors are greater than 4 pixels are eliminated from the three-dimensional points;
剔除所述三维点中观测点夹角小于2度的点;Eliminate points whose angles between observation points are less than 2 degrees from the three-dimensional points;
对所述三维点进行全局优化。The three-dimensional points are globally optimized.
在一种可选的方式中,所述相机位姿估计方法还包括:In an optional manner, the camera pose estimation method further includes:
根据所述主相机的第一地理位置与所述位姿变换关系计算得到所述从相机的第二地理位置;Calculate the second geographical location of the slave camera according to the first geographical location of the master camera and the posture transformation relationship;
根据所述第二地理位置和所述位姿变换关系对所述三维点进行优化计算, 得到每个所述从相机优化后的第二位姿。Optimizing and calculating the three-dimensional point according to the second geographical location and the posture transformation relationship, The optimized second pose of each slave camera is obtained.
在一种可选的方式中,所述获取包括至少两个相机的多相机拍摄装置中每个所述相机的外参之后,进一步包括:In an optional manner, after obtaining the external parameters of each camera in a multi-camera shooting device including at least two cameras, the method further includes:
获取所述多相机拍摄装置上一次作业过程中通过根据所述第一外参和所述第二外参计算得到的每个所述从相机相对于所述主相机的所述位姿变换关系。The position transformation relationship of each of the slave cameras relative to the master camera obtained by calculating according to the first external parameter and the second external parameter during the last operation of the multi-camera shooting device is obtained.
根据本发明实施例的另一方面,提供了一种相机位姿估计装置,包括:According to another aspect of an embodiment of the present invention, a camera pose estimation device is provided, comprising:
第一获取模块,用于获取包括至少两个相机的多相机拍摄装置中每个所述相机的外参,其中每个所述相机之间的相对位置关系固定不变,所述至少两个相机包括一个主相机和一个或多个从相机;A first acquisition module, used to acquire an external parameter of each camera in a multi-camera shooting device including at least two cameras, wherein a relative position relationship between each of the cameras is fixed, and the at least two cameras include a master camera and one or more slave cameras;
第一计算模块,用于从所述外参中确定所述主相机的第一外参和每个所述从相机的第二外参,根据所述第一外参和所述第二外参计算得到每个所述从相机相对于所述主相机的位姿变换关系;A first calculation module is used to determine a first extrinsic parameter of the master camera and a second extrinsic parameter of each of the slave cameras from the extrinsic parameters, and calculate a posture transformation relationship of each of the slave cameras relative to the master camera according to the first extrinsic parameter and the second extrinsic parameter;
第二获取模块,用于通过传感器获取所述主相机的第一地理位置;A second acquisition module, configured to acquire a first geographic location of the main camera through a sensor;
第二计算模块,用于根据所述多相机拍摄装置拍摄的多张图像生成三维点,根据所述第一地理位置和所述位姿变换关系对所述三维点进行优化计算,得到每个所述相机优化后的第一位姿。The second calculation module is used to generate three-dimensional points according to the multiple images taken by the multi-camera shooting device, and optimize the three-dimensional points according to the first geographical location and the posture transformation relationship to obtain the optimized first posture of each camera.
根据本发明实施例的另一方面,提供了一种相机位姿估计设备,包括:According to another aspect of an embodiment of the present invention, a camera pose estimation device is provided, comprising:
处理器、存储器、通信接口和通信总线,所述处理器、所述存储器和所述通信接口通过所述通信总线完成相互间的通信;A processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface communicate with each other via the communication bus;
所述存储器用于存放至少一程序,所述程序使所述处理器执行如上述相机位姿估计方法的操作。The memory is used to store at least one program, and the program enables the processor to perform operations such as the above-mentioned camera pose estimation method.
根据本发明实施例的又一方面,提供了一种计算机可读存储介质,所述存储介质中存储有至少一程序,所述程序使相机位姿估计设备执行如上述方法对应的操作。According to another aspect of an embodiment of the present invention, a computer-readable storage medium is provided, wherein at least one program is stored in the storage medium, and the program enables a camera pose estimation device to perform operations corresponding to the above method.
根据本发明实施例的相机位姿估计方法、装置、设备及计算机可读存储介质,通过获取主相机外参和从相机外参,计算得到从相机外参相对于主相机外参的位姿变换关系,再获取主相机的第一地理位置,根据第一地理位置 和位姿变换关系对主相机与从相机拍摄的多张目标图像生成的三维点进行优化计算,由于主相机的第一地理位置是其实际位置,而主相机与从相机之间的实际位置关系均相对固定,因此通过获取从相机相对于主相机的位姿变换关系即相当于获取到了从相机的实际位置,基于相机的实际位置对三维点进行优化可以解决当拍摄到的图像的地面纹理较差时,特征点匹配效果较差导致优化效果差的问题,对特征点匹配的依赖性较低,可以提高地面纹理较差时的优化效果,提高优化精确性与适用性,进而提高三维重建的准确度。According to the camera pose estimation method, device, equipment and computer-readable storage medium of the embodiments of the present invention, by obtaining the main camera extrinsic parameters and the slave camera extrinsic parameters, the pose transformation relationship of the slave camera extrinsic parameters relative to the main camera extrinsic parameters is calculated, and then the first geographic location of the main camera is obtained. And pose transformation relationship are used to optimize the 3D points generated by multiple target images taken by the main camera and the slave camera. Since the first geographical location of the main camera is its actual location, and the actual position relationship between the main camera and the slave camera is relatively fixed, obtaining the pose transformation relationship of the slave camera relative to the main camera is equivalent to obtaining the actual position of the slave camera. Optimizing the 3D points based on the actual position of the camera can solve the problem of poor optimization effect caused by poor feature point matching when the ground texture of the captured image is poor. It has low dependence on feature point matching, can improve the optimization effect when the ground texture is poor, improve the optimization accuracy and applicability, and thus improve the accuracy of 3D reconstruction.
上述说明仅是本发明实施例技术方案的概述,为了能够更清楚了解本发明实施例的技术手段,而可依照说明书的内容予以实施,并且为了让本发明实施例的上述和其它目的、特征和优点能够更明显易懂,以下特举本发明的具体实施方式。The above description is only an overview of the technical solution of the embodiment of the present invention. In order to more clearly understand the technical means of the embodiment of the present invention, it can be implemented according to the contents of the specification. In order to make the above and other purposes, features and advantages of the embodiment of the present invention more obvious and easy to understand, the specific implementation methods of the present invention are listed below.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
附图仅用于示出实施方式,而并不认为是对本发明的限制。而且在整个附图中,用相同的参考符号表示相同的部件。在附图中:The accompanying drawings are only used to illustrate the embodiments and are not to be considered as limiting the present invention. In addition, the same reference symbols are used to represent the same components throughout the accompanying drawings. In the accompanying drawings:
图1示出了本发明实施例提供的相机位姿估计方法的流程示意图;FIG1 is a schematic diagram showing a flow chart of a camera pose estimation method provided by an embodiment of the present invention;
图2示出了本发明实施例提供的相机位姿估计装置的结构示意图;FIG2 is a schematic diagram showing the structure of a camera pose estimation device provided by an embodiment of the present invention;
图3示出了本发明实施例提供的相机位姿估计设备的结构示意图。FIG3 shows a schematic structural diagram of a camera pose estimation device provided by an embodiment of the present invention.
具体实施方式Detailed ways
下面将参照附图更详细地描述本发明的示例性实施例。虽然附图中显示了本发明的示例性实施例,然而应当理解,可以以各种形式实现本发明而不应被这里阐述的实施例所限制。Exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. Although exemplary embodiments of the present invention are shown in the drawings, it should be understood that the present invention can be implemented in various forms and should not be limited to the embodiments set forth herein.
针对现在普遍使用的通过无人机航测进行三维重建,发明人注意到,现有的无人机航测通过空中三角测量进行三维重建时,将无人机上每个相机拍摄到的图像进行特征点提取与匹配,仅利用匹配的信息对生成的三维点进行BA(Bundle Adjustment)优化,由于对匹配信息的强依赖性,当地面纹理特性较差时,会出现稳定性不足、特征点匹配效果下降和相机位置解算出现异 常等问题。为了提高无人机航测三维重建的稳定性,扩大适用性,研究出一种更精确的、对匹配关系依赖较小,且对地形场景适用性更广的相机位姿估计方法是尤为重要的。In view of the currently common use of drone aerial survey for 3D reconstruction, the inventors have noticed that in the existing drone aerial survey for 3D reconstruction through aerial triangulation, the images taken by each camera on the drone are extracted and matched, and only the matching information is used to perform BA (Bundle Adjustment) optimization on the generated 3D points. Due to the strong dependence on matching information, when the ground texture characteristics are poor, insufficient stability, reduced feature point matching effect and abnormal camera position solution will occur. In order to improve the stability of UAV aerial survey 3D reconstruction and expand its applicability, it is particularly important to develop a more accurate camera pose estimation method that is less dependent on matching relationships and has wider applicability to terrain scenes.
为了解决上述问题,本申请发明人经过研究,设计了一种相机位姿估计方法,通过获取主相机外参和从相机外参,计算主相机外参与从相机外参的位姿变换关系,获取主相机的第一地理位置,再根据第一地理位置和位姿变换关系对三维重建生成的三维点进行优化计算,可以降低优化过程对特征点匹配关系的依赖性,在复杂地形不易因匹配关系不准确而对优化后的三维点精确度造成影响,能够提高通过无人机航测进行三维重建的准确度。In order to solve the above problems, the inventors of the present application have designed a camera pose estimation method after research. By obtaining the main camera extrinsic parameters and the slave camera extrinsic parameters, calculating the pose transformation relationship between the main camera extrinsic parameters and the slave camera extrinsic parameters, obtaining the first geographical location of the main camera, and then optimizing the three-dimensional points generated by three-dimensional reconstruction according to the first geographical location and the pose transformation relationship, the dependence of the optimization process on the matching relationship of feature points is reduced, and the accuracy of the optimized three-dimensional points is not easily affected by inaccurate matching relationships in complex terrain, which can improve the accuracy of three-dimensional reconstruction through unmanned aerial vehicle aerial survey.
图1示出了本发明实施例提供的相机位姿估计方法的流程图,如图1所示,该方法包括以下步骤:FIG. 1 shows a flow chart of a camera pose estimation method provided by an embodiment of the present invention. As shown in FIG. 1 , the method includes the following steps:
步骤110:获取包括至少两个相机的多相机拍摄装置中每个相机的外参,其中每个相机之间的相对位置关系固定不变,至少两个相机包括一个主相机和一个或多个从相机。Step 110: Obtaining external parameters of each camera in a multi-camera shooting device including at least two cameras, wherein the relative position relationship between each camera is fixed, and the at least two cameras include a master camera and one or more slave cameras.
本步骤中,主相机指的是多相机拍摄装置中作为正射镜头的相机,即正对目标方向拍摄的相机,在三维重建中,为了拍摄到同一位置不同角度的图像以便于提取特征点生成三维模型,相机的拍摄方向往往不一致,一般只存在一个主相机,其余相机均为从相机,在本申请实施例中,仅以具有一个主相机、一个或多个从相机的多相机拍摄装置为例说明。In this step, the main camera refers to the camera used as an orthographic lens in a multi-camera shooting device, that is, the camera that shoots in the direction of the target. In three-dimensional reconstruction, in order to capture images at different angles at the same position so as to extract feature points and generate a three-dimensional model, the shooting directions of the cameras are often inconsistent. Generally, there is only one main camera, and the remaining cameras are slave cameras. In the embodiment of the present application, only a multi-camera shooting device with one main camera and one or more slave cameras is taken as an example.
本实施例中,多相机拍摄装置中的主相机上需安装地理位置获取装置,例如各种类型的位置传感器。其他相机则作为从相机,可基于主相机的地理位置以及其他相机与主相机之间固定不变的相对位置关系,在后续步骤中获得从相机相对于主相机的位姿变换关系,或者进一步获得从相机的地理位置。In this embodiment, a geographic location acquisition device, such as various types of location sensors, needs to be installed on the main camera in the multi-camera shooting device. The other cameras serve as slave cameras, and based on the geographic location of the main camera and the fixed relative position relationship between the other cameras and the main camera, the posture transformation relationship of the slave cameras relative to the main camera can be obtained in subsequent steps, or the geographic location of the slave cameras can be further obtained.
在本步骤中,获取包括至少两个相机的多相机拍摄装置中每个相机的外参指的是,获取主相机的外参数和从相机的外参数,相机的外参数指的是相机在世界坐标系下的参数,包括旋转矩阵R和平移矩阵T。相机外参可以通过多种方式获取,例如,可以通过主相机和从相机在同一轨迹点拍摄的多张图像进行一次空中三角测量计算,得到主相机外参和从相机外参,也可以通过相机自标定的方式得到。相机自标定的方法也有多种,可以采用Tsai两步 法、张氏标定法、主动系统控制相机做特定运动法、分层逐步标定法或基于Kruppa方程进行相机自标定,可以根据实际情况采用不同的方法来获取相机外参,本申请实施例对此不作特殊限定。In this step, obtaining the extrinsic parameters of each camera in a multi-camera shooting device including at least two cameras refers to obtaining the extrinsic parameters of the master camera and the slave camera. The extrinsic parameters of the camera refer to the parameters of the camera in the world coordinate system, including the rotation matrix R and the translation matrix T. The camera extrinsic parameters can be obtained in a variety of ways. For example, an aerial triangulation calculation can be performed using multiple images taken by the master camera and the slave camera at the same trajectory point to obtain the master camera extrinsic parameters and the slave camera extrinsic parameters. They can also be obtained by camera self-calibration. There are also many methods for camera self-calibration. The Tsai two-step method can be used. Different methods can be used to obtain camera extrinsics according to actual conditions, such as the Zhang calibration method, the active system controlling the camera to perform specific movements, the layered step-by-step calibration method, or the camera self-calibration based on the Kruppa equation. The embodiments of the present application do not specifically limit this.
本步骤中,多相机拍摄装置中的每个相机之间为刚性连接。若多相机拍摄装置中的一个或多个相机转动了拍摄角度,相机的朝向即发生变化,位姿也会发生变化。由于每个相机之间的相对位置关系固定不变,多相机拍摄装置中所有相机的运动均一致,例如多相机拍摄装置所有相机同时整体转动一定角度,或者多相机拍摄装置整体移动一定距离等,此时即使每个相机的位姿会发生变化,但是每个相机之间的相对位姿是不变的。In this step, each camera in the multi-camera shooting device is rigidly connected. If one or more cameras in the multi-camera shooting device rotate the shooting angle, the direction of the camera changes, and the posture also changes. Since the relative position relationship between each camera is fixed, the movement of all cameras in the multi-camera shooting device is consistent. For example, all cameras in the multi-camera shooting device rotate a certain angle at the same time, or the multi-camera shooting device moves a certain distance as a whole. At this time, even if the posture of each camera changes, the relative posture between each camera remains unchanged.
通过获取包括至少两个相机的多相机拍摄装置中每个相机的外参,为后续相机位姿估计计算获取了数据基础,基于相机外参进行计算可以获得较为准确的数据计算结果。By obtaining the extrinsic parameters of each camera in a multi-camera shooting device including at least two cameras, a data basis is obtained for subsequent camera pose estimation calculations. Calculations based on the camera extrinsic parameters can obtain relatively accurate data calculation results.
步骤120:从所述外参中确定所述主相机的第一外参和每个所述从相机的第二外参,根据所述第一外参和所述第二外参计算得到每个所述从相机相对于所述主相机的位姿变换关系。Step 120: Determine a first extrinsic parameter of the master camera and a second extrinsic parameter of each of the slave cameras from the extrinsic parameters, and calculate a posture transformation relationship of each of the slave cameras relative to the master camera based on the first extrinsic parameter and the second extrinsic parameter.
在本步骤中,通过步骤110获取到的主相机外参和从相机外参,计算得出主相机外参与从相机外参间的位姿变换关系,位姿变换关系是一个变换公式,也可以为一个参数,或是一个或多个运算公式,目的是使主相机外参或从相机外参在结合位姿变换关系进行计算后可以互相转换,根据实际情况可以采用不同形式的位姿变换关系参与计算,只要能方便的实现主相机外参与从相机外参之间的转换即可,本申请实施例对此不作特殊限定。In this step, the main camera extrinsics and slave camera extrinsics obtained in step 110 are used to calculate the pose transformation relationship between the main camera extrinsics and the slave camera extrinsics. The pose transformation relationship is a transformation formula, and can also be a parameter, or one or more calculation formulas. The purpose is to enable the main camera extrinsics or the slave camera extrinsics to be converted to each other after calculation in combination with the pose transformation relationship. Different forms of pose transformation relationships can be used in the calculation according to actual conditions, as long as the conversion between the main camera extrinsics and the slave camera extrinsics can be conveniently realized. The embodiment of the present application does not make any special limitation on this.
通过得到主相机外参与从相机外参间的位姿变换关系,使后续相机位姿估计中可以通过位姿变换关系将主相机与从相机之间固定的相对位置数据加入优化过程,使相机位姿估计能够以主相机与从相机的相对位置作为优化基础,不仅是通过图像特征点的匹配关系进行优化,能够提高优化的适用性与稳定性,使相机位姿估计结果更加精准。By obtaining the pose transformation relationship between the main camera extrinsic parameters and the slave camera extrinsic parameters, the fixed relative position data between the main camera and the slave camera can be added to the optimization process in the subsequent camera pose estimation through the pose transformation relationship, so that the camera pose estimation can use the relative position of the main camera and the slave camera as the optimization basis. It is not only optimized through the matching relationship of image feature points, but also can improve the applicability and stability of the optimization, making the camera pose estimation result more accurate.
步骤130:通过传感器获取主相机的第一地理位置。Step 130: Acquire a first geographic location of the main camera through a sensor.
本步骤中,获取主相机的第一地理位置指的是,通过传感器直接得到主相机的第一地理位置,地理位置指的是主相机在世界坐标系下的位置数据。 In this step, acquiring the first geographical location of the main camera refers to directly obtaining the first geographical location of the main camera through a sensor, and the geographical location refers to the position data of the main camera in the world coordinate system.
在本步骤中,传感器可以是陀螺仪或GPS,也可以是其他传感器,可以根据实际情况通过不同的传感器获取第一地理位置,只需要能够方便地直接获取或通过一定计算获取到主相机在世界坐标系下的位置数据即可,本申请实施例对此不作特殊限定。In this step, the sensor can be a gyroscope or a GPS, or other sensors. The first geographic location can be obtained through different sensors according to actual conditions. It only needs to be able to conveniently obtain the position data of the main camera in the world coordinate system directly or through certain calculations. The embodiment of the present application does not make any special limitations on this.
通过传感器获取主相机的第一地理位置,使数据的获取更加方便简单,且获取的地理位置为主相机的准确的实际所处位置,为后续优化计算提供准确的数据基础。The first geographic location of the main camera is obtained through the sensor, making data acquisition more convenient and simple, and the obtained geographic location is the accurate actual location of the main camera, providing an accurate data basis for subsequent optimization calculations.
步骤140:根据所述多相机拍摄装置拍摄的多张图像生成三维点,根据所述第一地理位置和所述位姿变换关系对所述三维点进行优化计算,得到每个所述相机优化后的第一位姿。Step 140: Generate three-dimensional points based on the multiple images taken by the multi-camera shooting device, optimize and calculate the three-dimensional points according to the first geographical location and the posture transformation relationship, and obtain the optimized first posture of each camera.
在本步骤中,根据多相机拍摄装置拍摄的多张图像生成三维点指的是,主相机与从相机在拍摄多张图像后,根据多张图像进行三维重建,生成三维点,在本申请实施例中,根据主相机与从相机拍摄的多张图像进行空中三角测量生成三维点。In this step, generating three-dimensional points based on multiple images taken by a multi-camera shooting device means that after the main camera and the slave camera take multiple images, three-dimensional reconstruction is performed based on the multiple images to generate three-dimensional points. In an embodiment of the present application, aerial triangulation is performed based on the multiple images taken by the main camera and the slave cameras to generate three-dimensional points.
其中,三维点的生成指的是通过在多张图像中提取特征点的方式生成,根据图像的数量不同,生成的三维点的数量会因此稀疏或密集。The generation of three-dimensional points refers to the generation by extracting feature points from multiple images. Depending on the number of images, the number of generated three-dimensional points will be sparse or dense.
在本步骤中,根据第一地理位置和位姿变换关系对三维点进行优化计算,得到每个相机优化后的第一位姿指的是,将第一地理位置和位姿变换关系的数据代入优化过程中,作为优化参数,对三维点进行优化,得到优化后的第一位姿。In this step, the three-dimensional points are optimized and calculated according to the first geographic location and the posture transformation relationship to obtain the optimized first posture of each camera, which means that the data of the first geographic location and the posture transformation relationship are substituted into the optimization process as optimization parameters, and the three-dimensional points are optimized to obtain the optimized first posture.
在本步骤中,根据第一地理位置和位姿变换关系对三维点进行优化计算的方法可以采用光束平差法(Baundle Adjustment),光束平差法是指从视觉重建中提炼出最优的3D模型和相机参数(内参和外参)。从每个特征点反射出来的几束光线(bundles of light rays),在我们把相机姿态和特征点的位置做出最优的调整(adjustment)之后,收束到光心的这个过程,简称BA,可以将第一地理位置和位姿变换关系代入光束平差法中来实现三维点的优化计算。In this step, the method for optimizing the calculation of the three-dimensional points according to the first geographic location and the position transformation relationship can be the bundle adjustment method. The bundle adjustment method refers to extracting the optimal 3D model and camera parameters (intrinsic and external parameters) from visual reconstruction. The bundles of light rays reflected from each feature point converge to the optical center after we make the best adjustment (adjustment) to the camera posture and the position of the feature point. This process is referred to as BA. The first geographic location and the position transformation relationship can be substituted into the bundle adjustment method to achieve the optimization calculation of the three-dimensional points.
通过根据主相机与从相机拍摄的多张目标图像生成三维点,再根据第一地理位置和位姿变换关系对三维点进行优化,可以使三维点的优化不仅依赖 于多张图像特征点的匹配关系,当遇到较复杂地形,目标图像的特征点匹配关系不精准时,也能获得较为精确的优化结果,同时通过将主相机的第一地理位置和位姿变换关系参与优化,提高了三维重建在复杂地形下的适用性与优化精确度。By generating three-dimensional points based on multiple target images taken by the main camera and the slave camera, and then optimizing the three-dimensional points based on the first geographic location and the posture transformation relationship, the optimization of the three-dimensional points can be made not only dependent on Based on the matching relationship of feature points of multiple images, when encountering complex terrain and the matching relationship of feature points of the target image is not accurate, more accurate optimization results can be obtained. At the same time, by involving the first geographic location and posture transformation relationship of the main camera in the optimization, the applicability and optimization accuracy of 3D reconstruction in complex terrain are improved.
通过上述步骤110、步骤120、步骤130和步骤140的结合可知,根据本申请提供的相机位姿估计方法,可以通过获取主相机外参与从相机外参,计算出主相机外参与从相机外参间的位姿变换关系,然后获取主相机的第一地理位置,通过位姿变换关系与第一地理位置对三维点进行优化,通过这种方式,使数据的获取更加方便,只需要在主相机上配合安装用于获取第一地理位置的传感器即可,由于每个相机间的相对位置不会改变,因此可以通过位姿变换关系与主相机上的传感器获取的第一地理位置计算得到每个相机精准的位置,不需要在每一个相机上都安装额外的传感器,降低了设备成本,并且通过将主相机的第一地理位置和能够反映主相机与从相机的相对位置的位姿变换关系参与优化计算,可以使优化过程不彻底依赖于多张目标图像的特征点的匹配关系,当对复杂地形进行三维重建时,特征点的匹配关系不精确时,通过将第一地理位置和位姿变换关系参与优化计算,可以提高优化后的三维重建生成的三维点的精确度,提高了适用性。Through the combination of the above steps 110, 120, 130 and 140, it can be known that according to the camera pose estimation method provided by the present application, the pose transformation relationship between the main camera extrinsic parameters and the slave camera extrinsic parameters can be calculated by obtaining the main camera extrinsic parameters and the slave camera extrinsic parameters, and then the first geographic location of the main camera is obtained, and the three-dimensional point is optimized by the pose transformation relationship and the first geographic location. In this way, data acquisition is more convenient, and it is only necessary to install a sensor for obtaining the first geographic location on the main camera. Since the relative position between each camera does not change, the precise position of each camera can be calculated by the pose transformation relationship and the first geographic location obtained by the sensor on the main camera. There is no need to install additional sensors on each camera, which reduces the equipment cost. In addition, by involving the first geographic location of the main camera and the pose transformation relationship that can reflect the relative position of the main camera and the slave camera in the optimization calculation, the optimization process does not completely rely on the matching relationship between the feature points of multiple target images. When three-dimensional reconstruction is performed on complex terrain, if the matching relationship of the feature points is inaccurate, the first geographic location and the pose transformation relationship are involved in the optimization calculation, so that the accuracy of the three-dimensional points generated by the optimized three-dimensional reconstruction can be improved, thereby improving the applicability.
在本发明的一个实施例中,步骤120进一步包括:In one embodiment of the present invention, step 120 further includes:
步骤a01:根据所述主相机与所述从相机在同一轨迹位置拍摄的多张图像计算得到所述主相机外参和所述从相机外参;Step a01: Calculating the main camera extrinsic parameters and the slave camera extrinsic parameters according to a plurality of images taken by the main camera and the slave camera at the same track position;
步骤a02:根据所述主相机外参与所述从相机外参,计算所述主相机拍摄的图像到所述从相机拍摄的图像的转换关系,计算公式为:
T01=Tw0’*Tw1
Step a02: Calculate the conversion relationship between the image taken by the main camera and the image taken by the slave camera according to the main camera extrinsic parameter and the slave camera extrinsic parameter. The calculation formula is:
T01=Tw0'*Tw1
其中,T01为所述转换关系,Tw0为所述主相机外参,Tw0’为Tw0的逆矩阵,Tw1为所述从相机外参;Wherein, T01 is the conversion relationship, Tw0 is the master camera external parameter, Tw0' is the inverse matrix of Tw0, and Tw1 is the slave camera external parameter;
步骤a03:将所述图像的转换关系确定为所述位姿变换关系。Step a03: Determine the conversion relationship of the image as the posture transformation relationship.
在步骤a01中,根据主相机与从相机在同一轨迹位置拍摄的多张图像计算得到主相机外参和从相机外参,具体为,根据主相机与从相机在同一轨迹位置拍摄的多张图像,进行空中三角测量,根据空中三角测量的结果得到主 相机外参和从相机外参。In step a01, the main camera extrinsic parameters and the slave camera extrinsic parameters are calculated based on multiple images taken by the main camera and the slave camera at the same track position. Specifically, aerial triangulation is performed based on multiple images taken by the main camera and the slave camera at the same track position, and the main camera extrinsic parameters are obtained based on the results of aerial triangulation. Camera extrinsics and slave camera extrinsics.
其中,同一轨迹位置指的是,在无人机航测时,无人机飞行路线可形成轨迹,一条轨迹上包含多个轨迹位置,也可以称轨迹点,每一轨迹位置在世界坐标系下都有固定不变的坐标,即根据主相机与从相机在同一轨迹位置拍摄的多张图像计算得到主相机外参和从相机外参可以理解为,根据主相机与从相机在同一位置拍摄的多张图像计算得到主相机外参与从相机外参。Among them, the same trajectory position means that during UAV aerial survey, the flight route of the UAV can form a trajectory. A trajectory contains multiple trajectory positions, which can also be called trajectory points. Each trajectory position has fixed coordinates in the world coordinate system, that is, the main camera extrinsic parameters and the slave camera extrinsic parameters are calculated according to multiple images taken by the main camera and the slave camera at the same trajectory position. It can be understood that the main camera extrinsic parameters and the slave camera extrinsic parameters are calculated according to multiple images taken by the main camera and the slave camera at the same position.
在步骤a02中,Tw0指的是主相机在世界坐标系下的外参,而Tw0’指的是主相机坐标系到世界坐标系的转换,可以通过求主相机外参Tw0的逆矩阵的方式获得Tw0’,用于计算主相机外参与从相机外参间的转换关系,即TW0’*TW1。In step a02, Tw0 refers to the extrinsic parameter of the main camera in the world coordinate system, and Tw0' refers to the transformation from the main camera coordinate system to the world coordinate system. Tw0' can be obtained by calculating the inverse matrix of the main camera extrinsic parameter Tw0, which is used to calculate the transformation relationship between the main camera extrinsic parameter and the slave camera extrinsic parameter, that is, TW0'*TW1.
通过获取主相机外参和从相机外参间的转换关系T01,使在同一个坐标系下的主相机与从相机的位置可以互相转换,由于多个相机间的相对位置基本保持不变的特性,亦可以根据任意一个相机的位置数据求得其他相机的位置数据,这样求得的数据不会受拍摄条件或图像质量影响,在后续优化中可以通过转换关系T01以任意一个相机的外参作为优化数据进行优化,降低了对特征点匹配关系的依赖性,在多种图像中均可以得到精确的优化结果,不会因拍摄得到的图像质量较差,导致特征点匹配不准确而造成优化结果受到影响,提高了适用性。By obtaining the conversion relationship T01 between the main camera extrinsic parameters and the slave camera extrinsic parameters, the positions of the main camera and the slave camera in the same coordinate system can be converted to each other. Since the relative positions of multiple cameras remain basically unchanged, the position data of other cameras can be obtained based on the position data of any camera. The data obtained in this way will not be affected by the shooting conditions or image quality. In the subsequent optimization, the extrinsic parameters of any camera can be used as optimization data through the conversion relationship T01, which reduces the dependence on the feature point matching relationship. Accurate optimization results can be obtained in multiple images, and the optimization results will not be affected by inaccurate feature point matching due to poor image quality obtained by shooting, thereby improving applicability.
在本发明的一个实施例中,所述根据所述多相机拍摄装置拍摄的多张图像生成三维点,进一步包括:In one embodiment of the present invention, generating three-dimensional points according to the multiple images captured by the multi-camera shooting device further includes:
步骤b01:提取多张图像中每一张图像的特征点信息;Step b01: extracting feature point information of each of the multiple images;
步骤b02:根据所述特征点信息生成词袋信息;Step b02: Generate bag-of-words information according to the feature point information;
步骤b03:将所述词袋信息中具有相同特征描述子的至少两个图像进行匹配计算,得到匹配的两个图像间的匹配关系;Step b03: performing matching calculation on at least two images having the same feature descriptor in the bag-of-words information to obtain a matching relationship between the two matching images;
步骤b04:根据所述匹配关系计算全部图像中每两张图像间的相对转换关系;Step b04: calculating the relative conversion relationship between every two images in all images according to the matching relationship;
步骤b05:根据相对转换关系生成三维点。Step b05: Generate three-dimensional points based on the relative transformation relationship.
在步骤b01中,提取每一目标图像的特征点信息可以通过FAST特征点 提取,即遍历每一目标图像中的各个像素点,以当前像素点为中心、3为半径选择16个周围像素,并依次进行比较,若灰度差异值大于设定阈值的即标记为特征点,设定阈值可以根据实际情况进行设置,本申请实施例对此不作特殊限定。也可以选择ORB特征点提取或surf特征点提取等方法,只需要能够方便的提取出每一目标图像的特征点信息即可,本申请实施例对此不作特殊限定。In step b01, the feature point information of each target image can be extracted by FAST feature point Extraction, that is, traverse each pixel in each target image, select 16 surrounding pixels with the current pixel as the center and 3 as the radius, and compare them in turn. If the grayscale difference value is greater than the set threshold, it is marked as a feature point. The set threshold can be set according to the actual situation, and the embodiment of the present application does not make special restrictions on this. You can also choose ORB feature point extraction or surf feature point extraction methods, as long as you can easily extract the feature point information of each target image, and the embodiment of the present application does not make special restrictions on this.
其中,特征点信息可以为以特征点为中心的、或涵盖特征点的一个区域内的全部像素点,也可以为单个像素或多个像素点的参数信息,目的是使后续的词袋信息可以根据特征点信息生成,根据实际情况特征点信息可以为多种形式,只要能便于后续词袋信息的生成即可,本申请实施例对此不作特殊限定。Among them, the feature point information can be all the pixels in an area centered on the feature point or covering the feature point, or it can be parameter information of a single pixel or multiple pixels. The purpose is to enable subsequent word bag information to be generated based on the feature point information. According to actual conditions, the feature point information can be in various forms, as long as it can facilitate the generation of subsequent word bag information. The embodiments of the present application do not make special limitations on this.
在步骤b02中,根据特征点信息生成词袋信息具体为,根据特征点信息通过聚类产生单词,即为词袋信息,例如,特征点信息为多个区域内的全部像素点,每个区域内分别包含了湖泊与草地,则可以对应生成包含湖泊与草地的词袋信息。In step b02, word bag information is generated based on feature point information, specifically by generating words through clustering based on the feature point information, that is, word bag information. For example, the feature point information is all pixel points in multiple areas, and each area contains lakes and grasslands, then corresponding word bag information containing lakes and grasslands can be generated.
在步骤b03中,将词袋信息中具有相同特征描述子的至少两个目标图像进行匹配计算,得到匹配的两个目标图像间的匹配关系具体为,通过回环检测根据特征描述子对词袋信息中对应的目标图像进行匹配,特征描述子指的是一种描述子(Descriptor),是刻画特征的一个数据结构,一个描述子的维数可以是多维的,用于对特征点进行描述,其获取方式为,以图像特征点为中心,取S*S的邻域窗口,在窗口内随机选取一对点,比较这两者像素的大小,进行二进制赋值,然后继续随机选取N对点,重复进行二进制赋值,形成一个二进制编码,这个编码就是对特征点的描述,即特征描述子。In step b03, at least two target images with the same feature descriptor in the word bag information are matched and calculated to obtain the matching relationship between the two matched target images. Specifically, the corresponding target images in the word bag information are matched according to the feature descriptor through loop detection. The feature descriptor refers to a descriptor (Descriptor), which is a data structure that describes features. The dimension of a descriptor can be multidimensional and is used to describe feature points. The acquisition method is to take the image feature point as the center, take an S*S neighborhood window, randomly select a pair of points in the window, compare the sizes of the pixels of the two, perform binary assignment, and then continue to randomly select N pairs of points, repeat the binary assignment, and form a binary code. This code is the description of the feature point, that is, the feature descriptor.
在步骤b03中,得到匹配关系后,进一步的,为了提高匹配关系精确度,可以通过几何过滤的方式过滤掉其中的误匹配关系。In step b03, after the matching relationship is obtained, further, in order to improve the accuracy of the matching relationship, the erroneous matching relationship can be filtered out by geometric filtering.
在步骤b04中,根据匹配关系计算目标图像的相对转换关系具体为,步骤b03中得到的匹配关系用于计算得到每对匹配的目标图像的外参数相对转换关系,然后进行旋转平均化和平移平均化,旋转平均化指的是在给定相对旋转测量值的情况下对相机的绝对位置进行估计,平移平均化指的是在给定 相对平移测量值的情况下对相机的绝对位置进行估计,相对旋转测量值和相对平移测量值均可以根据相对转换关系得到。在进行旋转平均时,可以采用L2范数,因为在代码优化迭代的过程中,L2范数即为求平方和,代码会对这类型的公式进行优化求解,收敛较快。在进行平移平均时,可以采用L1范数,因为L1范数对噪声的反馈较为稳定。In step b04, the relative transformation relationship of the target image is calculated based on the matching relationship. Specifically, the matching relationship obtained in step b03 is used to calculate the relative transformation relationship of the extrinsic parameters of each pair of matched target images, and then rotation averaging and translation averaging are performed. Rotation averaging refers to estimating the absolute position of the camera under a given relative rotation measurement value, and translation averaging refers to estimating the absolute position of the camera under a given relative rotation measurement value. The absolute position of the camera is estimated in the case of relative translation measurement values. Both relative rotation measurement values and relative translation measurement values can be obtained based on the relative conversion relationship. When performing rotation averaging, the L2 norm can be used, because in the process of code optimization iteration, the L2 norm is to find the sum of squares. The code will optimize and solve this type of formula and converge quickly. When performing translation averaging, the L1 norm can be used, because the L1 norm has a more stable feedback on noise.
在本发明的一个实施例中,所述根据第一地理位置和位姿变换关系对三维点进行优化计算,进一步包括:In one embodiment of the present invention, the optimizing calculation of the three-dimensional point according to the first geographical location and the posture transformation relationship further includes:
步骤b06:根据如下公式确定用于优化的投影矩阵:
Pi=k·[Rc|tc]·[Ri|ti]
Step b06: Determine the projection matrix for optimization according to the following formula:
P i = k · [R c | t c ] · [R i | t i ]
其中,Pi为投影矩阵,[Rc|tc]为所述转换关系,[Ri|ti]为所述第一地理位置,k为主相机和从相机中任一相机内参,i为第一地理位置序号,c为所述相机的序号;Wherein, Pi is the projection matrix, [ Rc | tc ] is the conversion relationship, [ Ri | ti ] is the first geographical location, k is the internal parameter of any camera in the master camera and the slave camera, i is the sequence number of the first geographical location, and c is the sequence number of the camera;
步骤b07:根据所述投影矩阵计算所述三维点的最小化的重投影误差,其公式为:
Step b07: Calculate the minimized reprojection error of the three-dimensional point according to the projection matrix, and the formula is:
其中,x为所述三维点,xo为将所述三维点进行重投影后得到的二维特征点,o为三维点序号。Wherein, x is the three-dimensional point, xo is the two-dimensional feature point obtained after reprojecting the three-dimensional point, and o is the sequence number of the three-dimensional point.
在步骤b06中,将转换关系[Rc|tc]与第一地理位置[Ri|ti]相乘,并加入相机内参k参与计算得到投影矩阵Pi,能够体现三维点到二维点的转换,为后续计算提供数据基础。In step b06, the conversion relationship [R c |t c ] is multiplied by the first geographic location [R i |t i ], and the camera intrinsic parameter k is added to participate in the calculation to obtain the projection matrix P i , which can reflect the conversion from three-dimensional points to two-dimensional points and provide a data basis for subsequent calculations.
其中,相机内参k可以为主相机内参,也可以为从相机内参,根据实际情况可以针对不同相机进行优化,只需要能够最终得到准确的投影矩阵,例如,当需要针对主相机进行优化时,相机内参k为主相机内参,优化计算后得到的优化后的第一位姿为主相机的第一位姿。 Among them, the camera intrinsic parameter k can be the main camera intrinsic parameter or the slave camera intrinsic parameter. It can be optimized for different cameras according to actual conditions. It only needs to be able to obtain an accurate projection matrix in the end. For example, when it is necessary to optimize the main camera, the camera intrinsic parameter k is the main camera intrinsic parameter, and the optimized first pose obtained after optimization calculation is the first pose of the main camera.
其中k为主相机和从相机中任一相机内参若k为主相机内参则[Rc|tc]为单位矩阵。Where k is the intrinsic parameter of any camera among the master camera and the slave camera. If k is the intrinsic parameter of the master camera, then [ R c|t c ] is the unit matrix.
在步骤b07中,通过最小二乘法计算重投影误差,即指的是,计算最小化三维点重投影到图像二维平面的最小距离,在计算重投影误差时,可以通过使用ceres tools来进行迭代求解最优解,根据实际情况也可以采用不同的工具辅助计算,本申请对此不作特殊限定。In step b07, the reprojection error is calculated by the least squares method, that is, the minimum distance that minimizes the reprojection of the three-dimensional point to the two-dimensional plane of the image is calculated. When calculating the reprojection error, ceres tools can be used to iteratively solve the optimal solution. Different tools can also be used to assist the calculation according to actual conditions. This application does not make any special restrictions on this.
在本发明的一个实施例中,在根据第一地理位置和位姿变换关系对三维点进行优化计算之后,还包括:In one embodiment of the present invention, after optimizing and calculating the three-dimensional point according to the first geographical location and the position and posture transformation relationship, the method further includes:
步骤d01:根据对所述三维点进行优化计算后得到的重投影误差,剔除所述三维点中像素误差大于4个像素点的点;Step d01: according to the reprojection error obtained after optimizing the three-dimensional points, eliminating the points whose pixel errors are greater than 4 pixels from the three-dimensional points;
步骤d02:剔除所述三维点中观测点夹角小于2度的点;Step d02: Eliminate points whose angles between observation points and the three-dimensional points are less than 2 degrees;
步骤d03:对所述三维点进行全局优化。Step d03: globally optimizing the three-dimensional points.
在步骤d01中,像素误差可以通过重投影误差计算得到,计算重投影误差的公式为:
In step d01, the pixel error can be calculated by the reprojection error. The formula for calculating the reprojection error is:
其中,x为所述三维点,xo为将所述三维点进行重投影后得到的二维特征点,o为所述三维点的序号,根据重投影误差的计算公式,,可以得到计算像素误差的公式为:Wherein, x is the three-dimensional point, xo is the two-dimensional feature point obtained by reprojecting the three-dimensional point, and o is the serial number of the three-dimensional point. According to the calculation formula of the reprojection error, the formula for calculating the pixel error can be obtained as follows:
|Pix-xo||P i xx o |
其中,Pi为所述投影矩阵,x为所述三维点,xo为将所述三维点进行重投影后得到的二维特征点。像素误差即为当将三维点投影到二维平面时,与二维特征点位置的差异。Wherein, Pi is the projection matrix, x is the 3D point, and xo is the 2D feature point obtained by reprojecting the 3D point. The pixel error is the difference between the position of the 3D point and the 2D feature point when the 3D point is projected onto a 2D plane.
在步骤d02中,观测点指的是根据多相机拍摄装置拍摄的多张图像生成的三维点。若一个三维点可由两个相机同时观测到,则该三维点到两个相机 的直线所形成的夹角即为观测点夹角,若同一个观测点的所有观测点夹角中最大的观测点夹角小于2度,则剔除这个观测点。In step d02, the observation point refers to a 3D point generated from multiple images captured by a multi-camera shooting device. If a 3D point can be observed by two cameras at the same time, then the 3D point to the two cameras The angle formed by the straight line is the observation point angle. If the largest observation point angle among all observation point angles of the same observation point is less than 2 degrees, this observation point will be eliminated.
当观测点夹角小于2度时,可以认为两个相机的观测点夹角特别小,这种情况下生成的三维点往往误差较大,而重投影误差大于4像素点时,即三维点投影到二维平面与二维像素点位置的差异大于4像素点时,也可以认为这个三维点的误差较大,因此,通过剔除像素误差大于4的点,并剔除每个观测点夹角小于2度的点,能够使最终剩余的三维点的精确度更高,使对三维点进行全局优化后的效果更好。When the angle between the observation points is less than 2 degrees, it can be considered that the angle between the observation points of the two cameras is particularly small. In this case, the generated 3D points often have large errors. When the reprojection error is greater than 4 pixels, that is, when the difference between the 3D point projected onto the 2D plane and the 2D pixel position is greater than 4 pixels, it can also be considered that the error of this 3D point is large. Therefore, by eliminating points with pixel errors greater than 4 and eliminating points with an angle of less than 2 degrees between each observation point, the accuracy of the remaining 3D points can be higher, and the effect of global optimization of the 3D points can be better.
在本发明的一个实施例中,所述相机位姿估计方法还包括:In one embodiment of the present invention, the camera pose estimation method further includes:
步骤e01:根据所述主相机的第一地理位置与所述位姿变换关系计算得到所述从相机的第二地理位置;Step e01: Calculate the second geographical location of the slave camera according to the first geographical location of the master camera and the posture transformation relationship;
步骤e02:根据所述第二地理位置和所述位姿变换关系对所述三维点进行优化计算,得到每个所述从相机优化后的第二位姿。Step e02: Optimizing and calculating the three-dimensional points according to the second geographical location and the posture transformation relationship to obtain each second posture optimized from the camera.
在步骤e01中,根据主相机的第一地理位置与位姿变换关系计算得到从相机的第二地理位置指的是,根据获取到的第一地理位置与计算得到的位姿变换关系,将第一地理位置与位姿变换关系结合计算,得到从相机的第二地理位置,第二地理位置指的是从相机在世界坐标系下的位置数据。In step e01, the second geographical location of the slave camera is calculated based on the first geographical location and posture transformation relationship of the main camera. This means that the second geographical location of the slave camera is obtained by combining the first geographical location and the posture transformation relationship obtained by calculation, and the second geographical location refers to the position data of the slave camera in the world coordinate system.
在步骤e02中,根据第二地理位置和位姿变换关系对三维点进行优化计算,得到每个从相机优化后的第二位姿指的是,将第二地理位置和位姿变换关系的数据代入优化过程中,作为优化参数,对三维点进行优化,得到优化后的第二位姿。In step e02, the three-dimensional points are optimized and calculated according to the second geographic location and the posture transformation relationship to obtain each second posture optimized from the camera. This means that the data of the second geographic location and the posture transformation relationship are substituted into the optimization process as optimization parameters, the three-dimensional points are optimized, and the optimized second posture is obtained.
通过根据第二地理位置和位姿变换关系对三维点进行优化,得到从相机的第二位姿,可以使三维点的优化不仅依赖于多张图像特征点的匹配关系,在复杂地形下特征点匹配关系不精准时,也能获得较为精确的优化结果,提高了适用性,提高了三维重建优化的效果。By optimizing the three-dimensional points according to the second geographic location and posture transformation relationship, the second posture from the camera is obtained. The optimization of the three-dimensional points not only depends on the matching relationship of the feature points of multiple images, but also can obtain more accurate optimization results when the matching relationship of the feature points is not accurate under complex terrain, thereby improving the applicability and the effect of three-dimensional reconstruction optimization.
在本发明的一个实施例中,所述获取包括至少两个相机的多相机拍摄装置中每个相机的外参之后,进一步包括:In one embodiment of the present invention, after obtaining the external parameters of each camera in a multi-camera shooting device including at least two cameras, the method further includes:
步骤f01:获取多相机拍摄装置上一次作业过程中根据第一外参和第二外 参计算得到的每个从相机相对于主相机的历史位姿变换关系,将历史位姿变换关系作为位姿变换关系,并跳转至所述通过传感器获取主相机的第一地理位置的步骤。Step f01: Obtain the last operation of the multi-camera shooting device according to the first external reference and the second external reference. The historical pose transformation relationship of each slave camera relative to the master camera is calculated, the historical pose transformation relationship is used as the pose transformation relationship, and the process jumps to the step of obtaining the first geographic location of the master camera through the sensor.
在步骤f01中,由于多相机拍摄装置中每个相机之间为刚性连接,每个相机的相对位置关系不会发生变化,因此上一次作业得到的历史位姿变换关系可以被多次读取并重复使用,在每次航拍作业后在本次航拍获取到的数据也可以被保存,以便后续继续使用,数据可以保存为json格式文件,根据实际情况也可以保存为其他类型的数据,只需要保证数据能够被方便的重复读取使用即可,本申请实施例对此不作特殊限定。In step f01, since each camera in the multi-camera shooting device is rigidly connected, the relative position relationship of each camera will not change, so the historical posture transformation relationship obtained in the previous operation can be read multiple times and reused. After each aerial photography operation, the data obtained in this aerial photography can also be saved for subsequent use. The data can be saved as a json format file, and can also be saved as other types of data according to actual conditions. It is only necessary to ensure that the data can be easily read and used repeatedly, and the embodiment of the present application does not make any special limitations on this.
其中,在获取了历史位姿变换关系作为位姿变换关系后,便不需要重复计算相机的位姿变换关系,可以直接跳转到下一步骤。Among them, after obtaining the historical posture transformation relationship as the posture transformation relationship, there is no need to repeatedly calculate the posture transformation relationship of the camera, and you can jump directly to the next step.
通过获取上一次作业过程中根据第一外参和第二外参计算得到的每个从相机相对于主相机的历史位姿变换关系,并将历史位姿变换关系作为位姿变换关系,使第一次作业后的每次作业都可以直接利用上一次作业已经计算出来的固定不变的数据,简化了作业流程,提高了计算效率。By obtaining the historical pose transformation relationship of each slave camera relative to the master camera calculated according to the first external parameter and the second external parameter in the previous operation process, and using the historical pose transformation relationship as the pose transformation relationship, each operation after the first operation can directly use the fixed and unchanging data calculated in the previous operation, which simplifies the operation process and improves the calculation efficiency.
图2示出了根据本发明一个实施例的相机位姿估计装置200的功能框图。如图2所示,该装置包括:第一获取模块210、第一计算模块220,第二获取模块230和第二计算模块240。Fig. 2 shows a functional block diagram of a camera pose estimation device 200 according to an embodiment of the present invention. As shown in Fig. 2 , the device comprises: a first acquisition module 210 , a first calculation module 220 , a second acquisition module 230 and a second calculation module 240 .
第一获取模块210,用于获取包括至少两个相机的多相机拍摄装置中每个所述相机的外参,其中每个所述相机之间的相对位置关系固定不变,所述至少两个相机包括一个主相机和一个或多个从相机;A first acquisition module 210 is used to acquire external parameters of each camera in a multi-camera shooting device including at least two cameras, wherein the relative position relationship between each camera is fixed, and the at least two cameras include a master camera and one or more slave cameras;
第一计算模块220,用于从所述外参中确定所述主相机的第一外参和每个所述从相机的第二外参,根据所述第一外参和所述第二外参计算得到每个所述从相机相对于所述主相机的位姿变换关系;A first calculation module 220 is used to determine a first extrinsic parameter of the master camera and a second extrinsic parameter of each of the slave cameras from the extrinsic parameters, and calculate a posture transformation relationship of each of the slave cameras relative to the master camera according to the first extrinsic parameter and the second extrinsic parameter;
第二获取模块230,用于通过传感器获取所述主相机的第一地理位置;A second acquisition module 230, configured to acquire a first geographical location of the main camera through a sensor;
第二计算模块240,用于根据所述多相机拍摄装置拍摄的多张图像生成三维点,根据所述第一地理位置和所述位姿变换关系对所述三维点进行优化计算,得到每个所述相机优化后的第一位姿。 The second calculation module 240 is used to generate three-dimensional points according to the multiple images taken by the multi-camera shooting device, and optimize the three-dimensional points according to the first geographical location and the posture transformation relationship to obtain the optimized first posture of each camera.
在一些实施例中,第一计算模块220进一步包括:In some embodiments, the first calculation module 220 further includes:
第一计算单元,用于根据所述主相机与所述从相机在同一轨迹位置拍摄的多张图像计算得到所述主相机外参和所述从相机外参;A first calculation unit, configured to calculate the main camera extrinsic parameters and the slave camera extrinsic parameters according to a plurality of images taken by the main camera and the slave camera at the same track position;
第二计算单元,用于根据所述主相机外参与所述从相机外参,计算所述主相机拍摄的图像到所述从相机拍摄的图像的转换关系,计算公式为:T01=Tw0’*Tw1,其中,T01为所述转换关系,Tw0为所述主相机外参,Tw0’为Tw0的逆矩阵,Tw1为所述从相机外参;A second calculation unit is used to calculate the conversion relationship between the image taken by the main camera and the image taken by the slave camera according to the main camera extrinsic parameter and the slave camera extrinsic parameter, and the calculation formula is: T01=Tw0'*Tw1, wherein T01 is the conversion relationship, Tw0 is the main camera extrinsic parameter, Tw0' is the inverse matrix of Tw0, and Tw1 is the slave camera extrinsic parameter;
第三计算单元,用于将所述图像的转换关系确定为所述位姿变换关系。The third computing unit is used to determine the conversion relationship of the image as the posture transformation relationship.
在一些实施例中,第二计算模块240进一步包括:In some embodiments, the second calculation module 240 further includes:
第四计算单元,用于提取多张图像中每一张图像的特征点信息;a fourth computing unit, configured to extract feature point information of each of the plurality of images;
第五计算单元,用于根据所述特征点信息生成词袋信息;A fifth computing unit, configured to generate bag-of-words information according to the feature point information;
第六计算单元,用于将所述词袋信息中具有相同特征描述子的至少两个图像进行匹配计算,得到匹配的两个图像间的匹配关系;a sixth calculation unit, configured to perform a matching calculation on at least two images having the same feature descriptor in the bag-of-words information to obtain a matching relationship between the two matching images;
第七计算单元,用于根据所述匹配关系计算全部图像中每两张图像间的相对转换关系;a seventh calculation unit, configured to calculate a relative transformation relationship between every two images in all images according to the matching relationship;
第八计算单元,用于根据所述相对转换关系生成所述三维点。An eighth calculation unit is used to generate the three-dimensional point according to the relative transformation relationship.
在一些实施例中,第二计算模块240进一步包括:In some embodiments, the second calculation module 240 further includes:
第九计算单元,用于根据如下公式确定用于优化的投影矩阵:Pi=k·[Rc|tc]·[Ri|ti],其中,Pi为所述投影矩阵,[Rc|tc]为所述位姿变换关系,[Ri|ti]为所述第一地理位置,k为所述主相机和所述从相机中任一相机内参,i为所述第一地理位置序号,c为所述位姿变换关系序号;a ninth calculation unit, configured to determine a projection matrix for optimization according to the following formula: Pi = k·[ Rc | tc ]·[ Ri | ti ], wherein Pi is the projection matrix, [ Rc | tc ] is the posture transformation relationship, [ Ri | ti ] is the first geographic location, k is an intrinsic parameter of any camera of the master camera and the slave camera, i is a sequence number of the first geographic location, and c is a sequence number of the posture transformation relationship;
第十计算单元,用于根据所述投影矩阵计算所述三维点的最小化的重投影误差,其公式为:其中,x为所 述三维点,xo为将所述三维点进行重投影后得到的二维特征点,o为所述三维点的序号。A tenth calculation unit is used to calculate the minimized reprojection error of the three-dimensional point according to the projection matrix, and the formula is: Among them, x is The three-dimensional point, xo is a two-dimensional feature point obtained by reprojecting the three-dimensional point, and o is the sequence number of the three-dimensional point.
在一些实施例中,相机位姿估计装置200还包括:In some embodiments, the camera pose estimation apparatus 200 further includes:
第一剔除模块,用于根据对所述三维点进行优化计算后得到的重投影误差,剔除所述三维点中像素误差大于4个像素点的点;A first elimination module is used to eliminate points whose pixel errors are greater than 4 pixels from the three-dimensional points according to the reprojection errors obtained after optimizing and calculating the three-dimensional points;
第二剔除模块,用于剔除所述三维点中观测点夹角小于2度的点;A second elimination module is used to eliminate points whose angles between observation points and the three-dimensional points are less than 2 degrees;
优化模块,用于对所述三维点进行全局优化。The optimization module is used to perform global optimization on the three-dimensional points.
在一些实施例中,相机位姿估计装置200还包括:In some embodiments, the camera pose estimation apparatus 200 further includes:
第三计算模块,用于根据所述主相机的第一地理位置与所述位姿变换关系计算得到所述从相机的第二地理位置;A third calculation module, configured to calculate a second geographical location of the slave camera according to the first geographical location of the master camera and the posture transformation relationship;
第四计算模块,用于根据所述第二地理位置和所述位姿变换关系对所述三维点进行优化计算,得到每个所述从相机优化后的第二位姿。The fourth calculation module is used to optimize the calculation of the three-dimensional point according to the second geographical location and the posture transformation relationship to obtain the second posture optimized from each camera.
在一些实施例中,相机位姿估计装置200还包括:In some embodiments, the camera pose estimation apparatus 200 further includes:
第五计算模块,用于根据主相机的第一地理位置与位姿变换关系计算得到从相机的第二地理位置;A fifth calculation module, configured to calculate a second geographical location of the slave camera according to a first geographical location and a position transformation relationship of the master camera;
第六计算模块,用于根据第二地理位置和位姿变换关系对三维点进行优化计算,得到每个从相机优化后的第二位姿。The sixth calculation module is used to optimize the calculation of the three-dimensional point according to the second geographical location and the posture transformation relationship to obtain the second posture optimized from each camera.
在一些实施例中,相机位姿估计装置200还包括:In some embodiments, the camera pose estimation apparatus 200 further includes:
第三获取模块,用于获取多相机拍摄装置上一次作业过程中根据第一外参和第二外参计算得到的每个从相机相对于主相机的历史位姿变换关系,将历史位姿变换关系作为位姿变换关系,并跳转至所述通过传感器获取所述主相机的第一地理位置的步骤。The third acquisition module is used to obtain the historical posture transformation relationship of each slave camera relative to the main camera calculated according to the first external parameter and the second external parameter during the last operation of the multi-camera shooting device, use the historical posture transformation relationship as the posture transformation relationship, and jump to the step of obtaining the first geographic location of the main camera through the sensor.
图3示出了根据本发明实施例的一种相机位姿估计设备的结构示意图,本发明具体实施例并不对相机位姿估计设备的具体实现做限定。FIG3 shows a schematic structural diagram of a camera pose estimation device according to an embodiment of the present invention. The specific embodiment of the present invention does not limit the specific implementation of the camera pose estimation device.
如图3所示,该相机位姿估计设备可以包括:处理器302、存储器306、通信接口304和通信总线308。As shown in FIG. 3 , the camera pose estimation device may include: a processor 302 , a memory 306 , a communication interface 304 , and a communication bus 308 .
处理器302、存储器306和通信接口304通过通信总线308完成相互间的 通信。The processor 302, the memory 306 and the communication interface 304 communicate with each other via the communication bus 308. Communication.
存储器306用于存放至少一程序310,程序310使处理器302执行如上述相机位姿估计方法实施例中的相关步骤。The memory 306 is used to store at least one program 310 , and the program 310 enables the processor 302 to execute the relevant steps in the above-mentioned camera pose estimation method embodiment.
本发明实施例还提供了一种计算机可读存储介质,存储介质中存储有至少一程序,程序在相机位姿估计设备上运行时,使得相机位姿估计设备可执行上述任意方法实施例中的相机位姿估计方法。An embodiment of the present invention further provides a computer-readable storage medium, in which at least one program is stored. When the program runs on a camera pose estimation device, the camera pose estimation device can execute the camera pose estimation method in any of the above method embodiments.
在此提供的算法或显示不与任何特定计算机、虚拟系统或者其它设备固有相关。各种通用系统也可以与基于在此的示教一起使用。根据上面的描述,构造这类系统所要求的结构是显而易见的。此外,本发明实施例也不针对任何特定编程语言。应当明白,可以利用各种编程语言实现在此描述的本发明的内容,并且上面对特定语言所做的描述是为了披露本发明的最佳实施方式。The algorithm or display provided herein is not inherently related to any particular computer, virtual system or other equipment. Various general purpose systems can also be used together with the teachings based on this. According to the above description, it is obvious to construct the structure required for this type of system. In addition, the embodiment of the present invention is not directed to any specific programming language yet. It should be understood that various programming languages can be utilized to realize the content of the present invention described herein, and the description made to specific languages above is for disclosing the best mode of the present invention.
在此处所提供的说明书中,说明了大量具体细节。然而,能够理解,本发明的实施例可以在没有这些具体细节的情况下实践。在一些实例中,并未详细示出公知的方法、结构和技术,以便不模糊对本说明书的理解。In the description provided herein, a large number of specific details are described. However, it is understood that embodiments of the present invention can be practiced without these specific details. In some instances, well-known methods, structures and techniques are not shown in detail so as not to obscure the understanding of this description.
类似地,应当理解,为了精简本发明并帮助理解各个发明方面中的一个或多个,在上面对本发明的示例性实施例的描述中,本发明实施例的各个特征有时被一起分组到单个实施例、图、或者对其的描述中。然而,并不应将该公开的方法解释成反映如下意图:即所要求保护的本发明要求比在每个权利要求中所明确记载的特征更多的特征。Similarly, it should be understood that in order to streamline the present invention and aid in understanding one or more of the various inventive aspects, in the above description of exemplary embodiments of the present invention, various features of the embodiments of the present invention are sometimes grouped together into a single embodiment, figure, or description thereof. However, this method of disclosure should not be interpreted as reflecting an intention that the claimed invention requires more features than those expressly recited in each claim.
本领域技术人员可以理解,可以对实施例中的设备中的模块进行自适应性地改变并且把它们设置在与该实施例不同的一个或多个设备中。可以把实施例中的模块或单元或组件组合成一个模块或单元或组件,以及可以把它们分成多个子模块或子单元或子组件。除了这样的特征和/或过程或者单元中的至少一些是相互排斥之外,可以采用任何组合对本说明书(包括伴随的权利要求、摘要和附图)中公开的所有特征以及如此公开的任何方法或者设备的所有过程或单元进行组合。除非另外明确陈述,本说明书(包括伴随的权利要求、摘要和附图)中公开的每个特征可以由提供相同、等同或相似目的的替代特征来代替。Those skilled in the art will appreciate that the modules in the devices in the embodiments may be adaptively changed and arranged in one or more devices different from the embodiments. The modules or units or components in the embodiments may be combined into one module or unit or component, and may be divided into a plurality of submodules or subunits or subcomponents. Except that at least some of such features and/or processes or units are mutually exclusive, all features disclosed in this specification (including the accompanying claims, abstracts and drawings) and all processes or units of any method or device disclosed in this manner may be combined in any combination. Unless otherwise expressly stated, each feature disclosed in this specification (including the accompanying claims, abstracts and drawings) may be replaced by an alternative feature providing the same, equivalent or similar purpose.
应该注意的是上述实施例对本发明进行说明而不是对本发明进行限制, 并且本领域技术人员在不脱离所附权利要求的范围的情况下可设计出替换实施例。在权利要求中,不应将位于括号之间的任何参考符号构造成对权利要求的限制。单词“包含”不排除存在未列在权利要求中的元件或步骤。位于元件之前的单词“一”或“一个”不排除存在多个这样的元件。本发明可以借助于包括有若干不同元件的硬件以及借助于适当编程的计算机来实现。在列举了若干装置的单元权利要求中,这些装置中的若干个可以是通过同一个硬件项来具体体现。单词第一、第二、以及第三等的使用不表示任何顺序。可将这些单词解释为名称。上述实施例中的步骤,除有特殊说明外,不应理解为对执行顺序的限定。 It should be noted that the above embodiments illustrate the present invention rather than limit the present invention. And those skilled in the art may design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference symbols placed between brackets shall not be construed as limiting the claims. The word "comprising" does not exclude the presence of elements or steps not listed in the claims. The word "one" or "an" preceding an element does not exclude the presence of a plurality of such elements. The present invention may be implemented with the aid of hardware comprising several different elements and with the aid of appropriately programmed computers. In a unit claim enumerating several devices, several of these devices may be embodied by the same hardware item. The use of the words first, second, and third, etc. does not indicate any order. These words may be interpreted as names. The steps in the above embodiments, unless otherwise specified, should not be understood as limitations on the order of execution.

Claims (10)

  1. 一种相机位姿估计方法,其特征在于,包括:A camera pose estimation method, characterized by comprising:
    获取包括至少两个相机的多相机拍摄装置中每个所述相机的外参,其中每个所述相机之间的相对位置关系固定不变,所述至少两个相机包括一个主相机和一个或多个从相机;Acquire external parameters of each camera in a multi-camera shooting device including at least two cameras, wherein the relative position relationship between each camera is fixed, and the at least two cameras include a master camera and one or more slave cameras;
    从所述外参中确定所述主相机的第一外参和每个所述从相机的第二外参,根据所述第一外参和所述第二外参计算得到每个所述从相机相对于所述主相机的位姿变换关系;Determine a first extrinsic parameter of the master camera and a second extrinsic parameter of each of the slave cameras from the extrinsic parameters, and calculate a posture transformation relationship of each of the slave cameras relative to the master camera according to the first extrinsic parameter and the second extrinsic parameter;
    通过传感器获取所述主相机的第一地理位置;Acquire a first geographic location of the main camera through a sensor;
    根据所述多相机拍摄装置拍摄的多张图像生成三维点,根据所述第一地理位置和所述位姿变换关系对所述三维点进行优化计算,得到每个所述相机优化后的第一位姿。Three-dimensional points are generated according to the multiple images taken by the multi-camera shooting device, and the three-dimensional points are optimized and calculated according to the first geographical location and the posture transformation relationship to obtain the optimized first posture of each camera.
  2. 根据权利要求1所述的相机位姿估计方法,其特征在于,所述从所述外参中确定所述主相机的第一外参和每个所述从相机的第二外参,根据所述第一外参和所述第二外参计算得到每个所述从相机相对于所述主相机的位姿变换关系,进一步包括:The camera pose estimation method according to claim 1 is characterized in that the step of determining a first extrinsic parameter of the master camera and a second extrinsic parameter of each slave camera from the extrinsic parameter, and calculating a pose transformation relationship of each slave camera relative to the master camera according to the first extrinsic parameter and the second extrinsic parameter further comprises:
    根据所述主相机与所述从相机在同一轨迹位置拍摄的多张图像计算得到所述主相机外参和所述从相机外参;The main camera extrinsic parameter and the slave camera extrinsic parameter are calculated according to a plurality of images taken by the main camera and the slave camera at the same track position;
    根据所述主相机外参与所述从相机外参,计算所述主相机拍摄的图像到所述从相机拍摄的图像的转换关系,计算公式为:
    T01=Tw0’*Tw1
    According to the main camera extrinsic parameter and the slave camera extrinsic parameter, the conversion relationship between the image taken by the main camera and the image taken by the slave camera is calculated, and the calculation formula is:
    T01=Tw0'*Tw1
    其中,T01为所述转换关系,Tw0为所述主相机外参,Tw0’为Tw0的逆矩阵,Tw1为所述从相机外参;Wherein, T01 is the conversion relationship, Tw0 is the master camera external parameter, Tw0' is the inverse matrix of Tw0, and Tw1 is the slave camera external parameter;
    将所述图像的转换关系确定为所述位姿变换关系。The transformation relationship of the image is determined as the posture transformation relationship.
  3. 根据权利要求1所述的相机位姿估计方法,其特征在于,所述根据所述多相机拍摄装置拍摄的多张图像生成三维点,进一步包括:The camera pose estimation method according to claim 1, characterized in that the generating of three-dimensional points according to the multiple images captured by the multi-camera shooting device further comprises:
    提取多张图像中每一张图像的特征点信息; Extract feature point information of each image from multiple images;
    根据所述特征点信息生成词袋信息;Generate bag-of-words information according to the feature point information;
    将所述词袋信息中具有相同特征描述子的至少两个图像进行匹配计算,得到匹配的两个图像间的匹配关系;Performing matching calculation on at least two images having the same feature descriptor in the bag-of-words information to obtain a matching relationship between the two matching images;
    根据所述匹配关系计算全部图像中每两张图像间的相对转换关系;Calculating the relative transformation relationship between every two images in all images according to the matching relationship;
    根据所述相对转换关系生成所述三维点。The three-dimensional point is generated according to the relative transformation relationship.
  4. 根据权利要求1所述的相机位姿估计方法,其特征在于,所述根据所述第一地理位置和所述位姿变换关系对所述三维点进行优化计算,进一步包括:The camera pose estimation method according to claim 1, characterized in that the optimizing calculation of the three-dimensional point according to the first geographical location and the pose transformation relationship further comprises:
    根据如下公式确定用于优化的投影矩阵:
    Pi=k·[Rc|tc]·[Ri|ti]
    The projection matrix used for optimization is determined according to the following formula:
    P i = k · [R c | t c ] · [R i | t i ]
    其中,Pi为所述投影矩阵,[Rc|tc]为所述位姿变换关系,[Ri|ti]为所述第一地理位置,k为所述主相机和所述从相机中任一相机内参,i为所述第一地理位置序号,c为所述相机的序号;Wherein, Pi is the projection matrix, [ Rc | tc ] is the posture transformation relationship, [ Ri | ti ] is the first geographic location, k is the intrinsic parameter of any camera of the master camera and the slave camera, i is the sequence number of the first geographic location, and c is the sequence number of the camera;
    根据所述投影矩阵计算所述三维点的最小化的重投影误差,其公式为:
    The minimized reprojection error of the three-dimensional point is calculated according to the projection matrix, and the formula is:
    其中,x为所述三维点,xo为将所述三维点进行重投影后得到的二维特征点,o为所述三维点的序号。Wherein, x is the three-dimensional point, xo is the two-dimensional feature point obtained by reprojecting the three-dimensional point, and o is the serial number of the three-dimensional point.
  5. 根据权利要求1所述的相机位姿估计方法,其特征在于,在根据所述第一地理位置和所述位姿变换关系对所述三维点进行优化计算之后,所述相机位姿估计方法还包括:The camera pose estimation method according to claim 1, characterized in that after optimizing and calculating the three-dimensional point according to the first geographical location and the pose transformation relationship, the camera pose estimation method further comprises:
    根据对所述三维点进行优化计算后得到的重投影误差,剔除所述三维点中像素误差大于4个像素点的点;According to the reprojection error obtained after optimizing the three-dimensional points, points whose pixel errors are greater than 4 pixels are eliminated from the three-dimensional points;
    剔除所述三维点中观测点夹角小于2度的点; Eliminate points whose angles between observation points are less than 2 degrees from the three-dimensional points;
    对所述三维点进行全局优化。The three-dimensional points are globally optimized.
  6. 根据权利要求1所述的相机位姿估计方法,其特征在于,所述相机位姿估计方法还包括:The camera pose estimation method according to claim 1, characterized in that the camera pose estimation method further comprises:
    根据所述主相机的第一地理位置与所述位姿变换关系计算得到所述从相机的第二地理位置;Calculate the second geographical location of the slave camera according to the first geographical location of the master camera and the posture transformation relationship;
    根据所述第二地理位置和所述位姿变换关系对所述三维点进行优化计算,得到每个所述从相机优化后的第二位姿。The three-dimensional point is optimized and calculated according to the second geographical location and the posture transformation relationship to obtain the second posture optimized by each slave camera.
  7. 根据权利要求1所述的相机位姿估计方法,其特征在于,所述获取包括至少两个相机的多相机拍摄装置中每个所述相机的外参之后,所述相机位姿估计方法进一步包括:The camera pose estimation method according to claim 1, characterized in that after obtaining the external parameters of each camera in a multi-camera shooting device including at least two cameras, the camera pose estimation method further comprises:
    获取所述多相机拍摄装置上一次作业过程中根据所述第一外参和所述第二外参计算得到的每个所述从相机相对于所述主相机的所述位姿变换关系。The position transformation relationship of each of the slave cameras relative to the master camera calculated according to the first external parameter and the second external parameter during the last operation of the multi-camera shooting device is obtained.
  8. 一种相机位姿估计装置,其特征在于,包括:A camera pose estimation device, comprising:
    第一获取模块,用于获取包括至少两个相机的多相机拍摄装置中每个所述相机的外参,其中每个所述相机之间的相对位置关系固定不变,所述至少两个相机包括一个主相机和一个或多个从相机;A first acquisition module, used to acquire an external parameter of each camera in a multi-camera shooting device including at least two cameras, wherein a relative position relationship between each of the cameras is fixed, and the at least two cameras include a master camera and one or more slave cameras;
    第一计算模块,用于从所述外参中确定所述主相机的第一外参和每个所述从相机的第二外参,根据所述第一外参和所述第二外参计算得到每个所述从相机相对于所述主相机的位姿变换关系;A first calculation module is used to determine a first extrinsic parameter of the master camera and a second extrinsic parameter of each of the slave cameras from the extrinsic parameters, and calculate a posture transformation relationship of each of the slave cameras relative to the master camera according to the first extrinsic parameter and the second extrinsic parameter;
    第二获取模块,用于通过传感器获取所述主相机的第一地理位置;A second acquisition module, configured to acquire a first geographic location of the main camera through a sensor;
    第二计算模块,用于根据所述多相机拍摄装置拍摄的多张图像生成三维点,根据所述第一地理位置和所述位姿变换关系对所述三维点进行优化计算,得到每个所述相机优化后的第一位姿。The second calculation module is used to generate three-dimensional points according to the multiple images taken by the multi-camera shooting device, and optimize the three-dimensional points according to the first geographical location and the posture transformation relationship to obtain the optimized first posture of each camera.
  9. 一种相机位姿估计设备,其特征在于,包括:处理器、存储器、通信接口和通信总线,所述处理器、所述存储器和所述通信接口通过所述通信总线完成相互间的通信;A camera pose estimation device, characterized in that it comprises: a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface communicate with each other through the communication bus;
    所述存储器用于存放至少一程序,所述程序使所述处理器执行如权利要求1-7任意一项所述的相机位姿估计方法的操作。 The memory is used to store at least one program, and the program enables the processor to perform the operation of the camera pose estimation method as described in any one of claims 1-7.
  10. 一种计算机可读存储介质,其特征在于,所述存储介质中存储有至少一程序,所述程序在相机位姿估计设备上运行时,使得相机位姿估计设备执行如权利要求1-7任意一项所述的相机位姿估计方法的操作。 A computer-readable storage medium, characterized in that at least one program is stored in the storage medium, and when the program is run on a camera pose estimation device, the camera pose estimation device performs the operation of the camera pose estimation method as described in any one of claims 1-7.
PCT/CN2023/124164 2022-11-04 2023-10-12 Camera pose estimation method and apparatus, and computer-readable storage medium WO2024093635A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN2022113751658 2022-11-04
CN202211375165.8A CN115423863B (en) 2022-11-04 2022-11-04 Camera pose estimation method and device and computer readable storage medium

Publications (1)

Publication Number Publication Date
WO2024093635A1 true WO2024093635A1 (en) 2024-05-10

Family

ID=84208028

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/124164 WO2024093635A1 (en) 2022-11-04 2023-10-12 Camera pose estimation method and apparatus, and computer-readable storage medium

Country Status (2)

Country Link
CN (1) CN115423863B (en)
WO (1) WO2024093635A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115423863B (en) * 2022-11-04 2023-03-24 深圳市其域创新科技有限公司 Camera pose estimation method and device and computer readable storage medium
CN116580083B (en) * 2023-07-13 2023-09-22 深圳创维智慧科技有限公司 Pose estimation method and device of image pickup device, electronic device and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3113481A1 (en) * 2015-06-29 2017-01-04 Thomson Licensing Apparatus and method for controlling geo-tagging in a camera
CN111862180A (en) * 2020-07-24 2020-10-30 三一重工股份有限公司 Camera group pose acquisition method and device, storage medium and electronic equipment
CN114219852A (en) * 2020-08-31 2022-03-22 北京魔门塔科技有限公司 Multi-sensor calibration method and device for automatic driving vehicle
CN115205383A (en) * 2022-06-17 2022-10-18 深圳市优必选科技股份有限公司 Camera pose determination method and device, electronic equipment and storage medium
CN115423863A (en) * 2022-11-04 2022-12-02 深圳市其域创新科技有限公司 Camera pose estimation method and device and computer readable storage medium

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111566701B (en) * 2020-04-02 2021-10-15 深圳市瑞立视多媒体科技有限公司 Method, device and equipment for calibrating scanning field edge under large-space environment and storage medium
WO2022204855A1 (en) * 2021-03-29 2022-10-06 华为技术有限公司 Image processing method and related terminal device
CN115187658B (en) * 2022-08-29 2023-03-24 合肥埃科光电科技股份有限公司 Multi-camera visual large target positioning method, system and equipment

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3113481A1 (en) * 2015-06-29 2017-01-04 Thomson Licensing Apparatus and method for controlling geo-tagging in a camera
CN111862180A (en) * 2020-07-24 2020-10-30 三一重工股份有限公司 Camera group pose acquisition method and device, storage medium and electronic equipment
CN114219852A (en) * 2020-08-31 2022-03-22 北京魔门塔科技有限公司 Multi-sensor calibration method and device for automatic driving vehicle
CN115205383A (en) * 2022-06-17 2022-10-18 深圳市优必选科技股份有限公司 Camera pose determination method and device, electronic equipment and storage medium
CN115423863A (en) * 2022-11-04 2022-12-02 深圳市其域创新科技有限公司 Camera pose estimation method and device and computer readable storage medium

Also Published As

Publication number Publication date
CN115423863A (en) 2022-12-02
CN115423863B (en) 2023-03-24

Similar Documents

Publication Publication Date Title
CN110070615B (en) Multi-camera cooperation-based panoramic vision SLAM method
CN107316325B (en) Airborne laser point cloud and image registration fusion method based on image registration
WO2024093635A1 (en) Camera pose estimation method and apparatus, and computer-readable storage medium
WO2021004416A1 (en) Method and apparatus for establishing beacon map on basis of visual beacons
WO2020014909A1 (en) Photographing method and device and unmanned aerial vehicle
TWI820395B (en) Method for generating three-dimensional(3d) point cloud of object, system for 3d point set generation and registration, and related machine-readable medium
US8259994B1 (en) Using image and laser constraints to obtain consistent and improved pose estimates in vehicle pose databases
CN105352509B (en) Unmanned plane motion target tracking and localization method under geography information space-time restriction
CN112419374B (en) Unmanned aerial vehicle positioning method based on image registration
CN111127524A (en) Method, system and device for tracking trajectory and reconstructing three-dimensional image
CN111383205B (en) Image fusion positioning method based on feature points and three-dimensional model
CN108519102B (en) Binocular vision mileage calculation method based on secondary projection
WO2020063878A1 (en) Data processing method and apparatus
CN112197764B (en) Real-time pose determining method and device and electronic equipment
Habbecke et al. Automatic registration of oblique aerial images with cadastral maps
JP2019032218A (en) Location information recording method and device
CN108876828A (en) A kind of unmanned plane image batch processing three-dimensional rebuilding method
Zingoni et al. Real-time 3D reconstruction from images taken from an UAV
Kostavelis et al. Visual odometry for autonomous robot navigation through efficient outlier rejection
Duran et al. Accuracy comparison of interior orientation parameters from different photogrammetric software and direct linear transformation method
CN109493415A (en) A kind of the global motion initial method and system of aerial images three-dimensional reconstruction
KR102249381B1 (en) System for generating spatial information of mobile device using 3D image information and method therefor
US9852542B1 (en) Methods and apparatus related to georeferenced pose of 3D models
WO2021051220A1 (en) Point cloud fusion method, device, and system, and storage medium
CN117115271A (en) Binocular camera external parameter self-calibration method and system in unmanned aerial vehicle flight process

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23884567

Country of ref document: EP

Kind code of ref document: A1