CN115588085A - Axis reconstruction method, axis reconstruction equipment and storage medium - Google Patents

Axis reconstruction method, axis reconstruction equipment and storage medium Download PDF

Info

Publication number
CN115588085A
CN115588085A CN202110758575.XA CN202110758575A CN115588085A CN 115588085 A CN115588085 A CN 115588085A CN 202110758575 A CN202110758575 A CN 202110758575A CN 115588085 A CN115588085 A CN 115588085A
Authority
CN
China
Prior art keywords
axis
image
spatial
target
images
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110758575.XA
Other languages
Chinese (zh)
Inventor
谢诗超
边威
黄帅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Autonavi Software Co Ltd
Original Assignee
Autonavi Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Autonavi Software Co Ltd filed Critical Autonavi Software Co Ltd
Priority to CN202110758575.XA priority Critical patent/CN115588085A/en
Publication of CN115588085A publication Critical patent/CN115588085A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • G06T17/05Geographic models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/60Analysis of geometric attributes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Geometry (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Remote Sensing (AREA)
  • Computer Graphics (AREA)
  • Image Processing (AREA)

Abstract

The embodiment of the application provides an axis reconstruction method, an axis reconstruction device and a storage medium. In the embodiment of the application, the spatial prediction axis of the target shaft is jointly reconstructed by using the image set containing at least one same target shaft, and the spatial prediction axis is optimized to obtain the spatial axis of the target shaft, so that the precision progressive shaft axis reconstruction is realized. The intermediate result of the axis reconstruction mode is the spatial prediction axis of the target rod, compared with the intermediate result of the point cloud of the environment space constructed in the dense reconstruction or semi-dense reconstruction rod axis mode, the data size is smaller, and the consumption of the intermediate result on the storage space can be reduced; on the other hand, compared with the axis line mode of a dense reconstruction or semi-dense reconstruction rod, the axis line reconstruction mode provided by the embodiment of the application does not need to construct an environment point cloud space in advance, and is beneficial to reducing the calculation amount of axis line reconstruction.

Description

Axis reconstruction method, axis reconstruction equipment and storage medium
Technical Field
The present application relates to the field of electronic map technologies, and in particular, to a method, an apparatus, and a storage medium for axis reconstruction.
Background
High-precision (high-definition) maps have played a key role in multiple fields such as automatic driving and smart cities, and the collection of high-precision map production data at present mainly depends on a collection vehicle carrying various sensors (such as a laser radar, a camera and inertial navigation), but the collection vehicle has the problem of high manufacturing cost and cannot be applied in a large scale, so that the collection amount and the collection efficiency of the high-precision map production data are limited, and the production and the updating of the high-precision maps are finally restricted. In order to solve the problems of high-precision map productivity and situation, research and development of a low-cost acquisition means of high-precision map production data enter a research field, such as a vision solution.
In the real world, a rod-shaped object such as a light pole, a signboard post, or the like, which is one of the most frequently appearing geographical elements in a road, plays an important role in positioning and driving planning of automatic driving, and therefore, the rod-shaped object needs to be expressed in a high-precision map. The cross sections of the rods are various in shape, such as round, square and triangular, and the direct vectorization of the outer surface is difficult, and in an automatic driving scene, the axis of the rod usually plays a key role in self-positioning of a vehicle, so vectorization and reconstruction of the axis of the rod become an important production link of a high-precision map.
Dense or semi-dense reconstruction shaft axes are mostly used in current vision solutions, in particular: firstly, constructing a point cloud of an environment space based on a picture/image acquired by a visual sensor; and then extracting a rod from the point cloud, and extracting an axis for vector quantization. However, this solution is computationally intensive and intermediate results consume a large amount of memory space and are not suitable for engineering applications. However, the axis object reconstruction method is sensitive to illumination and observation angles, so that the axis object reconstruction accuracy is low.
Disclosure of Invention
Aspects of the present disclosure provide an axis reconstruction method, apparatus, and storage medium to improve accuracy of axis reconstruction.
The embodiment of the application provides an axis reconstruction method, which comprises the following steps:
acquiring an image set acquired by a visual sensor, wherein the image set comprises more than two frames of images;
acquiring internal and external parameters when the vision sensor acquires the images and an observation axis of each frame of image containing a target rod-shaped object, wherein the image contains at least one same target rod-shaped object;
selecting two frames of images from the image set each time; constructing a plurality of spatial planes according to the observation axes and the internal and external parameters of the two frames of images;
calculating a spatially predicted axis of the target shaft from the plurality of spatial planes;
optimizing the spatial prediction axis to obtain a spatial axis of the target shaft.
An embodiment of the present application further provides a computer device, including: a memory and a processor; wherein the memory is used for storing a computer program; the processor is coupled to the memory for executing the computer program for performing the steps in the above-described axis reconstruction method.
The embodiment of the present application further provides a collecting device, wherein, include: a machine body; the machine body is provided with a memory, a processor and a visual sensor; wherein the memory is used for storing a computer program; the vision sensor is used for acquiring images; the processor is coupled to the memory for performing the steps of the axis reconstruction method described above.
Embodiments of the present application also provide a computer-readable storage medium storing computer instructions, wherein the computer instructions, when executed by one or more processors, cause the one or more processors to perform the steps of the above-mentioned axis reconstruction method.
An embodiment of the present application further provides a computer program product, including: a computer program; the computer program is executed by a processor to implement the steps in the axis reconstruction method described above.
In the embodiment of the application, the spatial prediction axis of the target shaft is jointly reconstructed by using an image set containing at least one same target shaft, and the spatial prediction axis is optimized to obtain the spatial axis of the target shaft. The intermediate result of the axis reconstruction mode is a spatial prediction axis of the target rod-shaped object, the data volume is small, and compared with the intermediate result of the point cloud of the environment space constructed in the dense reconstruction or semi-dense reconstruction rod-shaped object axis mode, the consumption of the intermediate result on the storage space can be reduced; on the other hand, compared with the axis reconstruction method of the rod-shaped object, the axis reconstruction method provided by the embodiment of the application does not need to construct an environment point cloud space first, and is beneficial to reducing the calculation amount of axis reconstruction.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
fig. 1a is a schematic flowchart of an axis reconstruction method provided in an embodiment of the present application;
fig. 1b is a schematic flowchart of a spatial prediction axis construction method according to an embodiment of the present application;
FIG. 2 is a schematic structural diagram of a computer device provided in an embodiment of the present application;
fig. 3 is a schematic structural diagram of an acquisition device provided in an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be described in detail and completely with reference to the following specific embodiments of the present application and the accompanying drawings. It should be apparent that the described embodiments are only some of the embodiments of the present application, and not all of the embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present application without making any creative effort belong to the protection scope of the present application.
In some embodiments of the present application, an image set including at least one same target shaft is used to jointly reconstruct a spatial prediction axis of the target shaft, and the spatial prediction axis is optimized to obtain a spatial axis of the target shaft. The intermediate result of the axis reconstruction mode is the spatial prediction axis of the target rod, compared with the intermediate result of the point cloud of the environment space constructed in the axis reconstruction mode of the dense reconstruction or semi-dense reconstruction rod, the data size is smaller, and the consumption of the intermediate result on the storage space can be reduced; on the other hand, compared with the axis reconstruction method of the rod-shaped object, the axis reconstruction method provided by the embodiment of the application does not need to construct an environment point cloud space first, and is beneficial to reducing the calculation amount of axis reconstruction.
The technical solutions provided by the embodiments of the present application are described in detail below with reference to the accompanying drawings.
It should be noted that: the same reference numerals are used in the following figures and embodiments to denote the same object, and thus, once an object is defined in one figure or embodiment, it need not be discussed further in subsequent figures and embodiments.
Fig. 1a is a schematic flowchart of an axis reconstruction method provided in an embodiment of the present application. As shown in fig. 1a, the method comprises:
101. acquiring an image set acquired by a visual sensor; the image set contains more than 2 frames of images, i.e. at least 3 frames of images.
102. And acquiring internal and external parameters when the vision sensor acquires the images and an observation axis of the target rod contained in each frame of image. The images in the set of images contain at least one identical target shaft.
103. And selecting two frames of images from the image set each time, and constructing a plurality of spatial planes according to the observation axis and the internal and external parameters corresponding to the two frames of images selected each time.
104. From the plurality of spatial planes, a spatially predicted axis of the target shaft is calculated.
105. The spatial prediction axis is optimized to obtain the spatial axis of the target shaft.
The axis reconstruction method provided by the embodiment is suitable for any equipment, device or module with an image processing function and the like. In some embodiments, the axis reconstruction method may be performed by an autonomous mobile device. Wherein, the autonomous mobile device can be a robot, an autonomous vehicle, an unmanned aerial vehicle, or the like. For the autonomous mobile device, the vision sensor can be carried on the autonomous mobile device and collects the environment image of the autonomous mobile device.
In other embodiments, the axis reconstruction method may also be performed by a computer device in communication with the vision sensor. The computer device may be a terminal device such as a smart phone, a tablet computer, a desktop computer or a notebook computer, and may also be a service end device. The server device may be a single server device, a cloud server array, or a Virtual Machine (VM) running in the cloud server array. In addition, the server device may also refer to other computing devices with corresponding service capabilities, for example, a terminal device (running a service program) such as a computer.
In the present embodiment, the specific implementation form of the vision sensor is not limited. The visual sensor may be a stand-alone visual sensor, such as a camera, video camera, or a camera deployed in an area. Of course, the vision sensor may be mounted on other devices. For example, a camera mounted on a terminal device, a drive recorder mounted on a motor vehicle, and the like. The camera can be a monocular camera, a binocular camera or a depth camera.
In practical application, three-dimensional environment reconstruction is performed based on an environment image acquired by a visual sensor, and the method is a technical means frequently adopted by map services such as positioning, navigation or map updating. When three-dimensional reconstruction is performed based on visual images, especially when three-dimensional reconstruction is performed based on monocular vision, a shaft is one of the most typical elements in the environment. Shaft refers primarily to an elongated object whose height is much greater than the cross-sectional dimension. For example, the shaft may be a light pole, a sign post, a traffic light post, and the like. The cross section of the rod-shaped object has various shapes such as round, square and triangle, and the direct vectorization of the outer surface is difficult. Therefore, reconstructing the shaft axis is often a key element in vehicle positioning, and thus becomes an important part of map construction.
In this embodiment, to reconstruct the spatial axis of the shaft, in step 101, a set of images acquired by a vision sensor may be acquired. The image set contains more than 2 frames of images, i.e. at least 3 frames of images. Wherein each frame of image may comprise an image of the target shaft. For the autonomous mobile device, multi-frame environment images acquired by a camera can be acquired. Each frame of the ambient image contains an image of the target shaft. For other computer devices, images sent by the vision sensor may be acquired, resulting in a set of images. Each frame of image contains an image of the target shaft.
Further, in step 102, the viewing axis of the target shaft contained in each frame of image may be acquired. Wherein at least two or more frames of images contained in the image set contain at least one identical target shaft. Alternatively, target detection may be performed on each image frame to obtain the viewing axis of the target shaft in each image frame.
In the embodiments of the present application, the specific implementation of acquiring the viewing axis of the target shaft in each image is not limited. Optionally, each frame of image can be subjected to semantic segmentation respectively to determine the pixel coordinates of the target rod-shaped object in each frame of image; and performing linear regression on the pixel coordinates of the target rod in each frame of image to obtain the observation axis of the target rod in the frame of image. Considering that in spatial geometry, two points may define a straight line, the pixel coordinates of the two points may be used to represent the viewing axis of the target shaft.
Or, the images in the image set can be respectively input into the target detection model to output the spatial information of the target detection frame in each frame of image; and determining the observation axis of the target shaft in each frame of image according to the spatial information of the target detection frame in each frame of image. Alternatively, the central line of the target detection frame in each frame of image may be calculated according to the spatial information of the target detection frame in each frame of image, and the central line of the target detection frame in the frame of image may be used as the observation axis of the target rod in the frame of image, and the like.
Or, respectively extracting straight line features from the images in the image set, and determining the observation axis of the target rod in each frame of image according to the extracted straight line features of each frame of image. Optionally, edge detection may be performed on each frame of image to obtain a binary edge image of each frame of image; and then, extracting a characteristic curve of the binary edge image by adopting Hough transformation to obtain an observation axis of the frame image.
In step 102, internal and external parameters of the visual sensor during image acquisition may also be obtained, which means that the internal and external parameters of the visual sensor during image acquisition are obtained. Parameters of the vision sensor include: intrinsic parameters (simply referred to as intrinsic parameters) and extrinsic parameters (simply referred to as extrinsic parameters). The internal parameters of the vision sensor are intrinsic parameters of the vision sensor and are related to the characteristics of the vision sensor. Such as focus distance of the vision sensor, pixel size, image center, distortion parameters, etc. The extrinsic parameters of the vision sensor are parameters thereof in a world coordinate system, such as pose data of the vision sensor and the like. The pose data mainly includes: the position and rotational orientation of the vision sensor, etc.
The intrinsic parameters of the vision sensor are intrinsic parameters and are fixed and invariable. The extrinsic parameters change with changes in the pose of the vision sensor. Therefore, the internal parameters of the vision sensor are unchanged during the process of acquiring the image, and the external parameters may be different. In this embodiment, the specific implementation of acquiring the intrinsic parameters of the vision sensor and the extrinsic parameters thereof in the process of acquiring the images in the image set is not limited.
Optionally, internal parameters of the vision sensor can be calibrated, and external parameters of the vision sensor can be calibrated when the calibration image is acquired. Alternatively, the internal and external parameters of the vision sensor can be calibrated through multiple frames of images. For example, multiple frames of images may be utilized to build a geometric model of the imaging of the vision sensor. And solving the geometric model of the imaging of the visual sensor to obtain the internal and external parameters of the visual sensor in the process of acquiring the multi-frame image.
Or, the external parameters of the Visual sensor in the process of acquiring multiple frames of images can be obtained through algorithms such as Visual Odometer (VO), visual-Inertial odometer (VIO) or repositioning. Wherein, the main steps of the relocation algorithm are as follows: constructing a temporary map according to environmental information in the image acquisition process; and comparing the constructed temporary map with the stored electronic map to determine the pose of the vision sensor in acquiring the frame of image.
Optionally, based on a matching algorithm, traversing each pose of the constructed temporary map on a stored electronic map, for example, the size of the grid is 5cm, selecting a step length of 5cm, covering the temporary map with possible poses in the stored electronic map, and then taking 5 degrees for the angle step length, including orientation parameters in all the poses. When the grid representing the obstacle on the temporary map hits the grid representing the obstacle on the stored electronic map, adding points, and determining the pose with the highest score as the pose of the global optimal solution; and then, calculating the matching rate of the pose of the global optimal solution, and determining the pose of the global optimal solution as the pose of the vision sensor acquiring the frame image when the matching rate of the pose of the global optimal solution is greater than a preset matching rate threshold.
Next, considering that the observation axis of the target shaft in the image corresponds to a corresponding spatial straight line in the world coordinate system, and the external parameters of the vision sensor are also the position and orientation in the world coordinate system, based on the principle that a straight line and a point can determine a plane, in step 103, two frames of images can be selected each time from the image set, and a plurality of spatial planes are constructed according to the observation axis and the internal and external parameters corresponding to the two frames of images selected each time. Wherein, the observation axis corresponding to the image refers to the observation axis of the target shaft contained in the image; the internal and external parameters corresponding to the image refer to the internal and external parameters in the process of acquiring the image by the visual sensor. The observation axis of the target rod-shaped object in each frame of image and the internal and external parameters of the visual sensor in the frame of image are collected, so that a space plane can be constructed. Wherein the number of spatial planes may be less than or equal to the same number of images.
Further, since each spatial plane contains a spatial straight line corresponding to the observation axis of the target shaft, theoretically, the spatial axis is the spatial axis of the target shaft, and since each spatial plane should theoretically contain the spatial axis of the target shaft, a plurality of spatial planes theoretically intersect with one spatial straight line. Based on this, in step 104, the spatial prediction axis of the target shaft may be calculated according to a plurality of spatial planes, and the spatial prediction axis of the target shaft may be reconstructed by using the multi-frame images. The spatial plane used for calculating the spatial prediction axis of the target shaft may be the entire spatial plane constructed by the multi-frame images or a partial spatial plane.
However, in practical use, due to the influence of observation errors and calculation errors, the plurality of spatial planes may not intersect with the same straight line. Therefore, in step 105, the spatial prediction axis can be optimized to obtain the spatial axis of the target shaft, and the precision progressive shaft axis reconstruction is realized.
In this embodiment, a spatial prediction axis of the target shaft is jointly reconstructed using an image set including at least one same target shaft, and the spatial prediction axis is optimized to obtain the spatial axis of the target shaft. The intermediate result of the axis reconstruction mode provided by the embodiment is the spatial prediction axis of the target rod, and compared with the intermediate result of the point cloud of the environmental space constructed in the dense reconstruction or semi-dense reconstruction rod axis mode, the intermediate result data volume of the axis reconstruction mode provided by the embodiment is smaller, and the consumption of the intermediate result on the storage space can be reduced.
On the other hand, in the shaft axis mode of dense reconstruction or semi-dense reconstruction, a point cloud of an environment space needs to be constructed based on a picture/image acquired by a visual sensor; and then extracting a rod-shaped object from the point cloud, and extracting an axis for vectorization. No matter in the process of constructing the point cloud of the environment space or in the process of extracting the rod-shaped object from the point cloud, the calculation process is complex, and the calculation amount is large. The axis reconstruction method provided by the embodiment does not need to construct an environment point cloud space, and the calculation amount of the axis reconstruction process can be reduced.
In addition, in this embodiment, the spatial prediction axis of the target rod is jointly reconstructed by using the image set including at least one same target rod, and the spatial prediction axis is optimized to obtain the spatial axis of the target rod, thereby realizing the precision progressive rod axis reconstruction. The axis reconstruction mode has robustness on noise and abnormal observation axes and is beneficial to improving the axis reconstruction accuracy.
In the embodiments of the present application, the spatial prediction axis and the expression of the spatial axis of the target shaft are not limited. Alternatively, the spatial straight line may be represented as:
Figure BDA0003148745140000061
Figure BDA0003148745140000062
for the representation of the straight line in the three-dimensional euclidean space in terms of the purrocco coordinates, the coordinates uniquely define a straight line in three dimensions. Wherein
Figure BDA0003148745140000063
l represents a vector in a straight direction; m represents the spatial position of the line and m is a vector perpendicular to the plane of l. The Prock coordinate system satisfies the constraint
m·l=0 (1)
Since the coordinate system is a homogeneous coordinate system, plus the constraint of equation (1), the degree of freedom of a straight line in space is actually 4.
Suppose there is an I-frame image
Figure BDA0003148745140000064
Wherein, I represents the number of the images in the image set, and is an integer larger than or equal to 2. Further, the axis of view of the target shaft axis in each frame of image is
Figure BDA0003148745140000065
I ∈ { I }, i.e., I =1,2
Figure BDA0003148745140000066
A straight line on a two-dimensional plane.
Figure BDA0003148745140000067
Respectively representing the coordinates of any two points on the observation axis. Assuming that the vision sensor is a pinhole imaging model, the internal parameter K based on the vision sensor can be used for dividing two points on a straight line according to the formula (2)
Figure BDA0003148745140000068
Go to the normalized plane to obtain
Figure BDA0003148745140000069
Figure BDA00031487451400000610
Wherein
Figure BDA00031487451400000611
And s is a scale factor.
Further, assume that the pose of a known vision sensor during the acquisition of images in an image set is denoted by { T } i E.g. SE (3), wherein,
Figure BDA00031487451400000612
r in the formula (3) i E SO (3) represents a three-dimensional rotation from the camera coordinate system to the world coordinate system,
Figure BDA00031487451400000613
representing the coordinates of the camera's optical center in the world coordinate system. Then is made of
Figure BDA00031487451400000614
t i Can uniquely determine a space plane pi in world space i =(a i ,b i ,c i ,d i ) T
Figure BDA00031487451400000615
Respectively represent
Figure BDA00031487451400000616
Corresponding coordinates in the world coordinate system. (a) A i ,b i ,c i ) A component representing a spatial plane normal vector; d i Representing the distance of the origin from the plane of space.
Wherein for pi i =(a i ,b i ,c i ,d i ) T There are:
Figure BDA00031487451400000617
based on the analysis, for any frame of image in the image set, a spatial plane can be constructed according to the observation axis of the target shaft in the image and the internal and external parameters of the vision sensor in the process of acquiring the image. The following takes the first image of two frames of images selected at a time as an example, and the process of constructing the spatial plane is exemplarily described. Wherein, the first image is any one of two frames of images selected at each time.
Optionally, a first transformation matrix between the pixel coordinate system and the world coordinate system in the process of acquiring the first image may be calculated according to internal and external parameters of the vision sensor in the process of acquiring the first image; converting a first observation axis of the target rod-shaped object in the first image into a first space straight line under a world coordinate system by using a first conversion matrix; and constructing a first space plane by using the first space straight line and external parameters of the vision sensor in the process of acquiring the first image.
For example, the internal parameters of the vision sensor, the external parameters of the vision sensor during the process of acquiring the first image, and the pixel coordinates of the first observation axis of the target shaft in the first image may be taken into the formula (5), so as to obtain the corresponding first spatial straight line of the first observation axis in the world coordinate system.
Figure BDA0003148745140000071
Wherein,
Figure BDA0003148745140000072
f denotes the focal length of the vision sensor, d x ×d y Representing the physical size of each pixel in the image C acquired by the vision sensor. d is a radical of x Denotes the physical size of each pixel in the x-axis, d y Representing the physical size of each pixel on the y-axis. (u, v) represents the pixel coordinates of a point P in the image, (u) 0 ,v 0 ) Pixel coordinates representing an image center; (x) w ,y w ,z w ) Representing the coordinates in the world coordinate system corresponding to the point P; z is a radical of c Represents the z-axis coordinate of point P under the camera coordinate system; r and t represent extrinsic parameters of the vision sensor at the time of image acquisition. Wherein f is x 、f y 、(u 0 ,v 0 ) Is an internal parameter of the vision sensor; and R and t are external parameters of the vision sensor in the process of acquiring the image.
Wherein,
Figure RE-GDA0003338833570000073
namely the first conversion matrix.
Further, the coordinates of the first spatial straight line and the extrinsic parameters of the vision sensor during the acquisition of the first image may be taken into the formula (4), and the component (a) of the normal vector of the first spatial plane may be calculated i ,b i ,c i ) And the distance d between the first space plane and the origin i And then a first spatial plane is obtained.
In the ideal case, a plurality of spatial planes { π i Intersect at a line in the world, which is the spatial prediction axis of the reconstruction target rod
Figure BDA0003148745140000074
Spatial axes of underlying target shaft
Figure BDA0003148745140000075
Are illustrated.
Alternatively, for any two frames of images in the image set, defined as image j and image k, the spatial plane pi corresponding to each of image j and image k can be calculated according to equations (2) and (4) j ,π k Is defined as a space plane pi j And a spatial plane.
Furthermore, a space straight line can be determined according to the intersection of two space planes, and the space plane pi can be calculated j And the space plane pi k The intersection line of (a):
Figure BDA0003148745140000076
further, the space plane pi can be calculated from the following expression (6) j And the space plane pi k The intersection line of (a):
Figure BDA0003148745140000077
one of which is the spatial prediction axis.
Figure BDA0003148745140000078
Further, considering that there is an error in image observation in actual use, a plurality of spatial planes { π i Will not intersect a straight line. Based on this, in the embodiments of the present application, a Random Sample consensus (RANSAC) algorithm may be used to reconstruct the spatially predicted axis of the target shaft. The specific implementation mode is as follows:
acquiring a second image and a third image from images of the image set which do not participate in the spatial axis construction; constructing a second space plane according to the first observation axis of the target rod-shaped object in the second image and the internal and external parameters of the vision sensor in the process of acquiring the second image; and constructing a third space plane according to a second observation axis of the target rod-shaped object in the third image and internal and external parameters of the vision sensor in the process of acquiring the third image. And the second image and the third image are any two frames of images which are not involved in the construction of the spatial axis. For a specific embodiment of constructing the second spatial plane and the third spatial plane, reference may be made to the above related contents of constructing the first spatial plane, and details are not repeated here.
Further, calculating an intersection line of the second space plane and the third space plane as an initial space prediction axis of the target rod-shaped object; judging whether the initial spatial prediction axis meets the set requirement or not; and the initial spatial prediction axis satisfying the set requirement is taken as the spatial prediction axis of the target shaft.
In this embodiment, the set requirements can be: and the current cycle number reaches a set number threshold, and/or the proportion of an inner point axis in the observation axis of the I frame image is greater than or equal to the set proportion threshold. If the distance between the observation axis in the image a and the projection straight line is less than or equal to the set distance threshold, the observation axis of the image a is the inner point axis. The projection straight line here means: the projected straight line of the spatial prediction axis in the image a is calculated using the other frame images. The other frame images refer to images other than the image a in the image set.
Accordingly, projected straight lines of the initial spatial prediction axis in the other images than the second image and the third image may be calculated. For example, the initial spatial prediction of the line may be used
Figure BDA0003148745140000081
And (7) calculating the projection straight line of the initial spatial prediction straight line in other images.
Figure BDA0003148745140000082
In the case of the formula (7),
Figure BDA0003148745140000083
representing initial spatial prediction lines
Figure BDA0003148745140000084
A projected straight line in image i;
Figure BDA0003148745140000085
Figure BDA0003148745140000086
the symbol | represents the determinant of the matrix,
Figure BDA0003148745140000087
represents an antisymmetric array, which can be calculated from equation (8):
Figure BDA0003148745140000088
further, the distance between the projected straight line in the other image and the observation axis in the other image may be calculated. The other images here refer to images other than the above-described second image and third image in the image set. Alternatively, the initial spatial prediction straight line may be calculated from equation (9)
Figure BDA0003148745140000089
The distance between the projected straight line in image i and the observed straight line in image i is:
Figure BDA00031487451400000810
wherein, in the formula (9),
Figure BDA00031487451400000811
k =1,2. The image i is any other image except the second image and the third image.
Further, a target observation axis, in which a distance from a projection straight line in the corresponding image is less than or equal to a set distance threshold, may be selected from observation axes of other images except the second image and the third image, where the target observation axis is the above-mentioned interior point axis. Further, the target observation axis can be calculated in the figureThe proportion P occupied in the observation axis of the image set (I frame image), namely the proportion P of the inner points in the observation axis of the image set (I frame image) is calculated. For example, assuming that the number of target observation axes is M, which is a positive integer, the ratio of the target observation axes in the observation axes of the image set (I-frame image) is:
Figure BDA0003148745140000091
further, at least one of the following determination operations may be performed: judging whether the proportion P is larger than or equal to a set proportion threshold value; and/or judging whether the current cycle number reaches a set number threshold. If at least one judgment operation has the condition that the judgment result is yes, namely the proportion P is larger than or equal to the set proportion threshold value; and/or determining that the initial spatial prediction axis meets the set requirement if the current cycle number reaches the set number threshold, and taking the initial spatial prediction axis as the spatial prediction axis of the target rod.
Correspondingly, if the judgment result of the at least one judgment operation is negative, the proportion P is smaller than the set proportion threshold value; if the current cycle times are smaller than the set time threshold, returning to execute the operation of acquiring a second image and a third image from the image which does not participate in the construction of the spatial axis until the initial spatial prediction axis meets the set requirement; and the initial spatial prediction axis which meets the set requirement is taken as the spatial prediction axis of the target rod.
To facilitate an understanding of the above-described process of reconstructing a spatially predicted axis of a target shaft based on RANSAC, an exemplary embodiment is described below. As shown in fig. 1b, the process of reconstructing the spatial prediction axis mainly includes:
s1: the number of cycles n is initialized to 0, i.e. n =0.
S2: judging whether N reaches a set cycle number threshold N; if yes, executing step S12; if the judgment result is negative, step S3 is executed.
S3: two frames of images are taken from the images which do not participate in the spatial prediction axis reconstruction at present, and the two frames of images are used as a second image and a third image.
S4: and constructing a second space plane according to a second observation axis of the target rod-shaped object in the second image and the internal and external parameters of the vision sensor in the process of acquiring the second image. According to equations (2), (4) and (5), a second spatial plane is constructed.
S5: and constructing a third space plane according to a third observation axis of the target rod-shaped object in the third image and internal and external parameters of the vision sensor in the process of acquiring the third image. That is, according to the equations (2), (4) and (5), the third spatial plane is constructed.
S6: and calculating the intersection line of the second space plane and the third space plane as the initial space prediction axis of the target shaft-shaped object. From equation (6), the initial spatially predicted axis of the target shaft is calculated.
S7: projection straight lines of the initial spatial prediction axis in the other images except the second image and the third image are calculated. That is, according to equation (7), the projection straight line of the initial spatial prediction axis in the other images except the second image and the third image is calculated.
S8: the distance between the projected straight line in the other image and the observation axis in the other image is calculated. That is, the distance between the projection straight line in the other image and the observation axis in the other image is calculated according to equation (9).
S9: and selecting a target observation axis of which the distance from the projection straight line in the corresponding image is less than or equal to a set distance threshold value from the observation axes in other images.
S10: and calculating the proportion P of the target observation axis in the observation axes of the image set.
S11: and judging whether the proportion P is larger than or equal to a set proportion threshold value or not. If yes, executing step S12; if the determination result is no, step S13 is executed.
S12: the current initial predicted spatial axis is taken as the spatial predicted axis of the target shaft.
S13: increasing the current cycle number n by 1, i.e. n = n + +; and returns to perform step S2.
Wherein, the pseudo code for constructing the spatial prediction axis of the target shaft is shown in the following table 1:
table 1 RANSAC-based coarse reconstruction spatial prediction axis
Figure BDA0003148745140000101
Wherein, in Table 1
Figure BDA0003148745140000102
Refers to the space prediction straight line and the inner point set of the target shaft object finally output
Figure BDA0003148745140000103
The target observation axis is the corresponding target observation axis when the initial spatial prediction straight line meets the set requirement. Wherein the initial spatial prediction axis when the set requirement is satisfied is
Figure BDA0003148745140000111
I.e. inner point set
Figure BDA0003148745140000112
Is meant to be at
Figure BDA0003148745140000113
And the image of the corresponding target observation axis.
Considering that there is an error in the image observation in actual use, therefore, a plurality of spatial planes { π i Will not intersect a straight line. In practice, the observation axis z of the shaft of the image i Based on the fact that the observation noise is often large, the precision progressive reconstruction method is provided, namely the reconstructed spatial prediction straight line is optimized to obtain the spatial axis of the target rod-shaped object.
Optionally, a target observation axis when the initial spatial prediction axis meets a set requirement may be obtained as an observation axis for optimization; and obtaining an image, i.e. a set of interior points, of which the observation axis for optimization is located
Figure BDA0003148745140000114
As an image for optimization. Further, external parameters of the vision sensor in the process of acquiring the image for optimization can be acquired; and the observation axis for optimization and the external parameters of the visual sensor in the process of acquiring the optimized image are utilized to predict the spatial axis of the target rod-shaped object
Figure BDA0003148745140000115
And optimizing to obtain the spatial axis of the target rod.
Optionally, the spatially predicted axis of the target shaft
Figure BDA0003148745140000116
It can be expressed in terms of Procko coordinates and, correspondingly, the spatial axis of the target stem to be solved can also be expressed in terms of Procko coordinates. Since the spatial straight line has only 4 degrees of freedom, in the embodiment of the present application, the 6-dimensional prick coordinate can be parameterized into an orthogonal coordinate system, and the 4-dimensional orthogonal coordinate represents the spatial straight line. Wherein, the orthogonal coordinate system can be expressed as:
Figure BDA0003148745140000117
wherein ψ ∈ SO (3), U = exp (ψ) ∈ SO (3),
Figure BDA0003148745140000118
the process of parameterizing the Pluecker coordinates of the spatial straight lines to the orthogonal coordinate system is shown as equation (11):
Figure BDA0003148745140000119
accordingly, the process of converting the coordinates of the spatial straight line in the orthogonal coordinate system into the procck coordinates is as shown in equation (12):
Figure BDA00031487451400001110
where log (-), exp (-) represent the log and exponential maps of the rotated group (-), u 1 ,u 2 First and second column vectors of U, respectively.
For this embodiment, the coordinates of the spatial axis of the target shaft in the orthogonal coordinate system (10) are the quantities to be found. Further, the projection straight line of the quantity to be solved in the image for optimization can be calculated by utilizing the external parameters (pose) of the vision sensor in the process of acquiring the image for optimization. Alternatively, the amount to be obtained may be expressed as in equation (7)
Figure BDA00031487451400001111
And substituting the external parameters (pose data) of the vision sensor in the process of acquiring the optimization image into the formula (7), and calculating according to the formula (7) to obtain the projection straight line of the quantity to be obtained in the optimization image.
Further, a residual function reflecting the sum of the distances between the projection straight line in the image for optimization and the observation axis for optimization in the corresponding image may be constructed, and the maximum a posteriori estimation may be solved.
Wherein, the residual function can be expressed as:
Figure BDA00031487451400001112
in equation (13), the residual term
Figure BDA0003148745140000121
In the case of the formula (13),
Figure BDA0003148745140000122
representing a set of images for optimization; i denotes an optimization image i.
Furthermore, the coordinate of the spatial prediction axis in the orthogonal coordinate system can be used as an initial solution of the residual function (13), and the residual function (13) is subjected to unconstrained nonlinear optimization to solve the problem that the spatial axis of the target rod is under the orthogonal coordinate systemThe coordinates of (c). Optionally, the formula (13) may be optimized in an unconstrained non-linear manner to obtain an optimal solution of the formula (13)
Figure BDA0003148745140000123
Alternatively, the solution when the residual function (i.e., equation (13)) is minimized may be taken as the optimal solution
Figure BDA0003148745140000124
Wherein the optimal solution
Figure BDA0003148745140000125
I.e. the coordinates of the spatial axis of the target shaft in an orthogonal coordinate system.
In this embodiment, a specific embodiment of the unconstrained nonlinear optimization performed by the equation (13) is not limited. Optionally, the formula (13) may be optimized in an unconstrained non-linear manner by using a Levenberg-Marquardt algorithm, a gauss-newton algorithm, a least square method, or the like, but is not limited thereto.
Further, the prock coordinates of the spatial axis of the target shaft can be calculated according to the solved coordinates of the spatial axis of the target shaft in the orthogonal coordinate system. Optionally, the optimal solution may be
Figure BDA0003148745140000126
The above equations (11) and (12) are substituted for solving to obtain the Procko coordinates of the spatial axis of the target shaft
Figure BDA0003148745140000127
The Prock coordinate
Figure BDA0003148745140000128
For representing the spatial axis of the target shaft. Therefore, the optimization of the spatial prediction axis roughly reconstructed by using RANSAC is realized, the precision progressive three-dimensional reconstruction of the spatial axis is realized, the accuracy of the reconstruction result is high, and the robustness of noise and abnormal observed values can be ensured by using a progressive reconstruction algorithm. Compared with a dense reconstruction mode, the axis reconstruction mode provided by the embodiment can effectively reduceThe calculation amount is reduced, and the intermediate result storage consumption is reduced.
It should be noted that the execution subjects of the steps of the methods provided in the above embodiments may be the same device, or different devices may be used as the execution subjects of the methods. For example, the execution subject of steps 101 and 102 may be device a; for another example, the execution subject of step 101 may be device a, and the execution subject of step 102 may be device B; and so on.
In addition, in some of the flows described in the above embodiments and the drawings, a plurality of operations are included in a specific order, but it should be clearly understood that the operations may be executed out of the order presented herein or in parallel, and the sequence numbers of the operations, such as 101, 102, etc., are merely used for distinguishing different operations, and the sequence numbers do not represent any execution order per se. Additionally, the flows may include more or fewer operations, and the operations may be performed sequentially or in parallel.
Accordingly, embodiments of the present application further provide a computer-readable storage medium storing computer instructions, which, when executed by one or more processors, cause the one or more processors to perform the steps of the above-mentioned axis reconstruction method.
An embodiment of the present application further provides a computer program product, including: a computer program. The computer program is executed by a processor to implement the steps in the axis reconstruction method described above. Alternatively, the computer program product may be implemented as travel application-like software. Such as navigation map application software, etc. Alternatively, the computer program product may also be implemented as other application software integrated with the navigation function, such as taxi taking software, take-out software integrated with the navigation function, and the like.
It should be noted that the axis reconstruction method provided by the above embodiments is applicable to any device having a data processing function. In some embodiments, the method is performed by a computer device, such as a terminal device or a server device.
Correspondingly, the embodiment of the application further provides computer equipment. Fig. 2 is a schematic structural diagram of a computer device according to an embodiment of the present application. As shown in fig. 2, the computer apparatus includes: a memory 20a and a processor 20b. The memory 20a is used for storing computer programs.
The processor 20b is coupled to the memory 20a for executing a computer program for: acquiring an image set acquired by a visual sensor; the image set includes more than two frames of images; acquiring internal and external parameters of the vision sensor during the collection of the images and an observation axis of a target rod contained in each frame of image; the images in the set of images include at least one identical target shaft. Selecting two frames of images from the image set each time; constructing a plurality of spatial planes according to the observation axes and the internal and external parameters corresponding to the two selected frame images each time; calculating a spatial prediction axis of the target rod according to the plurality of spatial planes; and optimizing the spatial prediction axis to obtain a spatial axis of the target shaft.
In some embodiments, the processor 20b, in acquiring the viewing axis of the target shaft contained in each frame of image, is specifically configured to: performing semantic segmentation on each frame of image respectively to determine the pixel coordinates of the target rod-shaped object in each frame of image; and performing linear regression on the pixel coordinates of the target rod in each frame of image to obtain an observation axis of the target rod in the frame of image.
In some embodiments, the processor 20b, when constructing the plurality of spatial planes, is specifically configured to: and aiming at a first image in the two frames of images, constructing a first space plane according to a first observation axis of the target rod-shaped object in the first image and internal and external parameters of the vision sensor in the process of acquiring the first image.
Further, the processor 20b, when constructing the first spatial plane, is specifically configured to: calculating a first conversion matrix between a pixel coordinate system and a world coordinate system in the process of acquiring a first image according to internal and external parameters of a visual sensor in the process of acquiring the first image; converting the first observation axis into a first spatial straight line under a world coordinate system by using a first conversion matrix; and constructing a first space plane by utilizing the first space straight line and the external parameters of the vision sensor in the process of acquiring the first image.
In other embodiments, the processor 20b, when constructing the plurality of spatial planes, is specifically configured to: acquiring a second image and a third image from images which do not participate in spatial axis construction; constructing a second space plane according to a second observation axis of the target rod-shaped object in the second image and internal and external parameters of the vision sensor in the process of acquiring the second image; and constructing a third space plane according to a third observation axis of the target rod-shaped object in the third image and internal and external parameters of the vision sensor in the process of acquiring the third image.
Accordingly, the processor 20b, when calculating the spatially predicted axis of the target shaft, is specifically configured to: calculating an intersection line of the second space plane and the third space plane as an initial space prediction axis of the target rod-shaped object; judging whether the initial space prediction axis meets the set requirement or not; and the initial spatial prediction axis which meets the set requirement is taken as the spatial prediction axis of the target rod.
Optionally, the processor 20b is further configured to: respectively calculating first projection straight lines of the initial spatial prediction axis in other images by using external parameters of the vision sensor in the process of acquiring other images; the other images are images in the image set except the second image and the third image; calculating the distance between the first projection straight line and the observation axis in the other image; selecting a target observation axis of which the distance from the first projection straight line in the corresponding image is less than or equal to a set distance threshold from observation axes in other images; and calculating the proportion of the target observation axis in the observation axes of the image set.
Accordingly, the processor 20b is specifically configured to perform at least one of the following determination operations in determining whether the initial spatial prediction axis satisfies the set requirement:
judging whether the proportion is greater than or equal to a set proportion threshold value;
judging whether the current cycle number reaches a set number threshold;
and if the judgment result of at least one judgment operation is yes, determining that the initial spatial prediction axis meets the set requirement.
Accordingly, the processor 20b is further configured to: if the judgment result of the at least one judgment operation is negative, returning to execute the operation of acquiring a second image and a third image from the images which do not participate in the spatial axis construction of the image set until the initial spatial prediction axis meets the set requirement; and the initial spatial prediction axis which meets the set requirement is taken as the spatial prediction axis of the target rod.
Optionally, the processor 20b, when optimizing the spatial prediction axis to obtain the spatial axis of the target shaft, is specifically configured to: acquiring a target observation axis when the initial spatial prediction axis meets a set requirement, and taking the target observation axis as an observation axis for optimization; acquiring external parameters of the visual sensor in the process of acquiring the images for optimization; the image for optimization is an image where an observation axis for optimization is located; and optimizing the spatial prediction axis by utilizing the observation axis for optimization and the external parameters of the visual sensor in the process of acquiring the image for optimization to obtain the spatial axis of the target rod-shaped object.
Optionally, the spatial prediction axis is represented by Prockian coordinates; accordingly, the processor 20b, when optimizing the spatial prediction axis, is specifically configured to: mapping the Prock coordinate of the spatial prediction axis to an orthogonal coordinate system to obtain the coordinate of the spatial prediction axis in the orthogonal coordinate system; taking the coordinate of the space axis of the target rod-shaped object under an orthogonal coordinate system as a quantity to be solved, and calculating a second projection straight line of the quantity to be solved in an image for optimization by utilizing external parameters of a vision sensor in the process of acquiring the image for optimization; constructing a residual function reflecting the sum of the distances between the second projection straight line in the image for optimization and the observation axis for optimization in the corresponding image; taking the coordinate of the spatial prediction axis under the orthogonal coordinate system as an initial solution of a residual function, and performing unconstrained nonlinear optimization on the residual function to solve the coordinate of the spatial axis of the target rod under the orthogonal coordinate system; and calculating the Prockian coordinate of the space axis of the target shaft according to the solved coordinate of the space axis of the target shaft in the orthogonal coordinate system.
Optionally, the solution of the quantity to be solved when minimizing the residual function is taken as the coordinates of the spatial axis of the target shaft in an orthogonal coordinate system.
In some optional embodiments, as shown in fig. 2, the computer device may further include: a communication component 20c, a power supply component 20d, etc. If the computer device is a terminal device such as a computer, the method may further include: display component 20e, audio component 20f, and the like. Only some of the components shown in fig. 2 are schematically depicted, and it is not meant that the computer device must include all of the components shown in fig. 2, nor that the computer device only includes the components shown in fig. 2.
The computer device provided in this embodiment jointly reconstructs a spatial prediction axis of the target shaft by using an image set including at least one same target shaft, and optimizes the spatial prediction axis to obtain the spatial axis of the target shaft. The intermediate result of the axis reconstruction mode is the spatial prediction axis of the target rod, compared with the axis mode of dense reconstruction or semi-dense reconstruction rods, the intermediate result of the point cloud of the environment space is constructed first, the data size is small, and the consumption of the intermediate result on the storage space can be reduced; on the other hand, compared with the axis line mode of the rod-shaped object through dense reconstruction or semi-dense reconstruction, the axis line reconstruction method provided by the embodiment of the application does not need to construct an environment point cloud space first, and is beneficial to reducing the calculation amount of axis line reconstruction.
In addition, the computer device provided in this embodiment may jointly reconstruct a spatial prediction axis of the target shaft by using an image set including at least one same target shaft, and optimize the spatial prediction axis to obtain a spatial axis of the target shaft, thereby implementing precision progressive shaft axis reconstruction. The axis reconstruction mode has robustness on noise and abnormal observation axes, and is beneficial to improving the axis reconstruction accuracy.
Fig. 3 is a block diagram of a hardware structure of an acquisition device according to this embodiment. As shown in fig. 3, the acquisition apparatus 30 includes: a machine body 30a. The machine body 30a is provided with a vision sensor 30b, a processor 30c, and a memory 30d.
It should be noted that the processor 30c, the memory 30d and the vision sensor 30b may be disposed inside the machine body 30a, or may be disposed on the surface of the machine body 30a.
The machine body 30a is an actuator of the acquisition device 30 and may perform operations specified by one or more processors 30c in a certain environment. The mechanical body 30a represents the appearance of the collecting device 30 to some extent. In the present embodiment, the appearance of the capturing device 30 is not limited. Alternatively, the acquisition device may be an autonomous mobile device, such as a robot, unmanned vehicle, drone, or the like. The acquisition device may also be a smartphone, a wearable device, a non-autonomous mobile device that requires a person to drive a vehicle, or the like, but is not limited thereto. The machine body 30a mainly refers to the body of the collecting apparatus 30.
It is worth mentioning that the mechanical body 30a is also provided with some basic components of the acquisition device 30, such as a drive component, an odometer, a power supply component, an audio component, etc. Alternatively, the drive assembly may include drive wheels, drive motors, universal wheels, and the like. The basic components and the configurations of the basic components contained in different acquisition devices are different, and the embodiments of the present application are only some examples.
The memory 30d is mainly used for storing computer programs, and these computer programs can be executed by the processor 30c, so that the processor 30c controls the collecting device 30 to realize corresponding functions and complete corresponding actions or tasks. In addition to storing computer instructions, the memory 30d may be configured to store various other data to support operations on the acquisition device 30. Examples of such data include instructions for any application or method operating on the acquisition device 30.
In the present embodiment, the vision sensor 30a can be regarded as an "eye" of the capturing device 30, and is mainly used for capturing an environmental image around the current position of the capturing device 30 during the movement of the capturing device 30. The vision sensor 30a may be implemented by any device having an image capturing function, for example, a camera, a laser sensor, an infrared sensor, etc. Further, the camera may be a binocular camera, a monocular camera, a depth camera, or the like, but is not limited thereto.
The processor 30c, which may be considered a control system of the acquisition device 30, may be coupled to the memory 30d for executing one or more computer programs stored in the memory 30d to control the acquisition device 30 to perform corresponding functions, perform corresponding actions or tasks. It should be noted that, when the acquisition device 30 is located in different scenes, the functions, actions or tasks that it needs to implement may be different; accordingly, the computer instructions stored in memory 30d may vary, and execution of different computer instructions by processor 30c may control acquisition device 30 to perform different functions, perform different actions or tasks.
In this embodiment, the vision sensor 30b may be used to capture an image, resulting in a collection of images. The image set includes two or more frames of images.
In the present embodiment, the processor 30c is mainly configured to: acquiring an image set; internal and external parameters of the vision sensor 30b in the process of acquiring the images in the image set and an observation axis of a target rod contained in each frame of image; the images in the set of images contain at least one identical target shaft. Further, two frames of images are selected from the image set each time; constructing a plurality of spatial planes according to the observation axes and the internal and external parameters corresponding to the two selected frames of images each time; calculating a spatial prediction axis of the target rod according to the plurality of spatial planes; and optimizing the spatial prediction axis to obtain a spatial axis of the target shaft.
In some embodiments, the processor 30b, in acquiring the viewing axis of the target shaft contained in each frame of image, is specifically configured to: performing semantic segmentation on each frame of image respectively to determine the pixel coordinates of the target rod-shaped object in each frame of image; and performing linear regression on the pixel coordinates of the target rod in each frame of image to obtain the observation axis of the target rod in the frame of image.
In some embodiments, the processor 30b, when constructing the plurality of spatial planes, is specifically configured to: for a first image of the two frames of images selected each time, a first spatial plane is constructed according to a first observation axis of the target shaft in the first image and internal and external parameters of the vision sensor 30b during acquisition of the first image.
Further, the processor 30b, when constructing the first spatial plane, is specifically configured to: calculating a first transformation matrix between the pixel coordinate system and the world coordinate system in the process of acquiring the first image according to the internal and external parameters of the vision sensor 30b in the process of acquiring the first image; converting the first observation axis into a first spatial straight line under a world coordinate system by using a first conversion matrix; and constructing a first spatial plane using the first spatial lines and the extrinsic parameters of the vision sensor 30b during the acquisition of the first image.
In other embodiments, the processor 30b is specifically configured to, when two frames of images are selected from the image set: and acquiring a second image and a third image from the images which do not participate in the spatial axis construction in the image set. The processor 30b, when constructing the plurality of spatial planes in which the observation axis is located, is specifically configured to: constructing a second spatial plane according to a second observation axis of the target rod in the second image and internal and external parameters of the vision sensor 30b in the process of acquiring the second image; and a third spatial plane is constructed based on the third viewing axis of the target shaft in the third image and the internal and external parameters of the vision sensor 30b during the acquisition of the third image.
Accordingly, the processor 30b, when calculating the spatially predicted axis of the target shaft, is specifically configured to: calculating an intersection line of the second space plane and the third space plane as an initial space prediction axis of the target rod-shaped object; judging whether the initial space prediction axis meets the set requirement or not; and the initial spatial prediction axis which meets the set requirement is taken as the spatial prediction axis of the target rod.
Optionally, the processor 30b is further configured to: respectively calculating first projection straight lines of the initial spatial prediction axis in other images by utilizing the external parameters of the vision sensor 30b in the processes of the other images; the other images are images in the image set except the second image and the third image; calculating the distance between the first projection line and the observation axis in the other image; selecting a target observation axis of which the distance from the first projection straight line in the corresponding image is less than or equal to a set distance threshold from observation axes in other images; and calculating the proportion of the target observation axis in the observation axes of the image set.
Accordingly, the processor 30b is specifically configured to perform at least one of the following determination operations in determining whether the initial spatial prediction axis satisfies the set requirement:
judging whether the proportion is greater than or equal to a set proportion threshold value;
judging whether the current cycle number reaches a set number threshold;
and if the judgment result of at least one judgment operation is yes, determining that the initial spatial prediction axis meets the set requirement.
Accordingly, the processor 30b is further configured to: if the judgment result of the at least one judgment operation is negative, returning to execute the operation of acquiring a second image and a third image from the images which do not participate in the spatial axis construction of the image set until the initial spatial prediction axis meets the set requirement; and the initial spatial prediction axis which meets the set requirement is taken as the spatial prediction axis of the target rod.
Optionally, the processor 30b, when optimizing the spatial prediction axis to obtain the spatial axis of the target shaft, is specifically configured to: acquiring a target observation axis when the initial spatial prediction axis meets a set requirement, and taking the target observation axis as an observation axis for optimization; acquiring external parameters of the vision sensor 30b in the process of acquiring the optimization images; the image for optimization is an image where an observation axis for optimization is located; the spatial prediction axis is optimized to obtain the spatial axis of the target shaft using the observation axis for optimization and the extrinsic parameters of the vision sensor 30b during the acquisition of the image for optimization.
Optionally, the spatial prediction axis is represented by Prockian coordinates; accordingly, the processor 30b, when optimizing the spatial prediction axis, is specifically configured to: mapping the Prock coordinate of the spatial prediction axis to an orthogonal coordinate system to obtain the coordinate of the spatial prediction axis in the orthogonal coordinate system; taking the coordinate of the space axis of the target rod-shaped object under an orthogonal coordinate system as a quantity to be solved, and calculating a second projection straight line of the quantity to be solved in the image for optimization by utilizing the external parameter of the vision sensor 30b in the process of acquiring the image for optimization; constructing a residual function reflecting the sum of the distances between the second projection straight line in the image for optimization and the observation axis for optimization in the corresponding image; taking the coordinate of the spatial prediction axis under an orthogonal coordinate system as an initial solution of a residual function, and performing unconstrained nonlinear optimization on the residual function to solve the coordinate of the spatial axis of the target rod-shaped object under the orthogonal coordinate system; and calculating the Prock coordinates of the space axis of the target shaft according to the solved coordinates of the space axis of the target shaft in the orthogonal coordinate system.
Optionally, the solution of the quantities to be solved when minimizing the residual function is taken as the coordinates of the spatial axis of the target shaft in an orthogonal coordinate system.
The acquisition apparatus provided in this embodiment jointly reconstructs a spatial prediction axis of the target shaft by using an image set including at least one same target shaft, and optimizes the spatial prediction axis to obtain the spatial axis of the target shaft. The intermediate result of the axis reconstruction mode is the spatial prediction axis of the target rod, compared with the intermediate result of the point cloud of the environment space constructed in the dense reconstruction or semi-dense reconstruction rod axis mode, the data size is smaller, and the consumption of the intermediate result on the storage space can be reduced; on the other hand, compared with the shaft axis reconstruction method of the dense reconstruction or semi-dense reconstruction rod, the shaft axis reconstruction method provided by the embodiment of the application does not need to construct an environment point cloud space first, and is beneficial to reducing the calculation amount of shaft axis reconstruction.
The acquisition device provided by this embodiment can jointly reconstruct the spatial prediction axis of the target rod by using an image set including at least one same target rod, and optimize the spatial prediction axis to obtain the spatial axis of the target rod, thereby realizing the precision progressive rod axis reconstruction. The axis reconstruction mode has robustness on noise and abnormal observation axes, and is beneficial to improving the axis reconstruction accuracy.
In embodiments of the present application, the memory is used to store computer programs and may be configured to store other various data to support operations on the device on which it is located. Wherein the processor may execute a computer program stored in the memory to implement the corresponding control logic. The memory may be implemented by any type or combination of volatile or non-volatile memory devices, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.
In the embodiments of the present application, the processor may be any hardware processing device that can execute the above described method logic. Alternatively, the processor may be a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), or a Micro Controller Unit (MCU); programmable devices such as Field-Programmable Gate arrays (FPGAs), programmable Array Logic devices (PALs), general Array Logic devices (GAL), complex Programmable Logic Devices (CPLDs), etc. may also be used; or Advanced Reduced Instruction Set Computer (RISC) processors (ARM) or System On Chip (SOC), etc., but is not limited thereto.
In embodiments of the present application, the communication component is configured to facilitate wired or wireless communication between the device in which it is located and other devices. The device where the communication component is located can access a wireless network based on a communication standard, such as WiFi,2G or 3G,4G,5G or a combination thereof. In an exemplary embodiment, the communication component receives a broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component may also be implemented based on Near Field Communication (NFC) technology, radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, ultra Wideband (UWB) technology, bluetooth (BT) technology, or other technologies.
In the embodiment of the present application, the display assembly may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the display component includes a touch panel, the display component may be implemented as a touch screen to receive input signals from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation.
In embodiments of the present application, a power supply component is configured to provide power to various components of the device in which it is located. The power components may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the device in which the power component is located.
In embodiments of the present application, the audio component may be configured to output and/or input audio signals. For example, the audio component includes a Microphone (MIC) configured to receive an external audio signal when the device in which the audio component is located is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signal may further be stored in a memory or transmitted via a communication component. In some embodiments, the audio assembly further comprises a speaker for outputting audio signals. For example, for devices with language interaction functionality, voice interaction with a user may be enabled through an audio component, and so forth.
It should be noted that, the descriptions of "first", "second", etc. in this document are used for distinguishing different messages, devices, modules, etc., and do not represent a sequential order, nor do they limit the types of "first" and "second".
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and so forth) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic tape cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element described by the phrase "comprising" does not exclude the presence of other identical elements in the process, method, article, or apparatus that comprises the element.
The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art to which the present application pertains. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims (12)

1. An axial reconstruction method, comprising:
acquiring an image set acquired by a visual sensor, wherein the image set comprises more than two frames of images;
acquiring internal and external parameters when the vision sensor acquires the images and an observation axis of a target rod contained in each frame of image, wherein the image contains at least one same target rod;
selecting two frames of images from the image set each time; constructing a plurality of space planes according to the observation axes and the internal and external parameters corresponding to the two frames of images;
calculating a spatially predicted axis of the target shaft from the plurality of spatial planes;
optimizing the spatial prediction axis to obtain a spatial axis of the target shaft.
2. The method of claim 1, wherein constructing a plurality of spatial planes from the viewing axes and the inside-outside parameters corresponding to the two frames of images comprises:
and aiming at a first image in the two frames of images, constructing a first space plane according to a first observation axis of the target shaft in the first image and internal and external parameters of the vision sensor in the process of acquiring the first image.
3. The method of claim 2, wherein said constructing a first spatial plane from the first viewing axis of the target shaft in the first image and the internal and external parameters of the vision sensor during the acquisition of the first image comprises:
calculating a first conversion matrix between a pixel coordinate system and a world coordinate system in the process of acquiring the first image according to internal and external parameters of the visual sensor in the process of acquiring the first image;
converting the first observation axis into a first spatial straight line in a world coordinate system by using the first conversion matrix;
and constructing a first space plane by utilizing the first space straight line and the external parameters of the vision sensor in the process of acquiring the first image.
4. The method of claim 1, wherein said selecting two frames at a time from said set of images comprises:
selecting a second image and a third image from the images of the image set which do not participate in the spatial axis construction;
said calculating a spatially predicted axis of said target shaft from said plurality of spatial planes, comprising:
calculating an intersection line of a second space plane corresponding to the second image and a third space plane corresponding to the third image to serve as an initial space prediction axis of the target rod;
judging whether the initial spatial prediction axis meets a set requirement or not; and using the initial spatial prediction axis which meets the set requirement as the spatial prediction axis of the target shaft.
5. The method of claim 4, further comprising:
respectively calculating first projection straight lines of the initial spatial prediction axis in other images by utilizing external parameters of the vision sensor in the process of acquiring other images; the other images are images in the image set except the second image and the third image;
calculating a distance between the first projected straight line and an observation axis in the other image;
selecting a target observation axis of which the distance from the first projection straight line in the corresponding image is less than or equal to a set distance threshold value from the observation axes in the other images;
calculating the proportion of the target observation axis in the observation axes of the image set;
the judging whether the initial spatial prediction axis meets the set requirement comprises executing at least one judgment operation of the following operations:
judging whether the proportion is larger than or equal to a set proportion threshold value;
judging whether the current cycle number reaches a set number threshold;
and if the judgment result of the at least one judgment operation is yes, determining that the initial spatial prediction axis meets the set requirement.
6. The method of claim 5, further comprising:
if the judgment result of the at least one judgment operation is negative, returning to execute the operation of acquiring a second image and a third image from the images which do not participate in the spatial axis construction of the image set until the initial spatial prediction axis meets the set requirement;
and taking the initial spatial prediction axis which meets the set requirement as the spatial prediction axis of the target shaft.
7. The method of claim 5, wherein said optimizing the spatial prediction axis to obtain the spatial axis of the target shaft comprises:
acquiring a target observation axis when the initial spatial prediction axis meets a set requirement, and taking the target observation axis as an observation axis for optimization;
acquiring external parameters of the visual sensor in the process of acquiring an image for optimization; the image for optimization is an image where an observation axis for optimization is located;
and optimizing the spatial prediction axis by using the observation axis for optimization and the external parameters of the visual sensor in the process of acquiring the image for optimization to obtain the spatial axis of the target rod-shaped object.
8. The method of claim 7, wherein the spatial prediction axis is represented by Prockian coordinates; the optimizing the spatial prediction axis by using the observation axis for optimization and the external parameter of the visual sensor corresponding to the observation axis for optimization includes:
mapping the Prockian coordinates of the spatial prediction axis to an orthogonal coordinate system to obtain the coordinates of the spatial prediction axis in the orthogonal coordinate system;
taking the coordinate of the spatial axis of the target rod-shaped object under an orthogonal coordinate system as a quantity to be solved, and calculating a second projection straight line of the quantity to be solved in the image for optimization by utilizing the external parameter of the visual sensor in the process of acquiring the image for optimization;
constructing a residual function reflecting the sum of the distances between the second projection straight line in the image for optimization and the observation axis for optimization in the corresponding image;
performing unconstrained nonlinear optimization on the residual function by taking the coordinate of the spatial prediction axis under an orthogonal coordinate system as an initial solution of the residual function so as to solve the coordinate of the spatial axis of the target rod under the orthogonal coordinate system;
and calculating the Prockian coordinates of the space axis of the target shaft according to the solved coordinates of the space axis of the target shaft in the orthogonal coordinate system.
9. The method of claim 8, further comprising:
the solution of the quantity to be solved when minimizing the residual function as the coordinates of the spatial axis of the target shaft in an orthogonal coordinate system.
10. The method of any one of claims 1-9, wherein the acquiring the viewing axis of the target shaft contained in each frame of image comprises:
performing semantic segmentation on each frame of image respectively to determine the pixel coordinates of the target rod-shaped object in each frame of image;
and performing linear regression on the pixel coordinates of the target rod in each frame of image to obtain the observation axis of the target rod in the frame of image.
11. An acquisition device, comprising: a machine body; the machine body is provided with a memory, a processor and a visual sensor; wherein the memory is to store a computer program;
the vision sensor is used for acquiring images;
the processor is coupled to the memory for executing the computer program for performing the steps of the method of any of claims 1-10.
12. A computer-readable storage medium storing computer instructions, which, when executed by one or more processors, cause the one or more processors to perform the steps of the method of any one of claims 1-10.
CN202110758575.XA 2021-07-05 2021-07-05 Axis reconstruction method, axis reconstruction equipment and storage medium Pending CN115588085A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110758575.XA CN115588085A (en) 2021-07-05 2021-07-05 Axis reconstruction method, axis reconstruction equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110758575.XA CN115588085A (en) 2021-07-05 2021-07-05 Axis reconstruction method, axis reconstruction equipment and storage medium

Publications (1)

Publication Number Publication Date
CN115588085A true CN115588085A (en) 2023-01-10

Family

ID=84771081

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110758575.XA Pending CN115588085A (en) 2021-07-05 2021-07-05 Axis reconstruction method, axis reconstruction equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115588085A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117765186A (en) * 2024-02-18 2024-03-26 广东电网有限责任公司广州供电局 Reconstruction method, device, equipment and storage medium of environment space

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117765186A (en) * 2024-02-18 2024-03-26 广东电网有限责任公司广州供电局 Reconstruction method, device, equipment and storage medium of environment space
CN117765186B (en) * 2024-02-18 2024-05-28 广东电网有限责任公司广州供电局 Reconstruction method, device, equipment and storage medium of environment space

Similar Documents

Publication Publication Date Title
CN112102411B (en) Visual positioning method and device based on semantic error image
JP7236565B2 (en) POSITION AND ATTITUDE DETERMINATION METHOD, APPARATUS, ELECTRONIC DEVICE, STORAGE MEDIUM AND COMPUTER PROGRAM
US11557083B2 (en) Photography-based 3D modeling system and method, and automatic 3D modeling apparatus and method
Wang et al. DV-LOAM: Direct visual lidar odometry and mapping
US20150279083A1 (en) Real-time three-dimensional reconstruction of a scene from a single camera
CN110637461B (en) Compact optical flow handling in computer vision systems
CN112927363B (en) Voxel map construction method and device, computer readable medium and electronic equipment
CN112258565B (en) Image processing method and device
CN112001912B (en) Target detection method and device, computer system and readable storage medium
CN111062981A (en) Image processing method, device and storage medium
US11831931B2 (en) Systems and methods for generating high-resolution video or animated surface meshes from low-resolution images
CN110660103B (en) Unmanned vehicle positioning method and device
EP3651144A1 (en) Method and apparatus for information display, and display device
CN113450459B (en) Method and device for constructing three-dimensional model of target object
CN116012483A (en) Image rendering method and device, storage medium and electronic equipment
Byrne et al. Maximizing feature detection in aerial unmanned aerial vehicle datasets
Nguyen et al. ROI-based LiDAR sampling algorithm in on-road environment for autonomous driving
US20220114813A1 (en) Detecting obstacle
CN115588085A (en) Axis reconstruction method, axis reconstruction equipment and storage medium
CN114119692A (en) Rigid object geometric information recovery method and device and storage medium
US20240244322A1 (en) Unconstrained image stabilisation
CN112880675B (en) Pose smoothing method and device for visual positioning, terminal and mobile robot
CN114863071A (en) Target object labeling method and device, storage medium and electronic equipment
CN114387312A (en) Image processing method, image processing device, electronic equipment and storage medium
CN111383337A (en) Method and device for identifying objects

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination