CN113436230B - Incremental translational averaging method, system and equipment - Google Patents

Incremental translational averaging method, system and equipment Download PDF

Info

Publication number
CN113436230B
CN113436230B CN202110992939.0A CN202110992939A CN113436230B CN 113436230 B CN113436230 B CN 113436230B CN 202110992939 A CN202110992939 A CN 202110992939A CN 113436230 B CN113436230 B CN 113436230B
Authority
CN
China
Prior art keywords
vertex
absolute position
camera
edges
optimization
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110992939.0A
Other languages
Chinese (zh)
Other versions
CN113436230A (en
Inventor
高翔
李梦晗
马孝冬
解则晓
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ocean University of China
Original Assignee
Ocean University of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ocean University of China filed Critical Ocean University of China
Priority to CN202110992939.0A priority Critical patent/CN113436230B/en
Publication of CN113436230A publication Critical patent/CN113436230A/en
Application granted granted Critical
Publication of CN113436230B publication Critical patent/CN113436230B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • G06T7/248Analysis of motion using feature-based methods, e.g. the tracking of corners or segments involving reference images or patches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • G06T7/74Determining position or orientation of objects or cameras using feature-based methods involving reference images or patches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30244Camera pose

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Studio Devices (AREA)

Abstract

The invention belongs to the technical field of motion recovery structures in three-dimensional reconstruction, and particularly relates to an incremental translational averaging method, system and device, aiming at solving the problems of high complexity, low accuracy and poor robustness of the existing translational averaging method. The method includes constructing an epi-polar geometry; constructing a camera quadruple set, and selecting an initial camera quadruple as an initial seed view based on local optimization; constructing a third vertex set; forming a camera triple, and determining the increment sequence of the vertex by adopting a next optimal view selection strategy based on a weighted support set; performing weighted local optimization/weighted global optimization on the current estimated absolute position; and performing double translation averaging on the vertexes of all the estimated absolute positions after the global optimization. The invention reduces the complexity of the translational averaging method and improves the accuracy and robustness of absolute position estimation.

Description

Incremental translational averaging method, system and equipment
Technical Field
The invention belongs to the technical field of structure recovery from motion in three-dimensional reconstruction, and particularly relates to an incremental translational averaging method, system and device.
Background
The motion recovery structure (structure from motion) is a key step in large-scale scene three-dimensional reconstruction based on images, and the development is rapid in recent years, the input of the motion recovery structure is image feature matching, and the output of the motion recovery structure is absolute pose of a camera and a scene structure. According to different initialization modes of camera poses, the motion recovery structure can be roughly divided into an incremental type and a global type. The incremental method completes initialization of the camera pose and the scene structure through iterative camera pose estimation and scene structure expansion, and in the iterative process, in order to cope with inevitable feature matching outliers, the method also introduces a random sample consensus (random sample consensus) algorithm and a bundle adjustment (bundle adjustment) technology. Unlike the incremental method, the global slave motion recovery structure mainly uses a motion averaging technique to complete initialization of the camera pose, and generally includes two steps of rotation averaging (rotation averaging) and translation averaging (translation averaging). Compared with a global structure recovery from motion, the incremental method calls a model estimation algorithm based on random sampling consistency and a parameter optimization technology based on binding adjustment more frequently, so that the result is more accurate and robust.
Translational averaging refers to estimating the absolute position of a camera given a relative translational measurement. The relative translation measurements are typically obtained by estimation and decomposition of an essential matrix (intrinsic matrix). Compared with the rotational averaging, the translational averaging is more difficult for three reasons: 1) the essential matrix only contains the direction information of relative translation, and the problem of scale uncertainty of the relative translation obtained by decomposing the essential matrix is solved; 2) the accuracy of the relative translation found from the essential matrix is more susceptible to false feature matching than relative rotation; 3) only cameras in the same parallel rigid component can be uniquely estimated by means of translational averaging, while rotational averaging requires that all cameras are in the same connected component. Currently, although the translational averaging problem has been widely studied, it is far from being solved and has been a hot topic compared to rotational averaging.
The existing translational averaging method mainly focuses on the following three aspects: 1) designing a proper cost function form and an optimization scheme; 2) studying a filtering/optimizing strategy for epi-polar geometry; 3) and introducing auxiliary information such as feature tracks, camera triples or rank constraints and the like. Although the above methods have achieved good results, they are more complex and less efficient due to excessive dependence on complex objective function forms and optimizations, elaborate initialization operations, or other additional information, and moreover, accuracy and robustness remain key challenges they face. The invention provides an incremental translational averaging method, which is inspired by an incremental structure recovery method from motion.
Disclosure of Invention
In order to solve the above problems in the prior art, that is, to solve the problems of high complexity, low accuracy and poor robustness of the conventional translational averaging method, a first aspect of the present invention provides an incremental translational averaging method applied to global solution of an absolute position of a camera in a motion recovery structure, the method comprising:
step S100, obtaining a plurality of frames of images, carrying out feature matching between every two images, and constructing an epi-polar geometric figure according to the epi-polar geometric relationship
Figure 240171DEST_PATH_IMAGE001
And then calculating relative rotation and relative translation between the matched image pairs; wherein the content of the first and second substances,
Figure 354145DEST_PATH_IMAGE002
a set of vertices, representing a set of cameras that capture images of a scene,
Figure 253968DEST_PATH_IMAGE003
the set of edges represents the set of epipolar geometry edges between two cameras which shoot different images and contains the motion information between the cameras;
step S200, selecting the front with the maximum feature matching quantity in the epi-polar geometric figure
Figure 226603DEST_PATH_IMAGE004
The camera quadruple set comprises camera quadruples formed by edges, a quadruple set is constructed, and the absolute position of each camera in each camera quadruple of the quadruple set under a local coordinate system is calculated; calculating the selection cost of each camera quadruple in the quadruple set by combining the absolute position of each camera, and taking the view corresponding to the camera quadruple with the maximum selection cost as an initial seed view;
step S300, constructing a vertex set with an estimated absolute position based on the vertex corresponding to the initial seed view, taking the vertex set with the estimated absolute position in the epi-polar geometric figure as a first vertex set, and taking the vertex set with the estimated absolute position in the epi-polar geometric figure as a second vertex set; selecting the front vertex with the maximum number of connecting edges with all the vertexes in the first vertex set in the second vertex set
Figure 793851DEST_PATH_IMAGE005
Each vertex is used for constructing a third vertex set;
step S400, forming a camera triple by each vertex in the third vertex set and the vertex in the first vertex set, and calculating the absolute position of each vertex in the third vertex set by a linear trifocal tensor solution; calculating the selection cost of each vertex according to the obtained absolute position, and taking the view corresponding to the vertex with the maximum selection cost as the next optimal view;
step S500, fixing the estimated absolute position of the camera in the first vertex set, and only performing weighted local optimization on the absolute position of the vertex corresponding to the newly estimated next optimal view; after the weighted local optimization is completed, judging the growth ratio of the number of vertexes of the current estimated absolute position, and if the ratio is greater than a set threshold, performing weighted global optimization on all the current estimated absolute positions;
the weighted local optimization is: calculating the relative position error corresponding to each side as a first error by combining the absolute position in the first vertex set and the relative position measured between two vertexes connected with each side in the first edge set for the absolute position of the vertex corresponding to the selected next optimal view; if the first error is less than the set error threshold, the corresponding edge is taken as the inner value edge, and the inner value edge is further based on
Figure 951162DEST_PATH_IMAGE006
The norm carries out weighted local optimization on the absolute position of the vertex corresponding to the selected next optimal view; the first edge set is a set of epipolar geometric edges between the first vertex set and the vertex corresponding to the selected next optimal view;
the weighted global optimization is as follows: calculating relative position errors corresponding to geometric edges of the outer poles by combining the absolute positions of all the vertexes with the relative positions obtained by measuring every two vertexes as second errors; if the second error is less than the set error threshold, the corresponding edge is taken as the inner value edge, and the inner value edge is further based on
Figure 462915DEST_PATH_IMAGE007
Performing weighted global optimization on the absolute positions of the vertexes of all the estimated absolute positions by using the norm;
after the weighted global optimization is completed, further carrying out retranslation averaging on all vertexes of the estimated absolute position by the weighted global optimization method;
and step S600, after the estimation of all the absolute positions is finished, outputting the absolute positions obtained by performing weighted global optimization and retranslation averaging on all the estimable vertexes as the final estimation result of the absolute position of each camera.
In some preferred embodiments, the absolute position of each camera in the camera quadruples of the quadruple set in the local coordinate system is calculated by:
Figure 363875DEST_PATH_IMAGE008
wherein the content of the first and second substances,
Figure 785629DEST_PATH_IMAGE009
respectively representing the camera in a local coordinate system
Figure 989209DEST_PATH_IMAGE010
The absolute position after the optimization is carried out,
Figure 863624DEST_PATH_IMAGE011
the distance of the square chord is represented,
Figure 37116DEST_PATH_IMAGE012
for the representation of the relative translation transformation into the global coordinate system,
Figure 703590DEST_PATH_IMAGE013
to represent
Figure 937125DEST_PATH_IMAGE003
Any one of the edges of the strip is,
Figure 439781DEST_PATH_IMAGE014
any one camera quad representing a set of quads,
Figure 416965DEST_PATH_IMAGE015
Figure 813311DEST_PATH_IMAGE016
presentation camera
Figure 342381DEST_PATH_IMAGE017
The initial absolute position of (a).
In some preferred embodiments, the selecting cost of each camera quadruple in the quadruple set is calculated by:
Figure 456968DEST_PATH_IMAGE018
wherein the content of the first and second substances,
Figure 237842DEST_PATH_IMAGE019
represents the cost of the selection of the camera quadruple,
Figure 98482DEST_PATH_IMAGE020
representing the weight corresponding to each epipolar geometry edge in the camera quadruple.
In some preferred embodiments, the cost of selecting each vertex is calculated according to the absolute position obtained by:
Figure 939399DEST_PATH_IMAGE021
wherein the content of the first and second substances,
Figure 275702DEST_PATH_IMAGE022
representing the cost of selection of each vertex in the next best view selection,
Figure 981972DEST_PATH_IMAGE023
is composed of
Figure 87331DEST_PATH_IMAGE024
One of the edges of the one of the two,
Figure 833570DEST_PATH_IMAGE024
for the vertex in the third vertex set
Figure 798115DEST_PATH_IMAGE025
And a set of epipolar geometry edges between vertices in the first set of vertices,
Figure 655212DEST_PATH_IMAGE026
representing edges
Figure 615078DEST_PATH_IMAGE023
Corresponding to the relative position between the two vertexes,
Figure 188011DEST_PATH_IMAGE027
representing vertices in a first set of vertices
Figure 233327DEST_PATH_IMAGE028
The current estimate of the absolute position is,
Figure 894116DEST_PATH_IMAGE029
representing the weight corresponding to each epipolar geometry edge.
In some preferred embodiments, the method for obtaining the inner value edge in the weighted local optimization is as follows:
Figure 115013DEST_PATH_IMAGE030
Figure 203054DEST_PATH_IMAGE031
wherein the content of the first and second substances,
Figure 125880DEST_PATH_IMAGE032
representing a first set of vertices
Figure 590359DEST_PATH_IMAGE033
Vertex corresponding to the selected next optimal view
Figure 790396DEST_PATH_IMAGE034
The set of connecting edges between them,
Figure 924706DEST_PATH_IMAGE035
is composed of
Figure 210193DEST_PATH_IMAGE032
Any one of the edges of the strip is,
Figure 212785DEST_PATH_IMAGE036
to represent
Figure 391962DEST_PATH_IMAGE032
The set of inner-value edges in (b),
Figure 87386DEST_PATH_IMAGE037
representing edges
Figure 594590DEST_PATH_IMAGE035
The relative position between the two cameras connected,
Figure 276238DEST_PATH_IMAGE015
representing vertices
Figure 185289DEST_PATH_IMAGE038
The current estimate of the absolute position is,
Figure 179177DEST_PATH_IMAGE039
to represent
Figure 173678DEST_PATH_IMAGE034
The absolute position of the initialization is set to be,
Figure 783651DEST_PATH_IMAGE040
representing the relative position error corresponding to each epipolar geometry,
Figure 422574DEST_PATH_IMAGE041
indicating a set error threshold.
In some preferred embodiments, based on
Figure 459800DEST_PATH_IMAGE007
The norm carries out weighted local optimization on the absolute position of the vertex corresponding to the selected next optimal view, and the method comprises the following steps:
Figure 941597DEST_PATH_IMAGE042
Figure 479894DEST_PATH_IMAGE043
wherein the content of the first and second substances,
Figure 97957DEST_PATH_IMAGE044
indicating absolute position
Figure 447030DEST_PATH_IMAGE045
The result of the local optimization is weighted and,
Figure 150544DEST_PATH_IMAGE046
to represent
Figure 102319DEST_PATH_IMAGE036
Any one of the edges of the strip is,
Figure 965102DEST_PATH_IMAGE047
representing edges
Figure 344131DEST_PATH_IMAGE046
Relative position between the two connected cameras.
In some preferred embodiments, the method for obtaining the inner value edge in the weighted global optimization is as follows:
Figure 800520DEST_PATH_IMAGE048
Figure 165773DEST_PATH_IMAGE049
wherein the content of the first and second substances,
Figure 758429DEST_PATH_IMAGE050
representing the set of inner-valued edges at the time of weighted global optimization,
Figure 964151DEST_PATH_IMAGE051
the set of edges between vertices representing all estimated absolute positions,
Figure 642257DEST_PATH_IMAGE013
to represent
Figure 76781DEST_PATH_IMAGE051
Any one of the edges of the strip is,
Figure 523942DEST_PATH_IMAGE012
representing edges
Figure 979195DEST_PATH_IMAGE013
The relative position between the two cameras connected,
Figure 531880DEST_PATH_IMAGE015
representing vertices
Figure 363570DEST_PATH_IMAGE038
A current absolute position estimate.
In some preferred embodiments, based on
Figure 196397DEST_PATH_IMAGE007
The norm carries out weighted global optimization on the absolute positions of all the vertexes with the estimated absolute positions, and the method comprises the following steps:
Figure 697916DEST_PATH_IMAGE052
Figure 350614DEST_PATH_IMAGE053
wherein the content of the first and second substances,
Figure 251574DEST_PATH_IMAGE054
representing a set of absolute positions
Figure 532383DEST_PATH_IMAGE055
The result of the weighted global optimization is performed,
Figure 595017DEST_PATH_IMAGE056
is composed of
Figure 735011DEST_PATH_IMAGE057
Any one of the edges of the strip is,
Figure 49449DEST_PATH_IMAGE058
representing edges
Figure 325710DEST_PATH_IMAGE056
Relative position between the two connected cameras.
In a second aspect of the present invention, an incremental translational averaging system is provided, the system including: the system comprises an epi-polar geometry diagram construction module, an initial seed view selection module, a set construction module, a next optimal view selection module, an optimization module and an absolute position estimation output module;
the external pole geometric figure construction module is configured to acquire a plurality of frames of images, perform feature matching between every two images, and construct an external pole geometric figure according to the external pole geometric relation
Figure 824824DEST_PATH_IMAGE001
And then calculating relative rotation and relative translation between the matched image pairs; wherein the content of the first and second substances,
Figure 311169DEST_PATH_IMAGE002
a set of vertices, representing a set of cameras that capture images of a scene,
Figure 288352DEST_PATH_IMAGE003
the set of edges represents the set of epipolar geometry edges between two cameras which shoot different images and contains the motion information between the cameras;
the initial seed view selecting module is configured to select the front with the largest number of feature matches in the epipolar geometry
Figure 684699DEST_PATH_IMAGE059
The camera quadruple set comprises camera quadruples formed by edges, a quadruple set is constructed, and the absolute position of each camera in each camera quadruple of the quadruple set under a local coordinate system is calculated; calculating the selection cost of each camera quadruple in the quadruple set by combining the absolute position of each camera, and taking the view corresponding to the camera quadruple with the maximum selection cost as an initial seed view;
the set building module is configured to build a vertex set with an estimated absolute position based on a vertex corresponding to the initial seed view, the vertex set is used as a first vertex set, and a vertex set with an unexstimated absolute position in the epipolar geometry map is used as a second vertex set; selecting the front vertex with the maximum number of connecting edges with all the vertexes in the first vertex set in the second vertex set
Figure 964501DEST_PATH_IMAGE060
Each vertex is used for constructing a third vertex set;
the next optimal view selecting module is configured to combine each vertex in the third vertex set and a vertex in the first vertex set into a camera triple, and calculate an absolute position of each vertex in the third vertex set by a linear trifocal tensor solution; calculating the selection cost of each vertex according to the obtained absolute position, and taking the view corresponding to the vertex with the maximum selection cost as the next optimal view;
the optimization module is configured to fix the estimated absolute position of the camera in the first vertex set, and perform weighted local optimization on the absolute position of the vertex corresponding to the next most optimal view which is estimated most recently; after the weighted local optimization is completed, judging the growth ratio of the number of vertexes of the current estimated absolute position, and if the ratio is greater than a set threshold, performing weighted global optimization on all the current estimated absolute positions;
the weighted local optimization is: calculating the relative position error corresponding to each side as a first error by combining the absolute position in the first vertex set and the relative position measured between two vertexes connected with each side in the first edge set for the absolute position of the vertex corresponding to the selected next optimal view; if the first error is less than the set error threshold, the corresponding edge is taken as the inner value edge, and the inner value edge is further based on
Figure 79088DEST_PATH_IMAGE007
The norm carries out weighted local optimization on the absolute position of the vertex corresponding to the selected next optimal view; the first edge set is a set of epipolar geometric edges between the first vertex set and the vertex corresponding to the selected next optimal view;
the weighted global optimization is as follows: calculating relative position errors corresponding to geometric edges of the outer poles by combining the absolute positions of all the vertexes with the relative positions obtained by measuring every two vertexes as second errors; if the second error is less than the set error threshold, the corresponding edge is taken as the inner value edge, and the inner value edge is further based on
Figure 594383DEST_PATH_IMAGE007
Performing weighted global optimization on the absolute positions of the vertexes of all the estimated absolute positions by using the norm;
after the weighted global optimization is completed, further carrying out retranslation averaging on all vertexes of the estimated absolute position by the weighted global optimization method;
and the absolute position estimation output module is configured to output the absolute positions obtained by performing weighted global optimization and retranslation averaging on all the estimable vertexes after finishing estimation of all the absolute positions, and the absolute positions are used as final estimation results of the absolute positions of all the cameras.
In a third aspect of the invention, an electronic device is proposed, at least one processor; and a memory communicatively coupled to at least one of the processors; wherein the memory stores instructions executable by the processor for execution by the processor to implement the incremental translational averaging method as recited in the claims.
The invention has the beneficial effects that:
the invention reduces the complexity of the translational averaging method and improves the accuracy and robustness of absolute position estimation.
1) The method adopts an initial four-tuple selection strategy based on local optimization to realize the selection and construction of the seed view; determining the increment sequence of the vertex by adopting a next optimal view selection strategy based on the weighted support set; and performing weighted local or global optimization after the selection and initialization of the next optimal view, and performing one-step re-translation averaging operation after weighted global optimization to enable the estimation of the absolute position to be more accurate and robust, so that the result of the estimation of the absolute position is improved, and the accuracy and the robustness of the estimation of the absolute position are improved.
2) Due to the effectiveness of the incremental parameter estimation method, the translational averaging method provided by the invention is less dependent on the common robust operation in other methods, and a simpler and more efficient way is provided for the implementation of translational averaging.
Drawings
Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings.
Fig. 1 is a schematic flowchart of an incremental translational averaging method according to an embodiment of the present invention;
FIG. 2 is a block diagram of an incremental translational averaging system in accordance with an embodiment of the present invention;
FIG. 3 is a detailed flowchart of an incremental translational averaging method according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of a computer system suitable for implementing an electronic device according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict.
The invention discloses an incremental translational averaging method, which is applied to solving the absolute position of a camera in a global motion recovery structure, and comprises the following steps:
step S100, obtaining a plurality of frames of images, carrying out feature matching between every two images, and constructing an epi-polar geometric figure according to the epi-polar geometric relationship
Figure 235449DEST_PATH_IMAGE001
And then calculating relative rotation and relative translation between the matched image pairs; wherein the content of the first and second substances,
Figure 810786DEST_PATH_IMAGE002
a set of vertices, representing a set of cameras that capture images of a scene,
Figure 147090DEST_PATH_IMAGE003
the set of edges represents the set of epipolar geometry edges between two cameras which shoot different images and contains the motion information between the cameras;
step S200, selecting the front with the maximum feature matching quantity in the epi-polar geometric figure
Figure 607021DEST_PATH_IMAGE004
The camera quadruple set comprises camera quadruples formed by edges, a quadruple set is constructed, and the absolute position of each camera in each camera quadruple of the quadruple set under a local coordinate system is calculated; calculating the selection cost of each camera quadruple in the quadruple set by combining the absolute position of each camera, and taking the view corresponding to the camera quadruple with the maximum selection cost as an initial seed view;
step S300, constructing a vertex set with an estimated absolute position based on the vertex corresponding to the initial seed view, taking the vertex set with the estimated absolute position in the epi-polar geometric figure as a first vertex set, and taking the vertex set with the estimated absolute position in the epi-polar geometric figure as a second vertex set; selecting the front vertex with the maximum number of connecting edges with all the vertexes in the first vertex set in the second vertex set
Figure 712380DEST_PATH_IMAGE005
Each vertex is used for constructing a third vertex set;
step S400, forming a camera triple by each vertex in the third vertex set and the vertex in the first vertex set, and calculating the absolute position of each vertex in the third vertex set by a linear trifocal tensor solution; calculating the selection cost of each vertex according to the obtained absolute position, and taking the view corresponding to the vertex with the maximum selection cost as the next optimal view;
step S500, fixing the estimated absolute position of the camera in the first vertex set, and only performing weighted local optimization on the absolute position of the vertex corresponding to the newly estimated next optimal view; after the weighted local optimization is completed, judging the growth ratio of the number of vertexes of the current estimated absolute position, and if the ratio is greater than a set threshold, performing weighted global optimization on all the current estimated absolute positions;
the weighted local optimization is: combining the absolute positions of the vertices corresponding to the selected next optimal view with the absolute positions of the vertices in the first setThe absolute position and the relative position obtained by measuring between two vertexes connected with each side in the first side set are used for calculating the relative position error corresponding to each side as a first error; if the first error is less than the set error threshold, the corresponding edge is taken as the inner value edge, and the inner value edge is further based on
Figure 851762DEST_PATH_IMAGE006
The norm carries out weighted local optimization on the absolute position of the vertex corresponding to the selected next optimal view; the first edge set is a set of epipolar geometric edges between the first vertex set and the vertex corresponding to the selected next optimal view;
the weighted global optimization is as follows: calculating relative position errors corresponding to geometric edges of the outer poles by combining the absolute positions of all the vertexes with the relative positions obtained by measuring every two vertexes as second errors; if the second error is less than the set error threshold, the corresponding edge is taken as the inner value edge, and the inner value edge is further based on
Figure 409782DEST_PATH_IMAGE007
Performing weighted global optimization on the absolute positions of the vertexes of all the estimated absolute positions by using the norm;
after the weighted global optimization is completed, further carrying out retranslation averaging on all vertexes of the estimated absolute position by the weighted global optimization method;
and step S600, after the estimation of all the absolute positions is finished, outputting the absolute positions obtained by performing weighted global optimization and retranslation averaging on all the estimable vertexes as the final estimation result of the absolute position of each camera.
For a clearer explanation of the incremental translational averaging method of the present invention, the following will discuss the steps in one embodiment of the method of the present invention with reference to fig. 1 and 3.
Step S100, obtaining a plurality of frames of images, carrying out feature matching between every two images, and constructing an epi-polar geometric figure according to the epi-polar geometric relationship
Figure 798038DEST_PATH_IMAGE001
And then calculating relative rotation and relative translation between the matched image pairs; wherein the content of the first and second substances,
Figure 633270DEST_PATH_IMAGE002
a set of vertices, representing a set of cameras that capture images of a scene,
Figure 550411DEST_PATH_IMAGE003
the set of edges represents the set of epipolar geometry edges between two cameras which shoot different images and contains the motion information between the cameras;
in the present embodiment, the epi-polar geometry used for translational averaging is denoted as
Figure 126886DEST_PATH_IMAGE001
Wherein
Figure 912308DEST_PATH_IMAGE002
And
Figure 992259DEST_PATH_IMAGE003
respectively representing the vertex set and the edge set of the epipolar geometry. The input to the invention is the relative translation between the matched image pairs
Figure 486826DEST_PATH_IMAGE061
It is transformed into the global coordinate system by known absolute rotations, denoted
Figure 19438DEST_PATH_IMAGE062
Wherein, in the step (A),
Figure 483918DEST_PATH_IMAGE013
to represent
Figure 543009DEST_PATH_IMAGE063
Any one of the edges of the strip is,
Figure 67532DEST_PATH_IMAGE012
is relatively translated
Figure 353020DEST_PATH_IMAGE061
By the formula
Figure 230977DEST_PATH_IMAGE064
A transformation to a representation in a global coordinate system, wherein,
Figure 19941DEST_PATH_IMAGE065
is a vertex
Figure 980944DEST_PATH_IMAGE066
Absolute rotation in a global coordinate system. The output of the invention is the absolute position of each camera after optimization, and is recorded as
Figure 347203DEST_PATH_IMAGE067
Wherein, in the step (A),
Figure 153485DEST_PATH_IMAGE068
to represent
Figure 62535DEST_PATH_IMAGE002
Any one of the vertices in (a) is,
Figure 804227DEST_PATH_IMAGE069
to represent
Figure 798727DEST_PATH_IMAGE068
The absolute position of the corresponding camera after optimization. FIG. 1 is a flow chart of the method of the present invention, which mainly includes the following parts: 1) selecting and constructing a seed view by adopting an initial quadruple selection strategy based on local optimization; 2) determining the increment sequence of the vertex by adopting a next optimal view selection strategy based on the weighted support set; 3) performing weighted local or global optimization after the selection and initialization of the next optimal view; 4) in order to make the estimation of the absolute position more accurate and robust, a one-step retranslation averaging operation is performed after the weighted global optimization. Specifically, the following steps are described.
Step S200, selecting the front with the maximum feature matching quantity in the epi-polar geometric figure
Figure 143121DEST_PATH_IMAGE059
The camera quadruple set comprises camera quadruples formed by edges, a quadruple set is constructed, and the absolute position of each camera in each camera quadruple of the quadruple set under a local coordinate system is calculated; calculating the selection cost of each camera quadruple in the quadruple set by combining the absolute position of each camera, and taking the view corresponding to the camera quadruple with the maximum selection cost as an initial seed view;
the selection of the initial seed view is both a key step in the incremental recovery of the structure from motion and a key step in the method of the present invention. The most intuitive way to process is to choose the camera pair or camera triplet with the smallest rotational cycle bias as the initial seed view according to:
Figure 293961DEST_PATH_IMAGE070
(1)
Figure 65608DEST_PATH_IMAGE071
(2)
wherein the content of the first and second substances,
Figure 812984DEST_PATH_IMAGE072
is that
Figure 836435DEST_PATH_IMAGE073
The angular distance of (a) above (b),
Figure 454498DEST_PATH_IMAGE074
to represent
Figure 662625DEST_PATH_IMAGE003
A set of camera triplets consisting of all epipolar geometry edges.
However, on the same side, with high precision
Figure 756352DEST_PATH_IMAGE075
Cannot ensure high precision
Figure 442549DEST_PATH_IMAGE061
Therefore, the choice of the initial seed view in translational averaging should depend on its own cyclic offset of position rather than the cyclic offset of rotation. In addition, due to the loss of the relative translation modular length, the camera triplet is the minimum configuration solution in the camera position calculation, and in order to evaluate the effectiveness of the initial seed view selection and recovery, an additional camera is required. Finally, in consideration of robustness, in the present embodiment, the initial seed view is selected and constructed by using a camera quadruple instead of a camera pair or a camera triplet, and the specific selection manner is described as follows:
before selection, it should be noted that the epi-polar geometry usually contains a large number of quads, which is especially evident when the number of vertices is large. In order to balance the effectiveness and the high efficiency of the four-tuple selection process, the invention only considers the front with the maximum number of feature matching
Figure 180697DEST_PATH_IMAGE059
Figure 435092DEST_PATH_IMAGE059
Is a natural number, the invention
Figure 625902DEST_PATH_IMAGE059
Preferably set to 100) camera quadruples of edges and the set of quadruples is noted as
Figure 115789DEST_PATH_IMAGE076
Wherein, in the step (A),
Figure 98658DEST_PATH_IMAGE077
the selected set of four-tuples is represented,
Figure 648588DEST_PATH_IMAGE014
to represent
Figure 326694DEST_PATH_IMAGE077
Any one of themCamera quadruplets.
For the
Figure 354693DEST_PATH_IMAGE077
Each quadruple in (2)
Figure 942800DEST_PATH_IMAGE014
The absolute position of each camera in the camera quadruple in the local coordinate system can be obtained by:
Figure 663631DEST_PATH_IMAGE008
(3)
wherein the content of the first and second substances,
Figure 219246DEST_PATH_IMAGE009
respectively representing the camera in a local coordinate system
Figure 50936DEST_PATH_IMAGE010
The absolute position after the optimization is carried out,
Figure 352605DEST_PATH_IMAGE011
the distance of the square chord is represented,
Figure 385283DEST_PATH_IMAGE012
for the representation of the relative translation transformation into the global coordinate system,
Figure 37981DEST_PATH_IMAGE013
to represent
Figure 407782DEST_PATH_IMAGE003
Any one of the edges of the strip is,
Figure 829536DEST_PATH_IMAGE014
any one camera quad representing a set of quads,
Figure 285313DEST_PATH_IMAGE015
Figure 159728DEST_PATH_IMAGE016
presentation camera
Figure 739745DEST_PATH_IMAGE017
The initial absolute position of (a).
At a set absolute position
Figure 16006DEST_PATH_IMAGE078
When estimating the initial, first, will
Figure 515120DEST_PATH_IMAGE015
And
Figure 1465DEST_PATH_IMAGE016
is initialized to
Figure 978648DEST_PATH_IMAGE079
And
Figure 374995DEST_PATH_IMAGE012
and respectively performing linear trifocal tensor solver on the triples through the known absolute rotation
Figure 654797DEST_PATH_IMAGE080
And
Figure 769384DEST_PATH_IMAGE081
in-process initialization
Figure 284679DEST_PATH_IMAGE027
And
Figure 660165DEST_PATH_IMAGE082
. After the above optimization, the following formula is used
Figure 501083DEST_PATH_IMAGE077
Each quadruple in
Figure 837386DEST_PATH_IMAGE014
Calculating a selection cost:
Figure 31738DEST_PATH_IMAGE018
(4)
wherein the content of the first and second substances,
Figure 402677DEST_PATH_IMAGE019
represents the cost of the selection of the camera quadruple,
Figure 148916DEST_PATH_IMAGE020
representing the weight corresponding to each epipolar geometry edge in the camera quadruple.
Finally, the selected initial quadruple (or initial seed view) can be obtained by:
Figure 831570DEST_PATH_IMAGE083
(5)
wherein the content of the first and second substances,
Figure 219826DEST_PATH_IMAGE084
representing the serial number of each vertex camera of the initial quadruple, the corresponding four absolute positions are
Figure 179691DEST_PATH_IMAGE085
Figure 972198DEST_PATH_IMAGE086
Figure 283094DEST_PATH_IMAGE087
Figure 209461DEST_PATH_IMAGE088
Step S300, constructing a vertex set with an estimated absolute position based on the vertex corresponding to the initial seed view, taking the vertex set with the estimated absolute position in the epi-polar geometric figure as a first vertex set, and taking the vertex set with the estimated absolute position in the epi-polar geometric figure as a second vertex set; selecting the front vertex with the maximum number of connecting edges with all the vertexes in the first vertex set in the second vertex set
Figure 168975DEST_PATH_IMAGE060
Each vertex is used for constructing a third vertex set;
the next optimal view selection is another key step in incrementally restoring the structure from motion, which also requires significant consideration. A simpler processing method is to select the camera with the largest number of edges connected to the camera with the current estimated absolute position, and use the corresponding view as the next optimal view. However, collections
Figure 788175DEST_PATH_IMAGE003
Different sides in the middle correspond to different relative translation measurement errors
Figure 320788DEST_PATH_IMAGE089
These edges should not be treated equally in the selection process. Therefore, in order to improve the robustness of the translational averaging method, the invention designs a next optimal view selection strategy based on a weighted support set, and the specific flow is described as follows
In this embodiment, the vertex sets of the current estimated and unexstimated absolute positions are respectively recorded as
Figure 395054DEST_PATH_IMAGE033
And
Figure 595092DEST_PATH_IMAGE090
by using
Figure 119614DEST_PATH_IMAGE033
Set of vertices representing the current estimated absolute position, as the first set of vertices
Figure 998577DEST_PATH_IMAGE090
The set of vertices representing the current unexstimated absolute position as the second set of vertices, i.e.
Figure 266747DEST_PATH_IMAGE091
. The next optimal view is selected for the purpose of selecting from
Figure 321291DEST_PATH_IMAGE090
Selects a vertex which can make the incremental absolute position calculation process more robust
Figure 626501DEST_PATH_IMAGE038
. To improve the efficiency of the next optimal view selection, only consider here
Figure 399285DEST_PATH_IMAGE090
Neutralization of
Figure 205567DEST_PATH_IMAGE033
The front with the maximum number of connecting edges of all the top points
Figure 973672DEST_PATH_IMAGE060
Figure 105576DEST_PATH_IMAGE060
For natural numbers, the invention is preferably provided
Figure 834498DEST_PATH_IMAGE060
10) vertices, and records the set of vertices as
Figure 319837DEST_PATH_IMAGE092
As a third set of vertices, where,
Figure 348973DEST_PATH_IMAGE093
to represent
Figure 120620DEST_PATH_IMAGE090
Neutralization of
Figure 727050DEST_PATH_IMAGE033
All vertices in the set connect the top 10 vertices with the highest number of edges,
Figure 875135DEST_PATH_IMAGE025
to represent
Figure 758777DEST_PATH_IMAGE093
Any one vertex in (b).
Step S400, forming a camera triple by each vertex in the third vertex set and the vertex in the first vertex set, and calculating the absolute position of each vertex in the third vertex set by a linear trifocal tensor solution; calculating the selection cost of each vertex according to the obtained absolute position, and taking the view corresponding to the vertex with the maximum selection cost as the next optimal view;
in the present embodiment, it is preferred that,
Figure 576692DEST_PATH_IMAGE093
each vertex in (1)
Figure 545785DEST_PATH_IMAGE025
Can all be combined with
Figure 497560DEST_PATH_IMAGE033
The vertices in (a) constitute a plurality of camera triplets, denoted as
Figure 363273DEST_PATH_IMAGE094
. The camera triplet is used here to eliminate the scale ambiguity problem in absolute position estimation, each belonging to
Figure 742301DEST_PATH_IMAGE095
Of (2)
Figure 933111DEST_PATH_IMAGE096
All can be
Figure 298365DEST_PATH_IMAGE025
Calculating an absolute camera position, recording as
Figure 156599DEST_PATH_IMAGE097
. At an estimated absolute position
Figure 706529DEST_PATH_IMAGE015
And
Figure 243690DEST_PATH_IMAGE016
and relative position after measurement
Figure 537268DEST_PATH_IMAGE012
Figure 250009DEST_PATH_IMAGE098
And
Figure 377365DEST_PATH_IMAGE099
in the case where it is known that,
Figure 932980DEST_PATH_IMAGE097
can be calculated by a linear trifocal tensor solution. Ideally, aggregate
Figure 764670DEST_PATH_IMAGE100
Each absolute position in
Figure 66338DEST_PATH_IMAGE097
Should remain equal, but in practice this does not happen due to the effect of the absolute position estimation error and the relative position measurement error. Therefore, to select the next optimal view, the set needs to be calculated according to the following formula
Figure 958071DEST_PATH_IMAGE101
The cost of selecting each absolute position in (1):
Figure 486135DEST_PATH_IMAGE021
(6)
wherein the content of the first and second substances,
Figure 121516DEST_PATH_IMAGE022
representing the cost of selection of each vertex in the next best view selection,
Figure 543270DEST_PATH_IMAGE023
is composed of
Figure 996117DEST_PATH_IMAGE024
One of the edges of the one of the two,
Figure 870532DEST_PATH_IMAGE024
for the vertex in the third vertex set
Figure 44024DEST_PATH_IMAGE025
And a set of epipolar geometry edges between vertices in the first set of vertices,
Figure 461230DEST_PATH_IMAGE026
representing edges
Figure 694766DEST_PATH_IMAGE023
Corresponding to the relative position between the two vertexes,
Figure 56477DEST_PATH_IMAGE027
representing vertices in a first set of vertices
Figure 420943DEST_PATH_IMAGE028
The current estimate of the absolute position is,
Figure 817290DEST_PATH_IMAGE029
representing the weight corresponding to each epipolar geometry edge.
Subsequently, the set of pairs of the following formula is utilized
Figure 362672DEST_PATH_IMAGE101
Absolute position of representative in
Figure 946100DEST_PATH_IMAGE102
Selecting:
Figure 726974DEST_PATH_IMAGE103
(7)
wherein the content of the first and second substances,
Figure 977827DEST_PATH_IMAGE104
representation collection
Figure 943378DEST_PATH_IMAGE101
The number of representative absolute positions in (1). Finally, the selected next optimal view can be obtained by:
Figure 14102DEST_PATH_IMAGE105
(8)
wherein the content of the first and second substances,
Figure 474033DEST_PATH_IMAGE106
indicating the sequence number of the vertex corresponding to the next optimal view selected, the absolute position of the vertex being initialized to
Figure 579392DEST_PATH_IMAGE039
. Since the next optimal view selection strategy proposed by the present invention is based on the support set weighted by the recomputed position deviation, it can deal more robustly with the relative translational outliers.
Step S500, fixing the estimated absolute position of the camera in the first vertex set, and only performing weighted local optimization on the absolute position of the vertex corresponding to the newly estimated next optimal view; after the weighted local optimization is completed, judging the growth ratio of the number of vertexes of the current estimated absolute position, and if the ratio is greater than a set threshold, performing weighted global optimization on all the current estimated absolute positions;
the weighted local optimization is: calculating the relative position error corresponding to each side as a first error by combining the absolute position in the first vertex set and the relative position measured between two vertexes connected with each side in the first edge set for the absolute position of the vertex corresponding to the selected next optimal view; if the first error is less than the set error threshold, the corresponding edge is taken as the inner value edge, and the inner value edge is further based on
Figure 325631DEST_PATH_IMAGE006
The norm carries out weighted local optimization on the absolute position of the vertex corresponding to the selected next optimal view; what is needed isThe first edge set is a set of epipolar geometric edges between the first vertex set and the vertex corresponding to the selected next optimal view;
the weighted global optimization is as follows: calculating relative position errors corresponding to geometric edges of the outer poles by combining the absolute positions of all the vertexes with the relative positions obtained by measuring every two vertexes as second errors; if the second error is less than the set error threshold, the corresponding edge is taken as the inner value edge, and the inner value edge is further based on
Figure 539444DEST_PATH_IMAGE007
Performing weighted global optimization on the absolute positions of the vertexes of all the estimated absolute positions by using the norm;
after the weighted global optimization is completed, further carrying out retranslation averaging on all vertexes of the estimated absolute position by the weighted global optimization method;
in this embodiment, after the next optimal view is selected, the vertex is selected
Figure 396542DEST_PATH_IMAGE034
Is initialized to an absolute position of
Figure 887566DEST_PATH_IMAGE039
In order to further improve the accuracy of absolute position estimation, the invention carries out local or global optimization on the currently estimated absolute position. Local optimization optimizes only the most recently estimated absolute position while fixing other absolute positions, global optimization simultaneously pairs sets
Figure 680072DEST_PATH_IMAGE107
Is optimized for all estimated absolute positions. In consideration of efficiency, the local optimization and the global optimization in the invention are alternately carried out, and a certain growth rate is achieved only at the currently estimated absolute position
Figure 990968DEST_PATH_IMAGE108
(in the invention)
Figure 651757DEST_PATH_IMAGE108
Preferably set to 50%) is globally optimized. Similar to the selection of the initial quadruple and the selection of the next optimal view, the local optimization and the global optimization in the invention also introduce a weighted idea. In addition, in order to deal with the drift problem in the incremental estimation scheme, after each weighted global optimization, the invention also carries out retranslation averaging on the local epipolar geometry diagram. The specific flow of weighted local and global optimization and re-panning averaging is described as follows:
for the weighted local optimization, on the basis of the selection and initialization of the next optimal view, firstly, an inner value edge set is solved, and the formula is as follows:
Figure 856342DEST_PATH_IMAGE030
(9)
Figure 944384DEST_PATH_IMAGE031
wherein the content of the first and second substances,
Figure 742575DEST_PATH_IMAGE032
representing a first set of vertices
Figure 82421DEST_PATH_IMAGE033
Vertex corresponding to the selected next optimal view
Figure 16879DEST_PATH_IMAGE034
The set of connecting edges between them,
Figure 541401DEST_PATH_IMAGE035
is composed of
Figure 688873DEST_PATH_IMAGE032
Any one of the edges of the strip is,
Figure 957043DEST_PATH_IMAGE036
to represent
Figure 11587DEST_PATH_IMAGE032
The set of inner-value edges in (b),
Figure 316798DEST_PATH_IMAGE037
representing edges
Figure 824002DEST_PATH_IMAGE035
The relative position between the two cameras connected,
Figure 630284DEST_PATH_IMAGE015
representing vertices
Figure 663968DEST_PATH_IMAGE038
The current estimate of the absolute position is,
Figure 530293DEST_PATH_IMAGE039
to represent
Figure 259215DEST_PATH_IMAGE034
The absolute position of the initialization is set to be,
Figure 10133DEST_PATH_IMAGE040
representing the relative position error corresponding to each epipolar geometry,
Figure 508110DEST_PATH_IMAGE041
the error threshold value (namely the error threshold value of the included angle between two positions) is shown in the experiment of the invention
Figure 545337DEST_PATH_IMAGE109
Then, the vertex is aligned
Figure 151767DEST_PATH_IMAGE034
Absolute position of
Figure 299852DEST_PATH_IMAGE044
Performing weighted local optimization, wherein the formula is as follows:
Figure 917915DEST_PATH_IMAGE042
(10)
Figure 266988DEST_PATH_IMAGE043
wherein the content of the first and second substances,
Figure 970502DEST_PATH_IMAGE044
indicating absolute position
Figure 922277DEST_PATH_IMAGE045
The result of the local optimization is weighted and,
Figure 785060DEST_PATH_IMAGE046
to represent
Figure 164089DEST_PATH_IMAGE036
Any one of the edges of the strip is,
Figure 354899DEST_PATH_IMAGE047
representing edges
Figure 844786DEST_PATH_IMAGE046
Relative position between the two connected cameras.
For weighted global optimization, similar to weighted local optimization, first a set of edges from all current estimated absolute positions is needed
Figure 578386DEST_PATH_IMAGE110
In-take inner value edge set
Figure 862737DEST_PATH_IMAGE111
The formula is as follows:
Figure 540843DEST_PATH_IMAGE112
(11)
Figure 221705DEST_PATH_IMAGE113
wherein the content of the first and second substances,
Figure 668867DEST_PATH_IMAGE114
representing the set of inner-valued edges at the time of weighted global optimization,
Figure 389698DEST_PATH_IMAGE110
the set of edges between vertices representing all estimated absolute positions,
Figure 430466DEST_PATH_IMAGE013
to represent
Figure 262156DEST_PATH_IMAGE110
Any one of the edges of the strip is,
Figure 829404DEST_PATH_IMAGE012
representing edges
Figure 111349DEST_PATH_IMAGE013
The relative position between the two cameras connected,
Figure 498468DEST_PATH_IMAGE015
representing vertices
Figure 133849DEST_PATH_IMAGE038
A current absolute position estimate.
Then, the sets are combined
Figure 696548DEST_PATH_IMAGE107
Is weighted globally, the formula is as follows:
Figure 759182DEST_PATH_IMAGE115
(12)
Figure 633597DEST_PATH_IMAGE053
wherein the content of the first and second substances,
Figure 931724DEST_PATH_IMAGE116
representing a set of absolute positions
Figure 473563DEST_PATH_IMAGE055
The result of the weighted global optimization is performed,
Figure 707099DEST_PATH_IMAGE056
is composed of
Figure 944176DEST_PATH_IMAGE114
Any one of the edges of the strip is,
Figure 186939DEST_PATH_IMAGE058
representing edges
Figure 317706DEST_PATH_IMAGE056
Relative position between the two connected cameras.
After the weighted global optimization, for the retranslation averaging, the absolute position set is obtained through the weighted global optimization
Figure 112355DEST_PATH_IMAGE054
And the formula in the weighted global optimization is used for solving the inner value edge set again, and the currently estimated absolute position is optimized again, so that the accuracy and the robustness of the method are further improved.
And step S600, after the estimation of all the absolute positions is finished, outputting the absolute positions obtained by performing weighted global optimization and retranslation averaging on all the estimable vertexes as the final estimation result of the absolute position of each camera.
And acquiring the three-dimensional coordinates of the point cloud under the global coordinate system according to the optimized absolute position of the camera to obtain a sparse reconstruction result. On the basis, a final three-dimensional model can be generated through the steps of dense reconstruction, point cloud modeling and the like.
In addition, in order to verify the effect of the present invention, we performed test experiments on 1DSfM data set, and the related information of the data set is listed in table 1, ALM (ALM-Alamo), ELS (ELS-Ellis Island), MDR (MDR-Madrid Metrop), MND (MND-monotreat not date), NYC (NYC-NYC Library), PDP (PDP-Piazza del Popolo), PIC (PIC-Piccadilly), ROF (ROF-Roman Forum), TOL (TOL-Tower of London), USQ (USQ-Union Square), VNC (VNC-Vienna cathodal), YKM (YKM-Yorkminster) denote data sets, collectively referred to as 1DSfM data set, and can refer to the following documents: "K, Wilson and N, Snavely, road global transformations with 1DSfM, In European Conference on Computer Vision (ECCV), pages 61-75, 2014. In the experiment, the result of the Bundler calibration is used as the true value of the absolute position of the camera, and the error median of the absolute position estimation is used as an evaluation index.
TABLE 1
Figure 961362DEST_PATH_IMAGE118
In order to verify the effectiveness of the key technology provided by the invention, a plurality of ablation experiments are carried out, including initial four-tuple selection (ablation one) based on local optimization, next optimal view selection (ablation two) based on a weighted support set, weighting (ablation three), repeated translation averaging (ablation four), global optimization (ablation five), local optimization (ablation six) without weighting, and the following six conditions are briefly described:
1) selecting the initial seed view as a camera triple with the minimum rotation cycle deviation under the condition of no initial quadruple selection based on local optimization;
2) under the condition that no next optimal view based on the weighted support set is selected, selecting the next optimal view as the camera with the largest number of edges connected with the camera with the current estimated absolute position;
3) all relative translation measurements are treated equally without weighting;
4) under the condition of no re-translational averaging, re-translational averaging is not performed after each weighted global optimization;
5) under the condition of no weighted global optimization, weighted global optimization and retranslation averaging are not carried out in the incremental absolute position calculation process;
6) under the condition of non-weighted local optimization, no optimization operation is carried out in the incremental absolute position calculation process, and each absolute position is set as an initial value given after the next optimal view is selected.
The results of the ablation experiments are shown in table 2, from which it can be seen that: for most test data, the translational averaging estimation errors in all ablation experiments are increased, which shows that the key technologies proposed in the invention are effective in improving the accuracy and robustness of the method.
TABLE 2
Figure 742237DEST_PATH_IMAGE119
In comparative experiments, we compared the method of the present invention with five other methods, corresponding documents of which are:
[1] Z. Cui and P. Tan. Global structure-from-motion by similarity averaging. In IEEE International Conference on Computer Vision (ICCV), pages 864–872, 2015.
[2] C. Sweeney, T. Sattler, T. Höllerer, M. Turk, and M. Pollefeys. Optimizing the viewing graph for structure-from-motion. In IEEE International Conference on Computer Vision (ICCV), pages 801–809, 2015.
[3] T. Goldstein, P. Hand, C. Lee, V. Voroninski, and S. Soatto. ShapeFit and ShapeKick for robust, scalable structure from motion. In European Computer Vision (ECCV), pages 289–304, 2016.
[4] B. Zhuang, L. Cheong, and G. H. Lee. Baseline desensitizing in translation averaging. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 4539–4547, 2018.
[5] Y. Kasten, A. Geifman, M. Galun, and R. Basri. Algebraic characterization of essential matrices and their averaging in multiview settings. In IEEE/CVF International Conference on Computer Vision (ICCV), pages 5894–5902, 2019.
the results of the comparative experiments are shown in Table 3, from which it can be seen that: in all the comparison methods, the translational averaging method provided by the invention achieves the overall optimization in the aspects of accuracy, robustness and the like. In the comparison of the result accuracy, only the results of the first comparison experiment and the fourth comparison experiment have smaller difference with the result accuracy of the invention, but the methods need to use additional information or initialization operation, such as characteristic tracks, local binding adjustment, more accurate initial values and the like, and the solving process is more complicated.
TABLE 3
Figure 602876DEST_PATH_IMAGE120
An incremental translational averaging system according to a second embodiment of the present invention, as shown in fig. 2, includes: the system comprises an epi-polar geometry diagram construction module 100, an initial seed view selection module 200, a set construction module 300, a next optimal view selection module 400, an optimization module 500 and an absolute position estimation output module 600;
the epi-polar geometry map construction module 100 is configured to acquire a plurality of frames of images, perform feature matching between every two images, and construct the epi-polar geometry map according to the epi-polar geometry relationship
Figure 443793DEST_PATH_IMAGE121
And then calculating relative rotation and relative translation between the matched image pairs; wherein the content of the first and second substances,
Figure 780097DEST_PATH_IMAGE122
a set of vertices, representing a set of cameras that capture images of a scene,
Figure 226646DEST_PATH_IMAGE123
set of epipolar geometric edges between two cameras that capture different images, packageMotion information between cameras is contained;
the initial seed view selection module 200 is configured to select the top with the largest number of feature matches in the epipolar geometry
Figure 597585DEST_PATH_IMAGE124
The camera quadruple set comprises camera quadruples formed by edges, a quadruple set is constructed, and the absolute position of each camera in each camera quadruple of the quadruple set under a local coordinate system is calculated; calculating the selection cost of each camera quadruple in the quadruple set by combining the absolute position of each camera, and taking the view corresponding to the camera quadruple with the maximum selection cost as an initial seed view;
the set constructing module 300 is configured to construct a vertex set with an estimated absolute position based on the vertex corresponding to the initial seed view as a first vertex set, and construct a vertex set with an un-estimated absolute position in the epipolar geometry map as a second vertex set; selecting the front vertex with the maximum number of connecting edges with all the vertexes in the first vertex set in the second vertex set
Figure 343824DEST_PATH_IMAGE125
Each vertex is used for constructing a third vertex set;
the next optimal view selecting module 400 is configured to combine each vertex in the third vertex set and a vertex in the first vertex set into a camera triplet, and calculate an absolute position of each vertex in the third vertex set by a linear trifocal tensor solution; calculating the selection cost of each vertex according to the obtained absolute position, and taking the view corresponding to the vertex with the maximum selection cost as the next optimal view;
the optimization module 500 is configured to fix the estimated absolute position of the camera in the first vertex set, and perform weighted local optimization only on the absolute position of the vertex corresponding to the most recently estimated next optimal view; after the weighted local optimization is completed, judging the growth ratio of the number of vertexes of the current estimated absolute position, and if the ratio is greater than a set threshold, performing weighted global optimization on all the current estimated absolute positions;
the weighted local optimization is: calculating the relative position error corresponding to each side as a first error by combining the absolute position in the first vertex set and the relative position measured between two vertexes connected with each side in the first edge set for the absolute position of the vertex corresponding to the selected next optimal view; if the first error is less than the set error threshold, the corresponding edge is taken as the inner value edge, and the inner value edge is further based on
Figure 42790DEST_PATH_IMAGE126
The norm carries out weighted local optimization on the absolute position of the vertex corresponding to the selected next optimal view; the first edge set is a set of epipolar geometric edges between the first vertex set and the vertex corresponding to the selected next optimal view;
the weighted global optimization is as follows: calculating relative position errors corresponding to geometric edges of the outer poles by combining the absolute positions of all the vertexes with the relative positions obtained by measuring every two vertexes as second errors; if the second error is less than the set error threshold, the corresponding edge is taken as the inner value edge, and the inner value edge is further based on
Figure 899887DEST_PATH_IMAGE126
Performing weighted global optimization on the absolute positions of the vertexes of all the estimated absolute positions by using the norm;
after the weighted global optimization is completed, further carrying out retranslation averaging on all vertexes of the estimated absolute position by the weighted global optimization method;
the absolute position estimation output module 600 is configured to output the absolute positions obtained by performing weighted global optimization and retranslation averaging on all the estimable vertices after completing estimation of all the absolute positions, as the final estimation result of the absolute positions of each camera.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working process and related description of the system described above may refer to the corresponding process in the foregoing method embodiment, and details are not described herein again.
It should be noted that, the incremental translational averaging system provided in the foregoing embodiment is only illustrated by dividing the functional modules, and in practical applications, the above function allocation may be completed by different functional modules according to needs, that is, the modules or steps in the embodiment of the present invention are further decomposed or combined, for example, the modules in the foregoing embodiment may be combined into one module, or may be further split into multiple sub-modules, so as to complete all or part of the functions described above. The names of the modules and steps involved in the embodiments of the present invention are only for distinguishing the modules or steps, and are not to be construed as unduly limiting the present invention.
An electronic device according to a third embodiment of the present invention includes at least one processor; and a memory communicatively coupled to at least one of the processors; wherein the memory stores instructions executable by the processor for execution by the processor to implement the incremental translational averaging method as recited in the claims.
A computer-readable storage medium of a fourth embodiment of the present invention stores computer instructions for execution by the computer to implement the incremental translational averaging method recited in the claims above.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes and related descriptions of the above-described apparatuses and computer-readable storage media may refer to the corresponding processes in the foregoing method examples, and are not described herein again.
Referring now to FIG. 4, there is illustrated a block diagram of a computer system suitable for use as a server in implementing embodiments of the method, system, and apparatus of the present application. The server shown in fig. 4 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present application.
As shown in fig. 4, the computer system includes a Central Processing Unit (CPU)401 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 402 or a program loaded from a storage section 408 into a Random Access Memory (RAM) 403. In the RAM 403, various programs and data necessary for system operation are also stored. The CPU 401, ROM 402, and RAM 403 are connected to each other via a bus 404. An Input/Output (I/O) interface 405 is also connected to the bus 404.
The following components are connected to the I/O interface 405: an input section 406 including a keyboard, a mouse, and the like; an output section 407 including a Display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and a speaker; a storage section 408 including a hard disk and the like; and a communication section 409 including a Network interface card such as a LAN (Local Area Network) card, a modem, or the like. The communication section 409 performs communication processing via a network such as the internet. A driver 410 is also connected to the I/O interface 405 as needed. A removable medium 411 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 410 as necessary, so that a computer program read out therefrom is mounted into the storage section 408 as necessary.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 409, and/or installed from the removable medium 411. The computer program performs the above-described functions defined in the method of the present application when executed by a Central Processing Unit (CPU) 401. It should be noted that the computer readable medium mentioned above in the present application may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present application, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In this application, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The terms "first," "second," and the like are used for distinguishing between similar elements and not necessarily for describing or implying a particular order or sequence.
The terms "comprises," "comprising," or any other similar term are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
So far, the technical solutions of the present invention have been described in connection with the preferred embodiments shown in the drawings, but it is easily understood by those skilled in the art that the scope of the present invention is obviously not limited to these specific embodiments. Equivalent changes or substitutions of related technical features can be made by those skilled in the art without departing from the principle of the invention, and the technical scheme after the changes or substitutions can fall into the protection scope of the invention.

Claims (10)

1. An incremental translational averaging method applied to global solution of absolute camera positions in a motion recovery structure, the method is characterized by comprising the following steps:
step S100, obtaining a plurality of frames of images, carrying out feature matching between every two images, and constructing an epi-polar geometric figure according to the epi-polar geometric relationship
Figure DEST_PATH_IMAGE001
And then calculating relative rotation and relative translation between the matched image pairs; wherein the content of the first and second substances,
Figure 927637DEST_PATH_IMAGE002
a set of vertices, representing a set of cameras that capture images of a scene,
Figure DEST_PATH_IMAGE003
the set of edges represents the set of epipolar geometry edges between two cameras which shoot different images and contains the motion information between the cameras;
step S200, selecting the front with the maximum feature matching quantity in the epi-polar geometric figure
Figure 665786DEST_PATH_IMAGE004
The camera quadruple set comprises camera quadruples formed by edges, a quadruple set is constructed, and the absolute position of each camera in each camera quadruple of the quadruple set under a local coordinate system is calculated; calculating the selection cost of each camera quadruple in the quadruple set by combining the absolute position of each camera, and taking the view corresponding to the camera quadruple with the maximum selection cost as an initial seed view;
step S300, constructing a vertex set with an estimated absolute position based on the vertex corresponding to the initial seed view, taking the vertex set with the estimated absolute position in the epi-polar geometric figure as a first vertex set, and taking the vertex set with the estimated absolute position in the epi-polar geometric figure as a second vertex set; selecting the secondThe front vertex set with the maximum number of connecting edges with all the vertexes in the first vertex set
Figure DEST_PATH_IMAGE005
Each vertex is used for constructing a third vertex set;
step S400, forming a camera triple by each vertex in the third vertex set and the vertex in the first vertex set, and calculating the absolute position of each vertex in the third vertex set by a linear trifocal tensor solution; calculating the selection cost of each vertex according to the obtained absolute position, and taking the view corresponding to the vertex with the maximum selection cost as the next optimal view;
step S500, fixing the estimated absolute position of the camera in the first vertex set, and only performing weighted local optimization on the absolute position of the vertex corresponding to the newly estimated next optimal view; after the weighted local optimization is completed, judging the growth ratio of the number of vertexes of the current estimated absolute position, and if the ratio is greater than a set threshold, performing weighted global optimization on all the current estimated absolute positions;
the weighted local optimization is: calculating the relative position error corresponding to each side as a first error by combining the absolute position in the first vertex set and the relative position measured between two vertexes connected with each side in the first edge set for the absolute position of the vertex corresponding to the selected next optimal view; if the first error is less than the set error threshold, the corresponding edge is taken as the inner value edge, and the inner value edge is further based on
Figure 107131DEST_PATH_IMAGE006
The norm carries out weighted local optimization on the absolute position of the vertex corresponding to the selected next optimal view; the first edge set is a set of epipolar geometric edges between the first vertex set and the vertex corresponding to the selected next optimal view;
the weighted global optimization is as follows: calculating relative position errors corresponding to geometric edges of the outer poles by combining the absolute positions of all the vertexes with the relative positions obtained by measuring every two vertexes as second errors;if the second error is less than the set error threshold, the corresponding edge is taken as the inner value edge, and the inner value edge is further based on
Figure 766783DEST_PATH_IMAGE006
Performing weighted global optimization on the absolute positions of the vertexes of all the estimated absolute positions by using the norm;
after the weighted global optimization is completed, further carrying out retranslation averaging on all vertexes of the estimated absolute position by the weighted global optimization method;
and step S600, after the estimation of all the absolute positions is finished, outputting the absolute positions obtained by performing weighted global optimization and retranslation averaging on all the estimable vertexes as the final estimation result of the absolute position of each camera.
2. The incremental translational averaging method according to claim 1, wherein the absolute position of each camera in the camera quadruples of the quadruple set in the local coordinate system is calculated by:
Figure 787828DEST_PATH_IMAGE008
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE009
respectively representing the camera in a local coordinate system
Figure 114905DEST_PATH_IMAGE010
The absolute position after the optimization is carried out,
Figure DEST_PATH_IMAGE011
the distance of the square chord is represented,
Figure 930414DEST_PATH_IMAGE012
for the representation of the relative translation transformation into the global coordinate system,
Figure DEST_PATH_IMAGE013
to represent
Figure 673766DEST_PATH_IMAGE003
Any one of the edges of the strip is,
Figure 232924DEST_PATH_IMAGE014
any one camera quad representing a set of quads,
Figure DEST_PATH_IMAGE015
Figure 945665DEST_PATH_IMAGE016
presentation camera
Figure DEST_PATH_IMAGE017
The initial absolute position of (a).
3. The incremental translational averaging method of claim 2, wherein the cost of selecting each camera quad in the set of quad is calculated by:
Figure DEST_PATH_IMAGE019
wherein the content of the first and second substances,
Figure 197655DEST_PATH_IMAGE020
represents the cost of the selection of the camera quadruple,
Figure DEST_PATH_IMAGE021
representing the weight corresponding to each epipolar geometry edge in the camera quadruple.
4. The incremental translational averaging method according to claim 1, wherein the selection cost of each vertex is calculated from the obtained absolute position by:
Figure DEST_PATH_IMAGE023
wherein the content of the first and second substances,
Figure 894215DEST_PATH_IMAGE024
representing the cost of selection of each vertex in the next best view selection,
Figure DEST_PATH_IMAGE025
is composed of
Figure 788222DEST_PATH_IMAGE026
One of the edges of the one of the two,
Figure 824311DEST_PATH_IMAGE026
for the vertex in the third vertex set
Figure DEST_PATH_IMAGE027
And a set of epipolar geometry edges between vertices in the first set of vertices,
Figure 512781DEST_PATH_IMAGE028
representing edges
Figure 634321DEST_PATH_IMAGE025
Corresponding to the relative position between the two vertexes,
Figure DEST_PATH_IMAGE029
representing vertices in a first set of vertices
Figure 332019DEST_PATH_IMAGE030
The current estimate of the absolute position is,
Figure DEST_PATH_IMAGE031
representing the weight corresponding to each epipolar geometry edge.
5. The incremental translational averaging method according to claim 2, wherein the inner-value edge obtaining method in the weighted local optimization is:
Figure DEST_PATH_IMAGE033
Figure DEST_PATH_IMAGE035
wherein the content of the first and second substances,
Figure 19352DEST_PATH_IMAGE036
representing a first set of vertices
Figure DEST_PATH_IMAGE037
Vertex corresponding to the selected next optimal view
Figure 141373DEST_PATH_IMAGE038
The set of geometric edges of the outer pole in between,
Figure DEST_PATH_IMAGE039
is composed of
Figure 15788DEST_PATH_IMAGE036
Any one of the edges of the strip is,
Figure 986018DEST_PATH_IMAGE040
to represent
Figure 527858DEST_PATH_IMAGE036
The set of inner-value edges in (b),
Figure DEST_PATH_IMAGE041
representing edges
Figure 26972DEST_PATH_IMAGE039
The relative position between the two cameras connected,
Figure 654263DEST_PATH_IMAGE015
representing vertices
Figure 100288DEST_PATH_IMAGE042
The current estimate of the absolute position is,
Figure DEST_PATH_IMAGE043
to represent
Figure 293372DEST_PATH_IMAGE038
The absolute position of the initialization is set to be,
Figure 166650DEST_PATH_IMAGE044
the relative position error corresponding to each side is shown,
Figure DEST_PATH_IMAGE045
indicating a set error threshold.
6. The incremental translational averaging method according to claim 5, based on
Figure 77974DEST_PATH_IMAGE006
The norm carries out weighted local optimization on the absolute position of the vertex corresponding to the selected next optimal view, and the method comprises the following steps:
Figure DEST_PATH_IMAGE047
Figure DEST_PATH_IMAGE049
wherein the content of the first and second substances,
Figure 390007DEST_PATH_IMAGE050
indicating absolute position
Figure DEST_PATH_IMAGE051
The result of the local optimization is weighted and,
Figure 640859DEST_PATH_IMAGE052
to represent
Figure 747356DEST_PATH_IMAGE040
Any one of the edges of the strip is,
Figure DEST_PATH_IMAGE053
representing edges
Figure 83659DEST_PATH_IMAGE052
Relative position between the two connected cameras.
7. The incremental translational averaging method according to claim 6, wherein the inner-value edge obtaining method in the weighted global optimization is:
Figure DEST_PATH_IMAGE055
Figure DEST_PATH_IMAGE057
wherein the content of the first and second substances,
Figure 202312DEST_PATH_IMAGE058
representing the set of inner-valued edges at the time of weighted global optimization,
Figure DEST_PATH_IMAGE059
the set of edges between vertices representing all estimated absolute positions,
Figure 369988DEST_PATH_IMAGE013
to represent
Figure 585069DEST_PATH_IMAGE059
Any one of the edges of the strip is,
Figure 408669DEST_PATH_IMAGE012
representing edges
Figure 62504DEST_PATH_IMAGE013
The relative position between the two cameras connected,
Figure 491211DEST_PATH_IMAGE015
representing vertices
Figure 673931DEST_PATH_IMAGE042
A current absolute position estimate.
8. The incremental translational averaging method according to claim 7, based on
Figure 515985DEST_PATH_IMAGE006
The norm carries out weighted global optimization on the absolute positions of all the vertexes with the estimated absolute positions, and the method comprises the following steps:
Figure DEST_PATH_IMAGE061
Figure DEST_PATH_IMAGE063
wherein the content of the first and second substances,
Figure 973511DEST_PATH_IMAGE064
representing a set of absolute positions
Figure DEST_PATH_IMAGE065
Performing weighted globalThe result after the optimization is carried out,
Figure 787883DEST_PATH_IMAGE066
is composed of
Figure DEST_PATH_IMAGE067
Any one of the edges of the strip is,
Figure 407083DEST_PATH_IMAGE068
representing edges
Figure 205275DEST_PATH_IMAGE066
Relative position between the two connected cameras.
9. An incremental translational averaging system, the system comprising: the system comprises an epi-polar geometry diagram construction module, an initial seed view selection module, a set construction module, a next optimal view selection module, an optimization module and an absolute position estimation output module;
the external pole geometric figure construction module is configured to acquire a plurality of frames of images, perform feature matching between every two images, and construct an external pole geometric figure according to the external pole geometric relation
Figure 200913DEST_PATH_IMAGE001
And then calculating relative rotation and relative translation between the matched image pairs; wherein the content of the first and second substances,
Figure 338633DEST_PATH_IMAGE002
a set of vertices, representing a set of cameras that capture images of a scene,
Figure 128735DEST_PATH_IMAGE003
the set of edges represents the set of epipolar geometry edges between two cameras which shoot different images and contains the motion information between the cameras;
the initial seed view selecting module is configured to select the front with the largest number of feature matches in the epipolar geometry
Figure 679802DEST_PATH_IMAGE004
The camera quadruple set comprises camera quadruples formed by edges, a quadruple set is constructed, and the absolute position of each camera in each camera quadruple of the quadruple set under a local coordinate system is calculated; calculating the selection cost of each camera quadruple in the quadruple set by combining the absolute position of each camera, and taking the view corresponding to the camera quadruple with the maximum selection cost as an initial seed view;
the set building module is configured to set the vertexes of the estimated and non-estimated absolute positions in the epipolar geometry map as a first vertex set and a second vertex set; selecting the front vertex with the maximum number of connecting edges with all the vertexes in the first vertex set in the second vertex set
Figure 151234DEST_PATH_IMAGE005
Each vertex is used for constructing a third vertex set;
the next optimal view selecting module is configured to combine each vertex in the third vertex set and a vertex in the first vertex set into a camera triple, and calculate an absolute position of each vertex in the third vertex set by a linear trifocal tensor solution; calculating the selection cost of each vertex according to the obtained absolute position, and taking the view corresponding to the vertex with the maximum selection cost as the next optimal view;
the optimization module is configured to fix the estimated absolute position of the camera in the first vertex set, and perform weighted local optimization on the absolute position of the vertex corresponding to the next most optimal view which is estimated most recently; after the weighted local optimization is completed, judging the growth ratio of the number of vertexes of the current estimated absolute position, and if the ratio is greater than a set threshold, performing weighted global optimization on all the current estimated absolute positions;
the weighted local optimization is: for the absolute position of the vertex corresponding to the selected next optimal view, the absolute position in the first vertex set and the relative position measured between two vertexes connected with each edge in the first edge set are combined to obtainCalculating the corresponding relative position error of each side as a first error; if the first error is less than the set error threshold, the corresponding edge is taken as the inner value edge, and the inner value edge is further based on
Figure 202848DEST_PATH_IMAGE006
The norm carries out weighted local optimization on the absolute position of the vertex corresponding to the selected next optimal view; the first edge set is a set of epipolar geometric edges between the first vertex set and the vertex corresponding to the selected next optimal view;
the weighted global optimization is as follows: calculating relative position errors corresponding to geometric edges of the outer poles by combining the absolute positions of all the vertexes with the relative positions obtained by measuring every two vertexes as second errors; if the second error is less than the set error threshold, the corresponding edge is taken as the inner value edge, and the inner value edge is further based on
Figure 429430DEST_PATH_IMAGE006
Performing weighted global optimization on the absolute positions of the vertexes of all the estimated absolute positions by using the norm;
after the weighted global optimization is completed, further carrying out retranslation averaging on all vertexes of the estimated absolute position by the weighted global optimization method;
and the absolute position estimation output module is configured to output the absolute positions obtained by performing weighted global optimization and retranslation averaging on all the estimable vertexes after finishing estimation of all the absolute positions, and the absolute positions are used as final estimation results of the absolute positions of all the cameras.
10. An electronic device, comprising:
at least one processor; and a memory communicatively coupled to at least one of the processors; wherein the memory stores instructions executable by the processor for execution by the processor to implement the incremental translational averaging method of any one of claims 1-8.
CN202110992939.0A 2021-08-27 2021-08-27 Incremental translational averaging method, system and equipment Active CN113436230B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110992939.0A CN113436230B (en) 2021-08-27 2021-08-27 Incremental translational averaging method, system and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110992939.0A CN113436230B (en) 2021-08-27 2021-08-27 Incremental translational averaging method, system and equipment

Publications (2)

Publication Number Publication Date
CN113436230A CN113436230A (en) 2021-09-24
CN113436230B true CN113436230B (en) 2021-11-19

Family

ID=77798214

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110992939.0A Active CN113436230B (en) 2021-08-27 2021-08-27 Incremental translational averaging method, system and equipment

Country Status (1)

Country Link
CN (1) CN113436230B (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109166171A (en) * 2018-08-09 2019-01-08 西北工业大学 Motion recovery structure three-dimensional reconstruction method based on global and incremental estimation

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6307959B1 (en) * 1999-07-14 2001-10-23 Sarnoff Corporation Method and apparatus for estimating scene structure and ego-motion from multiple images of a scene using correlation
CN110580720B (en) * 2019-08-29 2023-05-12 天津大学 Panorama-based camera pose estimation method

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109166171A (en) * 2018-08-09 2019-01-08 西北工业大学 Motion recovery structure three-dimensional reconstruction method based on global and incremental estimation

Also Published As

Publication number Publication date
CN113436230A (en) 2021-09-24

Similar Documents

Publication Publication Date Title
CN110363858B (en) Three-dimensional face reconstruction method and system
Pollefeys Visual 3D Modeling from Images.
US10810718B2 (en) Method and device for three-dimensional reconstruction
WO2022206020A1 (en) Method and apparatus for estimating depth of field of image, and terminal device and storage medium
US20130272581A1 (en) Method and apparatus for solving position and orientation from correlated point features in images
US10755139B2 (en) Random sample consensus for groups of data
CN107358629A (en) Figure and localization method are built in a kind of interior based on target identification
Eichhardt et al. Affine correspondences between central cameras for rapid relative pose estimation
WO2018205164A1 (en) Method and system for three-dimensional model reconstruction
CN111612731B (en) Measuring method, device, system and medium based on binocular microscopic vision
CN114429555A (en) Image density matching method, system, equipment and storage medium from coarse to fine
CN115082540A (en) Multi-view depth estimation method and device suitable for unmanned aerial vehicle platform
Wang et al. Lrru: Long-short range recurrent updating networks for depth completion
CN112270748B (en) Three-dimensional reconstruction method and device based on image
JP4850768B2 (en) Apparatus and program for reconstructing 3D human face surface data
CN113436230B (en) Incremental translational averaging method, system and equipment
CN114255279A (en) Binocular vision three-dimensional reconstruction method based on high-precision positioning and deep learning
CN116402867A (en) Three-dimensional reconstruction image alignment method for fusing SIFT and RANSAC
CN114399547B (en) Monocular SLAM robust initialization method based on multiframe
Mahmoud et al. Fast 3d structure from motion with missing points from registration of partial reconstructions
Ding et al. Relative pose from a calibrated and an uncalibrated smartphone image
Yin et al. Initializing and accelerating Stereo-DIC computation using semi-global matching with geometric constraints
JP2008261756A (en) Device and program for presuming three-dimensional head posture in real time from stereo image pair
Pernek et al. Automatic focal length estimation as an eigenvalue problem
Banno et al. Estimation of F-Matrix and image rectification by double quaternion

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant