US20130265443A1 - Nonlinear Self-Calibration for Structure From Motion (SFM) Techniques - Google Patents
Nonlinear Self-Calibration for Structure From Motion (SFM) Techniques Download PDFInfo
- Publication number
- US20130265443A1 US20130265443A1 US13/724,973 US201213724973A US2013265443A1 US 20130265443 A1 US20130265443 A1 US 20130265443A1 US 201213724973 A US201213724973 A US 201213724973A US 2013265443 A1 US2013265443 A1 US 2013265443A1
- Authority
- US
- United States
- Prior art keywords
- focal length
- reconstruction
- calibration
- initial values
- recited
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 195
- 230000033001 locomotion Effects 0.000 title claims description 46
- 238000005457 optimization Methods 0.000 claims abstract description 50
- 238000013519 translation Methods 0.000 claims abstract description 11
- 230000015654 memory Effects 0.000 claims description 22
- 230000006870 function Effects 0.000 claims description 12
- 238000004422 calculation algorithm Methods 0.000 description 39
- 230000003044 adaptive effect Effects 0.000 description 21
- 238000012545 processing Methods 0.000 description 18
- 239000011159 matrix material Substances 0.000 description 12
- 238000004891 communication Methods 0.000 description 6
- 230000002093 peripheral effect Effects 0.000 description 6
- 230000005540 biological transmission Effects 0.000 description 4
- 238000009472 formulation Methods 0.000 description 4
- 239000000203 mixture Substances 0.000 description 4
- 230000003416 augmentation Effects 0.000 description 3
- 238000013500 data storage Methods 0.000 description 3
- 238000000354 decomposition reaction Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 238000009877 rendering Methods 0.000 description 3
- 230000006641 stabilisation Effects 0.000 description 3
- 238000011105 stabilization Methods 0.000 description 3
- 238000012897 Levenberg–Marquardt algorithm Methods 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 239000000835 fiber Substances 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000003190 augmentative effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000013501 data transformation Methods 0.000 description 1
- 238000007429 general method Methods 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 229920000638 styrene acrylonitrile Polymers 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
- 239000013598 vector Substances 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/10—Processing, recording or transmission of stereoscopic or multi-view image signals
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/50—Depth or shape recovery
- G06T7/55—Depth or shape recovery from multiple images
- G06T7/579—Depth or shape recovery from multiple images from motion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N17/00—Diagnosis, testing or measuring for television systems or their details
- H04N17/002—Diagnosis, testing or measuring for television systems or their details for television cameras
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/60—Control of cameras or camera modules
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2200/00—Indexing scheme for image data processing or generation, in general
- G06T2200/04—Indexing scheme for image data processing or generation, in general involving 3D image data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10004—Still image; Photographic image
- G06T2207/10012—Stereo images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20112—Image segmentation details
- G06T2207/20164—Salient point detection; Corner detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30241—Trajectory
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30244—Camera pose
Definitions
- SFM Structure from Motion
- a task or goal is to estimate the camera motion from a set of point correspondences in a set of images or video frames.
- Obtaining Structure from Motion (SFM) algorithms is of importance because a successful SFM algorithm would enable a wide range of applications in different domains including 3D image-based modeling and rendering, video stabilization, panorama stitching, video augmentation, vision based robot navigation, human-computer interaction, etc.
- SFM Structure from Motion
- a task or goal is to estimate the camera motion (which may, but does not necessarily, have both translation and rotation components) from a set of point correspondences in a set of images or video frames.
- intrinsic camera parameters e.g., focal length
- reconstruction Performing the task of estimating camera motion and intrinsic parameters for a frame or a sequence of frames may be referred to as reconstruction.
- a reconstruction algorithm or technique (which may also be referred to as an SFM technique) may be implemented and applied to estimate the camera motion and intrinsic parameters for image sequences.
- Embodiments of a nonlinear self-calibration technique are described that may, for example, be used in various SFM techniques.
- embodiments of the self-calibration technique may use a nonlinear least squares optimization technique to infer the parameters.
- a technique is described for initializing the parameters for the nonlinear optimization.
- Embodiments of the self-calibration technique may be robust (i.e., may generally produce reliable results), and can make full use of prior knowledge if available.
- embodiments of the nonlinear self-calibration technique work for both constant focal length and varying focal length.
- Embodiments of the nonlinear self-calibration technique may use prior knowledge of the camera intrinsic parameters (e.g., focal length). For instance, if the user knows the focal length or if the focal length is known through metadata of the captured images in the sequence, the known focal length may be used in the formulation to provide reliable calibration results (e.g., motion parameters). However, having such prior knowledge would not make much difference in most conventional linear self-calibration methods.
- Embodiments of the nonlinear self-calibration technique may be robust and efficient when compared to conventional self-calibration techniques. In particular, the nonlinear optimization problem that is solved may be sparse and may be implemented efficiently.
- Embodiments of the nonlinear self-calibration technique may, for example, be used in an adaptive technique that iteratively selects and reconstructs keyframes to fully cover an image sequence; the technique may, for example, be used in an adaptive reconstruction algorithm implemented by a general SFM technique.
- a projective reconstruction technique may at least initially be applied, and the self-calibration technique may then be applied to generate a Euclidian reconstruction.
- Embodiments of the nonlinear self-calibration technique may thus allow a metric (Euclidian) reconstruction to be obtained where otherwise only a projective reconstruction could be obtained.
- a projective reconstruction may be unfit for many practical applications.
- nonlinear self-calibration technique may be used in other SFM applications or techniques, or in any other application or technique that requires a self-calibration operation to be performed on input image(s).
- FIG. 1 is a high-level flowchart of a nonlinear self-calibration technique, according to at least some embodiments.
- FIG. 2 is a high-level flowchart of a general 3D Structure from Motion (SFM) technique, according to at least some embodiments.
- SFM Structure from Motion
- FIG. 3 is a flowchart of an adaptive technique for iteratively selecting and reconstructing additional keyframes to fully cover the image sequence that may be used in a general adaptive reconstruction algorithm, for example as implemented by a general 3D SFM technique, according to at least some embodiments.
- FIG. 4 is a flowchart of a self-calibration technique that may be applied in the adaptive technique for iteratively selecting and reconstructing additional keyframes, according to at least some embodiments.
- FIG. 5 illustrates a module that may implement one or more of the Structure from Motion (SFM) techniques and algorithms as described herein, including but not limited to the nonlinear self-calibration technique, according to at least some embodiments.
- SFM Structure from Motion
- FIG. 6 illustrates an example computer system that may be used in embodiments.
- such quantities may take the form of electrical or magnetic signals capable of being stored, transferred, combined, compared or otherwise manipulated. It has proven convenient at times, principally for reasons of common usage, to refer to such signals as bits, data, values, elements, symbols, characters, terms, numbers, numerals or the like. It should be understood, however, that all of these or similar terms are to be associated with appropriate physical quantities and are merely convenient labels. Unless specifically stated otherwise, as apparent from the following discussion, it is appreciated that throughout this specification discussions utilizing terms such as “processing,” “computing,” “calculating,” “determining” or the like refer to actions or processes of a specific apparatus, such as a special purpose computer or a similar special purpose electronic computing device.
- a special purpose computer or a similar special purpose electronic computing device is capable of manipulating or transforming signals, typically represented as physical electronic or magnetic quantities within memories, registers, or other information storage devices, transmission devices, or display devices of the special purpose computer or similar special purpose electronic computing device.
- SFM Structure from Motion
- a task or goal is to estimate the camera motion (which may, but does not necessarily, have both translation and rotation components) from a set of point correspondences in a set of images or video frames.
- intrinsic camera parameters e.g., focal length
- Performing the task of estimating camera motion and intrinsic parameters for a frame or a sequence of frames may be referred to as reconstruction.
- a reconstruction algorithm or technique (which may also be referred to as an SFM technique) may be implemented and applied to estimate the camera motion and intrinsic parameters for image sequences.
- SFM technique a reconstruction algorithm or technique
- a distinct camera may be assumed for each image or frame in an image sequence.
- each frame or image in a sequence may be referred to as a “camera.”
- Embodiments of a nonlinear self-calibration technique are described that may, for example, be used in various SFM techniques.
- embodiments of the self-calibration technique may use a nonlinear least squares optimization technique to infer the parameters.
- a technique is described for initializing the parameters for the nonlinear optimization.
- Embodiments of the self-calibration technique may be robust (i.e., may generally produce reliable results), and can make full use of prior knowledge if available.
- embodiments of the nonlinear self-calibration technique work for both constant focal length and varying focal length.
- Embodiments of the nonlinear self-calibration technique may use prior knowledge of the camera intrinsic parameters (e.g., focal length). For instance, if the user knows the focal length or if the focal length is known through metadata of the captured images in the sequence, the known focal length may be used in the formulation to provide reliable calibration results (e.g., motion parameters). However, having such prior knowledge would not make much difference in most conventional linear self-calibration methods.
- Embodiments of the nonlinear self-calibration technique may be robust and efficient when compared to conventional self-calibration techniques. In particular, the nonlinear optimization problem that is solved may be sparse and may be implemented efficiently.
- Embodiments of the nonlinear self-calibration technique may, for example, be used in an adaptive reconstruction algorithm implemented by a general SFM technique.
- Example embodiments of an adaptive reconstruction algorithm that may be implemented in a general SFM technique and that leverages the nonlinear self-calibration techniques are described herein.
- Embodiments of the nonlinear self-calibration technique may, for example, be used in the adaptive reconstruction algorithm to obtain a metric (Euclidian) reconstruction where otherwise only a projective reconstruction could be obtained.
- embodiments of the nonlinear self-calibration technique may be used in other SFM applications or techniques, or in any other application or technique that requires a self-calibration operation to be performed on input image(s).
- embodiments of the self-calibration technique may use a nonlinear least squares optimization technique to infer the parameters.
- a technique is described for initializing the parameters for the nonlinear optimization.
- Embodiments of the self-calibration technique may be robust (i.e., may generally produce reliable results), and can make full use of prior knowledge if available.
- embodiments of the nonlinear self-calibration technique work for both constant focal length and varying focal length.
- Embodiments of the nonlinear self-calibration technique may use prior knowledge of the camera intrinsic parameters (e.g., focal length). For instance, if the user knows the focal length or if the focal length is known through metadata of the captured images in the sequence, the known focal length may be used in the formulation to provide reliable calibration results (e.g., motion parameters). However, having such prior knowledge would not make much difference in most conventional linear self-calibration methods.
- Embodiments of the nonlinear self-calibration technique may be robust and efficient when compared to conventional self-calibration techniques. In particular, the nonlinear optimization problem that is solved may be sparse and may be implemented efficiently.
- Embodiments of the nonlinear self-calibration technique may allow a metric (Euclidian) reconstruction to be obtained where otherwise only a projective reconstruction could be obtained.
- a projective reconstruction may be unfit for many practical applications. For instance, it is difficult if not impossible to insert a virtual object into a moving video using a projective reconstruction.
- FIG. 1 is a high-level flowchart of a nonlinear self-calibration technique, according to at least some embodiments.
- N input images and a projective reconstruction for each image may be obtained.
- at least two sets of initial values may be determined for an equation to be optimized according to a nonlinear optimization technique to generate a metric reconstruction for the set of N images.
- the equation may then be optimized using each set of initial values according to the nonlinear optimization technique.
- the result with a smaller cost may be selected.
- the metric reconstruction is output.
- the output may include, but is not limited to, camera intrinsic parameters (e.g., focal length) and camera motion parameters (e.g., rotation and translation values) for the N images.
- FIG. 1 The elements of FIG. 1 are discussed in more detail below.
- Embodiments of the nonlinear self-calibration technique may address a problem in camera motion estimation—determining the intrinsic parameters of the cameras such as focal length.
- One method is conventional calibration, where the camera intrinsic parameters are determined from one or more captured images of a known calibration target or known properties of the scene such as vanishing points of orthogonal directions.
- the other method is generally referred to as self-calibration.
- the camera intrinsic parameters are determined directly from constants on the internal and/or external parameters.
- Self-calibration is generally more useful in practice because a calibration target or known properties of the scene are typically not available.
- a goal of self-calibration is to find a 4 ⁇ 4 matrix H ⁇ 4 ⁇ 4 such that P i H is a metric reconstruction.
- Equation (B1) is equivalent to the following reduced version where T i has been dropped:
- H 1 ⁇ 4 ⁇ 3 is the left 4 ⁇ 3 part of H.
- R i there is a generic ambiguity on R i in the sense that if (H 1 , R i ) satisfies equation (B2), then (H 1 R, R i R) satisfies the same equation where R is an arbitrary 3 ⁇ 3 rotation matrix. Without loss of generality, R 1 is chosen to be the identity rotation.
- P i contains a projective ambiguity. In order to at least partially fix the ambiguity, P 1 may be chosen to be [I, 0]. In the following discussion, it is assumed that P 1 has this expression.
- K i is allowed to vary arbitrarily, the problem is not well-defined. For instance, for any given H 1 ⁇ 3 ⁇ 4 , a decomposition similar to the QR decomposition may be performed to find an upper triangular matrix and a rotation matrix that satisfy the constraint.
- Embodiments of the self-calibration technique may exploit the assumptions on K i to arrive at interesting solutions. In embodiments the following assumptions may be made about the camera intrinsic matrix K i :
- embodiments of the self-calibration technique can be generalized to cases where different assumptions are made. Under these assumptions, the effect of principal point, pixel skew, and pixel aspect ratio on both P i and K i can be undone, and a simpler formulation may be derived:
- H 11 is the top 3 ⁇ 3 part of H 1 .
- H 11 is the top 3 ⁇ 3 part of H 1 .
- Equation (B3) becomes:
- Some prior knowledge on the focal length may be assumed. For instance, if the lens and camera that are used to capture the image are known, an approximate focal length can be computed from the focal length of the lens and parameters of the camera sensor.
- the lens information may, for example, be obtained from image/video metadata.
- the focal length is in the range from 24 mm to 35 mm (35 mm equivalent).
- f i is assumed to be the same for all the images, and may be denoted by f.
- the self-calibration problem may be solved according to an optimization process.
- the following cost function may be optimized:
- Equation (B9) is of the form of nonlinear least squares, in at least some embodiments the Levenberg-Marquardt algorithm may be used to optimize the cost.
- a pair of projection matrices is chosen, one of which is the first image.
- the choice of the other projection matrix may be important.
- the camera that is farthest away from the first image in time may be chosen. Without loss of generality, assume (P 1 , P 2 ) are chosen. The following is computed:
- R i and ⁇ i may be computed as follows.
- a QR decomposition may be computed as follows:
- a i is a 3 ⁇ 3 upper triangular matrix and ⁇ circumflex over (R) ⁇ i is a 3 ⁇ 3 rotation matrix.
- Equation (B9) may be optimized, for example using a Levenberg-Marquardt technique. Since there are two solutions for H 21 , there are two sets of initial values. In at least some embodiments, two optimizations are performed, one using each set of initial values. The result with the smaller cost may be chosen. Note that equation (B9) has a sparse form, and can be optimized efficiently using a sparse solver.
- the focal length changes for each image.
- a generalization of the algorithm in the section titled Constant focal length may be used for the varying focal length case.
- P 1 and P 2 are chosen to compute H 21 . The following is computed:
- R i and ⁇ i can be computed using the same algorithm presented in the section titled Constant focal length. However, the optimization may be modified to optimize over f i as well:
- Embodiments of the nonlinear self-calibration technique as described herein may be robust to error in the initial estimate of the focal length.
- the optimization tends to converge even if the focal length estimate is off by as much as 20%. Since in practice accurate prior knowledge may often not be available or attainable, this robustness is advantageous.
- the robustness of the nonlinear self-calibration technique also suggests a way to handle cases where there is no prior knowledge on the focal length. Note that the focal length has a bounded domain in ⁇ .
- a brute-force search may be used. Let f mm and f max be the minimum and maximum focal length. In the constant focal length case, the range may be divided into M bins as follows:
- Each f i may be used as the initial value for f, and the optimization may be performed. The result with the least cost may be returned.
- embodiments of the self-calibration technique described herein find two solutions to H 21 that correspond to the two different signs of P 2 . Finding only one solution, as is done in conventional self-calibration techniques, may result in the wrong solution being picked for at least the reason that the sign of P 2 is inconsistent.
- embodiments of the self-calibration technique described herein employ a nonlinear optimization to further refine the solution. This makes the self-calibration technique robust to errors in the initial guess of the focal length.
- Embodiments of the nonlinear self-calibration technique may, for example, be used in an adaptive reconstruction algorithm that starts by adaptively determining and reconstructing an initial set of keyframes that covers only a part of an image sequence (e.g., a set of spaced frames somewhere in the middle of the sequence), and that incrementally and adaptively determines and reconstructs additional keyframes to fully cover the image sequence.
- the adaptive reconstruction algorithm then adaptively determines and reconstructs optimization keyframes to provide a better reconstruction. The rest of the frames in the sequence may then be reconstructed based on the determined and reconstructed keyframes.
- At least some embodiments of the adaptive reconstruction algorithm may be configured to handle both cases where the intrinsic camera parameters (e.g., focal length) are known (e.g., via user input or via metadata provided with the input image sequence) and cases where the intrinsic camera parameters are not known.
- the first case may be referred to herein as the calibrated case
- the second case may be referred to herein as the uncalibrated case.
- a Euclidian (or metric) reconstruction technique may be applied in the calibrated case.
- a projective reconstruction technique may at least initially be applied.
- the nonlinear self-calibration technique as described herein may be applied to produce a Euclidian (or metric) reconstruction in the uncalibrated case.
- the adaptive reconstruction algorithm may, for example, be used in embodiments of a robust system for estimating camera motion (rotation and translation) in image sequences, a problem known in computer vision as Structure from Motion (SFM).
- SFM Structure from Motion
- Embodiments of a general 3D reconstruction technique which may also be referred to as a general SFM technique, are generally directed to performing reconstruction for image sequences in which the camera motion includes a non-zero translation component. In other words, the camera has moved when capturing the image sequence.
- the general SFM technique estimates the rotation and translation components of the camera motion, and may also estimate the camera intrinsic parameters (e.g., focal length) if not known.
- the general SFM technique may be generally directed to performing reconstruction for image sequences in which the scene does not contain a dominant plane.
- FIG. 2 is a high-level flowchart of the general SFM technique, according to at least some embodiments.
- an input image sequence may be obtained.
- the image sequence may, for example, be a video taken by a moving video camera or a set of images taken with a still camera.
- a feature tracking technique may be applied to establish point trajectories over time in the input image sequence. Embodiments of a feature tracking technique that may be used in at least some embodiments are described later in this document. Output of the feature tracking technique is a set of point trajectories.
- an initialization technique may be performed to determine and reconstruct a set of initial keyframes covering a portion of the image sequence according to the point trajectories.
- Input to the initialization technique includes at least the set of point trajectories.
- Output of the initialization technique is a set of initial keyframes and the initial reconstruction.
- Elements 106 through 110 are a keyframe reconstruction loop that incrementally and adaptively determines and reconstructs additional keyframes to fully cover the image sequence.
- a new keyframe is determined and reconstructed.
- a Euclidian reconstruction technique can be performed, since the camera intrinsic parameters are known.
- a projective reconstruction technique may be performed.
- a self-calibration technique may be applied to produce a Euclidian (or metric) reconstruction for the frame, if there are enough frames to perform the self-calibration.
- the method returns to 106 to add a next keyframe.
- an opt-keyframe technique may then be performed to determine and reconstruct optimization keyframes to improve the quality of the reconstruction.
- non-keyframes keyframes that have not yet been included in the reconstruction
- final processing may be performed.
- at least the camera intrinsic parameters and the Euclidean motion parameters for the images in the input image sequence may be output.
- embodiments of the general SFM technique may first perform feature tracking to establish point trajectories over time.
- a basic idea of feature tracking is to find the locations of the same point in subsequent video frames.
- a point should be tracked as long and as accurately as possible, and as many points as possible should be tracked.
- the general SFM technique may use an implementation of the Lucas-Kanade-Tomasi algorithm to perform feature tracking.
- a translational model may be used to track against the previous video frame (at time t ⁇ 1), and an affine model may be used to track against the reference video frame at time t 0 (t 0 may vary according to the point).
- the result of feature tracking is a set of point trajectories. Each point trajectory includes the two-dimensional (2D) locations of the “same” point in a contiguous set of frames. Let x i,j denote the 2D location of the i-th point in the j-th image.
- x i,j is undefined for some combinations of i and j.
- quantities such as ⁇ i,j x i,j may be used even if x i,j is undefined.
- the general SFM technique can work with any feature tracking technique that computes point trajectories.
- the point trajectories are input to the rest of the general SFM technique; the input image sequence may not be referenced after feature tracking
- an initialization technique may be performed in an adaptive reconstruction algorithm to determine and reconstruct a set of initial keyframes covering a portion of the image sequence according to the point trajectories.
- the general SFM technique may implement an incremental approach that adds one or more frames to the reconstruction at a time. To accomplish this, an initial reconstruction may need to be generated.
- a goal of the initialization technique is to compute an initial reconstruction from a subset of frames in the image sequence.
- two-view reconstruction algorithms may be used. Since the general SFM technique is incremental, the quality of the initial reconstruction may be important in generating a quality overall reconstruction. In at least some embodiments, to help achieve a quality initial reconstruction, two initial frames that best satisfy requirements of the initial reconstruction algorithm may be determined.
- input to the initialization technique includes at least the set of point trajectories.
- Two initial keyframes may be selected.
- a reconstruction may be performed from the two initial keyframes. Additional keyframes between the initial keyframes may be determined and reconstructed.
- a global optimization of the reconstruction may be performed.
- One or more outlier points may be determined and removed.
- One or more inlier points may be determined and recovered. Note that outlier and inlier points correspond to particular point trajectories, and that the entire point trajectory is removed (for outlier points) or recovered (for inlier points). If more than a threshold number of inliers were recovered, another global optimization may be performed as indicated at 280 . Otherwise, the initialization technique is done. Output of the initialization technique is a set of initial keyframes and the initial reconstruction.
- a keyframe reconstruction loop may be used to enlarge the initial reconstruction to cover the entire image sequence, as shown in elements 106 - 110 of FIG. 2 .
- the keyframe reconstruction loop may add keyframes in an incremental and adaptive fashion, adding one keyframe at a time until the entire video sequence is covered. Note that this loop does not add all the frames in the input image sequence. Instead, an adaptive algorithm is used to select particular frame to add.
- the additional keyframes may be selected from the set of keyframes that were previously selected.
- the initial reconstruction may cover a portion of the image sequence, and the additional keyframes may be added one at a time at each end of the current reconstruction, working outwards and alternating between ends.
- FIG. 3 is a flowchart of an adaptive technique for iteratively selecting and reconstructing additional keyframes to fully cover the image sequence that may be used in a general adaptive reconstruction algorithm, for example as implemented by a general 3D SFM technique, according to at least some embodiments.
- the adaptive technique for iteratively selecting and reconstructing additional keyframes is done. Otherwise, the technique proceeds to element 310 .
- a next keyframe may be determined according to an adaptive selection technique.
- the determined keyframe may be reconstructed and thus added to the current reconstruction.
- a global optimization may be performed on the current reconstruction.
- one or more outlier points may be determined and removed from the reconstruction.
- one or more inlier points may be determined and recovered (added to the reconstruction).
- a global optimization may again be performed on the current reconstruction as indicated at 362 .
- the current reconstruction is already a Euclidian reconstruction, so the technique returns to element 300 to determine if there are more keyframes to be processed. Otherwise, this is the uncalibrated case, and the reconstruction is a projective construction.
- self-calibration may be performed as indicated at 372 to upgrade the projective reconstruction to a Euclidean reconstruction.
- Results of the self-calibration may be analyzed to determine if the results are acceptable.
- the technique returns to element 300 to determine if there are more keyframes to be processed. Otherwise, the technique reverts to the reconstruction prior to the self-calibration attempt as indicated at 382 , and the technique returns to element 300 to determine if there are more keyframes to be processed.
- a self-calibration technique may be applied to upgrade a reconstruction from projective to Euclidean (metric). Note that self-calibration may not be applied to the calibrated case because the reconstruction is already metric. Once the reconstruction is Euclidean, self-calibration does not need to be performed. In at least some embodiments, self-calibration is only performed when the number of cameras in the current reconstruction reaches a certain threshold.
- the section titled Nonlinear Self-Calibration Technique describes a self-calibration technique that may be used in at least some embodiments. This section describes a few extra steps that may be taken in some embodiments to ensure that the results of the self-calibration technique are good and thus accepted.
- FIG. 4 is a flowchart of a self-calibration technique that may be implemented in the adaptive technique for iteratively selecting and reconstructing additional keyframes, according to at least some embodiments.
- a total reprojection error is computed, as indicated at 500 .
- Self-calibration is then performed, as indicated at 510 .
- a self-calibration technique as described in the section titled Nonlinear Self-Calibration Technique may be used.
- a global optimization of the reconstruction may be performed, as indicated at 520 .
- a multi-view bundle adjustment technique as described in the section titled Optimization using multi-view bundle adjustment may be used.
- inlier points may be determined and recovered, for example as described in the section titled Inlier recovery.
- the method may iterate between adding inliers and global optimization (e.g., multi-view bundle adjustment) until either no new inlier is added or the iteration count reaches a pre-defined threshold.
- global optimization e.g., multi-view bundle adjustment
- a new total reprojection error may be computed and compared to the total reprojection error that was previously computed at 500 , as indicated at 550 .
- the results of the comparison may be used to determine if the self-calibration was successful.
- the self-calibration result is accepted as indicated at 570 . Otherwise, the self-calibration step has failed, and the reconstruction is reverted back to the state before self-calibration, as indicated at 580 .
- an opt-keyframe technique may be applied to a reconstruction for an image sequence to determine and reconstruct optimization keyframes to improve the quality of the reconstruction.
- additional frames referred to herein as “opt-keyframes”
- the reconstruction is again globally optimized.
- opt-keyframes may be determined and added to the reconstruction so that the total number of frames in the reconstruction satisfies a threshold.
- One or more bad (outlier) points may be determined according to one or more criteria and removed from the reconstruction.
- One or more good (inlier) points may be determined and recovered.
- Bad (outlier) points may again be determined according to one or more criteria and removed from the reconstruction.
- the reconstruction may then be globally optimized.
- a set of opt-keyframes may be computed that are uniformly spread in the entire sequence so that the total number of frames reaches a pre-defined threshold.
- the camera parameters for the newly selected opt-keyframes may be computed.
- non-keyframes (keyframes that have not yet been included in the reconstruction) may be reconstructed.
- all of the frames in the input sequence that are not included in the current reconstruction may be reconstructed. These frames may be referred to as non-keyframes.
- all the frames in the reconstruction that include both keyframes and opt-keyframes are first reconstructed.
- the non-keyframe reconstruction technique may work on adjacent pairs of keyframes until all the pairs of keyframes have been processed.
- all of the 3D points that are visible in both frames are collected. These points may then be used to compute the parameters for a camera between the two frames, for example as described below.
- final processing may be performed.
- the largest contiguous subset of frames in the reconstruction may be found. All the frames that are not in this subset, along with all the points that are not visible in any of the frames in the subset, may be removed from the reconstruction.
- all of the frames and points in the reconstruction may be optimized (global optimization). In at least some embodiments, this optimization may be performed according to a refinement process that optimizes all the points and cameras together.
- At least the camera intrinsic parameters and the Euclidean motion parameters for the images in the input image sequence may be output.
- the reconstruction may have been cropped to the largest contiguous set of frames, as described in the section titled Final Processing.
- the output (at least the camera intrinsic parameters and the Euclidean motion parameters for the images in the input image sequence) of the general SFM technique described above may be used in a wide range of applications in different domains including but not limited to 3D image-based modeling and rendering, video stabilization, panorama stitching, video augmentation, vision based robot navigation, human-computer interaction, etc.
- the camera intrinsic parameters and the Euclidean motion parameters determined from the video sequence using an embodiment of the general SFM technique as described herein may be used to insert a 3D object into a video sequence.
- the inserted 3D object moves with the motion of the camera to maintain a natural and believable positioning in the frames.
- Some embodiments may include a means for performing one or more of the various techniques described herein, including but not limited to the nonlinear self-calibration technique.
- an SFM module may receive input specifying a set of point trajectories and generate as output structure and motion for a set of images or frames as described herein.
- the SFM module may, for example, apply the nonlinear self-calibration technique to convert a projective reconstruction to a metric (Euclidian) reconstruction.
- the SFM techniques described herein, including but not limited to the nonlinear self-calibration technique, and/or the SFM module may in some embodiments be implemented by a non-transitory, computer-readable storage medium and one or more processors (e.g., CPUs and/or GPUs) of a computing apparatus.
- the computer-readable storage medium may store program instructions executable by the one or more processors to cause the computing apparatus to perform one or more of the techniques as described herein, for example the nonlinear self-calibration technique.
- Other embodiments of the module(s) may be at least partially implemented by hardware circuitry and/or firmware stored, for example, in a non-volatile memory.
- Embodiments of an SFM module may, for example, be implemented as a stand-alone application, as a module of an application, as a plug-in or plug-ins for applications including image or video processing applications, and/or as a library function or functions that may be called by other applications such as image processing or video processing applications.
- Embodiments of the module(s) may be implemented in any image or video processing application, or more generally in any application in which video or image sequences may be processed.
- Example applications in which embodiments may be implemented may include, but are not limited to, Adobe® Premiere® and Adobe® After Effects®.
- FIG. 5 An example module that may implement one or more of the SFM techniques as described herein is illustrated in FIG. 5 .
- An example computer system on which the module may be implemented is illustrated in FIG. 6 . Note that the module may, for example, be implemented in still cameras and/or video cameras.
- FIG. 5 illustrates an example module that may implement one or more of the SFM techniques, including but not limited to the nonlinear self-calibration technique, as illustrated in the accompanying Figures and described herein, according to at least some embodiments.
- Module 1700 may, for example, receive an input image sequence, or alternatively a set of point trajectories for the images in a sequence. Module 1700 then applies one or more of the techniques as described herein to generate structure, camera parameters, and motion. In at least some embodiments, module 1700 may obtain point trajectories for the sequence, as indicated at 1710 . Module 1700 may then perform initialization to determine and reconstruct initial keyframes, as indicated at 1720 .
- Module 1700 may then determine and reconstruct additional keyframes to cover the video sequence, as indicated at 1730 .
- module 1700 may apply an embodiment of the nonlinear self-calibration technique as described herein, for example in converting a projective reconstruction to a metric (Euclidian) reconstruction at element 1730 .
- Module 1700 may then determine and reconstruct optimization keyframes, as indicated at 1740 .
- Module 1700 may then reconstruct non-keyframes, as indicated at 1750 .
- Module 1700 may then perform final processing, as indicated at 1760 .
- module 1700 may generate as output estimates of camera parameters and camera motion for the image sequence.
- Example applications of the SFM techniques as described herein may include one or more of, but are not limited to, video stabilization, video augmentation (augmenting an original video sequence with graphic objects), video classification, and robot navigation.
- embodiments of one or more of the SFM techniques may be used to provide structure and motion to any application that requires or desires such output to perform some video- or image-processing task.
- Embodiments of the various techniques as described herein including but not limited to the nonlinear self-calibration technique may be executed on one or more computer systems, which may interact with various other devices.
- One such computer system is illustrated by FIG. 6 .
- computer system 2000 may be any of various types of devices, including, but not limited to, a personal computer system, desktop computer, laptop, notebook, or netbook computer, mainframe computer system, handheld computer, workstation, network computer, a camera, a video camera, a tablet or pad device, a smart phone, a set top box, a mobile device, a consumer device, video game console, handheld video game device, application server, storage device, a peripheral device such as a switch, modem, router, or in general any type of computing or electronic device.
- computer system 2000 includes one or more processors 2010 coupled to a system memory 2020 via an input/output (I/O) interface 2030 .
- Computer system 2000 further includes a network interface 2040 coupled to I/O interface 2030 , and one or more input/output devices 2050 , such as cursor control device 2060 , keyboard 2070 , display(s) 2080 , and touch- or multitouch-enabled device(s) 2090 .
- input/output devices 2050 such as cursor control device 2060 , keyboard 2070 , display(s) 2080 , and touch- or multitouch-enabled device(s) 2090 .
- it is contemplated that embodiments may be implemented using a single instance of computer system 2000 , while in other embodiments multiple such systems, or multiple nodes making up computer system 2000 , may be configured to host different portions or instances of embodiments.
- some elements may be implemented via one or more nodes of computer system 2000 that are distinct from those nodes implementing other elements.
- computer system 2000 may be a uniprocessor system including one processor 2010 , or a multiprocessor system including several processors 2010 (e.g., two, four, eight, or another suitable number).
- processors 2010 may be any suitable processor capable of executing instructions.
- processors 2010 may be general-purpose or embedded processors implementing any of a variety of instruction set architectures (ISAs), such as the x86, PowerPC, SPARC, or MIPS ISAs, or any other suitable ISA.
- ISAs instruction set architectures
- each of processors 2010 may commonly, but not necessarily, be implement the same ISA.
- At least one processor 2010 may be a graphics processing unit.
- a graphics processing unit or GPU may be considered a dedicated graphics-rendering device for a personal computer, workstation, game console or other computing or electronic device.
- Modern GPUs may be very efficient at manipulating and displaying computer graphics, and their highly parallel structure may make them more effective than typical CPUs for a range of complex graphical algorithms.
- a graphics processor may implement a number of graphics primitive operations in a way that makes executing them much faster than drawing directly to the screen with a host central processing unit (CPU).
- the techniques disclosed herein may, at least in part, be implemented by program instructions configured for execution on one of, or parallel execution on two or more of, such GPUs.
- the GPU(s) may implement one or more application programmer interfaces (APIs) that permit programmers to invoke the functionality of the GPU(s). Suitable GPUs may be commercially available from vendors such as NVIDIA Corporation, ATI Technologies (AMD), and others.
- APIs application programmer interfaces
- System memory 2020 may be configured to store program instructions and/or data accessible by processor 2010 .
- system memory 2020 may be implemented using any suitable memory technology, such as static random access memory (SRAM), synchronous dynamic RAM (SDRAM), nonvolatile/Flash-type memory, or any other type of memory.
- SRAM static random access memory
- SDRAM synchronous dynamic RAM
- program instructions and data implementing desired functions, such as those described above for embodiments of the various techniques as described herein are shown stored within system memory 2020 as program instructions 2025 and data storage 2035 , respectively.
- program instructions and/or data may be received, sent or stored upon different types of computer-accessible media or on similar media separate from system memory 2020 or computer system 2000 .
- a computer-accessible medium may include storage media or memory media such as magnetic or optical media, e.g., disk or CD/DVD-ROM coupled to computer system 2000 via I/O interface 2030 .
- Program instructions and data stored via a computer-accessible medium may be transmitted by transmission media or signals such as electrical, electromagnetic, or digital signals, which may be conveyed via a communication medium such as a network and/or a wireless link, such as may be implemented via network interface 2040 .
- I/O interface 2030 may be configured to coordinate I/O traffic between processor 2010 , system memory 2020 , and any peripheral devices in the device, including network interface 2040 or other peripheral interfaces, such as input/output devices 2050 .
- I/O interface 2030 may perform any necessary protocol, timing or other data transformations to convert data signals from one component (e.g., system memory 2020 ) into a format suitable for use by another component (e.g., processor 2010 ).
- I/O interface 2030 may include support for devices attached through various types of peripheral buses, such as a variant of the Peripheral Component Interconnect (PCI) bus standard or the Universal Serial Bus (USB) standard, for example.
- PCI Peripheral Component Interconnect
- USB Universal Serial Bus
- I/O interface 2030 may be split into two or more separate components, such as a north bridge and a south bridge, for example.
- some or all of the functionality of I/O interface 2030 such as an interface to system memory 2020 , may be incorporated directly into processor 2010 .
- Network interface 2040 may be configured to allow data to be exchanged between computer system 2000 and other devices attached to a network, such as other computer systems, or between nodes of computer system 2000 .
- network interface 2040 may support communication via wired or wireless general data networks, such as any suitable type of Ethernet network, for example; via telecommunications/telephony networks such as analog voice networks or digital fiber communications networks; via storage area networks such as Fibre Channel SANs, or via any other suitable type of network and/or protocol.
- Input/output devices 2050 may, in some embodiments, include one or more display terminals, keyboards, keypads, touchpads, scanning devices, voice or optical recognition devices, or any other devices suitable for entering or retrieving data by one or more computer system 2000 .
- Multiple input/output devices 2050 may be present in computer system 2000 or may be distributed on various nodes of computer system 2000 .
- similar input/output devices may be separate from computer system 2000 and may interact with one or more nodes of computer system 2000 through a wired or wireless connection, such as over network interface 2040 .
- memory 2020 may include program instructions 2025 , configured to implement embodiments of the various techniques as described herein, and data storage 2035 , comprising various data accessible by program instructions 2025 .
- program instructions 2025 may include software elements of embodiments of the various techniques as illustrated in the above Figures.
- Data storage 2035 may include data that may be used in embodiments. In other embodiments, other or different software elements and data may be included.
- computer system 2000 is merely illustrative and is not intended to limit the scope of the various techniques as described herein.
- the computer system and devices may include any combination of hardware or software that can perform the indicated functions, including a computer, personal computer system, desktop computer, laptop, notebook, or netbook computer, mainframe computer system, handheld computer, workstation, network computer, a camera, a video camera, a set top box, a mobile device, network device, internet appliance, PDA, wireless phones, pagers, a consumer device, video game console, handheld video game device, application server, storage device, a peripheral device such as a switch, modem, router, or in general any type of computing or electronic device.
- Computer system 2000 may also be connected to other devices that are not illustrated, or instead may operate as a stand-alone system.
- the functionality provided by the illustrated components may in some embodiments be combined in fewer components or distributed in additional components.
- the functionality of some of the illustrated components may not be provided and/or other additional functionality may be available.
- instructions stored on a computer-accessible medium separate from computer system 2000 may be transmitted to computer system 2000 via transmission media or signals such as electrical, electromagnetic, or digital signals, conveyed via a communication medium such as a network and/or a wireless link.
- Various embodiments may further include receiving, sending or storing instructions and/or data implemented in accordance with the foregoing description upon a computer-accessible medium. Accordingly, the present invention may be practiced with other computer system configurations.
- a computer-accessible medium may include storage media or memory media such as magnetic or optical media, e.g., disk or DVD/CD-ROM, volatile or non-volatile media such as RAM (e.g. SDRAM, DDR, RDRAM, SRAM, etc.), ROM, etc., as well as transmission media or signals such as electrical, electromagnetic, or digital signals, conveyed via a communication medium such as network and/or a wireless link.
- storage media or memory media such as magnetic or optical media, e.g., disk or DVD/CD-ROM, volatile or non-volatile media such as RAM (e.g. SDRAM, DDR, RDRAM, SRAM, etc.), ROM, etc.
- RAM e.g. SDRAM, DDR, RDRAM, SRAM, etc.
- ROM etc.
- transmission media or signals such as electrical, electromagnetic, or digital signals
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Signal Processing (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- General Health & Medical Sciences (AREA)
- Studio Devices (AREA)
- Image Analysis (AREA)
Abstract
Description
- This application claims benefit of priority of U.S. Provisional Application Ser. No. 61/621,365 entitled “Structure from Motion Methods and Apparatus” filed Apr. 6, 2012, the content of which is incorporated by reference herein in its entirety.
- In computer vision, inferring rigid-body motions of a camera from a video or set of images is a problem known as Structure from Motion (SFM). In SFM, a task or goal is to estimate the camera motion from a set of point correspondences in a set of images or video frames. Obtaining Structure from Motion (SFM) algorithms is of importance because a successful SFM algorithm would enable a wide range of applications in different domains including 3D image-based modeling and rendering, video stabilization, panorama stitching, video augmentation, vision based robot navigation, human-computer interaction, etc.
- Various embodiments of Structure from Motion (SFM) techniques and algorithms are described that may be applied, for example, to find the three-dimensional (3D) structures of a scene, for example from a video taken by a moving video camera or from a set of images taken with a still camera, as well as systems that implement these algorithms and techniques. In SFM, a task or goal is to estimate the camera motion (which may, but does not necessarily, have both translation and rotation components) from a set of point correspondences in a set of images or video frames. In addition, in at least some cases, intrinsic camera parameters (e.g., focal length) may also be estimated if not known. Performing the task of estimating camera motion and intrinsic parameters for a frame or a sequence of frames may be referred to as reconstruction. Thus, a reconstruction algorithm or technique (which may also be referred to as an SFM technique) may be implemented and applied to estimate the camera motion and intrinsic parameters for image sequences.
- Embodiments of a nonlinear self-calibration technique are described that may, for example, be used in various SFM techniques. In contrast to conventional self-calibration methods that use linear or semi-linear algorithms, embodiments of the self-calibration technique may use a nonlinear least squares optimization technique to infer the parameters. In addition, a technique is described for initializing the parameters for the nonlinear optimization. Embodiments of the self-calibration technique may be robust (i.e., may generally produce reliable results), and can make full use of prior knowledge if available. In addition, embodiments of the nonlinear self-calibration technique work for both constant focal length and varying focal length. Embodiments of the nonlinear self-calibration technique may use prior knowledge of the camera intrinsic parameters (e.g., focal length). For instance, if the user knows the focal length or if the focal length is known through metadata of the captured images in the sequence, the known focal length may be used in the formulation to provide reliable calibration results (e.g., motion parameters). However, having such prior knowledge would not make much difference in most conventional linear self-calibration methods. Embodiments of the nonlinear self-calibration technique may be robust and efficient when compared to conventional self-calibration techniques. In particular, the nonlinear optimization problem that is solved may be sparse and may be implemented efficiently.
- Embodiments of the nonlinear self-calibration technique may, for example, be used in an adaptive technique that iteratively selects and reconstructs keyframes to fully cover an image sequence; the technique may, for example, be used in an adaptive reconstruction algorithm implemented by a general SFM technique. In this adaptive technique, in the uncalibrated case, a projective reconstruction technique may at least initially be applied, and the self-calibration technique may then be applied to generate a Euclidian reconstruction. Embodiments of the nonlinear self-calibration technique may thus allow a metric (Euclidian) reconstruction to be obtained where otherwise only a projective reconstruction could be obtained. A projective reconstruction may be unfit for many practical applications. For instance, it is difficult if not impossible to insert a virtual object into a moving video using a projective reconstruction. However, embodiments of the nonlinear self-calibration technique may be used in other SFM applications or techniques, or in any other application or technique that requires a self-calibration operation to be performed on input image(s).
-
FIG. 1 is a high-level flowchart of a nonlinear self-calibration technique, according to at least some embodiments. -
FIG. 2 is a high-level flowchart of a general 3D Structure from Motion (SFM) technique, according to at least some embodiments. -
FIG. 3 is a flowchart of an adaptive technique for iteratively selecting and reconstructing additional keyframes to fully cover the image sequence that may be used in a general adaptive reconstruction algorithm, for example as implemented by a general 3D SFM technique, according to at least some embodiments. -
FIG. 4 is a flowchart of a self-calibration technique that may be applied in the adaptive technique for iteratively selecting and reconstructing additional keyframes, according to at least some embodiments. -
FIG. 5 illustrates a module that may implement one or more of the Structure from Motion (SFM) techniques and algorithms as described herein, including but not limited to the nonlinear self-calibration technique, according to at least some embodiments. -
FIG. 6 illustrates an example computer system that may be used in embodiments. - While the invention is described herein by way of example for several embodiments and illustrative drawings, those skilled in the art will recognize that the invention is not limited to the embodiments or drawings described. It should be understood, that the drawings and detailed description thereto are not intended to limit the invention to the particular form disclosed, but on the contrary, the intention is to cover all modifications, equivalents and alternatives falling within the spirit and scope of the present invention. The headings used herein are for organizational purposes only and are not meant to be used to limit the scope of the description. As used throughout this application, the word “may” is used in a permissive sense (i.e., meaning having the potential to), rather than the mandatory sense (i.e., meaning must). Similarly, the words “include”, “including”, and “includes” mean including, but not limited to.
- In the following detailed description, numerous specific details are set forth to provide a thorough understanding of claimed subject matter. However, it will be understood by those skilled in the art that claimed subject matter may be practiced without these specific details. In other instances, methods, apparatuses or systems that would be known by one of ordinary skill have not been described in detail so as not to obscure claimed subject matter.
- Some portions of the detailed description which follow are presented in terms of algorithms or symbolic representations of operations on binary digital signals stored within a memory of a specific apparatus or special purpose computing device or platform. In the context of this particular specification, the term specific apparatus or the like includes a general purpose computer once it is programmed to perform particular functions pursuant to instructions from program software. Algorithmic descriptions or symbolic representations are examples of techniques used by those of ordinary skill in the signal processing or related arts to convey the substance of their work to others skilled in the art. An algorithm is here, and is generally, considered to be a self-consistent sequence of operations or similar signal processing leading to a desired result. In this context, operations or processing involve physical manipulation of physical quantities. Typically, although not necessarily, such quantities may take the form of electrical or magnetic signals capable of being stored, transferred, combined, compared or otherwise manipulated. It has proven convenient at times, principally for reasons of common usage, to refer to such signals as bits, data, values, elements, symbols, characters, terms, numbers, numerals or the like. It should be understood, however, that all of these or similar terms are to be associated with appropriate physical quantities and are merely convenient labels. Unless specifically stated otherwise, as apparent from the following discussion, it is appreciated that throughout this specification discussions utilizing terms such as “processing,” “computing,” “calculating,” “determining” or the like refer to actions or processes of a specific apparatus, such as a special purpose computer or a similar special purpose electronic computing device. In the context of this specification, therefore, a special purpose computer or a similar special purpose electronic computing device is capable of manipulating or transforming signals, typically represented as physical electronic or magnetic quantities within memories, registers, or other information storage devices, transmission devices, or display devices of the special purpose computer or similar special purpose electronic computing device.
- Various embodiments of Structure from Motion (SFM) techniques and algorithms are described that may be applied, for example, to find the three-dimensional (3D) structures of a scene, for example from a video taken by a moving video camera or from a set of images taken with a still camera. Systems that may implement these algorithms and techniques are also described. In SFM, a task or goal is to estimate the camera motion (which may, but does not necessarily, have both translation and rotation components) from a set of point correspondences in a set of images or video frames. In addition, in at least some cases, intrinsic camera parameters (e.g., focal length) may also be estimated if not known. Performing the task of estimating camera motion and intrinsic parameters for a frame or a sequence of frames may be referred to as reconstruction. Thus, a reconstruction algorithm or technique (which may also be referred to as an SFM technique) may be implemented and applied to estimate the camera motion and intrinsic parameters for image sequences. Note that a distinct camera may be assumed for each image or frame in an image sequence. Thus, each frame or image in a sequence may be referred to as a “camera.”
- Embodiments of a nonlinear self-calibration technique are described that may, for example, be used in various SFM techniques. In contrast to conventional self-calibration methods that use linear or semi-linear algorithms, embodiments of the self-calibration technique may use a nonlinear least squares optimization technique to infer the parameters. In addition, a technique is described for initializing the parameters for the nonlinear optimization. Embodiments of the self-calibration technique may be robust (i.e., may generally produce reliable results), and can make full use of prior knowledge if available. In addition, embodiments of the nonlinear self-calibration technique work for both constant focal length and varying focal length. Embodiments of the nonlinear self-calibration technique may use prior knowledge of the camera intrinsic parameters (e.g., focal length). For instance, if the user knows the focal length or if the focal length is known through metadata of the captured images in the sequence, the known focal length may be used in the formulation to provide reliable calibration results (e.g., motion parameters). However, having such prior knowledge would not make much difference in most conventional linear self-calibration methods. Embodiments of the nonlinear self-calibration technique may be robust and efficient when compared to conventional self-calibration techniques. In particular, the nonlinear optimization problem that is solved may be sparse and may be implemented efficiently.
- Embodiments of the nonlinear self-calibration technique may, for example, be used in an adaptive reconstruction algorithm implemented by a general SFM technique. Example embodiments of an adaptive reconstruction algorithm that may be implemented in a general SFM technique and that leverages the nonlinear self-calibration techniques are described herein. Embodiments of the nonlinear self-calibration technique may, for example, be used in the adaptive reconstruction algorithm to obtain a metric (Euclidian) reconstruction where otherwise only a projective reconstruction could be obtained. However, embodiments of the nonlinear self-calibration technique may be used in other SFM applications or techniques, or in any other application or technique that requires a self-calibration operation to be performed on input image(s).
- In contrast to conventional self-calibration methods that use linear or semi-linear algorithms, embodiments of the self-calibration technique may use a nonlinear least squares optimization technique to infer the parameters. In addition, a technique is described for initializing the parameters for the nonlinear optimization. Embodiments of the self-calibration technique may be robust (i.e., may generally produce reliable results), and can make full use of prior knowledge if available. In addition, embodiments of the nonlinear self-calibration technique work for both constant focal length and varying focal length.
- Embodiments of the nonlinear self-calibration technique may use prior knowledge of the camera intrinsic parameters (e.g., focal length). For instance, if the user knows the focal length or if the focal length is known through metadata of the captured images in the sequence, the known focal length may be used in the formulation to provide reliable calibration results (e.g., motion parameters). However, having such prior knowledge would not make much difference in most conventional linear self-calibration methods. Embodiments of the nonlinear self-calibration technique may be robust and efficient when compared to conventional self-calibration techniques. In particular, the nonlinear optimization problem that is solved may be sparse and may be implemented efficiently.
- Embodiments of the nonlinear self-calibration technique may allow a metric (Euclidian) reconstruction to be obtained where otherwise only a projective reconstruction could be obtained. A projective reconstruction may be unfit for many practical applications. For instance, it is difficult if not impossible to insert a virtual object into a moving video using a projective reconstruction.
-
FIG. 1 is a high-level flowchart of a nonlinear self-calibration technique, according to at least some embodiments. As indicated at 10, N input images and a projective reconstruction for each image may be obtained. As indicated at 20, at least two sets of initial values may be determined for an equation to be optimized according to a nonlinear optimization technique to generate a metric reconstruction for the set of N images. As indicated at 30, the equation may then be optimized using each set of initial values according to the nonlinear optimization technique. As indicated at 40, the result with a smaller cost may be selected. As indicated at 50, the metric reconstruction is output. In at least some embodiments, the output may include, but is not limited to, camera intrinsic parameters (e.g., focal length) and camera motion parameters (e.g., rotation and translation values) for the N images. - The elements of
FIG. 1 are discussed in more detail below. - Embodiments of the nonlinear self-calibration technique may address a problem in camera motion estimation—determining the intrinsic parameters of the cameras such as focal length. There are two general methods for obtaining intrinsic camera parameters from images. One method is conventional calibration, where the camera intrinsic parameters are determined from one or more captured images of a known calibration target or known properties of the scene such as vanishing points of orthogonal directions. The other method is generally referred to as self-calibration. In a self-calibration method, the camera intrinsic parameters are determined directly from constants on the internal and/or external parameters. Self-calibration is generally more useful in practice because a calibration target or known properties of the scene are typically not available.
- Assume N input images and that for each image a 3×4 projection matrix has been obtained:
-
P i εR 3×4 , i=1, 2, . . . , N. - A goal of self-calibration is to find a 4×4 matrix Hε□4×4 such that PiH is a metric reconstruction. Mathematically, this means that there exists a set of upper triangular matrices Kiε□3×3 with Ki(2, 1)=Ki(3, 1)=Ki(3, 2)=0, rotation matrices RiεSO(3), and translation vectors Tiε□3 such that:
-
P i H□K i [R i ,T i ], i=1, 2, . . . , N (B1) - where ˜ indicates equality up to a scale. Note that solving Ti jointly with Ki and Ri does not add any additional constraint compared to solving Ki and Ri alone. In other words, equation (B1) is equivalent to the following reduced version where Ti has been dropped:
-
P i H 1 □K i R i , . . . i=1, 2, . . . , N (B2) - where H1ε□4×3 is the left 4×3 part of H. Further note that there is a generic ambiguity on Ri in the sense that if (H1, Ri) satisfies equation (B2), then (H1R, RiR) satisfies the same equation where R is an arbitrary 3×3 rotation matrix. Without loss of generality, R1 is chosen to be the identity rotation. Also note that Pi contains a projective ambiguity. In order to at least partially fix the ambiguity, P1 may be chosen to be [I, 0]. In the following discussion, it is assumed that P1 has this expression.
- If Ki is allowed to vary arbitrarily, the problem is not well-defined. For instance, for any given H1ε□3×4, a decomposition similar to the QR decomposition may be performed to find an upper triangular matrix and a rotation matrix that satisfy the constraint. Embodiments of the self-calibration technique may exploit the assumptions on Ki to arrive at interesting solutions. In embodiments the following assumptions may be made about the camera intrinsic matrix Ki:
-
- The principal point is known, which is typically but not necessarily at the center of the image. The principal point may be different for different images.
- The pixel skew is 0 (the pixel grid is perpendicular).
- The pixel aspect ratio is known.
- Note that embodiments of the self-calibration technique can be generalized to cases where different assumptions are made. Under these assumptions, the effect of principal point, pixel skew, and pixel aspect ratio on both Pi and Ki can be undone, and a simpler formulation may be derived:
-
P i H 1□diag{f i ,f i,1}R i , i=1, 2, . . . , N (B3) - where f is the focal length of the i-th camera,
P iε□3×4 is Pi modulo the principal point, pixel skew, and pixel aspect ratio, and diag {a, b, c} is a 3×3 diagonal matrix: -
- Equation (B3) may be examined for the case of i=1. Since P1=[I, 0] and R1=I, the following may be obtained:
-
H 11□diag{f i ,f i,1}, (B5) - where H11 is the top 3×3 part of H1. Without loss of generality, the following may be chosen:
-
H 11 ={f 1 ,f 1,1} (B6) - Note that in general, Pi is noisy, i.e., there does not exist a solution for equation (B3). By choosing H11 with this particular form, a bias towards the first image is created since the equation is always satisfied for i=1.
- Equation (B3) becomes:
-
P i1diag{f i ,f i,1}+P i2 H 2i□diag{f i ,f i,1}R i , i=1, 2, . . . , N (B7) - where
P i1 andP i2 are the left 3×3 part and the right 3×1 part ofP i1 respectively and H21 is the bottom 1×3 part of H1. An auxiliary variable λi may be introduced to convert the equality up to a scale equation (B7) into an exact equality as follows: -
P i1,diag{f 1 ,f 1,1}+P i2 ,H 21,=λidiag{f 1 ,f 1,1}R i , i=1, 2, . . . , N (B8) - The self-calibration problem becomes solving H21 and λi,fi, Ri for i=1, 2, . . . , N in equation (B8).
- Some prior knowledge on the focal length may be assumed. For instance, if the lens and camera that are used to capture the image are known, an approximate focal length can be computed from the focal length of the lens and parameters of the camera sensor. The lens information may, for example, be obtained from image/video metadata. In at least some embodiments, if the lens and/or the camera are not known, since many if not most scenes where people need camera tracking are captured using relatively wide-angle lenses, it may be assumed that the focal length is in the range from 24 mm to 35 mm (35 mm equivalent). A discussion of extending the self-calibration technique to the case where there is no prior knowledge of the focal length in the section titled No prior knowledge on focal length.
- Two cases are presented below: constant focal length for the entire sequence, and varying focal length.
- In the case of constant focal length, fi is assumed to be the same for all the images, and may be denoted by f. The self-calibration problem may be solved according to an optimization process. In at least some embodiments, the following cost function may be optimized:
-
- A reason for using this type of cost function is that Ri has components at the same scale (between −1 and 1), and the summation over i makes sense. Since equation (B9) is of the form of nonlinear least squares, in at least some embodiments the Levenberg-Marquardt algorithm may be used to optimize the cost.
- In order to use the Levenberg-Marquardt algorithm, initial values for all the unknowns are needed. Prior knowledge on the focal length may be used here. Let f be the approximate focal length. Good initial values for H21, Ri and λi are also needed. A conventional algorithm for computing H21 exists. However, the conventional algorithm only gives a partial solution. More precisely, there are two solutions for H21, and the conventional algorithm only computes one of the two solutions. This makes the conventional algorithm unsuitable for the nonlinear optimization problem presented herein because the conventional algorithm may pick the wrong solution for H21 from the two solutions, and a nonlinear optimization starting from the wrong solution may not converge to the correct solution for the nonlinear optimization problem.
- The following describes an algorithm for computing the two solutions for H21 that may be used in at least some embodiments. A pair of projection matrices is chosen, one of which is the first image. The choice of the other projection matrix may be important. In at least some embodiments, the camera that is farthest away from the first image in time may be chosen. Without loss of generality, assume (P1, P2) are chosen. The following is computed:
-
- There exists a rotation matrix Rs, such that:
-
Rs t 2 =[∥t 2∥,0,0]T (B11) - The following is computed:
-
- The two solutions for H21 are given by:
-
- where W1, W2, and W3 are the rows of W: WT=[W1, W2, W3]T. It can be verified that the two solutions are both valid. The two solutions correspond to the choice of the sign of P2. Since P2 is up to a scale, which can be either positive or negative, two solutions for H21 are obtained
- In at least some embodiments, Ri and λi may be computed as follows. For a given H21, a QR decomposition may be computed as follows:
-
- where Ai is a 3×3 upper triangular matrix and {circumflex over (R)}i is a 3×3 rotation matrix. In at least some embodiments, the technique sets λi=A(3, 3) and uses {circumflex over (R)}i as the initial value for Ri.
- The above provides initial values for H21, Ri, and λi. Equation (B9) may be optimized, for example using a Levenberg-Marquardt technique. Since there are two solutions for H21, there are two sets of initial values. In at least some embodiments, two optimizations are performed, one using each set of initial values. The result with the smaller cost may be chosen. Note that equation (B9) has a sparse form, and can be optimized efficiently using a sparse solver.
- In the varying focal length case, the focal length changes for each image. In at lest some embodiments, a generalization of the algorithm in the section titled Constant focal length may be used for the varying focal length case. Again, without loss of generality, P1 and P2 are chosen to compute H21. The following is computed:
-
- and the rotation matrix Rs is found such that:
-
R s t 2 =[∥t 2∥,0,0]T (B16) - The following is computed:
-
- The two solutions for H21 are given by:
-
- In at least some embodiments, once H21 is computed, Ri and λi can be computed using the same algorithm presented in the section titled Constant focal length. However, the optimization may be modified to optimize over fi as well:
-
- Embodiments of the nonlinear self-calibration technique as described herein may be robust to error in the initial estimate of the focal length. The optimization tends to converge even if the focal length estimate is off by as much as 20%. Since in practice accurate prior knowledge may often not be available or attainable, this robustness is advantageous. The robustness of the nonlinear self-calibration technique also suggests a way to handle cases where there is no prior knowledge on the focal length. Note that the focal length has a bounded domain in □. In at least some embodiments, a brute-force search may be used. Let fmm and fmax be the minimum and maximum focal length. In the constant focal length case, the range may be divided into M bins as follows:
-
- Each fi may be used as the initial value for f, and the optimization may be performed. The result with the least cost may be returned.
- For the varying focal length case, the same range may be divided into M bins, and, for all possible pairs of (fi, fi) (where i=1, 2, . . . , M and j=1, 2, . . . , M, as the initial values for (f1, f2)), the optimization may be performed. The result with the least cost may be returned.
- In contrast to conventional self-calibration techniques, embodiments of the self-calibration technique described herein find two solutions to H21 that correspond to the two different signs of P2. Finding only one solution, as is done in conventional self-calibration techniques, may result in the wrong solution being picked for at least the reason that the sign of P2 is inconsistent. In addition, embodiments of the self-calibration technique described herein employ a nonlinear optimization to further refine the solution. This makes the self-calibration technique robust to errors in the initial guess of the focal length.
- Embodiments of the nonlinear self-calibration technique may, for example, be used in an adaptive reconstruction algorithm that starts by adaptively determining and reconstructing an initial set of keyframes that covers only a part of an image sequence (e.g., a set of spaced frames somewhere in the middle of the sequence), and that incrementally and adaptively determines and reconstructs additional keyframes to fully cover the image sequence. In at least some embodiments, the adaptive reconstruction algorithm then adaptively determines and reconstructs optimization keyframes to provide a better reconstruction. The rest of the frames in the sequence may then be reconstructed based on the determined and reconstructed keyframes. At least some embodiments of the adaptive reconstruction algorithm may be configured to handle both cases where the intrinsic camera parameters (e.g., focal length) are known (e.g., via user input or via metadata provided with the input image sequence) and cases where the intrinsic camera parameters are not known. The first case may be referred to herein as the calibrated case, and the second case may be referred to herein as the uncalibrated case. In at least some embodiments, in the calibrated case, a Euclidian (or metric) reconstruction technique may be applied. In at least some embodiments, in the uncalibrated case, a projective reconstruction technique may at least initially be applied. The nonlinear self-calibration technique as described herein may be applied to produce a Euclidian (or metric) reconstruction in the uncalibrated case.
- The adaptive reconstruction algorithm may, for example, be used in embodiments of a robust system for estimating camera motion (rotation and translation) in image sequences, a problem known in computer vision as Structure from Motion (SFM). Embodiments of a general 3D reconstruction technique, which may also be referred to as a general SFM technique, are generally directed to performing reconstruction for image sequences in which the camera motion includes a non-zero translation component. In other words, the camera has moved when capturing the image sequence. The general SFM technique estimates the rotation and translation components of the camera motion, and may also estimate the camera intrinsic parameters (e.g., focal length) if not known. In addition, the general SFM technique may be generally directed to performing reconstruction for image sequences in which the scene does not contain a dominant plane.
-
FIG. 2 is a high-level flowchart of the general SFM technique, according to at least some embodiments. As indicated at 100, an input image sequence may be obtained. The image sequence may, for example, be a video taken by a moving video camera or a set of images taken with a still camera. As indicated at 102, a feature tracking technique may be applied to establish point trajectories over time in the input image sequence. Embodiments of a feature tracking technique that may be used in at least some embodiments are described later in this document. Output of the feature tracking technique is a set of point trajectories. As indicated at 104, an initialization technique may be performed to determine and reconstruct a set of initial keyframes covering a portion of the image sequence according to the point trajectories. Input to the initialization technique includes at least the set of point trajectories. Output of the initialization technique is a set of initial keyframes and the initial reconstruction. -
Elements 106 through 110 are a keyframe reconstruction loop that incrementally and adaptively determines and reconstructs additional keyframes to fully cover the image sequence. As indicated at 106, a new keyframe is determined and reconstructed. In the calibrated case, a Euclidian reconstruction technique can be performed, since the camera intrinsic parameters are known. In the uncalibrated case, a projective reconstruction technique may be performed. As indicated at 108, in the uncalibrated case, a self-calibration technique may be applied to produce a Euclidian (or metric) reconstruction for the frame, if there are enough frames to perform the self-calibration. At 110, if there are more keyframes to be reconstructed, then the method returns to 106 to add a next keyframe. Otherwise, the method goes toelement 112. As indicated at 112, an opt-keyframe technique may then be performed to determine and reconstruct optimization keyframes to improve the quality of the reconstruction. As indicated at 114, non-keyframes (keyframes that have not yet been included in the reconstruction) may be reconstructed. As indicated at 116, final processing may be performed. As indicated at 118, at least the camera intrinsic parameters and the Euclidean motion parameters for the images in the input image sequence may be output. - Elements of the general SFM technique shown in
FIG. 2 are discussed in more detail below. - As indicated at 102 of
FIG. 2 , given an input image sequence, embodiments of the general SFM technique may first perform feature tracking to establish point trajectories over time. A basic idea of feature tracking is to find the locations of the same point in subsequent video frames. In general, a point should be tracked as long and as accurately as possible, and as many points as possible should be tracked. - In at least some embodiments, the general SFM technique may use an implementation of the Lucas-Kanade-Tomasi algorithm to perform feature tracking. In these embodiments, for every point at time t, a translational model may be used to track against the previous video frame (at time t−1), and an affine model may be used to track against the reference video frame at time t0 (t0 may vary according to the point). The result of feature tracking is a set of point trajectories. Each point trajectory includes the two-dimensional (2D) locations of the “same” point in a contiguous set of frames. Let xi,j denote the 2D location of the i-th point in the j-th image. Since not all of the points are present in all of the images, xi,j is undefined for some combinations of i and j. To simplify the notation, a binary characteristic function, ψi,j:ψs=1, may be used if the i-th point is present on the j-th image; otherwise, ψi,j=0. Through ψi,j, quantities such as ψi,jxi,j may be used even if xi,j is undefined.
- Note that various feature tracking algorithms and/or various matching paradigms, such as detecting and matching robust image features, may be used in various embodiments. The general SFM technique can work with any feature tracking technique that computes point trajectories.
- In at least some embodiments, the point trajectories are input to the rest of the general SFM technique; the input image sequence may not be referenced after feature tracking
- As indicated at 104 of
FIG. 2 , an initialization technique may be performed in an adaptive reconstruction algorithm to determine and reconstruct a set of initial keyframes covering a portion of the image sequence according to the point trajectories. As previously noted, at least some embodiments of the general SFM technique may implement an incremental approach that adds one or more frames to the reconstruction at a time. To accomplish this, an initial reconstruction may need to be generated. A goal of the initialization technique is to compute an initial reconstruction from a subset of frames in the image sequence. In at least some embodiments, two-view reconstruction algorithms may be used. Since the general SFM technique is incremental, the quality of the initial reconstruction may be important in generating a quality overall reconstruction. In at least some embodiments, to help achieve a quality initial reconstruction, two initial frames that best satisfy requirements of the initial reconstruction algorithm may be determined. - In at least some embodiments of an initialization technique, input to the initialization technique includes at least the set of point trajectories. Two initial keyframes may be selected. A reconstruction may be performed from the two initial keyframes. Additional keyframes between the initial keyframes may be determined and reconstructed. A global optimization of the reconstruction may be performed. One or more outlier points may be determined and removed. One or more inlier points may be determined and recovered. Note that outlier and inlier points correspond to particular point trajectories, and that the entire point trajectory is removed (for outlier points) or recovered (for inlier points). If more than a threshold number of inliers were recovered, another global optimization may be performed as indicated at 280. Otherwise, the initialization technique is done. Output of the initialization technique is a set of initial keyframes and the initial reconstruction.
- After initialization, additional keyframes may be determined and reconstructed to cover the image sequence. In at least some embodiments of the general SFM technique, a keyframe reconstruction loop may be used to enlarge the initial reconstruction to cover the entire image sequence, as shown in elements 106-110 of
FIG. 2 . The keyframe reconstruction loop may add keyframes in an incremental and adaptive fashion, adding one keyframe at a time until the entire video sequence is covered. Note that this loop does not add all the frames in the input image sequence. Instead, an adaptive algorithm is used to select particular frame to add. In at least some embodiments, the additional keyframes may be selected from the set of keyframes that were previously selected. In at least some embodiments, the initial reconstruction may cover a portion of the image sequence, and the additional keyframes may be added one at a time at each end of the current reconstruction, working outwards and alternating between ends. -
FIG. 3 is a flowchart of an adaptive technique for iteratively selecting and reconstructing additional keyframes to fully cover the image sequence that may be used in a general adaptive reconstruction algorithm, for example as implemented by a general 3D SFM technique, according to at least some embodiments. At 300, if all keyframes have been processed, then the adaptive technique for iteratively selecting and reconstructing additional keyframes is done. Otherwise, the technique proceeds toelement 310. As indicated at 310, a next keyframe may be determined according to an adaptive selection technique. As indicated at 320, the determined keyframe may be reconstructed and thus added to the current reconstruction. As indicated at 330, a global optimization may be performed on the current reconstruction. As indicated at 340, one or more outlier points may be determined and removed from the reconstruction. As indicated at 350, one or more inlier points may be determined and recovered (added to the reconstruction). At 360, if the number of inlier points that were added exceed a threshold, then a global optimization may again be performed on the current reconstruction as indicated at 362. At 370, in the calibrated case, the current reconstruction is already a Euclidian reconstruction, so the technique returns toelement 300 to determine if there are more keyframes to be processed. Otherwise, this is the uncalibrated case, and the reconstruction is a projective construction. If there are enough frames to perform self-calibration at this point, then self-calibration may be performed as indicated at 372 to upgrade the projective reconstruction to a Euclidean reconstruction. Results of the self-calibration may be analyzed to determine if the results are acceptable. At 380, if the results of the self-calibration are accepted, the technique returns toelement 300 to determine if there are more keyframes to be processed. Otherwise, the technique reverts to the reconstruction prior to the self-calibration attempt as indicated at 382, and the technique returns toelement 300 to determine if there are more keyframes to be processed. - In at least some embodiments, a self-calibration technique may be applied to upgrade a reconstruction from projective to Euclidean (metric). Note that self-calibration may not be applied to the calibrated case because the reconstruction is already metric. Once the reconstruction is Euclidean, self-calibration does not need to be performed. In at least some embodiments, self-calibration is only performed when the number of cameras in the current reconstruction reaches a certain threshold. The section titled Nonlinear Self-Calibration Technique describes a self-calibration technique that may be used in at least some embodiments. This section describes a few extra steps that may be taken in some embodiments to ensure that the results of the self-calibration technique are good and thus accepted.
-
FIG. 4 is a flowchart of a self-calibration technique that may be implemented in the adaptive technique for iteratively selecting and reconstructing additional keyframes, according to at least some embodiments. In at least some embodiments, before self-calibration, a total reprojection error is computed, as indicated at 500. Self-calibration is then performed, as indicated at 510. In at least some embodiments, a self-calibration technique as described in the section titled Nonlinear Self-Calibration Technique may be used. After self-calibration, a global optimization of the reconstruction may be performed, as indicated at 520. In at least some embodiments, a multi-view bundle adjustment technique as described in the section titled Optimization using multi-view bundle adjustment may be used. As indicated at 530, inlier points may be determined and recovered, for example as described in the section titled Inlier recovery. As indicated by 540, in at least some embodiments, the method may iterate between adding inliers and global optimization (e.g., multi-view bundle adjustment) until either no new inlier is added or the iteration count reaches a pre-defined threshold. At 540, when done, a new total reprojection error may be computed and compared to the total reprojection error that was previously computed at 500, as indicated at 550. At 560, the results of the comparison may be used to determine if the self-calibration was successful. In at least some embodiments, if the new total reprojection error is no more than a pre-defined factor of the total reprojection error computed before self-calibration, the self-calibration result is accepted as indicated at 570. Otherwise, the self-calibration step has failed, and the reconstruction is reverted back to the state before self-calibration, as indicated at 580. - As indicated at 112 of
FIG. 2 , an opt-keyframe technique may be applied to a reconstruction for an image sequence to determine and reconstruct optimization keyframes to improve the quality of the reconstruction. In the opt-keyframe technique, additional frames, referred to herein as “opt-keyframes”, are determined and added to the reconstruction, and the reconstruction is again globally optimized. By adding more optimized frames and more optimized points, the quality of the reconstruction may be improved. - In at least some embodiments of an opt-keyframe reconstruction technique, opt-keyframes may be determined and added to the reconstruction so that the total number of frames in the reconstruction satisfies a threshold. One or more bad (outlier) points may be determined according to one or more criteria and removed from the reconstruction. One or more good (inlier) points may be determined and recovered. Bad (outlier) points may again be determined according to one or more criteria and removed from the reconstruction. The reconstruction may then be globally optimized.
- In at least some embodiments, given the current reconstruction, a set of opt-keyframes may be computed that are uniformly spread in the entire sequence so that the total number of frames reaches a pre-defined threshold. The camera parameters for the newly selected opt-keyframes may be computed.
- As indicated at 114 of
FIG. 2 , non-keyframes (keyframes that have not yet been included in the reconstruction) may be reconstructed. In at least some embodiments of a non-keyframe reconstruction technique, all of the frames in the input sequence that are not included in the current reconstruction may be reconstructed. These frames may be referred to as non-keyframes. In at least some embodiments, all the frames in the reconstruction that include both keyframes and opt-keyframes are first reconstructed. In at least some embodiments, the non-keyframe reconstruction technique may work on adjacent pairs of keyframes until all the pairs of keyframes have been processed. In at least some embodiments, for each pair, all of the 3D points that are visible in both frames are collected. These points may then be used to compute the parameters for a camera between the two frames, for example as described below. - As indicated at 116 of
FIG. 2 , final processing may be performed. In at least some embodiments, there may be two steps in the final processing. In at least some embodiments, the largest contiguous subset of frames in the reconstruction may be found. All the frames that are not in this subset, along with all the points that are not visible in any of the frames in the subset, may be removed from the reconstruction. In at least some embodiments, optionally, all of the frames and points in the reconstruction may be optimized (global optimization). In at least some embodiments, this optimization may be performed according to a refinement process that optimizes all the points and cameras together. - As indicated at 118 of
FIG. 2 , at least the camera intrinsic parameters and the Euclidean motion parameters for the images in the input image sequence may be output. Note that the reconstruction may have been cropped to the largest contiguous set of frames, as described in the section titled Final Processing. The output (at least the camera intrinsic parameters and the Euclidean motion parameters for the images in the input image sequence) of the general SFM technique described above may be used in a wide range of applications in different domains including but not limited to 3D image-based modeling and rendering, video stabilization, panorama stitching, video augmentation, vision based robot navigation, human-computer interaction, etc. For example, the camera intrinsic parameters and the Euclidean motion parameters determined from the video sequence using an embodiment of the general SFM technique as described herein may be used to insert a 3D object into a video sequence. The inserted 3D object moves with the motion of the camera to maintain a natural and believable positioning in the frames. - Some embodiments may include a means for performing one or more of the various techniques described herein, including but not limited to the nonlinear self-calibration technique. For example, an SFM module may receive input specifying a set of point trajectories and generate as output structure and motion for a set of images or frames as described herein. The SFM module may, for example, apply the nonlinear self-calibration technique to convert a projective reconstruction to a metric (Euclidian) reconstruction. The SFM techniques described herein, including but not limited to the nonlinear self-calibration technique, and/or the SFM module may in some embodiments be implemented by a non-transitory, computer-readable storage medium and one or more processors (e.g., CPUs and/or GPUs) of a computing apparatus. The computer-readable storage medium may store program instructions executable by the one or more processors to cause the computing apparatus to perform one or more of the techniques as described herein, for example the nonlinear self-calibration technique. Other embodiments of the module(s) may be at least partially implemented by hardware circuitry and/or firmware stored, for example, in a non-volatile memory.
- Embodiments of an SFM module, or of one or more modules that implement one or more of the techniques described herein including but not limited to the nonlinear self-calibration technique, may, for example, be implemented as a stand-alone application, as a module of an application, as a plug-in or plug-ins for applications including image or video processing applications, and/or as a library function or functions that may be called by other applications such as image processing or video processing applications. Embodiments of the module(s) may be implemented in any image or video processing application, or more generally in any application in which video or image sequences may be processed. Example applications in which embodiments may be implemented may include, but are not limited to, Adobe® Premiere® and Adobe® After Effects®. “Adobe,” “Adobe Premiere,” and “Adobe After Effects” are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States and/or other countries. An example module that may implement one or more of the SFM techniques as described herein is illustrated in
FIG. 5 . An example computer system on which the module may be implemented is illustrated inFIG. 6 . Note that the module may, for example, be implemented in still cameras and/or video cameras. -
FIG. 5 illustrates an example module that may implement one or more of the SFM techniques, including but not limited to the nonlinear self-calibration technique, as illustrated in the accompanying Figures and described herein, according to at least some embodiments.Module 1700 may, for example, receive an input image sequence, or alternatively a set of point trajectories for the images in a sequence.Module 1700 then applies one or more of the techniques as described herein to generate structure, camera parameters, and motion. In at least some embodiments,module 1700 may obtain point trajectories for the sequence, as indicated at 1710.Module 1700 may then perform initialization to determine and reconstruct initial keyframes, as indicated at 1720.Module 1700 may then determine and reconstruct additional keyframes to cover the video sequence, as indicated at 1730. In at least some embodiments,module 1700 may apply an embodiment of the nonlinear self-calibration technique as described herein, for example in converting a projective reconstruction to a metric (Euclidian) reconstruction atelement 1730.Module 1700 may then determine and reconstruct optimization keyframes, as indicated at 1740.Module 1700 may then reconstruct non-keyframes, as indicated at 1750.Module 1700 may then perform final processing, as indicated at 1760. In at least some embodiments,module 1700 may generate as output estimates of camera parameters and camera motion for the image sequence. - Example applications of the SFM techniques as described herein may include one or more of, but are not limited to, video stabilization, video augmentation (augmenting an original video sequence with graphic objects), video classification, and robot navigation. In general, embodiments of one or more of the SFM techniques may be used to provide structure and motion to any application that requires or desires such output to perform some video- or image-processing task.
- Embodiments of the various techniques as described herein including but not limited to the nonlinear self-calibration technique may be executed on one or more computer systems, which may interact with various other devices. One such computer system is illustrated by
FIG. 6 . In different embodiments,computer system 2000 may be any of various types of devices, including, but not limited to, a personal computer system, desktop computer, laptop, notebook, or netbook computer, mainframe computer system, handheld computer, workstation, network computer, a camera, a video camera, a tablet or pad device, a smart phone, a set top box, a mobile device, a consumer device, video game console, handheld video game device, application server, storage device, a peripheral device such as a switch, modem, router, or in general any type of computing or electronic device. - In the illustrated embodiment,
computer system 2000 includes one or more processors 2010 coupled to asystem memory 2020 via an input/output (I/O)interface 2030.Computer system 2000 further includes anetwork interface 2040 coupled to I/O interface 2030, and one or more input/output devices 2050, such ascursor control device 2060,keyboard 2070, display(s) 2080, and touch- or multitouch-enabled device(s) 2090. In some embodiments, it is contemplated that embodiments may be implemented using a single instance ofcomputer system 2000, while in other embodiments multiple such systems, or multiple nodes making upcomputer system 2000, may be configured to host different portions or instances of embodiments. For example, in one embodiment some elements may be implemented via one or more nodes ofcomputer system 2000 that are distinct from those nodes implementing other elements. - In various embodiments,
computer system 2000 may be a uniprocessor system including one processor 2010, or a multiprocessor system including several processors 2010 (e.g., two, four, eight, or another suitable number). Processors 2010 may be any suitable processor capable of executing instructions. For example, in various embodiments, processors 2010 may be general-purpose or embedded processors implementing any of a variety of instruction set architectures (ISAs), such as the x86, PowerPC, SPARC, or MIPS ISAs, or any other suitable ISA. In multiprocessor systems, each of processors 2010 may commonly, but not necessarily, be implement the same ISA. - In some embodiments, at least one processor 2010 may be a graphics processing unit. A graphics processing unit or GPU may be considered a dedicated graphics-rendering device for a personal computer, workstation, game console or other computing or electronic device. Modern GPUs may be very efficient at manipulating and displaying computer graphics, and their highly parallel structure may make them more effective than typical CPUs for a range of complex graphical algorithms. For example, a graphics processor may implement a number of graphics primitive operations in a way that makes executing them much faster than drawing directly to the screen with a host central processing unit (CPU). In various embodiments, the techniques disclosed herein may, at least in part, be implemented by program instructions configured for execution on one of, or parallel execution on two or more of, such GPUs. The GPU(s) may implement one or more application programmer interfaces (APIs) that permit programmers to invoke the functionality of the GPU(s). Suitable GPUs may be commercially available from vendors such as NVIDIA Corporation, ATI Technologies (AMD), and others.
-
System memory 2020 may be configured to store program instructions and/or data accessible by processor 2010. In various embodiments,system memory 2020 may be implemented using any suitable memory technology, such as static random access memory (SRAM), synchronous dynamic RAM (SDRAM), nonvolatile/Flash-type memory, or any other type of memory. In the illustrated embodiment, program instructions and data implementing desired functions, such as those described above for embodiments of the various techniques as described herein are shown stored withinsystem memory 2020 asprogram instructions 2025 anddata storage 2035, respectively. In other embodiments, program instructions and/or data may be received, sent or stored upon different types of computer-accessible media or on similar media separate fromsystem memory 2020 orcomputer system 2000. Generally speaking, a computer-accessible medium may include storage media or memory media such as magnetic or optical media, e.g., disk or CD/DVD-ROM coupled tocomputer system 2000 via I/O interface 2030. Program instructions and data stored via a computer-accessible medium may be transmitted by transmission media or signals such as electrical, electromagnetic, or digital signals, which may be conveyed via a communication medium such as a network and/or a wireless link, such as may be implemented vianetwork interface 2040. - In one embodiment, I/
O interface 2030 may be configured to coordinate I/O traffic between processor 2010,system memory 2020, and any peripheral devices in the device, includingnetwork interface 2040 or other peripheral interfaces, such as input/output devices 2050. In some embodiments, I/O interface 2030 may perform any necessary protocol, timing or other data transformations to convert data signals from one component (e.g., system memory 2020) into a format suitable for use by another component (e.g., processor 2010). In some embodiments, I/O interface 2030 may include support for devices attached through various types of peripheral buses, such as a variant of the Peripheral Component Interconnect (PCI) bus standard or the Universal Serial Bus (USB) standard, for example. In some embodiments, the function of I/O interface 2030 may be split into two or more separate components, such as a north bridge and a south bridge, for example. In addition, in some embodiments some or all of the functionality of I/O interface 2030, such as an interface tosystem memory 2020, may be incorporated directly into processor 2010. -
Network interface 2040 may be configured to allow data to be exchanged betweencomputer system 2000 and other devices attached to a network, such as other computer systems, or between nodes ofcomputer system 2000. In various embodiments,network interface 2040 may support communication via wired or wireless general data networks, such as any suitable type of Ethernet network, for example; via telecommunications/telephony networks such as analog voice networks or digital fiber communications networks; via storage area networks such as Fibre Channel SANs, or via any other suitable type of network and/or protocol. - Input/
output devices 2050 may, in some embodiments, include one or more display terminals, keyboards, keypads, touchpads, scanning devices, voice or optical recognition devices, or any other devices suitable for entering or retrieving data by one ormore computer system 2000. Multiple input/output devices 2050 may be present incomputer system 2000 or may be distributed on various nodes ofcomputer system 2000. In some embodiments, similar input/output devices may be separate fromcomputer system 2000 and may interact with one or more nodes ofcomputer system 2000 through a wired or wireless connection, such as overnetwork interface 2040. - As shown in
FIG. 6 ,memory 2020 may includeprogram instructions 2025, configured to implement embodiments of the various techniques as described herein, anddata storage 2035, comprising various data accessible byprogram instructions 2025. In one embodiment,program instructions 2025 may include software elements of embodiments of the various techniques as illustrated in the above Figures.Data storage 2035 may include data that may be used in embodiments. In other embodiments, other or different software elements and data may be included. - Those skilled in the art will appreciate that
computer system 2000 is merely illustrative and is not intended to limit the scope of the various techniques as described herein. In particular, the computer system and devices may include any combination of hardware or software that can perform the indicated functions, including a computer, personal computer system, desktop computer, laptop, notebook, or netbook computer, mainframe computer system, handheld computer, workstation, network computer, a camera, a video camera, a set top box, a mobile device, network device, internet appliance, PDA, wireless phones, pagers, a consumer device, video game console, handheld video game device, application server, storage device, a peripheral device such as a switch, modem, router, or in general any type of computing or electronic device.Computer system 2000 may also be connected to other devices that are not illustrated, or instead may operate as a stand-alone system. In addition, the functionality provided by the illustrated components may in some embodiments be combined in fewer components or distributed in additional components. Similarly, in some embodiments, the functionality of some of the illustrated components may not be provided and/or other additional functionality may be available. - Those skilled in the art will also appreciate that, while various items are illustrated as being stored in memory or on storage while being used, these items or portions of them may be transferred between memory and other storage devices for purposes of memory management and data integrity. Alternatively, in other embodiments some or all of the software components may execute in memory on another device and communicate with the illustrated computer system via inter-computer communication. Some or all of the system components or data structures may also be stored (e.g., as instructions or structured data) on a computer-accessible medium or a portable article to be read by an appropriate drive, various examples of which are described above. In some embodiments, instructions stored on a computer-accessible medium separate from
computer system 2000 may be transmitted tocomputer system 2000 via transmission media or signals such as electrical, electromagnetic, or digital signals, conveyed via a communication medium such as a network and/or a wireless link. Various embodiments may further include receiving, sending or storing instructions and/or data implemented in accordance with the foregoing description upon a computer-accessible medium. Accordingly, the present invention may be practiced with other computer system configurations. - Various embodiments may further include receiving, sending or storing instructions and/or data implemented in accordance with the foregoing description upon a computer-accessible medium. Generally speaking, a computer-accessible medium may include storage media or memory media such as magnetic or optical media, e.g., disk or DVD/CD-ROM, volatile or non-volatile media such as RAM (e.g. SDRAM, DDR, RDRAM, SRAM, etc.), ROM, etc., as well as transmission media or signals such as electrical, electromagnetic, or digital signals, conveyed via a communication medium such as network and/or a wireless link.
- The various methods as illustrated in the Figures and described herein represent example embodiments of methods. The methods may be implemented in software, hardware, or a combination thereof. The order of method may be changed, and various elements may be added, reordered, combined, omitted, modified, etc.
- Various modifications and changes may be made as would be obvious to a person skilled in the art having the benefit of this disclosure. It is intended that the invention embrace all such modifications and changes and, accordingly, the above description to be regarded in an illustrative rather than a restrictive sense.
Claims (21)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13/724,973 US8942422B2 (en) | 2012-04-06 | 2012-12-21 | Nonlinear self-calibration for structure from motion (SFM) techniques |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201261621365P | 2012-04-06 | 2012-04-06 | |
US13/724,973 US8942422B2 (en) | 2012-04-06 | 2012-12-21 | Nonlinear self-calibration for structure from motion (SFM) techniques |
Publications (2)
Publication Number | Publication Date |
---|---|
US20130265443A1 true US20130265443A1 (en) | 2013-10-10 |
US8942422B2 US8942422B2 (en) | 2015-01-27 |
Family
ID=49291973
Family Applications (9)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/724,973 Active 2033-07-23 US8942422B2 (en) | 2012-04-06 | 2012-12-21 | Nonlinear self-calibration for structure from motion (SFM) techniques |
US13/725,019 Active 2033-04-14 US8873846B2 (en) | 2012-04-06 | 2012-12-21 | Detecting and tracking point features with primary colors |
US13/725,006 Active 2033-05-13 US8923638B2 (en) | 2012-04-06 | 2012-12-21 | Algorithm selection for structure from motion |
US13/725,041 Active 2036-01-11 US10778949B2 (en) | 2012-04-06 | 2012-12-21 | Robust video-based camera rotation estimation |
US13/724,906 Active 2033-08-06 US9083945B2 (en) | 2012-04-06 | 2012-12-21 | Keyframe selection for robust video-based structure from motion |
US13/724,871 Active 2033-07-11 US8934677B2 (en) | 2012-04-06 | 2012-12-21 | Initialization for robust video-based structure from motion |
US13/724,945 Active 2034-01-17 US9131208B2 (en) | 2012-04-06 | 2012-12-21 | Opt-keyframe reconstruction for robust video-based structure from motion |
US14/713,914 Active US9390515B2 (en) | 2012-04-06 | 2015-05-15 | Keyframe selection for robust video-based structure from motion |
US14/801,432 Active US9292937B2 (en) | 2012-04-06 | 2015-07-16 | Opt-keyframe reconstruction for robust video-based structure from motion |
Family Applications After (8)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/725,019 Active 2033-04-14 US8873846B2 (en) | 2012-04-06 | 2012-12-21 | Detecting and tracking point features with primary colors |
US13/725,006 Active 2033-05-13 US8923638B2 (en) | 2012-04-06 | 2012-12-21 | Algorithm selection for structure from motion |
US13/725,041 Active 2036-01-11 US10778949B2 (en) | 2012-04-06 | 2012-12-21 | Robust video-based camera rotation estimation |
US13/724,906 Active 2033-08-06 US9083945B2 (en) | 2012-04-06 | 2012-12-21 | Keyframe selection for robust video-based structure from motion |
US13/724,871 Active 2033-07-11 US8934677B2 (en) | 2012-04-06 | 2012-12-21 | Initialization for robust video-based structure from motion |
US13/724,945 Active 2034-01-17 US9131208B2 (en) | 2012-04-06 | 2012-12-21 | Opt-keyframe reconstruction for robust video-based structure from motion |
US14/713,914 Active US9390515B2 (en) | 2012-04-06 | 2015-05-15 | Keyframe selection for robust video-based structure from motion |
US14/801,432 Active US9292937B2 (en) | 2012-04-06 | 2015-07-16 | Opt-keyframe reconstruction for robust video-based structure from motion |
Country Status (1)
Country | Link |
---|---|
US (9) | US8942422B2 (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8873846B2 (en) | 2012-04-06 | 2014-10-28 | Adobe Systems Incorporated | Detecting and tracking point features with primary colors |
US9317928B2 (en) | 2012-04-06 | 2016-04-19 | Adobe Systems Incorporated | Detecting and tracking point features with primary colors |
US10504244B2 (en) * | 2017-09-28 | 2019-12-10 | Baidu Usa Llc | Systems and methods to improve camera intrinsic parameter calibration |
Families Citing this family (37)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8810598B2 (en) | 2011-04-08 | 2014-08-19 | Nant Holdings Ip, Llc | Interference based augmented reality hosting platforms |
US8657680B2 (en) * | 2011-05-31 | 2014-02-25 | United Video Properties, Inc. | Systems and methods for transmitting media associated with a measure of quality based on level of game play in an interactive video gaming environment |
US9264751B2 (en) | 2013-02-15 | 2016-02-16 | Time Warner Cable Enterprises Llc | Method and system for device discovery and content management on a network |
WO2014154533A1 (en) * | 2013-03-27 | 2014-10-02 | Thomson Licensing | Method and apparatus for automatic keyframe extraction |
JP5532176B1 (en) * | 2013-07-18 | 2014-06-25 | 富士ゼロックス株式会社 | Image reading apparatus and image forming apparatus |
US9582516B2 (en) | 2013-10-17 | 2017-02-28 | Nant Holdings Ip, Llc | Wide area augmented reality location-based services |
US8760500B1 (en) * | 2013-10-23 | 2014-06-24 | Google Inc. | Depth map generation |
US20150178927A1 (en) * | 2013-12-23 | 2015-06-25 | Metaio Gmbh | Method and system for determining a transformation associated with a capturing device |
EP3108456B1 (en) * | 2014-02-19 | 2020-06-24 | Koninklijke Philips N.V. | Motion adaptive visualization in medical 4d imaging |
CN103914874B (en) | 2014-04-08 | 2017-02-01 | 中山大学 | Compact SFM three-dimensional reconstruction method without feature extraction |
US10410429B2 (en) | 2014-05-16 | 2019-09-10 | Here Global B.V. | Methods and apparatus for three-dimensional image reconstruction |
JP6205069B2 (en) * | 2014-12-04 | 2017-09-27 | エスゼット ディージェイアイ テクノロジー カンパニー リミテッドSz Dji Technology Co.,Ltd | Imaging system and method |
JP6496323B2 (en) * | 2015-09-11 | 2019-04-03 | エスゼット ディージェイアイ テクノロジー カンパニー リミテッドSz Dji Technology Co.,Ltd | System and method for detecting and tracking movable objects |
GB2544725A (en) * | 2015-11-03 | 2017-05-31 | Fuel 3D Tech Ltd | Systems and methods for forming models of a three-dimensional objects |
CN105676833B (en) * | 2015-12-21 | 2018-10-12 | 海南电力技术研究院 | Power generation process control system fault detection method |
CN109076200B (en) * | 2016-01-12 | 2021-04-23 | 上海科技大学 | Method and device for calibrating panoramic stereo video system |
US11232583B2 (en) * | 2016-03-25 | 2022-01-25 | Samsung Electronics Co., Ltd. | Device for and method of determining a pose of a camera |
US10217225B2 (en) * | 2016-06-01 | 2019-02-26 | International Business Machines Corporation | Distributed processing for producing three-dimensional reconstructions |
US9881647B2 (en) * | 2016-06-28 | 2018-01-30 | VideoStitch Inc. | Method to align an immersive video and an immersive sound field |
CN108074212A (en) * | 2016-11-14 | 2018-05-25 | 纳恩博(北京)科技有限公司 | A kind of data processing method and mobile electronic device |
US10204423B2 (en) | 2017-02-13 | 2019-02-12 | Adobe Inc. | Visual odometry using object priors |
US10366525B2 (en) | 2017-09-22 | 2019-07-30 | Adobe Inc. | Generating an interactive digital media item that follows a viewer |
US10966073B2 (en) | 2017-11-22 | 2021-03-30 | Charter Communications Operating, Llc | Apparatus and methods for premises device existence and capability determination |
US10580259B2 (en) * | 2017-12-13 | 2020-03-03 | Novomatic Ag | Systems, methods and gaming machines having logic based on sporting events |
US10957062B2 (en) | 2018-10-31 | 2021-03-23 | Bentley Systems, Incorporated | Structure depth-aware weighting in bundle adjustment |
CN109918987B (en) * | 2018-12-29 | 2021-05-14 | 中国电子科技集团公司信息科学研究院 | Video subtitle keyword identification method and device |
DE102019208216A1 (en) | 2019-06-05 | 2020-12-10 | Conti Temic Microelectronic Gmbh | Detection, 3D reconstruction and tracking of several rigid objects moving relative to one another |
US11374779B2 (en) | 2019-06-30 | 2022-06-28 | Charter Communications Operating, Llc | Wireless enabled distributed data apparatus and methods |
US11182222B2 (en) | 2019-07-26 | 2021-11-23 | Charter Communications Operating, Llc | Methods and apparatus for multi-processor device software development and operation |
CN111105350B (en) * | 2019-11-25 | 2022-03-15 | 南京大学 | Real-time video splicing method based on self homography transformation under large parallax scene |
CN111147884B (en) * | 2020-01-02 | 2021-12-17 | 广州虎牙科技有限公司 | Data processing method, device, system, user side and storage medium |
US11830208B2 (en) * | 2020-03-25 | 2023-11-28 | Intel Corporation | Robust surface registration based on parameterized perspective of image templates |
CN111640137A (en) * | 2020-05-31 | 2020-09-08 | 石家庄铁道大学 | Monitoring video key frame evaluation method |
US11880939B2 (en) * | 2020-08-20 | 2024-01-23 | Intel Corporation | Embedding complex 3D objects into an augmented reality scene using image segmentation |
CA3205967A1 (en) * | 2021-01-28 | 2022-08-04 | Hover Inc. | Systems and methods for image capture |
CN113177971A (en) * | 2021-05-07 | 2021-07-27 | 中德(珠海)人工智能研究院有限公司 | Visual tracking method and device, computer equipment and storage medium |
CN115908482B (en) * | 2022-10-14 | 2023-10-20 | 荣耀终端有限公司 | Modeling error data positioning method and device |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030103682A1 (en) * | 2001-12-05 | 2003-06-05 | Microsoft Corporation | Methods and system for providing image object boundary definition by particle filtering |
US7177740B1 (en) * | 2005-11-10 | 2007-02-13 | Beijing University Of Aeronautics And Astronautics | Method and apparatus for dynamic measuring three-dimensional parameters of tire with laser vision |
US20100079598A1 (en) * | 2008-09-03 | 2010-04-01 | University Of South Carolina | Robust Stereo Calibration System and Method for Accurate Digital Image Correlation Measurements |
US20100142846A1 (en) * | 2008-12-05 | 2010-06-10 | Tandent Vision Science, Inc. | Solver for image segregation |
US20100245593A1 (en) * | 2009-03-27 | 2010-09-30 | Electronics And Telecommunications Research Institute | Apparatus and method for calibrating images between cameras |
US20110025853A1 (en) * | 2009-07-31 | 2011-02-03 | Naturalpoint, Inc. | Automated collective camera calibration for motion capture |
US20110064308A1 (en) * | 2009-09-15 | 2011-03-17 | Tandent Vision Science, Inc. | Method and system for learning a same-material constraint in an image |
US20130058581A1 (en) * | 2010-06-23 | 2013-03-07 | Beihang University | Microscopic Vision Measurement Method Based On Adaptive Positioning Of Camera Coordinate Frame |
US20130230214A1 (en) * | 2012-03-02 | 2013-09-05 | Qualcomm Incorporated | Scene structure-based self-pose estimation |
Family Cites Families (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4885713A (en) | 1987-04-22 | 1989-12-05 | Tektronix, Inc. | System for digitizing the trajectory of a target point of a moving beam |
US5692063A (en) | 1996-01-19 | 1997-11-25 | Microsoft Corporation | Method and system for unrestricted motion estimation for video |
US5787203A (en) | 1996-01-19 | 1998-07-28 | Microsoft Corporation | Method and system for filtering compressed video images |
US5748789A (en) | 1996-10-31 | 1998-05-05 | Microsoft Corporation | Transparent block skipping in object-based video coding systems |
US6760488B1 (en) | 1999-07-12 | 2004-07-06 | Carnegie Mellon University | System and method for generating a three-dimensional model from a two-dimensional image sequence |
US6970591B1 (en) * | 1999-11-25 | 2005-11-29 | Canon Kabushiki Kaisha | Image processing apparatus |
EP1147669B1 (en) * | 1999-11-29 | 2006-12-27 | Sony Corporation | Video signal processing method and apparatus by feature points extraction in the compressed domain. |
US6996254B2 (en) | 2001-06-18 | 2006-02-07 | Microsoft Corporation | Incremental motion estimation through local bundle adjustment |
US7457433B2 (en) * | 2005-01-20 | 2008-11-25 | International Business Machines Corporation | System and method for analyzing video from non-static camera |
US8803958B2 (en) * | 2008-01-04 | 2014-08-12 | 3M Innovative Properties Company | Global camera path optimization |
US8837811B2 (en) | 2010-06-17 | 2014-09-16 | Microsoft Corporation | Multi-stage linear structure from motion |
US8259994B1 (en) | 2010-09-14 | 2012-09-04 | Google Inc. | Using image and laser constraints to obtain consistent and improved pose estimates in vehicle pose databases |
US9153025B2 (en) | 2011-08-19 | 2015-10-06 | Adobe Systems Incorporated | Plane detection and tracking for structure from motion |
US8693734B2 (en) | 2011-11-18 | 2014-04-08 | Adobe Systems Incorporated | Detecting poorly conditioned points in bundle adjustment |
US9317928B2 (en) | 2012-04-06 | 2016-04-19 | Adobe Systems Incorporated | Detecting and tracking point features with primary colors |
US8942422B2 (en) | 2012-04-06 | 2015-01-27 | Adobe Systems Incorporated | Nonlinear self-calibration for structure from motion (SFM) techniques |
-
2012
- 2012-12-21 US US13/724,973 patent/US8942422B2/en active Active
- 2012-12-21 US US13/725,019 patent/US8873846B2/en active Active
- 2012-12-21 US US13/725,006 patent/US8923638B2/en active Active
- 2012-12-21 US US13/725,041 patent/US10778949B2/en active Active
- 2012-12-21 US US13/724,906 patent/US9083945B2/en active Active
- 2012-12-21 US US13/724,871 patent/US8934677B2/en active Active
- 2012-12-21 US US13/724,945 patent/US9131208B2/en active Active
-
2015
- 2015-05-15 US US14/713,914 patent/US9390515B2/en active Active
- 2015-07-16 US US14/801,432 patent/US9292937B2/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030103682A1 (en) * | 2001-12-05 | 2003-06-05 | Microsoft Corporation | Methods and system for providing image object boundary definition by particle filtering |
US7177740B1 (en) * | 2005-11-10 | 2007-02-13 | Beijing University Of Aeronautics And Astronautics | Method and apparatus for dynamic measuring three-dimensional parameters of tire with laser vision |
US20100079598A1 (en) * | 2008-09-03 | 2010-04-01 | University Of South Carolina | Robust Stereo Calibration System and Method for Accurate Digital Image Correlation Measurements |
US8248476B2 (en) * | 2008-09-03 | 2012-08-21 | University Of South Carolina | Robust stereo calibration system and method for accurate digital image correlation measurements |
US20100142846A1 (en) * | 2008-12-05 | 2010-06-10 | Tandent Vision Science, Inc. | Solver for image segregation |
US20100245593A1 (en) * | 2009-03-27 | 2010-09-30 | Electronics And Telecommunications Research Institute | Apparatus and method for calibrating images between cameras |
US20110025853A1 (en) * | 2009-07-31 | 2011-02-03 | Naturalpoint, Inc. | Automated collective camera calibration for motion capture |
US20110064308A1 (en) * | 2009-09-15 | 2011-03-17 | Tandent Vision Science, Inc. | Method and system for learning a same-material constraint in an image |
US20130058581A1 (en) * | 2010-06-23 | 2013-03-07 | Beihang University | Microscopic Vision Measurement Method Based On Adaptive Positioning Of Camera Coordinate Frame |
US20130230214A1 (en) * | 2012-03-02 | 2013-09-05 | Qualcomm Incorporated | Scene structure-based self-pose estimation |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8873846B2 (en) | 2012-04-06 | 2014-10-28 | Adobe Systems Incorporated | Detecting and tracking point features with primary colors |
US8923638B2 (en) | 2012-04-06 | 2014-12-30 | Adobe Systems Incorporated | Algorithm selection for structure from motion |
US8934677B2 (en) | 2012-04-06 | 2015-01-13 | Adobe Systems Incorporated | Initialization for robust video-based structure from motion |
US9083945B2 (en) | 2012-04-06 | 2015-07-14 | Adobe Systems Incorporated | Keyframe selection for robust video-based structure from motion |
US9131208B2 (en) | 2012-04-06 | 2015-09-08 | Adobe Systems Incorporated | Opt-keyframe reconstruction for robust video-based structure from motion |
US9292937B2 (en) | 2012-04-06 | 2016-03-22 | Adobe Systems Incorporated | Opt-keyframe reconstruction for robust video-based structure from motion |
US9317928B2 (en) | 2012-04-06 | 2016-04-19 | Adobe Systems Incorporated | Detecting and tracking point features with primary colors |
US9390515B2 (en) | 2012-04-06 | 2016-07-12 | Adobe Systems Incorporated | Keyframe selection for robust video-based structure from motion |
US10778949B2 (en) | 2012-04-06 | 2020-09-15 | Adobe Inc. | Robust video-based camera rotation estimation |
US10504244B2 (en) * | 2017-09-28 | 2019-12-10 | Baidu Usa Llc | Systems and methods to improve camera intrinsic parameter calibration |
Also Published As
Publication number | Publication date |
---|---|
US9083945B2 (en) | 2015-07-14 |
US20130265439A1 (en) | 2013-10-10 |
US10778949B2 (en) | 2020-09-15 |
US20130266179A1 (en) | 2013-10-10 |
US20130265387A1 (en) | 2013-10-10 |
US9292937B2 (en) | 2016-03-22 |
US9131208B2 (en) | 2015-09-08 |
US8934677B2 (en) | 2015-01-13 |
US20130266180A1 (en) | 2013-10-10 |
US8873846B2 (en) | 2014-10-28 |
US8923638B2 (en) | 2014-12-30 |
US20150249811A1 (en) | 2015-09-03 |
US9390515B2 (en) | 2016-07-12 |
US20130266238A1 (en) | 2013-10-10 |
US8942422B2 (en) | 2015-01-27 |
US20130266218A1 (en) | 2013-10-10 |
US20150317802A1 (en) | 2015-11-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8942422B2 (en) | Nonlinear self-calibration for structure from motion (SFM) techniques | |
US9747699B2 (en) | Plane detection and tracking for structure from motion | |
EP3698323B1 (en) | Depth from motion for augmented reality for handheld user devices | |
Newson et al. | Video inpainting of complex scenes | |
US10304244B2 (en) | Motion capture and character synthesis | |
Herling et al. | High-quality real-time video inpaintingwith PixMix | |
US9414048B2 (en) | Automatic 2D-to-stereoscopic video conversion | |
US8355592B1 (en) | Generating a modified image with semantic constraint | |
US9041819B2 (en) | Method for stabilizing a digital video | |
US8811749B1 (en) | Determining correspondence between image regions | |
US9317928B2 (en) | Detecting and tracking point features with primary colors | |
US8320620B1 (en) | Methods and apparatus for robust rigid and non-rigid motion tracking | |
CN111868786B (en) | Cross-device monitoring computer vision system | |
Abraham et al. | A survey on video inpainting | |
Bugeau et al. | Coherent background video inpainting through Kalman smoothing along trajectories | |
Lourakis et al. | Efficient 3D camera matchmoving using markerless, segmentation-free plane tracking | |
Youssef et al. | NeRF-Supervised Feature Point Detection and Description | |
Alluri | Robust Video Stabilization and Quality Evaluation for Amateur Videos | |
Mitchell et al. | A robust structure and motion replacement for bundle adjustment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: ADOBE SYSTEMS INCORPORATED, CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:JIN, HAILIN;REEL/FRAME:029520/0994 Effective date: 20121218 |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551) Year of fee payment: 4 |
|
AS | Assignment |
Owner name: ADOBE INC., CALIFORNIA Free format text: CHANGE OF NAME;ASSIGNOR:ADOBE SYSTEMS INCORPORATED;REEL/FRAME:048867/0882 Effective date: 20181008 |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1552); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 8 |