EP3072108A1 - Procede d'estimation de la vitesse de deplacement d'une camera - Google Patents
Procede d'estimation de la vitesse de deplacement d'une cameraInfo
- Publication number
- EP3072108A1 EP3072108A1 EP14799446.1A EP14799446A EP3072108A1 EP 3072108 A1 EP3072108 A1 EP 3072108A1 EP 14799446 A EP14799446 A EP 14799446A EP 3072108 A1 EP3072108 A1 EP 3072108A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- point
- camera
- current image
- image
- pixel
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
- G06T7/248—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments involving reference images or patches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30244—Camera pose
Definitions
- the invention relates to a method and a system for estimating the speed of movement of a camera at the moment when this camera captures a current image of a three-dimensional scene.
- the invention also relates to a method of constructing the trajectory of a camera with progressive shutter of pixels and a method of processing an image implementing the method of estimating the speed of displacement.
- the invention also relates to an information recording medium for the implementation of these methods.
- the speed of movement of the first camera is estimated from the speed of movement of these points of interest from one image to another.
- the subject of the invention is a first method for estimating the speed of movement of a first camera at the moment when this first camera captures a current image of a three-dimensional scene, in accordance with claim 1. .
- the invention also relates to a second method for estimating the speed of movement of a first camera at the moment when this first camera captures a current image of a three-dimensional scene, according to claim 4.
- the above methods do not implement step of extraction of points of interest in the images, nor of pairing these points of interest between successive images. For this, the above methods directly minimize a difference between physical quantities measured in the reference image and in the current image for a large number of pixels of these images. This simplifies the process.
- the difference between the physical quantities is calculated for a very large number of points of the images, that is to say for a number greater than 10% of the pixels of the current image or reference number
- the number of deviations to be minimized is much larger than the number of unknowns to estimate.
- the number of deviations taken into account to estimate speed is much greater than in the case of point of interest-based methods.
- the first or the second camera may be a simple monocular camera unable to measure the depth that separates it from the photographed scene.
- the above methods can accurately estimate the speed even in the presence of a distortion in the current image caused by the gradual shutter pixels.
- the coordinates of one of the first and second points are determined by taking into account the movement of the first camera for the duration t A.
- the first and second values of the physical quantity both correspond more precisely to the same photographed point of the scene, which improves the estimation of the speed.
- Embodiments of these methods may include one or more of the features of the dependent claims.
- steps d) and e) allows to simultaneously estimate the pose and speed of the first camera and thus reconstruct its trajectory in the photographed scene without using additional sensors as an inertial sensor;
- the invention also relates to a method of constructing the trajectory of a first camera with progressive shutter of pixels in a three-dimensional scene, according to claim 11.
- the invention also relates to a method for processing a current image of a three-dimensional scene, according to claim 12.
- the invention also relates to an information recording medium, comprising instructions for the implementation of one of the above methods, when these instructions are executed by an electronic computer.
- the invention also relates to a first system for estimating the movement speed of a first camera at the moment when this first camera captures a current image of a three-dimensional scene, according to claim 14.
- the invention also relates to a second system for estimating the speed of movement of a first camera at the moment when this first camera captures a current image of a three-dimensional scene, according to claim 15.
- FIG. 1 is a schematic illustration of a system making it possible to estimate the speed of movement of a camera at the moment when the latter captures an image as well as to process and correct the images thus captured;
- FIGS. 2A and 2B are timing diagrams illustrating the instants of acquisition of different rows of pixels, first, in the case of a camera with progressive shutter of pixels and, secondly, in the case of a camera with simultaneous shutter of all its pixels;
- FIG. 3 is a schematic illustration of a step for determining corresponding points between a reference image and a current image;
- FIG. 4 is a flowchart of a method for estimating the speed of a camera as well as for processing the images captured by this camera;
- FIG. 5 is an illustration of another embodiment of a camera that can be used in the system of FIG. 1;
- FIG. 6 is a flowchart of a method for estimating the speed of the camera of FIG. 5;
- FIG. 7 is a partial illustration of another method for estimating the speed of the camera of the system of FIG. 1;
- FIG. 8 is a schematic illustration of a step of determining corresponding points in the current image and in the reference image
- FIG. 9 is a timing diagram illustrating the evolution over time of the error between the estimated speed and the actual speed in four different cases.
- FIG. 1 represents an image processing system 2 for estimating the exposure x P R and the speed X V R of a camera at the moment when the latter acquires a current image.
- This system is also able to use the pose x pR and the estimated speed X V R to construct the trajectory of the camera and / or process the current images in order to correct them.
- This system 2 comprises a camera 4 which captures a temporally ordered sequence of images of a three-dimensional scene 6.
- the camera 4 is mobile, that is to say that it is movable inside the camera. scene 6 along an unknown trajectory in advance.
- the camera 4 is transported and moved by hand by a user or fixed on a robot or a remotely operated vehicle that moves inside the scene 6.
- the camera 4 is freely movable in the scene 6 of so that its pose x pR , that is to say its position and its orientation, is a vector with six unknowns.
- Scene 6 is a three-dimensional space. It may be an area inside a building such as an office, kitchen or hallways. It can also be an outdoor space such as a road, a city or a field.
- the camera 4 records the ordered sequence of images captured in an electronic memory 10 of a unit 12 of image processing.
- Each picture has pixels organized in rows parallel to one another. Here, these pixels are organized into columns and lines.
- Each pixel corresponds to an elementary sensor that measures a physical quantity.
- the measured physical quantity is chosen from the intensity of a radiation emitted by a point of the scene 6 and the distance separating this pixel from this point of the scene 6. This distance is called "depth".
- the pixels of the camera 4 measure only the intensity of the light emitted by the point of the scene photographed.
- each pixel measures in particular the color of the point of the scene photographed by this pixel. This color is for example coded using the RGB (Red-Green-Blue) model.
- the camera 4 is a progressive shutter camera pixels more known by the English term "rolling shutter camera.”
- the lines of pixels are captured one after the other unlike what happens for a camera with simultaneous shutter pixels.
- Cameras with simultaneous shutter of pixels are known under the term of "global shutter camera”.
- Figures 2A and 2B illustrate more precisely the characteristics of the camera 4 compared to those of a simultaneous shutter camera pixels.
- the horizontal axis represents the time while the vertical axis represents the number of the pixel line.
- each line of pixels, and therefore each image is captured with a periodicity t p .
- the time required to capture the light intensities measured by each pixel of the same line is represented by a hatched block 30.
- Each block 30 is preceded by a time t e of exposure of the pixels to the light rays to be measured. This time t e is represented by rectangles 32.
- Each exposure time t e is itself preceded by a duration of reinitialization of the pixels represented by blocks 34.
- the pixels are captured by the camera 4, and in Figure 2B, the pixels are captured by a simultaneous camera shutter pixels.
- the blocks 30 are offset temporally relative to each other since the different lines of the same image are captured one after the other and not simultaneously as in the case of FIG. 2B.
- the duration t A which flows between the instants of capture of two successive lines of pixels is non-zero in the case of FIG. 2A.
- the duration is the same whatever the pair of successive lines selected in the image captured by the camera 4.
- the duration is constant in time. Due to the existence of this duration t A , the capture of a complete image by the camera 4 can only be performed in a time t r equal to the sum of the durations t A which separates the capture times of the different lines of the complete picture. As indicated above, it is well known that if the camera 4 moves between the capture instants of one line and the next, this introduces a distortion in the captured image. Subsequently, this deformation is called "RS strain".
- deformation MB it is also well known that if the camera 4 moves during the exposure time t e , this causes the appearance of a kinetic blur in the image. Subsequently, this deformation is called "deformation MB”.
- each camera is modeled by a model for determining, from the coordinates of a point of the scene, the coordinates of the point, in the plane of the image, having photographed this point.
- the plane of an image is the plane of the space between a projection center C and the photographed scene, on which a central projection, of center C, of the scene makes it possible to obtain an image identical to that photographed by the camera.
- the model used is the pinhole model or "pin hole" in English. More information on this model can be found in the following articles:
- the position of each pixel is identified by the coordinates of a point p in the plane of the image. Subsequently, to simplify the description, it is considered that this point p is at the intersection of an axis AO passing through the point of the PS scene ( Figure 1) photographed by this pixel and a projection center C.
- the projection center C is at the intersection of all the optical axes of all the pixels of the image.
- the position of the center C with respect to the plane of the image, in a three-dimensional reference F linked without any degree of freedom to the camera 4, is an intrinsic characteristic of the camera 4. This position depends, for example, on the focal distance of the camera. camera.
- the set of intrinsic parameters of the camera 4 which make it possible to find the point p corresponding to the projection of the point PS on the plane PL along the axis OA, are grouped together in a matrix known under the term matrix of the intrinsic parameters of the camera or "intrinsic matrix" in English.
- This matrix is denoted K. It is written typically in the following form: fs 3 ⁇ 4 (
- f is the focal length of the camera expressed in pixels
- r is the dimension ratio of a pixel
- the pair (u 0 , v 0 ) corresponds to the position of the main pixel point, that is to say typically the center of the image.
- the shear factor is zero and the aspect ratio is close to 1.
- This matrix K is used in particular to determine the coordinates of the point p corresponding to the projection of the point PS on the PL plane of the camera.
- the matrix K can be obtained during a calibration phase. Such a calibration phase is for example described in the following articles:
- this matrix K is also possible to obtain this matrix K from an image of an object or a calibration pattern whose dimensions are known such as a chessboard or circles.
- this matrix is constant over time. To facilitate the description which follows, it will be assumed that this matrix K is constant and known.
- x pR the pose of the camera 4, that is to say its position and its orientation in a reference R bound without any degree of freedom to the scene 6.
- the reference R has two horizontal axes X and Y orthogonal to each other and a vertical axis Z.
- the pose x pR is a vector with six coordinates, three of which represent its position in the coordinate system R and three other coordinates to represent the inclination of the camera 4 with respect to the X, Y and Z axes.
- the position of the camera 4 is marked in the R mark by the coordinates of its projection center.
- the axis used to locate the inclination of the camera 4 with respect to the axes in the reference R is the optical axis of the camera 4. Subsequently, it is assumed that the camera 4 is able to move with six degrees of freedom, so that the six coordinates of the pose of the camera 4 are unknowns that must be estimated.
- x vR the speed of the camera 4, that is to say its speed in translation and in rotation expressed in the reference R.
- the speed X V R is a vector with six coordinates, including three coordinates correspond to the speed of the camera 4 in translation along the X, Y and Z axes, and three other coordinates correspond to the angular velocities of the camera 4 around its X, Y and Z axes.
- the function l is a function which associates at each point of the plane of the image PL the intensity measured or interpolated at this point.
- the processing unit 12 is a unit capable of processing the images captured by the camera 4 to estimate the exposure x pR and the speed X V R of this camera at the moment when it captures an image.
- the unit 12 is also capable of:
- the unit 12 comprises a programmable electronic calculator 14 capable of executing instructions stored in the memory 12.
- the memory 12 comprises in particular the instructions necessary to execute any of the methods of FIGS. and 7.
- the system 2 also comprises a device 20 used to construct a three-dimensional model 16 of the scene 6.
- the model 16 makes it possible to construct augmented reference images.
- augmented image denotes an image comprising, for each pixel, in addition to the intensity measured by this pixel, a measurement of the depth that separates this pixel from the point of the scene that it is photographing. The measurement of the depth makes it possible to obtain the coordinates of the point of the scene photographed by this pixel. These coordinates are expressed in the bound three-dimensional coordinate system without any degree of freedom to the camera having taken this augmented image. These coordinates are typically in the form of a triplet (x, y, D (p)), where:
- x and y are the coordinates of the pixel in the PL plane of the image
- - D (p) is the measured depth that separates this pixel from the point of the PS scene that it has photographed.
- the function D associates with each point p of the augmented image the depth D (p) measured or interpolated.
- the device 20 comprises a camera 22 RGB-D and a processing unit 24 capable of estimating the pose of the camera 22 in the reference frame R.
- the camera 22 is a camera that measures both the intensity luminous l * (p * ) of each point of the scene and the depth D * (p * ) which separates this pixel from the point of the scene photographed.
- the camera 22 is a camera with simultaneous shutter of the pixels.
- Such cameras are for example marketed by the Microsoft® company such as the Kinect® camera or the ASUS® company.
- the unit 24 is, for example, equipped with a programmable electronic calculator and a memory including the instructions necessary to execute a simultaneous localization and mapping process, better known by the acronym SLAM (Simultaneous Localization). And Mapping).
- SLAM Simultaneous Localization
- Mapping a simultaneous localization and mapping process
- M. Meilland and AI Comport "On unifying key-frame and voxel-based dense visual SLAM have large scales "IEEE International Conference on Intelligence Robots and Systems, 2013, 3-8 November, Tokyo.
- the unit 24 is embedded in the camera 22.
- the model 16 is built by the device 20 and stored in the memory 10.
- the model 16 is a database in which the different reference images I * are recorded.
- the pose x pR * of the camera 22 at the moment when the latter has captured the image I * is associated with each of these images P.
- Such a three-dimensional model of the scene 6 is known under the term model based reference images or "key-frame model" in English. More information on such a model can be found in article A1 previously cited.
- the method starts with a learning phase 50 in which the model 16 is built and stored in the memory 10.
- a learning phase 50 in which the model 16 is built and stored in the memory 10.
- the camera 22 is moved inside. from scene 6 to capture many reference images I * from many different poses.
- the camera 22 is moved slowly so that the kinetic blur in the reference images is negligible.
- such a displacement slow also eliminates the distortion caused by the gradual shutter of the pixels.
- step 54 the unit 24 estimates the successive poses of the camera 22 for each captured P reference image. Note that this step 54 can also be performed after step 52, i.e. once all the reference images have been captured.
- the model 16 is built and stored in the memory 10. For this, several reference images and the installation of the camera 22 at the moment when these reference images were captured are recorded. in a database.
- the camera 4 is moved inside the scene 6 along an unknown trajectory. At the same time as the camera 4 is moved, it captures a temporal succession of images from different unknown poses. Each captured image is stored in the memory 10.
- the camera 4 is moved at a high speed, i.e. sufficient for the RS and MB deformations to be perceptible in the captured images.
- the unit 12 processes in real time each image acquired by the camera 4 to estimate the exposure x pR and the speed X V R of the camera 4 at the instant when this image has been captured.
- the estimation of the exposure x pR and the speed X V R of the camera 4 is performed as soon as an image is captured by the camera 4 and ends before the next image is captured by this same camera 4. Subsequently, the image captured by the camera 4 used to determine the pose of this camera at the time of capture of this image is called "current image”.
- the unit 12 selects or builds a reference image I * which has photographed a large number of points of the common scene 6 with those which have been photographed by the current image. For example, for this, we make a rough estimate of the pose x pR of the camera 4, then we select in the model 16 the reference image whose pose is closest to this rough estimation of the pose x pR .
- the rough estimate is made by interpolation from the last poses and speeds estimated for the camera 4. For example, in a simplified case, the rough estimation of the pose x pR is taken equal to the last pose estimated for the 4.
- the acquisition frequency is greater than or equal to 30 Hz.
- the exposure x P R and the speed X V R are estimated. More precisely, here, we estimate the variations x p and Xv of, respectively, the pose and the speed of the camera 4 since the last estimated pose and speed, that is to say the variations of the pose and the speed since the last captured current image.
- the pose x p is sought and the speed x v which minimizes the difference Ei between the terms 1w (x, p * ) and P b (x, p * ), where x is a vector that groups the unknowns to be estimated.
- the vector x gathers the coordinates of the pose x p and the speed x v .
- the variable x therefore has 12 coordinates to estimate.
- Operations 1) and 2) are repeated in loop.
- the value chosen is modified at each iteration to try to find each time a new value of the speed x v which further decreases the difference Ei than the previous values tested.
- the number of iterations is stopped when a stopping criterion is satisfied.
- the iterations are stopped when a value of the speed x v makes it possible to obtain a value of the difference Ei less than a predetermined threshold Si.
- Another possible stopping criterion consists in systematically stopping the iterations of operations 1) and 2) when the number of iterations performed is greater than a predetermined threshold S 2 .
- an initial value must be assigned to the speed Xv. For example, this initial value is taken equal to zero, that is to say that it is taken as a first approximation that the speed X V R has not varied since its last estimate.
- the automatic choice of a new value of the speed x v likely to minimize the difference Ei is a well-known operation of minimizing deviation. For example, methods for choosing this new value of the speed x v are described in the following bibliographic references:
- l w (x, p * ) corresponds to the value of the light intensity of the point p * in the reference image constructed from the light intensities measured in the current image, taking into account the deformation RS.
- the construction of the value of this term from a given value of the speed x v is illustrated schematically in FIG. 3. To simplify FIG. 3, only a square of 3 by 3 pixels is represented for each image I and I * .
- the vertex v * corresponds to the coordinates, expressed in the reference F * , of the point PS photographed by the pixel centered on the point p * of the plane of the image PL * .
- the unit 12 searches for the point p w1 (FIG. 3) of the current image corresponding to the point p * , by first assuming that the duration is zero.
- the points p w1 and p * correspond if they both photograph the same PS point of the scene 6.
- the duration t A is zero, many known algorithms make it possible to find the coordinates of the point p w1 in the image I corresponding to the point p * in the image I * . Therefore, here only general information about a possible method for doing this is given.
- the unit 12 selects in the reference image the coordinates v * of the point PS associated with the point p * . Then, the unit 12 performs a change of reference to obtain the coordinates v of the same point PS expressed in the F mark of the camera 4. For this, a pose matrix ⁇ is used.
- the pose matrices are well known. The reader will be able to consult chapter 2 of book L1 for more information.
- the laying matrices are in the following form when the homogeneous coordinates are used:
- R is a rotation matrix
- the matrix R and the vector t are a function of the exposure x pR * and the exposure x pR respectively associated with the images I * and I.
- the exposure x pR * is known from the model 16.
- the exposure x pR is equal to x pR- i + x p .
- the point p w1 is the point which corresponds to the intersection of the plane PL and the axis AO which passes through the center C and the point PS of the scene 6.
- Wired function that returns the coordinates of the point p w1 corresponding to the point p * is known as a function of "warping". It is typically a central projection of center C. It is parameterized by the function ⁇ . So, we
- the point p w1 does not necessarily fall in the center of a pixel of the image I.
- the line of pixels to which the point p w1 of the image belongs has not been captured at the same time as the first line of the image, but at an interval of time ⁇ after this first line has been captured.
- the first captured line of the image I is the line at the bottom of the image as shown in FIG. 2A.
- the pose x pR that is estimated is the pose of the camera 4 when it captures the bottom line of the current image.
- the interval ⁇ can be calculated as being equal to (n + 1) U, where n is the number of lines of pixels that separate the line to which the point p w1 belongs and the first captured line.
- the number n is determined from the ordinate of the point p w1 .
- a function ei which returns the number n + 1 as a function of the ordinate of the point p w1 in the plane of the image.
- -T 2 (-TX V R) is a function that returns the coordinates of a point p T2 ⁇ - TXvR > of the three-dimensional space corresponding to the position of point p w1 after it has been moved in the opposite direction the displacement of the camera 4 during the interval ⁇ ,
- - w 2 is a function of "warping" which returns the coordinates of the point p TM 2 corresponding to the projection of the point p 1 "2 * - TM *" on the plane PL.
- the point p w2 is the intersection of the plane of the current image and an optical axis passing through the center C and the point p ⁇ J2 XVR
- T 2 incorporates the speed -X V R on the interval ⁇ to obtain a displacement equal to the displacement of the camera 4 during the interval ⁇ but of opposite direction.
- the speed X V R is considered constant during the interval ⁇ .
- the distance traveled by the camera 4 during the interval ⁇ at the speed X V R is calculated by integrating this speed over the time interval ⁇ .
- the vector v corresponds to the three coordinates of the speed in translation of the camera 4 and the symbol [w] x is the antisymmetric matrix ("skew symmetric matrix" in English) of the speed angular of the camera 4, that is to say the following matrix:
- ⁇ ⁇ , ⁇ ⁇ and ⁇ ⁇ are the angular velocities of the camera 4 around, respectively, the X, Y and Z axes of the marker R.
- the point p w2 like the point p w1 , does not necessarily fall in the center of a pixel. It is then necessary to estimate the light intensity at the point p TM 2 from the stored light intensities for the neighboring pixels in the current image. This is the role of the function l (..) which returns the luminous intensity at the interpolated point p from the stored luminous intensities for the pixels neighboring this point p.
- Many interpolation functions are known. For example, the easiest way is to return the memorized intensity for the pixel in which the point p TM 2 is located.
- the intensity ⁇ TM 2 must be the same as the intensity P (p * ) recorded in the reference image P after taking into account the RS strain.
- the motion blur is not negligible in the current image.
- the light intensities measured by the pixels of the current image are therefore affected by the kinetic blur while the light intensities stored for the pixels of the reference image are not affected by this kinetic blur.
- the estimated intensity ⁇ TM 2 ) is therefore affected by the kinetic blur as it is constructed from the light intensities of the pixels of the current image in which the deformation MB has not been corrected. Therefore, if the exposure time t e is not negligible, the intensity p TM 2 ) therefore does not correspond exactly to the intensity l * (p * ) even if the RS strain has been eliminated or at least decreased.
- l * b (x, p * ) is a value of the light intensity which would be measured at the point p * if the exposure time of the pixels of the camera 22 was equal to that of the camera 4, and if the camera 22 was moving at the speed X V R during the capture of the reference image I * .
- the image 1 * b corresponds to the image I * after a kinetic blur identical to that which affects the current image I has been added to the reference image.
- the coordinates of the neighboring points are obtained using the composition of functions w 3 (Ti "1 T 2 (-tXvR) Ti, p *) .
- the composition of functions T 1 T 2 (- IX V R) TI performs the following operations:
- the pose matrix ⁇ transforms the coordinates v * of the point PS of the scene 6 expressed in the coordinate system F * into coordinates v of this same point expressed in the coordinate system F, where v * are the coordinates of the point PS photographed by the point P *
- T 2 (-tx vR ) moves the point PS, of a distance function of a duration t and the speed X VR , to obtain the coordinates of a new point p T2 (-'x) expressed in the reference F, where t is a duration between zero and the exposure time t e of the pixel, and
- the laying matrix T 1 transforms the coordinates of the point p T2 ⁇ - txvR > expressed in the reference F into coordinates expressed in the reference F *.
- moving the camera by a distance tx V R with respect to a fixed scene is equivalent to moving the fixed scene of -tx vR with respect to a fixed camera.
- the functions Ti and T 2 are the same as those previously described.
- the function T 1 is the inverse of the laying matrix ⁇ .
- the function w 3 is a "warping" function which projects a point of the scene on the plane PL * to obtain the coordinates of a point which photographs this point of the scene. It is typically a central projection of center C *.
- the point p w3 is here the point situated at the intersection of the plane PL * and the axis passing through the center C * and the point p T2 ( - txvR) .
- the coordinates of a point close to the point p * are obtained for each value of the duration t.
- at least five, ten or twenty values of the duration t regularly distributed in the interval [0; t e ] are used.
- the intensity at a point in the reference image is obtained using a function l * (p w3 ).
- the function l * is the function that returns the intensity at the point p w3 in the reference image I *.
- the point p w3 is not necessarily in the center of a pixel.
- the function l * returns an intensity at the point p w3 constructed by interpolation from the intensities stored for the pixels neighboring the point p w3 .
- the intensity l * b (p *) is taken equal to the average intensities l * (p w3 ) calculated for the different times t. For example, this is the arithmetic mean using the same weighting coefficient for each term.
- the pose matrix Ti is updated with the new estimate of the pose x pR obtained from the new estimate of the pose X V R.
- the exposure x pR and the speed X V R can be used for various additional treatments. Typically, these additional treatments are performed in real time if they are not too long to execute. Otherwise, they are executed offline, that is to say after all the poses x P R and speeds X V R of the camera 4 have been calculated.
- a step 70 of constructing the trajectory of the camera 4 is carried out in real time.
- the unit 12 records in the form of a temporally ordered sequence the succession of the estimated x pR poses. This temporally ordered sequence then constitutes the trajectory constructed for the camera 4.
- the unit 12 also performs different offline processing. For example, during a step 72, the unit 12 processes the current image to limit the RS deformation using the estimation of the speed X V R. Typically, for this, the pixels of the current image are displaced. as a function of the speed X V R and the duration ⁇ . For example, this displacement of each pixel is estimated by the function W 2 (T 2 (-TXVR), P) for each pixel p of the current image.
- W 2 T 2 (-TXVR), P
- Such image processing methods are known and are therefore not described here in more detail. For example, such methods are described in the following article: F. Baker, EP Bennett, SB Kang, and R. Szeliski, "Removing rolling shutter wobble," IEEE, Conference on Computer Vision and Pattern Recognition, 2010.
- the unit 12 also processes the current image to limit the deformations caused by the motion blur.
- image processing methods based on the estimation of the speed X V R of the camera at the moment when the latter has captured the image are known. For example, such methods are described in the following articles:
- FIG. 5 shows a system identical to that of Figure 1, except that the camera 4 is replaced by a camera 80.
- This camera 80 is a camera identical to the camera 22 or just the same camera as the camera 22.
- the acquisition of the depth is done by gradually closing the pixels as described with reference to Figure 2A.
- the lines of pixels capture the depth one after the other.
- the duration between the moments of capture of the depth by two successive lines of pixels is noted d.
- the duration may be equal to or different from the duration t A for capturing intensities.
- the exposure time of the pixels to capture the depth is noted ted.
- the time ted is equal to or different from the duration t e of exposure.
- the camera 80 acquires for each pixel the same information as the camera 4 and, in addition, a vertex V coding, in the reference F, linked without any degree of freedom to the camera 80, the depth of the point of the scene photographed by this pixel.
- D w (x, p * ) is the composition of the following functions:
- the function D returns the value of the depth at point p TM 2 .
- the depth at the point p w2 is estimated by interpolation from the depths measured by the neighboring pixels of the point p w2 in the current image.
- the duration ⁇ ⁇ is the time calculated as the duration ⁇ , but replacing t A by t Ac i.
- D w (x, p * ) is therefore an approximation of the depth at the point p * in the reference image constructed from the depths measured by the camera 80.
- D * b (x, p * ) corresponds to the depth that would be measured at the point p * if the camera 22 was moved at the speed X V R and if the exposure time of the pixels of the camera 22 to measure the depth was equal to the exposure time ted.
- D * b (x, p * ) is constructed similarly to what has been described for the term l * b (x, p * ).
- D * b (x, p * ) is constructed:
- the selection of the neighboring points is identical to that previously described for the term l * b (x, p * ) except that the time t e is replaced by the time ted.
- the depth measured by the neighboring points is obtained by means of a function D *.
- the function D * (p w3 ) is the function that returns the depth at the point p w3 from the depths measured for the pixels near the point p w3 .
- variable x comprises, in addition to the six coordinates of the velocity x v , four coordinates intended to code the values of the times, t A d, t e and t e d.
- the steps of simultaneous minimization of the differences E 1 and E 2 lead to estimating, in addition to the speed x v , also the value of the times, t A d, t e and t e d.
- FIG. 7 represents another method for estimating the exposure x pR and the speed x V R of the camera 4 by means of the system 2. This process is identical to that of FIG. 4, except that the operation 68 is replaced by an operation 90. To simplify FIG. 7, only the part of the process comprising the operation 90 has been represented. The other parts of the process are identical to those previously described.
- step 90 the pose x p and the speed x v are estimated by minimizing a difference E 3 between the following terms: l * w (x, p) and l (p).
- l (p) is the intensity measured at the point p by the camera 4.
- l * w (x, p) corresponds to the estimate of the intensity at the point p of the current image constructed from the intensities stored in the reference image I * , taking into account the RS and MB deformations of the camera 4.
- l * w (x, p) corresponds to the following composition of functions:
- the vertex v contains the coordinates in the F mark of the point PS photographed by the point p. This vertex v is estimated from the vertex v * of the reference image. For example, we first look for the points p w1 closest to the point p, then the vertex v is estimated by interpolation from the coordinates ⁇ * of the vertex associated with its nearest points p w1 .
- TV 1 is the inverse pose matrix of the matrix Ti. It thus transforms the coordinates of the vertex v, expressed in the reference F, in coordinates v * expressed in the reference F * .
- the function w 4 is a "warping" function that projects the vertex v * on the plane PL * of the reference image to obtain the coordinates of a point p w4 ( Figure 8).
- the function w 4 is identical to the function w 3 .
- the function l * b is the same as that previously defined, that is to say that it makes it possible to estimate the value of the intensity at the point p w5 which would be measured by the pixels of the camera 22 if its exposure time was equal to t e , and if the camera 22 was moving at the speed X V R.
- the function l * b thus introduces the same kinetic blur on the reference image as that observed on the current image.
- FIG. 9 represents the evolution over time of the difference between the angular velocity estimated using the method of FIG. 4 and the actual angular velocity of the camera 4.
- the curves represented were obtained experimentally. Each curve shown was obtained using the same sequence of current images and the same reference images.
- the abscissa represents the number of processed ordinary images, while the ordinate represents the error between the actual angular velocity and the estimated angular velocity. This error is expressed here as a Root Mean Square Error (RMSE).
- RMSE Root Mean Square Error
- the curve 91 represents the evolution of the error without correcting the RS and MB deformations.
- the estimates of the speed x v are for example obtained with the method of FIG. 4 but taking the time t e and the duration t A equal to zero.
- the curve 92 corresponds to the case where only the RS strain is corrected. This curve is obtained by executing the method of FIG. 4 taking a value of the non-zero duration t A and setting the time t e to zero.
- the curve 94 corresponds to the case where only the deformation MB is corrected. This curve is obtained by performing the method of Figure 4 by taking a value for the time t e is not zero and by setting the time t A to zero.
- the squash 96 corresponds to the case where the deformations RS and MB are simultaneously corrected. This curve is obtained by executing the method of FIG. 4 taking non-zero values for the time t e and the duration t A.
- the unit 24 can be embedded inside the camera 22.
- the camera 22 can also comprise sensors that directly measure its placement inside the scene 6 without having to carry out treatments for this purpose. images.
- a Such sensor is an inertial sensor that measures the acceleration of the camera 22 along three orthogonal axes.
- the cameras 22 and 4 may be the same or different.
- the cameras 22 and 4 can measure the intensity of the radiation emitted by a point of the scene at wavelengths other than those visible to a human being.
- cameras 22 and 4 can work in the infrared.
- the unit 12 can be embedded in the camera 4 to perform the real-time processing. However, the unit 12 can also be mechanically distinct from the camera 4. In the latter case, the images captured by the camera 4 are, in a second step, downloaded to the memory 10, then processed by the unit 12 later.
- the three-dimensional model 16 of the scene 6 may be different from a model based on reference images.
- the model 16 can be replaced by a volumetric computer model of the scene 6 in three dimensions obtained for example by computer-aided design. Then, each reference image is constructed from this mathematical model. More details on these other types of three-dimensional models can be found in article A1.
- the reference image may be an image selected in the model 16 or an image constructed from the images contained in the model 16.
- the reference image may be obtained by combining several images contained in the model 16, as described in article A1, so as to obtain a reference image whose installation is closer to the estimated pose of the current image.
- the model 16 is not necessarily built beforehand during a learning phase. On the contrary, it can be constructed as the camera 80 is moved in the scene 6.
- the simultaneous construction of the trajectory of the camera 80 and the mapping of the scene 6 is known by the acronym SLAM (Simultaneous Localization And Mapping).
- SLAM Simultaneous Localization And Mapping
- the images of the camera 80 are added to the model 16 as they move in the scene 6.
- this is treated to limit RS and / or MB deformation as described in steps 72 and 74.
- phase 50 and device 20 are omitted.
- the estimate x pR of the pose of the camera 4 can be obtained in a different way.
- the camera 4 is equipped with a sensor measuring its pose in the scene, such as an inertial sensor and the estimation of the exposure x P R is obtained from the measurements of this sensor embedded in the camera 4.
- the differences E 1 or E 3 are calculated, not for a single reference image, but for several reference images. So, in the process of FIG. 4, the difference Ei is then replaced by the deviations ⁇ ⁇ or Ei. 2 where the difference EI A is calculated from a first reference image, and the difference E i. 2 is calculated from a second reference image distinct from the first.
- the pinhole model is completed by a model of radial distortions to correct the aberrations or distortions caused by the lenses. from the camera.
- models of distortions can be found in the following article: SLAMA, C.C. (1980). Manual of Photogrammetry. American Society of Photogrammetry, 4th edn. 24.
- the coordinates of the pose x pR can be considered as being independent of the speed X V R.
- the same method as previously described is implemented, except that the variable x will contain both the six coordinates of the pose x p and the six coordinates of the velocity x v .
- the number of degrees of freedom of the camera 4 or 80 is less than six. This is for example the case if the camera can move only in a horizontal plane or can not turn on itself. This limitation of the number of degrees of freedom in displacement is then taken into account by reducing the number of unknown coordinates necessary to determine the pose and the speed of the camera.
- the acceleration x a corresponds to the linear acceleration along the X, Y and Z axes as well as the angular acceleration around these same axes.
- the speed x v can be determined using only the difference E 2 between the depths. In this case, it is not necessary for cameras 4 and 22 to measure and record intensities for each pixel.
- the method of Figure 7 can be adapted to the case where the physical quantity measured by the camera 4 is the depth and not the light intensity. If the camera 80 is used, it is not necessary for the reference image to have a depth associated with each pixel. Indeed, the vertex v is then known and, for example, the method of Figure 7 can be implemented without having to use the vertex v *.
- the function l w (p * ) is taken equal to the function l (wi (Ti, v * )). This is therefore simply to take the value of the duration t A and / or the duration D equal to zero in the previous embodiments.
- the time t A, t A D, E t e and t D can be measured during the learning phase or estimated during the first iterations in Phase 60 for use.
- the estimation of the speed X V R can be used for other image processing than those previously described.
- the estimation of the speed X V R and the exposure x pR is not necessarily carried out in real time. For example, it can be done once the capture of the images by the camera 4 or 80 is complete.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Multimedia (AREA)
- Length Measuring Devices By Optical Means (AREA)
- Image Analysis (AREA)
Abstract
Description
Claims
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
FR1361305A FR3013487B1 (fr) | 2013-11-18 | 2013-11-18 | Procede d'estimation de la vitesse de deplacement d'une camera |
PCT/EP2014/074762 WO2015071457A1 (fr) | 2013-11-18 | 2014-11-17 | Procede d'estimation de la vitesse de deplacement d'une camera |
Publications (1)
Publication Number | Publication Date |
---|---|
EP3072108A1 true EP3072108A1 (fr) | 2016-09-28 |
Family
ID=50424398
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP14799446.1A Withdrawn EP3072108A1 (fr) | 2013-11-18 | 2014-11-17 | Procede d'estimation de la vitesse de deplacement d'une camera |
Country Status (4)
Country | Link |
---|---|
US (2) | US9940725B2 (fr) |
EP (1) | EP3072108A1 (fr) |
FR (1) | FR3013487B1 (fr) |
WO (1) | WO2015071457A1 (fr) |
Families Citing this family (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
FR3013488B1 (fr) | 2013-11-18 | 2017-04-21 | Univ De Nice (Uns) | Procede d'estimation de la vitesse de deplacement d'une camera |
US9727967B2 (en) * | 2014-06-23 | 2017-08-08 | Samsung Electronics Co., Ltd. | Methods for determining estimated depth in an image and systems thereof |
EP3451288A1 (fr) * | 2017-09-04 | 2019-03-06 | Universität Zürich | Odométrie visuelle-inertielle avec caméra pour événements |
US11783707B2 (en) | 2018-10-09 | 2023-10-10 | Ford Global Technologies, Llc | Vehicle path planning |
KR102559686B1 (ko) * | 2018-12-17 | 2023-07-27 | 현대자동차주식회사 | 차량 및 차량 영상 제어방법 |
US20200226787A1 (en) * | 2019-01-14 | 2020-07-16 | Sony Corporation | Information processing apparatus, information processing method, and program |
US10399227B1 (en) | 2019-03-29 | 2019-09-03 | Mujin, Inc. | Method and control system for verifying and updating camera calibration for robot control |
US10906184B2 (en) * | 2019-03-29 | 2021-02-02 | Mujin, Inc. | Method and control system for verifying and updating camera calibration for robot control |
US11460851B2 (en) | 2019-05-24 | 2022-10-04 | Ford Global Technologies, Llc | Eccentricity image fusion |
US11521494B2 (en) | 2019-06-11 | 2022-12-06 | Ford Global Technologies, Llc | Vehicle eccentricity mapping |
US11662741B2 (en) | 2019-06-28 | 2023-05-30 | Ford Global Technologies, Llc | Vehicle visual odometry |
CN110782492B (zh) * | 2019-10-08 | 2023-03-28 | 三星(中国)半导体有限公司 | 位姿跟踪方法及装置 |
US11610330B2 (en) | 2019-10-08 | 2023-03-21 | Samsung Electronics Co., Ltd. | Method and apparatus with pose tracking |
US11599974B2 (en) * | 2019-11-22 | 2023-03-07 | Nec Corporation | Joint rolling shutter correction and image deblurring |
CN111583305B (zh) * | 2020-05-11 | 2022-06-21 | 北京市商汤科技开发有限公司 | 神经网络训练及运动轨迹确定方法、装置、设备和介质 |
CN114034288B (zh) * | 2021-09-14 | 2022-11-08 | 中国海洋大学 | 海底微地形激光线扫描三维探测方法及系统 |
Family Cites Families (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP4844305B2 (ja) | 2005-09-12 | 2011-12-28 | 日本ビクター株式会社 | 撮像装置 |
JP4961800B2 (ja) | 2006-03-31 | 2012-06-27 | ソニー株式会社 | 画像処理装置、および画像処理方法、並びにコンピュータ・プログラム |
US8068140B2 (en) | 2006-08-07 | 2011-11-29 | Avago Technologies General Ip (Singapore) Pte. Ltd. | Still image stabilization suitable for compact camera environments |
US9565419B2 (en) | 2007-04-13 | 2017-02-07 | Ari M. Presler | Digital camera system for recording, editing and visualizing images |
US8698908B2 (en) | 2008-02-11 | 2014-04-15 | Nvidia Corporation | Efficient method for reducing noise and blur in a composite still image from a rolling shutter camera |
JP4666012B2 (ja) | 2008-06-20 | 2011-04-06 | ソニー株式会社 | 画像処理装置、画像処理方法、プログラム |
WO2011053678A1 (fr) | 2009-10-28 | 2011-05-05 | The Trustees Of Columbia University In The City Of New York | Procédés et systèmes pour obturateur roulant codé |
JP2011103631A (ja) | 2009-11-12 | 2011-05-26 | Nikon Corp | デジタルカメラ |
US8401242B2 (en) | 2011-01-31 | 2013-03-19 | Microsoft Corporation | Real-time camera tracking using depth maps |
JP5734082B2 (ja) | 2011-05-11 | 2015-06-10 | キヤノン株式会社 | 撮像装置及びその制御方法、並びにプログラム |
US9542611B1 (en) * | 2011-08-11 | 2017-01-10 | Harmonic, Inc. | Logo detection for macroblock-based video processing |
US8970709B2 (en) | 2013-03-13 | 2015-03-03 | Electronic Scripting Products, Inc. | Reduced homography for recovery of pose parameters of an optical apparatus producing image data with structural uncertainty |
-
2013
- 2013-11-18 FR FR1361305A patent/FR3013487B1/fr active Active
-
2014
- 2014-11-17 EP EP14799446.1A patent/EP3072108A1/fr not_active Withdrawn
- 2014-11-17 WO PCT/EP2014/074762 patent/WO2015071457A1/fr active Application Filing
- 2014-11-17 US US15/037,597 patent/US9940725B2/en not_active Expired - Fee Related
-
2018
- 2018-04-09 US US15/949,015 patent/US10636151B2/en active Active
Non-Patent Citations (2)
Title |
---|
MICHAEL SCHOBERL ET AL: "Modeling of image shutters and motion blur in analog and digital camera systems", IMAGE PROCESSING (ICIP), 2009 16TH IEEE INTERNATIONAL CONFERENCE ON, IEEE, PISCATAWAY, NJ, USA, 7 November 2009 (2009-11-07), pages 3457 - 3460, XP031628506, ISBN: 978-1-4244-5653-6 * |
See also references of WO2015071457A1 * |
Also Published As
Publication number | Publication date |
---|---|
WO2015071457A1 (fr) | 2015-05-21 |
US9940725B2 (en) | 2018-04-10 |
US20180308240A1 (en) | 2018-10-25 |
US10636151B2 (en) | 2020-04-28 |
FR3013487A1 (fr) | 2015-05-22 |
US20160292882A1 (en) | 2016-10-06 |
FR3013487B1 (fr) | 2017-04-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2015071457A1 (fr) | Procede d'estimation de la vitesse de deplacement d'une camera | |
EP3072109A1 (fr) | Procede d'estimation de la vitesse de deplacement d'une camera | |
EP0863413B1 (fr) | Procédé et dispositif de localisation d'un objet dans l'espace | |
EP2715662B1 (fr) | Procede de localisation d'une camera et de reconstruction 3d dans un environnement partiellement connu | |
EP2428934B1 (fr) | Procédé d'estimation du mouvement d'un porteur par rapport à un environnement et dispositif de calcul pour système de navigation | |
TW201118791A (en) | System and method for obtaining camera parameters from a plurality of images, and computer program products thereof | |
US20120257814A1 (en) | Image completion using scene geometry | |
EP3144881A1 (fr) | Procede de mosaiquage 3d panoramique d'une scene | |
CN113689578B (zh) | 一种人体数据集生成方法及装置 | |
EP2766872A1 (fr) | Procede d'etalonnage d'un systeme de vision par ordinateur embarque sur un mobile | |
EP3435332A1 (fr) | Dispositif électronique et procédé de génération, à partir d'au moins une paire d'images successives d'une scène, d'une carte de profondeur de la scène, drone et programme d'ordinateur associés | |
CN112648994B (zh) | 基于深度视觉里程计和imu的相机位姿估计方法及装置 | |
EP3200153A1 (fr) | Procédé de détection de cibles au sol et en mouvement dans un flux vidéo acquis par une caméra aéroportée | |
US10679376B2 (en) | Determining a pose of a handheld object | |
EP3355277A1 (fr) | Procédé d'affichage sur un écran d'au moins une représentation d'un objet, programme d'ordinateur, dispositif et appareil électroniques d'affichage associés | |
WO2005010820A2 (fr) | Procede et dispositif automatise de perception avec determination et caracterisation de bords et de frontieres d'objets d'un espace, construction de contours et applications | |
CN112766135A (zh) | 目标检测方法、装置、电子设备和存储介质 | |
CN112116068A (zh) | 一种环视图像拼接方法、设备及介质 | |
WO2013178540A1 (fr) | Procede de mesures tridimensionnelles par stereo-correlation utilisant une representation parametrique de l'objet mesure | |
EP2994813B1 (fr) | Procede de commande d'une interface graphique pour afficher des images d'un objet tridimensionnel | |
EP2943935B1 (fr) | Estimation de mouvement d'une image | |
FR3065097B1 (fr) | Procede automatise de reconnaissance d'un objet | |
WO2021009431A1 (fr) | Procede de determination de parametres d'etalonnage extrinseques d'un systeme de mesure | |
WO2018229358A1 (fr) | Procédé et dispositif de construction d'une image tridimensionnelle | |
FR3052287B1 (fr) | Construction d'une image tridimensionnelle |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
17P | Request for examination filed |
Effective date: 20160516 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
AX | Request for extension of the european patent |
Extension state: BA ME |
|
DAX | Request for extension of the european patent (deleted) | ||
RAP1 | Party data changed (applicant data changed or rights of an application transferred) |
Owner name: PIXMAP S.A.S. |
|
GRAP | Despatch of communication of intention to grant a patent |
Free format text: ORIGINAL CODE: EPIDOSNIGR1 |
|
RIC1 | Information provided on ipc code assigned before grant |
Ipc: G06T 7/70 20160928ALI20181024BHEP Ipc: G06T 7/246 20160928AFI20181024BHEP |
|
INTG | Intention to grant announced |
Effective date: 20181126 |
|
RIC1 | Information provided on ipc code assigned before grant |
Ipc: G06T 7/70 20170101ALI20181024BHEP Ipc: G06T 7/246 20170101AFI20181024BHEP |
|
RIC1 | Information provided on ipc code assigned before grant |
Ipc: G06T 7/70 20170101ALI20181024BHEP Ipc: G06T 7/246 20170101AFI20181024BHEP |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN |
|
18D | Application deemed to be withdrawn |
Effective date: 20190409 |