CA2650557A1 - System and method for three-dimensional object reconstruction from two-dimensional images - Google Patents

System and method for three-dimensional object reconstruction from two-dimensional images Download PDF

Info

Publication number
CA2650557A1
CA2650557A1 CA002650557A CA2650557A CA2650557A1 CA 2650557 A1 CA2650557 A1 CA 2650557A1 CA 002650557 A CA002650557 A CA 002650557A CA 2650557 A CA2650557 A CA 2650557A CA 2650557 A1 CA2650557 A1 CA 2650557A1
Authority
CA
Canada
Prior art keywords
image
dimensional
feature points
applying
images
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CA002650557A
Other languages
French (fr)
Other versions
CA2650557C (en
Inventor
Yousef Wasef Nijim
Izzat Hekmat Izzat
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Thomson Licensing SAS
Original Assignee
Thomson Licensing
Yousef Wasef Nijim
Izzat Hekmat Izzat
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Thomson Licensing, Yousef Wasef Nijim, Izzat Hekmat Izzat filed Critical Thomson Licensing
Publication of CA2650557A1 publication Critical patent/CA2650557A1/en
Application granted granted Critical
Publication of CA2650557C publication Critical patent/CA2650557C/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • G06T5/70
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/55Depth or shape recovery from multiple images
    • G06T7/579Depth or shape recovery from multiple images from motion

Abstract

A system and method for three-dimensional (3D) acquisition and modeling of a scene using two-dimensional (2D) images are provided. The system and method provides for acquiring first and second images of a scene, applying a smoothing function to the first image (202) to make feature points of objects, e.g., corners and edges of the objects, in the scene more visible, applying at least two feature detection functions to the first image to detect feature points of objects in the first image (204, 208), combining outputs of the at least two feature detection functions to select object feature points to be tracked (210), applying a smoothing function to the second image (206), applying a tracking function on the second image to track the selected object feature points (214), and reconstructing a three-dimensional model of the scene from an output of the tracking function (218).

Description

SYSTEM AND METHOD FOR THREE-DIMENSIONAL OBJECT
RECONSTRUCTION FROM TWO-DIMENSIONAL IMAGES

This application claims the benefit under 35 U.S.C. 119 of a provisional application 60/798087 filed in the United States on May 5, 2006.

TECHNICAL FIELD OF THE INVENTION
The present invention generally relates to three-dimensional object modeling, and more particularly, to a system and method for three-dimensional (3D) information acquisition from two-dimensional (2D) images using hybrid feature detection and tracking including smoothing functions.

BACKGROUND OF THE INVENTION
When a scene is filmed, the resulting video sequence contains implicit information on the three-dimensional (3D) geometry of the scene. While for adequate human perception this implicit information suffices, for many applications the exact geometry of the 3D scene is required. One category of these applications is when sophisticated data processing techniques are used, for instance in the generation of new views of the scene, or in the reconstruction of the 3D
geometry for industrial inspection applications.

Recovering 3D information has been an active research area for some time.
There are a large number of techniques in the literature that either captures information directly, for example, using a laser range finder or recover 3D
information from one or multiple two-dimensional (2D) images such as stereo or structure from motion techniques. 3D acquisition techniques in general can be classified as active and passive approaches, single view and multi-view approaches and geometric and photometric methods.

Passive approaches acquire 3D geometry from images or videos taken under regular lighting conditions. 3D geometry is computed using the geometric or photometric features extracted from images and videos. Active approaches use special light sources, such as laser, structure light or infrared light.

Active approaches compute the geometry based on the response of the objects and scenes to the special light projected onto the surface of the objects and scenes.
Single-view approaches recover 3D geometry using multiple images taken from a single camera viewpoint. Examples include structure from motion and depth from defocus.

Multi-view approaches recover 3D geometry from multiple images taken from multiple camera viewpoints, resulted from object motion, or with different light source positions. Stereo matching is an example of multi-view 3D recovery by matching the pixels in the left image and right image in the stereo pair to obtain the depth information of the pixels.
Geometric methods recover 3D geometry by detecting geometric features such as corners, edges, lines or contours in single or multiple images. The spatial relationship among the extracted corners, edges, lines or contours can be used to infer the 3D coordinates of the pixels in images. Structure From Motion (SFM) is a technique that attempts to reconstruct the 3D structure of a scene from a sequence of images taken from a camera moving within the scene or a static camera and a moving object. Although many agree that SFM is fundamentally a nonlinear problem, several attempts at representing it linearly have been made that provide mathematical elegance as well as direct solution methods. On the other hand, nonlinear techniques require iterative optimization, and must contend with local minima. However, these techniques promise good numerical accuracy and flexibility.
The advantage of SFM over the stereo matching is that one camera is needed.
Feature based approaches can be made more effective by tracking techniques, which exploits the past history of the features' motion to predict disparities in the next frame.
Second, due to small spatial and temporal differences between 2 consecutive frames, the correspondence problem can be also cast as a problem of estimating the apparent motion of the image brightness pattern, called the optical flow.
There are several algorithms that use SFM; most of them are based on the reconstruction of 3D geometry from 2D images. Some assume known correspondence values, and others use statistical approaches to reconstruct without correspondence.

The above-described methods have been extensively studied for decades.
However, no single technique performs well in all situations and most of the past methods focus on 3D reconstruction under laboratory conditions, which are relatively easy. For real-world scenes, subjects could be in movement, lighting may be complicated, and depth range could be large. It is difficult for the above-identified techniques to handle these real-world conditions.
SUMMARY
The present disclosure provides a system and method for three-dimensional (3D) acquisition and modeling of a scene using two-dimensional (2D) images.
The system and method of the present disclosure includes acquiring at least' two images of a scene and applying a smoothing function to make the features more visible followed by a hybrid scheme of feature selection and tracking for the recovery of 3D
information. Initially, the smoothing function is applied on the images followed by a feature point selection that will find the features in the image. At least two feature point detection functions are employed to cover a wider range of good feature points in the first image, then the smoothing function is applied on the second image followed by a tracking function to track the detected feature points in the second image. The results of the feature detection/selection and tracking will be combined to obtain a complete 3D model. One target application of this work is 3D
reconstruction of film sets. The resulting 3D models can be used for visualization during the film shooting "or for postproduction. Other applications will -benefit from this approach including but not limited to gaming and 3D TV.
According to one aspect of the present disclosure, a three-dimensional acquisition process is provided including acquiring first and second images of a scene, applying at least two feature detection functions to the first image to detect feature points of objects in the image, combining outputs of the at least two featur,e detection functions to select object feature points to be tracked, applying a tracking function on the second image to track the selected object feature points, and reconstructing a three-dimensional model of the scene from the output of the tracking function. The process further applying a smoothing function on the first image before the applying of at least two feature detection functions step to make the feature points of objects in the first image more visible, wherein the features points are corners, edges or lines of objects in the image.

In another aspect of the present disclosure, a system for three-dimensional (3D) information acquisition from two-dimensional (2D) images is provided. The 5 system includes a post-processing device configured for reconstructing a three-dimensional model of a scene from at least two images, the post-processing device including a feature point detector configured to detect feature points in an image, the feature point detector including at least two feature detection functions, wherein at least two feature detection functions are applied to a first image of the at least two images, a feature point tracker configured for tracking selected feature points between at least two images, and a depth map generator configured to generate a depth map between the at least two images from the tracked feature points, wherein the post-processing device creates the 3D model from the depth map. The post-processing device further includes a smoothing function filter configured for making feature points of objects in the first image more visible.

In a further aspect of the present disclosure, a program storage device readable by a machine, tangibly embodying a program of instructions executable by the machine to perform method steps for modeling a three-dimensional (3D) scene from two-dimensional (2D) images is provided, the method including acquiring first and second images of a scene, applying a smoothing function to the first image, applying at least two feature detection functions to the smoothed first image to detect feature points of objects in the image, combining outputs of the at least two feature detection functions to select object feature points to be tracked, applying the smoothing function on the second image, applying a tracking function on the second image to track the selected object feature points, and reconstructing a three-dimensional model of the scene from an output of the tracking function.
BRIEF DESCRIPTION OF THE DRAWINGS

These, and other aspects, features and advantages of the present invention will be described or become apparent from the following detailed description of the preferred embodiments, which is to be read in connection with the accompanying drawings.

In the drawings, wherein like reference numerals denote similar elements throughout the views:

FIG. 1 is an exemplary illustration of a system for three-dimensional (3D) information acquisition according to an aspect of the present invention;
FIG. 2 is a flow diagram of an exemplary method for reconstructing three-dimensional (3D) objects from two-dimensional (2D) images according to an aspect of the present invention;

FIG. 3A is an illustration of a scene processed with one feature point detection function; and FIG. 3B is an illustration of the scene shown in FIG. 3A processed with a hybrid detection function.
It should be understood that the drawing(s) is for purposes of illustrating the concepts of the invention and is not necessarily the only possible configuration for illustrating the invention.
DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS

It should be understood that the elements shown in the Figures may be implemented in various forms of hardware, software or combinations thereof.
Preferably, these elements are implemented in a combination of hardware and software on one or more appropriately programmed general-purpose devices, which may include a processor, memory and input/output interfaces.

The present description illustrates the principles of the present invention.
It will thus be appreciated that those skilled in the art will be able to devise various arrangements that, although not explicitly described or shown herein, embody the principles of the invention and are included within its spirit and scope.

All examples and conditional language recited herein are intended for pedagogical purposes to aid the reader in understanding the principles of the invention and the concepts contributed by the inventors to furthering the art, and are to be construed as being without limitation to such specifically recited examples and conditions.
Moreover, all statements herein reciting principles, aspects, and embodiments of the invention, as well as specific examples thereof, are intended to encompass both structural and functional equivalents thereof. Additionally, it is intended that such equivalents include both currently known equivalents as well as equivalents developed in the future, i.e., any elements developed that perform the same function, regardless of structure.

Thus, for example, it will be appreciated by those skilled in the art that the block diagrams presented herein represent conceptual views of illustrative circuitry embodying the principles of the invention. Similarly, it will be appreciated that any flow charts, flow diagrams, state transition diagrams, pseudocode, and the like represent various processes which may be substantially represented in computer readable media and so executed by a computer or processor, whether or not such computer or processor is explicitly shown.
The functions of the various elements shown in the figures may be provided through the use of dedicated hardware as well as hardware capable of executing software in association with appropriate software. When provided by a processor, the functions may be provided by a single dedicated processor, by a single shared processor, or by a plurality of individual processors, some of which may be shared.
Moreover, explicit use of the term "processor" or "controller" should not be construed to refer exclusively to hardware capable of executing software, and may implicitly include, without limitation, digital signal processor ("DSP") hardware, read only memory ("ROM") for storing software, random access memory ("RAM"), and nonvolatile storage.

Other hardware, conventional and/or custom, may also be included.
Similarly, any switches shown in the figures are conceptual only. Their function may be carried out through the operation of program logic, through dedicated logic, through the interaction of program control and dedicated logic, or even manually, the particular technique being selectable by the implementer as more specifically understood from the context.
In the claims hereof, any element expressed as a means for performing a specified function is intended to encompass any way of performing that function including, for example, a) a combination of circuit elements that performs that function or b) software in any form, including, therefore, firmware, microcode or the like, combined with appropriate circuitry for executing that software to perform the function. The invention as defined by such claims resides in the fact that the functionalities provided by the various recited means are combined and brought together in the manner which the claims call for. It is thus regarded that any means that can provide those functionalities are equivalent to those shown herein.
The techniques disclosed in the present invention deal with the problem of recovering 3D geometries of objects and scenes. Recovering the geometry of a real-world scene is a challenging problem due to the movement of subjects, large depth discontinuity between foreground and background, and complicated lighting and brightness conditions. The current methods used in the feature point selection and tracking to estimate a depth map of an image or to reconstruct the 3D
representation do not perform very well by themselves. The reconstruction of 3D images from images is used but the results are limited and the depth map is not very accurate.
Some of the techniques for accurate 3D acquisition, such as laser scan, are unacceptable in many situations due to, for example, the presence of human subjects.

A system and method is provided for recovering three-dimensional (3D) geometries of objects and scenes. The system and method of the present invention provides an enhancement approach for Structure From Motion (SFM) using a hybrid approach to recover 3D features. This technique is motivated by the lack of a single method capable of locating features for large environments reliably. The techniques of the present invention start by applying first a different smoothing function, such as Poison or Laplacian transform, to the images before feature point detection/selection and tracking. This type of smoothing filter helps make the features in images more visible to detect than the Gaussian function commonly used. Then, multiple feature detectors are applied to one image to obtain good features. After the use of two feature detectors, good features are obtained, which are then tracked easily throughout several images using a tracking method.

Referring now to the Figures, exemplary system components according to an embodiment of the present disclosure are shown in FIG. 1. A scanning device may be provided for scanning film prints 104, e.g., camera-original film negatives, into a digital format, e.g. Cineon-format or Society of Motion Picture and Television Engineers (SMPTE) Digital Picture Exchange (DPX) files.

The scanning device 103 may comprise, e.g., a telecine or any device that will generate a video output from film such as, e.g., an Arri LocProTM with video output.
Alternatively, files from the post production process or digital cinema 106 (e.g., files 5 already in computer-readable form) can be used directly. Potential sources of computer-readable files are AVIDT"' editors, DPX files, D5 tapes etc.

Scanned film prints are input to the post-processing device 102, e.g., a computer. The computer is implemented on any of the various known computer 10 platforms having hardware such as one or more central processing units (CPU), memory 110 such as random access memory (RAM) and/or read only memory (ROM) and input/output (I/O) user interface(s) 112 such as a keyboard, cursor control device (e.g., a mouse or joystick) and display device. The computer platform also includes an operating system and micro instruction code. The various processes and functions described herein may either be part of the'micro instruction code or part of a software application program (or a combination thereof) which is executed via the operating system. In one embodiment, the software application program is tangibly embodied ort a program storage device, which may be uploaded to and executed by any suitable machine such as post-processing device 102. In addition, various other peripheral devices may be connected to the computer platform by various interfaces and bus structures, such a parallel port, serial port or universal serial bus (USB). Other peripheral devices may include additional storage devices 124 and a printer 128. The printer 128 may be employed for printed a revised version of the film 126 wherein scenes may have been altered or replaced using 3D modeled objects as a result of the techniques described below.

Alternatively, files/film prints already in computer-readable form 106 (e.g., digital cinema, which for example, may be stored on external hard drive 124) may be directly input into the computer 102. Note that the term "film" used herein may refer to either film prints or digital cinema.
A software program includes'a three-dimensional (3D) reconstruction module 114 stored in the memory 110. The 3D reconstruction module 114 includes a smoothing function filter 116 for making features of objects in images more visible to detect. The 3D reconstruction module 114 also includes a feature point detector 118 for detecting feature points in an image. The feature point detector 118 will include at least two different feature point detection functions, e.g., algorithms, for detecting or selecting feature points. A feature point tracker 120 is provided for tracking selected feature points throughout a plurality of consecutive images via a tracking function or algorithm. A depth map generator 122 is also provided for generating a depth map from the tracked feature points.

FIG. 2 is a flow diagram of an exemplary method for reconstructing three-dimensional (3D) objects from two-dimensional (2D) images according to an aspect of the present invention.

Referring to FIG. 2, initially, the post-processing device 102 obtains the digital master video file in a computer-readable format. The digital video file may be acquired by capturing a temporal sequence of video images with a digital video camera. Alternatively, the video sequence may be captured by a conventional film-type camera. In this scenario, the film is scanned via scanning device 103 and the process proceeds to step 202. The camera will acquire 2D images while moving either the object in a scene or the camera. The camera will acquire multiple viewpoints of the scene.

It is to be appreciated that whether the film is scanned or already in digital format, the digital file of the film will include indications or information on locations of the frames (e.g.. timecode, frame number, time from start of the film, etc..).
Each frame of the digital video file will include one image, e.g., 11, 12, ...In.
In step 202, a smoothing function filter 116 is applied to image I.
Preferably, the smoothing function filter 116 is a Poison or Laplacian transform which helps make features of objects in the image more visible to detect than the Gaussian function commonly used in the art. It is to be appreciated that other smoothing function filters may be employed.

Image I1 is then processed by a first feature point detector in step 204.
Feature points are the salient features of an image, such as corners, edges, lines or the like, where there is a high amount of image intensity contrast. The feature points are selected because they are easily identifiable and may be tracked robustly.
The feature point detector 118 may use a Kitchen-Rosenfeld corner detection operator C, as is well known in the art. This operator is used to evaluate the degree of "cornerness" of the image at a given pixel location. "Corners" are generally image features characterized by the intersection of two directions of image intensity gradient maxima, for example at a 90 degree angle. To extract feature points, the Kitchen-Rosenfeld operator is applied at each valid pixel position of image I.
The higher the value of the operator C at a particular pixel, the higher its degree of "cornerness", and the pixel position (x,y) in image I1 is a feature point if C
at (x,y) is greater than at other pixel positions in a neighborhood around (x,y). The neighborhood may be a 5x5 matrix centered on the pixel position (x,y). To assure robustness, the selected feature points may have a degree of cornerness greater than a threshold, such as Tc =10. The output from the feature point detector 118 is a set of feature points { Fi } in image I1 where each F, corresponds to a "feature" pixel position in image I. Many other feature point detectors can be employed including, but not limited to, Scale Invariant Feature Transform (SIFT), Smallest Univalue Segment Assimilating Nucleus (SUSAN), Hough transform, Sobel edge operator and Canny edge detector.
'In step 206, image I1 is input to smoothing function filter 116 and a second different feature point detector is applied to the image (step 208). The feature points that are detected in steps 204 and step 208 are then combined and the duplicate selected feature points are eliminated (step 210). It is to be appreciated that the smoothing function filter applied at step 206 is the same filter applied at step 202;
however, in other embodiments, different smoothing function filters may be used in each of steps 202 and 206.
It is to be appreciated that by employing a hybrid approach to feature point detection a large number of feature points will be detected. FIG. 3A
illustrates a scene with detected feature points represented by small squares. The scene in FIG.
3A was processed with one feature point detector. In contrast, the scene in FIG. 3B
was processed with a hybrid point detector approach in accordance with the present invention and has detected a significantly higher number of feature points.

After the detected feature points are chosen, a second image 12 is smoothed using the same smoothing function filter that was used on the first image I1 (step 212). The good feature points that were selected on the first image I1 are then tracked on the second image I2 (step 214). Given a set of feature points Fl in image 11, the feature point tracker 120 tracks the feature points into the next image 12 of the scene shot by finding their closest match.

As described above, in other embodiments, the smoothing function filter applied in step 212 may be different than the filters applied in steps 202 and 206.
Furthermore, it is to be appreciated that although steps 202 through steps 212 were described sequentially, in certain embodiments, the smoothing function filters may be applied simultaneously via parallel processing or hardware.
Once the feature points are tracked, the disparity information is calculated for each tracked feature. Disparity is calculated as the difference between the pixel location in I1 and 12 in the horizontal direction.
Disparity is inversely related to depth with a scaling factor related to camera calibration parameters. At step 216, camera calibration parameters are obtained and are employed by the depth map generator 122 to generator a depth map for the object or scene between the two images. The camera parameters include but are not limited to the focal length of the camera and the distance between the two camera shots. The camera parameters may be manually entered into the system 100 via user interface 112 or estimated from camera calibration algorithms.
Using the camera parameters, the depth is estimated at the feature points. The resulting depth map is sparse with depth values only at the detected feature. A depth map is a two-dimension array of values for mathematically representing a surface in space, where the rows and columns of the array correspond to the x and y location information of the surface; and the array elements are depth or distance readings to the surface from a given point or camera location. A depth map can be viewed as a grey scale image of an object, with the depth information replacing the intensity information, or pixels, at each point on the surface of the object.
Accordingly, surface points are also referred to as pixels within the technology of 3D graphical construction, and the two terms will be used interchangeably within this disclosure.
Since disparity information is inversely proportional to depth multiplied by a scaling factor, it can be used directly for building the 3D scene model for most applications.
This simplifies the computation since it makes computation of camera parameters unnecessary.

From the sets of feature points present in the image pair I1 and 12 and an estimate of the depth at each feature point, and assuming that the feature points are chosen so that they lie relatively close to each other and span the whole image, the depth map generator 122 creates a 3D mesh structure by interconnecting such feature points in which the feature points lie at the vertices of formed polygons. The closer the feature points are to each other, the denser the resulting 3D mesh structure.

Since the depth at each vertex of the 3D structure is known, the depths at the points within each polygon may be estimated. In this way the depth at all image pixel 5 positions may be estimated. This may be done by planar interpolation.

A robust and fast method of generating the 3D mesh structure is Delaunay triangulation. The feature points are connected to form a set of triangles whose vertices lie at feature point positions. Using the depth associated with each feature 10 point and its corresponding vertex, a "depth plane" may be fitted to each individual triangle from which the depths of every point within the triangle may be determined.
A complete 3D model of the object can be reconstructed by combining the triangulation mesh resulted from the Delaunay algorithm with the texture information 15 from image I1 (step 218). The texture information is the 2D intensity image. The complete 3D model will include depth and intensity values at image pixels. The resulting combined image can be visualized using conventional visualization tools such as the ScanAlyze software developed at Stanford University of Stanford, CA.

The reconstructed 3D model of a particular object or scene may then be rendered for viewing on a display device or saved in a digital file 130 separate from the file containing the images. The digital file of 3D reconstruction 130 may be stored in storage device 124 for later retrieval, e.g., during an editing stage of the film where a modeled object may be inserted into a scene where the object was not previously present.

The system and method of the present invention utilizes multiple feature point detectors and combines the results of the multiple feature point detectors to improve the number and quality of the detected feature points. In contrast to a single feature detector, combining different feature point detectors improve the results of finding good feature points to track. After getting the "better" results from the multiple feature point detectors (i.e. using more than one feature point detector), the feature points in the second image are easier to track and produce better depth map results compared to using one feature detector to get the depth map results.
Although the embodiment which incorporates the teachings of the present invention has been shown and described in detail herein, those skilled in the art can readily devise.many other varied embodiments that still incorporate these teachings.
Having described preferred embodiments for a system and method for three-dimensional (3D) acquisition and modeling of a scene (which are intended to be illustrative and not limiting), it is noted that modifications and variations can be made by persons skilled in the art in light of the above teachings. It is therefore to be understood that changes may be made in the particular embodiments of the invention disclosed which are within the scope and spirit of the invention as outlined by the appended claims. Having thus described the invention with the details and particularity required by the patent laws, what is claimed and desired protected by Letters Patent is set forth in the appended claims.

Claims (21)

1. A three-dimensional acquisition process comprising:
acquiring first and second images of a scene;
applying at least two feature detection functions to the first image to detect feature points of objects in the first image (204,208);
combining outputs of the at least two feature detection functions to select object feature points to be tracked (210);
applying a tracking function on the second image to track the selected object feature points (214); and reconstructing a three-dimensional model of the scene from an output of the tracking function (218).
2. The three-dimensional acquisition process as in claim 1, further comprising applying a smoothing function on the first image (202) before the applying at least two feature detection functions step to make the feature points of objects in the first image more visible.
3. The three-dimensional acquisition process as in claim 2, wherein the features points are corners, edges or lines of objects in the image.
4. The three-dimensional acquisition process as in claim 2, further comprising applying the same smoothing function on the second image (206) before the applying a tracking function step.
5. The three-dimensional acquisition process as in claim 1, further comprising applying a first smoothing function to the first image (202) before applying a first of the at least two feature detection functions and applying a second smoothing function to the first image (206) before applying a second of the at least two feature detection functions, the first and second smoothing functions make the feature points of objects in the first image more visible.
6. The three-dimensional acquisition process as in claim 1, wherein the combining step further comprises eliminating duplicate feature points detected by the at least two feature detection functions.
7. The three-dimensional acquisition process as in claim 1, wherein the reconstructing step further comprises generating a depth map of the selected object feature points between the first and second images (216).
8. The three-dimensional acquisition process as in claim 7, wherein the reconstructing step further comprises generating a three-dimensional mesh structure from the selected object feature points and the depth map.
9. The three-dimensional acquisition process as in claim 8, wherein the generating a three-dimensional mesh structure step is performed by a triangulation function.
10. The three-dimensional acquisition process as in claim 8, wherein the reconstructing step further comprises combining the mesh structure with texture information from the first image to complete the three-dimensional model.
11. A system (100) for three-dimensional (3D) information acquisition from two-dimensional (2D) images, the system comprising:
a post-processing device (102) configured for reconstructing a three-dimensional model of a scene from at least two images; the post-processing device including a feature point detector (118) configured to detect feature points in an image, the feature point detector (118) including at least two feature detection functions, wherein at least two feature detection functions are applied to a first image of the at least two images;
a feature point tracker (120) configured for tracking selected feature points between at least two images; and a depth map generator (122) configured to generate a depth map between the at least two images from the tracked feature points;
wherein the post-processing device creates the 3D model from the depth map.
12. The system (100) as in claim 11, wherein the post-processing device (102) further includes a smoothing function filter (116) configured for making feature points of objects in the first image more visible.
13. The system (100) as in claim 12, wherein the smoothing function filter (116) is a Poison transform or Laplacian transform.
14. The system (100) as in claim 12, wherein the feature point detector (118) is configured to combine the detected feature points from the at least two feature detection functions and eliminate duplicate detected feature points.
15. The system (100) as in claim 12, wherein the post-processing device (102) is further configured to generate a three-dimensional mesh structure from the selected feature points and the depth map.
16. The system (100) as in claim 15, wherein the post-processing device (102) is further configured for combining the mesh structure with texture information from the first image to complete the 3D model.
17. The system (100) as in claim 16, further comprising a display device (112) for rendering the 3D model.
18. A program storage device readable by a machine, tangibly embodying a program of instructions executable by the machine to perform method steps for modeling a three-dimensional (3D) scene from two-dimensional (2D) images, the method comprising:

acquiring first and second images of a scene;

applying a smoothing function to the first image (202);
applying at least two feature detection functions to the smoothed first image to detect feature points of objects in the first image (204, 208);
combining outputs of the at least two feature detection functions to select object feature points to be tracked (210);
applying the smoothing function on the second image (206);
applying a tracking function on the second image to track the selected object feature points (214); and reconstructing a three-dimensional model of the scene from an output of the tracking function (218).
19. The program storage device as in claim 18, wherein the reconstructing step further comprises generating a depth map of the selected object feature points between the first and second images.
20. The program storage device as in claim 19, wherein the reconstructing step further comprises generating a three-dimensional mesh structure from the selected object feature points and the depth map.
21. The program storage device as in claim 20, wherein the reconstructing step further comprises combining the mesh structure with texture information from the first image to complete the three-dimensional model.
CA2650557A 2006-05-05 2006-10-25 System and method for three-dimensional object reconstruction from two-dimensional images Expired - Fee Related CA2650557C (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US79808706P 2006-05-05 2006-05-05
US60/798,087 2006-05-05
PCT/US2006/041647 WO2007130122A2 (en) 2006-05-05 2006-10-25 System and method for three-dimensional object reconstruction from two-dimensional images

Publications (2)

Publication Number Publication Date
CA2650557A1 true CA2650557A1 (en) 2007-11-15
CA2650557C CA2650557C (en) 2014-09-30

Family

ID=38577526

Family Applications (1)

Application Number Title Priority Date Filing Date
CA2650557A Expired - Fee Related CA2650557C (en) 2006-05-05 2006-10-25 System and method for three-dimensional object reconstruction from two-dimensional images

Country Status (5)

Country Link
EP (1) EP2016559A2 (en)
JP (1) JP2009536499A (en)
CN (1) CN101432776B (en)
CA (1) CA2650557C (en)
WO (1) WO2007130122A2 (en)

Families Citing this family (35)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7542034B2 (en) 2004-09-23 2009-06-02 Conversion Works, Inc. System and method for processing video images
KR100894874B1 (en) * 2007-01-10 2009-04-24 주식회사 리얼이미지 Apparatus and Method for Generating a Stereoscopic Image from a Two-Dimensional Image using the Mesh Map
US8089635B2 (en) 2007-01-22 2012-01-03 California Institute Of Technology Method and system for fast three-dimensional imaging using defocusing and feature recognition
EP2106531A2 (en) 2007-01-22 2009-10-07 California Institute Of Technology Method for quantitative 3-d imaging
US8655052B2 (en) 2007-01-26 2014-02-18 Intellectual Discovery Co., Ltd. Methodology for 3D scene reconstruction from 2D image sequences
US8274530B2 (en) 2007-03-12 2012-09-25 Conversion Works, Inc. Systems and methods for filling occluded information for 2-D to 3-D conversion
AU2008244494B2 (en) 2007-04-23 2010-10-21 California Institute Of Technology Single-lens 3-D imaging device using a polarization-coded aperture mask combined with a polarization-sensitive sensor
WO2009067223A2 (en) * 2007-11-19 2009-05-28 California Institute Of Technology Method and system for fast three-dimensional imaging using defocusing and feature recognition
US8514268B2 (en) 2008-01-22 2013-08-20 California Institute Of Technology Method and device for high-resolution three-dimensional imaging which obtains camera pose using defocusing
CN101383046B (en) * 2008-10-17 2011-03-16 北京大学 Three-dimensional reconstruction method on basis of image
US8773507B2 (en) 2009-08-11 2014-07-08 California Institute Of Technology Defocusing feature matching system to measure camera pose with interchangeable lens cameras
US8773514B2 (en) 2009-08-27 2014-07-08 California Institute Of Technology Accurate 3D object reconstruction using a handheld device with a projected light pattern
CN102271262B (en) 2010-06-04 2015-05-13 三星电子株式会社 Multithread-based video processing method for 3D (Three-Dimensional) display
US8675926B2 (en) * 2010-06-08 2014-03-18 Microsoft Corporation Distinguishing live faces from flat surfaces
DK3091508T3 (en) 2010-09-03 2019-04-08 California Inst Of Techn Three-dimensional imaging system
US8855406B2 (en) 2010-09-10 2014-10-07 Honda Motor Co., Ltd. Egomotion using assorted features
US9224245B2 (en) * 2011-01-10 2015-12-29 Hangzhou Conformal & Digital Technology Limited Corporation Mesh animation
US10607350B2 (en) * 2011-08-31 2020-03-31 Apple Inc. Method of detecting and describing features from an intensity image
WO2013029673A1 (en) * 2011-08-31 2013-03-07 Metaio Gmbh Method of detecting and describing features from an intensity image
JP5966837B2 (en) * 2012-10-05 2016-08-10 大日本印刷株式会社 Depth production support apparatus, depth production support method, and program
EP3047391B1 (en) * 2013-09-18 2023-06-28 Siemens Medical Solutions USA, Inc. Method and system for statistical modeling of data using a quadratic likelihood functional
CN104517316B (en) * 2014-12-31 2018-10-16 中科创达软件股份有限公司 A kind of object modelling method and terminal device
US9613452B2 (en) * 2015-03-09 2017-04-04 Siemens Healthcare Gmbh Method and system for volume rendering based 3D image filtering and real-time cinematic rendering
CN108140243B (en) * 2015-03-18 2022-01-11 北京市商汤科技开发有限公司 Method, device and system for constructing 3D hand model
WO2017132165A1 (en) 2016-01-25 2017-08-03 California Institute Of Technology Non-invasive measurement of intraocular pressure
CN106023307B (en) * 2016-07-12 2018-08-14 深圳市海达唯赢科技有限公司 Quick reconstruction model method based on site environment and system
CN106846469B (en) * 2016-12-14 2019-12-03 北京信息科技大学 Based on tracing characteristic points by the method and apparatus of focusing storehouse reconstruct three-dimensional scenic
US10586379B2 (en) 2017-03-08 2020-03-10 Ebay Inc. Integration of 3D models
US11727656B2 (en) 2018-06-12 2023-08-15 Ebay Inc. Reconstruction of 3D model with immersive experience
CN109117496B (en) * 2018-06-25 2023-10-27 国网经济技术研究院有限公司 Three-dimensional simulation design method and system for temporary construction arrangement of transformer substation engineering
CN110942479B (en) * 2018-09-25 2023-06-02 Oppo广东移动通信有限公司 Virtual object control method, storage medium and electronic device
CN110533777B (en) * 2019-08-01 2020-09-15 北京达佳互联信息技术有限公司 Three-dimensional face image correction method and device, electronic equipment and storage medium
CN111083373B (en) * 2019-12-27 2021-11-16 恒信东方文化股份有限公司 Large screen and intelligent photographing method thereof
CN111601246B (en) * 2020-05-08 2021-04-20 中国矿业大学(北京) Intelligent position sensing system based on space three-dimensional model image matching
CN111724481A (en) * 2020-06-24 2020-09-29 嘉应学院 Method, device, equipment and storage medium for three-dimensional reconstruction of two-dimensional image

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3548652B2 (en) * 1996-07-24 2004-07-28 株式会社東芝 Apparatus and method for restoring object shape
JP3512992B2 (en) * 1997-01-07 2004-03-31 株式会社東芝 Image processing apparatus and image processing method
JP2003242162A (en) * 2002-02-18 2003-08-29 Nec Soft Ltd Feature point extracting method, image searching method, feature point extracting apparatus, image searching system, and program
FR2837597A1 (en) * 2002-03-25 2003-09-26 Thomson Licensing Sa Three-dimensional scene modeling process, involves calculating point of reference image on basis of set of images, of defined minimum and maximum depth values of point depth corresponding to maximum distortion
CN100416613C (en) * 2002-09-29 2008-09-03 西安交通大学 Intelligent scene drawing system and drawing & processing method in computer network environment
CN1312633C (en) * 2004-04-13 2007-04-25 清华大学 Automatic registration method for large scale three dimension scene multiple view point laser scanning data

Also Published As

Publication number Publication date
EP2016559A2 (en) 2009-01-21
CA2650557C (en) 2014-09-30
WO2007130122A2 (en) 2007-11-15
CN101432776B (en) 2013-04-24
WO2007130122A3 (en) 2008-04-17
CN101432776A (en) 2009-05-13
JP2009536499A (en) 2009-10-08

Similar Documents

Publication Publication Date Title
CA2650557C (en) System and method for three-dimensional object reconstruction from two-dimensional images
US8433157B2 (en) System and method for three-dimensional object reconstruction from two-dimensional images
JP5160643B2 (en) System and method for recognizing 3D object from 2D image
KR102468897B1 (en) Method and apparatus of estimating depth value
JP5156837B2 (en) System and method for depth map extraction using region-based filtering
US8452081B2 (en) Forming 3D models using multiple images
EP2089853B1 (en) Method and system for modeling light
US8447099B2 (en) Forming 3D models using two images
EP3182371B1 (en) Threshold determination in for example a type ransac algorithm
US20120176478A1 (en) Forming range maps using periodic illumination patterns
US20120176380A1 (en) Forming 3d models using periodic illumination patterns
Angot et al. A 2D to 3D video and image conversion technique based on a bilateral filter
JP4337203B2 (en) Distance image generating apparatus, distance image generating method, and program providing medium
Neri et al. A maximum likelihood approach for depth field estimation based on epipolar plane images
Yao et al. 2D-to-3D conversion using optical flow based depth generation and cross-scale hole filling algorithm
Feris et al. Multiflash stereopsis: Depth-edge-preserving stereo with small baseline illumination
Yamao et al. A sequential online 3d reconstruction system using dense stereo matching
Sarti et al. Image-based surface modeling: a multi-resolution approach
Li et al. Reconstruction of 3D structural semantic points based on multiple camera views
Eisemann et al. Reconstruction of Dense Correspondences.
Wang et al. Depth Super-resolution by Fusing Depth Imaging and Stereo Vision with Structural Determinant Information Inference
Hu et al. High-Definition 3D Reconstruction in Real-Time from a Moving Depth Sensor

Legal Events

Date Code Title Description
EEER Examination request
MKLA Lapsed

Effective date: 20151026