WO2001029767A2 - System and method for three-dimensional modeling - Google Patents

System and method for three-dimensional modeling Download PDF

Info

Publication number
WO2001029767A2
WO2001029767A2 PCT/EP2000/009879 EP0009879W WO0129767A2 WO 2001029767 A2 WO2001029767 A2 WO 2001029767A2 EP 0009879 W EP0009879 W EP 0009879W WO 0129767 A2 WO0129767 A2 WO 0129767A2
Authority
WO
WIPO (PCT)
Prior art keywords
feature
matching
information
model
image
Prior art date
Application number
PCT/EP2000/009879
Other languages
French (fr)
Other versions
WO2001029767A3 (en
Inventor
Yan Yong
Kiran Challapali
Original Assignee
Koninklijke Philips Electronics N.V.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Koninklijke Philips Electronics N.V. filed Critical Koninklijke Philips Electronics N.V.
Priority to KR1020017007811A priority Critical patent/KR20010089664A/en
Priority to EP00969460A priority patent/EP1190385A2/en
Priority to JP2001532487A priority patent/JP2003512802A/en
Publication of WO2001029767A2 publication Critical patent/WO2001029767A2/en
Publication of WO2001029767A3 publication Critical patent/WO2001029767A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/204Image signal generators using stereoscopic image cameras
    • H04N13/239Image signal generators using stereoscopic image cameras using two 2D image sensors having a relative position equal to or related to the interocular distance
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/597Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • G06T17/10Constructive solid geometry [CSG] using solid primitives, e.g. cylinders, cubes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/194Transmission of image signals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • G06T2207/10012Stereo images
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/15Processing image signals for colour aspects of image signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/189Recording image signals; Reproducing recorded image signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N2013/0074Stereoscopic image analysis
    • H04N2013/0081Depth or disparity estimation from stereoscopic image signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N2013/0074Stereoscopic image analysis
    • H04N2013/0085Motion estimation from stereoscopic image signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N2013/0074Stereoscopic image analysis
    • H04N2013/0092Image segmentation from stereoscopic image signals

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Geometry (AREA)
  • Software Systems (AREA)
  • Computer Graphics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Processing (AREA)
  • Processing Or Creating Images (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Image Analysis (AREA)
  • Testing, Inspecting, Measuring Of Stereoscopic Televisions And Televisions (AREA)
  • Length Measuring Devices By Optical Means (AREA)

Abstract

A method and image processing system are disclosed that provides a three-dimensional model using information from an input pair of image frames. A three-dimensional surface is obtained by first identifying features in the input frames. The feature corresponding matching is performed using disparity information based upon the frames. The identified features may also be correlated using temporal information.

Description

System and method for three-dimensional modeling.
FIELD OF THE INVENTION
The present invention pertains generally to the field of three-dimensional modeling, and in particular, the invention relates to a system and method for three- dimensional modeling of an object contained in a digital image using disparity based information.
BACKGROUND OF THE INVENTION
Video/image communication applications over the Internet or Public Switch
Telephone Network (PSTN) are growing in popularity and use. In conventional video/image communication technology a picture (in a JPEG or GIF format) is captured and then transmitted over a transmission network. This approach, however, requires a large bandwidth because of the size (i.e., the amount of data) of the picture.
When coding moving images with data rates between 64 K and 2 M bits/sec, a block-based hybrid coder is typically used. The coder subdivides each image of a sequence into independent moving blocks. Each block is then coded by 2D motion prediction and transform coding. Depending on the transmission rate, the received resulting image does not play smoothly and cannot be played in real-time.
Methods have been used to improve video/image communication and/or to reduce the amount of information required to be transmitted. One such method has been used in videophone applications. An image is encoded by three sets of parameters which define its motion, shape and surface color. Since the subject of the visual communication is typically a human, primary focus can be directed to the subject's head or face.
In conventional videophone communication systems, a single camera is typically used to acquire an image of the person making a video call. Since only one camera is used, it is difficult to acquire a true three-dimensional (3D) face surface (i.e., the shape parameters). Typically, to generate a 3D surface, multiple two-dimensional (usually six) views of an object are required. Using these views, a distance transformation is applied. For example, an ellipse can be used as a generating function to obtain the Z coordinate of shapes in the object as a function of its distance to the boundary of the object. These contour lines can only approximate the 3D shape.
Another method is called model-based coding. Low bit-rate communication can be achieved by encoding and transmitting only representative facial parameters of the subject's head. At the remote site, a face image is synthesized using the transmitted parameters. In general, model-based coding requires at least four tasks: segmentation of the face, extraction of facial features, tracking of the features and estimation of motion.
One known method for face segmentation is to create a dataset describing a parameterized face. This dataset defines a three-dimensional description of a face object. The parameterized face is given as an anatomically-based structure by modeling muscle and skin actuators and force-based deformations.
As shown in Fig. 1, a set of polygons define a human face model 100. Each of the vertices of the polygons are defined by X, Y and Z coordinates. Each vertex is identified by an index number. A particular polygon is defined by a set of indices surrounding the polygon. A code may also be added to the set of indices to define a color for the particular polygon.
Systems and methods are also known that analyze digital images, recognize a human face and extract facial features. Conventional facial feature detection systems use methods such as facial color tone detection, template matching or edge detection approaches One of the most difficult problems in model-based coding is providing facial feature correspondence quickly, easily and robustly. In sequential frames, the same facial features must be matched correctly. Conventionally, a block-matching process is used to compare pixels in a current frame and a next frame to determine feature correspondence. If the entire frame is searched for feature correspondence, the process is slow and may yield incorrect results due to mismatching of regions having the same gradient values. If only a subset of the frame is searched, the processing time may be improved. However, in this situation, the process may fail to determine any feature correspondence.
There thus exists in the art a need for improved systems and methods for three-dimensional modeling of objects contained in a digital image for reduced data rate transmission.
BRIEF SUMMARY OF THE INVENTION
It is an object of the present invention to address the limitations of the conventional video/image communication systems and model-based coding discussed above. It is another object of the invention to provide an object-oriented, cross- platform method of delivering real-time compressed video information.
It is yet another object of the invention to enable coding of specific objects within an image frame. It is a further object of the invention to integrate synthetic and natural visual objects interactively or in real-time.
In one aspect of the present invention, an image processing device includes at least one feature extraction determinator configured to extract feature position information from a pair of input image signals and a matching unit that matches corresponding features in the input image signals in accordance with the feature position information and disparity information.
One embodiment of the invention relates to a method for determining parameters related to a 3D model. The method includes the steps of extracting feature position information related to a pair of input images, and matching correspondence features in the pair of input images in accordance with the extracted feature position information and disparity information. The method also includes the step of determining the parameters for the 3D model in accordance with the feature correspondence matching.
These and other embodiments and aspects of the present invention are exemplified in the following detailed disclosure.
BRIEF DESCRIPTION OF DRAWINGSError! Bookmark not defined.
The features and advantages of the present invention can be understood by reference to the detailed description of the preferred embodiments set forth below taken with the drawings, in which: Fig. 1 is a schematic front view of a human face model used for three- dimensional model-based coding.
Fig. 2 is a block diagram of a 3D modeling system in accordance with one aspect of the present invention.
Fig. 3 is a block diagram of an exemplary computer system capable of supporting the system of Fig. 1.
Fig. 4 is a block diagram showing the architecture of the computer system of Fig. 2.
Fig. 5 is a block diagram showing an exemplary arrangement in accordance with a preferred embodiment of the invention. DESCRIPTION OF THE PREFERRED EMBODIMENTS
Referring now to Fig. 2, a 3D modeling system 10 is shown. Generally, the system 10 includes at least one feature extraction determinator 1 1, at least one set of temporal information 12 and a feature correspondence matching unit 13. A left frame 14 and a right frame 15 are input into the system 10. The left and right frames are comprised of image data which may be digital or analog. If the image data is analog than an analog-to-digital circuit can be used to convert the data to a digital format.
The feature extraction determinator 1 1 determines the position/location of features in a digital image such as facial feature positions of the nose, eyes and mouth. While two feature extraction determinators 11 are shown in Fig. 2, one determinator may be used to extract the position information from both the left and right frames 14 and 15. The temporal information 12 includes data such as previous and/or future frames that are used to provide constraints for accurate feature correspondences. As should be understood, the current frame to be process is necessarily the first frame input to the system 10. Test frames may be use to establish some hysteresis.
In a preferred embodiment, the system 10 is implemented by computer readable code executed by a data processing apparatus. The code may be stored in a memory within the data processing apparatus or read/downloaded from a memory medium such as a CD-ROM or floppy disk. In other embodiments, hardware circuitry may be used in place of, or in combination with, software instructions to implement the invention. For example, the invention may implemented on a digital television platform using a Trimedia processor for processing and a television monitor for display. The invention can also be implemented on a computer 30 shown in Fig. 3.
As shown in Figure 3, the computer 30 includes a network connection 31 for interfacing to a data network, such as a variable-bandwidth network or the Internet, and a fax/modem connection 32 for interfacing with other remote sources such as a video or a digital camera (not shown). The computer 30 also includes a display 33 for displaying information (including video data) to a user, a keyboard 34 for inputting text and user commands, a mouse 35 for positioning a cursor on the display 33 and for inputting user commands, a disk drive 36 for reading from and writing to floppy disks installed therein, and a CD-ROM drive 37 for accessing information stored on CD-ROM. The computer 30 may also have one or more peripheral devices attached thereto, such as a pair of video conference cameras for inputting images, or the like, and a printer 38 for outputting images, text, or the like. Figure 4 shows the internal structure of the computer 30 which includes a memory 40 that may include a Random Access Memory (RAM), Read-Only Memory (ROM) and a computer-readable medium such as a hard disk. The items stored in the memory 40 include an operating system 41, data 42 and applications 43. The data stored in the memory 40 may also comprise the temporal information 12. In preferred embodiments of the invention, operating system 41 is a windowing operating system, such as UNIX; although the invention may be used with other operating systems as well such as Microsoft [J Windows95. Among the applications stored in memory 40 are a video coder 44, a video decoder 45 and a frame grabber 46. The video coder 44 encodes video data in a conventional manner, and the video decoder 45 decodes video data which has been coded in the conventional manner. The frame grabber 46 allows single frames from a video signal stream to be captured and processed.
Also included in the computer 30 are a central processing unit (CPU) 50, a communication interface 51, a memory interface 52, a CD-ROM drive interface 53, a video interface 54 and a bus 55 The CPU 50 comprises a microprocessor or the like for executing computer readable code, i.e., applications, such those noted above, out of the memory 50. Such applications may be stored in memory 40 (as noted above) or, alternatively, on a floppy disk in disk drive 36 or a CD-ROM in CD-ROM drive 37. The CPU 50 accesses the applications (or other data) stored on a floppy disk via the memory interface 52 and accesses the applications (or other data) stored on a CD-ROM via CD-ROM drive interface 53.
Application execution and other tasks of the computer 30 may be initiated using the keyboard 34 or the mouse 35. Output results from applications running on the computer 30 may be displayed to a user on display 34 or, alternatively, output via network connection 31. For example, input video data may be received through the video interface 54 or the network connection 31. The input video data may be decoded by the video decoder 45. Output video data may be coded by the video coder 44 for transmission through the video interface 54 or the network interface 31. The display 33 preferably comprises a display processor for forming video images based on decoded video data provided by the CPU 50 over the bus 55. Output results from the various applications may be provided to the printer 38.
Returning to Fig. 2, the left frame 14 and the right frame 15 preferably comprise a pair of stereo digital images. For example, the digital images may be received from two (still or video) cameras 60 and 61 (shown in Fig. 5) and stored in the memory 40 for subsequent processing. Other frames, or pairs of frames, taken at difference angles or views may also be used. The cameras 60 and 61 may be part of another system such as a video conferencing system or an animation system.
The cameras 60 and 61 are closely located to each other and a subject 64 is located a short distance away from the cameras 62 and 63. As shown in Fig. 5. the cameras 60 and 61 are at a distance b apart (center-to-center) from each other. The object 62 is at a distance f from each the cameras 60 and 61. Preferably, b is equal to approximately 5 to 6 inches and f is equal to approximately 3 feet. It should be understood, however, that the invention is not limited to these distances and that these distances are merely exemplary.
Preferably, the camera 60 takes a front view and the camera 61 takes an offset or side view of the object 62. This allows for a comparison to be made of the left frame 14 and the right frame 15 to determine a disparity map. In a preferred embodiment of the invention, the left frame 14 (image A) is compared to a right frame 15 (image B). The reverse comparison, however, may also be performed.
The digital frames or images can be conceptualized as comprising a plurality of horizontal scan lines and a plurality of vertical columns that form an array pixels. The number of scan lines and columns determines the resolution of the digital image. To determine the disparity map, scan lines are lined up, e.g. scan line 10 of image A matches scan line 10 of image B. A pixel on scan line 10 of image A is then matched to its corresponding pixel in scan line 10 of image B. So, for example, if the 15th pixel of scan line 10 of image A matches the 10th pixel of scan line 10 of image B, the disparity is calculated as follows: 15-10=5. It is noted that when the left and right cameras 60 and 61 are closely located, the pixels of foreground information, e.g. a human face, of an image will have a larger disparity than pixels of background information.
A disparity map based upon the disparity calculations may be stored in the memory 40. Each scan line (or column) of the image would have a profile consisting a disparity for each pixel in that scan line (or column). In this embodiment, the grayscale level of each pixel indicates the magnitude of the calculated disparity for that pixel. The darker the grayscale level the lower the disparity.
A disparity threshold may be chosen, e.g. 10. and any disparity above the disparity threshold indicates the pixel is foreground information (i.e. the subject 64) while any disparity below 10 indicates the pixel is background information. The selection of the disparity threshold is based in part on the camera distances discussed above. For example, a lower disparity threshold may be used if the object 62 is position at a greater distance from the cameras 60 and 61 ; or a higher disparity threshold may be used if the cameras 60 and 61 are further apart from each other.
The disparity map is used to extract facial feature positions or coordinates from the left and right frames 14 and 15. Preferably, the systems and methods described in U.S. Patent application 08/385.280, filed on August 30. 1999, comprise the feature extraction determinator 11. Preferably, the facial feature positions include positions for the eyes, nose, mouth as well as the outline positions of the head. As related to Fig. 1, these positions correlate to the various vertices of the face model 100. For example, in regard to the nose, the facial feature extraction determinator preferably provides information directly related to vertices 4, 5, 23 and 58 as shown in Fig. 1.
The feature extraction determinator 1 1 , however, only provides the X and Y coordinates of the facial features. The feature correspondence matching unit 13 provides the Z coordinate. Preferably, the feature extraction determinator 1 1 uses a triangulation procedure based upon the inference of the position of a 3D point given to a perspective projection on the left and right stereo image frames 14 and 15. For example, given the X and Y coordinates of a feature point (FL and FR) in the left and right frames 14 and 15, the 3D surface (i.e., the Z or depth information), can be determined by the following equation:
Z = f * b / (|FL - FR|), where
the distance f (shown in Fig. 5) is the focus length of the cameras 60 and 61 ; the distance b (shown in Fig. 5) is the base line distance between the cameras 60 and 61 ; and
|FL - FR| represents the disparity, which is calculated as discussed above.
In this embodiment, the above equation gives a relationship between the disparity and the surface Z under several geometric conditions. In particular, the image plane in front of each camera is at the focus length f and both cameras are oriented identically with the X-axis of the camera reference frame oriented along the line defined by the position of the cameras 60 and 61. The focus lengths of the cameras 60 and 61 are assumed to be the same. It is also assumed that any geometric distortion of the cameras' 60 and 61 lens have been compensated for. Other geometric arrangements may be used; however, the relationship between the disparity and surface Z becomes more complicated. Other vertices of the face model 100 shown in Fig. 1 can be interpolated or extrapolated based on the positions (i.e., facial feature vertex information) from the feature extraction determinator 1 1 and determination from the feature correspondence matching unit 13. The interpolation be may be based on a linear, non-linear or scalable model or function. For example, a vertex between two other known vertices may be determined using a predetermined parabolic function which all three vertices satisfy. Other face models having additional vertices may also be used to provide enhanced or improved modeling results.
The face model 100 shown in Fig. 1 is a generic face with a neutral expression. The control of the face model 100 is scalable. The face model 100 template may be stored or loaded at remote sites before any communication is initiated. Using the extracted facial features, the polygon vertices can be adjusted to more closely match a particular human face. In particular, based on the information and processing performed by the feature correspondence matching unit 13 and the feature extraction determinator 11 , the face model 100 template is adapted and animated to enable movement, expressions and synchronize audio (i.e., speech). Essentially, the generic face model 100 is dynamically transformed in real-time into a particular face. The real-time or non-real-time transmission of the model face parameters/data provides for low bit-rate animation of a synthetic facial model. Preferably, the data rate is 64 Kbit/sec or less, however, for moving image a data rate between 64 Kbit sec to 4 Mbit/sec is also acceptable. In another embodiment, the temporal information 12 is used to verify feature matching results from the feature correspondence matching unit 13 and/or perform an alternative feature matching process. In this embodiment, for example, matching is only performed by the feature correspondence matching unit 13 on selected frames, preferably "key" frames (in the MPEG format). Once a key frame is feature matched, correspondence matching of features (i.e., depths) in other non-key frames (or other key frames) can be determined by tracking corresponding feature points in a temporal fashion. The 3D motion can be determined up to a scale in one translation direction from two views (i.e. the temporal information 12 may consist of two left or two right sequential or consecutive frames) if an initial feature correspondence is given. Preferably, the feature correspondence matching unit 12 is used to periodically feature match other key frames to eliminate any built-up error from the temporal feature matching. The feature correspondence matching unit 13 may be configured to perform both the feature correspondence matching and the temporal feature matching as needed. The temporal feature matching can be performed faster than the feature correspondence matching, which is advantageous for real-time processing. The invention has numerous applications in fields such as video conferencing and animation/simulation of real objects, or in any application in which object modeling is required. For example, typical applications include video games, multimedia creation and improved navigation over the Internet. In addition, the invention is not limited to 3D face models. The invention may be used with models of other physical objects and scenes; such as 3D models of automobiles and rooms. In this embodiment the feature extraction determinator 1 1 gathers position information related to the particular object or scene in questions, e.g., the position of wheels or the location furniture. Further processing is then based on this information. While the present invention has been described above in terms of specific embodiments, it is to be understood that the invention is not intended to be confined or limited to the embodiments disclosed herein. For example, the invention is not limited to any specific type of filtering or mathematical transformation or to any particular input image scale or orientation. On the contrary, the present invention is intended to cover various structures and modifications thereof included within the spirit and scope of the appended claims.

Claims

CLAIMS:
1. An image processing device (10) comprising: at least one feature extraction determinator (11) configured to extract feature position information from a pair of input image signals (14,15); and a matching unit (13) coupled to said feature extraction determinator (11) arranged to match corresponding features in the input image signals (14.15) in accordance with the feature position information and disparity information.
2. The image processing device (10) according to Claim 1 wherein said matching unit outputs three-dimensional (3D) information related to the input images (14,15).
3. The image processing device according to Claim 2 wherein the 3D surface information is based upon a predetermined model (100).
4. The image processing device according to Claim 4 wherein the predetermined model is a human face model (100).
5. The image processing device according to Claim 1, wherein said matching unit performs matching on at least one frame (14) of the input image signals and temporal feature matching (12) on at least one other frames.
6. The image processing device according to Claim 6, wherein the temporal feature matching is performed using sequential input image frames.
7. The image processing device according to Claim 6, wherein the one frame is a key frame.
8. A method for determining parameters related to a 3D model (100) comprising the steps of: extracting feature position information related to a pair of input images (14.16); matching correspondence features in the pair of input images in accordance with the extracted feature position information and disparity information; and determining the parameters for the 3D model in accordance with the results of the feature correspondence matching.
9. The method according to Claim 9, wherein the 3D model is a human face model (100).
10. A method for coding an object in a digital image for transmission, said method the steps of: extracting feature position information related to at least a pair of digital images (14,15); matching correspondence features in the pair of digital images in accordance with the extracted feature position information and disparity information; and coding information for transmission in accordance with results of the feature correspondence matching.
11. The method according to Claim 1 1 , wherein the information for transmission comprises parameters related to a 3D model (100).
PCT/EP2000/009879 1999-10-21 2000-10-06 System and method for three-dimensional modeling WO2001029767A2 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
KR1020017007811A KR20010089664A (en) 1999-10-21 2000-10-06 System and method for three-dimensional modeling
EP00969460A EP1190385A2 (en) 1999-10-21 2000-10-06 System and method for three-dimensional modeling
JP2001532487A JP2003512802A (en) 1999-10-21 2000-10-06 System and method for three-dimensional modeling

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US42273599A 1999-10-21 1999-10-21
US09/422,735 1999-10-21

Publications (2)

Publication Number Publication Date
WO2001029767A2 true WO2001029767A2 (en) 2001-04-26
WO2001029767A3 WO2001029767A3 (en) 2001-12-20

Family

ID=23676138

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2000/009879 WO2001029767A2 (en) 1999-10-21 2000-10-06 System and method for three-dimensional modeling

Country Status (4)

Country Link
EP (1) EP1190385A2 (en)
JP (1) JP2003512802A (en)
KR (1) KR20010089664A (en)
WO (1) WO2001029767A2 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003017680A1 (en) * 2001-08-15 2003-02-27 Koninklijke Philips Electronics N.V. 3d video conferencing system
CN101536534B (en) * 2006-10-30 2011-07-06 皇家飞利浦电子股份有限公司 Video depth map alignment
TWI393070B (en) * 2009-12-14 2013-04-11 Nat Applied Res Laboratories Human face model construction method
CN113781539A (en) * 2021-09-06 2021-12-10 京东鲲鹏(江苏)科技有限公司 Depth information acquisition method and device, electronic equipment and computer readable medium

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101016071B1 (en) * 2010-06-22 2011-02-17 (주)지오투정보기술 An urban spatial image processing apparatus for obtaining a corresponding area based on vector transform from stereo images according to routes of mobile mapping equipment
CN103530900B (en) * 2012-07-05 2019-03-19 北京三星通信技术研究有限公司 Modeling method, face tracking method and the equipment of three-dimensional face model
KR101373294B1 (en) * 2012-07-09 2014-03-11 인텔렉추얼디스커버리 주식회사 Display apparatus and method displaying three-dimensional image using depth map

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0901105A1 (en) * 1997-08-05 1999-03-10 Canon Kabushiki Kaisha Image processing apparatus

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0901105A1 (en) * 1997-08-05 1999-03-10 Canon Kabushiki Kaisha Image processing apparatus

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
ANTOSZCZYSZYN P M ET AL: "Tracking of the motion of important facial features in model-based coding" SIGNAL PROCESSING. EUROPEAN JOURNAL DEVOTED TO THE METHODS AND APPLICATIONS OF SIGNAL PROCESSING, ELSEVIER SCIENCE PUBLISHERS B.V. AMSTERDAM, NL, vol. 66, no. 2, 30 April 1998 (1998-04-30), pages 249-260, XP004129644 ISSN: 0165-1684 *
GALICIA G ET AL: "DEPTH BASED RECOVERY OF HUMAN FACIAL FEATURES FROM VIDEO SEQUENCES" PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING. (ICIP). WASHINGTON, OCT. 23 - 26, 1995, LOS ALAMITOS, IEEE COMP. SOC. PRESS, US, vol. 2, 23 October 1995 (1995-10-23), pages 603-606, XP000624041 ISBN: 0-7803-3122-2 *
IZQUIERDO E M ET AL: "IMAGE ANALYSIS FOR 3D MODELING, RENDERING, AND VIRTUAL VIEW GENERATION" COMPUTER VISION AND IMAGE UNDERSTANDING, ACADEMIC PRESS, US, vol. 71, no. 2, 1 August 1998 (1998-08-01), pages 231-253, XP000766985 ISSN: 1077-3142 *
MALASSIOTIS S ET AL: "OBJECT-BASED CODING OF STEREO IMAGE SEQUENCES USING THREE-DIMENSIONAL MODELS" IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, IEEE INC. NEW YORK, US, vol. 7, no. 6, 1 December 1997 (1997-12-01), pages 892-905, XP000729345 ISSN: 1051-8215 *
PAPADIMITRIOU D V ET AL: "THREE-DIMENSIONAL PARAMETER ESTIMATION FROM STEREO IMAGE SEQUENCES FOR MODEL-BASED IMAGE CODING" SIGNAL PROCESSING. IMAGE COMMUNICATION, ELSEVIER SCIENCE PUBLISHERS, AMSTERDAM, NL, vol. 7, no. 4 - 6, 1 November 1995 (1995-11-01), pages 471-487, XP000538024 ISSN: 0923-5965 *
STEINBACH E ET AL: "Motion-based analysis and segmentation of image sequences using 3-D scene models" SIGNAL PROCESSING. EUROPEAN JOURNAL DEVOTED TO THE METHODS AND APPLICATIONS OF SIGNAL PROCESSING, ELSEVIER SCIENCE PUBLISHERS B.V. AMSTERDAM, NL, vol. 66, no. 2, 30 April 1998 (1998-04-30), pages 233-247, XP004129643 ISSN: 0165-1684 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003017680A1 (en) * 2001-08-15 2003-02-27 Koninklijke Philips Electronics N.V. 3d video conferencing system
US7825948B2 (en) 2001-08-15 2010-11-02 Koninklijke Philips Electronics N.V. 3D video conferencing
CN101536534B (en) * 2006-10-30 2011-07-06 皇家飞利浦电子股份有限公司 Video depth map alignment
TWI393070B (en) * 2009-12-14 2013-04-11 Nat Applied Res Laboratories Human face model construction method
CN113781539A (en) * 2021-09-06 2021-12-10 京东鲲鹏(江苏)科技有限公司 Depth information acquisition method and device, electronic equipment and computer readable medium

Also Published As

Publication number Publication date
KR20010089664A (en) 2001-10-08
WO2001029767A3 (en) 2001-12-20
EP1190385A2 (en) 2002-03-27
JP2003512802A (en) 2003-04-02

Similar Documents

Publication Publication Date Title
Pearson Developments in model-based video coding
US6792144B1 (en) System and method for locating an object in an image using models
JP4436126B2 (en) Video communication systems using model-based coding and prioritization techniques.
Kompatsiaris et al. Spatiotemporal segmentation and tracking of objects for visualization of videoconference image sequences
WO2008156437A1 (en) Do-it-yourself photo realistic talking head creation system and method
US20020164068A1 (en) Model switching in a communication system
Welsh Model-based coding of videophone images
WO2001029767A2 (en) System and method for three-dimensional modeling
Malassiotis et al. Object-based coding of stereo image sequences using three-dimensional models
JP2001231037A (en) Image processing system, image processing unit, and storage medium
Girod Image sequence coding using 3D scene models
KR100281965B1 (en) Face Texture Mapping Method of Model-based Coding System
Chang et al. Video realistic avatar for virtual face-to-face conferencing
Valente et al. A multi-site teleconferencing system using VR paradigms
Fedorov et al. Talking head: synthetic video facial animation in MPEG-4
Yu et al. 2D/3D model-based facial video coding/decoding at ultra-low bit-rate
St ephane Valente Model {Based Coding and Virtual Teleconferencing
JP2910256B2 (en) Moving image transmission apparatus and moving image transmission method
JPH05128261A (en) Moving image movement estimating equipment and transmitter
KR20000033063A (en) Facial image matching method in a model-based encoding system
Sarris et al. Constructing a videophone for the hearing impaired using MPEG-4 tools
Bozdağı Three-Dimensional Facial Motion and Structure Estimation in Video Coding
Aizawa et al. Human Facial Motion
CN117218246A (en) Training method and device for image generation model, electronic equipment and storage medium
Huang et al. Advances in very low bit rate video coding in north america

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): JP KR

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE

WWE Wipo information: entry into national phase

Ref document number: 2000969460

Country of ref document: EP

121 Ep: the epo has been informed by wipo that ep was designated in this application
ENP Entry into the national phase

Ref country code: JP

Ref document number: 2001 532487

Kind code of ref document: A

Format of ref document f/p: F

WWE Wipo information: entry into national phase

Ref document number: 1020017007811

Country of ref document: KR

WWP Wipo information: published in national office

Ref document number: 1020017007811

Country of ref document: KR

AK Designated states

Kind code of ref document: A3

Designated state(s): JP KR

AL Designated countries for regional patents

Kind code of ref document: A3

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE

WWP Wipo information: published in national office

Ref document number: 2000969460

Country of ref document: EP

WWW Wipo information: withdrawn in national office

Ref document number: 2000969460

Country of ref document: EP

WWR Wipo information: refused in national office

Ref document number: 1020017007811

Country of ref document: KR