WO2011152842A1

WO2011152842A1 - Face morphing based on learning

Info

Publication number: WO2011152842A1
Application number: PCT/US2010/043224
Authority: WO
Inventors: Hui Chao; Suk Hwan Lim; Feng Tang
Original assignee: Hewlett-Packard Development Company, L.P.
Priority date: 2010-06-01
Filing date: 2010-07-26
Publication date: 2011-12-08

Abstract

To generate a composite image, a source head image can be compared to candidate host head images from a library, and the candidate host head images are ranked based on their degree of similarity to the source head image. A selected candidate host head image can be adjusted so that it conforms to the source head image, and the source head image is composited into the adjusted host head image to provide a composite image. The source head image also can be compared to candidate intermediary head images in a collection, and the candidate intermediary head images are ranked based on their degree of similarity to the source head image. The source head image is modified to be similar to the selected intermediary head image, and the modified source head image is composited into a host head image predetermined to correspond to the selected intermediary head image.

Description

FACE MORPHING BASED ON LEARNING

BACKGROUND

[0001] There are many applications in which a user may want to morph or deform a face and map it into other content, such as a virtual character.

Systems and methods are provided herein to facilitate morphing an input face onto a host face.

DESCRIPTION OF DRAWINGS

[0002] FIG. 1 is a block diagram of an example of an image manipulation system for creating a resultant image.

[0003] FIG. 2A shows a flow chart of an example process for generating a composite image.

[0004] FIG. 2B shows a flow chart of another example process for generating a composite image.

[0005] FIG. 3 is a diagrammatic view of an example head region on which are demarcated the locations of a set of facial features in accordance with an example.

[0006] FIG. 4 is a flow diagram showing an example process for face morphing.

[0007] FIG. 5A shows an example image on which are demarcated the locations of a set of facial features that can be ^'used to compute physical feature measures.

[0008] FIG. 5B shows example host head images in a library.

[0009] FIG. 5C depicts a collection of intermediary head images.

[0010] FIG. 6 is a block diagram of an example of a computer that incorporates an example of the image manipulation system of FIG. 1.

DETAILED DESCRIPTION

[0001] In the following description, like reference numbers are used to identify like elements. Furthermore, the drawings are intended to illustrate major features of exemplary embodiments in a diagrammatic manner. The drawings are not intended to depict every feature of actual embodiments nor relative dimensions of the depicted elements, and are not drawn to scale.

[0002] An "image" broadly refers to any type of visually perceptible content that may be rendered on a physical medium (e.g., a display monitor or a print medium). Images may be complete or partial versions of any type of digital or electronic image, including: an image that was captured by an image sensor (e.g., a video camera, a still image camera, or an optical scanner) or a processed (e.g., filtered, reformatted, enhanced or otherwise modified) version of such an image; a computer-generated bitmap or vector graphic image; a textual image (e.g., a bitmap image containing text); and an iconographic image.

[0003] The term "image forming element" refers to an addressable region of an image. In some examples, the image forming elements correspond to pixels, which are the smallest addressable units of an image. Each image forming element has at least one respective "image value" that is represented by one or more bits. For example, an image forming element in the RGB color space includes a respective image value for each of the colors (such as but not limited to red, green, and blue), where each of the image values may be represented by one or more bits.

[0004] "Image data" herein includes data representative of image forming elements of the image and image values.

[0005] A "computer" is any machine, device, or apparatus that processes data according to computer-readable instructions that are stored on a computer-readable medium either temporarily or permanently. A "software application" (also referred to as software, an application, computer software, a computer application, a program, and a computer program) is a set of instructions that a computer can interpret and execute to perform one or more specific tasks. A "data file" is a block of information that durably stores data for use by a software application.

[0006] The term "computer-readable medium" refers to any medium capable storing information that is readable by a machine (e.g., a computer). Storage devices suitable for tangibly embodying these instructions and data include, but are not limited to, all forms of non-volatile computer-readable memory, including, for example, semiconductor memory devices, such as EPROM, EEPROM, and Flash memory devices, magnetic disks such as internal hard disks and removable hard disks, magneto-optical disks, DVD-ROM/RAM, and CD-ROM/RAM.

[0007] As used herein, the term "includes" means includes but not limited to, the term "including" means including but not limited to. The term "based on" means based at least in part on.

[0008] The examples that are described herein provide systems and methods for face morphing. The systems and methods can take as input information concerning a person or object of interest from an image. Examples of such input information include face geometric feature points, ethnicity and color information (such as but not limited to those of the hair and eyes) of the face image. The input for the person or object of interest are used to map the person or object of interest into a different object (a host head image), such as but not limited to a face of a virtual character. Herein, a virtual character includes a cartoon character. A result of a system and method herein is a resultant image with a more natural-looking morphing of the face of the source head image into a host head image. In an example, a result of a system and method herein is a resultant image with a more natural-looking virtual character with a replaced face.

[0009] A system and method provided herein for a more natural-looking morphing of a face of a person or object of interest (from a source head image) into a different object (a host head image) by taking into account the degree of similarity between physical feature measures of the source head image and a number of candidate host head images (the different objects). The candidate host head images can be in a library of candidate face templates, such as but not limited to a library or database of virtual characters. The libraries referred to herein can be stored in a database. In one example, candidate host head images from the library are ranked based on the degree of similarity of the source head image to the candidate host head images (based on the comparing), from the most similar candidate (most likely match) to the least similar candidate (least likely match). An example system and method disclosed herein uses- physical feature measures based on a number of feature points positioned around an entire face, including points on the facial features (such as one or more of eyes, nose, mouth, eyebrows, etc) and on the outline of the face (such as one or more of cheekbones, jawline, chinline, etc), pose of the face, facial expression, ethnicity, gender, skin, hair and eye colors. Therefore, the comparison to identify and rank candidate host head images takes into account greater variations in physical features of the source head image, to facilitate a selection of candidate host head images that are more closely matched with the source head, providing a more natural-looking resultant final image.

[0010] The face of the person or object of interest can be morphed into a face of a target virtual character based on the target virtual character. A small modification of the input face image (person or object of interest) can yield a beautified resultant face image. See, e.g., T Leyvand et al., 2006, "Digital face beautification", ACM SIGGRAPH. One human face image also can be morphed into^' another face. G. Wolberg, 1998, "Image morphing: a survey," The Visual Computer, Springer. It can be challenging to perform face morphing into a virtual character, such as but not limited to cartoon characters. Cartoon characters have very different faces from human faces, and the morphing can be

dramatically different than that for human faces.

[001 1] A system and a method provided herein can create a more natural virtual character with the replaced face based on face analysis and matching of the input image, the exemplary face, and the face of the virtual character (such as the face of a cartoon character).

[0012] FIG. 1 shows an example of an image compositing system 10 that generates a composite image 12 using input information concerning a person or object of interest from an image 14. In particular, the image

compositing system 10 can be used to determine input information of a person or an object of interest from an image 14 and performing face morphing to provide a composite image 12. In one example, the image compositing system 10 can segment a person or an object from an image 14 and manipulate and create a composite image 12 with the host head image, such as but not limited to a virtual character, having a replaced, morphed face. [0013] In general, image 14 can be any type of image, including amateur and professional photographs and commercially produced images. In one example use scenario, a user provides the image 14 in the form of a personal photograph that shows a person or an object against a background, and a content provider provides the virtual character that is to undergo face morphing based on the person or object of interest from image 14 to provide composite image 12. The image compositing system 10 can be used to process the image 14 and outputs the composite image 12, according to any of the methods disclosed herein.

[0014] FIG. 2A shows a flow chart of an example process for generating a composite image. In block 20, data representative of physical feature measures of a source head image (images 14) is compared to

corresponding physical feature values of candidate host head images from a library of host head images. The physical feature measures of the source head image can be computed based on image data representing the source head image. In block 22, based on the comparison, at least two candidate host head images from the library are ranked based on their degree of similarity to the source head image. In block 24, the selected candidate host head image is adjusted so that it conforms in shape and/or color to the source head image. In block 26, the source head image is composited into a version of the adjusted host head image at a location corresponding to the adjusted host head image to generate a composite image. In an example, the composite image (composite image 12) can be generated by performing a weighted blending of the source head image (image 14) and the host head image in accordance with an alpha matte. The host head image can be adjusted as described herein to provide an adjusted host head image, prior to generation of the composite image.

[0015] In an example, the selected candidate host head image used in block 24 is the highest ranked intermediary head image.

[0016] FIG. 2B shows a flow chart of another example process for generating a composite image. In block 30, data representative of physical feature measures of a source head image (images 14) is compared to

corresponding physical feature values of intermediary head images in a collection of intermediary head images. The collection of intermediary head images can be stored in a database. The physical feature measures of the source head image can be computed based on image data representing the source head image. In block 32, based on the comparison, at least two candidate intermediary head images from the collection are ranked based on their degree of similarity to the source head image. In block 34, the source head image is modified to provide a modified source head image having physical feature measures similar to the physical feature measures of the selected intermediary head image. In block 36, the modified source head image is composited into a host head image predetermined to correspond to the selected intermediary head image. The compositing is performed according to predetermined transformation parameters between the selected intermediary head image and the host head image. The composite image (composite image 12) can be generated by performing a weighted blending of the modified source head image (image 14) and the host head image in accordance with an alpha matte.

[0017] In an example, the selected intermediary head image used in . block 34 is the highest ranked intermediary head image.

[0018] In general, image 14 can be any type of image, including amateur and professional photographs and commercially produced images. An example source of images 14 in the collection of intermediary head images is images captured by an image sensor of, e.g., entertainment or sports celebrities, or reality television individuals. Images 14 can be of one or more members of a family near an attraction at an amusement park, images taken of performers during a performance (e.g., a concert), or images taken at a retail store or kiosk. In an example use scenario, a system and method disclosed herein is applied to images in a database of images, such as but not limited to images of an area captured from imaging devices (such as but not limited to surveillance devices, or film footage) located at an airport, a stadium, a restaurant, a mall, outside building, etc., as described herein. In another example use scenario, a system and method disclosed herein is applied to images in a database of images, such as but not limited to images captured using imaging devices (such as but not limited to surveillance devices, or film footage) of an area located at an airport, a stadium, a restaurant, a mall, outside an office building or residence, etc. In another example use scenario, a system and method disclosed herein is applied to images of an advertisement, including advertisement on a screen or a web page. In yet another example use scenario, a system and method disclosed herein is applied to frames of video, including a film, television program, or music video. An example implementation of a method disclosed herein is applying image compositing system 10 to images captured by an image capture device installed in a monitored location for morphing the face of a person or object and compositing it into a different object, such as an image of an individual, and object, or a virtual character, to provide a composite image. The process of either Fig. 2A or Fig. 2B can be used in this implementation. It will be appreciated that other uses are possible.

[0019] Using any of the systems and methods disclosed herein, image compositing system 10 can provide a fully automated process for morphing the face of a person or object and compositing it into a different object, such as an image of an individual, and object, or a virtual character, to provide a composite image. In some examples, the image compositing system 10 outputs the composite image 12 (a composite image) by storing it in a data storage device (e.g., in a database), rendering it on a display (e.g., in a user interface generated by a software application program), or rendering it on a print medium (e.g., paper).

[0020] In one example use scenario, a user provides the image 14 in the form of a personal photograph, and a service provider provides the library of candidate host head images into which the face of image 14 can be morphed, and an apparatus that includes a memory storing computer-readable instructions and a processor coupled to the memory, to execute the instructions, and based at least in part on the execution of the instructions, to perform operations, e.g., according to the process described in Fig. 2A. Instead of the apparatus, the service provider may provide a computer-readable medium storing computer- readable program code adapted to be executed by a computer to implement the process described in Fig. 2A. The service provider can implement image compositing system 10 to process the image 14 and outputs the modified image 12 (a composite image), according to any of the methods disclosed herein, including to the process described in Fig. 2A.

[0021] In another example use scenario, a user provides the image 14 in the form of a personal photograph, and a content provider provides a collection of candidate intermediary host head images, host head images that correspond to the. candidate intermediary host head images, and transformation parameters that have been predetermined between each candidate intermediary host head image and its corresponding host head images. The service provider can provide a computer-readable medium storing computer-readable program code adapted to be executed by a computer to implement the process described in Fig. 2B, so that the face of image 14 can be morphed into a host head image according to ^•the process of Fig. 2B. Alternatively, the service provider can provide an apparatus that includes a memory storing computer-readable instructions and a processor coupled to the memory, to execute the instructions, and based at least in part on the execution of the instructions, to perform operations, e.g., according to the process described in Fig. 2B. The service provider can implement an image compositing system 10 to process the image 14 and outputs the modified image 12 (a composite image), according to any of the methods disclosed herein, including to the process described in Fig. 2B.

[0022] Following is an example system and method for face morphing in connection with the example process of Fig. 2A. In this scenario, a user can have a specific selected category of virtual characters for the face morphing. For example, the user can choose from a recommended set of candidate virtual characters for the face morphing. The image compositing system 10 can make a recommendation of a virtual characters to use for the face morphing and then perform the morphing based on the user's selection.

[0023] The candidate virtual characters can belong to a certain specified category based on classification of geometric features of the virtual characters. For example, each virtual character can have labeled face pose, face alignment points, ethnicity, hair, skin color, gender and facial expression. These labeled features in connection with a virtual character can be stored, e.g., in a database, along with other details concerning that virtual character. [0024] For the input face image (of a person or object of interest from image 14), face analysis is performed to obtain multiple features. Examples of such features include the pose of the face (e.g., the off-plane rotation), the face alignment points, ethnicity estimation, skin color, and hair color estimation.

Example face analysis includes facial feature detection, demographics estimation, and head segmentation (e.g., to provide for skin color detection, etc). The physical feature measures of the source head image are determined based on measures of the multiple features determined from the face analysis.

[0025] Based on feature similarities between the input face image (source head image 14) and the candidate host head images (such as but not limited to virtual characters), the candidate host head images (such as but not limited to virtual characters) are ranked as described herein. A range of host head images recommendations, ranked based on similarity, can be output to a user. A user selects a virtual character to use based on the recommendations to the user. A user also indicates a choice of face morphing, for example but not limited to, shape matching and/or color matching.

[0026] Once a candidate host head image (such as but not limited to virtual character) is selected for use, the face morphing can be performed as follows. If shape matching is the requested form of face morphing, the face of the host head image (e.g., virtual character) is aligned and reshaped based on the input face image. An example method is disclosed in, e.g., S. Schaefer et al., 2006, "Image Deformation Using Moving Least Squares", ACM Transactions on Graphics. If color matching is the requested form of face morphing, the face and hair color of the virtual character are adjusted based on the input face image.

[0027] The face image (of a person or object of interest from image 14) is morphed into the face of the host head image, such as but not limited to a virtual character, including a cartoon character's face, to provide the composite image 12. The composite image can be generated by performing a weighted blending of the source head image (face image) and the host head image in accordance with an alpha matte. In one example, image compositing system 10 can further provide a tool for a user to control the amount or degree of similarity between the resulting face of the composite image 12 and the face of the host - io - head image, such as but not limited to a virtual character, including a cartoon character's face. That is, the amount or degree of morphing can be tunable from a minimal degree of morphing to a significant degree of morphing. In an example, the amount or degree of morphing can be controlled using the weighting factors in the blending using the alpha matte.

[0028] Following is another example system and method for face morphing in connection with the example process of Fig. 2B. The face morphing is performed based on a set of candidate intermediary head images for which mapping to corresponding host head images was previously performed. That is, the transformation parameters between a candidate intermediary head image and its corresponding host head images are predetermined. In an example use scenario, the face morphing is performed is based on a set of candidate template faces (candidate intermediary head images) for which mapping to host head images, such as but not limited to candidate virtual characters, was previously performed. That is, the face morphing, such as but not limited to, cartooning transformation, can be performed based on a database or collection of candidate intermediary head images (candidate template faces) and host head images, such as but not limited to candidate virtual characters. In one example, the database or collection is generated based on learning from a manual transformation from candidate intermediary head images (candidate template faces) to determine appropriate corresponding host head images for a given candidate intermediary head image, and to determine transformation parameters that morph the given candidate intermediary head image and a corresponding host head image.

[0029] A system and a method in the example process of Fig. 2B provides for face morphing in a use scenario where face alignment of the person or object of interest (source head image) and a host head image (such as but not limited to a virtual character) is challenging. The system and method can be used for deriving the transformation parameters used for morphing a natural face image to the virtual character. The system and method can be used for performing a face morphing during an automatic process for cartooning a face image. [0030] Following is an example system and method for face morphing in the example process of Fig. 2B. The face morphing can performed using a data from a database (or collection) generated by performing manual morphing for a small but diversified set of known template faces (candidate intermediary head image) with various ethnicity, hair and skin color to a set of candidate host head image (such as but not limited to a virtual characters). For a new source head image, i.e., a face of a person or object of interest that is not in the database (or collection), a set of candidate intermediary head image, i.e., template faces, in the database (or collection) is determined based on the stored features describing the template face. A measure of similarity is determined between the new face and one or more of the candidate intermediary head image (template faces) in the database (or collection). Candidate intermediary head images from the collection are ranked based on their degree of similarity to the source head image. The best matching candidate intermediary head image (template face) can be determined as the most similar to the new source head image face (the highest ranked candidate intermediary head image), determined based on the computed measures of similarity).

[0031] The source head image is modified to provide a modified source headlmage that has physical feature measures similar to the physical feature , measures of the selected intermediary head image. That is, the new face is warped (or morphed) so that the measures of its physical features are approximately the same as the physical feature measures of the selected intermediary head image to provide a modified source head image. In an example, the new face is warped (or morphed) into the best matching template face to provide a hybrid face (a modified source head image). To provide a composite image, the same transformation that was predetermined between the selected intermediary head image and its corresponding host head image is applied to the modified source head image to produce the composite image 12. In one example, the same transformation that was applied to the best matching intermediary head image template face to obtain the virtual character face is applied to the modified source head image (hybrid face) and the corresponding host head image to provide the composite image 12. The composite image (composite image 12) can be generated by performing a weighted blending of the modified source head image and the host head image in accordance with an alpha matte.

[0032] In one example, image compositing system 10 can further provide a refinement tool for a user to control the amount or degree of morphing between the hybrid face and the face of the virtual character (such as but not limited to a cartoon character's face) to provide the composite image 12. That is, the amount or degree of morphing can be tunable from a minimal degree of morphing to a significant degree of morphing. In an example, the amount or degree of morphing can be controlled using the weighting factors in the blending using the alpha matte.

[0033] In one example, the image compositing system 10 can determine a number of feature point locations, including point locations in the eyes, eye-brows, nose and chin of an image. FIG. 3 shows an example head region 40 on which are demarcated (by "X" marks) the locations of a set of facial features that are determined in accordance with this example. In the example of FIG. 3, eighty-eight feature point locations are marked. However, more than eighty-eight feature point locations, or fewer than eighty-eight feature point locations can be used to provide measures of the head region 40. These point locations are used to guide the processes used in some stages of the image manipulation process. A variety of different methods may be used to determine the facial feature locations. An example facial feature location process that may be used to determine the facial feature locations is described in L. Zhang et al., "Robust Face Alignment Based on Local Texture Classifiers," The IEEE

International Conference on Image Processing (ICIP-05), Genoa, Italy September 11-14, 2005. Measures of head region 40, determined based on feature point locations, are used to provide physical feature measures of a source head image (received as an image 14) which are used for comparing the source head image to a library of host head images (a set of candidate template faces to which the source head image is compared).

[0034] An example system and method that can be used for segmentation of a person or object from an image is described as follows. The image compositing system 10 can be used to segment the person or object from the image. The image compositing system 10 can segment the person or object from the image 14 based on a model of the image as a mixture of at least two image layers, where one or more of the image layers are components of a foreground corresponding to the person or object to be replaced and one or more other ones of the image layers are components of a background corresponding to parts of the image outside the person or object to be replaced. In some examples, the source image (I,) is modeled as a convex combination of K image layers F F^K in accordance with equation (1 ):

k

where the K vectors u-j are the matting components of the source image that specify the fractional contribution of each layer to the final color of each pixel of the source image. The alpha matte is determined from the matting components based on a specification of the particular ones of the matting components that are part of the foreground. For example, if a^k1 a^kn are designated as foreground components, then the alpha matte is obtained simply by adding these

components together (i.e., a=a^k1 +...+ a^kn).

[0035] In some of these examples, the source image (I,) is modeled as a mixture of two images (i.e., a foreground image F and a background image B) in accordance with equation (2):

l(x) = a(x)F(x) + (1 - a(x))B(x) (2) where x is a pixel location and a e [0, 1] is an alpha matte that quantifies the mixture. In a typical initialization map, a is either 0 or 1 rather than taking intermediate values. Such an initialization map performs "hard" classification of pixels either fully belonging to the foreground or the background.

[0036] The image compositing system 10 can initially determine an initialization map that identifies regions of the image 14 that correspond to the foreground and that identifies regions of the image 14 that correspond to the background. The initialization map is designed to provide rough designations of both foreground and background regions, where regions of the cropped source image that are highly likely to be parts of a face are marked as the foreground (e.g., "white") are regions that are highly likely to be non-facial areas are marked as the background (e.g., "black"). The remaining unmarked regions of the cropped source image are left as currently unknown; these regions will be labeled as foreground or background in the subsequent alpha matte generation process. The image compositing system 10 typically determines the initialization map by identifying regions of facial image content and regions non-facial image content (e.g., hair image content) in the cropped source image based on locations of respective ones of the facial features.

[0037] In some examples, the identified foreground and background regions in the initialization map are used as initial seed points for a k-means clustering algorithm which outputs an enhanced initialization map.

[0038] The image compositing system 10 derives the alpha matte from the enhanced initialization map. As explained above, the alpha matte specifies respective contributions of the image layers to the foreground and background. The image compositing system 10 refines the enhanced initialization map by applying the enhanced initialization map as a tri-map in an image matting process that generates the alpha-map, which conveys the desired segmentation of the source head image. The image matting process classifies the unknown regions of the enhanced initialization map as foreground or background based on color statistics in the known foreground and background regions. In general, a variety of different supervised image matting processes may be used to generate the alpha matte from the enhanced initialization map, including Poisson matting processes (see, e.g., J. Sun et al., "Poisson Matting," ACM SIGGRAPH, 2004) and spectral matting processes (see, e.g., A. Levin et al., "Spectral Matting," IEEE Transactions PAMI, Oct 2008). Image matting processes of these types are able to produce high quality segmentation maps of fine details of head image, such as regions of hair.

[0039] In any of the systems and method disclosed herein, the color tone or skin color of the

[0040] An example system and method that can be used for color tone adjustment of the person or object to be morphed, or the virtual character, is described as follows. The color tone adjustment can be performed based on a skin tone detected for the person or object to be morphed. The image compositing system 10 can generate a source skin map that segments skin areas from other areas in the image 14. In some examples the source skin map includes for each pixel of the input image a respective skin probability value indicating a degree to which the pixel corresponds to human skin. A

characteristic feature of the source skin map is that all pixels of the image 14 having similar values are mapped to similar respective skin probability values in the skin map. As used herein with respect to pixel values, the term "similar" means that the pixel values are the same or nearly the same and appear visually indistinguishable from one another. This feature of the skin map is important in, for example, pixels of certain human-skin image patches that have colors outside of the standard human-skin tone range. This may happen, for example, in shaded face-patches or alternatively in face highlights, where skin segments may -sometimes have a false boundary between skin and non-skin regions. The skin map values vary continuously without artificial boundaries even in skin patches trailing far away from the standard human-skin tone range.

[0041] The image compositing system 10 may ascertain the skin probability values indicating the degrees to which the input image pixels correspond to human skin in a wide variety of different ways.

[0042] In some examples, the image compositing system 10 computes the pixel intensity distributions of skin areas using the facial feature points.

Samples from areas such as cheek or forehead are selected as those points that are guaranteed to be skin areas. From those samples, the image compositing system 10 estimates conditional densities p(l |skin) where I is the pixel intensity. The image compositing system 10 then obtains the posterior probability

p(I I skin)p(skin) p(I | skin)

p(skin 1 1) = (3) pO) * p(I)

where p(l) is obtained from the histogram of the pixel intensities for the given image. This posterior probability is used as a multiplier to the skin color compensation such that only the pixels that are likely to be from the skin pixels are modified while non-skin pixels are not changed. In some of these examples, the image compositing system 10 determine the skin map by thresholding the posterior probabilities

p(skin 1 1) with an empirically determined threshold value.

[0043] In other examples, the image compositing system 10 ascertains the per-pixel human-skin probability values from human-skin tone probability distributions in respective channels of a color space (e.g., RGB, YCC, and LCH). ^< For example, in some examples, the image compositing system 10 ascertains the per-pixel human-skin tone probability values from human-skin tone probability distributions in the CIE LCH color space (i.e., P(skin|L), P(skin|C), and P(skin|H)). These human-skin tone probability distributions are approximated by Gaussian normal distributions (i.e., G(p, ,&)) that are obtained from mean (μ) and standard deviation (&) values for each of the p=L, C, and H color channels. In some examples, the mean (μ) and standard deviation (&) values for each of the p=L, C, and H color channels are obtained from O. Martinez Bailac, "Semantic retrieval of memory color content", PhD Thesis, Universitat Autonoma de Barcelona, 2004. The image compositing system 10 ascertains a respective skin probability value for each pixel of the cropped source image 62 by converting the cropped source image 62 into the CIE LCH color space (if necessary), determining the respective skin-tone probability value for each of the L, C, and H color channels based on the corresponding human-skin tone probability distributions, and computing the product of the color channel probabilities, as shown in equation (4):

P(skin I L,C,H) = G(L, _Ml, &_L)*G(C, p_c, 8r_c)xG(H, μ_Η, &_H) (4)

[0044] In some of these other examples, the skin map values are computed by applying to the probability function P(skin | L,C,H) a range ^• adaptation function that provides a clearer distinction between skin and non-skin pixels. In some of these examples, the range adaptation function is a power function of the type defined in equation (5):

Msm&y) = P{skin I

(5) where y > 0 and M_SKIN {x, y) axe the skin map values at location ( x,y ). In one example, y = 32 . The skin map function defined in equation (5) attaches high probabilities to a large spectrum of skin tones, while non-skin features typically attain lower probabilities.

[0045] Based on the skin maps, the image compositing system 10 can color-adjusts ^'the person or object of interest segmented from image 14 prior to it being morphed into the face of the virtual character.

[0046] FIG. 4 is a flow diagram showing an example system and method for face morphing of a face image 42 from a source head image into a face of a virtual character. The method can take as input a source head image (a face image of a person or object of interest, image 14) for morphing into a virtual character, and a number of candidate virtual characters. As illustrated in block 44, face analysis, such as but not limited to facial feature detection,

demographics estimation and head segmentation can be performed on the face from the source head image to provide physical feature measures of the source head image. The head segmentation can also provide for skin color detection. FIG. 5A shows an example source head image 100 on which are demarcated the locations of a set of facial features that can be used to compute physical feature measures of the source head image. Other physical feature measures of the example source head image, including the pose of the face, facial expression, ethnicity, gender, skin, hair and eye colors, can be determined for the example source head image. As illustrated in block 46, physical feature values representative of candidate virtual characters are also received as input. For example, as depicted in Fig. 4, each virtual character can be represented by physical feature measures of feature points on the face of the virtual character, hair color and skin color of the virtual character. Image compositing system 10 can be used to receive input from blocks 44 and 46, and perform the operations depicted in block 48 involving ranking the candidate virtual characters based on similarity to physical feature measures (such as but not limited to the

demographics and color information) of the face of the source head image.

Based on the ranking, a candidate virtual character can be selected, for example as the highest ranked virtual character, or a user can select a virtual character based on the rankings. The selected candidate virtual character can be adjusted to conforms in shape and/or color to the features of the source head image. In the example depicted in block 50, the selected candidate virtual character is adjusted to match the hair and/or the skin of the source head image. As depicted in block 52, the face from the source head image is morphed with the face of the virtual character. A composite image results from the operation of block 52.

[0047] The example of Fig. 4 is discussed in relation to a library of host head images that includes candidate virtual characters. The example of Fig. 4 can be practiced using a library 120 of candidate host head images (120-/, where /^' is an integer value) that includes not only candidate virtual characters, but also includes candidate host head images of other types of cartoon or graphic characters, of humans, or other types of host head images. FIG. 5A shows an example library 120 of candidate host head images (120-1 to 120-9), including candidate host head images of other types of cartoon characters, graphic characters and humans, that can be used according to any system or method disclosed herein. The example library 120 can be maintained or stored in one or more databases. Image compositing system 10 can be configured such that a user can limit the type of candidate host head images that are evaluated for ranking or selection. For example, a user can instruct image compositing system 10 to limit the candidate host head images that are evaluated for ranking or selection to a specific type of cartoon or graphic characters (including virtual characters), or humans, or other type of host head image.

[0048] FIG. 5C depicts a collection 130 of intermediary head images (130-7) that can be used in an example system and method herein. Where the collection 130 of candidate intermediary head images (130-y) are used to generate a composite image 12, the collection 130 can be maintained or stored in a database. In the example system and method, source head image is first compared to a collection 130 of intermediary head images (130-y, where is an integer value) to rank (and in some examples, select), candidate intermediary head images (130-/) from the collection 130 as described herein. The source head image is modified to provide a modified source head image that has physical feature measures similar to the physical feature measures of the selected intermediary head image (130-y). To generate a composite image, the modified source head image is composited into a host head image that is predetermined to correspond to the selected intermediary head image. The generating is performed in accordance with predetermined transformation parameters between the selected intermediary head image and the host head image.

[0049] In general, the image compositing system 10 typically includes one or more discrete data processing components, each of which may be in the form of any one of various commercially available data processing chips. In some implementations, the image compositing system 10 is embedded in the hardware of any one of a wide variety of digital and analog computer devices, including desktop, workstation, and server computers. In some examples, the image compositing system 10 executes process instructions (e.g., machine- readable code, such as computer software) in the process of implementing the methods that are described herein. These process instructions, as well as the data generated in the course of their execution, are stored in one or more computer-readable media. Storage devices suitable for tangibly embodying these instructions and data include all forms of non-volatile computer-readable memory, including, for example, semiconductor memory devices, such as EPROM, EEPROM, and flash memory devices, magnetic disks such as internal hard disks and removable hard disks, magneto-optical disks, DVD-ROM/RAM, and CD-ROM/RAM.

[0050] FIG. 6 shows an example of a computer system 140 that can implement any of the examples of the image compositing system 10 that are described herein. The computer system 140 includes a processing unit 142 (CPU), a system memory 144, and a system bus 146 that couples processing unit 142 to the various components of the computer system 140. The processing unit 142 typically includes one or more processors, each of which may be in the form of any one of various commercially available processors. The system memory 144 typically includes a read only memory (ROM) that stores a basic input/output system (BIOS) that contains start-up routines for the computer system 140 and a random access memory (RAM). The system bus 146 may be a memory bus, a peripheral bus or a local bus, and may be compatible with any of a variety of bus protocols, including PCI, VESA, MicroChannel, ISA, and EISA. The computer system 140 also includes a persistent storage memory 148 (e.g., a hard drive, a floppy drive, a CD ROM drive, magnetic tape drives, flash memory devices, and digital video disks) that is connected to the system bus 146 and contains one or more computer-readable media disks that provide non-volatile or persistent storage for data, data structures and computer-executable instructions.

[0051] A user may interact (e.g., enter commands or data) with the computer system 140 using one or more input devices 150 (e.g., a keyboard, a computer mouse, a microphone, joystick, and touch pad). Information may be presented through a user interface that is displayed to a user on the display 151 (implemented by, e.g., a display monitor), which is controlled by a display controller 154 (implemented by, e.g., a video graphics card). The computer system 140 also typically includes peripheral .output devices, such as speakers and a printer. One or more remote computers may be connected to the computer ^■ system 140 through a network interface card (NIC) 156.

[0052] As shown in FIG. 6, the system memory 144 also stores the image compositing system 10, a graphics driver 158, and processing information 160 that includes input data, processing data, and output data. In some

examples, the image compositing system 10 interfaces with the graphics driver 158 to present a user interface on the display 151 for managing and controlling the operation of the image compositing system 10.

[0053] Many modifications and variations of this invention can be made without departing from its spirit and scope, as will be apparent to those skilled in the art. The specific examples described herein are offered by way of example only, and the invention is to be limited only by the terms of the appended claims, . along with the full scope of equivalents to which such claims are entitled.

[0054] As an illustration of the wide scope of the systems and methods described herein, the systems and methods described herein may be

implemented on many different types of processing devices by program code comprising program instructions that are executable by the device processing subsystem. The software program instructions may include source code, object code, machine code, or any other stored data that is operable to cause a processing system to perform the methods and operations described herein. Other implementations may also be used, however, such as firmware or even appropriately designed hardware configured to carry out the methods and systems described herein.

[0055] It should be understood that as used in the description herein and throughout the claims that follow, the meaning of "a," "an," and "the" includes plural reference unless the context clearly dictates otherwise. Also, as used in the description herein and throughout the claims that follow, the meaning of "in" includes "in" and "on" unless the context clearly dictates otherwise. Finally, as used in the description herein and throughout the claims that follow, the meanings of "and" and "or" include both the conjunctive and disjunctive and may be used interchangeably unless the context expressly dictates otherwise; the phrase "exclusive or" may be used to indicate situation where only the disjunctive meaning may apply.

[0056] All references cited herein are incorporated herein by reference in their entirety and for all purposes to the same extent as if each individual publication or patent or patent application was specifically and individually indicated to be incorporated by reference in its entirety herein for all purposes. Discussion or citation of a reference herein will not be construed as an admission that such reference is prior art to the present invention.

Claims

WHAT IS CLAIMED IS:

1. A method performed by a physical computing system (140) comprising at least one processor (142) for generating a composite image (12), said method comprising:

comparing (20) data representative of physical feature measures of a source head image (14,100) to corresponding physical feature values of candidate host head images (120-/) from a library of host head images (120), wherein the physical feature measures of the source head image (14,100) are computed based on image data representing the source head image (14,100); ranking (22) at least two candidate host head images (120-/) from the library (120) based on a degree of similarity of the candidate hosthead images (120-/) to the source head image (14,100);

adjusting (24) a selected candidate host head image to provide an adjusted host head image that conforms in shape and/or color to the source head image (14,100); and

generating a composite image (12), wherein the generating (26) comprises compositing the source head image (14,100) into a version of the adjusted host head image at a location corresponding to the adjusted host head image.

2. The method of claim 1 , further comprising determining the selected candidate host head image as the or each highest ranked candidate host head image based on the ranking.

3. The method of claim 2, wherein the or each highest ranked candidate host head image is determined as the or each candidate host head image having a degree of similarity to the source head image (14,100) above a threshold similarity value.

4. The method of claim 1 , wherein the adjusted host head image conforms in shape to the source head image (14,100), and wherein adjusting the selected candidate host head image comprises approximately aligning the selected candidate host head image with the source head image (14,100) and modifying the shape of the selected candidate host head image to conform to the shape of the source head image (14,100) to provide the adjusted host head image.

5. The method of claim 1 , wherein the adjusted host head image conforms in color to the source head image (14,100), and wherein adjusting the selected candidate host head image comprises adjusting a color of the facial region and/or hair region of the selected candidate host head image to conform to the source head image (14,100) to provide the adjusted host head image.

6. The method of claim 1 , wherein the physical feature measures of the source head image (14,100) is one or more of the pose of the face, face alignment points, ethnicity estimation, skin color, gender, facial expression, or hair color estimation.

7. The method of claim 1 , wherein the physical feature values corresponding to a candidate host head image in the library (120) is one or more of the face pose, face alignment points, ethnicity, hair color, gender, facial expression or skin color of the candidate host head image.

8. The method of claim 1 , wherein the generating the composite image (12) comprises performing a weighted blending of the source head image (14,100) and the version of the adjusted host head image in accordance with an alpha matte.

9. A method performed by a physical computing system (140) comprising at least one processor (142) for generating a composite image (12), said method comprising:

comparing (30) data representative of physical feature measures of a source head image (14,100) to corresponding physical feature measures of candidate intermediary head images (130-y) in a collection (130), wherein the physical feature measures of the source head image (14,100) are computed based on image data representing the source head image (14,100), and wherein the physical feature measures of a candidate intermediary head image is computed based on image data representing the candidate intermediary head image;

ranking (32) at least two candidate intermediary head images (130-/) from the collection (130) based on a degree of similarity of the candidate intermediary head images (130-/) to the source head image (14,100);

modifying (34) the source head image (14,100) to provide a modified source head image having physical feature measures similar to the physical feature measures of a selected intermediary head image; and

generating a composite image (12), wherein the generating (36) comprises compositing the modified source head image into a host head image

predetermined to correspond to the selected intermediary head image, and wherein the generating is performed in accordance with predetermined transformation parameters between the selected intermediary head image and the host head image.

10. The method of claim 9, further comprising determining the selected candidate host head image as the or each highest ranked candidate host head image based on the ranking.

11. The method of claim 9, wherein the generating the composite image (12) comprises performing a weighted blending of the modified source head image and the version of the host head image in accordance with an alpha matte.

12. A computerized apparatus, comprising:

a memory storing computer-readable instructions; and

a processor (142) coupled to the memory, to execute the instructions, and based at least in part on the execution of the instructions, to perform operations comprising:

comparing (20) data representative of physical feature measures of a source head image (14,100) to corresponding physical feature values of candidate host head images (120-/) from a library of host head images (120), wherein the physical feature measures of the source head image (14,100) are computed based- on image data representing the source head image (14,100); ranking (22) at least two candidate host head images (120-/) from the library (120) based on a degree of similarity of the candidate host head images (120-/) to the source head image (14,100);

13. A computerized apparatus, comprising:

a memory storing computer-readable instructions; and

comparing (30) data representative of physical feature measures of a source head image (14,100) to corresponding physical feature measures of candidate intermediary head images (130-/) in a collection (130), wherein the physical feature measures of the source head image (14,100) are computed based on image data representing the source head image (14,100), and wherein the physical feature measures of a candidate intermediary head image is computed based on image data representing the candidate intermediary head image;

ranking (32) at least two candidate intermediary head images (130-;) from the collection (130) based on a degree of similarity of the candidate intermediary head images (130- ) to the source head image (14,100);

generating a composite image (12), wherein the generating (36) comprises compositing the modified source head image into a host head image predetermined to correspond to the selected intermediary head image, and wherein the generating is performed in accordance with predetermined transformation parameters between the selected intermediary head image and the host head image.

14. At least one computer-readable medium storing computer- readable program code adapted to be executed by a computer to implement a method comprising:

comparing (20) data representative of physical feature measures of a source head image (14,100) to corresponding physical feature values of candidate host head images (120-/) from a library of host head images (120), wherein the physical feature measures of the source head image (14,100) are computed based on image data representing the source head image (14,100); ranking (22) at least two candidate host head images (120-/) from the library (120) based on a degree of similarity of the candidate host head images (120-/) to the source head image (14,100);

15. At least one computer-readable medium storing computer- readable program code adapted to be executed by a computer to implement a method comprising:

comparing (30) data representative of physical feature measures of a source head image (14,100) to corresponding physical feature measures of candidate intermediary head images (130-;) in a collection (130), wherein the physical feature measures of the source head image (14,100) are computed based on image data representing the source head image (14,100), and wherein the physical feature measures of a candidate intermediary head image is computed based on image data representing the candidate intermediary head image;

ranking (32) at least two candidate intermediary head images (130-;) from the collection (130) based on a degree of similarity of the candidate intermediary head images (130-;) to the source head image (14,100);