WO2010057906A1

WO2010057906A1 - A method and an apparatus for generating an image of an element

Info

Publication number: WO2010057906A1
Application number: PCT/EP2009/065368
Authority: WO
Inventors: Esben Rosenlund Hansen; Rune Domsten; Pernille Ingildsen
Original assignee: 3Dvisionlab Aps
Priority date: 2008-11-19
Filing date: 2009-11-18
Publication date: 2010-05-27

Abstract

A method of generating an image of an element from two or more initial images of the element, each initial image relating to a first angle toward the element, each first angle being different from the other angle(s). The method comprises identifying, in each of the initial images, a number of points/positions and/or areas, each of two or more of the points/positions/areas of one of the initial images corresponding to a point/position/area of another of the initial images, and then generating the image of the element on the basis of the initial images by determining, for each of the two or more of the corresponding points/positions/areas, a weighting coefficient, and generating corresponding points/positions/areas of the image on the basis of the corresponding points/positions/areas of the initial images taking into account the pertaining weighting coefficient(s).

Description

A METHOD AND AN APPARATUS FOR GENERATING AN IMAGE OF AN ELEMENT

The present invention relates to methods and systems for generating an image using images from multiple cameras. In particular, the invention relates to the generation of an image from another angle than that from which the cameras view the element.

In e.g. video conferencing, the participant viewing his monitor has his camera(s) positioned at or on the monitor, whereby the other participants view this user from a side and not from the front. The present invention relates to a solution to that problem.

A first aspect of the invention relates to a method of generating an image of an element, the method comprising :

obtaining two or more initial images of the element, each initial image relating to a first angle toward the element, each first angle being different from the other first angle(s),

- identifying, in each of the initial images, a number of points/positions and/or areas, each of two or more of the points/positions/areas of one of the initial images corresponding to a point/position/area of another of the initial images, and

generating the image of the element on the basis of the initial images by:

determining, for each of the two or more of the corresponding points/positions/areas, a weighting coefficient, and

generating corresponding points/positions/areas of the image on the basis of the corresponding points/positions/areas of the initial images taking into account the pertaining weighting coefficient(s). In the present context, an image of an element may be any type of data, such as 2 dimensional data or 3 dimensional data. Naturally, this is the situation for both the final image as well as the initial images. It is noted that the final and initial images need not be of the same type. Thus, the initial images may be 2 dimensional image data, and the final image may be 3 dimensional image data, for example. The data may be formatted or compressed, if desired, in any manner.

The element normally will be a part of a person, or multiple persons, but may in principle be anything of which an image is desired.

The obtaining of the images may comprise actually generating the images using cameras, sensors, or the like. Alternatively, the image data may be received from one or more sources, such as cameras, sensors or the like or computers, servers, processors, networks or the like.

Preferably, the initial images are generated simultaneously, such as is the case for stereo images, in order for the subsequent process to generate the most realistic image. This simultaneousness requirement may, naturally, be altered to require or desire that the initial images are generated within 1000, such as within 500, preferably within 300, such as within 200, preferably within 100 ms of each other.

Naturally, the present method may comprise providing a sequence of initial images and generating a sequence of images of the element, such as using and generating video of the element.

Any number, larger than one, of initial images may be generated or obtained, for generation of an image of the element, all initial images preferably generated within the above time limit. The subsequent procedure may then determine which of such images to use in the generation of the final image.

Each of the initial images is provided from a different angle toward the element, in order to have non-identical initial images. Usually, such an angle will be that between the element (or part thereof imaged) and a camera/sensor or the like generating the image data. Different angles will mean different positions of such cameras/sensors and/or that different parts of the element are imaged.

In the present context, two points, positions and/or areas, one from one of the initial images and the other from another of the initial images, will correspond, if they e.g. relate to the same part of the element. Parts/areas/positions/points of the element may be determined in a number of manners by usual image processing. Thus, predetermined shapes (such as circles for identifying pupils) may be identified, as may high contrast points or areas (the corner of an eye or the mouth). Thus, sets of such corresponding points/positions/areas may be determined in known manners. One manner being a pre-assumed relative positioning of the points/positions/areas (such as the pre-assumed relative positions of the eyes, pupils and mouth of a person).

Naturally, some positions of an element may not be visible by all cameras/sensors and thereby be present in all initial images. Thus, for each point/position/area, a different number of initial images may have corresponding data.

According to the invention, a weighting coefficient is determined for each of the two or more of the corresponding points/positions/areas. This weighting coefficient describes or relates to the weight, percentage and/or relevance of the point/position/area of one of the initial images in relation to the corresponding points/positions/areas of other initial images. This relevance/weight/percentage may be determined in a number of manners, as will be described further below. Naturally, this coefficient may be a single value (such as the percentage out of 100 derived from one of two initial images) and may be a more complex coefficient describing the relation between the relevance of more than two initial images.

The generation step takes into account the weighting coefficient(s) when deriving or generating the image so that, for each of the points/positions/areas, a differing use of the information from the initial images is seen. In this manner, a better final image may be obtained. This mixing of information is a standard image transformation.

In one embodiment, the step of identifying the points/positions/areas comprises identifying, in the pertaining initial image, one or more recognizable parts, and identifying corresponding points/positions/areas of multiple initial images by identifying one or more sets of points/positions/areas, each set comprising one of the recognizable parts of one initial image and a corresponding part of each of one or more others of the multiple images.

In this respect, a recognizable part primarily is a part identifiable in the image, such as a part having a predetermined shape, such as a circle (pupil), an oval (eye), a square, a triangle, an angle (corner of the eye), or a sufficiently large contrast to the surroundings (corner of the eye/mouth), a predetermined shape (saddle point of the base of the nose), or a predetermined position in relation to other recognizable parts.

In another embodiment, the step of identifying the corresponding points/positions/areas comprises identifying corresponding areas in two or more initial images, and wherein the step of determining the weighting coefficient relating to corresponding areas comprises determining a weighting coefficient relating to the sizes of the areas. It may, for example, be assumed that the larger the area is, the more perpendicularly is a corresponding area of the element viewed by the pertaining camera/sensor or the like, and the more information is present in that area.

Other factors such as lighting/radiation intensity determined from the area/position/point may be used for determining the amount or information or the relevance of the information.

In another embodiment, the step of generating the image comprises identifying a number of corresponding areas in the initial images and morphing the initial images on the basis of the weighting coefficients relating to the individual areas. The standard morphing procedure, thus, is used for determining the points/areas/positions and determining or calculating the subsequent adaptation/deformation/alteration of the areas or positions/points in order to have those at determined, corresponding positions before mixing the contents of the images. This mixing then is not performed with the same weight over all of the images/points/positions/areas but with differing weighting coefficients for at least one or some of the points/areas/positions.

In one embodiment, the generation of the image of the element comprises the determination of a second angle relating to the element, the corresponding points/positions/areas of the image being generated on the basis of the second angle. This angle may be determined once and for all and never be changed, or may be changed at intervals or continuously. This angle may be pre-set or set by the user either as a value, a position from which the element is to be viewed/imaged, or an angle in relation to the element.

Preferably, the second angle is not identical to any of the first images.

In that embodiment, the step of determining the weighting coefficient may further comprise basing the weighting coefficient also on the determined second angle and an angle relating to the pertaining point/position/area of the pertaining initial image. Thus, the weighting coefficients may be based on how closely the direction from which a particular initial image is provided matches the direction from which the element is to be viewed. The closer the match between the angles, the higher could the credibility be assumed of the information in that initial image.

In one embodiment, the generating step comprises:

obtaining a reference image of the element, and

- determining one or more areas/points/positions in the reference image, and wherein the generating step comprises generating the corresponding points/positions/areas of the image on the basis of the determined areas/points/positions of the reference image.

In one situation, the generating step may generate the corresponding points/positions/areas of the image so as to be similar to or identical to those of the reference image. Thus, the same positions/points may be determined, and the same areas (position, shape, size, area) may be selected. This may require the alteration/manipulation/deformation of the initial images or the corresponding points/positions/areas of the initial images.

In this manner, the generating of corresponding points/positions/areas may, in fact, facilitate or change a direction of the element, when viewed in the image, corresponding to a direction of the element in the reference image. Thus, the above second direction may indirectly be input by the selection of the direction of the element (such as a face) in relation to the means (such as a camera) when providing the reference image.

When changing the initial images or the corresponding points/positions/areas of the image, the parts of the element usually imaged at such positions/areas/points (from the reference image) will be positioned at such positions/points/areas, whereby an effective rotation of the element may be performed.

Thus, in the above embodiment, the determining step may further comprise determining or altering the points/positions/areas of the image to have relative positions corresponding to relative positions of the corresponding points/positions/areas of the reference image. In this manner, making e.g. the areas have the same areas, weighting coefficients determined by the reference image may be used, if the above area-based determination is used.

In another embodiment, the step of determining the weighting coefficients comprises adapting the weighting coefficients at or in the vicinity of boundaries between different points/positions/areas. Using the same weighting coefficient within all of an area having a common boundary with another area inside which another weighting coefficient is used may generate visible edge effects. Thus, adapting the effective weighting coefficient at the boundaries may smooth such effects. One manner of obtaining this is to provide not only a number of separate weighting coefficients, one for each corresponding point/position/area, but provide a 2D or 3D graph/curve representing, for each point/position/area in the image, the determined weighting coefficient, the method further comprising the step of smoothing the weighting coefficient graph/curve.

In this manner, the weighting coefficient varies not only from corresponding point/position/area to corresponding point/position/area, but also inside e.g. areas and in the vicinity of points/positions.

This graph/curve may be generated by firstly mapping the points/positions/areas and the determined weighting coefficients in relation thereto and subsequently smoothing this curve/graph in order to obtain a smooth curve and thereby smooth transitions between points/positions/areas.

In a particular embodiment, the identifying step comprises predicting, in one or more of the first initial images and at a first point in time, points/positions/areas at a subsequent point in time.

One reason for this is to perform a preprocessing or prediction in order to facilitate a swifter analysis and conversion of the image, when the initial images forming the basis of the image are available.

In this situation, the generating step may comprise generating, at the subsequent point in time, the image on the basis of corresponding, predicted points/positions/areas as well as first initial images generated or provided at the subsequent point in time. Then, the pre-processing or prediction relates to the positions/points/areas, and the actual initial images are used for generating the image. In this situation, the weighting coefficients may also be predicted and based on the predicted points/positions/areas, or may be determined from the actual, initial images generated at the subsequent point in time. In general, this prediction may be made by, in one or more of the initial images, tracking individual points/positions/areas, such as the velocity, acceleration, direction etc. thereof, to thereby be able to predict the future movement, and thereby positions, thereof.

As mentioned, the weighting coefficients may be predicted as well, as the generating step may comprise determining the weighting coefficients on the basis of the predicted points/positions/areas. In this situation, the generating step preferably also comprises generating the image on the basis of the initial images provided or generated at the first point in time.

Thus, the full image may be based on predictions. At the first point in time, initial images are provided from which the positions/points/areas are predicted, and the contents of which are predicted at the predicted positions/points/areas. Then, the image, at the subsequent point in time, may be predicted and not based at all on the initial images provided at the subsequent point in time.

The initial images provided at the subsequent point in time may be used for predicting an image further into the future.

One manner of predicting the contents of an initial image, such as at the predicted points/positions/areas, may be one in which the contents in the initial image at the actual points/positions/areas are used or converted (such as by deforming/transforming an area and thereby its contents) to the predicted points/positions/areas. The contents of an area may be transformed by determining the desired transformation of the area and subsequently stretching/compressing/deforming the contents accordingly.

More precisely, the prediction of an initial image from a previous initial image may be performed by firstly predicting, from the movement of the points/positions/areas, predicted points/positions/areas, and subsequently manipulating the actually determined points/positions/areas of the initial image (e.g. using the wrapping part of the standard morphing technology) to obtain a predicted initial image. Naturally, if such prediction is desired, it may be desired to increase the frame rate of image generation from the cameras or the like, as this may give a better predication of future points/positions/areas. The frame rate of images actually used for the generation of the image need not be that high.

Another aspect of the invention relates to an apparatus for generating an image of an element, the apparatus comprising :

two or more cameras positioned at different positions for obtaining two or more initial images of the element, each initial image relating to a first angle toward the element, each first angle being different from the other first angle(s),

first processing means adapted to identify, in each of the initial images, a number of points, positions and/or areas, each of two or more of the points/positions/areas of one of the initial images corresponding to a point/position/area of another of the initial images, and

- generating means for generating the image of the element on the basis of the initial images by:

determining, for each of the two or more corresponding points/positions/areas, a weighting coefficient, and

generating corresponding points/positions/areas of the image on the basis of the corresponding points/positions/areas of the initial images taking into account the pertaining weighting coefficient(s).

As mentioned above, an image of an element may be information on any form, compressed or not, and may represent all of the element or part thereof, and may be in any number of dimensions desired.

In the present context, a camera may be any type of element adapted to provide 2D, 3D, or even ID image data (such as by scanning), and this camera may be remotely positioned and forward the image data via any type of communication (wireless or not), such as via networks, the WWW, servers or the like.

Normally, the cameras are positioned at different positions in relation to the element in order to provide non-identical images of the element. These positions are determined by the first angles, where an angle is defined by a line, e.g. between the camera (such as a predetermined point or part of a camera) and the element (such as a predetermined point or part of the element).

As mentioned above, it is preferred that the initial images are provided or generated simultaneously or within a predetermined time frame. Thus, the cameras may be adapted to exchange or receive timing data or signals in order to achieve this timing. Alternatively, the first processing means, if receiving the initial images, may select images provided or generated (the images may comprise timing information) within the desired time frame.

Naturally, the first processing means may be a central element adapted to receive the images and determine the points/positions/areas, or this processing means may be distributed into or near the cameras in order for the cameras to be able to pre-process the images before transmitting data.

Normally, the determination of points/positions/areas in an image is performed by identifying, in the image, parts of the element. Such parts normally are easily determinable parts, such as parts having a predetermined (2D or 3D) shape, colour, temperature, position, contrast or the like. This is standard image analysis.

In addition, predetermined or known relations between such positions/points/areas in the element may be used for identifying positions/points/areas in the initial images. The same relation may be used for correlating a point/position/area in one image with that in the other. Generating the image will mean generating data representing the image. This data may have any shape or form, as long as the image may be derived there from.

A weighting coefficient is determined for at least two of the corresponding points/positions/areas in order to obtain more credible image data. This weighting coefficient may, as is made clear further below, be determined in a number of manners but in general relates to the credibility/quality of the image data of the initial images having corresponding points/positions/areas at the pertaining point/position/area. As mentioned, this coefficient may simply be a number, or may describe the relation between more than two images.

The generation of the image is usually performed on the basis of the initial images, where information from the initial images at the points/positions/areas is utilized in relation to the pertaining weighting coefficient.

In one embodiment, the first processing means is adapted to identify, in the pertaining initial image, one or more recognizable parts, and identifying corresponding points/positions/areas of multiple initial images by identifying one or more sets of points/positions/areas, each set comprising one of the recognizable parts of the initial image and a corresponding part of each of one or more others of the multiple images.

As mentioned above, such recognizable parts normally are parts with a particular shape, colour, position, temperature or contrast.

In one embodiment, the first processing means is adapted to identify the corresponding areas in two or more initial images, and wherein the generating means is adapted to determine the weighting coefficient relating to the corresponding areas by determining a weighting coefficient relating to the sizes of the areas.

As mentioned above, the first processing means may, as may the generating means, be a centralized element receiving the initial images and performing the image analysis, or may be decentralized, such as being a part of the cameras or being parts adapted to receive the initial images and forward the data to other elements.

In a simple embodiment, the generating means is adapted to generate the image by identifying a number of corresponding areas in the initial images and morphing the initial images on the basis of the weighting coefficients relating to the individual areas. Here, the standard morphing is amended so that the individual areas are transformed with their own weighting coefficient instead of a single, global coefficient.

In one embodiment, the generating means is adapted to determine a second angle relating to the element, the generating means being adapted to generate the corresponding points/positions/areas of the image on the basis of the second angle. In one situation, the corresponding points/positions/areas are generated as if identified in an image taken along the second angle toward the element. Alternatively, the determined, corresponding points/positions/areas may be altered to be as if the image was taken along the second angle toward the element. In these manners, the image may be made to resemble the element viewed from the second angle, and if this angle is changeable, a user or, in a particular embodiment where the element is a person taking part in a video conference, the image of the person being transmitted to other participants in the conference, the other participants may alter this angle and thereby the angle form which the person is viewed.

In this embodiment, the generating means is preferably adapted to determine the weighting coefficient also on the basis of the determined second angle and a third angle relating to the pertaining point/position/area of the pertaining initial image. Thus, a simple manner of determining a weighting coefficient may relate to the difference in the angles. In this respect, the third angle may vary from point/position/area to point/position/area, as different e.g. areas are viewed along different third angles from a stationary camera. Thus, as the difference between a third angle and the second angle may differ from area/point/position to others, different weighting coefficients will be derived. In another embodiment, the generating means are adapted to:

obtain a reference image of the element, and

determine one or more areas/points/positions in the reference image, and

wherein the generating means are adapted to generate the corresponding points/positions/areas of the image on the basis of the determined areas/points/positions of the reference image.

In this manner, the reference image may be used for, as mentioned above, determine or define the second angle, and/or for defining the points/positions/areas of the image or initial images. In this situation, the generating means may be adapted to determine or alter the points/positions/areas of the image or initial images to have relative positions corresponding to relative positions of the corresponding points/positions/areas of the reference image. In this manner, the correspondence may be the same positions or any areas having the same positions, shape and/or area.

In one embodiment, the generating means are adapted adapt the weighting coefficients at or in the vicinity of boundaries between different points/positions/areas. The possibility of experiencing non-optimal boundary effects are described above.

In this situation, the generating means may comprise means for providing a 2D or 3D graph/curve representing, for each point/position/area in the image, the determined weighting coefficient, the providing means being adapted to smooth the weighting coefficient graph/curve. This smoothing may be the adaptation to e.g. an n-dimension polynomial or any other desirable smoothing technique.

In a particularly interesting embodiment, the first processing means are adapted to predict, in relation to one or more of the first initial images and at a first point in time, points/positions/areas at a subsequent point in time. This prediction may be based on a tracking of positions, movement, acceleration or the like of the points/positions/areas. Alternatively, pixel tracking may be performed.

In one situation, the generating means is adapted to generate, at the subsequent point in time, the image on the basis of the corresponding, predicted points/positions/areas as well as first initial images generated or provided at the subsequent point in time.

In another situation, the generating means is adapted to determine the weighting coefficients on the basis of the predicted points/positions/areas. In this situation, the generating means may also be adapted to generate the image on the basis of the initial images provided or generated at the first point in time.

In general, it may be desired that the cameras are adapted to provide image data at a frame rate higher than that of the desired images in order to facilitate a better prediction of future points/positions/areas.

In the following, preferred embodiments of the invention will be described with reference to the drawing, wherein :

Figure 1 illustrates a set-up for providing e.g. a stereo image being a pair of images taken simultaneously from different positions,

Figure 2 illustrates feature extraction from a face

- Figure 3 illustrates a stereo image pair of a face, and

Figure 4 illustrates a morphed image of the face from a new angle.

The presently preferred embodiment will now be described in a relatively simple set-up relating to the generation of an image by an altered morphing algorithm from an initial stereo image. In this respect, a stereo image, or a stereo image pair is two images taken of the element at the same time (or within a given maximum time difference) and from different positions/angles.

A face will be used, but it is clear that any type of element may be imaged.

Also, the input into this simple set-up is two standard 2D images, and the output image is a standard 2D image, but any type of image data and output data, such as a 3D image or data, such as for use in 3D holographic equipment may be generated.

In the set-up seen in figure 1, a face 12 is photographed or imaged by two cameras 14 and 16 from two different angles. It is now desired to provide an image of the face 12 from a third angle illustrated by the hatched arrow.

A type of system in which the present embodiment may be used is videoconferencing where the user will normally view his monitor 18 and not the camera(s), whereby all viewers view each other from non-central angles, such as from the side, from above or from below. This gives a non-optimal contact and would normally be corrected by having the user look into a camera 14, 16, instead of the monitor 18, which is difficult for most people.

Using the present set-up, the viewer 12 may view the monitor 18, and the system (receiving the signals from the cameras 14, 16 as well as information relating to the angle) will then generate an image of the user as if viewed from the desired angle.

In this respect, the angle is the angle from which the face 12 is to be viewed. This angle may be pre-selected and not changeable, or it may be changed by the user or another participant to the video conference.

One manner of generating an image from two initial images is a method in which, in each initial image, a number of corresponding points or positions are determined, which points/positions are used for dividing the image into particular areas, one area from one stereo image corresponding to an area in the other stereo image. It is noted that the areas (position, shape, area) will differ from each other (the initial images being taken from different positions or angles).

An example of the identification of the points/positions (often called feature generation) from a face may be seen in figure 2. A number of more or less easily determinable positions of the face are determined : the pupils, the corners of the eyes, the corners of the mouth, the nose, the ears, the eyebrows, a position right between the eyes, the chin, or the like. A number of such features are known, and a wide variety of manners exist of determining these positions in a robust way. The same is the situation if the image data are 3D data, where both large contrast areas (such as corners of the eye/mouth) as well as points with a a well-defined shape (tip or base of the nose) are easily determined.

Having identified these positions in both stereo images, the positions are used for dividing each stereo image into a number of areas bounded or defined by the positions. This is illustrated in figure 3 in which a stereo image pair of a face is illustrated as is an area bounded by the right pupil, the right corner of the mouth and the tip of the nose. It is seen that as the two images view the face from different angles, the two areas are not identical.

Naturally, an area may be defined around a point or with a boundary defined by the point. Also, a number of areas may be defined between a plurality of points, such that areas may be defined by points (such as an area around a point halfway between two determined points) not inside or on the boundary thereof. Thus, as many feature points and/or areas as desired may be defined or determined, and the areas may be selected or determined as large or as small as desired, even to the size of a single pixel.

Subsequently, the individual areas are changed/deformed/manipulated to be pair wise (one in each image) identical in shape. This is a stretching/compressing/rotation or other manipulation required to have the shapes of the areas identical. Naturally, ideally, the areas should be deformed/manipulated to have the shape as when seen from the angle (hatched arrow). Further below, methods of obtaining this are described.

Then the contents of the, now, at least substantially identical areas are "mixed" to have the desired contents. This is performed on the basis of the weighting coefficient described further below. On the basis of this weighting coefficient, it is determined for all parts or pixels in the area how much of the information in the area in one initial image and how much of the information in the area of the other initial image is to be used in this particular area of the final image. A simple manner is the calculation of each pixel from the values of the two pixels (one from each of the initial images) and the weighting coefficient.

As mentioned, for each area, a weighting coefficient is determined, which weighting coefficient relates to the weight, relevance or credibility of the information from the area in one of the initial images in relation to the information of the area of the other initial image.

An example is seen when looking at a face from one side. The side of the nose facing the viewer has more information about that side of the nose than if viewed from the other side of the face, whereby it is defined that the first (and best) viewing angle of that side of the nose is given the most emphasis.

Thus, for each area, the initial image is determined which has the most information, and this image is given the highest weight for that area.

Naturally, this weighting coefficient may depend on a number of factors, such as the difference in the viewing angle of a camera/image and the desired angle, as well as whether features of the element (here the face) are visible or not from the position of a camera.

A simple manner of determining a weighting coefficient is the actual area of an area defined by particular points or positions of the element 12. The larger the area in one initial image, compared to that of the other initial image, the larger the weighting coefficient may be selected, as the area in most circumstances relates to the same element, and the larger the area, the more perpendicular is the viewing angle to the surface of the area, and the more information is present in that initial image.

Naturally, other methods may be devised for determining the weighting coefficient, such as the generation of a 3D model of the face 12 and determination of which of the initial images actually has the most information by identifying the camera being the closest to viewing the part of the 3D model from a perpendicular angle.

In yet another manner, the weighting coefficient may be based on the difference in angle between the desired angle (the hatched arrow) and those of the cameras.

Consequently, the weighting coefficients of the areas will normally vary with the desired viewing angle.

Actually, a very simple manner exists in which the weighting coefficient is simply defined in the position of the final image. Thus, if two cameras are provided, one to the right of the person and one to the left, the weighting coefficient may vary from the left to the right, independently of the rotation of the face 12 as well as on other factors. The weighting factor may be identical in vertical lines or areas of the image, and may event be identical in e.g. the right half and the left half of the image.

Naturally, a weighting coefficient may determine that all information in an area is to be derived from the corresponding area of a single initial image. However, mostly, a percentage from two or more initial images is used, and this mixing of information is a standard procedure used e.g. in the below-mentioned Morphing method.

One manner of implementing the above method is to alter the so-called Morphing method which is normally usually used for having one image (e.g. a face) morph or transform into another image (another face or e.g. an animals head). In this method, the final angle is fixed, and the degree of emphasis between the two initial images is altered over time (gives the transformation) but identically over the whole, final image.

Thus, using this method by inputting (which the Inventors have not seen before) a stereo image pair of the face 12 and for providing the feature extraction and identifying the individual points/positions/areas and deforming/manipulating the areas to be pair-wise identical, it is altered to receive information relating to the weighting coefficient of each individual point/position/area and then generate the final image as "usual" but now with different weighting coefficients for each area/point/position.

Depending on the quality of the initial images and the element imaged, it may be possible to not merely determine robustly identifiable parts/positions of the element but to correlate pairs of individual pixels of the two initial images to thereby determine the final image on this basis. This provides a much more fine- grained image in that, otherwise, the same weighting factor is used within all of each area, which may provide recognizable or visible edge phenomena between neighboring areas.

One manner of avoiding this is to determine a weighting coefficient for each area but to nevertheless vary the weighting coefficient inside the area to provide a more even variation over the area boundaries. In one situation, if a first of two neighboring areas has weighting coefficient A and the other weighting coefficient B, the weighting coefficients inside the first and second areas may be determined as A and B, respectively, at their centers but may change between the centers slowly or more evenly from A to B and vice versa. Thus, the closer to the boundary between the first and second areas, the farther the weighting coefficient is from A and B.

In fact, a map or table of the weighting coefficient may be used which defines or describes the weighting coefficients of individual areas or positions/pixels of the final image; or even at each point or pixel therein. The above weighting coefficient variation may be obtained by providing, at positions inside the areas, the determined weighting coefficients, and providing a smoothing algorithm which smoothes the coefficients at least at the boundaries. This smoothing algorithm may be the fitting of an n-dimensional polynomial. Other methods also may be used. Naturally, this method is not limited merely to 2D images and 2D information.

Naturally, the weighting coefficients, seen as a set of values or as the above map or table may be fixed once and for all and simply then used, or they/it may be determined at regular intervals or continuously. This may be a separate process performed separately from the image generation which will merely use the latest weighting coefficients/map, until these/this is updated.

Thus, this separate process may derive or receive a set of initial images which are then used for determining the weighting coefficients and for providing such values/map for use with a number of subsequent image generations.

A particularly interesting embodiment deals with the problem that the image conversion/analysis/generation may, depending on the complexity and the processing capacity available, take a considerable number of microseconds. Thus, a delay may be seen between the speech (which is swiftly processed) and the image of the speaking person.

The above separate generation or reuse of the weighting coefficients/map will reduce this delay.

Another manner or in addition to the above manner, the movements of the element 12 may be predicted from the movements of the feature points determined. From a number of images, the actual movements (velocity, acceleration) of feature points may be determined, and future movements/positions may be predicted. From such predictions, the shapes of the individual areas may be predicted, as may the conversion or deformation required to bring these to the desired shape. As the weighting factors may also be predicted from the predicted areas, or the shapes or areas thereof, or simply be re-used, a large part of the pre-processing may be performed before receiving the actual images, part of the contents of which has been predicted. Then, the deformation of the areas and the generation of the final image may be performed in a swifter manner, but still on the actual images.

Also, actually, the full contents of the initial images may be predicted. In the ultimate situation, pixel tracking may be performed between subsequent initial images from the same camera to predict the value of each pixel and thereby predict the next initial image from that camera. This may require an increased frame rate from the cameras, but cameras are available today with a frame rate more than 10 times that of 25 images per second which provides a sufficient imaging quality for low quality products.

Otherwise, the contents of an earlier, obtained initial image may be used for providing the prediction of the desired image from a camera. In this manner, a determined area of the earlier image may be deformed as determined into the predicted shape and position of the predicted image, whereby the final image, for use at a particular point in time, may be generated with no information from the images taken at the point in time. These images taken at that point in time will, however, be used for the prediction of a subsequent image.

Thus, on the basis of a prediction of the movements and/or future positions of feature points and/or pixels, the areas, deformations/manipulations may be predicted, as may both the weighting coefficients and even the contents of the areas.

Naturally, the prediction may extend as far into the future as desired. However, the farther into the future the prediction, the more uncertain it is. Presently, it is desired to predict up to 500 ms into the future. A vast increase of processing power available is foreseen, but so is also the resolution of the images. Also, especially for transmission over wireless links, transmission of data also in the future will generate delays. Consequently, it is assumed that a delay will occur also in the future, whereby the above prediction is of great value.

In that or another embodiment, a reference image is provided of the user 12 by positioning the face 12 in the desired manner in front of one of the cameras (or both for calibration purposes). Thus, this camera now views the user as from the desired angle (the hatched arrow) during normal use. This reference image now defines the correct angle and the correct sizes and shapes of the individual areas defined by the feature points determined in the desired manner.

When, subsequently, obtaining a set of images from the cameras, the areas of the individual images will be deformed or manipulated to have the same shapes and positions as those of the reference image, and the weighting coefficients may be determined from the difference in shape/size of the areas of the individual areas and those of the reference image.

It is noted that the reference image now defines the desired angle, when it is used for defining the required manipulation of the initial images.

This reference image may also be used for providing the correct weighting coefficient map or the correct variation of weighting coefficients at the area boundaries, as the reference image will identify how the boundary is to look, and the final image and/or weighting coefficients/map may be altered to provide the desired variation across the boundary. Again, this weighting coefficient variation may be determined for each set of images, only once, or at regular or determined intervals.

When increasing the number of cameras, the set-up may be more complicated. However, it is still desired to determine, for each area, the relevance of the information in the area and thereby generate a final image, the individual areas of which has non-identical weighting coefficients and thereby use different amounts of information from the images from the individual cameras. As described above, the weighting coefficients may simply be determined on the basis of the sizes of the individual areas, or other manners may be used. Also, depending on the position and angle of the element 12, two or more cameras may be selected to provide the initial images, and images from other cameras may not be used in that respect.

Naturally, more than a single element (typically a face) may be present in the images of the cameras, and the system may be adapted to identify such multiple elements and either eliminate all but one element or even provide different images for each such element, as the viewing angle may differ for each element/face.

Also, the feature position extraction of the individual image may be performed by a centralized unit receiving the images from the cameras 14/16 or may be performed by the individual camera, as may the determination of the relevance of an area, such as from the actual area thereof. In one situation, a value reflecting the actual area may be generated and forwarded with the image data. Alternatively, a centralized element, such as a processor, graphics card and/or a computer) may receive the images and perform the above determinations and generate the final image. This has the advantage that if large parts of the image are to have a zero weighting coefficient, these need not be forwarded, whereby a reduction is seen in bandwidth and subsequent processing requirements. In addition, the prediction of feature point positions/velocity/acceleration may be performed in the individual camera if desired.

Also, the feature point determination may be performed using any desired means also in addition to simple image analysis, such as the use of alternative means. One means sometimes used for feature point extraction is IR imaging which makes it possible to determine feature positions not visible to the naked eye.

Claims

1. A method of generating an image of an element, the method comprising :

obtaining two or more initial images of the element, each initial image relating to a first angle toward the element, each first angle being different from the other angle(s),

identifying, in each of the initial images, a number of points/positions and/or areas, each of two or more of the points/positions/areas of one of the initial images corresponding to a point/position/area of another of the initial images, and

- generating the image of the element on the basis of the initial images by:

2. A method according to claim 1, wherein the step of identifying the points/positions/areas comprises identifying, in the pertaining initial image, one or more recognizable parts, and identifying corresponding points/positions/areas of multiple initial images by identifying one or more sets of points/positions/areas, each set comprising one of the recognizable parts of one initial image and a corresponding part of each of one or more others of the multiple images.

3. A method according to claim 1 or 2, wherein the step of identifying the corresponding points/positions/areas comprises identifying corresponding areas in two or more initial images, and wherein the step of determining the weighting coefficient relating to corresponding areas comprises determining a weighting coefficient relating to sizes of the areas.

4. A method according to any of the preceding claims, wherein the step of generating the image comprises identifying a number of corresponding areas in the initial images and morphing the initial images on the basis of the weighting coefficients relating to the individual areas.

5. A method according to any of the preceding claims, wherein the generating step comprises determining a second angle relating to the element, the corresponding points/positions/areas of the initial images or the image being generated on the basis of the second angle.

6. A method according to claim 5, wherein the step of determining the weighting coefficient comprises basing the weighting coefficient also on the determined second angle and a third angle relating to the pertaining point/position/area of the pertaining initial image.

7. A method according to any of the preceding claims, wherein the generating step comprises:

obtaining a reference image of the element, and

determining one or more areas/points/positions in the reference image, and

wherein the generating step comprises generating the corresponding points/positions/areas of the image or the initial images on the basis of the determined areas/points/positions of the reference image.

8. A method according to claim 7, wherein the determining step comprises determining the points/positions/areas of the image or the initial images to have relative positions corresponding to relative positions of the corresponding points/positions/areas of the reference image.

9. A method according to any of the preceding claims, the step of determining the weighting coefficients comprises adapting the weighting coefficients at or in the vicinity of boundaries between different points/positions/areas.

10. A method according to claim 9, further comprising providing a 2D or 3D graph/curve representing, for each point/position/area in the image, the determined weighting coefficient, the method further comprising the step of smoothing the weighting coefficient graph/curve.

11. A method according to any of the preceding claims, wherein the identifying step comprises predicting, in one or more of the first initial images and at a first point in time, points/positions/areas at a subsequent point in time.

12. A method according to claim 11, wherein the generating step comprises generating, at the subsequent point in time, the image on the basis of corresponding, predicted points/positions/areas as well as first initial images generated or provided at the subsequent point in time.

13. A method according to claim 11, wherein the generating step comprises determining the weighting coefficients on the basis of the predicted points/positions/areas.

14. A method according to claim 13, wherein the generating step also comprises generating the image on the basis of the initial images provided or generated at the first point in time.

15. An apparatus for generating an image of an element, the apparatus comprising :

two or more cameras positioned at different positions for obtaining two or more initial images of the element, each initial image relating to a first angle toward the element, each first angle being different from the other first angle(s), first processing means adapted to identify, in each of the initial images, a number of points/positions and/or areas, each of two or more of the points/positions/areas of one of the initial images corresponding to a point/position/area of another of the initial images, and

16. An apparatus according to claim 15, wherein the first processing means is adapted to identify, in the pertaining initial image, one or more recognizable parts, and identifying corresponding points/positions/areas of multiple initial images by identifying one or more sets of points/positions/areas, each set comprising one of the recognizable parts of the initial image and a corresponding part of each of one or more others of the multiple images.

17. An apparatus according to claim 15 or 16, wherein the first processing means is adapted to identify the corresponding areas in two or more initial images, and wherein the generating means is adapted to determine the weighting coefficient relating to corresponding areas by determining a weighting coefficient relating to the sizes of the areas.

18. An apparatus according to any of claims 15-17, wherein the generating means is adapted to generate the image by identifying a number of corresponding areas in the initial images and morphing the initial images on the basis of the weighting coefficients relating to the individual areas.

19. An apparatus according to any of claims 15-18, wherein the generating means is adapted to determine a second angle relating to the element, the generating means being adapted to generate the corresponding points/positions/areas of the image and/or the initial images on the basis of the second angle.

20. An apparatus according to claim 19, wherein the generating means is adapted to determine the weighting coefficient also on the basis of the determined second angle and a third angle relating to the pertaining point/position/area of the pertaining initial image.

21. An apparatus according to any of claims 15-20, wherein the generating means are adapted to:

obtain a reference image of the element, and

determine one or more areas/points/positions in the reference image, and

wherein the generating means are adapted to generate the corresponding points/positions/areas of the image and/or the initial images on the basis of the determined areas/points/positions of the reference image.

22. An apparatus according to claim 21, wherein the generating means are adapted to determine the points/positions/areas of the image and/or the initial images to have relative positions corresponding to relative positions of the corresponding points/positions/areas of the reference image.

23. An apparatus according to any of claims 15-22, the generating means are adapted adapt the weighting coefficients at or in the vicinity of boundaries between different points/positions/areas.

24. An apparatus according to claim 23, wherein the generating means comprises means for providing a 2D or 3D graph/curve representing, for each point/position/area in the image, the determined weighting coefficient, the providing means being adapted to smooth the weighting coefficient graph/curve.

25. An apparatus according to any of claims 15-24, wherein the first processing means are adapted to predict, on the basis of one or more of the first initial images and at a first point in time, points/positions/areas at a subsequent point in time.

26. An apparatus according to claim 25, wherein the generating means is adapted to generate, at the subsequent point in time, the image on the basis of corresponding, predicted points/positions/areas as well as first initial images generated or provided at the subsequent point in time.

27. An apparatus according to claim 25, wherein the generating means is adapted to determine the weighting coefficients on the basis of the predicted points/positions/areas.

28. An apparatus according to claim 27, wherein the generating means is adapted to generate the image on the basis of the initial images provided or generated at the first point in time.