WO2003105466A1

WO2003105466A1 - Method of imaging an object and mobile imaging device

Info

Publication number: WO2003105466A1
Application number: PCT/IB2003/002330
Authority: WO
Inventors: Jyh-Kuen Horng
Original assignee: Koninklijke Philips Electronics N.V.
Priority date: 2002-06-07
Filing date: 2003-05-21
Publication date: 2003-12-18
Also published as: AU2003233097A1

Abstract

In order to capture the outward appearance of an object and represent multiple images on the object at a later stage, the images have to be taken from different view-points. These images can be used as the source of a 3D-browser or a 3D-model creator to obtain an enhanced 3D-experience. The user has to walk around the object and take all the pictures necessary for a sufficient 3D-view. The difficulty of making use of conventional imaging devices is that a user has to manually specify any unnecessary pictures and manually has to decide on his own whether he has taken enough pictures. The invention proposes a concept which comprises an automatic means for motion estimation to determine the motion distance and the direction between images and an automatic means for coverage calculation to specify an approximate size of an object, and to calculate whether the object is sufficiently covered to guarantee a sufficient view of the object.

Description

Method of imaging an object and mobile imaging device

The invention relates to a method of imaging an object by means of a mobile imaging device, wherein at least a first and a second image of the object are taken for different views. Further, the invention relates to a mobile imaging device for imaging an object, wherein at least a first and a second image of the object are taken for different views.

Contemporary digital cameras provide a variety of improvements such as the improved quality of the optical or image sensoring system or other improvements such as auto-focusing and metering systems. However, only few products are known to provide a user with added value functions. One such attractive function is the support of panoramic image capturing. When a user intends to capture the outward appearance of a panorama, conventional imaging devices may provide several additional controls or information, which allow the user to capture and post-process the panoramic view more easily and efficiently. Conventionally a user has to take multiple images of the panorama from several particular viewpoints instead of just one. Such a panoramic mode of conventional devices has also to be used to capture three-dimensional objects such as a car, a sculpture or a vase. In this case, multiple images of the 3D-object have to be taken from different view-points. As this kind of 3D-imaging differs from panoramic imaging, conventional devices are rarely suitable for completing an attractive three-dimensional image of an object. It may be desirable to improve these images so that they are available for use after imaging as the source of a panoramic view or a 3D-object browser or a 3D-model creator to obtain an enhanced panoramic or 3D- experience. To obtain a better post- viewing result, the user usually has to walk around the object to take all the necessary pictures. However, with conventional devices this is still difficult as no indication is given as to whether or not one is taking unnecessary pictures or whether or not sufficient images have been taken to cover all parts of the object to be imaged. For instance, pictures taken from almost the same viewpoint may be unnecessary pictures. Also, taking more than the required number of images results in wasted time and storage capacity. Conventional devices such as the one known from JP 2001119625 can be used to generate a panoramic image, but are not well suited for taking images of three-dimensional objects to be used as the source of a 3D-object view. Similarly, only devices serving as image devices for taking panoramic images are known in prior art. Such panoramic imaging devices are usually handled as follows: If the user selects a panoramic capturing mode, the white balance and exposure value will be fixed and about 40 % of the previous picture will be left on the view finder to allow the user to identify the position of the next picture to be taken. The user then moves and roughly aligns the left partial image with the view currently on the viewfinder. The 40 % overlap usually provides sufficient indications to accomplish this manual registration process. The sequence of captured images is stitched together in a subsequent offline procedure to compose the total panoramic image. For this purpose, offline panoramic software may be used. However, this approach is not very well suited to capturing static three-dimensional objects such as cars, statues and human beings. The reasons for this inadequacy of the panoramic approach are as follows: 1. In the case of panoramic, capturing users usually do not more when taking shots of surroundings. Instead in the case of object capturing they focus on the object they are interested in, walk around it and take pictures of it from different view-points. These pictures should cover as many different portions of the object as possible. Unlike in a panoramic mode, the trajectory of the camera will not be always linear. Probably, to image a 3D-object the camera will at least cover two-dimensional space in order to obtain not only horizontal views but also views of the top and bottom. 2. The panoramic approach available nowadays is a fully manual process. One has to appropriately position the camera in order to ensure that the object to be imaged is matched with the remaining 40 % from a previous image. However, the necessary approach is much more complicated for object mode capturing. It is usually unreasonable to ask the user to both identify suitable viewing positions of a subsequent shot and to remember what parts of information he may have gathered.

In summary, panoramic capturing devices of prior art do not usually provide any control or assist in indicating unnecessary images or insufficient imaging of a panorama or an object. Further, the help functions of an imaging device in a panoramic mode are restricted and focused on a linear camera trajectory and are generally a fully manual process to secure the matching of subsequent images taken. This is where the invention comes in, the object of which is to provide a method and an apparatus for conveniently imaging all kinds of objects, such as a panorama or a three-dimensional object, and in which the process of capturing an image sequence of such an object is simplified for the user. As regards the method this object is achieved by a method of imaging an object by means of a mobile imaging device, wherein at least a first and a second image of the object are taken for different views of the object, the method comprising the steps of:

- selecting the object to be imaged,

- taking the first image for a first object view and specifying first image data, - capturing at least one further image for a different object view and specifying further image data, wherein according to the invention the method further comprises the steps of processing the first and further image data automatically by the device to generate an output,

- indicating on basis of the output whether the further image is relevant to be taken as the second image.

As regards the apparatus the object is achieved by a mobile imaging device for imaging an object, wherein at least a first and a second image of the object are taken to obtain different views of the object, the mobile imaging device comprising:

- a means selecting the object for imaging, - a means taking the first image from a first view and capturing at least one further image from a different view,

- a sensoring means specifying first image data and further image data, a

- means automatic processing of first and further image data and generating an output,

- a means indicating a relevance of the further image on basis of the output. A certain position may be common to some object views. Further object views may also result from differing positions or view-points, whereby from each position the object is envisaged in a different direction.

"Capturing" an image comprises any kind of catching or reproduction of an image of the object during the process of viewing the object by means of the mobile imaging device. "Capturing" comprises e.g. transmitting an image of the object in a viewfinder of a camera, the image of which may be examined by a user on a display.

"Taking" an image comprises any kind of capturing of an image and additionally any kind of registering, storing or recording of an image. "Taking" comprises e.g. recording of an image on a storage device contained by the imaging device for a later development, display or processing of the image, or any other use of the image after the process of viewing the object is accomplished.

In the apparatus a display or a viewfinder may be used to select the object. A lens system or any other optical system may be used in combination with an optical sensor to view and capture images. Any kind of sensor such as an imaging sensor and/or position and/or motion indicating sensor may be used to gather image data. Such a sensoring means may also be part of a shutter or a viewfinder, which will also be described in the detailed description.

The proposed invention has arisen from the desire to add an intelligent indication system to a digital camera to provide the user with direct help in taking a sequence of images of an object. The main concept of the proposed invention is to adapt the intelligent indication system, in such a way that it is well-suited to capturing and taking images of all kind of objects, in particular by panoramic image sequences and also by three-dimensional image sequences. Whereas in the case of a panoramic image sequence the imaging device is moved along a more or less linear trajectory or at least within a substantial flat two- dimensional surface, in the case of a three-dimensional image sequence the imaging device is moved within three dimensions in an object-centered way to capture object-centered scenes. It was realized that the above-mentioned intelligent indication system could be performed by tracking the motion between a current view and the last image captured by the user. "Motion" in particular is an effect of changing view-points of a user.

Image data are preferably already specified automatically. Processor means are suitable for further processing of data and indicating relevance. As proposed, an intelligent indication system is realized in particular by automatic processing of at least the first and further image data by the device to generate an output and by using the output to indicate a relevance of the further image. Image data in particular comprise data of the view or perspective from which the image is taken. The term "view" comprises all kind of information such as device and object position, or parameters deduced from those positions, such as the distance between the device and the object for instance. Further, the direction from which the object is imaged or captured may form part of the data, such as the view angle for instance. Parameters of the object itself, such as the size of the object, may also be comprised by the image data. Further data may concern the circumstances of capturing or the imaging environment, such as brightness or luminance values, contrast values or color parameters. The term "processing" comprises all kinds of calculating or storing image data as indicated above. In particular processing comprises the processing of absolute object and device parameters for each individual view taken alone and also relative parameters such as parameters indicating a change between different views. In particular, the latter relative parameters are also referred to by the term "motion".

Developed configurations of the invention are further outlined in the dependent claims.

In a further method step the further image may be taken as the second image in case the further image is considered as relevant and the further image data may be assigned as the second image data. Most preferably in yet another step image data are updated to be used as first image data for a subsequent step. In such a subsequent step still further images may be captured and added to the sequence of images by taking a further image as a second image. By such means the proposed concept may be repeated as often as necessary.

The method advantageously comprises the step of indicating, on the basis of the updated image data, a further different view from which a further image can and should be taken in a subsequent step. Such an indication of a new viewpoint may be given to the user e.g. on a viewfinder display.

A user may choose predetermined imaging requirements from a variety of available imaging modes, e.g. a panoramic mode or a 3D-mode. Depending on such choice of a user it can be indicated whether the further image is relevant to be taken as the second image on the basis of the output and with respect to the predetermined image requirements.

A particular developed configuration of the method may comprise the following four steps:

1. The first image data at least can be specified by means of a frame on a viewfinder. In particular an object size could be indicated by e.g. a 3D-box. The first image is taken.

2. A further image is captured, either from the same view e.g. at another angle or from a different point of view.

3. The first and further image data are subsequently processed according to a change between the first image data of the first image and the further image data of the further image. Such a step accounts for the motion when changing from a first view to a second view, e.g. by changing an angle of view or a viewpoint or both.

4. An image may be taken, preferably in an additional step, if the change exceeds a predetermined threshold. Such a threshold may be set depending on the above-mentioned requirements and the image data. The processing of image data may also comprise the supply of a coverage rate of a first and a further image. An image can be taken automatically or the user may receive an indication that an image should be taken from a particular viewpoint.

As outlined above the proposed concept allows a better camera design. Such a camera calculates and conveys moving requirements and quantities to the user and therefore indicates all the necessary steps for taking a sequence of images so that the user just has to follow the comments given by the camera to complete the sequence of images necessary to image the whole object, either in a panoramic or a three-dimensional view.

The invention will now be described in detail with reference to the accompanying drawing. The detailed description will illustrate and describe what is considered as a preferred embodiment of the invention. It should, of course, be understood that various modifications and changes in form or detail could readily be made without departing from the spirit of the invention. It is therefore intended that the invention may not be limited to the exact form and detail shown and described herein, nor to anything less than the whole of the invention disclosed herein and as claimed hereinafter. Further the features described in the description, the drawing and the claims disclosing the invention may be essential for the invention, considered alone or in combination. The figures of the drawings illustrate in: Figure 1 a flow diagram of processes and interactions among modules of a preferred embodiment of the proposed method.

A preferred embodiment 1 of the proposed method is outlined in the following with regard to Figure 1. A computer program product may also be comprised by the preferred embodiment of the proposed concept, e.g. a camera system or image processing system may comprise such a computer program product.

Such a system includes in particular three kinds of modules: some auxiliary utility instruments 1 A, a motion estimator IB and a coverage calculator lC. The flow of processing and interactions among the modules in Figure 1 is indicated by arrows. In step 1 A the user 10 specifies the object size with the aid of an auxiliary utility instrument and makes the first shot. The auxiliary utility instrument may be a box-shaped frame 4, which helps the user 10 to specify the approximate object size, and which is utilized to compute the coverage rate of the object 3. In step IB the motion estimator starts its operation. The motion estimator retrieves the approximate moving distance and direction between consecutive images. Such motion offsets are supplied by the coverage calculator lC. The motion estimator computes and accumulates the motion parameters including directions and quantities according to the changes of scene and decides whether the present view is relevant or not. If the motion between a current view and the last image captured exceeds a specified threshold, the current view is relevant and the camera will take a shot directly or the user 10 will be notified that he needs to make a decision.

As a schematic example box 2 of the Figure 1 shows a sequence of images 2.1, 2.2, 2.3, 2.4 and 2.5 taken from respective views and covering the object O.

Afterwards, in step 1C of the preferred embodiment, the coverage calculator 1C is informed by the motion estimator IB that it needs to update the status of captured images 2. Subsequently the coverage calculator 1C may also be arranged to display the subsequent direction on a viewfinder 1 A or a LCD-display to the user. Further useful information, such as coverage rate or view-point positions (2.1 - 2.5), could also be shown on the viewfinder 1 A for reference.

In addition to the motion estimator IB and coverage calculator 1C in Figure 1 a user 10 and a shutter IE are indicated schematically. Each of these four modules provides specific commands or information to other modules and updates their own status or performs specific operations while receiving commands or information from others. Details of the functionality and responsibility are as follows:

1. The user module 10 represents the user itself. Advantageously according to the preferred embodiment the only action the user has to take is to determine whether or not to take the scene shown on the viewfinder 1 A as an image when he is informed by the camera. This action is indicated by an arrow pointing from the user 10 to the motion estimator IB.

2. Further, a shutter module IE is provided. The term "shutter" is used to represent the whole image capturing mechanism within a digital camera. This includes the lens, the CCD array, the shutter and the electronics controlling the behavior of the camera. The shutter module receives commands from a user 10 and the motion estimator module IB. When the taking of a shot is requested, the shutter is released and the current scene is taken as an image. A command is then sent to the coverage calculator 1C to update the coverage information. In addition to the techniques known from conventional devices , the shutter of the preferred embodiment also provides some internal communication channels 5 specifically adapted to communicate between the modules 1 A, IB, IC, IE and 10 as indicated by arrows 5 in Figure 1.

3. The major functionality of the coverage calculator IC is to compute the coverage rate of the object 3 according to the object size specified at the beginning by aid of 1 A and 4 and the motion offsets are computed by the motion estimator IB. When the shutter finishes capturing a picture, it sends an update command to the coverage calculator IC. Subsequently the coverage calculator IC requests the motion offsets from the motion estimator IB. After the coverage rate has been updated, the direction which should preferably be followed , i.e. the one which has been figured out by analysis of modules IB and IC is advantageously sent to the shutter and displayed on the view finder to advise the user 10.

4. The motion estimator IB performs a frame by frame comparison to retrieve the motion in between. While the camera is moving this module runs a real-time motion estimation for every frame. For positions between each of the positions of the images 2.1 - 2.5 of the image sequence of object O, the motion of the camera is accumulated and if the motion exceeds a preset threshold, the critical frame is recorded directly or a message is displayed that the user should be notified, to indicate that a further image is relevant to be supplied to the whole picture. The motion estimator IB may also deal with the query from the coverage calculator IC and provide requested motion information, i.e. indicating only those relevant frames which have to be kept. This functionality is also referred to as key- frame selection.

Further preferred embodiments of the proposed concept may comprise modules to perform data collection work following off-line processing 6, such as modules for 3D-model generation or object viewing and image stitching. In summary, to capture the outward appearance of an object and represent multiple images of the object at a later stage, the images should be taken from different viewpoints. These images can be used as the source of a 3D-browser or a 3D-model creator to obtain an enhanced 3D-experience. The user has to walk around the object and take all the pictures necessary for a sufficient 3D-view. The difficulty of making use of conventional imaging devices is that a user has to manually specify any unnecessary pictures and manually has to decide on his own whether he has taken enough pictures. The proposed concept comprises an automatic means of motion estimation to determine the motion distance and the direction between images and an automatic means for covering calculation to specify a rough object size and calculate whether an object is sufficiently covered to guarantee a sufficient view of the object.

Claims

CLAIMS:

1. Method of imaging an obj ect by means of a mobile imaging device, wherein at least a first and a second image of the object are taken to obtain different views of the object, the method comprising the steps of:

- selecting the object to be imaged, - taking the first image for a first object view and specifying first image data,

- capturing at least one further image for a different object view and specifying further image data, characterized by

- processing the first and further image data automatically by the device to generate an output,

2. The method as claimed in claim 1, characterized in that it comprises the further step of taking the further image as the second image in case the fiirther image is considered as relevant and assigning the further image data as second image data.

3. The method as claimed in claim 2, characterized in that it comprises the further step of updating image data to be used as first image data for a subsequent step.

4. The method as claimed in claim 3, characterized in that it comprises the further step of indicating on the basis of the updated image data a further different view from which a further image can be taken in a subsequent step.

5. The method as claimed in one of the preceding claims, characterized in that the object for imaging is selected with respect to predetermined imaging requirements.

6. The method as claimed in the preceding claim, characterized in that whether or not the further image is relevant to be taken as the second image is indicated on the basis of the output and with respect to the predetermined image requirements.

7. The method as claimed in claim 5 or 6, characterized in that the image requirements may be determined by selecting an imaging mode from a group of optionally available modes.

8. The method as claimed in one of the preceding claims, characterized in that at least the first image data are specified by means of a frame on a viewfinder.

9. The method as claimed in one of the preceding claims, characterized in that the first and further image data are processed according to a change between the first image data of the first image and the further image data of the further image.

10. The method as claimed in the preceding claim, characterized in that an image is taken if the change exceeds a predetermined threshold.

11. The method as claimed in one of the preceding claims, characterized in that the step of processing of image data further comprises supplying a coverage rate of a first and a further image.

12. Mobile imaging device for imaging an object, wherein at least a first and a second image of the object are taken to obtain different views of the object, the mobile imaging device comprising:

- a means selecting the object to be imaged,

- a means taking the first image for a first object view and capturing at least one further image for a different object view,

- a sensoring means specifying first image data and further image data, - a means automatic processing of first and further image data and generating an output,

- a means indicating a relevance of the further image on basis of the output.

13. A computer program product on a medium by a computing device comprising a software code section, which induces the computing device to automatically process first and further image data to generate an output, and which indicates on the basis of the output whether the further image is relevant to be taken as the second image when the product is executed on a computing device within a method according to the preamble of claim 1.