WO2007135659A2

WO2007135659A2 - Clustering - based image registration

Info

Publication number: WO2007135659A2
Application number: PCT/IL2007/000497
Authority: WO
Inventors: Elya Shechtman; Chen Brestel; Yair Shimoni
Original assignee: Elbit Systems Electro-Optics Elop Ltd.
Priority date: 2006-05-23
Filing date: 2007-04-19
Publication date: 2007-11-29
Also published as: IL175877A0; WO2007135659A3; IL175877A

Abstract

The invention relates to an observation system that comprises at least one platform means and a video or image sensor installed on said platform means in order to produce several images of an area of interest under varying conditions and a computer system in order to perform registration between said images and wherein said system is characterized by a clustering-based image registration method implemented in said computer system, which includes steps of inputting images, detecting feature points, initial matching of feature points into pairs, clustering feature point pairs, outlier rejection and defining final correspondence of pairs of points.

Description

CLUSTERING - BASED IMAGE REGISTRATION

FIELD OF THE INVENTION

The present invention relates to systems and methods for registration of a plurality of images of the same scene. The scene's images to be registered can be taken at different times and angles and other types of varied conditions.

BACKGROUND OF THE INVENTION

It is well known in the art, that employing techniques of image registration enables us to register (match) a plurality of images that describe the same scene, in order to extract additional information that is not contained solely in each of the single images of the scene.

An image registration method or technique has many applications in various fields. For example: three-dimensional (3D) scene reconstruction, wherein the images are taken from different viewing positions; object tracking; navigation; detection of changes in the scene that occur over time (wherein the images are taken at different times); providing image mosaics (wherein neighboring images were taken with some overlap); geo- referencing of an image while utilizing another previously geo-referenced image (e.g. - orthophoto) as well as sensor fusion wherein the scene is viewed by sensors having different wavelengths (multi-spectral imaging).

Image registration methods or techniques reveal the geometrical relationships between images of a scene taken at different conditions, such as viewing position, time, type of sensor, etc. The registration is achieved by identifying entities that are common to both images. Hence, when the images are very different, the registration turns out to be rather difficult.

There are various reasons for these differences. Different viewing positions make the scene look distorted in one image with respect to another image. This distortion can be perspective for planar scenes, or more complex and non-linear in case of 3D scenes (hills, valleys, trees, buildings, etc.) or distortions related to the optics of the cameras (lens, sensor etc.). While the planar distortions can be modeled by a parametric transformation, the other distortions are often non-parametric and contain partial occlusions and parallax effects. Other differences include reflectance due to the different angles from non- Lambertian surfaces in the scene (e.g. specular objects). In addition, perspective distortion influences the 2D structure of objects that are seen in the different images. Moreover, images taken at different times may result in different illumination by the sun due to this being a different hour or even different season, coupled with differences of the shading and hence the shadow that was cast at that time. It is obvious, that differences of time, month or year may cause various changes in the scene itself, such as new or changed roads, buildings or flora.

In order to register two images, one should identify entities that are common to both images. Basically, there are two known approaches for accomplishing this, namely, the "direct" or pixel based methods and the "feature based" methods.

For "direct" or pixel based methods see for example the paper "About Direct Methods", M. Irani and P. Anandan, P. H. S. Torr and A. Zisserman, in Vision Algorithms' 99, LNCS 1883, pp. 267-277, Corfu, 2000, edited by B. Triggs, A. Zisserman, R. Seliski.

For "feature based" methods see for example - "Feature Based Methods for Structure and Motion Estimation", P. H. S. Torr and A. Zisserman, B. ibid, (Vision Algorithms' 99, LNCS 1883, pp. 278-294, Corfu, 2000).

The "direct" methods exploit the pixels themselves to compute a pixel-wise dense correspondence between the images, assuming a global transformation. The "feature based" methods use the pixel level to find features in each image, for example lines, corners and other localized features or segments. Then, features from the two images are matched and a transformation is estimated.

The "feature based" methods have the advantages of simplicity, low complexity and applicability for various image types. These methods can treat complex non-planar transformations while achieving reasonable accuracy and also can handle a large variance between the images.

In the "feature based" methods, once corresponding feature point pairs are established between the two images, a global transformation (e.g., planar, global polynomials or thin splines) for the whole image, or a set of simple local transformations for different parts of the image are estimated. Use of local transformations provides the advantage of a more accurate registration; however one has to define, for each transformation, the exact image region it belongs to. Having to determine the appropriate image region is prone to be a non-trivial problem: dividing the image into too small regions would lead to unstable calculations of transformations, whereas using too large regions or overlapping regions would result in erroneous transformations due to a mix of transformations that belong to different regions.

Previous attempts to find an efficient and accurate method for defining image regions to be used later as a basis for a local transformation included a method employing recursive subdividing of the image into smaller regions as long as the transformation computed for a region has too large errors (see for example - "A Survey of Image Registration Techniques", Lisa Gottesfeld Brown, ACM comp. Survey, vol. 24, 1992.). Another approach relied on dividing a single image to coherent regions for object recognition by searching for features that are consistent in 2D location, scale and orientation, using a Hough transform (see for example - David G. Lowe, "Distinctive image features from scale-invariant key points", International journal of computer vision, 60(1), 2004).

In a paper by J. Zhen, A. Balasuriya and S. Challa. - "Vision data fusion for target tracking" (Proceedings of the Sixth International Conference on Information Fusion, pp. 269-275, 2003) an approach for target tracking was introduced. The paper describes target segments detection by applying a clustering process on all image pixels. In accordance with the paper, for a pair of consecutive frames of a video, optical flow is first estimated. Then, the 2D displacement for each pixel together with its color and position in the image are used in the clustering process. It is noted that the approach described in the paper is limited for consecutive frames wherein the viewing position, illumination and the scene itself are almost identical due to the short time interval between the frames.

Registration of MR- or CT- images is described in patent application WO 200447025 by M. J. Quist and P. Roesch: "Method and device for image registration". First, pairs of control points in both images are found. Then a clustering algorithm is applied on the various transformations defined by the pairs. Control points that belong to the same tissue and follow the same transformation are clustered together. Moreover, a pair could be assigned to more than one cluster if its transformation is a superposition of a few transformations. It is noted that the cited patent does not describe the crucial step of finding the control points. This step happens to be non-trivial wherein the various differences between the images are considered. Moreover, no treatment of erroneous pairs is proposed, although any professional in the art would appreciate that such pairs are certain to be found in the images we are dealing with.

A paper by Alexander C. Berg, Tamara L. Berg and Jitendra Malik. "Shape matching and object recognition using low distortion correspondences" (Proceedings of Computer vision and pattern recognition, 2005), describes a method for performing a local deformable matching between feature points without explicitly segmenting the image into regions. Instead, an optimization problem is solved wherein constraints are defined both for matching a single pair of features and for couples of such pairs. For the couples, the configuration of the two points in one image is compared to the configuration of the matching points in the other image, to give a measure of the distortion together with the match quality based on comparing geometric blur descriptors, are used in a minimization problem in order to solve for the correspondences. Any professional in the art will understand that such method is rather complex and due to the global optimization concerned, it is slow for large images and will tend to prefer smooth transitions and low distortions between close regions. Such method is less suited to treat cases where close regions have very different translations between the images (due to abrupt depth changes).

It is therefore claimed that before the present invention, existing registration methods lacked a reliable "region forming process" based on a regional position of a matched feature pair and its transition vector with respect to other pairs in its region. Those methods lack the ability to enforce regional consistency in the preparation of image transformation by screening spurious feature point matches, which would result in transformation errors. This ability is especially important in images of complex scenes with large appearance changes, abrupt depth changes, occlusions and parallax effects. None of the above methods tackled images with such complex scenes.

Although it was known that high performance image registration can be achieved by guided reduction of search data and search space, due to the intensive computational requirements of the registration process and the ever-increasing amount of image data to be registered, innovative and more efficient registration algorithms were certainly necessities. SUMMARY OF THE INVENTION

The object of the present invention is to provide a method for image registration that is accurate, simple to compute and can handle largely differing images and complex non-planar image transformations.

In one aspect of the present invention, the present invention constitutes an observation system that comprises a platform means (at least one), for example: a satellite, an aircraft, an UAV (Unmanned Aerial Vehicle), a vehicle, an observation (reconnaissance) outpost upon which an image or a video sensor is mounted, designed to generate single images or several video images of the area of interest (being observed) under varying conditions (e.g. - different instances of time and/or location).

The system comprises a computer system for affecting the registration of the images either on-line or off-line (e.g. - on the ground).

The system is characterized by that its computer system implements a clustering based image registration method.

The method includes the following steps: images inputting; detecting feature points; initial matching of feature points into pairs; clustering feature point pairs; outlier rejection; defining final correspondence of pairs of points, and alignment of the images.

In another aspect of the present invention, the subject matter of the current application, the invention constitutes a method for computerized registration (matching) of images taken under varying conditions (such as time and/or location), that includes the steps of: inputting images; detecting feature points; initial matching of feature points into pairs; clustering feature point pairs; outlier rejection; defining final correspondence of pairs of points, and alignment of the images. In a preferred configuration of the method, it includes in addition, steps of: inputting telemetry or geographic data and applying coarse alignment based on said telemetry or geographic data, executed after the "inputting images" stage.

In yet another preferred configuration of the method, it includes in addition, steps of: inputting coordinates in an input image and computing matching coordinates in a reference image by interpolation or extrapolation, following or replacing the stage of "images alignment". In another aspect of the present invention, the invention is embodied through images that are registered (matched) one to the other, executed by implementing the method that was presented briefly above. Hereinafter - it would be elaborated upon in a more detailed discussion.

BRIEF DESCRIPTION OF THE ACCOMPANYING DRAWINGS

The present invention will be described hereinafter in conjunction with the accompanying figures. Identical components, wherein some of them are presented in the same figure - or in case that a same component appears in several figures, will carry an identical number.

Figure No. 1 presents an illustration of various examples of systems capable of implementing and employing the clustering-based image registration method that is the subject matter of the present invention. Figure No. 2 is a flow chart illustrating schematically an example of the method for clustering-based image registration that is the subject matter of the present invention. The method can be implemented on a computer system either on-line or off-line.

Figures No. 3a to No. 3g present schematic illustrations of the different steps of an image registration process preformed in accordance with the method described hereinafter in detail with reference to figure No. 2.

Figure No. 4 is a flow chart, illustrating schematically the step of detecting feature points as defined, constituting a part of the method described hereinafter with reference to Fig. No. 2.

Figure No. 5 is a flow chart illustrating schematically the step of matching pairs of feature points between images as defined, constituting a part of the method described hereinafter with reference to figure No. 2.

Figure No. 6 is a flow chart schematically illustrating the clustering feature point matched pairs step as defined, constituting a part of the method described hereinafter with reference to figure No 2. DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS

Let's refer to figure No.l. This figure presents illustrations of various examples of systems capable of implementing and employing the clustering-based image registration method that is the subject matter of the present invention.

As would be clarified later on, such observation / reconnaissance systems that are the subject matter of this invention, namely systems in which it is possible to implement the innovative method, comprises a platform means (at least one), upon which a video or image sensor is installed, (for example, a camera) that produces several images of the area of interest (being observed) under different instances of time and location.

The illustration depicts various examples for the process of image sources generation applicable to the innovative method. The scene has a non-planar topography with many 3D objects (e.g. buildings) causing different occlusions in the images. The images are captured by different cameras, from very different viewing angles and at different times (different seasons, sun illumination directions, shadows etc.). In accordance with the present invention, a video or image sensor installed on platform means that in order to produce several images of an area of interest under varying conditions wherein those varying conditions may include changes between the images such as: view point, illumination (direction and intensity), sun illumination (direction and intensity), seasonal changes (flora, etc.).

For example -

A system 10 includes two mobile platform means, a satellite 12 and an aircraft 14.

Upon each one of those, a video sensor is mounted (that is not illustrated), which produces several video images of the area of interest 16. The video images are processed in a ground station (that is not illustrated). This ground station is equipped with a computer system, in which the method (the subject matter of this application) is implemented.

The video sensor installed in satellite 12 produces a video image of the area of interest 16 under existing conditions, such as an instance (spot) in time, elevation, illuminating angle and the like, constituting variables that are different in their parameters' values from the time instant and other conditions in which the video image of the same area of interest were taken by the video sensor of the aircraft 14. The computer system that implements the method, i.e., the subject matter of the current application, handles the registration between the images that were produced by the different sensors.

Any professional experienced in the art would understand that system 10 might utilize also static platform means, one or more, - that is are not mobile, and this in addition to the aerial platform means or as an alternative to one (or both) of those cited above.

For example, platform means constituting an observation tower 18 is suggested, wherein a video sensor (that is not illustrated) is installed at its apex. The sensor produces a video (image) of the area of interest 16. In case airborne platform means are absent, the system might include several observation towers that are watching over the same area of interest but from different angles and at times also at different time instances ("time spots"). In this case, as well, the computer system that implements the method (that is the subject matter of the current application) is responsible to provide the registration (matching) of the images that were produced by the sensors. Another example -

A system 20 that an comprises airborne platform means, an aircraft 22 that carries a reconnaissance pod 24 for photographing and tracking the area of interest 16 from a far away range while matching photos that are produced incessantly (all the time). The computer system that might (even) be installed on board - in the reconnaissance pod itself, implements the method that is the subject matter of the application and takes care of the registration between the images being produced.

An additional example follows -

A system 30 that comprises mobile platform means, a remotely operated vehicle 32, equipped with a video sensor (that is not illustrated) for photographing from the ground within the area of interest 16. The video photos of the same area of interest 16 as they were received from another (a different) platform means (for example, satellite 12, aircraft 14, reconnaissance tower 18, aircraft 22) aid in navigating vehicle 32 within a sector of the area of interest 16 (and can be utilized also in order to detect changes in the scene, producing a "mosaic" of images or combining them through sensors "fusion"). A computer system that implements the method that is the subject matter of the application and might be located in a control center far away from the target (that is not illustrated), takes care of the registration between the video images received from vehicle 32 and the video of the sector in which it is moving as it was received (as said — from a different platform means and, for example, also at another time).

A yet another additional example -

A system 40 that comprises a mobile platform means - an UAV (Unmanned Aerial Vehicle) 42, equipped with a video sensor (that is not illustrated) for photographing a continuum of images 44, with a certain overlap between the images in the area of interest

16. A computer system that implements the method that is the subject matter of the application (and that might be located in a distant control center that is not illustrated) handles the registration between the video images in order to generate a mosaic having uniform orientation and sense.

Any professional in the art would understand also, that systems in which the method might be implemented, might include two platfoπn means, one far (at a distance from the other), that would be photographing at differing time and location conditions, one area of interest that is in itself also moving. For example, a system for identifying an identikit of a person that has been photographed in another location and at a different time (for example, a suspect of a terrorism act) and is photographed again on a different occasion, enabling to identify him in due time (in the example just presented - e. g., prevent him from boarding a departing flight).

Reference is being made to Figure No. 2. The figure depicts a flow chart illustrating schematically an example of the method for clustering-based image registration that is the subject matter of the present invention.

The method, generally numbered 100, comprises the steps of: inputting images - 115, (optionally) inputting telemetry or geographic data - 120, (optionally) applying coarse alignment based on telemetry or geographic data - 125, detecting feature points - 130, initial matching of feature points into pairs - 135, clustering feature point pairs - 140, Outlier Rejection — 145, defining final correspondence of pairs of points - 150, images alignment - 155, (optionally) point coordinates from input image - 160 and (optionally) computing and establishing matching coordinates in reference image -165.

Hereinafter we will elaborate on the different steps, while, for the sake of clarity, we will also refer to Figures No. 3a to No. 3g that are schematic illustrations of different steps of image registration processes in accordance with method 100, and at the same time, we will refer also to Figures No. 4 to No. 6, that schematically illustrate specific sub- processes or subroutines occurring within certain steps of the method 100.

Step 115 is the step of inputting two different images of approximately the same scene. If we refer to figure 3a, one can see an illustration of two images 316 and 318. In accordance with method 100, one of the images 316 is a reference image. Image 316 will be utilized as the basis for the image comparison with input image 318. Either one of images 316 or 318 can be obtained from a real time source such as a visual real time sensor (e.g., a camera) or retrieved from a pre-stored location (e.g., a data base). Images

316, 318 could have been taken at different times and/or on different dates, different capture viewpoint, different wavelengths and input device. Therefore at this step it is not feasible to try and reliably align the two images one to the other, based only on the information shown in the images.

Step 120 is an optional one. In step 120, telemetry or geographic data is inputted. In certain systems, an additional telemetry or geographic data (in case of an orthophoto) is available that is related to the images. Given one image with, for example, its mertial navigation system (INS) and sensor data, and a second image with its INS and sensor data or geographic data (in case of an orthophoto), one can calculate and apply a coarse transformation in order to align the two images.

Therefore, optionally, in step 125 a coarse alignment is performed - based on telemetry or geographic data related to the two images.

Referring to figure 3b, the figure illustrates such coarse alignment of the two different images 316, 318. One can see that the coarse alignment results in minor rotation of image 318 in relation to image 316. Although for the sake of clarity, the same number -

318 is used, after the alignment step the image is changed (as it was subjected to transformation).

Any professional in the art will understand that the telemetry or geographic data has a limited precision, with errors that could reach hundreds of meters. Therefore, in order to enable a precise registration the following steps of the method are performed -

In step 130 - the step of detecting feature points, image content is transformed into local feature coordinates which are invariant to scale, translation, rotation, and other imaging parameters. Any professional in the art will understand that such feature points can be detected by applying feature detection algorithms (e.g. Harris corners, Difference Of Gaussians (DOG), determinant of local Hessian, additional list of feature examples can be found in the paper "Scale & affine invariant interest point detectors ", Krystian Mikolajczyk and Cordelia Schmid, Int. J. Comput. Vision, Vol. 60(1), pp 63-86, 2004), in multiple resolution scales (using for example - a Gaussian pyramid) in each image independently, and stored for further use.

Referring to figure 3 c, the figure illustrates such detection of feature points in multiple resolution scales. Feature points 331, 332 and 333 are independently detected in image 316 at different resolution scales (three scales are demonstrated for illustration purposes only). Feature points 334, 335 and 336 are also independently detected in image 318 at multiple resolution scales (here too, three scales are demonstrated solely for illustration purposes).

Referring to figure No. 4, the figure is a schematic illustration of a flow chart depicting step 130 (the step of detecting feature points as described hereinabove with reference to figures No. 2 and No. 3c). Any professional in art will understand that such feature point detection can be preformed at multiple resolution scales (for example - by applying a hierarchy of low-pass filtered versions of the original image).

Step 130 begins with inputting reference image 316 and input image 318. Feature detection algorithms 406 and 412 are independently applied over the reference and input images. At the next steps 408 and 414, the previously detected feature points represented by descriptors and their position in the image, are stored. At the next steps 410 and 416 the image resolution is reduced. Steps 406, 408 and 410 for reference image 316 and steps

412, 414 and 416 for input image 318 - are repeated n times (e.g. n=5 for a 5 hierarchy level Gaussian pyramid).

The result of the process is the database 418. Database 418 contains a set of feature points that have been detected in each image and for each resolution scale separately. In addition, database 418 contains the pyramids themselves. The feature points are represented by descriptors and positions within the images. Any professional in the art will understand that optionally, the feature detection across multiple resolution scales can be performed in parallel processes by distributing the computing and image processing tasks to several computers or several processors in a single computer.

Referring back to figure 2, the next step is 135. Step 135 is defined as the initial matching of feature points into pairs. At this step, pairs of matching feature points from the two images are established based on a correlation measured inside local windows. This step is implemented while applying well-known techniques (e.g., Cross correlation,

Normalized Cross Correlation, SIFT, geometric blur).

At this step, the precise location of a feature found in a coarse scale is found while using a finer scale. Feature points are declared non-matching if the fine-tuning of the location determination process results in a location outside a search window (of a predetermined threshold). Feature points are also declared non-matching when a feature point in the input image has more than a single possible match in the reference image with high correlation scores.

Optionally, at this step, the feature points are rotated by a few small angles to account for a small image rotation.

Lets refer to figure 3d. Figure 3d describes a case of pruning or in other word - rejecting match because there exists a state of mismatch between the couple of the feature points in the two images being compared, namely 316 and 318. This match (or lack of match) is being examined, as said, while performing stage 135 of method 100. As per the example described in the figure, the match between the feature point

342 in image 316 to feature points 344 and 346 in image 318 would be invalidated. This stage, the stage of pruning or rejecting resulting from lack of match, exists due to prevailing circumstances, namely the existence of two feasible (possible) matching (344, 346), wherein both of them have a correlation score that is nearly the same (beyond the threshold that has been pre-set). This stage is further described herein under, within the framework of stage 510 in figure 5.

Any professional experienced in this field knows, that when — to the feature points in an image there exist more than one set of feature points that matches an item in the other image (i. e., feature points having a nearly similar correlation score), then it is necessary to invalidate the couples being tested. This, because of the apprehension that as the process advances it will be possible that a comparison to match the images would be done - based on feature points that do not present those same items that are in the different images, resulting in an error that might lead to a mistake in the matching process.

Lets refer to figure 3e. Figure 3e describes several stages in the process of examining the match between feature point 350 in image 316 to feature point 352 in image 318. Also this match is examined as said, within the framework of stage 135 of method 100.

In accordance with the example described in the figure, the search for an appropriate feature point in figure 318 for feature point 350 on image 316, is executed within a "search window" 354. As it would be explained below, the size of the search window 354 is defined in advance. Note that as described in stage 506 shown in figure No. 5, the size of the search window is defined as a part of stage 135. The search for the feature point having the highest correlation score is carried out within the same search window, and this within the framework of stage 508 shown in figure 5.

Reverting to the described example, the examination of the match between feature points 350 and 352 - is accomplished also at higher resolution levels of images 316 and

318. The coordinates of the feature points in the images given at the higher resolution level are translated to the images at the original resolution (scale 0). Within the framework of this stage, in case that a match between the feature points that were detected in the higher resolution (in the illustrated example - scale 2), to the feature points that were detected at the original resolution (scale 0) does not exist - the match between the pair of the feature points is rejected. As would be explained later inhere under for stage

512 of figure 5, the examination of the level of match for the feature points at various resolution levels and its comparison with the original image, constitutes an additional means serving for verifying that a couple of feature points that were defined as couple, indeed describe the same element in the different images.

Referring to figure 3f, the figure illustrates such an initial matching, namely initial matching of feature points into pairs. Pairs 361 and 362 of matching feature points from the two images 316 and 318 are established based on a correlation measured inside local windows 363, 364, 365 and 366. Referring to figure No. 5, the figure is a flow chart, illustrating schematically step

135 - the step of initial feature point matching based on a correlation measurement inside local windows, as described hereinabove with reference to figures 2 and 3f. The illustrated process begins with inputting reference image feature points 502 and input image feature points 504 both retrieved from database 418 (see figure 4).

The next step - 506, is defining a correlation window size and the search window size as well. Next is step 508 of searching for the best match in the reference image for each feature point in the Input image. The search is conducted within the window as defined at step 506.

The next step - 510, is disregarding or pruning feature point matches where the correlation measured for the best match is too close in terms of a pre-determined threshold, to the correlation as measured to the second best match. Next there is step - 512, which handles the refining of the precise location of a pair of feature points by higher resolution scales and abandoning matching pairs where the refinement process fails to produce further match.

The last step - 514, is storing the feature point pairs.

Any professional in the art will understand, that erroneous pairs could also be the result of using a correlation (measured in local windows) for establishing feature point match pairs, as describes in reference to step 135.

Referring back to figure 2, one will understand that such erroneous pairs can be discarded by applying the next step of method 100, namely step 140 - clustering feature- point pairs. In clustering step 140, feature-points matched pairs (from the input and reference images) are gathered into clusters of similar position and translation characteristics. The basic idea is to eliminate the use of a single matching pair, but instead look for clusters that represent regions of matching pairs.

Since scene feature points within the images are of various size scales, detection is performed in step 130 at various image (or filter) resolution scales. While the initial matching of feature points (based on correlations) is done for feature points of the same size scale, the clustering process is done for feature point pairs of all scales together.

Therefore, a preliminary conversion of all feature point coordinates is performed to the scale of the original images. Any professional in the art will understand that such preliminary conversion can be preformed at any stage after step 130. Clustering of feature point pairs can be done in at least at a 4D space such as location and 2D translation. The clustering of feature point pairs is done for example by - K-means clustering, mean shift clustering, agglomerative clustering, hierarchical, spectral clustering, and/or multi-scale clustering. Referring to figure No. 6 that is a flow chart schematically illustrating step 140, wherein step 602 is inputting and representing feature point pairs by location in the reference image (X₁ , yi) and location in the Input image (x₂, y₂).

The next step is 604. Step 604 is computing for every pair the translation of a feature point between the images and representing it by the term of (x₂-xi, y₂-yi). In other words, the computing of the translation is repeated for all matching pairs.

The next step is 606. Based on the translation computed for each feature point pair, a clustering algorithm is applied. The clustering algorithm identify and group feature point pairs such that the properties of pairs that belong to the same cluster resemble one another in position and translation but differ from those of other clusters. The next step - 608, is analyzing cluster sizes (e.g. number of pairs). During step

608, pairs that were wrongly matched during step 135 (while using local correlation measure) are filtered. This filtering is achieved by differentiating single isolated pairs or minor clusters from clusters of acceptable matching pairs. These clusters of acceptable matching pairs include a quantity of feature point pairs that exceeds a certain threshold. Such filtering is based on the assumption that resembling translations will occur to matching pairs that belong to the same planar area.

The next step - 610, is choosing a representing set of acceptable pairs. At this step, a representing set of feature point pairs from the remaining clusters which are isotropically spread throughout the images are chosen as a basis for registration. Referring to figure 3g, the figure illustrates the clustering process in a sequence of illustrations, where feature points in reference image 316 are marked as white dots and feature points in input image 318 are marked as black dots.

All represented feature points have been previously matched to pairs throughout step 135 (see figure 2) illustrated in the example as pairs 361, 362 in figure 3f. Here, in figure 3g, feature point 370 in reference image 316 is represented by (x_\, yi) and feature point 372 in input image 318 is represented by (x₂, y₂). Both feature points 370 and 372 are defined together as feature point pair 362 (see figure 3f). In accordance with step 140 (see figure 2) and step 604 (see figure 6), translation 374 in terms of (x₂-xi, y₂-yi) is computed and attributed to the pair 362.

As mentioned hereinabove, such translation 374 is computed to every feature point pair as a preliminary step to applying the clustering algorithm. As a result of the clustering step 140 (see figure 2), clusters of feature point pairs

376 and 378 are formed. Feature points within feature point pairs 376 and 378 share similar translation characteristics and are proximate in location within the images therefore enforcing regional consistency.

At this stage of the clustering process, feature point pair 380 is not included within cluster 376 as its translation characteristic differs from those of other pairs comprising cluster 376. This pair is included in another small cluster. Also each of the pairs 382 and 384 belong to a different small cluster.

Following next, additional analysis 608 (see figure 6) is performed. Small clusters, for example those that contain pairs 380, 382, 384, are filtered out. Referring back to figure 2, the next step is 145 - outlier rejection (for example - using RANSAC algorithm or Hough transform), is preformed in order to ensure global compatibility while filtering out erroneous feature point pair matches (e.g. - rejecting points that are far from a parameterized model such as a planar model).

It is assumed that the global transformation between the two images can be approximated by a smooth transformation. This smooth transformation can be computed using a minimal number of m feature point pairs.

In order to choose the acceptable pairs, a set of m pairs is chosen randomly and a transformation (e.g. affine) is calculated. The compatibility of the transformation with the other pairs is then checked and stored as a score. A pair is considered compatible if the difference (error) between its position and that predicted by the transformation is less then a given threshold Dmax.

The random choice is repeated n times and the transformation that has the highest score are chosen.

It is noted that only pairs that are very far from the transformation (having high Dmax) are filtered out while small errors are allowed, since the smooth transformation is only an approximation for the real non-planar and non-rigid transformation. Based on the chosen transformation the erroneous pairs are filtered out and disregarded in the registration.

The next step 155 is aligning the two images. Based on the feature point pairs that remained after the outlier rejection step 145 a precise transformation is calculated and applied to align the two images. The images alignment step results in generating a new image from the input image and can be accomplished by using the transformation or transformations solved (by the registration) between the input and reference images. The generation of a new image uses an interpolation that can be exploited for example, by triangulation, weighted sum, k-closest points, spline interpolation, etc. At this stage the image registration is completed.

Optionally and in addition, steps 160 and 165 can be implemented. Given point coordinates in the input image, one can compute the matching point coordinates in the reference image. Computation of matching coordinates can be preformed, for example, by first, registering the input and reference images to find a set of matching points in the two images. Then, the matching coordinates in the reference image of the input points (of these set) are known precisely. Next, the matching coordinates of additional points are computed by interpolation from those input points.

Any professional in the art will appreciate that the unique approach of method 100 provide for computing a reliable matching between two images while rejecting erroneous pairs of matches.

The clustering-based image registration method can be implemented in few image scales (resolutions) such that a registration is first computed in a coarse scale and its precision is then being improved in a finer scale.

In method 100 the distortion is treated in a simple way by demanding proximity and similar translation of neighboring points. The method's local approach is suited to treat easily and robustly cases where close regions have very different translations between the images (due to abrupt depth changes). These regions will be clustered into different clusters.

Method 100 is characterized by the combination of two steps - the clustering step 140 which provides a reliable decision based on a regional position of a matched pair and its translation vector, with respect to other pairs in its region, hence enforces a regional consistency, and in combination with the next step - the outlier rejection step 145 which provides a global consistency by using an approximate global transformation to reject outliers, using voting techniques (e. g. - RANSAC).

The novel approach of method 100 enables to cope with different challenging tasks, such as - geo referencing, matching an aerial slanted image (10 degrees below the horizon) to an orthophoto image, change detection: matching pairs of images taken at different time and viewing position for change detection image viewed (taken) from ground, air or space, sensor fusion: matching images of different sensor types, such as visible and IR and mosaicing: generating a wide field of view from a sequence of narrow FOV images using partial overlapping for matching the images. Any professional would understand that the present invention was described above only in a way of presenting examples, serving our descriptive needs and those changes or variants in the method of clustering-based image registration - the subject matter of the present invention, would not exclude them from the framework of the invention.

In other words, it is feasible to implement the invention as it was described above while referring to the accompanying figures, also with introducing changes and additions that would not depart from the characteristics of the method of clustering-based image registration which is implementable in various systems that are claimed hereinafter

Claims

CLAIMS What is claimed is:

1. An observation system that comprises: at least one platform means; a video or image sensor installed on said platform means in order to produce a plurality of images of an area of interest under varying conditions; a computer system in order to perform registration between said images; wherein said system is characterized in that in said computer system a clustering- based image registration method is implemented, which includes steps of: inputting images, detecting feature points, initial matching of feature points into pairs, clustering feature point pairs, outlier rejection and defining final correspondence of pairs of points.

2. An observation system in accordance with claim 1, wherein said method of clustering-based image registration further comprises but after performing said step of inputting images, the steps of : inputting telemetry and/or geographic data; and applying coarse alignment, based on said telemetry and/or geographic data.

3. An observation system in accordance with claim 2, wherein said method of clustering-based image registration further comprises after performing the step of defining final correspondence of pairs of points, the steps of: inputting coordinates in an input image and computing matching coordinates in a reference image.

4. An observation system in accordance with claim 1, wherein said method of clustering-based image registration further comprises after performing the step of defining final correspondence of pairs of points, the step of : image alignment.

5. An observation system in accordance with claim 1, wherein the system comprises: two platform means whereas both said platform means are mobile and equipped with a video or image sensor; and wherein said system further comprises: a ground station in which said computer system is located.

6. An observation system in accordance with claim 1, wherein the platform means are static means.

7. An observation system in accordance with claim, wherein the platform means are mobile.

8. An observation system in accordance with claim 1, wherein said system comprises a plurality of said platform means, whereas at least one of said means is static.

9. An observation system in accordance with claim 1, wherein said system comprises a plurality of said platform means, located at a distant one from the others; and wherein said area of interest is mobile and exposed every time to only one of said plurality platform means.

10. A method for effecting computerized registration between images of an area of interest that were taken under varying conditions of time and/or location, and which comprises the steps of: inputting images, detecting feature points, initial matching of feature points into pairs, clustering feature point pairs, outlier rejection, and defining final correspondence of pairs of points.

11. A method for effecting computerized registration between images of an area of interest in accordance with claim 10, further comprising following said step of images alignment, step of defining final correspondence of pairs of points, the step of: alignment of the images.

12. A method for affecting computerized registration between images of an area of interest in accordance with claim 11 , further comprising following said step of images alignment, steps of: inputting telemetry or geographic data; and applying coarse alignment based on said telemetry or geographic data.

13. A method for affecting computerized registration between images of an area of interest in accordance with claim 11, further comprising following said step of images alignment, the step of: inputting coordinates from a reference image and computing matching coordinates in an input image.

14. A plurality of images, matched one to the other, of an area of interest that were generated by an observation system in accordance with any of claims 1 to 8.

15. A plurality of images, matched one to the other, of an area of interest that were generated by implementing the method in accordance with any of claims 9 to 13.

16. An observation system in accordance with any of claims 1 to 8, as it has been substantially exemplified hereinabove with reference to the accompanying figures.

17. A method for effecting computerized registration between images of an area of interest in accordance with any of claims No. 9 to No. 13, as it has been substantially exemplified hereinabove with reference to the accompanying figures.

18. A plurality of images, matched one to the other, of an area of interest in accordance with any of claims 14 and 15, as it has been substantially exemplified hereinabove with reference to the accompanying figures.