WO2011100964A2

WO2011100964A2 - Method for processing multi-channel image recordings in order to detect hidden objects in the optoelectronic inspection of persons

Info

Publication number: WO2011100964A2
Application number: PCT/DE2011/050005
Authority: WO
Inventors: Mario RÖSSLER; Thomas Fiksel; Ulf Krause
Original assignee: Esw Gmbh
Priority date: 2010-02-18
Filing date: 2011-02-18
Publication date: 2011-08-25
Also published as: WO2011100964A3; DE102010008630A1

Abstract

The invention relates to a method for processing multi-channel image recordings in order to detect hidden objects in a scene, wherein image data from at least two spectrally different camera systems having overlapping fields of vision are subjected to an image data combination. The aim of finding a new possibility for processing multi-channel image recordings for the optoelectronic inspection of persons that achieves an improvement in the detection of hidden objects without the so-called full-body-scanner effect of the millimeter wave scan appearing is achieved according to the invention in that images are recorded as two-dimensional image data in image channels having different frequency ranges, a mask image is produced from the image data of at least one image channel, wherein the image data are differentiated into at least two classes of pixels, the image data of each image channel are transformed into a Gaussian and Laplacian pyramid and the image data of the mask image are transformed into a mask image pyramid adequate therefor, and the image data of the Laplacian pyramids of the individual image channels are combined with the mask image pyramid at pyramid levels that correspond to each other, and then the Laplacian pyramids from corresponding image points of the same pyramid levels are merged into the resulting Laplacian pyramid on the basis of a comparison criterion.

Description

Method for processing multi-channel image recordings for the detection of hidden objects in opto-electronic person control

The invention relates to a method for processing multi-channel image recordings for the detection of hidden objects in a scene, in particular in the optoelectronic person control, in which image data from at least two spectrally different sensitive camera systems, which have largely overlapping fields of view, a picture data combination are subjected.

In the case of person detection to detect hidden objects, in particular in passenger air traffic control, millimeter-wave image acquisition systems have proven to be a reliable optoelectronic control means for detecting security-relevant objects hidden under the clothing of persons (passengers) without tactile body searches.

For example, e.g. from US 2005/0116947 A1 discloses a passive millimeter-wave imaging system in which dense millimeter-wave radiation is recorded by at least one millimeter-wave frequency scanning system and a plurality of beam formers from a two-dimensional field of view. In this case, the recorded radiation is amplified and divided into two separate blocks, which are assigned to the horizontal and the vertical direction. By simultaneous signal detection of the signal strengths within each beam, two-dimensional images of a target object are generated at a frame rate of 30 Hz (standard video frequency) with the frequency band selected between 75.5-93.5 GHz to achieve a good balance between clothing penetration, spatial resolution and to get a compact design of the system. By only measuring the natural heat emission of living beings (persons) in contrast to natural environmental sources (such as the cold sky), a very good contrast of reflective objects hidden under the clothing can be achieved. For personal checks in closed buildings, however, the contrast is too low in this detection method.

US 2005/0231421 A1 and US 2005/0232459 A1 describe imaging scanning systems in which millimeter-wave radiation and radiation of a different wavelength are used in each case for image generation. In this case, the image generated from the other wavelength has a higher spatial resolution compared to the image produced with millimeter-wave radiation in order to improve the detection of clothing-hidden objects with the higher image resolution and the privacy of the monitored persons through the lower-resolution millimeter-wave image protect. From the acquired data of the various recording systems, features are then formed by correlation of the data, which are then classified by logical combination and the results are displayed. It can then draw conclusions about found anomalies (hidden objects) and / or trigger an alarm. The disadvantage here is the necessary restriction to "suspicious areas" in order to avoid the proscribed nude scanner effect of the millimeter wave image and the relatively low resolution in the identification of a detected hidden object.

Further, US 2006/0006322 describes weighted noise compensation for the contrast enhancement of millimeter-wave imaging techniques, which compensates in particular for random, unpredictable effects of noise in the output signals of a plurality of radiometer channels. In this case, each channel is individually weighted as a function of the reciprocal of the standard deviation of the fluctuations of the output signal and then the intensities of each pixel are linked by addition of the successively weighted signals associated with that pixel for the image composition of the millimeter wave image.

Furthermore, US 2008/0043102 A1 discloses a monitoring system which combines a millimeter-wave image from a first sensor system with a second supplementary system that supplies further object information. In this case, the supplementary system can be a second sensor system, but also a non-imaging source of object information in order to classify objects or their features. However, nothing specific about how the possible operations, combination, comparison, or appropriate manipulations of data are applied to extract and identify suspicious objects associated with an imaged person is related by linking the different image or information sources. It remains to select suspect areas to be treated with higher resolution or image enhancement algorithms, the selection of which is due to the appearance of a "wavy texture" in the vicinity of the suspect object.

Furthermore, US 2009/0041293 A1 describes an imaging system for detecting hidden objects on the basis of several millimeter-wave cameras. In addition, for each millimeter-wave camera, a conventional video camera with the same field of view is arranged whose signals are synchronized with one another in real time in order to display hidden objects on a video monitor from differences of the different millimeter-wave images.

From the three latter solutions with improved image generation and processing of millimeter-wave signals for detection of entrained objects in person control, no measures for suppressing the nude scanner effect of millimeter-wave imaging are described.

The invention has for its object to find a new way to process multi-channel image recordings for the detection of hidden objects in a scene, especially in optoelectronic person control, which achieves an improvement in the detection of hidden objects without the so-called. Naked Scanner effect of millimeter waves -Sampling occurs.

According to the invention, the object in a method for processing multichannel image recordings for the detection of hidden objects in a scene, in particular in the optoelectronic person control, in which image data from at least two spectrally different sensitive camera systems, which have largely overlapping fields of view, an image data combination are subjected solved by the following steps:
(1) taking images in image channels of different frequency ranges as two-dimensional image data, obtaining an image channel from the range of the millimeter waves and synchronizing the output images of the different channels to each other if the scanning is performed at different refresh rates;
(2) image enhancement by filter operation, in which at least one noise suppression is performed in a low-resolution image channel,
(3) generating a mask image from the image data of at least one image channel, wherein the image data is discriminated into at least two classes of pixels,
(4) transforming the image data of each image channel into a Gaussian and Laplacian pyramid and of the mask image into a mask pyramid suitable for this purpose, wherein quadratic regions with matching fields of view are respectively selected from the read-out image data,
(5) Fusion of the transformed image data of the Laplace pyramids of the individual image channels with the mask image pyramid in mutually corresponding pyramidal planes and subsequent fusion of the Laplace pyramids of the individual image channels to a resulting Laplacian pyramid, wherein corresponding pixels within the same pyramidal planes of the individual Laplace pyramids is determined on the basis of a relative comparison criterion and the resulting pixels are joined together plane by layer to the resulting Laplace pyramid, and
(6) Back transformation of the resulting resulting Laplace pyramid into a fused result image.

Advantageously, the mask image is generated by means of a fixed threshold value or by means of an adaptable threshold value, the adjustable threshold value being calculated from a histogram distribution of a read-out image. The threshold value can be calculated appropriately by the median value of the histogram.

In a further advantageous variant, the threshold value is determined from a histogram analysis in which either a global minimum is sought.

Furthermore, by means of the histogram analysis, a plurality of threshold values can also be determined if a plurality of local minima are present, and thus more than two classes of pixels within the mask image are distinguished.

The mask image can also be generated by means of several threshold values, for which information from at least two image channels is expediently used. In this case, a first mask image is preferably generated from a first image channel and subdivided with information from a further image channel at least one further class of pixels, the information from the further image channel being used to calculate at least one further threshold value.

Advantageously, a first two-class mask image is generated from the image channel in the millimeter wave range and a second multi-layer mask image is generated from the first mask image and an image from a longer wavelength image channel. In this case, the threshold values for the multilevel mask image can be advantageously obtained by histogram evaluation of the image data from an IR channel.

The generated two- or more-class mask image is preferably converted into a mask image pyramid for the image fusion in such a way that an adequate data structure to the Laplace pyramids of the individual image channels generated from the image transformation arises by a stepwise reduction of the resolution of the mask image.

The image fusion is preferably carried out on the basis of a merger of the Laplace pyramids of two image channels, wherein those pixels of the individual Laplace pyramid levels are determined from corresponding pixels of the individual with the mask pyramid weighted Laplace pyramid levels that have the largest value in terms ,

In a reduced simplified variant, the image fusion may suitably be based on a fusion of the Laplacian pyramids of at least three image channels, wherein the combination with the mask pyramid is replaced by a combination with the Laplacian pyramid of an additional image channel and from corresponding pixels of the individual Laplacian pyramids. Pyramid levels those pixels are determined, which have the largest amount in value.

Further image enhancement of the image of a low-resolution image channel may be advantageously made by using data from at least one higher-resolution image channel at a higher refresh rate by providing a priori information on movements within at least one higher-resolution image channel at a higher refresh rate to the image of the low resolving image channel.

The invention is based on the basic idea that the data of different image channels recorded in addition to a millimeter-wave channel can likewise be processed so as to improve the detection of hidden objects in such a way that a naked representation of the examined person does not take place. The core of the solution according to the invention is the combination of pixel-based (Gauss-Laplace transformation) and object-based generation of mask images and a real-time fusion of Gauss-Laplace-transformed image data from at least two image channels of different wavelengths and / or different resolution.

With the invention, it is possible to realize a method for processing multichannel image recordings for the detection of hidden objects in a scene, in particular in the optoelectronic person control, which improves the detection and the resolution of hidden objects, without the so-called. Naked scanner effect of Millimeter wave scan on the control monitor appears.

The invention will be explained below with reference to exemplary embodiments. The drawings show:

1 shows the basic sequence of the method according to the invention,

2 shows a flow of the invention with extension to three image channels with THz, IR and VIS camera,

3 shows an embodiment of the invention according to FIG. 3 with additional image enhancement in the low-resolution THz image channel by a priori information from the other image channels, FIG.

4 shows an embodiment of image fusion from three image channels using a Gauss-Laplace transformation and weighting Laplace planes with values from equivalently transformed mask images from two video image channels (IR, VIS);

5 shows a modified embodiment of the invention in which the mask image pyramid is replaced by a Laplace pyramid of at least one additionally required image channel.

The method of image generation for detecting hidden objects by combining signals from several separate image channels of different spectral sensitivity consists in its basic algorithm - as shown in block diagram in FIG. 1 - of six essential steps:
1. image recording with at least two spectrally different image recording systems,
2. image preprocessing for image enhancement by at least one noise suppression,
3. Object mask generation from images of at least one image channel,
4. transformation (Gauss-Laplace pyramids) of images from at least one other image channel and adequate transformation of the mask image,
5. Image fusion by fusion of the Laplace pyramids with additional local adaptive weighting
6. Back transformation of the weighted Laplace pyramids into a fused result image.

In an image acquisition unit which has a camera system with at least two spectrally different image channels, cameras of different wavelengths or frequency ranges generate two-dimensional image data from one and the same object, wherein the cameras may have different resolutions and frame rates, but must always be aligned with one another All

cameras

11, 12, ..., 1n with different frequency ranges, which are to be combined in a result image, have the greatest possible common intersection of the field of view (FOV).

For person control to detect hidden objects (in particular weapons and other security-related objects), one millimeter-wave receiving system is used as one of the image channels without restricting the generality of the subsequent method for processing multichannel image recordings, while at least one further image channel uses image data from the infrared (IR ) to visible (VIS) spectral range.

In the option-ordered example according to FIG. 1, initially two spectrally different channels are to be assumed. For this purpose, two

cameras

11 and 12, images with the same field of view (FOV) are taken and provided synchronized for further processing.

Without limiting the generality, it is believed that because of the intentional person control for the detection of hidden objects, a spectral channel is relatively narrowband within the millimeter-wave range (between 100 GHz and 10 THz and 3 mm to 0.3 mm) sensitive to the Clothing to be able to penetrate, and for the (at least one) further channel at least a relatively wide spectral band from the range of infrared (10 THz to 300 THz or 0.3 mm to 780 nm) and / or the visible frequency range (300 THz and 1 PHz or 780 nm to 300 nm) is selected.

The method of image formation for the detection of hidden objects has in detail the following sequence according to FIG. 1.

1. The first step of the method consists in the synchronized provision of recorded image data by means of an image acquisition unit 1, which provides the image data from a first camera 11 and a second camera 12 (optionally, further cameras can be up to a camera 1 n) as two-dimensional camera images.

2. In a second step, the image data of at least one of the image channels, in particular of the lower resolution and / or lower frame rate, are processed by subjecting the read-out image to an edge-preserving noise suppression.
For usable methods for a first image improvement can be taken from the prior art in many forms (see, eg, Bernd Jähne: Digital Image Processing, 4 Edition, Springer Verlag Berlin Heidelberg New York, 1997, Chapter 11.5, p 342 ff.)

3. In a third step, a mask image is created for at least one selected preprocessed camera image, which distinguishes two classes, namely background and object pixels, so that a separation of object and image background is possible. With several image channels, the channel is selected with which the separation of background and object (person) can best be realized. The background separation is realized in this example (binarization) with the help of a threshold, which is either adjustable or calculated from a histogram distribution.
The prior art discloses any other methods for separating object and image background, which may alternatively be used to generate a so-called object mask in a mask image 31.

4. In the fourth step, the image data of the various video channels are decomposed into a Gaussian and a Laplace pyramid. This transformation rule of the Gaussian and Laplace pyramids is a well-known method for data reduction and
-manipulation and can be found in its main features of the monograph (Bernd Jähne: Digital Image Processing, 4 Edition, Springer Verlag Berlin Heidelberg New York, 1997, Chapter 5.2, page 149 ff.).
To develop a Gauss-Laplace pyramid, a Gaussian pyramid is first constructed. It should be noted that the original image must have a page length of 2n pixels. For this purpose, either a square area is selected in a rectangular original image or, if necessary, the image is divided into several square image blocks. The original image represents the lowest pyramidal plane G0. The next Gaussian pyramid plane G1 is filtered by low-pass filtering (f_G= F / 2) and halving the interpolation points of G0. This process continues from level to level until the "image" reaches a size of only a single pixel. The low-pass filtering is realized by a mathematical convolution with a Gaussian bell, wherein the image is practically folded with a binomial filter.
After this process, the images are in the form of a Gaussian pyramid, each representing a certain frequency component. Each successor of an image plane has only 1/4 of the pixels of the previous level.
From the present Gaussian pyramid a Laplace pyramid is then developed. A Laplacian pyramidal plane is achieved by forming the difference of two adjacent gaussian pyramidal planes. It should be noted that these two levels must be the same size. This is achieved by expansion of the successor level. The gray value of the newly added pixels is calculated by interpolation of the two existing neighboring pixels. The individual Laplacepyramidenebenen represent the sharpness portions of an image. The Laplace pyramidal level L0 contains the highest frequency components and the levels below it the remaining lower frequency components.
After the Gauss-Laplace pyramidal plane has been formed and eventually the individual planes have been machined, the Gauss-Laplace pyramid is reconstructed by summing the desired Laplacepyramidenebenen and the highest Gausspyramidenebene. This image data decomposition is a prerequisite for the following combination step based on a multi-resolution method.
In a sibling step 41 for adapting the mask image data to the size and structure of the Laplace pyramids, mask image pyramids are generated from the mask images generated in step 3. This is done by a gradual reduction (each halving one side) of the resolution of the mask image to obtain a Laplace pyramid adequate data structure.

5. In the fifth step (see FIG. 4), the image data is merged by the combination of the calculated Laplace pyramids (multiresolution method) of the individual video data channels (picture channels). The corresponding Laplace pyramids are weighted with the created mask images from step 3. For this weighting of the Laplace pyramid with the mask image, the mask image pyramids formed in step 41 from the mask images, which are adequate for the Laplace pyramids, are used.
For the final fusion step of all weighted Laplace pyramids from the corresponding pixels of the individual Laplace pyramid levels, the pixels are selected, which have the largest value in terms of value, since this corresponds to the maximum information. The pixels selected in this way represent the resulting (fused) Laplace pyramid.

6. In the sixth step, the direct fusion result in the form of the fused Laplace pyramid is created by the inverse transformation of the Laplace pyramid into a result image. If the Gaussian and Laplace pyramid transformation according to Jähne (loc. Cit.) Has been used, the rule of inverse transformation given there can be used.

In the embodiment according to FIG. 2, the two-channel method described above has been extended to three channels.

In this example, additional image data in the infrared (IR) region and in the visible (VIS) spectral region are provided in the first step of image acquisition in addition to the millimeter-wave channel selected by the application (recorded by the THz camera 11). Without limiting the generality, it is assumed in this embodiment that the IR camera 12 operates in the range 7.5... 14 .mu.m, while the VIS camera 13 is sensitive in the range 300... 780 nm. For the THz camera 11, a frequency band in the range 0.3 THz ... 0.9 THz (0.3 ... 1 mm) selected.

The second step of the image enhancement 2 takes place again in the millimeter-wave channel as in the previous example, but can also be applied to the image data from the IR channel. Noise suppression (edge-preserving noise filtering) is performed in this example by a THz image filter 21 in which the read-out image is processed with a noise-canceling and edge-preserving algorithm, e.g. through a median filter, is preprocessed.

The third step of the mask image generation 3 is modified for the present three channels as follows.

From the image data of the THz camera 11, a first mask image 31 (THz mask), which distinguishes two classes of pixel data, background and person, is created with the aid of a histogram-supported threshold value. The median of the histogram is used to determine the threshold value.

Another determination of the threshold value is based on a histogram analysis in which a local minimum is sought. If there are several local minima, several thresholds can be used and thus more than two classes can be distinguished. If only one class subdivision is needed, we use the global minimum for the calculation of the one threshold from the histogram analysis.

From the image data of the IR camera 12 and the previously extracted THz mask (first mask image 31), a second mask image 32 is created which discriminates three classes of pixel data, namely background, person and hidden object. Here are the advantageous properties:
- the temperature measurement of surfaces,
- the higher radiometric and geometric resolution as well
- the higher refresh rate
the IR camera 12 exploited.

Due to the higher resolution of the IR image, edges of objects or persons can be detected more accurately.

With this additional information, a second mask image for distinguishing between the person and the hidden object can be generated from the first mask image, the THz mask.

In the fourth step of the transformation 4 - as already explained in the base example for two channels - both from the image data already used in the mask generation 3 in a mask image transformation 41 by data reduction an adequate mask pyramid and from the provided by the VIS camera 13 as a third image channel Image data each Gaussian and Laplace pyramids calculated, which are transformed as a result of the nearly overlapping fields of view (FOV) of the cameras 11 to 13, the same square image areas in the corresponding Laplace pyramids.

The subsequent step of the image fusion 5 is likewise extended by one channel compared to the example according to FIG. 1 and is illustrated in more detail in its calculation structure of the planar combination of the Laplace pyramids in FIG.

In this illustration it can be seen that in the simple combination of the individual Laplacian pyramid levels within each plane, a weighting is carried out with the mask data (mask image pyramid) originating from step 3, wherein the second mask image 32, which subdivides the image information from the two channels, THz camera 11 and IR camera 12, in the three pixel classes (background, person and hidden object) includes, equally applied as pyramid transformation plane by level on the combined Laplace pyramidal plane.

What is essential in this procedure is that the image data provided per channel in each pyramid level of data reduction is generated by averaging from adjacent pixel data interpolated new pixel data, which is then fused to the similarly generated mask image pyramid level at the subsequent level. In this case, the pixels are selected from the corresponding pixels of the individual weighted Laplace pyramid levels, which have the largest value in terms of amount.

The pixels selected in this way represent the resulting (fused) Laplacian pyramidal plane, which then in the sixth step results in an object-extracted result image by inverse transformation of the fused Laplacian pyramid. This fused result image is distinguished by the fact that despite the garment-permeating THz image data, an image display is displayed which does not allow the controlled person to be nude and nevertheless provides a much clearer display of objects hidden beneath the clothing, as well as a better spatial assignment (resolution) hidden objects on the person.

In the embodiment according to FIG. 3, a further modification of the locally adaptively weighted fusion algorithm compared to the example according to FIG. 2 is described. With the same choice of the fundamental spectral channels (THz, IR, VIS), however, the choice of the frequency bands is assumed here as follows. The THz camera 11 is in the range of 0.85 ... 0.9 mm (0.34 THz ... 0.35), the IR camera 12 in the range of 7.5 ... 14 microns and the VIS camera 13 in the range from 300 ... 780 nm sensitive.

The essential extension of image data processing over that of FIG. 2 is the use of data from image channels with higher spatial resolution (image data from IR camera 12 and VIS camera 13) for image enhancement of low resolution channels (in this case: the millimeter wave data from FIG THz camera 11) reached.

For this purpose, the second process step of image enhancement, before the image data is transformed into Gaussian and Laplace pyramids, is supplemented by the following measures (steps):

2.2 Due to the higher image resolution of the VIS and IR video data (at least 5: 1) and the higher frame rate (50 Hz vs. 10-25 Hz of the THz camera 13), a-priori information can be derived to predictive calculations (forecast evaluation) 22 perform the example be used for the precalculation of persons or object movements.

2.3 The a priori information of person or object movements obtained from the step of prediction calculations 22 are then used for image enhancement of the THz image data according to the edge preserving noise reduction already applied in the basic version. In this case, the a-priori information is processed by means of a Kalman filter. Further improvement of the THz image can be achieved by subpixel interpolation. This includes the movement information (direction, speed and rotation) .

With these additional measures, a better object tracking (movement of verified persons) and more precise object delimitation of hidden objects is achieved.

A further modified fusion process is shown in FIG. 5. In this embodiment, it is assumed that at least three image channels must be present in order to accomplish the task of improved image resolution (for hidden objects) and suppression of the nude scanner effect.

In the case of the image channels residing in the THz, IR and VIS spectral ranges, in this case the spectral sensitivity ranges should be set as follows:
THz camera 11: 0.85 mm,
IR camera 12: 7.5 ... 14 μm,
VIS camera 13: 300 ... 780 nm.

This example builds on the embodiment of Fig. 3, but aims at a shortened or unified image data processing by the mask image generation (and their adequate pyramid production) is omitted in favor of a thereby necessarily existing third image channel (here: next to THz and VIS Channels at least one IR channel), which should be optionally present in the embodiments of FIGS. 2 and 3 only for further (minor) improvement of the result image.

It was found that with each additional high resolution image channel, i. every two channels exceeding, even with firmly chosen (i.e., not locally adaptive) weighting of the Laplace pyramids of the individual image channels in the fusion to the resulting Laplace pyramid, a considerable improvement of the image resolution for hidden objects and suppression of the nude scanner effect can be achieved.

However, the preprocessing of the low-resolution millimeter-wave channel of the THz camera 11 in the second processing step is particularly expedient by additional a-priori information from the high-resolution image channels of the IR camera 12 and the VIS camera 13. A reduction to a pure edge-preserving noise suppression However, THz image filter 21 (eg Kalman filter) still allows a degraded but usable variant.

All other processing steps of image acquisition, image transformation in Gaussian and Laplace pyramids and their plane-wise fusion to the resulting Laplacian pyramid and their inverse transformation to the fused result image run in the same manner as described with reference to FIGS. 1, 3 and 4, in which However, Laplace pyramids from at least three image channels necessary for the fusion step 5 must be available.

With the aid of the above-described variants of the fusion algorithm on the basis of the merger of Laplace pyramid data from partially preprocessed (prepared and improved) data of given image channels or by adding additional locally adaptive mask images or additional image channels, hidden objects can be better recognized these objects are not covered by image information of the other channels, but can be improved with them. Overall, the spatial resolution in the area of the detected hidden objects is increased and that without representing the full imaging of the nude scanner property of the millimeter waves (THz camera 11) in the result image.

LIST OF REFERENCE NUMBERS

1 image acquisition

11 first camera (THz camera)

12 second camera (IR camera)

13 third camera (VIS camera)

1n nth camera (of a multi-channel system)

2 image enhancement

21 picture filters (noise reduction)

22 Prediction calculation (motion prediction)

23 Image enhancement with a priori information (from other channels)

3 mask imaging

31 first mask image (THz mask)

32 second mask image (THz IR mask)

4 Gauss-Laplace transform

41 mask image transformation

5 Fusion of the Laplace pyramids with weighting through mask image

51 THz Laplace plane

52 IR-Laplace plane

53 mask Laplace plane

54 data reduction surgery

55 weighting operation

6 Back transformation of the fused weighted Laplace pyramid

Claims

Method for processing multichannel image recordings for the detection of hidden objects in a scene, in particular in the optoelectronic person control, in which image data from at least two spectrally differently sensitive camera systems, which have largely overlapping fields of view, are subjected to an image data combination, comprising the following steps:

(1) taking pictures in picture channels of different frequency ranges as two-dimensional picture data, whereby an image channel is obtained from the range of the millimeter waves and the output pictures of the different channels are synchronized to each other, if the scanning is performed with different picture repetition frequencies,

(2) image enhancement by filter operation, in which at least one noise suppression is performed in a low-resolution image channel,

(3) generating a mask image from the image data of at least one image channel, wherein the image data is discriminated into at least two classes of pixels,

(4) transforming the image data of each image channel into a Gaussian and Laplacian pyramid and of the mask image into a mask pyramid suitable for this purpose, wherein from the image data read out in each case square regions with matching visual fields are selected,

(5) Fusion of the transformed image data of the Laplace pyramids of the individual image channels with the mask image pyramid in mutually corresponding pyramidal planes and subsequent fusion of the Laplace pyramids of the individual image channels to a resulting Laplacian pyramid, wherein corresponding pixels within the same pyramidal planes of the individual Laplace pyramids is determined on the basis of a relative comparison criterion and the resulting pixels are joined together plane by layer to the resulting Laplace pyramid, and

(6) Back transformation of the resulting resulting Laplace pyramid into a fused result image.
Method according to claim 1, characterized in that

the mask image is generated by means of a permanently adjustable threshold value.
Method according to claim 1, characterized in that

the mask image is generated using an adjustable threshold value, the threshold value being calculated from a histogram distribution of a read-out image.
A method according to claim 3, characterized in that

the threshold value for the mask image is calculated by the median value of the histogram.
A method according to claim 3, characterized in that

the threshold is determined from a histogram analysis that seeks a global minimum.
Method according to claim 5, characterized in that

multiple thresholds are determined if there are multiple local minima, and thus more than two classes of pixels within the mask image are distinguished.
Method according to claim 1, characterized in that

the mask image is generated using a plurality of threshold values, with information from at least two image channels (11, 12) being used for this purpose.
A method according to claim 7, characterized in that

a first mask image (31) is generated from a first image channel (11) and at least one further class of pixels is subdivided with information from a further image channel (12), wherein the information from the further image channel (12) is used to calculate at least one further threshold value become.
A method according to claim 8, characterized in that

converting a first two-class mask image (31) from the image channel in the millimeter-wave region (11) and a second multi-layer mask image (32) from the first mask image (31) and an image from a longer wavelength image channel (12).
A method according to claim 9, characterized in that

the threshold values for the multi-class mask image (32) are obtained by histogram evaluation in an IR channel (12).
Method according to claim 1, characterized in that

the generated mask image (31, 32) for the image fusion (5) is converted into a mask image pyramid such that a stepwise reduction of the resolution of the mask image yields an adequate data structure for the Laplace pyramids of the individual image channels (11) generated from the image transformation (4) , 12, 13, ... 1n).
A method according to claim 11, characterized in that

the image fusion (5) on the basis of a fusion of the Laplace pyramids of at least two image channels is carried out, wherein those pixels of the individual Laplace pyramid levels are determined from corresponding pixels of the individual with the mask pyramid weighted Laplace pyramid levels, which are the largest in terms of magnitude Own value.
Method according to claim 1, characterized in that

the image fusion (5) is based on a fusion of the Laplace pyramids of at least three image channels, the combination with the mask image pyramid being replaced by a combination with the Laplacian pyramid of an additional image channel and corresponding pixels of the individual Laplacian pyramid planes Pixels are determined which have the largest amount in value.
Method according to claim 1, characterized in that

an image enhancement of the image of a low-resolution image channel (11) by the use of data from at least one higher-resolution image channel (12, 13) with higher refresh rate is performed by a-priori information about movements within the high-resolution image channel (12, 13) be applied to the image of the low-resolution image channel (11) at a higher frame rate.