DE102010008630A1

DE102010008630A1 - Method for processing multi-channel image recordings for the detection of hidden objects in opto-electronic person control

Info

Publication number: DE102010008630A1
Application number: DE102010008630A
Authority: DE
Inventors: Mario 07745 Rößler; Thomas Dr. 07751 Fiksel; Ulf 07751 Krause
Original assignee: ESW GmbH
Current assignee: Vincorion Advanced Systems GmbH
Priority date: 2010-02-18
Filing date: 2010-02-18
Publication date: 2011-08-18
Also published as: WO2011100964A3; WO2011100964A2

Abstract

Die Erfindung betrifft ein Verfahren zur Verarbeitung von mehrkanaligen Bildaufnahmen für die Detektion von verborgenen Gegenständen in einer Szene, bei dem Bilddaten aus mindestens zwei spektral unterschiedlichen Kamerasystemen mit überlappenden Gesichtsfeldern einer Bilddatenkombination unterzogen werden. Die Aufgabe, eine neue Möglichkeit zur Verarbeitung von mehrkanaligen Bildaufnahmen für die optoelektronische Personenkontrolle zu finden, die eine Verbesserung des Nachweises verborgener Objekte erreicht, ohne dass der sog. Nacktscannereffekt der Millimeterwellen-Abtastung in Erscheinung tritt, wird erfindungsgemäß gelöst, indem – Bilder in Bildkanälen mit unterschiedlichen Frequenzbereichen als zweidimensionale Bilddaten aufgenommen werden, – aus den Bilddaten mindestens eines Bildkanals ein Maskenbild erzeugt wird, wobei die Bilddaten in mindestens zwei Klassen von Pixeln unterschieden werden, – die Bilddaten jedes Bildkanals in eine Gauss- und Laplace-Pyramide und des Maskenbildes in eine dazu adäquate Maskenbildpyramide, transformiert werden und – die Bilddaten der Laplace-Pyramiden der einzelnen Bildkanäle mit der Maskenbildpyramide in zueinander korrespondierenden Pyramidenebenen kombiniert und anschließend die Laplace-Pyramiden aus korrespondierenden Bildpunkten gleicher Pyramidenebenen auf Basis eines Vergleichskriteriums zur resultierenden Laplace-Pyramide verschmolzen werden.The invention relates to a method for processing multichannel image recordings for the detection of hidden objects in a scene, in which image data from at least two spectrally different camera systems with overlapping visual fields are subjected to an image data combination. The task of finding a new way of processing multichannel image recordings for optoelectronic personal control, which achieves an improvement in the detection of hidden objects without the so-called nude scanner effect of millimeter-wave scanning appearing, is achieved according to the invention by - images in image channels with different frequency ranges are recorded as two-dimensional image data, - a mask image is generated from the image data of at least one image channel, the image data being distinguished into at least two classes of pixels, - the image data of each image channel into a Gaussian and Laplace pyramid and the mask image into an adequate mask image pyramid are transformed and - the image data of the Laplace pyramids of the individual image channels are combined with the mask image pyramid in mutually corresponding pyramid planes and then the Laplace pyramids are made up of corresponding image points n the same pyramid levels are merged on the basis of a comparison criterion to form the resulting Laplace pyramid.

Description

Die Erfindung betrifft ein Verfahren zur Verarbeitung von mehrkanaligen Bildaufnahmen für die Detektion von verborgenen Gegenständen in einer Szene, insbesondere bei der optoelektronischen Personenkontrolle, bei dem Bilddaten aus mindestens zwei spektral unterschiedlich empfindlichen Kamerasystemen, die einander weitgehend überlappende Gesichtsfelder aufweisen, einer Bilddatenkombination unterzogen werden.The invention relates to a method for processing multi-channel image recordings for the detection of hidden objects in a scene, in particular in the optoelectronic person control, in which image data from at least two spectrally different sensitive camera systems, which have largely overlapping fields of view, a picture data combination are subjected.

Bei der Personenkontrolle zum Nachweis versteckter Gegenstände, insbesondere bei der Sicherheitskontrolle im Passagierflugverkehr, haben sich auf Basis von Millimeterwellen-Bildaufnahmesystemen als eine zuverlässige optoelektronische Kontrollmöglichkeit erwiesen, um ohne taktile Leibesvisitationen unter der Kleidung von Personen (Passagieren) verborgene sicherheitsrelevante Gegenstände nachzuweisen.In the case of person detection to detect hidden objects, in particular in passenger air traffic control, millimeter-wave image acquisition systems have proven to be a reliable optoelectronic control means for detecting security-relevant objects hidden under the clothing of persons (passengers) without tactile body searches.

So ist z. B. aus der US 2005/011 6947 A1 ein passives Millimeterwellen-Abbildungssystem bekannt, bei dem mit mindestens einem Millimeterwellenfrequenz-Abtastsystem und mehreren Strahlformern von einem zweidimensionalen Gesichtsfeld dichte Millimeterwellenstrahlung aufgenommen wird. Dabei wird die aufgenommene Strahlung verstärkt und in zwei separate Blöcke aufgeteilt, die der Horizontal- und der Vertikalrichtung zugeordnet sind. Durch simultane Signaldetektion der Signalstärken innerhalb jedes Strahls werden zweidimensionale Bilder eines Zielobjekts mit einer Bildrate von 30 Hz (Standard-Videofrequenz) erzeugt, wobei das Frequenzband zwischen 75,5–93,5 GHz ausgewählt wurde, um eine gute Balance zwischen Kleidungsdurchdringung, räumlicher Auflösung und kompakter Bauweise des Systems zu erhalten. Durch ausschließliche Messung der natürlichen Wärmeemission von Lebewesen (Personen) in Gegenüberstellung zu natürlichen Umgebungsquellen (wie z. B. des kalten Himmels) kann ein sehr guter Kontrast von reflektierenden Objekten, die unter der Kleidung versteckt sind, erzielt werden. Für Personenkontrollen in geschlossenen Gebäuden ist bei diesem Detektionsverfahren der Kontrast jedoch zu gering.So z. B. from the US 2005/011 6947 A1 a millimeter wave passive imaging system is known in which dense millimeter wave radiation is received by at least one millimeter-wave frequency scanning system and multiple beam formers from a two-dimensional field of view. In this case, the recorded radiation is amplified and divided into two separate blocks, which are assigned to the horizontal and the vertical direction. By simultaneous signal detection of the signal strengths within each beam, two-dimensional images of a target object are generated at a frame rate of 30 Hz (standard video frequency) with the frequency band selected between 75.5-93.5 GHz to achieve a good balance between clothing penetration, spatial resolution and to get a compact design of the system. By exclusively measuring the natural heat emission of living beings (persons) in comparison to natural environmental sources (such as the cold sky), a very good contrast of reflective objects hidden under the clothing can be achieved. For personal checks in closed buildings, however, the contrast is too low in this detection method.

In der US 2005/0231421 A1 und US 2005/0232459 A1 sind bildgebende Abtastsysteme beschrieben, bei denen Millimeterwellen-Strahlung und Strahlung einer anderen Wellenlänge jeweils zu Bilderzeugung verwendet werden. Dabei weist das aus der anderen Wellenlänge erzeugte Bild gegenüber dem mit Millimeterwellenstrahlung erzeugten Bild eine höhere räumliche Auflösung auf, um mit der höheren Bildauflösung die Detektion von unter der Kleidung verborgenen Objekten zu verbessern und durch das geringer aufgelöste Millimeterwellen-Bild die Privatsphäre der überwachten Personen zu schützen. Aus den akquirierten Daten der verschiedenen Aufnahmesysteme werden dann durch Korrelation der Daten Merkmale gebildet, die anschließend durch logische Kombination klassifiziert und die Ergebnisse angezeigt werden. Daraus können dann Schlussfolgerungen über gefundene Anomalien (versteckte Objekte) gezogen und/oder ein Alarm ausgelöst werden. Nachteilig ist hierbei die nötige Beschränkung auf „verdächtige Bereiche”, um den verpönten Nacktscannereffekt des Millimeterwellenbildes zu vermeiden sowie die relativ geringe Auflösung bei der Identifikation eines detektierten verborgenen Objekts.In the US 2005/0231421 A1 and US 2005/0232459 A1 Imaging scanning systems are described in which millimeter-wave radiation and radiation of a different wavelength are used for image formation. In this case, the image generated from the other wavelength has a higher spatial resolution compared to the image produced with millimeter-wave radiation in order to improve the detection of clothing-hidden objects with the higher image resolution and the privacy of the monitored persons through the lower-resolution millimeter-wave image protect. From the acquired data of the various recording systems, features are then formed by correlation of the data, which are then classified by logical combination and the results are displayed. It can then draw conclusions about found anomalies (hidden objects) and / or trigger an alarm. The disadvantage here is the necessary restriction to "suspicious areas" in order to avoid the proscribed nude scanner effect of the millimeter wave image and the relatively low resolution in the identification of a detected hidden object.

Ferner ist in der US 2006/0006322 eine gewichtete Rauschkompensation für die Kontrastverbesserung von Millimeterwellen-Bildgebungsverfahren beschrieben, mit der insbesondere zufällige, unvorhersagbare Effekte des Rauschens der Ausgangssignale einer Vielzahl von Radiometerkanälen kompensiert werden. Dabei wird jeder Kanal in Abhängigkeit vom Reziprokwert der Standardabweichung der Schwankungen des Ausgangssignals individuell gewichtet und dann die Intensitäten jedes Pixels durch Addition der aufeinanderfolgend gewichteten, diesem Pixel zugeordneten Signale für die Bildzusammensetzung des Millimeterwellenbildes verknüpft. Des Weiteren ist in der US 2008/0043102 A1 ein Überwachungssystem offenbart, das ein Millimeterwellen-Bild aus einem ersten Sensorsystem mit einem zweiten ergänzenden System, das weitere Objektinformationen liefert, verknüpft. Dabei kann das ergänzende System ein zweites Sensorsystem, aber auch eine nichtabbildende Quelle von Objektinformationen sein, um Objekte oder deren Merkmale zu klassifizieren. Über die Verknüpfung der unterschiedlichen Bild- oder Informationsquellen ist jedoch nichts Genaueres mitgeteilt, wie die möglichen Operationen, Kombination, Vergleich oder geeignete Manipulationen von Daten, anzuwenden sind, um verdächtige Objekte, die einer abgebildeten Person zugeordnet sind, zu extrahieren und zu identifizieren. Es bleibt bei der Auswahl verdächtiger Bereiche, die mit höherer Auflösung oder Bildverbesserungsalgorithmen behandelt werden sollen, wobei deren Auswahl aufgrund des Auftretens einer „welligen Textur” in der Nähe des verdächtigen Objekts erfolgt. Weiterhin ist in der US 2009/0041293 A1 ein Abbildungssystem zum Nachweis versteckter Objekte auf Basis von mehreren Millimeterwellenkameras beschrieben. Zusätzlich ist zu jeder Millimeterwellenkamera eine herkömmliche Videokamera mit gleichem Gesichtsfeld angeordnet, deren Signale synchronisiert in Echtzeit miteinander überlagert werden, um aus Differenzen der unterschiedlichen Millimeterwellenbilder versteckte Objekte auf einem Videomonitor anzuzeigen. Aus den drei letztgenannten Lösungen mit verbesserter Bilderzeugung und -verarbeitung von Millimeterwellensignalen zur Erkennung von mitgeführten Objekten bei der Personenkontrolle sind keine Maßnahmen zur Unterdrückung des Nacktscannereffekts der Millimeterwellen-Bilderzeugung beschrieben.Furthermore, in the US 2006/0006322 A weighted noise compensation for the contrast enhancement of millimeter-wave imaging method is described, with the particular random, unpredictable effects of the noise of the output signals of a plurality of radiometer channels are compensated. In this case, each channel is individually weighted as a function of the reciprocal of the standard deviation of the fluctuations of the output signal and then the intensities of each pixel are linked by addition of the successively weighted signals associated with that pixel for the image composition of the millimeter wave image. Furthermore, in the US 2008/0043102 A1 discloses a monitoring system that combines a millimeter-wave image from a first sensor system with a second supplemental system that provides further object information. In this case, the supplementary system can be a second sensor system, but also a non-imaging source of object information in order to classify objects or their features. However, nothing specific about how the possible operations, combination, comparison, or appropriate manipulations of data are applied to extract and identify suspicious objects associated with an imaged person is related by linking the different image or information sources. It remains to select suspect areas to be treated with higher resolution or image enhancement algorithms, the selection of which is due to the appearance of a "wavy texture" in the vicinity of the suspect object. Furthermore, in the US 2009/0041293 A1 an imaging system for detecting hidden objects based on several millimeter-wave cameras described. In addition, for each millimeter-wave camera, a conventional video camera with the same field of view is arranged whose signals are synchronized with one another in real time in order to display hidden objects on a video monitor from differences of the different millimeter-wave images. Of the three latter solutions with improved imaging and processing of millimeter-wave signals for detection of entrained objects in the person control are none Measures to suppress the nude scanner effect of millimeter-wave imaging are described.

Der Erfindung liegt die Aufgabe zugrunde, eine neue Möglichkeit zur Verarbeitung von mehrkanaligen Bildaufnahmen für die Detektion von verborgenen Gegenständen in einer Szene, insbesondere bei der optoelektronischen Personenkontrolle zu finden, die eine Verbesserung des Nachweises verborgener Objekte erreicht, ohne dass der sog. Nacktscannereffekt der Millimeterwellen-Abtastung in Erscheinung tritt.The invention has for its object to find a new way to process multi-channel image recordings for the detection of hidden objects in a scene, especially in optoelectronic person control, which achieves an improvement in the detection of hidden objects without the so-called. Naked Scanner effect of millimeter waves -Sampling occurs.

Erfindungsgemäß wird die Aufgabe bei einem Verfahren zur Verarbeitung von mehrkanaligen Bildaufnahmen für die Detektion von verborgenen Gegenständen in einer Szene, insbesondere bei der optoelektronischen Personenkontrolle, bei dem Bilddaten aus mindestens zwei spektral unterschiedlich empfindlichen Kamerasystemen, die einander weitgehend überlappende Gesichtsfelder aufweisen, einer Bilddatenkombination unterzogen werden, gelöst durch folgende Schritte:

(1) Aufnehmen von Bildern in Bildkanälen unterschiedlicher Frequenzbereiche als zweidimensionale Bilddaten, wobei ein Bildkanal aus dem Bereich der Milimeterwellen gewonnen wird und die ausgegebenen Bilder der unterschiedlichen Kanäle zueinander synchronisiert werden, falls die Abtastung mit unterschiedlichen Bildwiederholfrequenzen erfolgt,
(2) Bildverbesserung mittels Filteroperation, bei der mindestens eine Rauschunterdrückung in einem niedrigauflösenden Bildkanal durchgeführt wird,
(3) Erzeugung eines Maskenbildes aus den Bilddaten mindestens eines Bildkanals, bei der die Bilddaten in wenigstens zwei Klassen von Pixeln unterschieden werden,
(4) Transformation der Bilddaten jedes Bildkanals in eine Gauss- und Laplace-Pyramide und des Maskenbildes in eine dazu adäquate Maskenbildpyramide, wobei aus den ausgelesenen Bilddaten jeweils quadratische Bereiche mit übereinstimmenden Gesichtsfeldern ausgewählt werden,
(5) Fusion der transformierten Bilddaten der Laplace-Pyramiden der einzelnen Bildkanäle mit der Maskenbildpyramide in zueinander korrespondierenden Pyramidenebenen und anschließende Verschmelzung der Laplace-Pyramiden der einzelnen Bildkanäle zu einer resultierenden Laplace-Pyramide, wobei aus korrespondierenden Bildpunkten innerhalb gleicher Pyramidenebenen der einzelnen Laplace-Pyramiden auf Basis eines relativen Vergleichkriteriums ermittelt wird und die resultierenden Bildpunkte ebenenweise zur resultierenden Laplace-Pyramide zusammengefügt werden, und
(6) Rücktransformation der erzeugten resultierenden Laplace-Pyramide in ein fusioniertes Ergebnisbild.

According to the invention, the object in a method for processing multichannel image recordings for the detection of hidden objects in a scene, in particular in the optoelectronic person control, in which image data from at least two spectrally different sensitive camera systems, which have largely overlapping fields of view, an image data combination are subjected solved by the following steps:

(1) taking pictures in picture channels of different frequency ranges as two-dimensional picture data, whereby an image channel is obtained from the range of the millimeter waves and the output pictures of the different channels are synchronized to each other, if the scanning is performed with different picture repetition frequencies,
(2) image enhancement by filter operation, in which at least one noise suppression is performed in a low-resolution image channel,
(3) generating a mask image from the image data of at least one image channel, wherein the image data is discriminated into at least two classes of pixels,
(4) transforming the image data of each image channel into a Gaussian and Laplacian pyramid and of the mask image into a mask pyramid suitable for this purpose, wherein from the image data read out in each case square regions with matching visual fields are selected,
(5) Fusion of the transformed image data of the Laplace pyramids of the individual image channels with the mask image pyramid in mutually corresponding pyramidal planes and subsequent fusion of the Laplace pyramids of the individual image channels to a resulting Laplacian pyramid, wherein corresponding pixels within the same pyramidal planes of the individual Laplace pyramids is determined on the basis of a relative comparison criterion and the resulting pixels are joined together plane by layer to the resulting Laplace pyramid, and
(6) Back transformation of the resulting resulting Laplace pyramid into a fused result image.

Vorteilhaft wird das Maskenbild mit Hilfe eines fest einstellbaren Schwellwertes oder mittels eines anpassbaren Schwellwertes erzeugt, wobei der anpassbare Schwellwert aus einer Histogrammverteilung eines ausgelesenen Bildes berechnet wird. Dabei kann der Schwellwert zweckmäßig durch den Medianwert des Histogramms berechnet werden. In einer weiteren vorteilhaften Variante wird der Schwellwert aus einer Histogrammanalyse bestimmt, bei der entweder ein globales Minimum gesucht wird. Weiterhin können mittels der Histogrammanalyse auch mehrere Schwellwerte ermittelt werden, falls mehrere lokale Minima vorhanden sind, und damit mehr als zwei Klassen von Pixeln innerhalb des Maskenbildes unterschieden werden.Advantageously, the mask image is generated by means of a fixed threshold value or by means of an adaptable threshold value, the adaptable threshold value being calculated from a histogram distribution of a read-out image. In this case, the threshold value can be calculated appropriately by the median value of the histogram. In a further advantageous variant, the threshold value is determined from a histogram analysis in which either a global minimum is sought. Furthermore, by means of the histogram analysis, a plurality of threshold values can also be determined if a plurality of local minima are present, and thus more than two classes of pixels within the mask image are distinguished.

Das Maskenbild kann außerdem mit Hilfe mehrerer Schwellwerte erzeugt werden, indem dazu zweckmäßig Informationen aus mindestens zwei Bildkanälen verwendet werden. Dabei wird vorzugsweise aus einem ersten Bildkanal ein erstes Maskenbild erzeugt und mit Informationen aus einem weiteren Bildkanal mindestens eine weitere Klasse von Pixeln unterteilt, wobei die Informationen aus dem weiteren Bildkanal zur Berechnung mindestens eines weiteren Schwellwertes verwendet werden. Vorteilhaft wird ein erstes zweiklassiges Maskenbild aus dem Bildkanal im Millimeterwellenbereich und ein zweites mehrklassiges Maskenbild aus dem ersten Maskenbild und einem Bild aus einem langwelligeren Bildkanal erzeugt. Dabei können die Schwellwerte für das mehrklassige Maskenbild zweckmäßig durch Histogrammauswertung der Bilddaten aus einem IR-Kanal gewonnen werden.The mask image can also be generated by means of several threshold values, for which information from at least two image channels is expediently used. In this case, a first mask image is preferably generated from a first image channel and subdivided with information from a further image channel at least one further class of pixels, the information from the further image channel being used to calculate at least one further threshold value. Advantageously, a first two-class mask image is generated from the image channel in the millimeter wave range and a second multi-layer mask image is generated from the first mask image and an image from a longer wavelength image channel. In this case, the threshold values for the multilevel mask image can be advantageously obtained by histogram evaluation of the image data from an IR channel.

Das erzeugte zwei- oder mehrklassige Maskenbild wird für die Bildfusion vorzugsweise in eine Maskenbildpyramide derart umgewandelt, dass durch eine schrittweise Reduzierung der Auflösung des Maskenbildes ein adäquates Datengefüge zu den aus der Bildtransformation erzeugten Laplace-Pyramiden der einzelnen Bildkanäle entsteht. Die Bildfusion erfolgt dabei vorzugsweise auf Basis einer Verschmelzung der Laplace-Pyramiden von zwei Bildkanälen, wobei aus korrespondierenden Bildpunkten der einzelnen mit der Maskenbildpyramide gewichteten Laplace-Pyramiden-Ebenen diejenigen Bildpunkte der einzelnen Laplace-Pyramiden-Ebenen ermittelt werden, die den betragsmäßig größten Wert besitzen.The generated two- or more-class mask image is preferably converted into a mask image pyramid for the image fusion in such a way that an adequate data structure to the Laplace pyramids of the individual image channels generated from the image transformation arises by a stepwise reduction of the resolution of the mask image. The image fusion is preferably carried out on the basis of a merger of the Laplace pyramids of two image channels, wherein those pixels of the individual Laplace pyramid levels are determined from corresponding pixels of the individual with the mask pyramid weighted Laplace pyramid levels that have the largest value in terms ,

In einer reduzierten vereinfachten Variante kann die Bildfusion zweckmäßig auf Basis einer Verschmelzung der Laplace-Pyramiden von mindestens drei Bildkanälen erfolgen, wobei die Kombination mit der Maskenbildpyramide durch eine Kombination mit der Laplace-Pyramide eines zusätzlichen Bildkanals ersetzt wird und aus korrespondierenden Bildpunkten der einzelnen Laplace-Pyramiden-Ebenen diejenigen Bildpunkte ermittelt werden, die den betragsmäßig größten Wert besitzen.In a reduced simplified variant, the image fusion may suitably be based on a fusion of the Laplacian pyramids of at least three image channels are made, the combination with the mask image pyramid is replaced by a combination with the Laplace pyramid of an additional image channel and from corresponding pixels of the individual Laplace pyramid levels those pixels are determined, which have the largest amount in value.

Eine weitere Bildverbesserung des Bildes eines niedrig auflösenden Bildkanals kann vorteilhaft durch die Verwendung von Daten aus mindestens einem höher auflösenden Bildkanal mit höherer Bildwiederholrate vorgenommen werden, indem a-priori-Informationen über Bewegungen innerhalb mindestens eines höher auflösenden Bildkanals mit höherer Bildwiederholrate auf das Bild des niedrig auflösenden Bildkanals angewendet werden.Further image enhancement of the image of a low-resolution image channel may be advantageously made by using data from at least one higher-resolution image channel at a higher refresh rate by providing a priori information on movements within at least one higher-resolution image channel at a higher refresh rate to the image of the low resolving image channel.

Die Erfindung basiert auf der Grundüberlegung, dass die Daten verschiedener, zusätzlich zu einem Millimeterwellenkanal aufgenommene Bildkanäle zur Verbesserung des Nachweises verborgener Objekte gleichermaßen so verarbeitet werden können, dass eine Nacktdarstellung der untersuchten Person nicht erfolgt. Kern der erfindungsgemäßen Lösung ist die Kombination einer pixelbasierten (Gauss-Laplace-Transformation) und objektbasierten Erzeugung von Maskenbildern sowie eine Echtzeit-Fusion von Gauss-Laplace-transformierten Bilddaten aus mindestens zwei Bildkanälen unterschiedlicher Wellenlänge und/oder verschiedener Auflösung.The invention is based on the basic idea that the data of different image channels recorded in addition to a millimeter-wave channel can likewise be processed so as to improve the detection of hidden objects in such a way that a naked representation of the examined person does not take place. The core of the solution according to the invention is the combination of pixel-based (Gauss-Laplace transformation) and object-based generation of mask images and a real-time fusion of Gauss-Laplace-transformed image data from at least two image channels of different wavelengths and / or different resolution.

Mit der Erfindung ist es möglich, ein Verfahren zur Verarbeitung von mehrkanaligen Bildaufnahmen für die Detektion von verborgenen Gegenständen in einer Szene, insbesondere bei der optoelektronische Personenkontrolle, zu realisieren, das den Nachweis und die Auflösung verborgener Objekte verbessert, ohne dass der sog. Nacktscannereffekt der Millimeterwellen-Abtastung auf dem Kontrollmonitor in Erscheinung tritt.With the invention, it is possible to realize a method for processing multichannel image recordings for the detection of hidden objects in a scene, in particular in the optoelectronic person control, which improves the detection and the resolution of hidden objects, without the so-called. Naked scanner effect of Millimeter wave scan on the control monitor appears.

Die Erfindung soll nachstehend anhand von Ausführungsbeispielen näher erläutert werden. Die Zeichnungen zeigen:The invention will be explained below with reference to exemplary embodiments. The drawings show:

1: den Prinzipablauf des erfindungsgemäßen Verfahrens, 1 : the basic procedure of the method according to the invention,

2: einen Ablauf der Erfindung mit Erweiterung auf drei Bildkanäle mit THz-, IR- und VIS-Kamera, 2 : an expiration of the invention with extension to three image channels with THz, IR and VIS camera,

3: eine Ausführung der Erfindung gemäß 3 mit zusätzlicher Bildverbesserung im niedrig auflösenden THz-Bildkanal durch a-priori-Informationen aus den anderen Bildkanälen, 3 : an embodiment of the invention according to 3 with additional image enhancement in the low-resolution THz image channel by a-priori information from the other image channels,

4: eine Ausführungsform der Bildfusion aus drei Bildkanälen unter Verwendung einer Gauß-Laplace-Transformation und Wichtung Laplace-Ebenen mit Werten aus äquivalent transformierten Maskenbildern aus zwei Videobildkanälen (IR, VIS); 4 : an embodiment of image fusion from three image channels using a Gauss-Laplace transform and weighting Laplace planes with values from equivalent transformed mask images from two video image channels (IR, VIS);

5: eine modifizierte Ausführung der Erfindung, bei der die Maskenbildpyramide durch eine Laplace-Pyramide wenigstens eines zusätzlich erforderlichen Bildkanals ersetzt wird. 5 A modified embodiment of the invention in which the mask image pyramid is replaced by a Laplacian pyramid of at least one additionally required image channel.

Das Verfahren der Bilderzeugung zur Detektion von verborgenen Objekten durch Kombination von Signalen aus mehreren separaten Bildkanälen unterschiedlicher spektraler Empfindlichkeit besteht in seinem Grundalgorithmus – wie in 1 als Blockschema dargestellt – aus sechs wesentlichen Schritten:

1. Bildaufnahme mit wenigstens zwei spektral unterschiedlichen Bildaufnahmesystemen,
2. Bildvorverarbeitung zur Bildverbesserung durch mindestens eine Rauschunterdrückung,
3. Objektmaskenerzeugung aus Bildern von mindestens einem Bildkanal,
4. Transformation (Gauß-Laplace-Pyramiden) von Bildern aus mindestens einem anderen Bildkanal und adäquate Transformation des Maskenbildes,
5. Bildfusion durch Verschmelzung der Laplace-Pyramiden mit zusätzlicher lokal-adaptiver Wichtung,
6. Rücktransformation der gewichteten Laplace-Pyramiden in ein fusioniertes Ergebnisbild.

The method of image generation for the detection of hidden objects by combining signals from several separate image channels of different spectral sensitivity consists in its basic algorithm - as in 1 presented as a block diagram - from six essential steps:

1. image recording with at least two spectrally different image recording systems,
2. image preprocessing for image enhancement by at least one noise suppression,
3. Object mask generation from images of at least one image channel,
4. transformation (Gauss-Laplace pyramids) of images from at least one other image channel and adequate transformation of the mask image,
5. Image fusion by fusion of the Laplace pyramids with additional local adaptive weighting
6. Back transformation of the weighted Laplace pyramids into a fused result image.

In einer Bildaufnahmeeinheit, die ein Kamerasystem mit mindestens zwei spektral unterschiedlichen Bildkanälen aufweist, erzeugen Kameras aus unterschiedlichen Wellenlängen bzw. Frequenzbereichen zweidimensionale Bilddaten von ein und demselben Objekt, wobei die Kameras unterschiedliche Auflösungen und Bildraten haben können, jedoch stets so zueinander ausgerichtet sein müssen, dass alle Kameras 11, 12, ..., 1n mit unterschiedlichen Frequenzbereiche, die in einem Ergebnisbild kombiniert werden sollen, eine möglichst große gemeinsame Schnittmenge der Gesichtsfelder (FOV – Field of View) aufweisen. Für die Personenkontrolle zur Feststellung verborgen mitgeführter Objekte (insbesondere Waffen und anderer sicherheitsrelevanter Gegenstände) wird – ohne Beschränkung der Allgemeinheit des nachfolgenden Verfahrens zur Verarbeitung von mehrkanaligen Bildaufnahmen – als einer der Bildkanäle ein Millimeterwellenempfangssystem verwendet, während mindestens ein weiterer Bildkanal Bilddaten aus dem infraroten (IR) bis sichtbaren (VIS) Spektralbereich aufnimmt. Im optionsnegierten Beispiel gemäß 1 soll zunächst von zwei spektral unterschiedlichen Kanälen ausgegangen werden. Dazu werden von zwei Kameras 11 und 12, Bilder mit gleichem Gesichtsfeld (FOV) aufgenommen und zur weiteren Verarbeitung synchronisiert bereitgestellt. Ohne Beschränkung der Allgemeinheit, soll wegen der beabsichtigten Personenkontrolle zur Detektion von verborgenen Gegenständen, davon ausgegangen werden, dass ein Spektralkanal relativ schmalbandig innerhalb des Millimeterwellenbereichs (zwischen 100 GHz und 10 THz bzw. 3 mm bis 0,3 mm) empfindlich ist, um die Bekleidung durchdringen zu können, und für den (mindestens einen) weiteren Kanal wenigstens ein relativ breites Spektralband aus dem Bereich von Infrarot (10 THz bis 300 THz bzw. 0,3 mm bis 780 nm) und/oder dem sichtbaren Frequenzbereich (300 THz und 1 PHz bzw. 780 nm bis 300 nm) gewählt wird.In an image acquisition unit which has a camera system with at least two spectrally different image channels, cameras of different wavelengths or frequency ranges generate two-dimensional image data from one and the same object, wherein the cameras may have different resolutions and frame rates, but must always be aligned with one another all cameras 11 . 12 , ..., 1n with different frequency ranges, which are to be combined in a result image, the largest possible common intersection of the visual fields (FOV - Field of View) have. For person control to detect hidden objects (in particular weapons and other security-related objects), one millimeter-wave receiving system is used as one of the image channels without restricting the generality of the subsequent method for processing multichannel image recordings, while at least one further image channel uses image data from the infrared (IR ) to visible (VIS) spectral range. In optionnegierten example according to 1 should initially be assumed by two spectrally different channels. This will be done by two cameras 11 and 12 , Images with the same field of view (FOV) taken and for further processing provided synchronized. Without limiting the generality, it is believed that because of the intentional person control for the detection of hidden objects, a spectral channel is relatively narrowband within the millimeter-wave range (between 100 GHz and 10 THz and 3 mm to 0.3 mm) sensitive to the Clothing to be able to penetrate, and for the (at least one) further channel at least a relatively wide spectral band from the range of infrared (10 THz to 300 THz or 0.3 mm to 780 nm) and / or the visible frequency range (300 THz and 1 PHz or 780 nm to 300 nm) is selected.

Das Verfahren der Bilderzeugung für die Detektion von verborgenen Gegenständen hat im Einzelnen den folgenden Ablauf gemäß 1.

1. Der erste Schritt des Verfahrens besteht in der synchronisierten Bereitstellung aufgenommener Bilddaten mittels einer Bildaufnahmeeinheit 1, die die Bilddaten aus einer ersten Kamera 11 und einer zweiten Kamera 12 (optional können weitere Kameras bis zu einer Kamera 1n vorhanden sein) als zweidimensionale Kamerabilder bereitstellt.
2. In einem zweiten Schritt werden die Bilddaten von mindestens einem der Bildkanäle, insbesondere von dem mit geringerer Auflösung und/oder geringerer Bildrate, bearbeitet, indem das ausgelesene Bild einer kantenerhaltenden Rauschunterdrückung unterworfen wird. Dafür nutzbare Verfahren für eine erste Bildverbesserung können in vielfältiger Ausprägung dem Stand der Technik entnommen werden (siehe z. B. Bernd Jähne: Digitale Bildverarbeitung, 4 Auflage, Springer Verlag Berlin Heidelberg New York, 1997, Kapitel 11.5, S. 342 ff. )
3. In einem dritten Schritt wird für mindestens ein ausgewähltes vorverarbeitetes Kamera-Bild ein Maskenbild erstellt, das zwei Klassen, nämlich Hintergrund- und Objektpixel unterscheidet, so dass eine Trennung von Objekt und Bildhintergrund möglich ist. Bei mehren Bildkanälen wird der Kanal gewählt, mit dem am besten die Trennung von Hintergrund und Objekt (Person) realisiert werden kann. Die Hintergrundabtrennung wird in diesem Beispiel (Binarisierung) mit Hilfe eines Schwellwertes realisiert, der entweder einstellbar oder anhand einer Histogrammverteilung berechnet wird. Dem Stand der Technik sind beliebige andere Verfahren zur Trennung von Objekt und Bildhintergrund entnehmbar, die alternativ zur Anwendung kommen können, um eine so genannte Objekt-Maske in einem Maskenbild 31 zu erzeugen.
4. Im vierten Schritt werden die Bilddaten der verschiedenen Videokanäle in eine Gauß- und Laplace-Pyramide zerlegt. Diese Transformationsvorschrift der Gauß- und Laplace-Pyramide ist eine allgemein bekannte Methode zur Datenreduktion und -manipulation und kann in seinen Grundzügen der Monografie ( Bernd Jähne: Digitale Bildverarbeitung, 4 Auflage, Springer Verlag Berlin Heidelberg New York, 1997, Kapitel 5.2, Seite 149 ff. ) entnommen werden. Um eine Gauß-Laplace-Pyramide zu entwickeln, wird zunächst eine Gaußpyramide konstruiert. Dabei ist zu beachten, dass das Originalbild eine Seitenlänge von 2ⁿ Pixel aufweisen muss. Dazu wird in einem rechteckigen Originalbild entweder ein quadratischer Bereich ausgewählt oder – falls erforderlich – das Bild in mehrere quadratische Bildblöcke unterteilt. Das Originalbild stellt die unterste Pyramidenebene GO dar. Die nächste Gaußpyramidenebene G1 wird über eine Tiefpassfilterung (f_g = f/2) und Halbierung der Stützstellen von GO errechnet. Dieser Prozess wird von Ebene zu Ebene fortgesetzt, bis das „Bild” eine Größe von nur noch einem einzigen Pixel erreicht. Die Tiefpassfilterung wird über eine mathematische Faltung mit einer Gaußglocke realisiert, wobei das Bild praktisch mit einem Binomialfilter gefaltet wird. Nach diesem Prozess liegen die Bilder als Gaußpyramide vor, die jeweils einen gewissen Frequenzanteil repräsentieren. Jeder Nachfolger einer Bildebene besitzt nur noch 1/4 der Pixel der Vorgängerebene. Aus der vorliegenden Gaußpyramide wird dann eine Laplace-Pyramide entwickelt. Eine Laplace-Pyramiden-Ebene wird über die Bildung der Differenz von zwei benachbarten Gaußpyramidenebenen erzielt. Dabei ist zu beachten, dass diese beiden Ebenen die gleiche Größe aufweisen müssen. Dies wird durch Expansion der Nachfolgerebene realisiert. Der Grauwert der neu hinzugefügten Pixel wird durch Interpolation der beiden vorhandenen Nachbarpixel errechnet. Die einzelnen Laplacepyramidenebenen repräsentieren die Schärfeanteile eines Bildes. Die Laplacepyramidenebene LO beinhaltet die höchsten Frequenzanteile und die weiter darunter liegenden Ebenen die restlichen niedrigeren Frequenzanteile. Nachdem die Gauß-Laplace-Pyramidenebene gebildet und eventuell die einzelnen Ebenen bearbeitet wurden, wird die Gauß-Laplace-Pyramide rekonstruiert, indem die gewünschten Laplacepyramidenebenen und die höchste Gaußpyramidenebene aufsummiert werden. Diese Bilddatenzerlegung ist Vorraussetzung für den nachfolgenden auf einer Multiresolution-Methode basierenden Kombinationsschritt. In einem nebengeordneten Schritt 41 zur Anpassung der Maskenbilddaten an die Größe und Struktur der Laplace-Pyramiden werden aus den im Schritt 3 erzeugten Maskenbildern Maskenbildpyramiden erzeugt. Dies erfolgt durch eine schrittweise Reduzierung (jeweils Halbierung einer Seite) der Auflösung des Maskenbildes, um ein Laplace-Pyramiden adäquates Datengefüge zu erhalten.
5. Im fünften Schritt (siehe 4) erfolgt eine Fusion der Bilddaten durch die Kombination der berechneten Laplace-Pyramiden (Multiresolution-Verfahren) der einzelnen Videodatenkanäle (Bildkanäle). Dabei werden die entsprechenden Laplace-Pyramiden mit den erstellten Maskenbildern aus Schritt 3 gewichtet. Für diese Wichtung der Laplace-Pyramide mit dem Maskenbild werden die in Schritt 41 aus den Maskenbildern gebildeten Maskenbildpyramiden, die den Laplace-Pyramiden adäquat sind, verwendet. Für den abschließenden Fusionsschritt werden von allen gewichteten Laplace-Pyramiden aus den korrespondierenden Bildpunkten der einzelnen Laplace-Pyramiden-Ebenen die Bildpunkte gewählt, die den betragsmäßig größten Wert besitzen, da dieser der maximalen Information entspricht. Die auf diese Weise ausgewählten Bildpunkte stellen die resultierende (fusionierte) Laplace-Pyramide dar.
6. Im sechsten Schritt wird das direkte Fusionsergebnis in Form der fusionierten Laplace-Pyramide durch die Rücktransformation der Laplace-Pyramide in ein Ergebnisbild erstellt. Dafür kann – sofern die Gauss- und Laplace-Pyramiden-Transformation nach Jähne (a. a. O) verwendet wurde – die dort angegebene Vorschrift der Rücktransformation verwendet werden.

The image forming method for the detection of hidden objects has the following procedure in detail 1 ,

1. The first step of the method is the synchronized provision of captured image data by means of an image acquisition unit 1 taking the image data from a first camera 11 and a second camera 12 (Optionally, other cameras can be up to a camera 1n present) as two-dimensional camera images.
2. In a second step, the image data of at least one of the image channels, in particular of the lower resolution and / or lower frame rate, are processed by subjecting the read-out image to an edge-preserving noise suppression. For usable methods for a first image improvement can be taken from the state of the art in a variety of forms (see, eg. Bernd Jähne: Digital Image Processing, 4 Edition, Springer Verlag Berlin Heidelberg New York, 1997, Chapter 11.5, p. 342 ff. )
3. In a third step, a mask image is created for at least one selected preprocessed camera image, which distinguishes two classes, namely background and object pixels, so that a separation of object and image background is possible. With several image channels, the channel is selected with which the separation of background and object (person) can best be realized. The background separation is realized in this example (binarization) with the help of a threshold, which is either adjustable or calculated from a histogram distribution. The prior art discloses any other methods for separating object and image background which may alternatively be used to form a so-called object mask in a mask image 31 to create.
4. In the fourth step, the image data of the various video channels are decomposed into a Gaussian and a Laplace pyramid. This transformation rule of the Gaussian and Laplace pyramids is a well-known method for data reduction and manipulation and can be described in its fundamentals in the monograph ( Bernd Jähne: Digital Image Processing, 4 Edition, Springer Verlag Berlin Heidelberg New York, 1997, chapter 5.2, page 149 ff. ). To develop a Gauss-Laplace pyramid, a Gaussian pyramid is first constructed. It should be noted that the original image must have a page length of 2 ⁿ pixels. For this purpose, either a square area is selected in a rectangular original image or, if necessary, the image is divided into several square image blocks. The original image represents the lowest pyramidal plane GO. The next Gaussian pyramid plane G1 is calculated by low-pass filtering (f _g = f / 2) and halving the nodes of GO. This process continues from level to level until the "image" reaches a size of only a single pixel. The low-pass filtering is realized by a mathematical convolution with a Gaussian bell, wherein the image is practically folded with a binomial filter. After this process, the images are in the form of a Gaussian pyramid, each representing a certain frequency component. Each successor of an image plane has only 1/4 of the pixels of the previous level. From the present Gaussian pyramid a Laplace pyramid is then developed. A Laplacian pyramidal plane is achieved by forming the difference of two adjacent gaussian pyramidal planes. It should be noted that these two levels must be the same size. This is achieved by expansion of the successor level. The gray value of the newly added pixels is calculated by interpolation of the two existing neighboring pixels. The individual Laplacepyramidenebenen represent the sharpness portions of an image. The Laplace pyramid plane LO contains the highest frequency components and the further lower levels the remaining lower frequency components. After the Gauss-Laplace pyramidal plane has been formed and eventually the individual planes have been machined, the Gauss-Laplace pyramid is reconstructed by summing the desired Laplacepyramidenebenen and the highest Gausspyramidenebene. This image data decomposition is a prerequisite for the following combination step based on a multi-resolution method. In a sibling step 41 to adapt the mask image data to the size and structure of the Laplacian pyramids are made in the step 3 generated mask images generated mask image pyramids. This is done by a gradual reduction (each halving one side) of the resolution of the mask image to obtain a Laplace pyramid adequate data structure.
5. In the fifth step (see 4 ), the image data is merged by the combination the calculated Laplace pyramids (multi-resolution method) of the individual video data channels (image channels). The corresponding Laplace pyramids with the created mask images from step 3 weighted. For this weighting of the Laplace pyramid with the mask image are in step 41 Mask image pyramids formed from the mask images, which are adequate for the Laplace pyramids, are used. For the final fusion step of all weighted Laplace pyramids from the corresponding pixels of the individual Laplace pyramid levels, the pixels are selected, which have the largest value in terms of value, since this corresponds to the maximum information. The pixels selected in this way represent the resulting (fused) Laplace pyramid.
6. In the sixth step, the direct fusion result in the form of the fused Laplace pyramid is created by the inverse transformation of the Laplace pyramid into a result image. If the Gaussian and Laplace pyramid transformation according to Jähne (loc. Cit.) Has been used, the rule of inverse transformation given there can be used.

In der Gestaltung gemäß 2 ist das vorstehend beschriebene Zweikanal-Verfahren auf drei Kanäle erweitert worden. In diesem Beispiel werden im ersten Schritt der Bildaufnahme neben dem anwendungsbedingt gewählten Millimeterwellen-Kanal (aufgenommen durch die THz-Kamera 11) weitere Bilddaten im infraroten (IR-)Bereich und im sichtbaren (VIS-)Spektralbereich bereitgestellt. Ohne Beschränkung der Allgemeinheit sei in dieser Ausführung angenommen, dass die IR-Kamera 12 im Bereich 7,5 ... 14 μm arbeitet, während die VIS-Kamera 13 im Bereich 300 ... 780 nm empfindlich ist. Für die THz-Kamera 11 sei ein Frequenzband im Bereich 0,3 THz ... 0,9 THz (0,3 ... 1 mm) ausgewählt.In the design according to 2 For example, the two-channel method described above has been extended to three channels. In this example, in the first step, the image acquisition is next to the millimeter-wave channel selected by the application (taken by the THz camera 11 ) provided further image data in the infrared (IR) region and in the visible (VIS) spectral region. Without loss of generality, it is assumed in this embodiment that the IR camera 12 in the range 7.5 ... 14 microns works while the VIS camera 13 in the range 300 ... 780 nm is sensitive. For the THz camera 11 If a frequency band in the range 0.3 THz ... 0.9 THz (0.3 ... 1 mm) is selected.

Der zweite Schritt der Bildverbesserung 2 erfolgt wie im vorherigen Beispiel wiederum im Millimeterwellenkanal, kann aber auch auf die Bilddaten aus dem IR-Kanal angewendet werden. Die Rauschunterdrückung (kantenerhaltende Rauschfilterung) wird in diesem Beispiel durch ein THz-Bildfilter 21 vorgenommen, in dem das ausgelesene Bild mit einem rauschunterdrückenden und kantenerhaltenden Algorithmus, z. B. durch ein Median-Filter, vorverarbeitet wird.The second step of image enhancement 2 takes place as in the previous example again in the millimeter wave channel, but can also be applied to the image data from the IR channel. Noise suppression (edge preserving noise filtering) in this example is provided by a THz image filter 21 made in which the read image with a noise-canceling and edge-preserving algorithm, eg. B. by a median filter, is preprocessed.

Der dritte Schritt der Maskenbilderzeugung 3 wird für die vorliegenden drei Kanäle wie folgt modifiziert. Aus den Bilddaten der THz-Kamera 11 wird mit Hilfe eines histogrammgestützt berechneten Schwellwertes ein erstes Maskenbild 31 (THz-Maske) erstellt, das zwei Klassen von Pixeldaten, Hintergrund und Person, unterscheidet. Für die Bestimmung des Schwellwertes wird der Median des Histogramms verwendet. Eine weitere Bestimmung des Schwellwertes basiert auf einer Histogrammanalyse, bei der ein lokales Minimum gesucht wird. Falls mehrere lokale Minima vorhanden sind, können auch mehrere Schwellwerte verwendet und damit mehr als zwei Klassen unterschieden werden. Wird nur eine Klassenunterteilung benötigt, wir aus der Histogrammanalyse das globale Minimum zur Berechnung des einen Schwellwertes verwendet. Aus den Bilddaten der IR-Kamera 12 und der zuvor extrahierten THz-Maske (erstes Maskenbild 31) wird ein zweites Maskenbild 32 erstellt, das drei Klassen von Pixeldaten diskriminiert, nämlich Hintergrund, Person und verborgenes Objekt. Hierbei werden die vorteilhaften Eigenschaften:

– der Temperaturmessung von Oberflächen,
– der höheren radiometrischen und geometrischen Auflösung sowie
– der höheren Bildwiederholfrequenz der IR-Kamera 12 ausgenutzt.

The third step of mask imaging 3 is modified as follows for the present three channels. From the image data of the THz camera 11 becomes a first mask image with the aid of a histogram-based calculated threshold value 31 (THz mask), which distinguishes two classes of pixel data, background and person. The median of the histogram is used to determine the threshold value. Another determination of the threshold value is based on a histogram analysis in which a local minimum is sought. If there are several local minima, several thresholds can be used and thus more than two classes can be distinguished. If only one class subdivision is needed, we use the global minimum for the calculation of the one threshold from the histogram analysis. From the image data of the IR camera 12 and the previously extracted THz mask (first mask image 31 ) becomes a second mask image 32 which discriminates three classes of pixel data, namely, background, person, and hidden object. Here are the advantageous properties:

- the temperature measurement of surfaces,
- the higher radiometric and geometric resolution as well
- the higher refresh rate of the IR camera 12 exploited.

Aufgrund der höheren Auflösung der IR-Bildes können Kanten von Objekten oder Personen genauer detektiert werden. Mit diesen zusätzlichen Informationen kann aus dem ersten Maskenbild, der THz-Maske, ein zweites Maskenbild zur Unterscheidung zwischen Person und verborgenem Objekt erzeugt werden.Due to the higher resolution of the IR image, edges of objects or persons can be detected more accurately. With this additional information, a second mask image for distinguishing between the person and the hidden object can be generated from the first mask image, the THz mask.

Im vierten Schritt der Transformation 4 werden – wie bereits im Basisbeispiel für zwei Kanäle erläutert – sowohl aus den bereits in der Maskenerzeugung 3 verwendeten Bilddaten in einer Maskenbildtransformation 41 durch Datenreduktion eine adäquate Maskenbildpyramide als auch aus den von der VIS-Kamera 13 als drittem Bildkanal bereitgestellten Bilddaten jeweils Gauß- und Laplace-Pyramiden berechnet, wobei infolge der einander nahezu überdeckenden Gesichtsfelder (FOV) der Kameras 11 bis 13 die gleichen quadratischen Bildbereiche in die entsprechenden Laplace-Pyramiden transformiert werden.In the fourth step of the transformation 4 - as already explained in the basic example for two channels - both from the already in the mask generation 3 used image data in a mask image transformation 41 by data reduction an adequate mask pyramid as well as from the VIS camera 13 Gaussian and Laplacian pyramids calculated as the third image channel, whereby due to the nearly overlapping field of view (FOV) of the cameras 11 to 13 the same square image areas are transformed into the corresponding Laplace pyramids.

Der nachfolgende Schritt der Bildfusion 5 wird gegenüber dem Beispiel nach 1 ebenfalls um einen Kanal erweitert und ist in seiner Berechnungsstruktur der ebenenweise vorgenommenen Kombination der Laplace-Pyramiden in 4 ausführlicher dargestellt. In dieser Darstellung ist zu erkennen, dass bei der einfachen Kombination der einzelnen Laplace-Pyramidenebenen innerhalb jeder Ebene eine Wichtung mit den aus Schritt 3 stammenden Maskendaten (Maskenbildpyramide) erfolgt, wobei das zweite Maskenbild 32, das die Unterteilung der Bildinformation aus den zwei Kanälen, THz-Kamera 11 und IR-Kamera 12, in die drei Pixelklassen (Hintergrund, Person und verborgenes Objekt) beinhaltet, gleichermaßen als Pyramidentransformation ebenenweise auf die kombinierte Laplace-Pyramidenebene angewendet wird.The next step of image fusion 5 will be compared to the example 1 also extended by one channel and is in its computational structure of the plane-wise combination of the Laplace pyramids in 4 shown in more detail. In this illustration it can be seen that in the simple combination of the individual Laplace pyramidal planes within each plane, a weighting with that from step 3 originating mask data (mask image pyramid), wherein the second mask image 32 that divides the image information from the two channels, THz camera 11 and IR camera 12 , which includes three pixel classes (background, person and hidden object), as well as a pyramid transformation levelwise applied to the combined Laplace pyramidal plane.

Wesentlich ist bei dieser Prozedur, dass die pro Kanal bereitgestellten Bilddaten in jeder Pyramidenebene einer Datenreduktion durch Mittelung aus benachbarten Pixeldaten interpolierte neue Pixeldaten erzeugt werden, die dann in der darauf folgenden Ebene Gegenstand der Fusion mit der auf gleiche Weise erzeugten Maskenbild-Pyramidenebene sind. Dabei werden aus den korrespondierenden Bildpunkten der einzelnen gewichteten Laplace-Pyramiden-Ebenen die Bildpunkte gewählt, die den betragsmäßig größten Wert besitzen. Die auf diese Weise ausgewählten Bildpunkte stellen die resultierende (fusionierte) Laplace-Pyramiden-Ebene dar, die dann im sechsten Schritt durch Rücktransformation der fusionierten Laplacepyramide zu einem objektextrahierten Ergebnisbild führt. Dieses fusionierte Ergebnisbild zeichnet sich dadurch aus, dass trotz der kleidungsdurchdringenden THz-Bilddaten eine Bilddarstellung zur Anzeige gebracht wird, die keine Nacktdarstellung der kontrollierten Person und dennoch eine weitaus deutlichere Darstellung von unter der Bekleidung verborgenen Gegenständen ermöglicht sowie eine bessere örtliche Zuordnung (Auflösung) der verborgenen Gegenstände an der Person.What is essential in this procedure is that the image data provided per channel in each pyramid level of data reduction is generated by averaging from adjacent pixel data interpolated new pixel data, which is then fused to the similarly generated mask image pyramid level at the subsequent level. In this case, the pixels are selected from the corresponding pixels of the individual weighted Laplace pyramid levels, which have the largest value in terms of amount. The pixels selected in this way represent the resulting (fused) Laplacian pyramidal plane, which then in the sixth step results in an object-extracted result image by inverse transformation of the fused Laplacian pyramid. This fused result image is distinguished by the fact that despite the garment-permeating THz image data, an image display is displayed which does not allow the controlled person to be nude and nevertheless provides a much clearer display of objects hidden beneath the clothing, as well as a better spatial assignment (resolution) hidden objects on the person.

In der Ausführungsform gemäß 3 wird eine weitere Modifikation des lokal-adaptiv gewichteten Fusionsalgorithmus gegenüber dem Beispiel gemäß 2 beschrieben. Bei gleicher Wahl der grundsätzlichen Spektralkanäle (THz, IR, VIS) wird hier jedoch die Wahl der Frequenzbänder wie folgt angenommen. Die THz-Kamera 11 sei im Bereich von 0,85 ... 0,9 mm (0,34 THz ... 0,35), die IR-Kamera 12 im Bereich von 7,5 ... 14 μm und die VIS-Kamera 13 im Bereich von 300 ... 780 nm empfindlich.In the embodiment according to 3 Another modification of the locally adaptively weighted fusion algorithm over the example of FIG 2 described. With the same choice of the fundamental spectral channels (THz, IR, VIS), however, the choice of the frequency bands is assumed here as follows. The THz camera 11 be in the range of 0.85 ... 0.9 mm (0.34 THz ... 0.35), the IR camera 12 in the range of 7.5 ... 14 μm and the VIS camera 13 sensitive in the range of 300 ... 780 nm.

Die wesentliche Erweiterung der Bilddatenverarbeitung gegenüber der von 2 besteht in der Verwendung von Daten aus Bildkanälen mit höherer räumlicher Auflösung (Bilddaten aus IR-Kamera 12 und VIS-Kamera 13) für eine Bildverbesserung niedrigauflösender Kanäle (in diesem Fall: der Millimeterwellendaten aus der THz-Kamera 11) erreicht. Dazu wird der zweite Verfahrensschritt der Bildverbesserung, bevor die Bilddaten in Gauß- und Laplace-Pyramiden transformiert werden, durch folgende Maßnahmen (Schritte) ergänzt:

2.2 Aufgrund der höheren Bildauflösung der VS- und IR-Videodaten (mindestens 5:1) und der höheren Bildwiederholfrequenz (50 Hz gegenüber 10–25 Hz der THz-Kamera 13), können a-priori-Informationen abgeleitet werden, um Vorhersage-Berechnungen (forecast evaluation) 22 durchzuführen, die z. B. zur Vorausberechnung von Personen- oder Objektbewegungen genutzt werden.
2.3 Die aus dem Schritt der Vorhersage-Berechnungen 22 gewonnenen a-priori-Informationen von Personen- oder Objektbewegungen werden dann für eine Bildverbesserung der THz-Bilddaten nach dem in der Grundversion ohnehin angewandten kantenerhaltenden Rauschminderung verwendet. Hierbei werden die a-priori-Informationen mittels eines Kalman-Filters verarbeitet. Eine weitere Verbesserung des THz-Bildes kann durch ein Subpixelinterpolation erreicht werden. Dabei gehen die Bewegungsinformationen (Richtung, Geschwindigkeit und Rotation) ein.

The substantial expansion of image data processing over that of 2 consists of using data from image channels with higher spatial resolution (image data from IR camera 12 and VIS camera 13 ) for image enhancement of low-resolution channels (in this case, the millimeter-wave data from the THz camera 11 ) reached. For this purpose, the second process step of image enhancement, before the image data is transformed into Gaussian and Laplace pyramids, is supplemented by the following measures (steps):

2.2 Due to the higher image resolution of the VS and IR video data (at least 5: 1) and the higher refresh rate (50 Hz compared to 10-25 Hz of the THz camera 13 ), a-priori information can be derived to obtain prediction calculations (forecast evaluation) 22 perform the z. B. be used for the precalculation of person or object movements.
2.3 The from the step of the prediction calculations 22 obtained a-priori information of person or object movements are then used for image enhancement of the THz image data according to the edge-preserving noise reduction anyway applied in the basic version. In this case, the a-priori information is processed by means of a Kalman filter. Further improvement of the THz image can be achieved by subpixel interpolation. This includes the movement information (direction, speed and rotation).

Mit diesen Zusatzmaßnahmen wird eine bessere Objektverfolgung (Bewegung von überprüften Personen) und genauere Objektabgrenzung von verborgenen Objekten erreicht.With these additional measures, a better object tracking (movement of verified persons) and more precise object delimitation of hidden objects is achieved.

Ein weiter abgewandelter Fusionsablauf ist in 5 dargestellt. Bei dieser Ausführung wird von der Voraussetzung ausgegangen, dass mindestens drei Bildkanäle vorhanden sein müssen, um die Aufgabe der verbesserten Bildauflösung (für verborgene Objekte) und die Unterdrückung des Nacktscannereffekts zu erreichen. Bei den erneut in den THz-, IR- und VIS-Spektralbereichen angesiedelten Bildkanälen sollen in diesem Fall die spektralen Empfindlichkeitsbereiche wie folgt eingestellt sein:
THz-Kamera 11: 0,85 mm,
IR-Kamera 12: 7,5 ... 14 μm,
VIS-Kamera 13: 300 ... 780 nm.A further modified fusion process is in 5 shown. In this embodiment, it is assumed that at least three image channels must be present in order to accomplish the task of improved image resolution (for hidden objects) and suppression of the nude scanner effect. In the case of the image channels residing in the THz, IR and VIS spectral ranges, in this case the spectral sensitivity ranges should be set as follows:
THz camera 11 : 0.85 mm,
IR camera 12 : 7.5 ... 14 μm,
VIS camera 13 : 300 ... 780 nm.

Dieses Beispiel baut auf der Ausführungsvariante von 3 auf, zielt aber auf eine verkürzte bzw. vereinheitlichte Bilddatenverarbeitung ab, indem auf die Maskenbilderzeugung (sowie deren adäquater Pyramidenerzeugung) verzichtet wird zugunsten eines dadurch zwingend vorhandenen dritten Bildkanals (hier: neben THz- und VIS-Kanälen mindestens ein IR-Kanal), der in den Ausführungen gemäß den 2 und 3 lediglich zur weiteren (geringfügigen) Verbesserung des Ergebnisbildes optional vorhanden sein sollte. Es konnte festgestellt werden, dass mit jedem zusätzlichen hochauflösenden Bildkanal, d. h. jedem zwei übersteigenden Kanal, auch bei fest gewählter (d. h. nicht lokal-adaptiver) Wichtung der Laplace-Pyramiden der einzelnen Bildkanäle bei der Fusion zur resultierenden Laplace-Pyramide eine erhebliche Verbesserung der Bildauflösung für verborgene Objekte sowie Unterdrückung der Nacktscannereffekts erreichbar ist.This example is based on the embodiment of 3 , but aims at a shortened or unified image data processing by the mask image generation (and their adequate pyramid production) is omitted in favor of a thereby compelling existing third image channel (here: next to THz and VIS channels at least one IR channel), the in the embodiments according to the 2 and 3 only for further (minor) improvement of the result image should be optionally available. It was found that with each additional high-resolution image channel, ie every two channels exceeding, even with firmly selected (ie not locally adaptive) weighting of the Laplace pyramids of the individual image channels in the merger to the resulting Laplace pyramid, a significant improvement in image resolution can be reached for hidden objects as well as suppression of the nude scanner effect.

Besonders zweckmäßig ist dabei allerdings die Vorverarbeitung des niedrig auflösenden Millimeterwellenkanals der THz-Kamera 11 im zweiten Verarbeitungsschritt durch zusätzliche a-priori-Informationen aus den hochauflösenden Bildkanälen der IR-Kamera 12 und der VIS-Kamera 13. Eine Reduktion auf eine reine kantenerhaltende Rauschunterdrückung mit THz-Bildfilter 21 (z. B. Kalman-Filter) ermöglicht jedoch immer noch eine verschlechterte, aber nutzbare Variante.However, the preprocessing of the low-resolution millimeter-wave channel of the THz camera is particularly expedient in this case 11 in the second processing step by additional a-priori information from the high-resolution image channels of the IR camera 12 and the VIS camera 13 , A reduction to a pure edge-preserving noise reduction with THz image filter 21 (eg Kalman filter) still allows for a degraded but usable variant.

Alle übrigen Verarbeitungsschritte der Bildaufnahme, Bildtransformation in Gauß- und Laplace-Pyramiden und deren ebenenweise Fusion zur resultierenden Laplace-Pyramide und deren Rücktransformation zum fusionierten Ergebnisbild laufen in derselben, wie zu 1, 3 und 4 beschriebenen, Art und Weise ab, wobei jedoch Laplace-Pyramiden aus mindestens drei Bildkanälen notwendig für den Fusionsschritt 5 zur Verfügung stehen müssen. All other processing steps of image acquisition, image transformation in Gaussian and Laplace pyramids and their plane-wise fusion to the resulting Laplace pyramid and their inverse transformation to the merged result image run in the same, as to 1 . 3 and 4 described, but with Laplace pyramids of at least three image channels necessary for the fusion step 5 must be available.

Mit Hilfe der vorstehend beschriebenen Varianten des Fusionsalgorithmus auf Basis der Verschmelzung von Laplace-Pyramiden-Daten von zum Teil vorverarbeiteten (aufbereiteten und verbesserten) Daten vorgegebener Bildkanäle bzw. durch Hinzunahme zusätzlicher lokal-adaptiver Maskenbilder oder zusätzlicher Bildkanäle können verborgene Objekte besser erkannt werden, da diese Objekte nicht durch Bildinformationen der anderen Kanäle überdeckt, sondern mit diesen verbessert werden. Dabei wird insgesamt die örtliche Auflösung im Bereich der detektierten verborgenen Objekte gesteigert und zwar ohne die volle Bildgebung der Nacktscannereigenschaft der Millimeterwellen (THz-Kamera 11) im Ergebnisbild darzustellen.With the aid of the above-described variants of the fusion algorithm on the basis of the merger of Laplace pyramid data from partially preprocessed (prepared and improved) data of given image channels or by adding additional locally adaptive mask images or additional image channels, hidden objects can be better recognized these objects are not covered by image information of the other channels, but can be improved with them. Overall, the spatial resolution in the area of the detected hidden objects is increased without the full imaging of the nude scanner property of the millimeter waves (THz camera 11 ) in the result image.

BezugszeichenlisteLIST OF REFERENCE NUMBERS

11: Bildaufnahmeimage capture
1111: erste Kamera (THz-Kamera)first camera (THz camera)
1212: zweite Kamera (IR-Kamera)second camera (IR camera)
1313: dritte Kamera (VIS-Kamera)third camera (VIS camera)
1n1n: n-te Kamera (eines Mehrkanalsystems)nth camera (of a multi-channel system)
22: Bildverbesserungimage enhancement
2121: Bildfilter (Rauschminderung)Image filter (noise reduction)
2222: Vorhersageberechnung (Bewegungsvorhersage)Prediction calculation (motion prediction)
2323: Bildverbesserung mit a priori Informationen (aus anderen Kanälen)Image enhancement with a priori information (from other channels)
33: MaskenbilderzeugungMask imaging
3131: erstes Maskenbild (THz-Maske)first mask image (THz mask)
3232: zweites Maskenbild (THz-IR-Maske)second mask image (THz IR mask)
44: Gauß-Laplace-TransformationGauss-Laplace transform
4141: Maskenbild-TransformationMask image transformation
55: Fusion der Laplace-Pyramiden mit Wichtung durch MaskenbildFusion of the Laplace pyramids with weighting through mask image
5151: THz-Laplace-EbeneTHz Laplace plane
5252: IR-Laplace-EbeneIR Laplace plane
5353: Masken-Laplace-EbeneMask Laplace plane
5454: DatenreduktionsoperationData reduction surgery
5555: WichtungsoperationWeighting operation
66: Rücktransformation der fusioniert gewichteten Laplace-PyramideBack transformation of the fused weighted Laplace pyramid

ZITATE ENTHALTEN IN DER BESCHREIBUNG QUOTES INCLUDE IN THE DESCRIPTION

Diese Liste der vom Anmelder aufgeführten Dokumente wurde automatisiert erzeugt und ist ausschließlich zur besseren Information des Lesers aufgenommen. Die Liste ist nicht Bestandteil der deutschen Patent- bzw. Gebrauchsmusteranmeldung. Das DPMA übernimmt keinerlei Haftung für etwaige Fehler oder Auslassungen.This list of the documents listed by the applicant has been generated automatically and is included solely for the better information of the reader. The list is not part of the German patent or utility model application. The DPMA assumes no liability for any errors or omissions.

Zitierte PatentliteraturCited patent literature

US 2005/0116947 A1 [0003]
US 2005/0231421 A1 [0004]
US 2005/0232459 A1 [0004]
US 2006/0006322 [0005]
US 2008/0043102 A1 [0005]
US 2009/0041293 A1 [0005]

Zitierte Nicht-PatentliteraturCited non-patent literature

Bernd Jähne: Digital Image Processing, 4 Edition, Springer Verlag Berlin Heidelberg New York, 1997, Chapter 11.5, p. 342 et seq. [0023]
Bernd Jähne: Digital Image Processing, 4 Edition, Springer Verlag Berlin Heidelberg New York, 1997, Chapter 5.2, page 149 ff. [0023]

Claims

Method for processing multichannel image recordings for the detection of hidden objects in a scene, in particular in the optoelectronic person control, in which image data from at least two spectrally differently sensitive camera systems, which have largely overlapping fields of view, are subjected to an image data combination, comprising the following steps: (1) taking pictures in picture channels of different frequency ranges as two-dimensional picture data, whereby an image channel is obtained from the range of the millimeter waves and the output pictures of the different channels are synchronized to each other, if the scanning is performed with different picture repetition frequencies, (2) image enhancement by filter operation, in which at least one noise suppression is performed in a low-resolution image channel, (3) generating a mask image from the image data of at least one image channel, wherein the image data is discriminated into at least two classes of pixels, (4) transforming the image data of each image channel into a Gaussian and Laplacian pyramid and of the mask image into a mask pyramid suitable for this purpose, wherein from the image data read out in each case square regions with matching visual fields are selected, (5) Fusion of the transformed image data of the Laplace pyramids of the individual image channels with the mask image pyramid in mutually corresponding pyramidal planes and subsequent fusion of the Laplace pyramids of the individual image channels to a resulting Laplacian pyramid, wherein corresponding pixels within the same pyramidal planes of the individual Laplace pyramids is determined on the basis of a relative comparison criterion and the resulting pixels are joined together plane by layer to the resulting Laplace pyramid, and (6) Back transformation of the resulting resulting Laplace pyramid into a fused result image.

A method according to claim 1, characterized in that the mask image is generated by means of a fixed threshold.

A method according to claim 1, characterized in that the mask image is generated by means of an adjustable threshold value, wherein the threshold value is calculated from a histogram distribution of a read-out image.

A method according to claim 3, characterized in that the threshold value for the mask image is calculated by the median value of the histogram.

A method according to claim 3, characterized in that the threshold value is determined from a histogram analysis in which a global minimum is sought.

A method according to claim 5, characterized in that a plurality of threshold values are determined if a plurality of local minima are present, and thus more than two classes of pixels within the mask image are distinguished.

Method according to Claim 1, characterized in that the mask image is generated with the aid of a plurality of threshold values, with information from at least two image channels ( 11 . 12 ) be used.

Method according to claim 7, characterized in that from a first image channel ( 11 ) a first mask image ( 31 ) and with information from another image channel ( 12 ) at least one further class of pixels is subdivided, the information from the further image channel ( 12 ) are used to calculate at least one further threshold value.

Method according to claim 8, characterized in that a first two-class mask image ( 31 ) from the image channel in the millimeter wave range ( 11 ) and a second multi-class mask image ( 32 ) from the first mask image ( 31 ) and a picture from a longer wavelength picture channel ( 12 ) is converted.

Method according to claim 9, characterized in that the threshold values for the multi-class mask image ( 32 ) by histogram evaluation in an IR channel ( 12 ) be won.

Method according to claim 1, characterized in that the generated mask image ( 31 . 32 ) for image fusion ( 5 ) is converted into a mask image pyramid such that a stepwise reduction of the resolution of the mask image yields an adequate data structure compared to that from the image transformation (FIG. 4 ) generated Laplace pyramids of the individual image channels ( 11 . 12 . 13 , ... 1n ) arises.

Method according to claim 11, characterized in that the image fusion ( 5 ) based on a Merging of the Laplace pyramids of at least two image channels is carried out, wherein from corresponding pixels of the individual with the mask image pyramid weighted Laplace pyramid levels those pixels of the individual Laplace pyramid levels are determined, which have the largest value in terms.

Method according to claim 1, characterized in that the image fusion ( 5 ) is performed on the basis of a fusion of the Laplace pyramids of at least three image channels, the combination with the mask image pyramid is replaced by a combination with the Laplace pyramid of an additional image channel and from corresponding pixels of the individual Laplace pyramid levels those pixels are determined which have the largest amount in value.

A method according to claim 1, characterized in that an image enhancement of the image of a low-resolution image channel ( 11 ) by using data from at least one higher-resolution image channel ( 12 . 13 ) is performed at a higher frame rate by a priori information about movements within the high-resolution image channel ( 12 . 13 ) with higher refresh rate on the image of the low-resolution image channel ( 11 ) be applied.