DE202022104365U1

DE202022104365U1 - A robust color image hashing system for image authentication

Info

Publication number: DE202022104365U1
Application number: DE202022104365.5U
Authority: DE
Original assignee: National Institute Of Tech Silchar; National Institute of Technology India
Current assignee: National Institute Of Tech Silchar; National Institute of Technology India
Priority date: 2022-08-01
Filing date: 2022-08-01
Publication date: 2022-08-22
Anticipated expiration: 2032-08-02

Abstract

Ein neuro-evolutionäres Erkennungssystem (100) zur Früherkennung von Lungenkrebs, wobei das System (100) Folgendes umfasst:
eine Vielzahl von Empfängern (102) zum Empfangen eines Eingangsbildes und eines Hash-Wertes, wobei das empfangene Bild mit einer blinden Rotationsskalierungstransformation (RST) gekapselt wird;
ein geometrisches Transformationskorrekturmodul (104), das mit der Vielzahl von Empfängern (102) verbunden ist, um Merkmale der blinden RST-Transformation zu eliminieren, indem inhärente Eigenschaften der geometrischen Transformation verwendet werden und die Rotationsbedingung des empfangenen Bildes ausgewertet wird, wobei die Merkmale der blinden RST-Transformation eliminiert werden, indem die Koordinaten von Grenzpixeln mit Werten ungleich Null berechnet werden, um ein interpoliertes Bild zu erhalten;
ein Bildauthentifizierungsmodul (106), das mit dem geometrischen Transformationsmodul (104) verbunden ist, zum Erzeugen eines Hash-Codes durch eine L2-Regularisierungsbeschränkung (L2RC), die auf die Aktivierung eines Abschlussblocks in einem Codierer (106a) angewendet wird, wobei ein Schätzmodul (108), das in dem Bildauthentifizierungsmodul (106) verkörpert ist, einen Korrelationskoeffizienten zwischen dem empfangenen Hash und dem erzeugten Hash schätzt, wobei, wenn der Korrelationskoeffizient kleiner als ein erster Schwellenwert ist, das interpolierte Bild entweder ein manipuliertes Bild oder ein anderes Bild ist; und
ein Manipulationserkennungs- und Lokalisierungsmodul (110), das mit dem Bildsegmentierungsmodul (106) verbunden ist, um das Bild als manipuliertes Bild oder einen anderen Bildidentifizierungsbereich der manipulierten Bilder zu klassifizieren, wobei das manipulierte Bild auf den erzeugten Hash-Code abgebildet wird, um eine Bildkarte zu erzeugen und das manipulierte Bild zu lokalisieren.

A neuro-evolutionary detection system (100) for early detection of lung cancer, the system (100) comprising:
a plurality of receivers (102) for receiving an input image and a hash value, the received image being encapsulated with a blind rotational scaling transform (RST);
a geometric transform correction module (104) connected to the plurality of receivers (102) for eliminating features of the blind RST transform by using inherent properties of the geometric transform and evaluating the rotation condition of the received image, wherein the features of the blind RST transformation are eliminated by computing the coordinates of non-zero value boundary pixels to obtain an interpolated image;
an image authentication module (106) connected to the geometric transformation module (104) for generating a hash code by an L2 regularization constraint (L2RC) applied to activation of a termination block in an encoder (106a), an estimation module (108), embodied in the image authentication module (106), estimates a correlation coefficient between the received hash and the generated hash, wherein if the correlation coefficient is less than a first threshold, the interpolated image is either a manipulated image or a different image ; and
a tamper detection and localization module (110) coupled to the image segmentation module (106) for classifying the image as a tampered image or another image identification portion of the tampered images, wherein the tampered image is mapped to the generated hash code to provide a generate an image map and locate the manipulated image.

Description

BEREICH DER ERFINDUNGFIELD OF THE INVENTION

Die vorliegende Erfindung bezieht sich auf ein Gebiet der Bildauthentifizierungssysteme. Insbesondere bezieht sich die vorliegende Erfindung auf ein robustes Farbbild-Hashing unter Verwendung von Convolutional Stacked Denoising Auto-Encodern zur Bildauthentifizierung.The present invention relates to a field of image authentication systems. More particularly, the present invention relates to robust color image hashing using convolutional stacked denoising auto-encoders for image authentication.

HINTERGRUND DER ERFINDUNGBACKGROUND OF THE INVENTION

Mit der Entwicklung von hochentwickelten Bildbearbeitungsprogrammen ist die Manipulation oder Fälschung von Bildinhalten recht einfach. Die Erkennung von Manipulationen ist sehr wichtig, um die Gültigkeit eines Bildes zu bestimmen. Dieses Problem lässt sich am besten mit wahrnehmungsbasierten Bildhash-Verfahren lösen. Bei diesen Techniken werden hervorstechende Merkmale aus einem Bild extrahiert, um einen Hash zu erstellen.With the development of sophisticated image editing programs, manipulating or forging image content is quite easy. Tampering detection is very important to determine the validity of an image. This problem is best solved using perception-based image hashing techniques. These techniques extract salient features from an image to create a hash.

Herkömmliche Hash-Verfahren wie MD5 (Message Digest 5) oder SHA-256 erzeugen hochsensible Hash-Codes für Daten. Jede kleine Änderung führt zu unterschiedlichen Hash-Codes. Dieses Verhalten erweist sich als nachteilig für digitale Bilder (Original oder übertragene Bilder in der Bildkommunikation), die CPOs wie Komprimierung, Filterung und Skalierung unterzogen werden, um die visuellen Informationen nahe am Original zu halten. Das Ziel der perzeptuellen Hashing-Technik ist es, Hash-Codes zu erzeugen, die nur Änderungen im visuellen Bereich berücksichtigen. Zwischen den Hash-Codes des Originalbildes und den durch CPOs verarbeiteten Gegenstücken besteht ein hoher Grad an Korrelation, d. h. Robustheit, während die Korrelation zwischen den Hash-Codes des Originalbildes und dem verzerrten Inhaltsbild geringer sein sollte, d. h. Diskriminierung. Die Leistung der perzeptiven Hash-Verfahren wird anhand ihrer Robustheit gegenüber CPOs und ihrer Empfindlichkeit gegenüber bösartigen Operationen wie dem Entfernen oder Hinzufügen von Inhalten gemessen.Conventional hash methods such as MD5 (Message Digest 5) or SHA-256 generate highly sensitive hash codes for data. Any small change will result in different hash codes. This behavior proves detrimental to digital images (original or transmitted images in image communications) that undergo CPOs such as compression, filtering, and scaling to keep the visual information close to the original. The goal of the perceptual hashing technique is to generate hash codes that only consider changes in the visual domain. There is a high degree of correlation between the hash codes of the original image and the counterparts processed by CPOs; H. Robustness, while the correlation between the hash codes of the original image and the distorted content image should be lower, i.e. H. discrimination. The performance of perceptual hashing techniques is measured by their robustness to CPOs and their sensitivity to malicious operations such as removing or adding content.

Im Allgemeinen werden Hashing-Verfahren in zwei Kategorien unterteilt, nämlich das ortsabhängige Hashing (LSH) und das lernbasierte Hashing (LBH). LSH verwendet Funktionen, die von den gegebenen Daten unabhängig sind, um Hash-Codes zu erzeugen, während LBH-Techniken Hash-Funktionen auf der Grundlage der Eingabedaten erlernen. Bei LSH ist die Beibehaltung der semantischen Ähnlichkeit, die eine Grundvoraussetzung für den Einsatz von Bildverarbeitungssystemen ist, nicht gewährleistet. Um diese Probleme zu überwinden, wird empfohlen, lernbasierte Hash-Verfahren zu verwenden.In general, hashing methods fall into two categories, location-based hashing (LSH) and learning-based hashing (LBH). LSH uses functions that are independent of the given data to generate hash codes, while LBH techniques learn hash functions based on the input data. With LSH, the retention of the semantic similarity, which is a basic requirement for the use of image processing systems, is not guaranteed. To overcome these problems, it is recommended to use learning-based hashing techniques.

Die LBH-Techniken verwenden Eingabedaten, um ähnliche Bilder in Form von Hash-Codes mit einem hohen Grad an Korrelation abzubilden. Diese Verfahren sind skalierbar und erhalten eine semantische Beziehung zwischen den eingegebenen Bildern. Daher wurde in dieser Arbeit die LBH-Technik auf der Grundlage von Convolutional Stacked Denoising Auto-Encodern (CSDAEs) für die Bildauthentifizierung vorgeschlagen.The LBH techniques use input data to map similar images into hash codes with a high degree of correlation. These methods are scalable and preserve a semantic relationship between the input images. Therefore, the LBH technique based on Convolutional Stacked Denoising Auto-Encoders (CSDAEs) for image authentication was proposed in this work.

Die Auto-Encoder (AEs) sind in der Lage, die hervorstechenden Merkmale von unbeschrifteten Daten (oder Bildern) zu erlernen und eine wirksame Darstellung (Array kurzer Länge) der Eingabedaten zu liefern. Diese Darstellung sorgt für minimale Rekonstruktionsverluste bei der Dekodierung. Die meisten AEs sind vollständig verbundene Feedforward-Netzwerke mit einer Regularisierungsbeschränkung (RC) in der mittleren Schicht, die einen Hash-Code erzeugt. RC befähigt das Modell, Datenverteilungen zu lernen, anstatt die Eingabe in die Ausgabe zu kopieren. Faltungsneuronale Netze (CNNs) sind aufgrund ihrer Gewichtsteilungseigenschaften robust gegen Überanpassung. CNNs funktionieren gut bei Daten mit lokaler räumlicher Kohärenz, z. B. bei Bildern.The Auto-Encoders (AEs) are able to learn the salient features of blank data (or images) and provide an efficient representation (short-length array) of the input data. This representation ensures minimal reconstruction losses during decoding. Most AEs are fully connected feedforward networks with a regularization constraint (RC) in the middle layer that produces a hash code. RC enables the model to learn data distributions instead of copying input to output. Convolutional neural networks (CNNs) are robust to overfitting due to their weight-sharing properties. CNNs work well for data with local spatial coherence, e.g. B. with pictures.

Das erste bekannte Bild-Hashing wurde von Schneider und Chang vorgestellt, gefolgt von der Arbeit von R. Venkatesan et al. Danach haben die Forscher dem Bild-Hashing zur Authentifizierung von Inhalten große Aufmerksamkeit geschenkt.The first known image hashing was presented by Schneider and Chang, followed by the work of R. Venkatesan et al. After that, researchers paid much attention to image hashing to authenticate content.

Einem anderen Forscher zufolge haben Tang et al. Ringpartitionen und NMF (nicht-negative Matrixfaktorisierung) eingeführt, um Hash-Codes zu generieren, die robust gegen Rotation sind und eine gute Unterscheidungsfähigkeit aufweisen, aber empfindlich auf einige geometrische Operationen reagieren.According to another researcher, Tang et al. Introduced ring partitions and NMF (non-negative matrix factorization) to generate hash codes that are robust to rotation and have good discrimination ability, but are sensitive to some geometric operations.

Einem anderen Forscher zufolge haben Davarzani et al. die Singulärwertzerlegung (SVD) und zentralsymmetrische lokale Binärmuster (CSLBP) eingeführt, um einen Wahrnehmungsbild-Hash (PIH) zu erstellen. Diese Methode kann Manipulationen erkennen, wenn der manipulierte Bereich mindestens 10 % des Referenzbildes ausmacht.According to another researcher, Davarzani et al. introduced singular value decomposition (SVD) and centrally symmetric local binary patterns (CSLBP) to create a perceptual image hash (PIH). len. This method can detect tampering if the tampered area is at least 10% of the reference image.

Einem anderen Forscher zufolge haben Ding et al. eine DCT-basierte Bildauthentifizierung vorgeschlagen. Diese Methode kann keine Manipulationen in der Harmonie erkennen.According to another researcher, Ding et al. proposed a DCT-based image authentication. This method cannot detect manipulations in the harmony.

Ein anderer Forscher, Xue et al., hat ein Hashing-Verfahren mit SIFT und LBP entwickelt. Diese Methode ist im Fall von geometrischen Operationen begrenzt.Another researcher, Xue et al., developed a hashing method using SIFT and LBP. This method is limited in case of geometric operations.

Ein anderer Forscher, Pun et al., schlug eine auf der Auswahl von Merkmalspunkten basierende Bildhash-Technik zur Authentifizierung von Inhalten und zur Lokalisierung von Manipulationen vor. Diese Methode ist begrenzt, wenn Manipulationen und zusammengesetzte RST gleichzeitig auftreten.Another researcher, Pun et al., proposed a feature point selection-based image hash technique for content authentication and manipulation localization. This method is limited when manipulations and composite RST occur simultaneously.

Einem anderen Forscher zufolge verwendeten Ouyang et al. eine PIH-Technik, die QZMs (Quaternion-Zernike-Momente) für globale Merkmale und lokale SIFT-Merkmale verwendet, die jedoch durch Helligkeits- und geometrische Störungen beeinträchtigt wird. Beim Stand der Technik wurden Ringpartitionen und die Faktorisierung der nichtnegativen Gradientenmatrix verwendet, die das System drehungsunempfindlich machten. Aus den hervorstechenden Regionen werden dann lokale Merkmale abgeleitet, um die gefälschten Regionen in einem Bild zu lokalisieren.According to another researcher, Ouyang et al. a PIH technique that uses QZMs (quaternion-Zernike moments) for global features and local SIFT features, but is affected by luminance and geometric perturbations. The prior art used ring partitions and factorization of the non-negative gradient matrix, which rendered the system rotation insensitive. Local features are then derived from the salient regions to locate the fake regions in an image.

Ein anderer Forscher, Karsh et al., schlug ein Hashing-Verfahren mit DWT-SVD vor. Diese Methoden versagten jedoch bei Farbbildern, boten jedoch eine Hash-Robustheit gegen ein hohes Maß an Drehungen. Die Hash-Verfahren werden zur Bildauthentifizierung verwendet, aber wenn zusammengesetzte RST und böswillige Manipulationen gleichzeitig auftreten, können Manipulationen nicht erkannt werden.Another researcher, Karsh et al., proposed a hashing technique using DWT-SVD. However, these methods failed on color images, but offered hash robustness against high levels of rotation. The hashing methods are used for image authentication, but when composite RST and malicious tampering occur simultaneously, tampering cannot be detected.

Die Robustheit gegenüber geometrischen Veränderungen ist ein Hauptproblem bei den oben genannten Bildauthentifizierungsmethoden. Die Auswirkungen geometrischer Verzerrungen bei der Bildauthentifizierung wurden durch Bildausrichtungstechniken abgeschwächt.Robustness to geometric changes is a major issue with the above image authentication methods. The effects of geometric distortions in image authentication have been mitigated using image alignment techniques.

Einem anderen Forscher zufolge haben Battia et al. das Abstimmungsverfahren zur Wiederherstellung des Originalbildes vorgestellt. Die Leistung der Bildwiederherstellung ist besser, erfordert aber eine größere Länge des Hash, d. h. 1000 Stellen.According to another researcher, Battia et al. presented the voting procedure for restoring the original image. Image recovery performance is better, but requires longer length of hash, i.e. H. 1000 digits.

Ein anderer Forscher, Yan et al., stellte einen Ansatz vor, bei dem eine Quaternion-Fourier-Mellin-Transformation zur Bildwiederherstellung verwendet wird, aber diese Methode ist im Falle von Manipulationen begrenzt.Another researcher, Yan et al., presented an approach using a quaternion Fourier-Mellin transform for image restoration, but this method is limited in case of manipulations.

Einem anderen Forscher zufolge haben Karsh et al. einen geometrischen Korrekturansatz eingeführt, der auf den inhärenten Eigenschaften der geometrischen Transformation beruht. Diese Methode erfordert eine sehr kleine Hash-Länge, ist aber bei negativen Drehungsgraden begrenzt. Autokodierer (AEs) sind seit Jahrzehnten Teil der historischen Landschaft. Traditionell werden AEs hauptsächlich für die Extraktion und das Lernen von Merkmalen verwendet.According to another researcher, Karsh et al. introduced a geometric correction approach based on the inherent properties of the geometric transformation. This method requires a very small hash length but is limited at negative spin degrees. Autocoders (AEs) have been part of the historical landscape for decades. Traditionally, AEs are mainly used for feature extraction and learning.

Einem weiteren Forscher zufolge verwendeten Makhzani et al. Autokodierer mit aggressiven Sparsamkeitsbeschränkungen. Bei diesem Ansatz werden nur nb Aktivierungen jeder Merkmalskarte beibehalten, während der Rest zu Null gemacht wird.According to another researcher, Makhzani et al. Autocoders with aggressive parsimony constraints. With this approach, only nb activations of each trait card are kept while the rest are made zero.

Die oben erwähnten früheren Arbeiten konzentrierten sich entweder darauf, die Hash-Codes robust zu machen oder manipulierte Regionen zu lokalisieren.The previous work mentioned above either focused on making the hash codes robust or on locating manipulated regions.

Daher besteht die Notwendigkeit, eine Vorrichtung zum Trainieren eines einzelnen gestapelten Faltungs-Entrauschungs-Autocodierers zu entwickeln, um Hash-Codes zu erzeugen, die robust gegen verschiedene geometrische Angriffe sind und gleichzeitig die manipulierten Regionen lokalisieren.Therefore, there is a need to develop an apparatus for training a single stacked convolutional denoising autocoder to generate hash codes that are robust against various geometric attacks while locating the manipulated regions.

Der technische Fortschritt, der durch die vorliegende Erfindung offenbart wird, überwindet die Einschränkungen und Nachteile bestehender und konventioneller Systeme und Methoden.The technical advance disclosed by the present invention overcomes the limitations and disadvantages of existing and conventional systems and methods.

ZUSAMMENFASSUNG DER ERFINDUNGSUMMARY OF THE INVENTION

Die vorliegende Erfindung bezieht sich allgemein auf ein neuro-evolutionäres Erkennungssystem zur Früherkennung von Lungenkrebs.The present invention relates generally to a neuro-evolutionary detection system for the early detection of lung cancer.

Ein Ziel der vorliegenden Erfindung ist es, ein Bild-Hashing-Verfahren vorzuschlagen, das auf CSDAEs während des Trainings basiert,An aim of the present invention is to propose an image hashing method based on CSDAEs during training,

Ein weiteres Ziel der vorliegenden Erfindung ist es, einen geometrischen Korrekturansatz vorzuschlagen, insbesondere für zusammengesetzte RST, um manipulierte Bereiche zu erkennen, undAnother aim of the present invention is to propose a geometric correction approach, in particular for composite RST, to detect manipulated areas and

Ein weiteres Ziel der vorliegenden Erfindung ist es, Bildmanipulationen zu erkennen.Another object of the present invention is to detect image manipulations.

In einer Ausführungsform eine neuro-evolutionäre Erkennungsvorrichtung für die Früherkennung von Lungenkrebs, wobei die Vorrichtung umfasst:

eine Vielzahl von Empfängern zum Empfangen eines Eingangsbildes und eines Hashwertes, wobei das empfangene Bild mit einer blinden Rotationsskalierungstransformation (RST) gekapselt ist;
ein geometrisches Transformationskorrekturmodul, das mit der Vielzahl von Empfängern verbunden ist, um Merkmale der blinden RST-Transformation zu eliminieren, indem inhärente Eigenschaften der geometrischen Transformation verwendet werden und die Rotationsbedingung des empfangenen Bildes ausgewertet wird, wobei die Merkmale der blinden RST-Transformation eliminiert werden, indem die Koordinaten von Grenzpixeln mit Werten ungleich Null berechnet werden, um ein interpoliertes Bild zu erhalten;
ein Bildauthentifizierungsmodul, das mit dem geometrischen Transformationsmodul verbunden ist, zum Erzeugen eines Hash-Codes durch eine L2-Regularisierungsbeschränkung (L2RC), die auf die Aktivierung eines Abschlussblocks in einem Codierer angewendet wird, wobei ein in dem Bildauthentifizierungsmodul enthaltenes Schätzmodul einen Korrelationskoeffizienten zwischen dem empfangenen Hash und dem erzeugten Hash schätzt, wobei, wenn der Korrelationskoeffizient kleiner als ein erster Schwellenwert ist, das interpolierte Bild entweder ein manipuliertes Bild oder ein anderes Bild ist; und
ein Modul zur Erkennung und Lokalisierung von Manipulationen, das mit dem Bildsegmentierungsmodul verbunden ist, um das Bild als manipuliertes Bild oder einen anderen Bildidentifizierungsbereich der manipulierten Bilder zu klassifizieren, wobei das manipulierte Bild dem erzeugten Hash-Code zugeordnet wird, um eine Bildkarte zu erzeugen und das manipulierte Bild zu lokalisieren.

In one embodiment, a neuro-evolutionary detection device for the early detection of lung cancer, the device comprising:

a plurality of receivers for receiving an input image and a hash value, the received image encapsulated with a blind rotational scaling transform (RST);
a geometric transform correction module connected to the plurality of receivers to eliminate blind RST transform features by using inherent properties of the geometric transform and evaluating the rotation condition of the received image, thereby eliminating the blind RST transform features , by calculating the coordinates of non-zero boundary pixels to obtain an interpolated image;
an image authentication module connected to the geometric transformation module for generating a hash code by an L2 regularization constraint (L2RC) applied to the activation of a termination block in an encoder, wherein an estimation module included in the image authentication module calculates a correlation coefficient between the received hash and the generated hash, wherein if the correlation coefficient is less than a first threshold, the interpolated image is either a manipulated image or a different image; and
a tampering detection and localization module connected to the image segmentation module to classify the image as a tampered image or another image identification portion of the tampered images, wherein the tampered image is associated with the generated hash code to generate an image map and locate the manipulated image.

In einer Ausführungsform umfassen die Grenzpixel das oberste Pixel, das äußerste rechte Pixel, das äußerste linke Pixel und das unterste Pixel.In one embodiment, the boundary pixels include the top pixel, the rightmost pixel, the leftmost pixel, and the bottom pixel.

In einer Ausführungsform wertet das Modul zur Korrektur der geometrischen Transformation die Drehrichtung aus, d. h., entweder im Uhrzeigersinn oder gegen den Uhrzeigersinn, und passt das gedrehte Bild an, wobei, wenn das empfangene Bild gegen den Uhrzeigersinn bis zu θ Grad gedreht ist, der oberste Nicht-Nullwert-Koordinatenpunkt weiter rechts liegt als der unterste Punkt, unabhängig von seinem Translationsgrad, und umgekehrt für ein im Uhrzeigersinn gedrehtes Bild bis zu -θ Grad, wobei, wenn der Wert von θ im Wesentlichen gleich θ' Grad ist, das empfangene Bild als gedrehtes Bild betrachtet wird, das um den θ' Grad gegenläufig gedreht ist.In one embodiment, the geometric transformation correction module evaluates the direction of rotation, i. i.e., either clockwise or counterclockwise, and adjusts the rotated image where, if the received image is rotated counterclockwise by θ degrees, the top non-zero coordinate point is further to the right than the bottom point, regardless from its translation degree, and vice versa for a clockwise rotated image up to -θ degrees, where if the value of θ is substantially equal to θ' degrees, the received image is considered to be a rotated image counter-rotated by θ' degrees is.

In einer Ausführungsform berechnet ein binäres Interpolationsmodul einen interessierenden Bereich, indem es das Bild auf eine Dimension des empfangenen Bildes bei Einstellung des Drehwinkels bearbeitet.In one embodiment, a binary interpolation module calculates a region of interest by manipulating the image to one dimension of the received image while adjusting the rotation angle.

In einer Ausführungsform ist der Codierer ein Convolutional stacked denoising auto-encoder (CSDAE) mit einer Vielzahl von Blöcken zur Erzeugung des Hash-Bildes, wobei der CSDAE ein neuronales Vorwärtsnetzwerk ist, wobei jeder der Vielzahl von Blöcken aus einer Faltungsschicht, gefolgt von einer Stapelnormalisierung und einer Max-Pooling-Schicht besteht.In one embodiment, the encoder is a convolutional stacked denoising auto-encoder (CSDAE) having a plurality of blocks to generate the hash image, the CSDAE being a forward neural network, each of the plurality of blocks consisting of a convolution layer followed by stack normalization and a max pooling layer.

In einer Ausführungsform bildet die CSDAE empfangene interpolierte Bilder hierarchisch in einen latenten Raum ab, der auf die Dimension des empfangenen Bildes neu abgebildet wird, wobei die L2 RC so konfiguriert ist, dass sie: irrelevante Komponenten aus dem Hash-Bild entfernt, indem sie eine kleinste Kombination wählt, um Lernprobleme zu lösen; und Unterdrückung der Auswirkungen statischen Rauschens auf die Ziele, indem eine Überanpassung verhindert wird.In one embodiment, the CSDAE hierarchically maps received interpolated images into a latent space that is re-mapped to the dimension of the received image, with the L2 RC configured to: Remove irrelevant components from the hash image by using a chooses smallest combination to solve learning problems; and suppressing the effects of static noise on the targets by preventing overfitting.

In einer Ausführungsform wird der verfälschte Bereich durch Multiplikation des verfälschten Bereichs mit dem verfälschten Bild lokalisiert, wobei, wenn der verfälschte Bereich weniger als 30 % beträgt, das empfangene Bild als „verfälschtes Bild“ und andernfalls als „anderes Bild“ beurteilt wird.In one embodiment, the corrupted area is located by multiplying the corrupted area by the corrupted image, where if the corrupted area is less than 30%, the received image is judged as "corrupted image" and otherwise as "different image".

In einer Ausführungsform beträgt der erste Schwellenwert 0.98 und ein zweiter Schwellenwert für die Temperierungslokalisierung ist 0.5.In one embodiment, the first threshold is 0.98 and a second temperament localization threshold is 0.5.

In einer Ausführungsform wird jede Faltungsschicht mit einer Vielzahl von Filtern initialisiert, mit Ausnahme der letzten Schicht, die jeweils eine Kernelgröße von (3,3) haben, wobei die CSDAE schichtweise trainiert und dann als Ganzes feinabgestimmt wird.In one embodiment, each convolutional layer is initialized with a plurality of filters, except for the last layer, each having a kernel size of (3,3), where the CSDAE is trained layer by layer and then fine-tuned as a whole.

In einer Ausführungsform umfasst eine Decoder-Blockschicht eine 2D-Faltungsschicht und eine Up-Sampling-Schicht, gefolgt von einer Batch-Norm-Schicht.In one embodiment, a decoder block layer includes a 2D convolution layer and an up-sampling layer, followed by a batch norm layer.

Um die Vorteile und Merkmale der vorliegenden Erfindung weiter zu verdeutlichen, wird eine genauere Beschreibung der Erfindung durch Bezugnahme auf spezifische Ausführungsformen davon, die in den beigefügten Figuren dargestellt ist, gemacht werden. Es wird davon ausgegangen, dass diese Figuren nur typische Ausführungsformen der Erfindung zeigen und daher nicht als Einschränkung ihres Umfangs zu betrachten sind. Die Erfindung wird mit zusätzlicher Spezifität und Detail mit den beigefügten Figuren beschrieben und erläutert werden.In order to further clarify the advantages and features of the present invention, a more detailed description of the invention will be made by reference to specific embodiments thereof illustrated in the accompanying figures. It is understood that these figures show only typical embodiments of the invention and therefore should not be considered as limiting its scope. The invention will be described and illustrated with additional specificity and detail with the accompanying figures.

Figurenlistecharacter list

Diese und andere Merkmale, Aspekte und Vorteile der vorliegenden Erfindung werden besser verstanden, wenn die folgende detaillierte Beschreibung unter Bezugnahme auf die beigefügten Figuren gelesen wird, in denen gleiche Zeichen gleiche Teile in den Figuren darstellen, wobei:

1 ein Blockdiagramm einer neuro-evolutionären Erkennungsvorrichtung zur Früherkennung von Lungenkrebs zeigt, und
2 eine grafische Darstellung der TPR-Raten für verschiedene Rotationsgrade zeigt.

These and other features, aspects and advantages of the present invention will be better understood when the following detailed description is read with reference to the accompanying figures, in which like characters represent like parts throughout the figures, wherein:

1 shows a block diagram of a neuro-evolutionary detection device for the early detection of lung cancer, and
2 shows a graphical representation of the TPR rates for different degrees of rotation.

Der Fachmann wird verstehen, dass die Elemente in den Figuren der Einfachheit halber dargestellt sind und nicht unbedingt maßstabsgetreu gezeichnet wurden. Die Flussdiagramme veranschaulichen beispielsweise das Verfahren anhand der wichtigsten Schritte, um das Verständnis der Aspekte der vorliegenden Offenbarung zu verbessern. Darüber hinaus kann es sein, dass eine oder mehrere Komponenten der Vorrichtung in den Figuren durch herkömmliche Symbole dargestellt sind, und dass die Figuren nur die spezifischen Details zeigen, die für das Verständnis der Ausführungsformen der vorliegenden Offenbarung relevant sind, um die Figuren nicht mit Details zu überfrachten, die für Fachleute, die mit der vorliegenden Beschreibung vertraut sind, leicht erkennbar sind.Those skilled in the art will understand that the elements in the figures are presented for simplicity and are not necessarily drawn to scale. For example, the flow charts illustrate the method of key steps to enhance understanding of aspects of the present disclosure. In addition, one or more components of the device may be represented in the figures by conventional symbols, and the figures only show the specific details relevant to understanding the embodiments of the present disclosure, not to encircle the figures with details to overload, which are easily recognizable to those skilled in the art familiar with the present description.

DETAILLIERTE BESCHREIBUNGDETAILED DESCRIPTION

Um das Verständnis der Erfindung zu fördern, wird nun auf die in den Figuren dargestellte Ausführungsform Bezug genommen und diese mit bestimmten Worten beschrieben. Es versteht sich jedoch von selbst, dass damit keine Einschränkung des Umfangs der Erfindung beabsichtigt ist, wobei solche Änderungen und weitere Modifikationen des dargestellten Systems und solche weiteren Anwendungen der darin dargestellten Grundsätze der Erfindung in Betracht gezogen werden, wie sie einem Fachmann auf dem Gebiet der Erfindung normalerweise einfallen würden.For the purposes of promoting an understanding of the invention, reference will now be made to the embodiment illustrated in the figures and specific language will be used to describe the same. It should be understood, however, that no limitation on the scope of the invention is intended, and such alterations and further modifications to the illustrated system and such further applications of the principles of the invention set forth therein are contemplated as would occur to those skilled in the art invention would normally come to mind.

Es versteht sich für den Fachmann von selbst, dass die vorstehende allgemeine Beschreibung und die folgende detaillierte Beschreibung beispielhaft und erläuternd für die Erfindung sind und diese nicht einschränken sollen.It will be understood by those skilled in the art that the foregoing general description and the following detailed description are exemplary and explanatory of the invention and are not intended to be limiting.

Wenn in dieser Beschreibung von „einem Aspekt“, „einem anderen Aspekt“ oder ähnlichem die Rede ist, bedeutet dies, dass ein bestimmtes Merkmal, eine bestimmte Struktur oder eine bestimmte Eigenschaft, die im Zusammenhang mit der Ausführungsform beschrieben wird, in mindestens einer Ausführungsform der vorliegenden Erfindung enthalten ist. Daher können sich die Ausdrücke „in einer Ausführungsform“, „in einer anderen Ausführungsform“ und ähnliche Ausdrücke in dieser Beschreibung alle auf dieselbe Ausführungsform beziehen, müssen es aber nicht.When this specification refers to "an aspect," "another aspect," or the like, it means that a particular feature, structure, or characteristic described in connection with the embodiment is present in at least one embodiment of the present invention. Therefore, the phrases "in one embodiment," "in another embodiment," and similar phrases throughout this specification may or may not all refer to the same embodiment.

Die Ausdrücke „umfasst“, „enthaltend“ oder andere Variationen davon sollen eine nicht ausschließliche Einbeziehung abdecken, so dass ein Verfahren oder eine Methode, die eine Liste von Schritten umfasst, nicht nur diese Schritte einschließt, sondern auch andere Schritte enthalten kann, die nicht ausdrücklich aufgeführt sind oder zu einem solchen Verfahren oder einer solchen Methode gehören. Ebenso schließen eine oder mehrere Vorrichtungen oder Teilsysteme oder Elemente oder Strukturen oder Komponenten, die mit „umfasst...a“ eingeleitet werden, nicht ohne weitere Einschränkungen die Existenz anderer Vorrichtungen oder anderer Teilsysteme oder anderer Elemente oder anderer Strukturen oder anderer Komponenten oder zusätzlicher Vorrichtungen oder zusätzlicher Teilsysteme oder zusätzlicher Elemente oder zusätzlicher Strukturen oder zusätzlicher Komponenten aus.The terms "comprises,""including," or other variations thereof are intended to cover non-exclusive inclusion, such that a method or method that includes a list of steps includes not only those steps, but may also include other steps that are not expressly stated or pertaining to any such process or method. Likewise, any device or subsystem or element or structure or component preceded by "comprises...a" does not, without further limitation, exclude the existence of other devices or other subsystem or other element or other structure or other component or additional device or additional subsystems or additional elements or additional structures or additional components.

Sofern nicht anders definiert, haben alle hierin verwendeten technischen und wissenschaftlichen Begriffe die gleiche Bedeutung, wie sie von einem Fachmann auf dem Gebiet, zu dem diese Erfindung gehört, allgemein verstanden wird. Das System, die Methoden und die Beispiele, die hier angegeben werden, dienen nur der Veranschaulichung und sind nicht als Einschränkung gedacht.Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one skilled in the art to which this invention pertains. The system, methods, and examples provided herein are for purposes of illustration only and are not intended to be limiting.

Ausführungsformen der vorliegenden Erfindung werden im Folgenden unter Bezugnahme auf die beigefügten Figuren im Detail beschrieben.Embodiments of the present invention are described in detail below with reference to the attached figures.

1 zeigt ein Blockdiagramm eines neuro-evolutionären Erkennungssystems (100) zur Früherkennung von Lungenkrebs, wobei das System (100) Folgendes umfasst: eine Vielzahl von Empfängern (102), ein Modul zur Korrektur geometrischer Transformationen (104), ein Bildauthentifizierungsmodul (106), einen Kodierer (106a), einen Dekodierer (106b), ein Schätzmodul (108) und ein Modul zur Erkennung und Lokalisierung von Manipulationen (110). 1 shows a block diagram of a neuro-evolutionary detection system (100) for early detection of lung cancer, the system (100) comprising: a plurality of receivers (102), a module for correcting geometric transformations (104), an image authentication module (106), a encoder (106a), a decoder (106b), an estimation module (108) and a module for detecting and locating manipulations (110).

Die Vielzahl von Empfängern (102) zum Empfangen eines Eingangsbildes und eines Hash-Wertes, wobei das empfangene Bild mit einer blinden Rotationsskalierungstransformation (RST) gekapselt ist.The plurality of receivers (102) for receiving an input image and a hash value, the received image encapsulated with a blind rotational scaling transform (RST).

Das Modul (104) zur Korrektur der geometrischen Transformation ist mit der Vielzahl von Empfängern (102) verbunden, um Merkmale der blinden RST-Transformation zu eliminieren, indem inhärente Eigenschaften der geometrischen Transformation verwendet und die Rotationsbedingung des empfangenen Bildes ausgewertet werden, wobei die Merkmale der blinden RST-Transformation eliminiert werden, indem die Koordinaten von Grenzpixeln mit Werten ungleich Null berechnet werden, um ein interpoliertes Bild zu erhalten. Die Randpixel umfassen das oberste Pixel, das äußerste rechte Pixel, das äußerste linke Pixel und das unterste Pixel.The geometric transformation correction module (104) is connected to the plurality of receivers (102) to eliminate features of the blind RST transformation by using inherent properties of the geometric transformation and evaluating the rotation condition of the received image, where the characteristics of the blind RST transform can be eliminated by computing the coordinates of boundary pixels with non-zero values to obtain an interpolated image. The edge pixels include the top pixel, the rightmost pixel, the leftmost pixel, and the bottom pixel.

Das Modul zur Korrektur der geometrischen Transformation wertet die Drehrichtung aus, d.h., entweder im Uhrzeigersinn oder gegen den Uhrzeigersinn, und passt das gedrehte Bild an, wobei, wenn das empfangene Bild gegen den Uhrzeigersinn um bis zu θ Grad gedreht ist, der oberste Nicht-Nullwert-Koordinatenpunkt weiter rechts liegt als der unterste Punkt, unabhängig von seinem Translationsgrad, und umgekehrt für ein im Uhrzeigersinn gedrehtes Bild um bis zu -θ Grad, wobei, wenn der Wert von θ im Wesentlichen gleich θ' Grad ist, das empfangene Bild als ein gedrehtes Bild betrachtet wird, das um den θ' Grad gegenläufig gedreht ist.The geometric transformation correction module evaluates the rotation direction, i.e., either clockwise or counterclockwise, and adjusts the rotated image, where if the received image is rotated counterclockwise by up to θ degrees, the top non- zero value coordinate point is further to the right than the bottom point, regardless of its degree of translation, and vice versa for a clockwise rotated image by up to -θ degrees, where if the value of θ is substantially equal to θ' degrees, the received image as a rotated image is viewed that is counter-rotated by the θ' degree.

Wird ein Bild bis zu 45 Grad gegen den Uhrzeigersinn gedreht, so liegt der oberste Koordinatenpunkt, der nicht Null ist, unabhängig von seinem Translationsgrad weiter rechts als sein unterster Punkt, d. h. Xt>Xb, und umgekehrt für ein im Uhrzeigersinn gedrehtes Bild bis zu -45 Grad. Wenn das Bild als gegen den Uhrzeigersinn gedreht eingestuft wird, wird der Drehwinkel wie folgt berechnet: $θ = arctan \frac{Δ y}{Δ x'} θ' = arctan \frac{Δ y'}{Δ x'}$

andernfalls wird der Drehwinkel wie folgt berechnet.

θ = - (90 - arctan arctan (\frac{Δ y}{Δ x}));

θ' = - (90 - arctan arctan \frac{Δ y'}{Δ x'}));

wobei Δy=Y_r-Y_b, Δx=X_r-X_b, Where Δy'=Y_t-Y_l, und
Δx'=X_t-X_l, worin (Xr,Yr), (Xt,Yt), und (Xl; Yl) die Koordinaten der Pixelwerte sind, die nicht Null sind, für den rechten, oberen bzw. unteren, äußersten linken Punkt. Um die Skalierungs- und Translationsfaktoren zu berücksichtigen, werden die Koordinaten der obersten, untersten, rechten und linken Pixelwerte, die nicht Null sind, berechnet. Anhand dieser Informationen wird dann der interessierende Bereich beschnitten und durch bilineare Interpolation an die Abmessungen des Abfragebildes angepasst.When an image is rotated counterclockwise up to 45 degrees, the top non-zero coordinate point is further to the right than its bottom point, regardless of its degree of translation, ie Xt>Xb, and vice versa for a clockwise rotated image up to - 45 degrees. If the image is judged to be rotated counterclockwise, the rotation angle is calculated as follows:

θ = arctan \frac{Δ y}{Δ x'} θ' = arctan \frac{Δ y'}{Δ x'}

otherwise the angle of rotation is calculated as follows.

θ = - (90 - arctan arctan (\frac{Δ y}{Δ x}));

θ' = - (90 - arctan arctan \frac{Δ y'}{Δ x'}));

where Δy=Y _r -Y _b , Δx=X _r -X _b , Where Δy'=Y _t -Y _l , and
Δx'=X _t -X _l , where (Xr,Yr), (Xt,Yt), and (Xl;Yl) are the coordinates of the non-zero pixel values for the right, top, and bottom, leftmost, respectively Point. To account for the scaling and translation factors, the coordinates of the top, bottom, right, and left non-zero pixel values are calculated. Using this information, the region of interest is then cropped and fitted to the dimensions of the query image using bilinear interpolation.

Das binäre Interpolationsmodul (104) berechnet einen interessierenden Bereich, indem es das Bild auf eine Dimension des empfangenen Bildes bei Einstellung des Drehwinkels bearbeitet.The binary interpolation module (104) calculates a region of interest by manipulating the image to one dimension of the received image while adjusting the rotation angle.

Das Bildauthentifizierungsmodul (106) ist mit dem geometrischen Transformationsmodul (104) verbunden, um einen Hash-Code durch eine L2-Regularisierungsbeschränkung (L2RC) zu erzeugen, die bei der Aktivierung eines Abschlussblocks in einem Codierer (106a) angewendet wird, wobei ein im Bildauthentifizierungsmodul (106) enthaltenes Schätzmodul (108) einen Korrelationskoeffizienten zwischen dem empfangenen Hash und dem erzeugten Hash schätzt, wobei, wenn der Korrelationskoeffizient kleiner als ein erster Schwellenwert ist, das interpolierte Bild entweder ein manipuliertes Bild oder ein anderes Bild ist.The image authentication module (106) is connected to the geometric transformation module (104) to generate a hash code by an L2 regularization constraint (L2RC) applied upon activation of a termination block in an encoder (106a), using a in the image authentication module (106) included estimation module (108) estimates a correlation coefficient between the received hash and the generated hash, wherein if the correlation coefficient is less than a first threshold, the interpolated image is either a manipulated image or a different image.

Der Codierer (106a) ist ein Convolutional stacked denoising auto-encoder (CSDAE) mit einer Vielzahl von Blöcken zur Erzeugung des Hash-Bildes, wobei der CSDAE ein neuronales Vorwärtsnetzwerk ist, wobei jeder der Vielzahl von Blöcken aus einer Faltungsschicht, gefolgt von einer Stapelnormalisierung und einer Max-Pooling-Schicht besteht. Das CSDAE bildet empfangene interpolierte Bilder hierarchisch in einen latenten Raum ab, der auf die Dimension des empfangenen Bildes neu abgebildet wird, wobei das L2 RC so konfiguriert ist, dass es: irrelevante Komponenten aus dem Hash-Bild entfernt, indem es eine kleinste Kombination auswählt, um Lernprobleme zu lösen; und Auswirkungen von statischem Rauschen auf Ziele unterdrückt, indem es eine Überanpassung verhindert.The encoder (106a) is a convolutional stacked denoising auto-encoder (CSDAE) having a plurality of blocks to generate the hash image, the CSDAE being a forward neural network, each of the plurality of blocks consisting of a convolution layer followed by a stack normalization and a max pooling layer. The CSDAE hierarchically maps received interpolated images into a latent space that is re-mapped to the dimension of the received image, with the L2 RC configured to: Remove irrelevant components from the hash image by choosing a smallest combination to solve learning problems; and suppresses effects of static noise on targets by preventing overfitting.

Jede Faltungsschicht wird mit einer Vielzahl von Filtern initialisiert, mit Ausnahme der letzten Schicht, die jeweils eine Kernelgröße von (3,3) haben, wobei die CSDAE schichtweise trainiert und dann als Ganzes feinabgestimmt wird.Each convolutional layer is initialized with a multitude of filters, except for the last layer, each having a kernel size of (3,3), where the CSDAE is trained layer by layer and then fine-tuned as a whole.

Die Blockschicht des Decoders (106b) besteht aus einer 2D-Faltungsschicht und einer Up-Sampling-Schicht, gefolgt von einer Batch-Norm-Schicht.The block layer of the decoder (106b) consists of a 2D convolution layer and an up-sampling layer, followed by a batch norm layer.

Das Modul (110) zur Erkennung und Lokalisierung von Manipulationen ist mit dem Bildsegmentierungsmodul (106) verbunden, um das Bild als manipuliertes Bild oder als einen anderen Bildidentifizierungsbereich der manipulierten Bilder zu klassifizieren, wobei das manipulierte Bild auf den erzeugten Hash-Code abgebildet wird, um eine Bildkarte zu erzeugen und das manipulierte Bild zu lokalisieren.The tampering detection and localization module (110) is connected to the image segmentation module (106) to classify the image as a tampered image or another image identification portion of the tampered images, wherein the tampered image is mapped to the generated hash code, to generate an image map and locate the manipulated image.

Der verfälschte Bereich wird lokalisiert, indem der verfälschte Bereich mit dem verfälschten Bild multipliziert wird, wobei, wenn der verfälschte Bereich weniger als 30 % beträgt, das empfangene Bild als „verfälschtes Bild“ und andernfalls als „anderes Bild“ beurteilt wird.The corrupted area is located by multiplying the corrupted area by the corrupted image, where if the corrupted area is less than 30%, the received image is judged as "corrupted image" and otherwise as "different image".

Der erste Schwellenwert beträgt 0.98 und ein zweiter Schwellenwert für die Temperierungslokalisierung beträgt 0.5.The first threshold is 0.98 and a second temperament localization threshold is 0.5.

Die manipulierten Regionen (T) werden wie folgt lokalisiert:

T (x,y)= {Ĩ(x,y) - I̅(x,y)} ≥ δ₂ wobei δ2 ein Schwellenwert für die Manipulationslokalisierung ist und x,y Pixelkoordinaten sind. Schließlich wird T mit dem manipulierten Bild multipliziert, um die wahrscheinlich manipulierten Bereiche des empfangenen Bildes anzuzeigen.

Schichten Encoder-Blockschicht Decoder-Blockschicht

Kernel_Größe 3*3 3*3 Anzah_der_Karten 16 16 Pool Größe 2*2 2*2 Pool_Typ max max

The manipulated regions (T) are located as follows:

T (x,y)= {Ĩ(x,y) - I̅(x,y)} ≥ δ ₂ where δ2 is a threshold for manipulation localization and x,y are pixel coordinates. Finally, T is multiplied by the manipulated image to indicate the likely manipulated areas of the received image.

layers encoder block layer Decoder Block Layer

kernel_size 3*3 3*3 number_of_cards 16 16 pool size 2*2 2*2 pool_type Max Max

Die optimalen Parameterwerte für das vorgeschlagene System werden wie folgt gewählt: P × Q × R = 128 × 128 × 3, und die erste Schwelle δ1 ist 0.98, δ2 ist 0.5. Für die Implementierung der CSDAE wurde das Tensor Flow Deep Learning Framework verwendet. Alle Trainings- und Testbilder werden auf [0,1] skaliert, bevor sie in die CSDAE eingespeist werden. Während der Trainingsphase erhält die CSDAE Bilder, die CPOs unterzogen wurden, und entsprechende Referenzbilder als Ausgabe. Jede Faltungsschicht wird mit 64 Filtern initialisiert, mit Ausnahme der letzten Schicht, die jeweils eine Kernelgröße von (3,3) haben. Die CSDAE wird schichtweise trainiert und dann als Ganzes feinabgestimmt. Das Training wird für insgesamt 43,328 Bilder durchgeführt. Die Quellbilder stammen aus der USC-SIPI-Datenbank und werden für 100 Epochen bei einer Stapelgröße von 400 trainiert, wobei 320 der Trainingsstapel und die 80 Proben der Validierungsstapel oder der Teststapel sind. Die USC-SIPI-Datenbank besteht aus verschiedenen Volumina wie Texturen und Luftaufnahmen, die verschiedene Zeichen der Bilder verwenden. Die Größe der Bilder in jedem Band variiert von 256 x 256 Pixel bis 1024 x 1024 Pixel. Das vorgeschlagene System wird im Hinblick auf die blinde RST-Korrektur, die Robustheit und die Unterscheidungsleistung sowie die Erkennung und Lokalisierung von Manipulationen bewertet und getestet.The optimal parameter values for the proposed system are chosen as follows: P × Q × R = 128 × 128 × 3, and the first threshold δ1 is 0.98, δ2 is 0.5. The Tensor Flow Deep Learning Framework was used to implement the CSDAE. All training and test images are scaled to [0,1] before being fed into the CSDAE. During the training phase, the CSDAE receives images that have undergone CPOs and corresponding reference images as output. Each convolutional layer is initialized with 64 filters, except for the last layer, each of which has a kernel size of (3,3). The CSDAE is trained in layers and then fine-tuned as a whole. The training is performed for a total of 43,328 images. The source images are from the USC SIPI database and are trained for 100 epochs at a batch size of 400, where 320 is the training batch and the 80 samples are the validation batch or the testing batch. The USC-SIPI database consists of different volumes such as textures and aerial photographs using different characters of the images. The size of the images in each band varies from 256x256 pixels to 1024x1024 pixels. The proposed system will be evaluated and tested in terms of blind RST correction, robustness and discrimination performance, and tampering detection and localization.

Das System wird für Eingabebilder mit den Abmessungen (512 X 512 X 3) und für (128 X 128 X 3) für 100 Epochen trainiert. Die Genauigkeits- und Verlustkurven scheinen für das Modell, das für 512 X 512 X 3 Bilder trainiert wurde, oszillierend zu sein, während sie für das Modell, das für 128 X 128 X 3 Bilder trainiert wurde, relativ glatt sind. Das Modell für 512 Eingaben konvergiert nicht, d.h. „TA 512 input.“The system is trained on input images of dimensions (512 X 512 X 3) and (128 X 128 X 3) for 100 epochs. The accuracy and loss curves appear to be oscillating for the model trained for 512 X 512 X 3 images, while they are relatively smooth for the model trained for 128 X 128 X 3 images. The model for 512 inputs does not converge, i.e. "TA 512 input."

Man kommt zu dem Schluss, dass die Genauigkeitsleistung bei einem 128 x 128 x 3-dimensionalen Eingangsbild besser ist als bei einem 512 x 512 x 3-dimensionalen.It is concluded that the accuracy performance is better for a 128 x 128 x 3-dimensional input image than for a 512 x 512 x 3-dimensional one.

2 zeigt eine grafische Darstellung der TPR-Raten für verschiedene Drehungsgrade. Es ist zu beobachten, dass die CSDAE robust gegenüber Drehungen von -5° bis +5° ist. Dies gibt der vorgeschlagenen RST-Korrektur eine Fehlerspanne von (-5,+5) Grad. Kombiniert man dies mit der vorgeschlagenen RST-Korrektur, erzeugt die CSDAE Hash-Codes, die für den gesamten Bereich von (-45,+45) Grad robust sind. 2 shows a graphical representation of the TPR rates for different degrees of rotation. It can be observed that the CSDAE is robust to rotations from -5° to +5°. This gives the proposed RST correction an error margin of (-5,+5) degrees. Combining this with the proposed RST correction, the CSDAE produces hash codes that are robust for the entire range of (-45,+45) degrees.

Gemäß einer Ausführungsform wird die Robustheit und Unterscheidungsfähigkeit des vorgeschlagenen Bild-Hashings bewertet. Das Experiment wird mit 10,094 ähnlichen Bildpaaren durchgeführt, die aus 103 Referenzbildern durch Anwendung von inhaltserhaltenden Operationen (CPOs) erzeugt wurden. Das System wird auch mit 19,900 unähnlichen Bildern bewertet, die aus 200 Bildern als 200 (200-1)/2 erzeugt wurden. Die Bilder stammen aus verschiedenen Quellen, nämlich 100 Bilder aus der Referenzdatenbank, 50 Bilder von der Nikon D3200 und 50 Bilder aus dem Internet. Das Experiment wird auch durchgeführt an 763 manipulierten Bildpaaren, von denen 365 „große Manipulationen“ aus CASIA V2.0 und 398 „kleine Manipulationen“ aus der NITS-Datenbank stammen. Der manipulierte NITS-Datensatz wurde erstellt, indem der manipulierte Bereich von 1 bis 30 % variiert wurde. Außerdem bestehen die Bilder aus unterschiedlichen Vorder- und Hintergründen. Das macht den Datensatz zu einer Herausforderung. Der PMCC (Pearson-Produkt-Momentum-Korrelationskoeffizient) wird zwischen den Hash-Codes des Referenzbildes und den entsprechenden semantisch ähnlichen Bildern, unähnlichen Bildpaaren und manipulierten Bildpaaren berechnet.According to one embodiment, the robustness and discrimination of the proposed image hashing is evaluated. The experiment is performed on 10,094 similar image pairs generated from 103 reference images by applying content-preserving operations (CPOs). The system is also evaluated with 19,900 dissimilar images generated from 200 images as 200 (200-1)/2. The images come from different sources, namely 100 images from the reference database, 50 images from the Nikon D3200 and 50 images from the internet. The experiment is also performed on 763 manipulated image pairs, of which 365 "large manipulations" come from CASIA V2.0 and 398 "small manipulations" from the NITS database. The manipulated NITS dataset was created by varying the manipulated range from 1 to 30%. In addition, the images consist of different foregrounds and backgrounds. This makes the data set a challenge. The PMCC (Pearson Product Momentum Correlation Coefficient) is calculated between the hash codes of the reference image and the corresponding semantically similar images, dissimilar image pairs and manipulated image pairs.

Es wurde experimentell evaluiert, dass durch die Wahl des Schwellenwerts δ1 =0.98 nur 1.10 % der semantisch ähnlichen Bilder als unterschiedlich eingestuft werden und nur 1.57 % der manipulierten Bilder fälschlicherweise als ähnlich erkannt werden. Allerdings ist die FPR für unterschiedliche Paare gleich Null. Daher bietet der gewählte Schwellenwert einen besseren Kompromiss zwischen FPR und TPR, wie in der folgenden Tabelle dargestellt: Hash-Korrelation TPR FPR 0.97 0.9979 0.0157 0.98 0.9879 0.0130 0.99 0.9626 0.0130 It was experimentally evaluated that by choosing the threshold value δ1 =0.98 only 1.10% of the semantically similar images are classified as different and only 1.57% of the manipulated images are falsely recognized as similar. However, the FPR for different pairs is zero. Therefore, the chosen threshold offers a better compromise between FPR and TPR as shown in the table below: hash correlation TPR FPR 0.97 0.9979 0.0157 0.98 0.9879 0.0130 0.99 0.9626 0.0130

Die Falschakzeptanzrate wird jedoch durch die Variation von δ1 weiter verbessert, was von der jeweiligen Anwendung abhängt. Hier wird die Fähigkeit des vorgeschlagenen Systems zur Erkennung und Lokalisierung von böswilligen Hinzufügungen oder Löschungen von Inhalten auf Bildern bewertet. Die Klassifizierung erfolgt in Abhängigkeit vom Prozentsatz des manipulierten Inhalts auf dem Bild. Bilder mit einem manipulierten Inhalt von mehr als 29 % werden als „stark manipuliert“ eingestuft und als „anderes Bild“ betrachtet. Bilder mit einem gefälschten Bereich zwischen 1 und 29 % werden als „klein“ eingestuft. 1.57 % der insgesamt gefälschten Bilder werden fälschlicherweise als semantisch ähnlich eingestuft.However, the false acceptance rate is further improved by varying δ1, which depends on the specific application. Here the ability of the proposed system to detect and locate malicious additions or deletions of content on images is evaluated. The classification is done depending on the percentage of manipulated content on the image. Images with more than 29% manipulated content are classified as "heavily manipulated" and as "different image" considered. Images with a fake area between 1 and 29% are classified as "small". 1.57% of total fake images are misclassified as semantically similar.

Nach einer Ausführungsform ist die Receiver-Operating-Characteristics-Kurve (ROC-Kurve) ein geeigneter Ansatz zur Messung der Leistung eines Klassifikators bei unterschiedlichen Schwellenwerten. Wenn zwei Verfahren den gleichen FPR-Wert haben, dann ist das Verfahren mit dem höheren TPR-Wert dem anderen überlegen. Der Leistungsvergleich erfolgt unter vier Gesichtspunkten: dem Kompromiss zwischen Robustheit und Unterscheidung, der Robustheit gegenüber CPOs, der Robustheit gegenüber virulenten Operationen und der Gesamtleistung des Moduls. Das vorgeschlagene System hat eine höhere AUC als die anderen Systeme. Dies kann auf die geometrische Korrektur zurückzuführen sein. Daher bietet das vorgeschlagene System eine bessere Kompromissleistung zwischen Robustheit und Unterscheidung.According to one embodiment, the Receiver Operating Characteristics (ROC) curve is a suitable approach to measure the performance of a classifier at different thresholds. If two methods have the same FPR value, then the method with the higher TPR value will outperform the other. The performance comparison is made from four points of view: the trade-off between robustness and discrimination, robustness against CPOs, robustness against virulent operations, and the overall performance of the module. The proposed system has a higher AUC than the other systems. This may be due to the geometric correction. Therefore, the proposed system offers better trade-off performance between robustness and discrimination.

Der von Karsh et al. und Yan et al. vorgeschlagene Ansatz weist ebenfalls gute Kompromisse auf, denen das vorgeschlagene System folgt. Beide Methoden sind auch robust gegenüber einigen geometrischen Operationen. Daher ist die Leistung mit der des vorgeschlagenen Systems vergleichbar.The Karsh et al. and Yan et al. The proposed approach also has good trade-offs that the proposed system follows. Both methods are also robust to some geometric operations. Therefore, the performance is comparable to that of the proposed system.

Die nachstehende Tabelle zeigt den Vergleich der TPR-Raten, wenn der optimale Schwellenwert für alle verglichenen Systeme festgelegt wird. Betrieb Lv et.al Ouyang et.al Tang et al. Yan et al. Karsh et al. Vorgeschla genes System Große Manipulationen 0.4042 0.2383 0.180 0 0.342 5 0.042 5 0.0082 Kleine Manipulationen 0.6123 0.4692 0.451 8 0.620 0 0.120 7 0.0226 The table below shows the comparison of TPR rates when determining the optimal threshold for all compared systems. operation Lv et al Ouyang et al Tang et al. Yan et al. Karsh et al. Proposed system Big manipulations 0.4042 0.2383 0.180 0 0.342 5 0.042 5 0.0082 Small manipulations 0.6123 0.4692 0.451 8 0.620 0 0.120 7 0.0226

Es wird festgestellt, dass die Hash-Codes gegenüber den meisten Parametern robust sind, mit Ausnahme der Gammakorrektur. Bei dem vorgeschlagenen System ist die TPR-Rate höher als bei vergleichbaren Systemen. Das vorgeschlagene System ist also robust gegenüber Helligkeit, Kontrastanpassung, Hinzufügen von Salz- und Pfefferrauschen, JPEG-Kompression, Einbettung, Wasserzeichen usw., ist aber bei der Gammakorrektur etwas eingeschränkt.The hash codes are found to be robust to most parameters, with the exception of gamma correction. The proposed system has a higher TPR rate than comparable systems. So, the proposed system is robust in terms of brightness, contrast adjustment, salt and pepper noise addition, JPEG compression, embedding, watermarking, etc., but is somewhat limited in gamma correction.

Die Robustheit von Yan et al. und Karsh et al. ist größer als die der anderen Methoden, gefolgt von der vorgeschlagenen Methode. Die Systeme haben eine geometrische Korrektur auf der Empfängerseite. Die Technik von Ouyang et al. bietet eine sehr gute Robustheit gegenüber Rotation und Skalierung, ist aber empfindlich gegenüber anderen Faktoren. Im Gegensatz dazu ist das Verfahren von Lv et al. aufgrund des SIFT-Merkmals robust gegenüber der Einbettung von Wasserzeichen, aber empfindlich gegenüber anderen Faktoren. Die Technik von Tang et al. ist aufgrund des Ringpartitionsansatzes robust gegenüber Rotation, aber empfindlich gegenüber Gammakorrektur, Translation und zusammengesetztem RST.The robustness of Yan et al. and Karsh et al. is greater than that of the other methods, followed by the proposed method. The systems have a geometric correction on the receiver side. The technique of Ouyang et al. offers very good robustness to rotation and scaling, but is sensitive to other factors. In contrast, the method of Lv et al. robust to watermark embedding due to the SIFT feature, but sensitive to other factors. The technique of Tang et al. is robust to rotation due to the ring partition approach, but sensitive to gamma correction, translation, and composite RST.

Das vorgeschlagene System hat die niedrigste Falschakzeptanzrate unter den verglichenen Systemen. Nur 3 von 365 „großen verfälschten“ Bildern werden fälschlicherweise als ähnlich eingestuft, während 9 von 398 „kleinen verfälschten“ Bildern fälschlicherweise akzeptiert werden. Dies ist jedoch immer noch weniger als bei den verglichenen Systemen. Dies könnte auf das Vorhandensein einer gestapelten Faltungsschicht zurückzuführen sein, die das Bild mit tiefen Merkmalen darstellt. Dadurch werden kleine Manipulationen angemessen erkannt. Die FPR für kleinflächige Manipulationen ist höher als für großflächige, da kleine Inhalte im Hash möglicherweise nicht gut wiedergegeben werden. Daher beträgt die FPR für kleine Manipulationen 0.0226, während sie für große Manipulationen 0.0082 beträgt. Wenn jedoch zwei oder mehr ROC-Kurven von verglichenen Systemen miteinander verbunden sind, kann es schwierig sein, zu unterscheiden, welches System überlegen ist. In einem solchen Fall kann die Leistung der Systeme anhand der Fläche unter der ROC-Kurve (AUC) verglichen werden. Die AUC gibt den Bereich an, der von der ROC-Kurve und der Koordinatenachse begrenzt wird. Es ist offensichtlich, dass die AUC des vorgeschlagenen Systems 0.9973 beträgt, was der höchste Wert unter den verglichenen Systemen ist. Es ist festzustellen, dass die Leistung des vorgeschlagenen Systems bei ausschließlicher Verwendung globaler Merkmale besser ist als die Leistung des Standes der Technik (Tang et al.), der die Merkmale, d. h. globale und lokale, kombiniert hat. Das vorgeschlagene System erkennt Manipulationen in einem kleinen Bereich, was auf das Deep-Learning-Paradigma zurückzuführen sein könnte, das eine wesentliche Einschränkung der bestehenden Systeme darstellt.The proposed system has the lowest false acceptance rate among the systems compared. Only 3 out of 365 "large distorted" images are incorrectly classified as similar, while 9 out of 398 "small distorted" images are incorrectly accepted. However, this is still less than the compared systems. This could be due to the presence of a stacked convolution layer representing the deep feature image. As a result, small manipulations are adequately detected. The FPR for small-scale manipulations is higher than for large-scale because small content may not be reflected well in the hash. Therefore, the FPR for small manipulations is 0.0226, while for large manipulations it is 0.0082. However, when two or more ROC curves from compared systems are connected, it can be difficult to distinguish which system is superior. In such a case, the performance of the systems can be compared using the area under the ROC curve (AUC). The AUC indicates the area bounded by the ROC curve and the coordinate axis. It is evident that the AUC of the proposed system is 0.9973, which is the highest value among the compared systems. It is noted that the performance of the proposed system using only global features is better than the performance of the prior art (Tang et al.) which uses the features, i. H. global and local, combined. The proposed system detects tampering in a small area, which could be due to the deep learning paradigm, which is a major limitation of the existing systems.

Bei gleichzeitiger Manipulation und Drehung identifizieren die vorgeschlagenen Systeme den Bereich der Manipulation. Außerdem stellt das vorgeschlagene System das gedrehte Bild auf der Empfängerseite wieder her, ohne dass Informationen von der Senderseite benötigt werden. Im Gegensatz dazu benötigen die bisherigen Verfahren Informationen von der Senderseite. Obwohl die Hash-Länge des vorgeschlagenen Systems größer ist als die der verglichenen Systeme, bietet es eine höhere AUC und eine geringere Falschakzeptanzrate. Darüber hinaus kann das vorgeschlagene System die manipulierten Regionen lokalisieren, selbst wenn Manipulationen und zusammengesetzte RST gleichzeitig auftreten, und es ist auch robuster gegenüber CPOs.With simultaneous manipulation and rotation, the proposed systems identify the area of \u200b\u200bthe manipulation. In addition, the proposed system restores the rotated image on the receiver side without requiring any information from the sender side. In contrast, need the previous procedures information from the sender side. Although the hash length of the proposed system is larger than that of the systems compared, it offers a higher AUC and a lower false acceptance rate. Furthermore, the proposed system can localize the manipulated regions even when manipulations and composite RST occur simultaneously, and it is also more robust to CPOs.

Gemäß einer alternativen Ausführungsform können die Gewichtsmatrizen der CSDAE mit einer RBM (Restricted Boltzmann Machines) vortrainiert werden, um eine schnellere Konvergenz und geringere MSE-Verluste zu erreichen.According to an alternative embodiment, the weight matrices of the CSDAE can be pre-trained with an RBM (Restricted Boltzmann Machines) to achieve faster convergence and lower MSE losses.

Nach einer anderen alternativen Ausführungsform kann ein binärer Relaxationsterm zur Verlustfunktion hinzugefügt werden, um die Hash-Codes spärlich zu machen.According to another alternative embodiment, a binary relaxation term can be added to the loss function to make the hash codes sparse.

Gemäß einer weiteren alternativen Ausführungsform können die Restverbindungen in der CSDAE hinzugefügt werden, um die Probleme mit dem verschwindenden Gradienten zu vermeiden, die während des Modelltrainings aufgetreten sind.According to another alternative embodiment, the residual connections can be added in the CSDAE to avoid the zero gradient problems encountered during model training.

Die Figuren und die vorangehende Beschreibung geben Beispiele für Ausführungsformen. Der Fachmann wird verstehen, dass eines oder mehrere der beschriebenen Elemente durchaus zu einem einzigen Funktionselement kombiniert werden können. Alternativ dazu können bestimmte Elemente in mehrere Funktionselemente aufgeteilt werden. Elemente aus einer Ausführungsform können einer anderen Ausführungsform hinzugefügt werden. Die Reihenfolge der hier beschriebenen Prozesse kann beispielsweise geändert werden und ist nicht auf die hier beschriebene Weise beschränkt. Außerdem müssen die Handlungen eines Flussdiagramms nicht in der dargestellten Reihenfolge ausgeführt werden; auch müssen nicht unbedingt alle Handlungen durchgeführt werden. Auch können die Handlungen, die nicht von anderen Handlungen abhängig sind, parallel zu den anderen Handlungen ausgeführt werden. Der Umfang der Ausführungsformen ist durch diese spezifischen Beispiele keineswegs begrenzt. Zahlreiche Variationen sind möglich, unabhängig davon, ob sie in der Beschreibung explizit aufgeführt sind oder nicht, wie z. B. Unterschiede in der Struktur, den Abmessungen und der Verwendung von Materialien. Der Umfang der Ausführungsformen ist mindestens so groß wie in den folgenden Ansprüchen angegeben.The figures and the preceding description give examples of embodiments. Those skilled in the art will understand that one or more of the elements described may well be combined into a single functional element. Alternatively, certain elements can be broken down into multiple functional elements. Elements from one embodiment may be added to another embodiment. For example, the order of the processes described herein may be changed and is not limited to the manner described herein. Also, the acts of a flowchart need not be performed in the order presented; also, not all actions have to be performed. Also, the actions that are not dependent on other actions can be performed in parallel with the other actions. The scope of the embodiments is in no way limited by these specific examples. Numerous variations are possible, regardless of whether they are explicitly mentioned in the description or not, e.g. B. Differences in structure, dimensions and use of materials. The scope of the embodiments is at least as broad as indicated in the following claims.

Vorteile, andere Vorzüge und Problemlösungen wurden oben im Hinblick auf bestimmte Ausführungsformen beschrieben. Die Vorteile, Vorzüge, Problemlösungen und Komponenten, die dazu führen können, dass ein Vorteil, ein Nutzen oder eine Lösung auftritt oder ausgeprägter wird, sind jedoch nicht als kritisches, erforderliches oder wesentliches Merkmal oder Komponente eines oder aller Ansprüche zu verstehen.Advantages, other benefits, and solutions to problems have been described above with respect to particular embodiments. However, the benefits, advantages, problem solutions, and components that can cause an advantage, benefit, or solution to occur or become more pronounced are not to be construed as a critical, required, or essential feature or component of any or all claims.

BezugszeichenlisteReference List

100100: Ein neuro-evolutionäres Erkennungssystem zur Früherkennung von Lungenkrebs .A neuro-evolutionary detection system for early detection of lung cancer.
102102: Empfängerrecipient
104104: Modul zur Korrektur der geometrischen TransformationGeometry transformation correction module
106a106a: Kodiererencoder
106b106b: Decoderdecoder
106106: Modul zur BildauthentifizierungImage authentication module
108108: Schätzungsmodulestimation module
110110: Modul zur Erkennung und Lokalisierung von ManipulationenModule for detection and localization of manipulations

Claims

A neuro-evolutionary detection system (100) for early detection of lung cancer, the system (100) comprising: a plurality of receivers (102) for receiving an input image and a hash value, wherein the received image is transformed with a blind rotational scaling transform (RST) is encapsulated; a geometric transform correction module (104) connected to the plurality of receivers (102) for eliminating features of the blind RST transform by using inherent properties of the geometric transform and evaluating the rotation condition of the received image, wherein the features of the blind RST transformation are eliminated by the Koordi nats of boundary pixels having non-zero values are calculated to obtain an interpolated image; an image authentication module (106) connected to the geometric transformation module (104) for generating a hash code by an L2 regularization constraint (L2RC) applied to activation of a termination block in an encoder (106a), an estimation module (108), embodied in the image authentication module (106), estimates a correlation coefficient between the received hash and the generated hash, wherein if the correlation coefficient is less than a first threshold, the interpolated image is either a manipulated image or a different image ; and a tamper detection and localization module (110) coupled to the image segmentation module (106) for classifying the image as a tampered image or another image identification portion of the tampered images, wherein the tampered image is mapped to the generated hash code to generate an image map and locate the manipulated image.

system after claim 1 , where the boundary pixels include the top pixel, the rightmost pixel, the leftmost pixel, and the bottom pixel.

system after claim 1 , where the geometric transformation correction module evaluates the direction of rotation, i.e., either clockwise or counterclockwise, and adjusts the rotated image, where if the received image is rotated counterclockwise by up to θ degrees, the top not -zero coordinate point is further to the right than the bottom point, regardless of its degree of translation, and vice versa for a clockwise rotated image up to -θ degrees, where if the value of θ is substantially equal to θ' degrees, the received image as a rotated image is viewed that is counter-rotated by the θ' degree.

system after claim 1 wherein a binary interpolation module (104) calculates a region of interest by manipulating the image to one dimension of the received image while adjusting the angle of rotation.

system after claim 1 , wherein the encoder (106a) is a convolutional stacked denoising auto-encoder (CSDAE) having a plurality of blocks for generating the hash image, the CSDAE being a forward neural network, each of the plurality of blocks consisting of a convolutional layer followed by a stack normalization and a max pooling layer.

system after claim 5 , where the CSDAE hierarchically maps received interpolated images into a latent space reduced to the dimension of the received image, where the L2 RC is configured to: Remove irrelevant components from the hash image by choosing a smallest combination to solve learning problems; and to suppress the effects of static noise on targets by preventing overfitting.

system after claim 1 , where the corrupted area is located by multiplying the corrupted area by the corrupted image, where if the corrupted area is less than 30%, the received image is judged as "corrupted image" and otherwise as "different image".

system after claim 1 , where the first threshold is 0.98 and a second threshold for tempering localization is 0.5.

system after claim 1 , where each convolutional layer is initialized with a plurality of filters, except for a last layer, each having a kernel size of (3,3), where the CSDAE is trained layer by layer and then fine-tuned as a whole.

system after claim 1 , wherein a decoder (106b) block layer comprises a 2D convolution layer and an upsampling layer followed by a batch norm layer.