DE102020211214A1

DE102020211214A1 - System and method for detecting anomalies in images

Info

Publication number: DE102020211214A1
Application number: DE102020211214.2A
Authority: DE
Inventors: Mehmet Gulsun; Vivek Singh; Alexandru TURCEA
Original assignee: Siemens Healthcare GmbH
Current assignee: Siemens Healthineers Ag De
Priority date: 2020-09-07
Filing date: 2020-09-07
Publication date: 2022-03-10

Abstract

Die Erfindung beschreibt ein Verfahren zur Erzeugung eines Systems (6) zum Erkennen von Anomalien in Bildern, welches folgende Schritte umfasst:- Bereitstellen eines generativen Netzes und/oder eines Autoencoders (10) („G/A-Netzes“), eines siamesischen Netzes (8), eines ersten Trainingsdatensatzes (T1), der normale Bilder umfasst, und eines zweiten Trainingsdatensatzes (T2), der abnormale Bilder umfasst,- Trainieren des G/A-Netzes (10), um latente Daten (LD) anhand Eingangsbildern (IP) und Ausgangsbilder (OP) anhand der latenten Daten (LD) zu erzeugen, wobei das Training mit Bildern des ersten Trainingsdatensatzes ausgeführt wird, wobei eine Verlustfunktion zumindest zu Beginn des Trainings für das Training verwendet wird, wobei die Verlustfunktion die Ähnlichkeit der Eingangsbilder (IP) und jeweiliger Ausgangsbilder (OP) erhöht,- Trainieren des siamesischen Netzes (8), um Ähnlichkeitsmaße (S) zwischen Eingangsbildern (IP) und jeweiligen Ausgangsbildern (OP) zu erzeugen, wobei das Training mit Bildern des ersten Trainingsdatensatzes (T1) und des zweiten Trainingsdatensatzes (T2) dadurch ausgeführt wird, dass Bilder beider Trainingsdatensätze (T1, T2) als Eingangsbilder (IP) für das G/A-Netz (10) verwendet werden und Ausgangsbilder (OP) des G/A-Netzes (10) durch das siamesische Netz (8) mit ihren jeweiligen Eingangsbildern (IP) verglichen werden.Die Erfindung beschreibt ferner ein verwandtes System und Verfahren zum Erkennen von Anomalien in Bildern und ein verwandtes Bildgebungssystem.The invention describes a method for generating a system (6) for detecting anomalies in images, which comprises the following steps: - providing a generative network and/or an autoencoder (10) ("G/A network"), a Siamese network (8), a first training data set (T1), which includes normal images, and a second training data set (T2), which includes abnormal images,- training the G/A network (10) to generate latent data (LD) based on input images ( IP) and output images (OP) based on the latent data (LD), the training being carried out with images of the first training data set, a loss function being used for the training at least at the beginning of the training, the loss function being the similarity of the input images ( IP) and respective output images (OP), - training the Siamese network (8) to generate similarity measures (S) between input images (IP) and respective output images (OP), the Trai ning with images of the first training data set (T1) and the second training data set (T2) is carried out in that images of both training data sets (T1, T2) are used as input images (IP) for the G/A network (10) and output images (OP ) of the G/A network (10) are compared with their respective input images (IP) through the Siamese network (8). The invention further describes a related system and method for detecting anomalies in images and a related imaging system.

Description

Die Erfindung beschreibt ein System und ein Verfahren zum Erkennen von Anomalien in Bildern sowie ein Verfahren zur Erzeugung eines solchen Systems und eines medizinischen Bildgebungssystems. Die Anomalien in Bildern werden vorzugsweise mit durch das System erzeugten Vertrauenswerten erkannt.The invention describes a system and a method for detecting anomalies in images and a method for generating such a system and a medical imaging system. The anomalies in images are preferably detected with confidence levels generated by the system.

Auf dem technischen Gebiet des Maschinenlernens werden manchmal Systeme, die insbesondere ein Framework eines tiefen neuronalen Netzes umfassen, verwendet, welche in der Lage sind, anhand eines Satzes von Trainingsbildern zu lernen und neue Bilder mit den gleichen Merkmalen wie die Trainingsbilder zu erzeugen. Es gibt mehrere Ausführungsformen solcher Systeme, die sich in ihrer internen Anordnung unterscheiden, denen jedoch gemeinsam ist, dass sie ein erstes neuronales Netz („Eingangsnetz“), das ein Eingangsbild in einen Satz systematischer Daten, beispielsweise Merkmale oder eine Codierung (nachfolgend „latente Daten“), konvertiert, und ein zweites neuronales Netz („Ausgangsnetz“), das Bilder anhand der latenten Daten erzeugt, umfassen. Die anhand latenter Daten von Eingangsbildern erzeugten Ausgangsbilder sollten ähnlich aussehen wie die Eingangsbilder.In the technical field of machine learning, systems are sometimes used, in particular comprising a deep neural network framework, capable of learning from a set of training images and generating new images with the same characteristics as the training images. There are several embodiments of such systems, which differ in their internal arrangement, but which have in common that they have a first neural network (“input network”) that converts an input image into a set of systematic data, such as features or a code (hereinafter “latent Data”), converted, and a second neural network (“Output Network”) that generates images from the latent data. The output images generated from latent data of input images should look similar to the input images.

Solche Systeme sind beispielsweise „generative Netze“, wobei spezielle Ausführungsformen generative adversarielle Netze (GAN) oder generative Abfragenetze (GQN) sind. Andere Beispiele sind Autoencoder, insbesondere variable Autoencoder (VAE). Abgesehen von der Verwendung für die Kompression von Bildern können sie (insbesondere ihr Ausgangsnetz) auch für die Erzeugung photorealistischer Objekte, beispielsweise Gesichter, die vollständig fiktiv sind, verwendet werden.Such systems are, for example, “generative networks”, specific embodiments being generative adversarial networks (GAN) or generative query networks (GQN). Other examples are autoencoders, particularly variable autoencoders (VAE). Aside from being used for compressing images, they (especially their parent mesh) can also be used to generate photorealistic objects, such as faces, that are entirely fictitious.

Ein Beispiel für ein Netz auf dem technischen Gebiet der Erfindung ist ein Autoencoder, d. h. eine Architektur eines künstlichen neuronalen Netzes, die in der Lage ist, eine Repräsentation (Codierung) für ein Bild (typischerweise zur Dimensionalitäts- und/oder Rauschverringerung) auf der Grundlage des unüberwachten Lernens von Datencodierungen zu erlernen. Zusätzlich wird mit der Codierung eine Rekonstruktionsseite (Decodierung) trainiert, wobei der Autoencoder Ausgangsbilder anhand der reduzierten Codierung (latenten Daten) erzeugt, wobei die Ausgangsbilder den ursprünglichen Eingangsbildern möglichst nahekommende Repräsentationen sein sollten. Demgemäß umfasst ein Autoencoder eine Eingangsschicht, ein Codiernetz (als Eingangsnetz), einen die Codierung (latenten Daten) aufweisenden latenten Raum, ein Decodiernetz (als Ausgangsnetz) und eine Ausgangsschicht. Spezielle Autoencoder sind variationelle Autoencoder (VAE), die generative Modelle sind.An example of a network in the technical field of the invention is an autoencoder, i. H. disclose an artificial neural network architecture capable of learning a representation (encoding) for an image (typically for dimensionality and/or noise reduction) based on unsupervised learning of data encodings. In addition, a reconstruction side (decoding) is trained with the coding, with the autoencoder generating output images using the reduced coding (latent data), with the output images being representations that come as close as possible to the original input images. Accordingly, an autoencoder comprises an input layer, an encoding network (as an input network), a latent space having the encoding (latent data), a decoding network (as an output network), and an output layer. Special autoencoders are variational autoencoders (VAE), which are generative models.

Häufig werden die Netze für die Erzeugung von Bildern oder die Erkennung und Klassifikation von Bildern verwendet.The networks are often used for the generation of images or the recognition and classification of images.

Ein Nachteil generativer Netze besteht darin, dass das Training sehr kompliziert ist und zu systematischen Fehlern führen kann.A disadvantage of generative networks is that the training is very complicated and can lead to systematic errors.

Ein ernstes Problem von Erkennungs- und Klassifikationsnetzen ist der Mangel an Zuverlässigkeit, insbesondere wenn neuronale Deep-Learning-Netze verwendet werden, weil die Glaubwürdigkeit der Ausgabe stark vom Training abhängt. Beispielsweise könnte ein Training an rauschbehafteten oder falschen Bildern leicht dazu führen, dass Fehlvorhersagen mit hohem Vertrauensgrad gemacht werden. Insbesondere bei der Klassifikation medizinischer Bilder, beispielsweise bei Anomalieerkennungsaufgaben, tritt das Problem unausgeglichener Daten auf, wodurch der Deep-Learning-Trainingsprozess schwierig gemacht wird.A serious problem of recognition and classification networks is the lack of reliability, especially when using deep learning neural networks, because the credibility of the output depends heavily on the training. For example, training on noisy or spurious images could easily result in high confidence mispredictions being made. In particular, in the classification of medical images, such as in anomaly detection tasks, the problem of unbalanced data arises, making the deep learning training process difficult.

Demgemäß besteht ein Hauptnachteil von Erkennungs- und Klassifikationsnetzen darin, dass es kein Maß für die Zuverlässigkeit der Ausgabe gibt. Insbesondere in Bezug auf medizinische Bilder sei angemerkt, dass eine genaue Annotation medizinischer Bilder hohe Anstrengungen und Kosten erfordert und Unsicherheiten in den Bildern ausgesetzt sein kann. Zusätzlich könnten medizinische Bilder, beispielsweise infolge einer Dosisverringerung, rauschbehaftet sein. Weil das Training eines Deep-Learning-Modells anhand rauschbehafteter oder unsicherer Ground-Truth-Daten zu falschen Ergebnissen mit einem hohen Vertrauensgrad führen könnte, ist dies, wie vorstehend erwähnt, für das Treffen klinischer Entscheidungen besonders problematisch.Accordingly, a major disadvantage of recognition and classification networks is that there is no measure of the reliability of the output. With regard to medical images in particular, it should be noted that precise annotation of medical images requires a great deal of effort and expense and can be subject to uncertainties in the images. In addition, medical images could be noisy, for example as a result of dose reduction. As mentioned above, because training a deep learning model on noisy or uncertain ground truth data could lead to incorrect results with a high level of confidence, this is particularly problematic for clinical decision making.

Die Aufgabe der vorliegenden Erfindung besteht darin, die bekannten Systeme, Vorrichtungen und Verfahren zu verbessern, um eine Verbesserung der Bildverarbeitung und insbesondere des Trainings eines Erkennungs- und Klassifikationsnetzes zu ermöglichen. Eine weitere bevorzugte Aufgabe besteht darin, eine (Vertrauens-) Bewertung als Maß der Zuverlässigkeit für die Erkennung und Klassifikation von Anomalien in Bildern zu erzeugen.The object of the present invention is to improve the known systems, devices and methods in order to enable an improvement in image processing and in particular in the training of a recognition and classification network. Another preferred task is to generate a (confidence) score as a measure of reliability for the detection and classification of anomalies in images.

Diese Aufgabe wird durch ein Verfahren nach Anspruch 1, ein System nach Anspruch 10, ein Verfahren zum Erkennen von Anomalien nach Anspruch 12 und ein Bildgebungssystem nach Anspruch 13 gelöst.This object is solved by a method according to claim 1, a system according to claim 10, a method for detecting anomalies according to claim 12 and an imaging system according to claim 13.

Diese Erfindung betrifft neuronale Netze, die Algorithmen oder Modelle sind, die trainiert werden müssen. Die Grundprinzipien des Maschinenlernens sind Fachleuten wohlbekannt. Abgesehen von einer geeigneten Verlustfunktion für die Lösung eines spezifischen Problems (auch wohlbekannt) sind die Natur der Trainingsdaten und die Ground Truth (häufig auf die Daten angewendete Labels) entscheidend. Demgemäß kann ein (trainiertes) neuronales Netz oder eine Gruppe (trainierter) neuronaler Netze, falls die Verlustfunktion klar ist, durch die spezifische Trainingsprozedur definiert werden.This invention relates to neural networks, which are algorithms or models that need to be trained. The basic principles of machine learning are well known to those skilled in the art. Apart from an appropriate loss function for solving a spe Of the specific problem (also well known), the nature of the training data and the ground truth (labels often applied to the data) are crucial. Accordingly, if the loss function is clear, a (trained) neural network or a group of (trained) neural networks can be defined by the specific training procedure.

Das erfindungsgemäße System zum Erkennen von Anomalien in Bildern wird durch ein spezielles Verfahren trainiert.The system according to the invention for detecting anomalies in images is trained using a special method.

Ein erfindungsgemäßes Verfahren zur Erzeugung eines Systems zum Erkennen von Anomalien in Bildern, insbesondere in medizinischen Bildern, umfasst die folgenden Schritte:

- Bereitstellen eines generativen Netzes und/oder eines Autoencoders.

A method according to the invention for generating a system for detecting anomalies in images, in particular in medical images, comprises the following steps:

- Provision of a generative network and/or an autoencoder.

Typischerweise wird entweder ein generatives Netz oder ein Autoencoder (Netz) bereitgestellt, es könnte jedoch auch eine Kombination eines generativen Netzes und eines Autoencoders bereitgestellt werden. Das generative Netz und/oder der Autoencoder (das Netz) werden nachfolgend auch als „G/A-Netz“ bezeichnet, um die Lesbarkeit zu verbessern. Wie vorstehend bereits erwähnt wurde, sind die Grundlagen von G/A-Netzen Fachleuten wohlbekannt. Die jeweiligen Netze umfassen ein erstes neuronales Netz („Eingangsnetz“), das ein Eingangsbild in einen Satz systematischer Daten, beispielsweise Merkmale oder eine Codierung (nachfolgend „latente Daten“ oder „latente Merkmale“), konvertiert, und ein zweites neuronales Netz („Ausgangsnetz“), das Bilder anhand der latenten Daten erzeugt. Die anhand latenter Daten von Eingangsbildern erzeugten Ausgangsbilder sollten ähnlich aussehen wie die Eingangsbilder. Verlustfunktionen für solche G/A-Netze sind Fachleuten wohlbekannt. Das G/A-Netz wurde insbesondere noch nicht trainiert. Es könnte jedoch auch ein bereits trainiertes G/A-Netz sein, das nun gemäß dem Verfahren weiter trainiert wird.

- Bereitstellen eines siamesischen Netzes.

Typically either a generative mesh or an autoencoder (mesh) is provided, however a combination of a generative mesh and an autoencoder could also be provided. The generative mesh and/or the autoencoder (the mesh) are hereinafter also referred to as "G/A mesh" to improve readability. As mentioned above, the basics of G/A networks are well known to those skilled in the art. The respective networks include a first neural network (“input network”) that converts an input image into a set of systematic data such as features or an encoding (hereinafter “latent data” or “latent features”), and a second neural network (“ output network") that generates images from the latent data. The output images generated from latent data of input images should look similar to the input images. Loss functions for such G/A networks are well known to those skilled in the art. In particular, the G/A network has not yet been trained. However, it could also be an already trained G/A network that is now further trained according to the method.

- Providing a Thai network.

Ein siamesisches Netz (manchmal auch als „siamesisches neuronales Netz“ oder „neuronales Zwillingsnetz“ bezeichnet) ist ein künstliches neuronales Netz, das die gleichen Gewichte verwendet, während es parallel auf zwei verschiedene Eingangsvektoren einwirkt, um vergleichbare Ausgangsvektoren zu berechnen. In Bezug auf das Gebiet der Erfindung sind die Eingangsvektoren hier Eingangsbilder und Ausgangsbilder eines generativen Netzes oder seine latenten Daten (verglichen mit anhand der Ausgangsbilder erzeugten latenten Daten). Der Ausgangsvektor dieses siamesischen Netzes ist ein Ähnlichkeitsmaß. Demgemäß werden durch das G/A-Netz erzeugte Bilder mit ihrem jeweiligen Originalbild verglichen. Die Grundprinzipien des Trainings siamesischer Netze sind wohlbekannt. Beispielsweise kann das Training mit einem Triplet-Verlust oder kontrastiven Verlust erreicht werden. Das siamesische Netz ist insbesondere noch nicht trainiert. Es könnte jedoch auch ein bereits trainiertes Netz sein, das nun gemäß dem Verfahren weiter trainiert wird.

- Bereitstellen eines normale Bilder umfassenden ersten Trainingsdatensatzes.

A Siamese network (sometimes referred to as a "Siamese neural network" or "twin neural network") is an artificial neural network that uses the same weights while acting on two different input vectors in parallel to compute comparable output vectors. In relation to the field of the invention, the input vectors here are input images and output images of a generative mesh or its latent data (compared to latent data generated from the output images). The output vector of this Siamese network is a measure of similarity. Accordingly, images generated by the G/A mesh are compared to their respective original images. The basic principles of Siamese network training are well known. For example, training can be accomplished with a triplet loss or contrastive loss. The Siamese network in particular has not yet been trained. However, it could also be a network that has already been trained, which is then trained further according to the method.

- Provision of a first training data set comprising normal images.

Der Ausdruck „normale Bilder“ bedeutet Bilder von Objekten in ihrem normalen Zustand, wobei „normal“ einen vordefinierten Zustand, in dem sich Objekte befinden sollten, oder einen korrekten Zustand bezeichnet. In Bezug auf medizinische Bilder bedeutet „normal“ den gesunden Zustand. In Bezug auf die Produktqualität bedeutet „normal“ den fehlerfreien oder gewünschten Zustand. Die Natur von Bildern legt hier fest, welche Art Problem das System später löst. Es ist bevorzugt, dass die Bilder medizinische Bilder (für eine Anwendung in der Medizin) oder Bilder von Produkten (für die Qualitätssicherung) sind. Beispielsweise sind die Bilder des ersten Datensatzes Bilder eines gesunden Organs oder Körperbereichs.The term "normal images" means images of objects in their normal state, where "normal" denotes a predefined state that objects should be in or a correct state. In relation to medical images, "normal" means the healthy condition. In terms of product quality, "normal" means defect-free or desired condition. The nature of images here determines what kind of problem the system will later solve. It is preferred that the images are medical images (for medical use) or product images (for quality assurance). For example, the images of the first dataset are images of a healthy organ or body area.

Die Bilder des ersten Trainingsdatensatzes werden vorzugsweise als normale Bilder gelabelt, um eine Ground Truth bereitzustellen. Es könnte jedoch auch allein die Kenntnis, dass Bilder des ersten Datensatzes für das Training verwendet werden, als Ground Truth verwendet werden.

- Bereitstellen eines abnormale Bilder umfassenden zweiten Trainingsdatensatzes.

The images of the first training data set are preferably labeled as normal images in order to provide ground truth. However, just knowing that images of the first data set are used for training could also be used as ground truth.

- Providing a second training data set comprising abnormal images.

Der Ausdruck „abnormale Bilder“ bedeutet Bilder von Objekten in einem vom normalen Zustand verschiedenen Zustand, d. h. einem Zustand, in dem sich Objekte nicht befinden sollen, oder einem inkorrekten Zustand. In Bezug auf medizinische Bilder bedeutet „abnormal“ einen pathologischen Zustand. In Bezug auf die Produktqualität bedeutet „abnormal“ einen fehlerhaften oder defektbehafteten Zustand. Es ist klar, dass die abnormalen Bilder die gleiche Art von Objekten wie die normalen Bilder zeigen, mit dem Unterschied, dass die Objekte nun nicht normal sind. Nach dem vorhergehenden Beispiel sind die Bilder des zweiten Datensatzes Bilder eines nicht gesunden Organs oder Körperbereichs.The term "abnormal images" means images of objects in a condition different from normal, i.e. H. a state in which objects are not supposed to be, or an incorrect state. In relation to medical images, "abnormal" means a pathological condition. In terms of product quality, "abnormal" means a defective or defective condition. It is clear that the abnormal images show the same type of objects as the normal images, except that the objects are now non-normal. According to the previous example, the images of the second data set are images of an unhealthy organ or body area.

Die Bilder des zweiten Trainingsdatensatzes werden vorzugsweise als abnormale Bilder gelabelt, um eine Ground Truth bereitzustellen. Es könnte jedoch auch allein die Kenntnis, dass Bilder des zweiten Datensatzes für das Training verwendet werden, als Ground Truth verwendet werden.

- Trainieren des G/A-Netzes, um latente Daten anhand Eingangsbildern und Ausgangsbilder anhand der latenten Daten zu erzeugen, wobei das Training mit Bildern des ersten Datensatzes ausgeführt wird, wobei eine Verlustfunktion für das Training zumindest zu Beginn des Trainings verwendet wird (später könnte das siamesische Netz möglicherweise die Verlustfunktion ersetzen), wobei die Verlustfunktion die Ähnlichkeit der Eingangsbilder und der jeweiligen Ausgangsbilder erhöht.

The images of the second training data set are preferably labeled as abnormal images in order to provide ground truth. However, the mere knowledge that images from the second data set are used for the training could also be used as ground truth.

- training the G/A network to generate latent data from input images and output images from the latent data, where the training is performed on images of the first data set, using a loss function for training at least at the beginning of the training (later the Siamese network could possibly replace the loss function), the loss function increasing the similarity of the input images and the respective output images.

Das G/A-Netz verwendet sein Eingangsnetz zum Erzeugen latenter Daten, insbesondere eines Codier- oder Merkmalssatzes, anhand der normalen Bilder des ersten Trainingsdatensatzes. Um die Ein- und Ausgabe des G/A-Netzes später zu vergleichen, erzeugt das Ausgangsnetz Bilder anhand der latenten Daten.The G/A network uses its input network to generate latent data, specifically a coding or feature set, from the normal images of the first training data set. To later compare the input and output of the G/A mesh, the output mesh generates images from the latent data.

Geeignete Verlustfunktionen, welche die Ähnlichkeit der Eingangsbilder und der jeweiligen Ausgangsbilder erhöhen, sind Fachleuten wohlbekannt, beispielsweise Rekonstruktionsverlustfunktionen, welche die Ähnlichkeit von Bildern betreffen, oder perzeptuelle Verlustfunktionen, welche die latenten Daten von Bildern betreffen. In Bezug auf den perzeptuellen Verlust müssen die Ausgangsbilder durch das erste Netz (oder ein mit dem ersten Netz identisches Netz) erneut verarbeitet werden, um latente Daten, insbesondere einen Codier- oder Merkmalssatz, der Ausgangsbilder zu erzeugen. Demgemäß wird das G/A-Netz trainiert, um zu lernen, wie Beispiele anhand normaler Bilder, beispielsweise einer Majoritätsklasse, die häufig die normalen Beispiele für eine Aufgabe einer Erkennung einer medizinischen Anomalie umfasst, zu rekonstruieren sind. Das G/A-Netz könnte beispielsweise eine der Architekturen in der Art eines variationellen Autoencoders oder eines generativen adversariellen Netzes sein.Suitable loss functions that increase the similarity of the input images and the respective output images are well known to those skilled in the art, for example reconstruction loss functions relating to the similarity of images, or perceptual loss functions relating to the latent data of images. With respect to perceptual loss, the source images must be reprocessed by the first mesh (or a mesh identical to the first mesh) to generate latent data, specifically an encoding or feature set, of the source images. Accordingly, the G/A network is trained to learn how to reconstruct examples from normal images, e.g., a majority class, which often includes the normal examples for a medical anomaly detection task. For example, the G/A network could be one of the architectures such as a variational autoencoder or a generative adversarial network.

Es sei angemerkt, dass ein G/A-Netz nicht anstrebt, eine identische Ausgabe von Bildern, d. h. eine Ausgabe, bei der die Pixel der Eingangsbilder pixelweise verarbeitet werden, zu erzeugen. Die latenten Bilder repräsentieren stets Merkmale oder eine Codierung des Bilds, welche die direkten Werte und/oder Koordinaten von Pixeln nicht umfasst. Demgemäß „versteht“ das G/A-Netz durch Trainieren ausschließlich anhand normaler Bilder (beispielsweise Bilder eines gesunden Organs) abnormale Zustände des Objekts nicht. Demgemäß weisen Ausgangsbilder abnormaler Bilder (beispielsweise Bilder eines pathologischen Organs), die später (nicht während des Trainings) durch das G/A-Netz verarbeitet werden, eine geringere Ähnlichkeit mit ihren jeweiligen Eingangsbildern auf, als wenn normale Bilder verarbeitet werden. Vorzugsweise wird das G/A-Netz anhand eines großen Trainingsdatensatzes mit Berichtsaussagen trainiert, wobei keine explizite Annotation erforderlich ist.

- Trainieren des siamesischen Netzes, um Ähnlichkeitsmaße (auch als „Ähnlichkeitsmetrik“ bezeichnet) zwischen Eingangsbildern und jeweiligen Ausgangsbildern zu erzeugen, wobei das Training mit Bildern des ersten Trainingsdatensatzes und des zweiten Trainingsdatensatzes dadurch ausgeführt wird, dass Bilder beider Datensätze als Eingangsbilder für das G/A-Netz verwendet werden und Ausgangsbilder des G/A-Netzes durch das siamesische Netz mit ihren jeweiligen Eingangsbildern verglichen werden.

It should be noted that a G/A network does not aim to produce an identical output of images, ie an output in which the pixels of the input images are processed pixel by pixel. The latent images always represent features or an encoding of the image that does not include the direct values and/or coordinates of pixels. Accordingly, by training only on normal images (e.g., images of a healthy organ), the G/A network does not “understand” abnormal states of the object. Accordingly, output images of abnormal images (e.g. images of a pathological organ) that are later (not during training) processed by the G/A network have a lower resemblance to their respective input images than when normal images are processed. The G/A network is preferably trained using a large training data set with report statements, with no explicit annotation being required.

- Training the Siamese network to generate measures of similarity (also referred to as "similarity metrics") between input images and respective output images, the training being performed on images of the first training dataset and the second training dataset by using images of both datasets as input images for the G/ A-net are used and output images of the G/A-net are compared by the Siamese network with their respective input images.

Natürlich muss das siamesische Netz wissen, ob ein normales oder ein abnormales Bild eingegeben wird, um Ähnlichkeitsunterschiede zu erlernen. Wie vorstehend erwähnt, können direkte Labels der Bilder verwendet werden oder ist der verwendete Trainingsdatensatz selbst das Label. Die Grundprinzipien der Erzeugung eines Ähnlichkeitsmaßes sind wohlbekannt, wobei das Ähnlichkeitsmaß typischerweise ein Wert ist, der umso höher ist, je besser die Ähnlichkeit ist und bei einer geringeren Ähnlichkeit niedriger ist. Kurz gesagt besteht die Aufgabe des siamesischen Netzes darin, zu lernen, ob zwei Eingangsbilder ähnlich oder unterschiedlich sind, so dass eine Ähnlichkeitsmetrik auf der Grundlage der rekonstruierten Ausgabe eines G/A-Netzes erlernt wird. Es wird vorzugsweise trainiert, um das Ähnlichkeitsmaß zu maximieren, wenn eine normale Probe vorliegt und ihre Rekonstruktion bereitgestellt wird, und das Ähnlichkeitsmaß zu minimieren, wenn eine abnormale Probe gemessen wird und ihre Rekonstruktion bereitgestellt wird.Of course, the Siamese network needs to know whether a normal or abnormal image is input in order to learn similarity differences. As mentioned above, direct labels of the images can be used, or the training data set used is itself the label. The basic principles of generating a measure of similarity are well known, the measure of similarity typically being a value which is higher the better the similarity is and lower for a lower similarity. In short, the task of the Siamese network is to learn whether two input images are similar or different, so that a similarity metric is learned based on the reconstructed output of a G/A network. It is preferably trained to maximize the similarity measure when there is a normal sample and its reconstruction provided, and to minimize the similarity measure when an abnormal sample is measured and its reconstruction provided.

Die durch das siamesische Netz erzeugten Ähnlichkeitsmaße können vorzugsweise in einer aktiven Lernanordnung verwendet werden, um einen Annotationsprozess mit Empfehlungen anzuleiten, insbesondere für ein weiteres Training des G/A-Netzes.The similarity measures generated by the Siamese network can preferably be used in an active learning arrangement to guide an annotation process with recommendations, in particular for further training of the G/A network.

Ein erfindungsgemäßes System zum Erkennen von Anomalien in Bildern umfasst die folgenden Komponenten:

- ein generatives Netz und/oder einen Autoencoder, das und/oder der durch das erfindungsgemäße Verfahren trainiert wurde,
- ein siamesisches Netz, das durch das erfindungsgemäße Verfahren trainiert wurde. Dieses siamesische Netz wird mit dem G/A-Netz verbunden, so dass es in der Lage ist, Eingangsbilder mit ihren durch das G/A-Netz erzeugten Ausgangsbildern zu vergleichen.

A system according to the invention for detecting anomalies in images comprises the following components:

- a generative network and/or an autoencoder which and/or which has been trained by the method according to the invention,
- a Siamese network trained by the method according to the invention. This Siamese network is connected to the G/A network so that it is able to compare input images with their output images generated by the G/A network.

Wenngleich es alternative Anordnungen für das Training geben könnte, ist es bevorzugt, dass diese Anordnung (oder eine andere Anordnung, die sich auf das nachstehend beschriebene System bezieht) auch für das Training der Netze verwendet wird.While there could be alternative arrangements for training, it is preferred that this arrangement (or any other arrangement related to the system described below relates) is also used for training the networks.

Ein erfindungsgemäßes Verfahren zum Erkennen von Anomalien in Bildern umfasst die folgenden Schritte:

- Bereitstellen eines Bilds als Eingabe für ein erfindungsgemäßes System,
- Empfangen eines Ähnlichkeitsmaßes für dieses Bild (durch die Ausgabe des siamesischen Netzes),
- falls das Ähnlichkeitsmaß jenseits einer vordefinierten Ähnlichkeitsschwelle liegt, Klassifizieren einer Abnormität im Bild, insbesondere abhängig vom Bereich der Abnormität, wobei dies vorzugsweise mit einem nachfolgend beschriebenen Klassifikationsnetz erreicht werden kann.

A method according to the invention for detecting anomalies in images comprises the following steps:

- providing an image as input for a system according to the invention,
- Receiving a similarity measure for this image (through the output of the Siamese network),
- if the degree of similarity is beyond a predefined similarity threshold, classifying an abnormality in the image, in particular depending on the area of the abnormality, it being possible for this to be achieved preferably with a classification network described below.

Optional können Eingangsbilder mehrere Male verarbeitet werden, um mehrere Ausgangsbilder und jeweilige Ähnlichkeitsmaße zu erzeugen, wobei diese Mehrfachverarbeitung vorzugsweise mit jedem Eingangsbild ausgeführt wird. Demgemäß gibt es zwei oder mehr Ähnlichkeitsmetriken für jedes Eingangsbild. Damit ist es möglich und vorteilhaft, eine Wahrscheinlichkeitsverteilung (oder eine nicht normierte Ähnlichkeitsverteilung) über die Ähnlichkeitsmaße zu erzeugen.Optionally, input images may be processed multiple times to produce multiple output images and respective similarity measures, with this multiple processing preferably being performed on each input image. Accordingly, there are two or more similarity metrics for each input image. It is thus possible and advantageous to generate a probability distribution (or a non-normalized similarity distribution) via the similarity measures.

Es ist klar, dass „jenseits“ einer Schwelle „in einem Bereich abnormer Ereignisse“ bedeutet. In Bezug auf ein typisches Ähnlichkeitsmaß, wobei gute Ähnlichkeiten zu einem hohen Maß führen und schlechte Ähnlichkeiten zu einem niedrigen Maß führen, bedeutet „jenseits“ unterhalb der Schwelle.It is clear that "beyond" a threshold means "in a range of abnormal events". In terms of a typical measure of similarity, where good similarities result in a high measure and poor similarities result in a low measure, "beyond" means below the threshold.

Eine erfindungsgemäße Steuervorrichtung zum Steuern eines Bildgebungssystems umfasst ein erfindungsgemäßes System. Alternativ oder zusätzlich ist sie ausgebildet, das erfindungsgemäße Verfahren auszuführen. Die Steuervorrichtung kann zusätzliche Einheiten oder Vorrichtungen zum Steuern von Komponenten eines Bildgebungssystems, beispielsweise eine Sequenzsteuereinheit zur Messsequenzsteuerung, einen Speicher, eine Sendevorrichtung, die Strahlung erzeugt, verstärkt und aussendet, eine Magnetsystem-Schnittstelle, eine Strahlungsempfangsvorrichtung zum Erfassen von Signalen und/oder eine Rekonstruktionseinheit zum Rekonstruieren von Bilddaten umfassen.A control device according to the invention for controlling an imaging system comprises a system according to the invention. Alternatively or additionally, it is designed to carry out the method according to the invention. The control device can have additional units or devices for controlling components of an imaging system, for example a sequence control unit for measuring sequence control, a memory, a transmission device that generates, amplifies and emits radiation, a magnet system interface, a radiation receiving device for detecting signals and/or a reconstruction unit for reconstructing image data.

Ein erfindungsgemäßes Bildgebungssystem umfasst eine erfindungsgemäße Steuervorrichtung. Demgemäß umfasst ein erfindungsgemäßes Bildgebungssystem ein erfindungsgemäßes System und/oder ist ausgebildet, ein erfindungsgemäßes Verfahren auszuführen. Bevorzugte Bildgebungssysteme sind medizinische Bildgebungssysteme, beispielsweise Computertomographie(CT)-Systeme oder Magnetresonanz-Bildgebungssysteme.An imaging system according to the invention comprises a control device according to the invention. Accordingly, an imaging system according to the invention comprises a system according to the invention and/or is designed to carry out a method according to the invention. Preferred imaging systems are medical imaging systems, such as computed tomography (CT) systems or magnetic resonance imaging systems.

Einige Einheiten oder Module des vorstehend erwähnten Systems oder der vorstehend erwähnten Steuervorrichtung können ganz oder teilweise als auf einem Prozessor eines Systems oder einer Steuervorrichtung laufende Softwaremodule verwirklicht werden. Eine Verwirklichung größtenteils in Form von Softwaremodulen kann den Vorteil haben, dass bereits auf einem existierenden System installierte Anwendungen mit verhältnismäßig geringem Aufwand aktualisiert werden können, um diese Einheiten der vorliegenden Anmeldung zu installieren und auszuführen. Die Aufgabe der Erfindung wird auch durch ein Computerprogrammprodukt mit einem Computerprogramm gelöst, das direkt in den Speicher einer Vorrichtung eines Systems oder einer Steuervorrichtung eines Bildgebungssystems ladbar ist und Programmeinheiten zur Ausführung der Schritte des erfindungsgemäßen Verfahrens, wenn das Programm durch die Steuervorrichtung oder das System ausgeführt wird, umfasst. Zusätzlich zum Computerprogramm kann ein solches Computerprogrammprodukt auch weitere Teile in der Art einer Dokumentation und/oder zusätzlicher Komponenten, auch Hardwarekomponenten in der Art eines Hardware-Schlüssels (Dongles usw.) zur Ermöglichung des Zugriffs auf die Software, umfassen.Some units or modules of the above-mentioned system or controller may be implemented in whole or in part as software modules running on a processor of a system or controller. An implementation largely in the form of software modules can have the advantage that applications already installed on an existing system can be updated with relatively little effort in order to install and run these units of the present application. The object of the invention is also achieved by a computer program product with a computer program that can be loaded directly into the memory of a device of a system or a control device of an imaging system and program units for executing the steps of the method according to the invention when the program is executed by the control device or the system is included. In addition to the computer program, such a computer program product can also include other parts such as documentation and/or additional components, including hardware components such as a hardware key (dongles, etc.) to enable access to the software.

Ein computerlesbares Medium in der Art eines Speichersticks, einer Festplatte oder eines anderen transportierbaren oder permanent installierten Trägers kann dazu dienen, die ausführbaren Teile des Computerprogrammprodukts zu transportieren und/oder zu speichern, so dass diese von einer Prozessoreinheit einer Steuervorrichtung oder eines Systems gelesen werden können. Eine Prozessoreinheit kann einen oder mehrere Mikroprozessoren oder ihre Entsprechungen umfassen.A computer-readable medium in the form of a memory stick, a hard disk or another transportable or permanently installed carrier can be used to transport and/or store the executable parts of the computer program product so that they can be read by a processor unit of a control device or system . A processor unit may include one or more microprocessors or their equivalents.

Besonders vorteilhafte Ausführungsformen und Merkmale der Erfindung werden durch die abhängigen Ansprüche angegeben, wie in der folgenden Beschreibung dargelegt wird. Merkmale verschiedener Anspruchskategorien können angemessen kombiniert werden, um hier nicht beschriebene weitere Ausführungsformen zu erzeugen.Particularly advantageous embodiments and features of the invention are indicated by the dependent claims, as set out in the following description. Features from different claim categories can be combined as appropriate to create further embodiments not described here.

Gemäß einem bevorzugten Verfahren wird das G/A-Netz dadurch trainiert, dass ein Eingangsbild mit seinem durch das G/A-Netz erzeugten Ausgangsbild verglichen wird. Die Bilder können direkt oder indirekt mittels Daten verglichen werden, die durch identische Prozesse anhand der Bilder erzeugt wurden. Es ist bevorzugt, dass der Vergleich unter Verwendung einer Rekonstruktionsverlustfunktion ausgeführt wird. Es ist bevorzugt, die Bilder direkt zu vergleichen (Vergleich der Bilddaten). Alternativ oder zusätzlich werden Daten, welche durch ein erstes Netz des G/A-Netzes erzeugt wurden, insbesondere latente Daten, mit Daten verglichen, die zusätzlich durch das erste Netz verarbeitet wurden, insbesondere unter Verwendung einer perzeptuellen Verlustfunktion. Das erste Netz des G/A-Netzes ist vorzugsweise ein Encoder-Netz, so dass die Daten eine Codierung oder ein Merkmalssatz des Bilds sind. Es ist bevorzugt, dass das Training durch ein bereits trainiertes (zweites) siamesisches Netz unterstützt wird, um eine Rekonstruktionsverlustfunktion und/oder eine perzeptuelle Verlustfunktion zu ersetzen.According to a preferred method, the G/A network is trained by comparing an input image with its output image generated by the G/A network. The images can be compared directly or indirectly using data generated from the images by identical processes. It is preferred that the comparison is performed using a reconstruction loss function. It is preferred to directly compare the images (comparison of the image data). Alternatively or additionally, data that is th mesh of the G/A mesh, in particular latent data, is compared with data additionally processed by the first mesh, in particular using a perceptual loss function. The first network of the G/A network is preferably an encoder network such that the data is an encoding or feature set of the image. It is preferred that the training is supported by an already trained (second) Siamese network to replace a reconstruction loss function and/or a perceptual loss function.

Gemäß einem bevorzugten Verfahren wird das siamesische Netz trainiert, um Ähnlichkeitsmaße direkt durch Vergleichen der Bilder (Vergleich der Bilddaten) zwischen den Eingangsbildern und jeweiligen Ausgangsbildern zu erzeugen. Alternativ oder zusätzlich wird das siamesische Netz trainiert, um Ähnlichkeitsmaße zwischen den Eingangsbildern und jeweiligen Ausgangsbildern indirekt mit mittels eines ersten Netzes des G/A-Netzes erzeugten Daten, insbesondere latenten Daten, zu erzeugen, die mit Daten zusätzlich durch das erste Netz verarbeiteter Ausgangsbilder verglichen werden. Das erste Netz des G/A-Netzes ist vorzugsweise ein Encoder-Netz, so dass die Daten eine Codierung oder ein Merkmalssatz des Bilds sind.According to a preferred method, the Siamese network is trained to generate similarity measures directly by comparing the images (comparing the image data) between the input images and respective output images. Alternatively or additionally, the Siamese network is trained to generate similarity measures between the input images and respective output images indirectly with data generated by a first network of the G/A network, in particular latent data, which is compared with data additionally processed by the first network output images will. The first network of the G/A network is preferably an encoder network such that the data is an encoding or feature set of the image.

Gemäß einem bevorzugten Verfahren ist das G/A-Netz ein generatives adversarielles Netz (GAN) oder ein Autoencoder, insbesondere ein variabler Autoencoder (VAE). Ein generatives adversarielles Netz (GAN) ist eine Klasse von Maschinenlern-Frameworks, wobei zwei neuronale Netze miteinander konkurrieren.According to a preferred method, the G/A network is a generative adversarial network (GAN) or an autoencoder, in particular a variable autoencoder (VAE). A generative adversarial network (GAN) is a class of machine learning frameworks where two neural networks compete with each other.

Gemäß einem bevorzugten Verfahren wird das siamesische Netz trainiert, eine Ähnlichkeitsschwelle an den Ähnlichkeitsmaßen zu erzeugen, wobei die Ähnlichkeitsschwelle die Verarbeitung eines abnormalen Bilds durch das G/A-Netz angibt. Es lernt die Schwelle am Ähnlichkeitsmaß vorzugsweise unter Verwendung sowohl normaler als auch abnormaler Beispiele anhand der Trainingsdatensätze. Dies ist für eine endgültige binäre Entscheidung („normal“ oder „abnormal“) vorteilhaft, jedoch nicht erforderlich, wenn Samples nach ihrer Ähnlichkeit eingestuft werden. Es ist klar, dass die Schwelle so gewählt wird, dass sie normale Bilder von abnormalen Bildern trennt.According to a preferred method, the Siamese network is trained to generate a similarity threshold on the similarity measures, the similarity threshold being indicative of the processing of an abnormal image by the G/A network. It preferably learns the threshold on the similarity measure using both normal and abnormal examples from the training data sets. This is beneficial for a final binary decision ('normal' or 'abnormal'), but not required when ranking samples based on their similarity. It is clear that the threshold is chosen to separate normal images from abnormal images.

Gemäß einem bevorzugten Verfahren wird das siamesische Netz so ausgebildet, dass es Ähnlichkeitsmaße normiert, wobei die Normierung vorzugsweise auf einem Validierungsdatensatz beruht. Ein Validierungsdatensatz ist ein getrennter Trainingsdatensatz. Validierungsdaten des Validierungsdatensatzes werden hauptsächlich verwendet, um Hyperparameter eines Netzes abzustimmen und das Verhalten des Verlusts während des Trainings zu überwachen. Normierte Ähnlichkeitsmaße können als „Vertrauenswert“ festgelegt werden. Anhand dieses Vertrauenswerts (oder mit einem anderen Vergleich des Ähnlichkeitsmaßes mit vordefinierten Werten) können automatisierte Annotationen anhand von Vorhersagen mit einem hohen Vertrauenswert gemacht werden. Die Verwendung des Ähnlichkeitsmaßes als Vertrauenswert hat den Vorteil, dass Klinikern ein Maß der Zuverlässigkeit der Systemausgabe bereitgestellt werden kann.According to a preferred method, the Siamese network is designed in such a way that it normalizes similarity measures, the normalization preferably being based on a validation data set. A validation dataset is a separate training dataset. Validation dataset validation data is mainly used to tune hyperparameters of a network and monitor the behavior of loss during training. Standardized similarity measures can be defined as "confidence values". Using this confidence value (or any other comparison of the measure of similarity with predefined values), automated annotations can be made based on predictions with a high confidence value. Using the measure of similarity as a confidence value has the advantage that clinicians can be provided with a measure of the reliability of the system output.

Vorzugsweise umfasst das System (und/oder die Trainingsanordnung) ein Klassifikationsnetz, das mit dem Ausgang des siamesischen Netzes verbunden ist, vorzugsweise derart, dass es ein Ähnlichkeitsmaß empfängt, falls es jenseits einer vordefinierten Ähnlichkeitsschwelle liegt, insbesondere nur in diesem Fall, so dass normale Bilder (mit einem hohen Ähnlichkeitsmaß) nicht klassifiziert werden. Das Klassifikationsnetz empfängt Ausgangsbilder oder latente Daten des G/A-Netzes oder Eingangsbilder als Eingabe und klassifiziert seine Eingangsdaten. Beispielsweise werden normale Bilder einer gesunden Koronararterie nicht klassifiziert, weil sie ein hohes Ähnlichkeitsmaß aufweisen (das G/A-Netz ist an gesunden Koronararterien trainiert), während abnormale Bilder kranker Koronararterien ein niedriges Ähnlichkeitsmaß aufweisen. Demgemäß wird das Bild einer kranken Koronararterie vorzugsweise ferner in die Kategorien „kalzifiziert“, „nicht kalzifiziert“ oder „gemischt“ klassifiziert.Preferably, the system (and/or the training arrangement) comprises a classification network connected to the output of the Siamese network, preferably such that it receives a measure of similarity if it is beyond a predefined similarity threshold, in particular only in that case, so that normal Images (with a high degree of similarity) are not classified. The classification network receives output images or latent data of the G/A network or input images as input and classifies its input data. For example, normal images of a healthy coronary artery are not classified because they have a high similarity score (the G/A mesh is trained on healthy coronary arteries), while abnormal images of diseased coronary arteries have a low similarity score. Accordingly, the image of a diseased coronary artery is preferably further classified into the categories of "calcified", "non-calcified", or "mixed".

Gemäß einem bevorzugten Verfahren wird zusätzlich ein Klassifikationsnetz trainiert, so dass es in der Lage ist, Bilder unter Verwendung durch das siamesische Netz erzeugter Ähnlichkeitsmaße und/oder durch das G/A-Netz erzeugter Daten zu klassifizieren. Vorzugsweise wird das Klassifikationsnetz und Verwendung von Bildern des zweiten Trainingsdatensatzes und/oder der latenten Daten des G/A-Netzes und/oder der anhand des zweiten Trainingsdatensatzes erzeugten Ausgangsbilder des G/A-Netzes als Eingangsmerkmalssatz trainiert. Vorzugsweise wird das Klassifikationsnetz unter Verwendung der latenten Daten und/oder der Ausgangsbilder des G/A-Netzes und/oder des Ähnlichkeitsmaßes des siamesischen Netzes trainiert. Das Ähnlichkeitsmaß wird vorzugsweise verwendet, um „normale Beispiele“ herauszufiltern. Demgemäß kann das bevorzugte Verfahren durch das Trainieren eines Klassifizierers an abnormalen Beispielen erweitert werden, um eine Mehrklassenvorhersage zu unterstützen.According to a preferred method, a classification network is additionally trained so that it is able to classify images using similarity measures generated by the Siamese network and/or data generated by the G/A network. The classification network is preferably trained using images of the second training data set and/or the latent data of the G/A network and/or the output images of the G/A network generated using the second training data set as the input feature set. Preferably, the classification network is trained using the latent data and/or the output images of the G/A network and/or the similarity measure of the Siamese network. The similarity measure is preferably used to filter out "normal examples". Accordingly, by training a classifier on abnormal examples, the preferred method can be extended to support multi-class prediction.

Gemäß einem bevorzugten Verfahren wird das siamesische Netz trainiert, ein räumlich aufgelöstes Ähnlichkeitsmaß von Bildern zu erzeugen. Dies bedeutet, dass überwacht wird, wo im Bild die Ähnlichkeit hoch ist und wo sie niedrig ist. Dies kann insbesondere durch Segmentieren eines Bilds in Teilbilder und/oder Segmentieren eines Bildstapels (beispielsweise eines 3D-Bilds) in Bildschnitte und/oder unter Verwendung der Koordinaten von Pixeln der Bilder erreicht werden. Es ist bevorzugt, dass eine Klassifikation von einem Bereich in einem Bild und vom jeweiligen Ähnlichkeitsmaß dieses Bereichs abhängt. Falls beispielsweise eine geringe Ähnlichkeit in einem Bereich gefunden wird, in dem sich normalerweise das Herz befindet, wird angenommen, dass eine Krankheit des Herzens vorliegt.According to a preferred method, the Siamese network is trained to generate a spatially resolved similarity measure of images. This means monitoring where in the image the similarity is high and where it is low. In particular, this can be done by segmenting an image into partial images and/or segmenting an image stack (e.g. a 3D image) into image slices and/or using the coordinates of pixels of the images can be achieved. It is preferred that a classification depends on an area in an image and on the respective degree of similarity of this area. For example, if a low resemblance is found in an area where the heart normally resides, it is assumed that there is a disease of the heart.

Gemäß einem bevorzugten Verfahren ist das Training ein Ende-zu-Ende-Training. Alternativ oder zusätzlich werden durch das siamesische Netz erzeugte Ergebnisse verwendet, um das G/A-Netz weiter zu trainieren, und/oder es werden durch das G/A-Netz erzeugte Ergebnisse verwendet, um das siamesische Netz weiter zu trainieren.According to a preferred method, the training is end-to-end training. Alternatively or additionally, results generated by the Siamese network are used to further train the G/A network and/or results generated by the G/A network are used to further train the Siamese network.

Bei einem bevorzugten erfindungsgemäßen System sind Komponenten des Systems Teil eines Datennetzes, wobei das Datennetz und ein (insbesondere medizinisches) Bildgebungssystem vorzugsweise in Datenkommunikation miteinander stehen, wobei das Datennetz vorzugsweise Teile des Internets und/oder eines cloudbasierten Rechensystems umfasst, wobei das erfindungsgemäße System oder eine Anzahl von Komponenten dieses Systems vorzugsweise in diesem cloudbasierten Rechensystem verwirklicht sind. Beispielsweise sind die Komponenten des Systems Teil eines Datennetzes, wobei das Datennetz und ein medizinisches Bildgebungssystem, das die Bilddaten bereitstellt, vorzugsweise in Kommunikation miteinander stehen. Eine solche vernetzte Lösung könnte durch eine Internetplattform und/oder in einem cloudbasierten Rechensystem implementiert werden.In a preferred system according to the invention, components of the system are part of a data network, the data network and a (especially medical) imaging system preferably being in data communication with one another, the data network preferably comprising parts of the Internet and/or a cloud-based computing system, the system according to the invention or a Number of components of this system are preferably realized in this cloud-based computing system. For example, the components of the system are part of a data network, with the data network and a medical imaging system, which provides the image data, preferably being in communication with one another. Such a networked solution could be implemented through an internet platform and/or in a cloud-based computing system.

Das Verfahren kann auch Elemente des „Cloudcomputings“ aufweisen. Auf dem technischen Gebiet des „Cloudcomputings“ wird eine IT-Infrastruktur über ein Datennetz, beispielsweise einen Speicherbereich oder Verarbeitungsleistung und/oder Anwendungssoftware, bereitgestellt. Die Kommunikation zwischen dem Benutzer und der „Cloud“ wird durch Datenschnittstellen und/oder Datenübertragungsprotokolle erreicht.The method can also have elements of “cloud computing”. In the technical field of "cloud computing", an IT infrastructure is provided via a data network, for example a storage area or processing power and/or application software. Communication between the user and the "cloud" is achieved through data interfaces and/or data transfer protocols.

In Zusammenhang mit dem „Cloudcomputing“ findet gemäß einer bevorzugten Ausführungsform des erfindungsgemäßen Verfahrens das Bereitstellen von Daten über einen Datenkanal (beispielsweise ein Datennetz) für eine „Cloud“ statt. Diese „Cloud“ weist ein (fernes) Rechensystem, beispielsweise einen Computer-Cluster, der typischerweise nicht die lokale Maschine des Benutzers aufweist, auf. Diese Cloud kann insbesondere durch die medizinische Einrichtung, die auch die (medizinischen) Bildgebungssysteme bereitstellt, verfügbar gemacht werden. Insbesondere werden die Bilderfassungsdaten über ein RIS (Radiologieinformationssystem) oder ein PACS (Bildarchivierungs- und Kommunikationssystem) zu einem (fernen) Computersystem (der „Cloud“) gesendet.In connection with “cloud computing”, according to a preferred embodiment of the method according to the invention, data is provided via a data channel (for example a data network) for a “cloud”. This "cloud" includes a (remote) computing system, such as a computer cluster, that typically does not include the user's local machine. In particular, this cloud can be made available by the medical facility that also provides the (medical) imaging systems. In particular, the image acquisition data is sent to a (remote) computer system (the “Cloud”) via a RIS (Radiology Information System) or a PACS (Image Archiving and Communication System).

Innerhalb des Geltungsbereichs einer bevorzugten Ausführungsform des erfindungsgemäßen Systems befinden sich die vorstehend erwähnten Komponenten auf der Seite der „Cloud“. Ein bevorzugtes System umfasst ferner eine lokale Recheneinheit, die über einen Datenkanal (beispielsweise ein Datennetz, das insbesondere als RIS oder PACS ausgelegt ist) mit dem System verbunden ist. Die lokale Recheneinheit weist wenigstens eine Datenempfangsschnittstelle zum Empfangen von Daten auf. Überdies ist es bevorzugt, wenn der lokale Computer zusätzlich eine Sendeschnittstelle zum Senden von Daten zum System aufweist.Within the scope of a preferred embodiment of the system according to the invention, the components mentioned above are located on the "cloud" side. A preferred system also includes a local processing unit, which is connected to the system via a data channel (for example a data network designed in particular as RIS or PACS). The local processing unit has at least one data receiving interface for receiving data. Furthermore, it is preferred if the local computer also has a transmission interface for sending data to the system.

Mit der Erfindung können Annotationsprozesse effizienter und weniger kostspielig durch aktives Lernen geschehen, wobei das siamesische Netz Fälle mit hoher Unsicherheit aus einem ungelabelten Pool für die Annotation oder Ausreißer mit hoher Unsicherheit aus einem gelabelten Pool für die weitere Inspektion auszeichnet.With the invention, annotation processes can be done more efficiently and less expensively through active learning, with the Siamese network selecting high-uncertainty cases from an unlabeled pool for annotation or high-uncertainty outliers from a labeled pool for further inspection.

Andere Aufgaben und Merkmale der vorliegenden Erfindung werden anhand der folgenden detaillierten Beschreibungen, die in Zusammenhang mit den anliegenden Zeichnungen betrachtet werden, verständlich werden. Es ist jedoch zu verstehen, dass die Zeichnungen ausschließlich der Veranschaulichung und nicht der Definition der Grenzen der Erfindung dienen.Other objects and features of the present invention will become understood from the following detailed descriptions considered in conjunction with the accompanying drawings. However, it is to be understood that the drawings are for the purpose of illustration only and not to define the limits of the invention.

Es zeigen:

1 ein vereinfachtes CT-System gemäß einer Ausführungsform der Erfindung,
2 eine Ausführungsform für das Training eines G/A-Netzes mit einer Schätzung eines Rekonstruktionsverlusts,
3 eine Ausführungsform für das Training eines G/A-Netzes mit einer Schätzung eines perzeptuellen Verlusts,
4 eine schematische Ausführungsform eines erfindungsgemäßen Systems mit einem siamesischen Netz,
5 eine schematische Ausführungsform eines erfindungsgemäßen Systems mit einem siamesischen Netz und einem Klassifikationsnetz,
6 eine schematische Ausführungsform eines erfindungsgemäßen Systems mit einem siamesischen Netz und
7 ein Blockdiagramm des Prozessablaufs eines bevorzugten erfindungsgemäßen Trainingsverfahrens.

Show it:

1 a simplified CT system according to an embodiment of the invention,
2 an embodiment for training a G/A network with an estimation of a reconstruction loss,
3 an embodiment for training a G/A network with a perceptual loss estimation,
4 a schematic embodiment of a system according to the invention with a Siamese network,
5 a schematic embodiment of a system according to the invention with a Siamese network and a classification network,
6 a schematic embodiment of a system according to the invention with a Siamese network and
7 a block diagram of the process flow of a preferred training method according to the invention.

In den Diagrammen beziehen sich gleiche Zahlen überall auf gleiche Objekte. Objekte in den Diagrammen sind nicht notwendigerweise maßstabsgerecht dargestellt.Like numbers refer to like objects throughout the diagrams. Objects in the diagrams are not necessarily drawn to scale.

1 zeigt ein vereinfachtes Computertomographiesystem 1 mit einer Steuervorrichtung 5, die ein System 6 zur Ausführung des erfindungsgemäßen Verfahrens umfasst. Das Computertomographiesystem 1 weist wie üblich einen Scanner 2 mit einem Gantry auf, worin sich eine Röntgenquelle 3 mit einem Detektor 4 um einen Patienten dreht, und es zeichnet Rohdaten RD auf, die später durch die Steuervorrichtung 5 zu Bildern rekonstruiert werden. 1 shows a simplified computer tomography system 1 with a control device 5, which includes a system 6 for carrying out the method according to the invention. As usual, the computer tomography system 1 has a scanner 2 with a gantry, in which an X-ray source 3 with a detector 4 rotates around a patient, and it records raw data RD, which are later reconstructed by the control device 5 into images.

Es sei angemerkt, dass die beispielhafte Ausführungsform gemäß dieser Figur nur ein Beispiel eines Bildgebungssystems ist und dass die Erfindung auch theoretisch in jedem in einer medizinischen und nicht medizinischen Umgebung verwendeten Bildgebungssystem eingesetzt werden kann. Ebenso sind nur jene Komponenten dargestellt, die für die Erklärung der Erfindung wesentlich sind. Im Prinzip sind solche Bildgebungssysteme und zugeordneten Steuervorrichtungen Fachleuten auf dem Gebiet bekannt und brauchen daher nicht detailliert erklärt werden.It should be noted that the exemplary embodiment according to this figure is only an example of an imaging system and that the invention can also theoretically be implemented in any imaging system used in a medical and non-medical environment. Likewise, only those components that are essential for explaining the invention are shown. In principle, such imaging systems and associated control devices are known to those skilled in the art and therefore do not need to be explained in detail.

Das Bildgebungsystem (hier das CT-System 1) zeichnet Bilder auf, die für das Training des erfindungsgemäßen Systems 6 verwendet werden, und nach dem Training werden Bilder des Bildgebungssystems durch das erfindungsgemäße System 6 verarbeitet.The imaging system (here the CT system 1) records images that are used for training the system 6 according to the invention, and after training, images of the imaging system are processed by the system 6 according to the invention.

Zur Erzeugung eines Trainingsdatensatzes (ersten Trainingsdatensatzes T1 und zweiten Trainingsdatensatzes T2) kann ein Benutzer CT-Bilder untersuchen und sie als normale oder abnormale Bilder (beispielsweise Bilder, die krankheitsinduzierte Änderungen zeigen) labeln. Die Untersuchung kann an einem Endgerät 7 ausgeführt werden, das mit der Steuervorrichtung 5 kommunizieren kann. Dieses Endgerät kann auch zur Untersuchung von Ergebnissen des erfindungsgemäßen Systems 6 verwendet werden.To generate a training data set (first training data set T1 and second training data set T2), a user can examine CT images and label them as normal or abnormal images (e.g., images showing disease-induced changes). The examination can be carried out on a terminal 7 which can communicate with the control device 5 . This terminal can also be used to examine results from the system 6 according to the invention.

2 zeigt eine Ausführungsform zum Trainieren eines G/A-Netzes 10, hier eines Autoencoders 10, mit einer Schätzung eines Rekonstruktionsverlusts. Das G/A-Netz ist in diesem Beispiel vorzugsweise ein variationeller Autoencoder (VAE) und umfasst eine Eingangsschicht 11, ein Codiernetz 12 als erstes Netz 12, einen latenten Raum 13 (kann auch als „Merkmalsraum“ bezeichnet werden) zum Speichern der latenten Daten LD, ein Decodiernetz 14 als zweites Netz 14 und eine Ausgangsschicht 15. 2 12 shows an embodiment for training a G/A network 10, here an autoencoder 10, with an estimate of a reconstruction loss. The G/A network in this example is preferably a variational autoencoder (VAE) and comprises an input layer 11, an encoding network 12 as the first network 12, a latent space 13 (can also be referred to as "feature space") for storing the latent data LD, a decoding network 14 as second network 14 and an output layer 15.

Eingangsbilder IP werden der Eingangsschicht 11 bereitgestellt und durch Bilden latenter Daten LD (beispielsweise eines Merkmalssatzes) im latenten Raum 13 durch das Codiernetz 12 codiert. Die latenten Daten LD werden dann wieder durch ein Decodiernetz 14 decodiert, und die Ausgangsschicht stellt Ausgangsbilder OP bereit, die den Eingangsbildern dienen sollten (jedoch nicht mit ihnen identisch sind).Input images IP are provided to the input layer 11 and encoded by forming latent data LD (for example, a feature set) in the latent space 13 by the encoding network 12 . The latent data LD is then decoded again by a decoding network 14 and the output layer provides output images OP which should serve (but are not identical to) the input images.

Durch Vergleichen der Eingangsbilder IP mit ihren jeweiligen Ausgangsbildern OP kann festgestellt werden, wie gut das G/A-Netz 10 abgestimmt ist. Das Training kann durch Anwenden einer die Ähnlichkeit maximierenden Verlustfunktion erreicht werden.By comparing the input images IP to their respective output images OP, it can be determined how well the G/A network 10 is tuned. Training can be achieved by applying a loss function that maximizes similarity.

Bei diesem Beispiel werden Eingangsbilder IP mit ihren Ausgangsbildern OP direkt durch eine Rekonstruktionsverlustfunktion verglichen.In this example, input images IP are directly compared to their output images OP by a reconstruction loss function.

3 zeigt eine Ausführungsform zum Training eines G/A-Netzes 10 mit einer Schätzung eines perzeptuellen Verlusts. Das Netz ähnelt dem in 2 dargestellten G/A-Netz 10 mit dem Unterschied, dass die Ausgangsbilder OP wieder durch ein Codiernetz 12 codiert werden. Dieses Codiernetz 12 sollte identische Präferenzen wie das die Eingangsbilder codierende Codiernetz 12 aufweisen und könnte diesem gleichen. 3 Figure 12 shows an embodiment for training a G/A network 10 with a perceptual loss estimate. The network is similar to that in 2 G/A network 10 shown, with the difference that the output images OP are again coded by a coding network 12. This coding network 12 should have and could have identical preferences as the coding network 12 encoding the input images.

Bei diesem Beispiel werden codierte Eingangsbilder IP durch eine perzeptuelle Verlustfunktion mit codierten Ausgangsbildern OP verglichen.In this example, encoded input images IP are compared to encoded output images OP by a perceptual loss function.

4 zeigt eine schematische Ausführungsform eines erfindungsgemäßen Systems mit einem siamesischen Netz 8. Ein G/A-Netz 10 (wie in 2 dargestellt) erzeugt Ausgangsbilder OP anhand Eingangsbildern IP. Im Gegensatz zu 2 vergleicht ein siamesisches Netz 8 die Eingangsbilder IP mit ihren jeweiligen Ausgangsbildern OP. In der Trainingsphase wird das siamesische Netz 8 mit normalen und abnormalen Bildern trainiert und ist in der Lage, ein Ähnlichkeitsmaß S (auch als „Ähnlichkeitsmetrik“ bezeichnet) zu bestimmen. Ein Beispiel für eine Trainingsprozedur ist in 7 dargestellt. Die Ausgabe des siamesischen Netzes 8 ist das Ähnlichkeitsmaß S. 4 shows a schematic embodiment of a system according to the invention with a Siamese network 8. A G/A network 10 (as in 2 shown) generates output images OP based on input images IP. In contrast to 2 a Siamese network 8 compares the input images IP with their respective output images OP. In the training phase, the Siamese network 8 is trained with normal and abnormal images and is able to determine a measure of similarity S (also referred to as “similarity metric”). An example of a training procedure is in 7 shown. The output of the Siamese network 8 is the similarity measure S.

5 zeigt eine schematische Ausführungsform eines erfindungsgemäßen Systems 6 mit einem siamesischen Netz 8 und einem Klassifikationsnetz 9. Eine in 4 dargestellte Anordnung wird durch ein Klassifikationsnetz 9 erweitert, das die Ergebnisse des siamesischen Netzes 8 verarbeitet. Hier werden normale Ergebnisse mit einer hohen Ähnlichkeit nicht durch das Klassifikationsnetz 9 verarbeitet. Wenn die Ähnlichkeit unter eine bestimmte vordefinierte Schwelle abfällt, verarbeitet das Klassifikationsnetz 9 jedoch die Ergebnisse des siamesischen Netzes 8 und leitet mögliche Klassifikationen für die Eingangsbilder IP (die dann abnormal sind und beispielsweise Pathologien zeigen) ab. 5 shows a schematic embodiment of a system 6 according to the invention with a Siamese network 8 and a classification network 9. An in 4 The arrangement shown is expanded by a classification network 9 which processes the results of the Siamese network 8 . Here normal results with a high Similar ness not processed by the classification network 9. However, when the similarity falls below a certain predefined threshold, the classification network 9 processes the results of the Siamese network 8 and derives possible classifications for the input images IP (which are then abnormal and show pathologies, for example).

6 zeigt eine schematische Ausführungsform eines erfindungsgemäßen Systems 6 mit einem siamesischen Netz 8. Dieses Beispiel ähnelt dem in 4 dargestellten Beispiel mit dem Unterschied, dass codierte Eingangsbilder IP durch das siamesische Netz 8 mit codierten Ausgangsbildern OP verglichen werden. 6 shows a schematic embodiment of a system 6 according to the invention with a Siamese network 8. This example is similar to that in FIG 4 example shown with the difference that encoded input images IP are compared by the Siamese network 8 with encoded output images OP.

7 zeigt ein Blockdiagramm des Prozessablaufs eines bevorzugten erfindungsgemäßen Trainingsverfahrens. 7 shows a block diagram of the process flow of a preferred training method according to the invention.

In Schritt I wird ein (untrainiertes) G/A-Netz 10 bereitgestellt.In step I an (untrained) G/A network 10 is provided.

In Schritt II wird ein (untrainiertes) siamesisches Netz 8 bereitgestellt.In step II an (untrained) Siamese network 8 is provided.

In Schritt III wird ein erster Trainingsdatensatz T1, der normale Bilder umfasst, bereitgestellt.In step III, a first training data set T1, which includes normal images, is provided.

In Schritt IV wird ein zweiter Trainingsdatensatz T2, der abnormale Bilder umfasst, bereitgestellt.In step IV, a second training data set T2, which includes abnormal images, is provided.

In Schritt V wird das G/A-Netz 10 trainiert, um latente Daten LD anhand Eingangsbildern IP und Ausgangsbilder OP anhand der latenten Daten LD (siehe beispielsweise 2) zu erzeugen, wobei das Training mit Bildern des ersten Trainingsdatensatzes T1 ausgeführt wird, wobei eine Verlustfunktion für das Training zumindest zu Beginn des Trainings verwendet wird, wobei die Verlustfunktion die Ähnlichkeit der Eingangsbilder IP und der jeweiligen Ausgangsbilder IP erhöht.In step V, the G/A network 10 is trained to generate latent data LD from the input images IP and output images OP from the latent data LD (see for example 2 ), the training being carried out with images of the first training data set T1, a loss function being used for the training at least at the beginning of the training, the loss function increasing the similarity of the input images IP and the respective output images IP.

In Schritt VI wird das siamesische Netz 8 trainiert, um Ähnlichkeitsmaße S zwischen Eingangsbildern IP und jeweiligen Ausgangsbildern OP zu erzeugen, wobei das Training mit Bildern des ersten Trainingsdatensatzes T1 und des zweiten Trainingsdatensatzes T2 dadurch ausgeführt wird, dass Bilder beider Trainingsdatensätze T1, T2 als Eingangsbilder für das G/A-Netz 10 verwendet werden und Ausgangsbilder OP des G/A-Netzes 10 durch das siamesische Netz 8 mit ihren jeweiligen Eingangsbildern IP verglichen werden.In step VI, the Siamese network 8 is trained to generate similarity measures S between input images IP and respective output images OP, the training with images of the first training data set T1 and the second training data set T2 being carried out by using images of both training data sets T1, T2 as input images for the G/A network 10 and output images OP of the G/A network 10 are compared by the Siamese network 8 with their respective input images IP.

Wenngleich die vorliegende Erfindung in Form bevorzugter Ausführungsformen und Abänderungen davon offenbart wurde, ist zu verstehen, dass daran zahlreiche zusätzliche Modifikationen und Abänderungen vorgenommen werden könnten, ohne vom Schutzumfang der Erfindung abzuweichen. Im Interesse der Klarheit ist zu verstehen, dass die Verwendung von „ein/eine/eines“ in dieser Anmeldung nicht eine Mehrzahl ausschließt und dass „umfassend“ nicht andere Schritte oder Elemente ausschließt. Die Erwähnung einer „Einheit“ oder einer „Vorrichtung“ schließt nicht die Verwendung mehr als einer Einheit oder Vorrichtung aus.While the present invention has been disclosed in terms of preferred embodiments and variations thereof, it should be understood that numerous additional modifications and variations could be made therein without departing from the scope of the invention. In the interest of clarity, it is to be understood that the use of "a/an" in this application does not exclude a plurality and that "comprising" does not exclude other steps or elements. Mention of a "unit" or "device" does not preclude use of more than one unit or device.

Claims

Method for creating a system (6) for detecting anomalies in images, comprising the following steps: - providing a generative network and/or an autoencoder (10), - providing a Siamese network (8), - providing a first training data set (T1) which includes normal images, - providing a second training dataset (T2) that includes abnormal images, - Training the generative network and/or the autoencoder (10) to generate latent data (LD) based on input images (IP) and output images (OP) based on the latent data (LD), the training being carried out with images of the first training data set , wherein a loss function is used for the training at least at the beginning of the training, the loss function increasing the similarity of the input images (IP) and respective output images (OP), - training the Siamese network (8) to generate similarity measures (S) between input images (IP) and respective output images (OP), the training being carried out with images of the first training data set (T1) and the second training data set (T2) thereby, that images of both training datasets (T1, T2) are used as input images (IP) for the generative network and/or the autoencoder (10) and output images (OP) of the generative network and/or the autoencoder (10) by the Siamese network (8 ) are compared with their respective input images (IP).

procedure after claim 1 , wherein the generative network and/or the autoencoder (10) is trained by comparing an input image (IP) with its output image (OP) generated by the generative network and/or the autoencoder (10), in particular using a reconstruction loss function and/or that data generated by a first network (12) of the generative network and/or the autoencoder (10) are compared with data from output images (OP) that were additionally processed by the first network (12). , in particular using a perceptual loss function, wherein the training is preferably supported by a second Siamese network (8), which already has been trained to replace a reconstruction loss function and/or a perceptual loss function.

Method according to any one of the preceding claims, wherein the Siamese network (8) is trained to generate similarity measures (S) between the input images (IP) and respective output images (OP) directly by comparing the images and/or indirectly by comparing data , which were generated by means of a first network (12) of the generative network and/or the autoencoder (10), which is in particular an encoder network (12), with data which are generated by additional processing of the respective output images (OP) were generated with the first mesh (12).

Method according to one of the preceding claims, in which the generative network and/or the autoencoder (10) is a generative adversarial network (GAN) or a variable autoencoder (VAE).

A method according to any one of the preceding claims, wherein the Siamese network (8) is trained to generate a similarity threshold on the similarity measures (S), the similarity threshold indicating the processing of an abnormal image by the generative network and/or the autoencoder (10). .

Method according to one of the preceding claims, in which the Siamese network (8) is designed in such a way that it normalizes similarity measures (S), the normalization preferably being based on a validation data set.

Method according to one of the preceding claims, in which a classification network (9) is additionally trained so that it is able to classify images using similarity measures (S) generated by the Siamese network (8), the classification network (9) being preferred using images of the second training data set (T2) and/or the latent data (LD) and/or the output images (OP) generated by the generative network and/or the autoencoder (10) using the second training data set (T2) and preferably also of the similarity measure (S) of the Siamese network (8) is trained.

Method according to one of the preceding claims, wherein the Siamese network (8) is trained to generate a spatially resolved similarity measure (S) of images, with a classification preferably depending on an area in an image and the respective similarity measure (S) of this area .

Method according to any one of the preceding claims, wherein the training is end-to-end training and/or wherein results generated by the Siamese network (8) are used to further train the generative network and/or the autoencoder (10). , and/or wherein results generated by the generative network and/or the autoencoder (10) are used to further train the Siamese network (8).

System for detecting anomalies in images, comprising: - a generative network and/or an autoencoder (10) which and/or has been trained by the method according to any one of the preceding claims, - a Siamese network (8) trained by the method according to any one of the preceding claims and connected to the generative network and/or the autoencoder (10) in order to encode input images (IP) with theirs through the generative network and/or the Autoencoder (10) to compare generated output images (OP).

system after claim 10 comprising a classification network connected to the output of the Siamese network (8) such that it preferably receives a similarity measure (S) if it is beyond a predefined similarity threshold.

Method for detecting anomalies in images with a system (6). claim 10 or 11 , which comprises the following steps: - providing an image as an input image (IP) for the system (6), - receiving a similarity measure (S) for this image with the system (6), - if the similarity measure (S) is beyond a predefined similarity threshold lies, classifying an abnormality in the image, in particular depending on the area of the abnormality, preferably with a claim 7 trained classification network, - where input images (IP) are optionally processed multiple times to generate multiple output images (OP) and respective similarity measures (S) based on an input image (IP) and in particular to generate a probability distribution over the similarity measures (S).

Imaging system (1) according to a system (6). claim 10 or 11 includes and / or is designed to a method claim 12 to execute.

Computer program product comprising a computer program that can be loaded directly into a system (6) or a control device (5) for an imaging system (1), which program elements for carrying out steps of the method according to any one of Claims 1 until 9 or 12 when the computer program is executed by the system (6) or the control device (5).

A computer-readable medium storing program elements that can be read and executed by a computer unit to perform steps of the method according to any one of Claims 1 until 9 or 12 to be executed when the program elements are executed by the computer unit.