DE102020214715A1

DE102020214715A1 - NORMALIZED AUTOENCODER

Info

Publication number: DE102020214715A1
Application number: DE102020214715.9A
Authority: DE
Inventors: Sebastian Ziesche
Original assignee: Robert Bosch GmbH
Current assignee: Robert Bosch GmbH
Priority date: 2020-11-24
Filing date: 2020-11-24
Publication date: 2022-05-25

Abstract

Computerimplementiertes Verfahren (10) zum Trainieren eines generativen Maschinenlernmodells, das Folgendes umfasst: Erhalten (12) mehrerer Eingangsproben in einem Datenraum, Abbilden (14) der Eingangsproben in einen latenten Raum als mehrere Proben in dem latenten Raum, Rekonstruieren (16) mehrerer synthetischer Proben in dem Datenraum auf der Grundlage der Proben des latenten Raums, und iteratives Optimieren (18) des generativen Maschinenlernmodells auf der Grundlage einer Optimierungsfunktion, die einen Rekonstruktionsverlust und eine Teststatistik umfasst, um mehrere Modellparameter zu liefern.A computer-implemented method (10) for training a generative machine learning model, comprising: obtaining (12) multiple input samples in a data space, mapping (14) the input samples into a latent space as multiple samples in the latent space, reconstructing (16) multiple synthetic samples in the data space based on the latent space samples, and iteratively optimizing (18) the generative machine learning model based on an optimization function including a reconstruction loss and a test statistic to yield a plurality of model parameters.

Description

Technisches Gebiettechnical field

Die vorliegende Erfindung betrifft ein computerimplementiertes Verfahren zum Trainieren eines generativen Maschinenlernmodells, ein computerimplementiertes Verfahren zum Erzeugen synthetischer Datenproben unter Verwendung eines generativen Maschinenlernmodells und assoziierte Einrichtungen. Des Weiteren betrifft die vorliegende Erfindung ein assoziiertes Computerprogrammelement und ein assoziiertes computerlesbares Medium.The present invention relates to a computer-implemented method for training a generative machine learning model, a computer-implemented method for generating synthetic data samples using a generative machine learning model, and associated devices. Furthermore, the present invention relates to an associated computer program element and an associated computer-readable medium.

Hintergrundbackground

Die Entwicklung und Anwendung datengestützter Algorithmen in technischen Systemen gewinnt bei der Digitalisierung und insbesondere bei der Automatisierung technischer Systeme zunehmend an Bedeutung. In technischen Systemen kann es vorteilhaft sein, neue Datenpunkte, und insbesondere eine große Anzahl neuer Datenpunkte, zu erzeugen. Hierdurch lassen sich beispielsweise verschiedene Zukunftsszenarien simulieren und statistisch evaluieren.The development and application of data-supported algorithms in technical systems is becoming increasingly important in the digitalization and especially in the automation of technical systems. In technical systems it can be advantageous to generate new data points, and in particular a large number of new data points. This allows, for example, various future scenarios to be simulated and statistically evaluated.

Zu diesem Themengebiet gehört zum Beispiel die Erzeugung von Fahrzeugtrajektorie-Vorhersagedaten auf der Grundlage zuvor erhaltener Fahrzeugtrajektoriedaten. Ein Adversarial-Autoencoder (AAE) ist ein häufig verwendetes generatives Modell. Derartige Ansätze lassen sich jedoch weiter verbessern.This subject area includes, for example, the generation of vehicle trajectory prediction data based on previously obtained vehicle trajectory data. An adversarial autoencoder (AAE) is a commonly used generative model. However, such approaches can be further improved.

Kurzdarstellung der ErfindungSummary of the Invention

Gemäß einem ersten Aspekt wird ein computerimplementiertes Verfahren zum Trainieren eines generativen Maschinenlernmodells bereitgestellt. Das computerimplementierte Verfahren umfasst Folgendes:

- Erhalten mehrerer Eingangsproben in einem Datenraum;
- Abbilden der Eingangsproben in einen latenten Raum als mehrere Proben in dem latenten Raum;
- Rekonstruieren mehrerer synthetischer Proben in dem Datenraum auf der Grundlage einer oder mehrerer Proben des latenten Raums; und
- iteratives Optimieren des generativen Maschinenlernmodells auf der Grundlage einer Optimierungsfunktion, die einen Rekonstruktionsverlust und eine Teststatistik umfasst, um mehrere Modellparameter zu liefern.

According to a first aspect, a computer-implemented method for training a generative machine learning model is provided. The computer-implemented method includes the following:

- Obtaining multiple input samples in a data space;
- mapping the input samples into a latent space as a plurality of samples in the latent space;
- reconstructing a plurality of synthetic samples in the data space based on one or more samples of the latent space; and
- iteratively optimizing the generative machine learning model based on an optimization function that includes a reconstruction loss and a test statistic to yield multiple model parameters.

Gemäß einem zweiten Aspekt wird eine Einrichtung zum Trainieren eines generativen Maschinenlernmodells bereitgestellt. Die Einrichtung umfasst eine Eingabeschnittstelle, ausgelegt zum Empfangen mehrerer Eingangsproben, einen Prozessor, ausgelegt zum Erhalten der mehreren Eingangsproben, und eine Ausgabeschnittstelle, ausgelegt zum Ausgeben eines trainierten Modells und/oder mehrerer durch das trainierte Modell erzeugter synthetischer Datenproben.According to a second aspect, an apparatus for training a generative machine learning model is provided. The facility includes an input interface configured to receive a plurality of input samples, a processor configured to receive the plurality of input samples, and an output interface configured to output a trained model and/or a plurality of synthetic data samples generated by the trained model.

Der Prozessor ist ausgelegt, in einer Trainingsphase, zum Abbilden der mehreren Eingangsproben über einen durch den Prozessor ausgeführten Encoder in einen latenten Raum als mehrere Proben des latenten Raums, zum Rekonstruieren mehrerer synthetischer Datenproben über einen durch den Prozessor ausgeführten Decoder auf der Grundlage einer oder mehrerer Proben des latenten Raums, und zum iterativen Optimieren, unter Verwendung des Prozessors, des generativen Maschinenlernmodells auf der Grundlage einer Optimierungsfunktion, die einen Rekonstruktionsverlust und eine Teststatistik umfasst.The processor is configured, in a training phase, to map the plurality of input samples via a processor-executed encoder into a latent space as a plurality of samples of the latent space, to reconstruct a plurality of synthetic data samples via a processor-executed decoder based on one or more Samples the latent space, and to iteratively optimize, using the processor, the generative machine learning model based on an optimization function that includes a reconstruction loss and a test statistic.

Gemäß einem dritten Aspekt wird ein computerimplementiertes Verfahren zum Erzeugen synthetischer Datenproben unter Verwendung eines generativen Maschinenlernmodells bereitgestellt. Das Verfahren umfasst Folgendes:

- Konfigurieren eines Decoder-Netzes gemäß den mehreren gemäß dem Verfahren des ersten Aspekts oder dessen Ausführungsformen erzeugten Modellparametern;
- Erhalten mindestens einer Probe einer Generatorverteilung;
- Decodieren der mindestens einen Probe der Generatorverteilung unter Verwendung des Decoder-Netzes, um somit weitere mehrere synthetische Datenproben in dem Datenraum zu erhalten; und
- Ausgeben der weiteren mehreren synthetischen Datenproben.

According to a third aspect, a computer-implemented method for generating synthetic data samples using a generative machine learning model is provided. The procedure includes the following:

- configuring a decoder network according to the plurality of model parameters generated according to the method of the first aspect or its embodiments;
- obtaining at least one sample of a generator distribution;
- decoding the at least one sample of the generator distribution using the decoder network, thereby obtaining further multiple synthetic data samples in the data space; and
- Outputting the further multiple synthetic data samples.

Gemäß einem vierten Aspekt wird eine Einrichtung bereitgestellt, die ausgelegt ist zum Erzeugen synthetischer Datenproben unter Verwendung eines generativen Maschinenlernmodells, wobei das generative Maschinenlernmodell gegebenenfalls gemäß dem Verfahren des ersten Aspekts erzeugt wird. Die Einrichtung umfasst eine Eingabeschnittstelle, ausgelegt zum Empfangen mehrerer gegebenenfalls gemäß dem Verfahren des ersten Aspekts erzeugter Modellparameter. Die Einrichtung umfasst einen Prozessor, ausgelegt zum Erhalten, auf der Grundlage einer Teststatistik, einer oder mehrerer Proben des latenten Raums, konfiguriert gemäß den mehreren Modellparametern, um die eine oder die mehreren Proben des latenten Raums zu decodieren, um weitere mehrere synthetische Datenproben in dem Datenraum zu erhalten, und zum Ausgeben der weiteren mehreren synthetischen Datenproben.According to a fourth aspect, there is provided an apparatus arranged to generate synthetic data samples using a generative machine learning model, where the generative machine learning model is optionally generated according to the method of the first aspect. The device comprises an input interface designed to receive a plurality of model parameters that may have been generated according to the method of the first aspect. The device includes a processor configured to obtain, based on a test statistic, one or more samples of the latent space configured according to the plurality of model parameters to obtain the decode one or more samples of the latent space to obtain another plurality of synthetic data samples in the data space, and output the other plurality of synthetic data samples.

Gemäß einem fünften Aspekt wird ein Computerprogrammelement bereitgestellt, das zumindest Folgendes umfasst: (i) computerausführbare Anweisungen zum Trainieren, unter Verwendung maschinellen Lernens, eines generativen Maschinenlernmodells gemäß dem Verfahren des ersten Aspekts und/oder (ii) computerausführbare Anweisungen zum Erzeugen synthetischer Datenproben gemäß dem Verfahren des dritten Aspekts und/oder (iii) computerausführbare Anweisungen, die Modellparameter zum Bereitstellen eines gemäß dem Verfahren des ersten Aspekts trainierten generativen Maschinenlernmodells umfassen.According to a fifth aspect, there is provided a computer program element, comprising at least: (i) computer-executable instructions for training, using machine learning, a generative machine learning model according to the method of the first aspect and/or (ii) computer-executable instructions for generating synthetic data samples according to the Method of the third aspect and/or (iii) computer-executable instructions comprising model parameters for providing a generative machine learning model trained according to the method of the first aspect.

Gemäß einem sechsten Aspekt wird ein computerlesbares Medium bereitgestellt, das eines oder mehrere der Computerprogrammelemente des fünften Aspekts umfasst.According to a sixth aspect there is provided a computer-readable medium comprising one or more of the computer program elements of the fifth aspect.

Gemäß einem siebten Aspekt wird ein computerimplementiertes Verfahren zum Trainieren eines weiteren Maschinenlernmodells bereitgestellt, das Folgendes umfasst:

- Erhalten mehrerer synthetischer Datenproben wie gemäß dem computerimplementierten Verfahren des ersten Aspekts oder dessen Ausführungsformen erzeugt;
- Eingeben der mehreren synthetischen Datenproben in ein weiteres Maschinenlernmodell;
- iteratives Optimieren des weiteren Maschinenlernmodells auf der Grundlage einer Optimierungsfunktion, die einen Rekonstruktionsverlust umfasst, um dadurch weitere mehrere Modellparameter zu aktualisieren; und
- Ausgeben weiterer mehrerer Modellparameter des weiteren Maschinenlernmodells.

According to a seventh aspect, there is provided a computer-implemented method for training another machine learning model, comprising:

- obtaining a plurality of synthetic data samples as generated according to the computer-implemented method of the first aspect or embodiments thereof;
- inputting the plurality of synthetic data samples into another machine learning model;
- iteratively optimizing the further machine learning model based on an optimization function including a reconstruction loss to thereby update further multiple model parameters; and
- Outputting further several model parameters of the further machine learning model.

Dementsprechend beschreibt die vorliegende Patentschrift eine neue Architektur, die hier als „Normalisierter Autoencoder“ (NAE) bezeichnet wird und beispielsweise an Eingangsdaten trainiert werden kann, um Ausgangsdaten mit einer der Verteilung D_i der mehreren Eingangsdaten ähnlichen statistischen Verteilung zu erzeugen.Accordingly, the present patent specification describes a new architecture, which is referred to here as a "normalized autoencoder" (NAE) and can be trained on input data, for example, in order to generate output data with a statistical distribution similar to the distribution D _i of the plurality of input data.

Ein Adversarial-Autoencoder (AAE) umfasst üblicherweise einen Autoencoder mit einem zusätzlichen Generativen Adversarial-Netzwerk (GAN). Das GAN erzwingt, dass die Verteilung D_l der mehreren Daten in dem latenten Raum des Adversarial-Autoencoders ähnlich einer gewählten Generatorverteilung D_g ist. In der Regel wird ein AAE, um neue Daten zu erhalten, ohne Überwachung an den verfügbaren Eingangsdaten trainiert. Dann wird eine Probe aus der Verteilung D_g erzeugt und durch den trainierten Decoder des AAE geleitet. Dadurch ergibt sich ein neuer Ausgangsdatenpunkt mit ähnlichem Erscheinungsbild wie die Eingangsdaten, an denen der AAE trainiert wurde.An adversarial autoencoder (AAE) typically includes an autoencoder with an additional Generative Adversarial Network (GAN). The GAN forces the distribution _Dl of the multiple data in the latent space of the adversarial _autoencoder to be similar to a chosen generator distribution Dg. Typically, to obtain new data, an AAE is trained without supervision on the available input data. A sample is then generated from the distribution D _g and passed through the AAE's trained decoder. This results in a new output data point with a similar appearance to the input data on which the AAE was trained.

Gemäß einer Technik der vorliegenden Patentschrift ergibt sich die Architektur eines normalisierten Autoencoders (NAE) durch Ersetzen des GAN eines AAE durch eine Teststatistik S. Die Teststatistik evaluiert die Verteilung der mehreren Proben in dem latenten Raum, um eine Ähnlichkeit mit einer Verteilung D oder einer Verteilungsklasse C zu identifizieren. Im letzteren Fall gibt sie auch Informationen über die Verteilung in C, die D_l am nächsten ist, zurück. Die Teststatistik erfordert kein Training, und liefert dennoch nützliche Informationen über eine erwartete Eigenschaft von D_l.According to one technique of the present specification, the architecture of a normalized autoencoder (NAE) results from replacing the GAN of an AAE with a test statistic S. The test statistic evaluates the distribution of the multiple samples in the latent space for similarity to a distribution D or distribution class to identify C. In the latter case, it also returns information about the distribution in C that is closest to _Dl . The test statistic requires no training, yet provides useful information about an expected property of _Dl .

GANs erfordern ein neuronales Netz, und somit eine erhebliche Trainingskomplexität, was in der Praxis bedeutet, dass eine große Anzahl von Vektor- und Matrixmultiplikations- und - additionsoperationen nacheinander durchgeführt werden. Dagegen lassen sich viele Teststatistiken für die Verteilung D mit einer erheblich geringeren Laufzeit berechnen, beispielsweise unter Verwendung einer geringeren Anzahl von Vektormultiplikationen. Dementsprechend besteht eine Auswirkung des Ersetzens des GAN eines AAE durch eine Teststatistik, wie in dem hier vorgestellten NAE, in einer Verringerung der Rechenkomplexität, und somit der Laufzeit und/oder des Energieverbrauchs bei der Erzeugung neuer Daten.GANs require a neural network, and thus a significant training complexity, which in practice means that a large number of vector and matrix multiplication and addition operations are performed one after the other. On the other hand, many test statistics for the distribution D can be calculated with a significantly lower running time, for example using a smaller number of vector multiplications. Accordingly, one effect of replacing the GAN of an AAE with a test statistic, as in the NAE presented here, is a reduction in computational complexity, and thus runtime and/or power consumption when generating new data.

Darüber hinaus besitzt das durch den hier in Aspekten offenbarten NAE erzeugte resultierende Modell eine verglichen mit der eines AAE höhere Wiedergabetreue. Im Falle eines unter Verwendung eines GAN trainierten AAE ist die Verteilung D in der Regel festgelegt. Alternativ dazu lässt sich die Teststatistik S, wie in dem hier gemäß Aspekten erörterten NAE verwendet, so konfigurieren, dass sie lediglich bezüglich der mehreren Daten in dem latenten Raum testet, die sich in einer gegebenen Verteilungsklasse C befinden. Beispielsweise könnte der Test bezüglich der mehreren Daten in dem latenten Raum, die sich in einer gegebenen Verteilungsklasse befinden, unabhängig von dem Mittelwert und/oder der Kovarianz der mehreren Daten im latenten Raum sein. In diesem Fall blieben der Mittelwert und die Kovarianz der mehreren Daten in dem latenten Raum flexibel, wodurch sich der NAE einfacher optimieren ließe.In addition, the resulting model generated by the NAE disclosed in aspects herein has higher fidelity compared to that of an AAE. In the case of an AAE trained using a GAN, the distribution D is typically fixed. Alternatively, the test statistic S, as used in the NAE discussed in aspects herein, can be configured to only test on the plurality of data in the latent space that are in a given distribution class C. For example, the test on the multiple data in the latent space that is in a given distribution class could be independent of the mean and/or the covariance of the multiple data in the latent space. In this case, the mean and covariance of the multiple data in the latent space remained flexible, making the NAE easier to optimize.

Darüber hinaus lassen sich GANs, wie in AAEs enthalten, schwer trainieren und zeigen oftmals ein unbeständiges Trainingsverhalten. Durch die gemäß hier erörterten Aspekten vorgeschlagene Verwendung der Teststatistik S anstelle des GAN in dem Optimierungsterm wird das Training der AAE-Architektur vereinfacht, da eine Teststatistik S in der Regel keine trainierbaren Parameter aufweist. Somit wird der Trainingsprozess auf den Trainingsprozess reduziert, der für den Autoencoder ausgeführt werden sollte, ohne dass zusätzliche Stabilitäts- und Trainingsprobleme durch das GAN eines AAE eingeführt werden.In addition, GANs, such as those found in AAEs, are difficult to train and often exhibit erratic training behavior. Through the proposed use of the test statistic S instead of the GAN in the optimization term according to the aspects discussed here simplifies the training of the AAE architecture, since a test statistic S generally has no trainable parameters. Thus, the training process is reduced to the training process that should be performed for the autoencoder without introducing additional stability and training problems through the GAN of an AAE.

Der Ansatz der vorliegenden Patentschrift betrifft ein Verfahren zum Trainieren eines neuronalen Decoder-Netzes eines NAE an mehreren Eingangsdatenproben mit der Verteilung D_i, um so einen Decoder bereitzustellen, der in der Lage ist, synthetische Datenproben bereitzustellen, die sich eng an die Verteilung D_i der mehreren Eingangsdaten halten. Anders ausgedrückt werden mehrere Modellparameter zumindest eines Decoder-Netzes bereitgestellt, die in der Lage sind, die mehreren synthetischen Datenproben zu erzeugen.The approach of the present specification relates to a method for training a decoder neural network of an NAE on multiple input data samples having the distribution D _i , so as to provide a decoder capable of providing synthetic data samples that closely match the distribution D _i hold the multiple input data. In other words, multiple model parameters of at least one decoder network capable of generating the multiple synthetic data samples are provided.

Das Training zumindest des Decoder-Netzes des NAE wird durch eine Verlustfunktion mit zwei Komponenten beeinflusst. Die erste Komponente der Verlustfunktion ist der Rekonstruktionsverlust (L), der durch Vergleichen in den latenten Raum codierter Eingangsdatenproben mit entsprechenden aus dem latenten Raum heraus decodierten rekonstruierten Ausgangsproben erhalten wird. Die zweite Komponente der Verlustfunktion wird durch Anwenden der Teststatistik S auf mehrere Proben (Z) in dem latenten Raum erhalten. Die Teststatistik evaluiert, wie ähnlich/nah die Verteilung ihrer Eingangsproben (bei dieser Anwendung die Proben des latenten Raums) der Verteilung D oder einer in der Verteilungsklasse C liegenden Verteilung ist.The training of at least the NAE's decoder network is affected by a two-component loss function. The first component of the loss function is the reconstruction loss (L), obtained by comparing input data samples encoded into latent space with corresponding reconstructed output samples decoded out of latent space. The second component of the loss function is obtained by applying the test statistic S to multiple samples (Z) in the latent space. The test statistic evaluates how similar/close the distribution of its input samples (the latent space samples in this application) is to distribution D or a distribution in distribution class C.

Anders ausgedrückt evaluiert die Teststatistik des NAE, ob die Proben an ihrem Eingang eine Verteilung bilden, die eine der Form, bezüglich derer sie testet, ähnliche Form aufweist. Die Teststatistik kann zum Beispiel die Wölbung eines Eingangsprobensatzes evaluieren. Die Wölbung einer univariaten Normalverteilung beträgt beispielsweise 3. Bei einem Beispiel wird der Diskriminator eines generativen Adversarial-Netzwerks durch die Teststatistik ersetzt.In other words, the NAE's test statistic evaluates whether the samples at its input form a distribution that has a shape similar to the shape it is testing for. For example, the test statistic can evaluate the kurtosis of an input sample set. For example, the kurtosis of a univariate normal distribution is 3. In one example, the test statistic replaces the discriminator of a generative adversarial network.

Der Ansatz der vorliegenden Patentschrift betrifft auch ein Verfahren zum Erzeugen mehrerer synthetischer Datenproben unter Verwendung des trainierten neuronalen Decoder-Netzes gemäß den mehreren in der Trainingsphase berechneten Modellparametern zumindest eines Decoder-Netzes. In der Regel entnimmt der Decoder bei der Erzeugung synthetischer Proben, Proben aus der gewählten Generatorverteilung D_g, decodiert die von der Generatorverteilung erhaltenen Proben und gibt synthetische Proben mit einer Verteilung, die der der zum Trainieren des NAE verwendeten mehreren Eingangsdaten D_i ähnlich ist, aus.The approach of the present patent also relates to a method for generating a plurality of synthetic data samples using the trained neural decoder network according to the plurality of model parameters of at least one decoder network calculated in the training phase. Typically, when generating synthetic samples, the decoder takes samples from the chosen generator distribution D _g , decodes the samples obtained from the generator distribution and outputs synthetic samples with a distribution similar to that of the multiple input data D _i used to train the NAE, out.

Eine andere Betrachtungsweise der hier erörterten Technik ergibt sich bei Erwägung der verschiedenen Arten von Verteilung. Eine Eingangsdatenverteilung D_i charakterisiert die Verteilung mehrerer Eingangsproben X = {x₁, x₂, x₃, ..., x_n}. Eine Verteilung D_l in dem latenten Raum charakterisiert die Verteilung der Proben Z = {z₁, z₂, z₃, ..., z_n} des latenten Raums, wenn durch ein trainiertes Encoder-Modell trainiert. Eine Decodierung durch ein trainiertes Decoder-Modell ergibt mehrere synthetische Proben X̂ = (x̂₁, x̂₂, x̂₃, ..., x̂_n}. Ihre Verteilung approximiert die Verteilung D_i der mehreren Eingangsdaten X = {x₁, x₂, x₃, ..., x_n}. Ein Vergleich von X mit X̂ liefert eine Verlustfunktion, die bei der Optimierung (Rückpropagierung) des (neuronalen Netzes des) Encoder- und/oder des Decoder-Modells verwendet werden kann.Another way to look at the technique discussed here is to consider the different types of distribution. An input data distribution D _i characterizes the distribution of multiple input samples X={x ₁ , x ₂ , x ₃ ,..., x _n }. A distribution D ₁ in the latent space characterizes the distribution of the latent space samples Z={z ₁ , z ₂ , z ₃ ,..., z _n } when trained by a trained encoder model. A decoding by a trained decoder model yields several synthetic samples X̂ = (x̂ ₁ , x̂ ₂ , x̂ ₃ , ..., x̂ _n }. Their distribution approximates the distribution D _i of the several input data X = {x ₁ , x ₂ , x ₃ , ..., x _n } A comparison of X with X̂ yields a loss function that can be used in the optimization (back propagation) of the (neural network of) the encoder and/or the decoder model.

Die vorliegende Technik schlägt auch einen Zusatz zu der Verlustfunktion vor. Dieser wird durch Evaluierung der Verteilung D_l der Proben Z = (z₁, z₂, z₃, ..., z_n} des latenten Raums unter Verwendung einer Teststatistik S erhalten. Die Teststatistik kann beispielsweise testen, in welchem Maß D_l einer Gauß-Verteilung ähnelt. Wenn beispielsweise das Anwenden der Teststatistik S auf D_l ergibt, dass die Verteilung der mehreren Proben Z = {z₁, z₂, z₃, ..., z_n} des latenten Raums vergleichsweise nicht dem Gauß-Typ entspricht, so spiegelt sich dies in der Verlustfunktion wider. Das Encoder- und/oder das Decoder-Modell werden über eine Anzahl von Trainingsiterationen derart angepasst, dass sie die Verteilung der mehreren Proben Z = {z₁, z₂, z₃, ..., z_n} des latenten Raums zu einer mehr dem Gauß-Typ entsprechenden Form führen (bei einem Beispiel).The present technique also proposes an addition to the loss function. This is obtained by evaluating the distribution D _l of samples Z = (z ₁ , z ₂ , z ₃ , ..., z _n } of the latent space using a test statistic S. The test statistic can test, for example, to what extent D _l resembles a Gaussian distribution For example, if applying the test statistic S to D ₁ reveals that the distribution of the multiple samples Z={z ₁ , z ₂ , z ₃ ,..., z _n } of the latent space is not comparatively Gaussian -type, this is reflected in the loss function.The encoder and/or the decoder model is adjusted over a number of training iterations to match the distribution of the multiple samples Z = {z ₁ , z ₂ , z ₃ , ..., z _n } of the latent space lead to a more Gaussian-type shape (in one example).

In einem Datenerzeugungsmodus des NAE werden einem unter dem Einfluss der Teststatistik S trainierten Decoder Zufallsproben entweder aus der Verteilung D (falls S bezüglich D getestet hat) oder aus der Verteilung in der Verteilungsklasse C, die D_l am nächsten war (falls S bezüglich der Klasse C getestet hat) bereitgestellt. Dementsprechend ähnelt die Verteilung weiterer mehrerer synthetischer Proben X̂_s = {x̂_s1, x̂_s2, x̂_s3, ..., x̂_sm} der der in der Trainingsphase auf das Encoder- und das Decoder-Modell angewandten mehreren Eingangsdaten X = {x₁, x₂, x₃, ..., x_n}.In a data generation mode of the NAE, a decoder trained under the influence of the test statistic S is given random samples either from the distribution D (if S tested with respect to D) or from the distribution in the distribution class C that was closest to _Dl (if S with respect to the class C tested). Accordingly, the distribution of further multiple synthetic samples X̂ _s = {x̂ _s1 , x̂ _s2 , x̂ _s3 , ..., x̂ _sm } is similar to that of the multiple input data X = {x ₁ applied to the encoder and decoder models in the training phase , x2 , _x3 , ..., _xn _} .

Figurenlistecharacter list

1A FIG. 12 schematically illustrates a computer-implemented method for training a generative machine learning model according to a first aspect.
1B Figure 12 schematically illustrates a computer-implemented method for generating synthetic data samples using a generative machine learning model according to a third aspect.
2 shows the training of a generative machine learning model schematically.
3 Figure 12 schematically depicts a facility for generating synthetic data using a machine learning model.
4 shows a device according to the second aspect schematically.

Ausführliche BeschreibungDetailed description

Nun wird eine Verbesserung an Adversarial-Autencodern (AAEs) präsentiert. Insbesondere wird ein Normalisierter Autoencoder (NAE) präsentiert, der zumindest in Datenerzeugungsszenarien, in denen derzeit generative Modelle verwendet werden, eingesetzt werden kann. Eine wichtige Anwendung generativer Modelle ist zum Beispiel die Erzeugung von Trainingsdaten oder Testdaten aus bereits bestehenden Trainings- oder Testdaten. Dies findet beispielsweise Anwendung bei einer Anomaliedetektion, bei der nur wenige Trainingsdaten vorliegen, die Trainingsproben mit einer Anomalie umfassen, verglichen mit Trainingsproben ohne eine Anomalie.An improvement to adversarial autocoders (AAEs) is now presented. In particular, a Normalized Autoencoder (NAE) is presented, which can be used at least in data generation scenarios where generative models are currently used. An important application of generative models is, for example, the generation of training data or test data from existing training or test data. This finds application, for example, in anomaly detection where there is little training data comprising training samples with an anomaly compared to training samples without an anomaly.

Bei einer Anomalie kann es sich zum Beispiel um einen Sensor oder eine andere Hardwarekomponente einer Maschine mit einer Fehlfunktion handeln. Ein weiteres Beispiel, bei dem Untersuchungsdaten nicht ohne Weiteres in den benötigten Mengen verfügbar sind, sind Anwendungen des autonomen Fahrens. Das Erhalten realistischer Daten unsicherer Autotrajektorien ist in der Praxis schwieriger als das Erhalten realistischer Daten sicherer Autotrajektorien. Dementsprechend erörtert die vorliegende Patentschrift zumindest ein Verfahren zum Erzeugen von Trainingsdaten, ein Verfahren zum Erzeugen von Testdaten, um zu prüfen, ob ein trainiertes Modell gefahrlos betrieben werden kann, und ein generatives Modell zum Erzeugen der Trainings- bzw. Testdaten.An anomaly can be, for example, a sensor or other hardware component of a machine that is malfunctioning. Another example where research data is not readily available in the required amounts is autonomous driving applications. Obtaining realistic data of unsafe autotrajectories is more difficult in practice than obtaining realistic data of safe autotrajectories. Accordingly, the present specification discusses at least one method for generating training data, a method for generating test data to verify that a trained model is safe to operate, and a generative model for generating the training or test data.

Bei einem Autoencoder (AE) handelt es sich um eine Architektur eines neuronalen Netzes, die aus einem neuronalen Encoder-Netz und einem neuronalen Decoder-Netz besteht und zum Erzeugen eines Rekonstruktionsverlusts ausgelegt ist. Das neuronale Encoder-Netz ist zum Abbilden aus einem Datenraum in einen latenten Raum ausgelegt. Bei einem Beispiel weist der latente Raum eine geringere Dimension als der Datenraum auf. Das neuronale Decoder-Netz ist zum Abbilden aus dem latenten Raum zurück in den Datenraum ausgelegt. Der Rekonstruktionsverlust wird berechnet. Der Rekonstruktionsverlust ist ein Maß einer Ähnlichkeit zwischen einem gegebenen Element des Datenraums vor und nach einer Rekonstruktion aus dem latenten Raum. Die Gewichtungen des neuronalen Encoder-Netzes und des neuronalen Decoder-Netzes werden trainiert, indem der Verlust zwischen den Anfangsdaten und den Daten nach Durchlaufen des Autoencoders auf ein Minimum reduziert wird. Bei einem Beispiel wird die Funktion Avg(L(data, DEC(ENC(data)))) minimiert.An autoencoder (AE) is a neural network architecture consisting of an encoder neural network and a decoder neural network designed to generate reconstruction loss. The encoder neural network is designed to map from a data space to a latent space. In one example, the latent space has a smaller dimension than the data space. The decoder neural network is designed to map from latent space back to data space. The reconstruction loss is calculated. Reconstruction loss is a measure of similarity between a given element of data space before and after reconstruction from latent space. The weights of the encoder neural network and the decoder neural network are trained by minimizing the loss between the initial data and the data after it has passed through the autoencoder. In one example, the function Avg(L(data, DEC(ENC(data)))) is minified.

Eine Weiterentwicklung des Autoencoders ist der Adversarial-Autoencoder, der ein zusätzliches generatives Adversarial-Netzwerk umfasst. Das zusätzliche generative Adversarial-Netzwerk führt einen zusätzlichen Verlustterm ein. Der zusätzliche Verlustterm ist bei einem Beispiel proportional zu der Abweichung der Verteilung der codierten Eingangsdaten (Daten in dem latenten Raum) gegenüber einer vordefinierten Verteilung.A further development of the autoencoder is the adversarial autoencoder, which includes an additional generative adversarial network. The additional generative adversarial network introduces an additional loss term. The additional loss term is, in one example, proportional to the deviation of the distribution of the encoded input data (data in the latent space) from a predefined distribution.

Anders ausgedrückt approximiert der zusätzliche Verlustterm eines AAE die sogenannte Jensen-Shannon-Divergenz zwischen den zwei Verteilungen. Durch Minimieren des zusätzlichen Verlustterms ist der Encoder des AAE gezwungen, die Eingangsdaten so zu codieren, dass die Daten in dem latenten Raum näherungsweise gemäß der vordefinierten Verteilung verteilt sind.In other words, the additional loss term of an AAE approximates the so-called Jensen-Shannon divergence between the two distributions. By minimizing the additional loss term, the AAE's encoder is forced to encode the input data in such a way that the data is distributed in the latent space approximately according to the predefined distribution.

Die Ausgabe des Trainingsprozesses (des Autoencoders) kann als ein generatives Modell für die Datenverteilung bereitgestellt werden, indem aus einer Generatorverteilung (D_g) entnommene Zufallsproben nach Trainieren des AAE beispielsweise in den Decoder eingegeben werden. Um synthetische Daten aus dem generativen Modell zu erzeugen, wird eine Zufallsprobe der latenten Verteilung genommen. Die Zufallsprobe wird unter Verwendung des in der Trainingsphase trainierten Decoders decodiert. Die Zufallsprobe weist näherungsweise die gleiche Verteilung wie die ursprünglichen Eingabedaten auf.The output of the training process (the autoencoder) can be provided as a generative model for the data distribution, for example by inputting random samples taken from a generator distribution (D _g ) into the decoder after training the AAE. To generate synthetic data from the generative model, a random sample of the latent distribution is taken. The random sample is decoded using the decoder trained in the training phase. The random sample has approximately the same distribution as the original input data.

Diese Eigenschaft von Adversarial-Autoencodern ergibt sich daraus, dass die Daten in dem latenten Raum näherungsweise gemäß der vordefinierten Verteilung verteilt sind und sich eine Entnahme von Proben aus der vordefinierten Verteilung relativ einfach einrichten lässt. Adversarial-Autoencoder lassen sich jedoch weiter verbessern.This property of adversarial autoencoders results from the fact that the data in the latent space is distributed approximately according to the predefined distribution and sampling from the predefined distribution is relatively easy to set up. However, adversarial autoencoders can be further improved.

Gemäß einem ersten Aspekt wird ein computerimplementiertes Verfahren 10 zum Trainieren eines generativen Maschinenlernmodells bereitgestellt, das Folgendes umfasst:

- Erhalten 12 mehrerer Eingangsproben in einem Datenraum;
- Abbilden 14 der mehreren Eingangsproben in einen latenten Raum als mehrere Proben in dem latenten Raum Z;
- Rekonstruieren 16 mehrerer synthetischer Proben in dem Datenraum auf der Grundlage der mehreren Proben in dem latenten Raum Z; und
- iteratives Optimieren 18 des generativen Maschinenlernmodells auf der Grundlage einer Optimierungsfunktion, die einen Rekonstruktionsverlust und eine Teststatistik umfasst, um mehrere Modellparameter zu liefern.

According to a first aspect, there is provided a computer-implemented method 10 for training a generative machine learning model, comprising:

- obtaining 12 multiple input samples in a data space;
- mapping 14 the plurality of input samples into a latent space as a plurality of samples in the latent space Z;
- reconstructing 16 a plurality of synthetic samples in the data space based on the plurality of samples in the latent space Z; and
- iteratively optimizing 18 the generative machine learning model based on an optimization function that includes a reconstruction loss and a test statistic to yield multiple model parameters.

1A stellt ein computerimplementiertes Verfahren zum Trainieren eines generativen Maschinenlernmodells gemäß dem ersten Aspekt schematisch dar. 1A shows a computer-implemented method for training a generative machine learning model according to the first aspect schematically.

Der hier dargelegte allgemeine Ansatz besteht darin, das in einem AAE verwendete generative Adversarial-Netzwerk durch eine Teststatistik S zu ersetzen. Daher wird gemäß einem ersten Aspekt ein Modell zum Erzeugen von Testdaten bereitgestellt durch Verwendung eines neuronalen Netzes, beispielsweise eines Autoencoders, um die Teststatistik zu minimieren, anstatt die Jensen-Shannon-Divergenz zu minimieren, wie es ein GAN tun würde.The general approach presented here is to replace the generative adversarial network used in an AAE with a test statistic S. Therefore, according to a first aspect, a model is provided for generating test data by using a neural network, such as an autoencoder, to minimize the test statistic, rather than minimizing the Jensen-Shannon divergence as a GAN would do.

2 stellt ein Beispiel des Trainierens eines generativen Maschinenlernmodells gemäß einer Beispieleinrichtung, insbesondere eines Normalisierten Autoencoders, NAE, schematisch dar. Ein NAE umfasst ein Encoder-Netz 13a, ein Decoder-Netz 15a, eine Teststatistik-Engine 28 und eine Rekonstruktionsverlust-Engine 30. Das Encoder-Netz 13a ist ausgelegt zum Abbilden von einer Eingabeschnittstelle 12a erhaltener Eingangsdaten in dem Datenraum in einen Speicher 14a eines latenten Raums. 2 Figure 12 schematically illustrates an example of training a generative machine learning model according to an example device, in particular a Normalized Autoencoder, NAE. A NAE comprises an encoder network 13a, a decoder network 15a, a test statistics engine 28 and a reconstruction loss engine 30. The Encoder network 13a is arranged to map input data received from an input interface 12a in the data space into a latent space memory 14a.

Bei einem Beispiel speichert der Speicher 14a des latenten Raums durch den Encoder 13a erzeugte Proben des latenten Raums.In one example, latent space memory 14a stores latent space samples generated by encoder 13a.

Bei einem Beispiel stellen das Encoder-Netz 13a und/oder das Decoder-Netz 15a eine, zwei, drei oder mehr verdeckte Schichten bereit. Bei einem Beispiel speichert der Speicher 14a des latenten Raums Vektoren mit einer verglichen mit den von der Eingabeschnittstelle 12a erhaltenen Eingangsdaten geringeren Dimensionalität.In one example, the encoder network 13a and/or the decoder network 15a provides one, two, three or more hidden layers. In one example, latent space store 14a stores vectors having a lower dimensionality compared to the input data received from input interface 12a.

Das Decoder-Netz 15a ist ausgelegt zum Abbilden durch den Speicher 14a des latenten Raums erzeugter und/oder gespeicherter Proben zurück in den Datenraum als rekonstruierte Daten, die einer Ausgabeschnittstelle 18a des generativen Maschinenlernmodells bereitgestellt werden. Bei einem Beispiel sind das Encoder-Netz 13a und/oder das Decoder-Netz 15a als tiefe neuronale Netze konfiguriert, wobei dies jedoch nicht unbedingt notwendig ist.The decoder network 15a is configured to map samples generated and/or stored by the latent space store 14a back into the data space as reconstructed data provided to an output interface 18a of the generative machine learning model. In one example, encoder network 13a and/or decoder network 15a are configured as deep neural networks, although this is not a requirement.

Die Rekonstruktionsverlust-Engine 30 ist ausgelegt zum Vergleichen, im Datenraum, eines durch die Eingabeschnittstelle 12a bereitgestellten Eingangsdatenelements mit einem entsprechenden der Ausgangsschnittstelle 18a bereitgestellten Ausgangsdatenelement. Die Rekonstruktionsverlust-Engine 30 erzeugt den Rekonstruktionsverlust L, ein Maß dafür, wie ähnlich sich diese beiden Elemente des Datenraums sind. Zum Beispiel sollte der Rekonstruktionsverlust bei einer erfolgreichen Trainingsoperation mit jeder Iteration und Rückpropagierung, die beispielsweise als Teil eines iterativen Gradientenabstieg-Optimierungsprozesses durchgeführt werden, reduziert werden (Rückpropagierung).The reconstruction loss engine 30 is configured to compare, in data space, an input data item provided by the input interface 12a with a corresponding output data item provided by the output interface 18a. The reconstruction loss engine 30 produces the reconstruction loss L, a measure of how similar these two elements of the data space are. For example, in a successful training operation, the reconstruction loss should be reduced (backpropagation) with each iteration and backpropagation performed, for example, as part of an iterative gradient descent optimization process.

Die Teststatistik-Engine 28 empfängt ihre Eingabe von dem im Speicher 14a des latenten Raums geführten Vektor z des latenten Raums. Die Teststatistik-Engine 28 gibt einen Wert S(z) aus.The test statistics engine 28 receives its input from the latent space vector z maintained in the latent space store 14a. The test statistics engine 28 outputs a value S(z).

Bei einem Beispiel wird die Teststatistik S bezüglich aller Proben im latenten Raum Z evaluiert. Bei einem Beispiel wird die Teststatistik S bezüglich einer Teilmenge der Proben im latenten Raum Z evaluiert.In one example, the test statistic S is evaluated with respect to all samples in latent space Z. In one example, the test statistic S is evaluated with respect to a subset of the samples in the latent space Z.

Bei einem Beispiel gibt die Teststatistik-Engine 28 einen relativ kleinen Wert aus, wenn mehrere Eingangsproben z aus dem Speicher 14a des latenten Raums gemäß der gewählten Teststatistik S anscheinend aus der latenten Verteilung D (oder aus einer der Verteilungen in C) entnommen wurden. Bei einem Beispiel gibt die Teststatistik-Engine 28 einen relativ großen Wert aus, wenn es sich bei mehreren Eingangsproben z gemäß der gewählten Teststatistik S anscheinend nicht um eine Probe der latenten Verteilung D (oder beliebiger Verteilungen in C) handelt.In one example, the test statistic engine 28 outputs a relatively small value when multiple input samples z from the latent space store 14a appear to have been taken from the latent distribution D (or from one of the distributions in C) according to the selected test statistic S. In one example, the test statistic engine 28 returns a relatively large value when multiple input samples z do not appear to be samples of the latent distribution D (or any distributions in C) according to the chosen test statistic S.

Bei einem Beispiel gibt die Ausgabe der Teststatistik-Engine 28 einen Wert nahe null aus, wenn es sich bei mehreren Eingangsproben z gemäß der gewählten Teststatistik S anscheinend um eine Probe der latenten Verteilung D (oder eine Probe aus einer Verteilung in C) handelt, und nahe eins, wenn es sich bei mehreren Eingangsproben z gemäß der gewählten Teststatistik S anscheinend nicht um eine Probe der latenten Verteilung D (oder beliebiger Verteilungen in C) handelt.In one example, the output of the test statistic engine 28 returns a value close to zero when a plurality of input samples z appear to be a sample of the latent distribution D (or a sample from a distribution in C) according to the chosen test statistic S, and close to one if several input samples z do not appear to be samples of the latent distribution D (or any distributions in C) according to the chosen test statistic S.

Bei diesem Beispiel ist die Teststatistik-Engine 28 ausgelegt zum Erzeugen einer Teststatistik S als ein Maß dafür, wie sehr sich die Verteilung der Daten in dem latenten Raum von einer Verteilung D, oder einer Verteilungsklasse C, unterscheidet. Anders ausgedrückt ist die Teststatistik-Engine 28 ausgelegt zum Erzeugen einer Teststatistik 28, die eine Abbildung von einer Menge von Punkten in dem latenten Raum auf reelle Zahlen bereitstellt. Um den normalisierten Autoencoder (NAE) zu trainieren, werden in einem Erzeugungsmodus Eingangsproben X = {x₁, x₂, x₃, ..., x_n} von einer Datenquelle, die durch den NAE emuliert werden soll, erhalten. Bei einem Beispiel werden der Encoder 13a, der Decoder 15a und/oder die verdeckte Schicht auf null, eine Menge von Zufallszahlen oder einen arbiträren Initialisierungsvektor initialisiert. Bei einem Beispiel wird die Initialisierungsbedingung der Netze als ein Hyperparameter festgelegt.In this example, the test statistic engine 28 is configured to generate a test statistic S as a measure of how much the distribution of the data in the latent space differs from a distribution D, or distribution class C. In other words, the test statistic engine 28 is designed to generate a test statistic 28 that provides a mapping from a set of points in the latent space to real numbers. In order to train the normalized autoencoder (NAE), in a generation mode, input samples X = {x ₁ , x ₂ , x ₃ , ..., x _n } are obtained from a data source to be emulated by the NAE. In one example, encoder 13a, decoder 15a, and/or the hidden layer are initialized to zero, a set of random numbers, or an arbitrary initialization vector. In one example, the initialization condition of the nets is specified as a hyperparameter.

Für eine erste Iteration werden ein Rekonstruktionsverlust L und eine Teststatistik S berechnet. Der Rekonstruktionsverlust wird auf der Grundlage des Verlusts zwischen den Eingangsproben X und den codierten und anschließend decodierten Eingangsproben X berechnet. Die Teststatistik S wird auf der Grundlage einer Berechnung des codierten Werts der Eingangsproben X und einer Berechnung eines Werts der Teststatistik auf der Grundlage des Ergebnisses berechnet.For a first iteration, a reconstruction loss L and a test statistic S are calculated. The reconstruction loss is calculated based on the loss between the input samples X and the encoded and then decoded input samples X . The test statistic S is calculated based on a calculation of the encoded value of the input samples X and a calculation of a value of the test statistic based on the result.

Der Gesamtverlust wird durch Kombinieren des Rekonstruktionsverlusts L und der Teststatistik S in Block 32 berechnet.The total loss is calculated in block 32 by combining the reconstruction loss L and the test statistic S.

Bei anschließenden Iterationen wird der NAE durch Minimieren von L(X, DEC(ENC(X))) + S(ENC(X)) trainiert.On subsequent iterations, the NAE is trained by minimizing L(X, DEC(ENC(X))) + S(ENC(X)).

Bei einem Beispiel ist der Rekonstruktionsverlust L bei jeder Iteration eine Funktion der mehreren Eingangsproben X = {x₁, x₂, x₃, ..., x_n) und der synthetischen mehreren Proben X̂ = {x̂_s1, x̂_s2, x̂_s3, ..., x̂_sn} (während jeder Iteration). Die Teststatistik S(z) ist eine Funktion mindestens einer Probe des latenten Raums Z = {z₁, z₂, z₃, ..., z_n}.In one example, the reconstruction loss L at each iteration is a function of the multiple input samples X = {x ₁ , x ₂ , x ₃ , ..., x _n ) and the synthetic multiple samples X̂ = {x̂ _s1 , x̂ _s2 , x̂ _s3 , ..., x̂ _sn } (during each iteration). The test statistic S(z) is a function of at least one sample of the latent space Z={z ₁ , z ₂ , z ₃ ,..., z _n }.

Die Minimierung wird beispielsweise durch Anpassen der Gewichtungen in dem Encoder- und dem Decoder-Netz durchgeführt. Es kann zum Beispiel das Gradientenabstiegsverfahren verwendet werden. Nach einer Anzahl von Iterationen, oder gegebenenfalls bis ein Konvergenzkriterium erfüllt ist, bilden die Gewichtungen des Decoders 15a ein Decoder-Modell, das als ein generatives Modell zum Erzeugen weiterer arbiträrer Sequenzen den Eingangsproben ähnelnder synthetischer Ausgangsproben verwendet werden kann.The minimization is performed, for example, by adjusting the weights in the encoder and decoder networks. For example, the gradient descent method can be used. After a number of iterations, or until a convergence criterion is met, as appropriate, the decoder 15a weights form a decoder model that can be used as a generative model for generating further arbitrary sequences of synthetic output samples similar to the input samples.

Bei einem Beispiel nimmt S(z) kleine Werte an, wenn die Proben des latenten Raums Z dem Anschein nach Proben der Verteilung sind, bezüglich derer die Teststatistik testet, und nimmt ansonsten große Werte an.In one example, S(z) takes on small values when the samples of latent space Z appear to be samples of the distribution against which the test statistic is testing, and takes on large values otherwise.

Bei einem Beispiel ist die Teststatistik S(z) eine Metrik, die eine Ähnlichkeit oder Unähnlichkeit einer empirischen Verteilung der mehreren Eingangsproben Z = {z₁, z₂, z₃, ..., z_n} und einer vordefinierten Verteilung oder einer vordefinierten Verteilungsklasse definiert.In one example, the test statistic S(z) is a metric that measures a similarity or dissimilarity between an empirical distribution of the plurality of input samples Z={z ₁ , z ₂ , z ₃ ,..., z _n } and a predefined distribution or a predefined distribution class defined.

Bei einem Beispiel wird die Teststatistik S(z) minimiert, wenn eine Probe des latenten Raums Z eine Probe der vordefinierten Verteilung D oder einer Verteilung in der vordefinierten Verteilungsklasse C, für die die Teststatistik optimiert wird, zumindest approximiert.In one example, the test statistic S(z) is minimized when a sample of the latent space Z at least approximates a sample of the predefined distribution D or a distribution in the predefined distribution class C for which the test statistic is optimized.

Bei einem Beispiel basiert der Ausgabewert der Teststatistik S(z) auf mindestens einer Probe des latenten Raums Z.In one example, the output value of the test statistic S(z) is based on at least one sample of the latent space Z.

Bei einem Beispiel testet die Teststatistik S(z) eine oder mehrere Proben des latenten Raums dahingehend, ob sie aus einer Normalverteilung, einer multivariaten Normalverteilung, einer multivariaten Exponentialverteilung, oder einer diskreten Zufallsvariablenverteilung in dem latenten Raum extrahiert sind.In one example, the test statistic S(z) tests one or more latent space samples as to whether they are extracted from a normal distribution, a multivariate normal distribution, a multivariate exponential distribution, or a discrete random variable distribution in the latent space.

Bei einem Beispiel charakterisiert die Teststatistik S(z) das Ausmaß, in dem Proben des latenten Raums oder der Klasse vordefinierter Verteilungen eine gegebene Form approximieren, jedoch von dem Mittelwert oder der Varianz der vordefinierten Verteilung oder der Klasse vordefinierter Verteilungen unabhängig sind.In one example, the test statistic S(z) characterizes the extent to which samples of the latent space or class of predefined distributions approximate a given shape, but are independent of the mean or variance of the predefined distribution or class of predefined distributions.

Bei einem Beispiel handelt es sich bei der Teststatistik S(z) um die Teststatistik für multivariate Normalität W_n,β gemäß dem Artikel „A New Approach to the BHEP Tests for Multivariate Normality“ von Henze und Wagner, ausgeführt im Journal of Multivariate Analysis 62, Seiten 1-23 (1997), veröffentlicht durch Academic Press.As an example, the test statistic S(z) is the test statistic for multivariate normality W _n,β according to the article "A New Approach to the BHEP Tests for Multivariate Normality" by Henze and Wagner, published in Journal of Multivariate Analysis 62 , pp. 1-23 (1997), published by Academic Press.

Bei einem Beispiel wird das Abbilden der mehreren Eingangsproben in den latenten Raum als mehrere Proben in dem latenten Raum durch einen Encoder 13a durchgeführt und das Rekonstruieren mehrerer synthetischer Proben in einen Datenraum aus den Proben des latenten Raums wird durch einen Decoder 15a durchgeführt.In one example, the mapping of the multiple input samples into the latent space as multiple samples in the latent space is performed by an encoder 13a, and the reconstructing of multiple synthetic samples into a data space from the latent space samples is performed by a decoder 15a.

Bei einem Beispiel umfasst das iterative Optimieren des generativen Maschinenlernmodells auf der Grundlage einer den Rekonstruktionsverlust L und die Teststatistik S umfassenden Optimierungsfunktion iteratives Anpassen mehrerer Gewichtungen in dem Encoder 13a und/oder dem Decoder 15a auf der Grundlage einer Kombination des Rekonstruktionsverlusts L und der Teststatistik S.In one example, iteratively optimizing the generative machine learning model based on an optimization function comprising the reconstruction loss L and the test statistic S includes iteratively adjusting multiple weights in the encoder 13a and/or the decoder 15a based on a combination of the reconstruction loss L and the test statistic S.

Bei einem Beispiel handelt es sich bei dem generativen Maschinenlernmodell um einen Normalisierten Autoencoder.In one example, the generative machine learning model is a Normalized Autoencoder.

Bei einem Beispiel ersetzt die Teststatistik ein generatives Adversarial-Netzwerk eines Adversarial-Autoencoders. Dementsprechend profitiert der Trainingsprozess des normalisierten Autoencoders von einem höheren Stabilitätsgrad. Bei einem Beispiel identifiziert die Teststatistik eine Wölbung einer statistischen Verteilung.In one example, the test statistic replaces an adversarial generative network of an adversarial autoencoder. Accordingly, the training process of the normalized autoencoder benefits from a higher degree of stability. In one example, the test statistic identifies a kurtosis of a statistical distribution.

Bei einem Beispiel wird das generative Maschinenlernmodell an einen Speicher ausgegeben und/oder darin gespeichert.In one example, the generative machine learning model is output to and/or stored in memory.

Gemäß einem zweiten Aspekt wird eine Einrichtung 34 zum Trainieren eines generativen Maschinenlernmodells bereitgestellt. Die Einrichtung umfasst eine Eingabeschnittstelle 36, ausgelegt zum Empfangen mehrerer Eingangsproben, einen Prozessor 38, ausgelegt zum Erhalten der mehreren Eingangsproben, und eine Ausgabeschnittstelle 42, ausgelegt zum Ausgeben eines trainierten Modells und/oder mehrerer durch das trainierte Modell erzeugter synthetischer Datenproben.According to a second aspect, a device 34 for training a generative machine learning model is provided. The facility includes an input interface 36 configured to receive a plurality of input samples, a processor 38 configured to receive the plurality of input samples, and an output interface 42 configured to output a trained model and/or a plurality of synthetic data samples generated by the trained model.

Der Prozessor 38 ist ausgelegt, in einer Trainingsphase, zum Abbilden der mehreren Eingangsproben über einen durch den Prozessor ausgeführten Encoder 13a in einen latenten Raum als mehrere Proben in dem latenten Raum, zum Rekonstruieren mehrerer synthetischer Datenproben über einen durch den Prozessor 38 ausgeführten Decoder 15a auf der Grundlage der Proben des latenten Raums, und zum iterativen Optimieren, unter Verwendung des Prozessors, des generativen Maschinenlernmodells auf der Grundlage einer Optimierungsfunktion, die einen Rekonstruktionsverlust und eine Teststatistik umfasst. The processor 38 is configured, in a training phase, to map the multiple input samples via an encoder 13a executed by the processor into a latent space as multiple samples in the latent space, to reconstruct multiple synthetic data samples via a decoder 15a executed by the processor 38 based on the latent space samples, and iteratively optimizing, using the processor, the generative machine learning model based on an optimization function that includes a reconstruction loss and a test statistic.

4 stellt eine Einrichtung gemäß dem zweiten und/oder dem vierten Aspekt schematisch dar. 4 shows a device according to the second and/or the fourth aspect schematically.

Bei einem Beispiel handelt es sich bei der Einrichtung 34 um einen Personal Computer, einen Server, einen cloudbasierten Computer oder einen eingebetteten Computer. Der Speicher 40 der Einrichtung 34 speichert ein Computerprogramm, das bei Ausführung durch den Prozessor 38 bewirkt, dass der Prozessor 38 die durch die computerimplementierten Verfahren gemäß dem ersten oder dem dritten Aspekt beschriebenen Funktionalitäten ausführt.In one example, device 34 is a personal computer, a server, a cloud-based computer, or an embedded computer. The memory 40 of the device 34 stores a computer program which, when executed by the processor 38, causes the processor 38 to carry out the functionalities described by the computer-implemented methods according to the first or the third aspect.

Gemäß einem Beispiel handelt es sich bei der Eingabeschnittstelle 36 und/oder der Ausgabeschnittstelle um eine USB-Schnittstelle oder eine Ethernet-Schnittstelle oder eine WLAN-Schnittstelle oder andere geeignete Hardware, die die Eingabe und Ausgabe von Datenproben von der Einrichtung 34 ermöglichen kann.According to one example, the input interface 36 and/or the output interface is a USB interface, or an Ethernet interface, or a WLAN interface, or other suitable hardware that can enable the input and output of data samples from the device 34 .

Bei einem Beispiel umfasst die Einrichtung 34 ferner ein flüchtiges und/oder nichtflüchtiges Speichersystem 40, ausgelegt zum Empfangen von Eingangsbeobachtungen als Eingangsdaten von der Eingabeschnittstelle 40, beispielsweise zum Speichern einer Teststatistik S oder einer Vielzahl von latentem Raum.In one example, the facility 34 further comprises a volatile and/or non-volatile storage system 40 configured to receive input observations as input data from the input interface 40, for example to store a test statistic S or a plurality of latent spaces.

Gemäß einem dritten Aspekt wird ein computerimplementiertes Verfahren 11 zum Erzeugen synthetischer Datenproben unter Verwendung eines generativen Maschinenlernmodells bereitgestellt, das Folgendes umfasst:

- Konfigurieren 20 eines generativen Maschinenlernmodells gemäß mehreren Modellparametern;
- Erhalten 22, auf der Grundlage einer Teststatistik, einer Probe des latenten Raums aus gemäß den mehreren Modellparametern konfigurierten latenten Variablen;
- Decodieren 24 der Probe des latenten Raums, um weitere mehrere synthetische Datenproben in dem Datenraum zu erhalten; und
- Ausgeben 26 der weiteren mehreren synthetischen Datenproben.

According to a third aspect, there is provided a computer-implemented method 11 for generating synthetic data samples using a generative machine learning model, comprising:

- configuring 20 a generative machine learning model according to a plurality of model parameters;
- obtaining 22, based on a test statistic, a sample of the latent space from latent variables configured according to the plurality of model parameters;
- decoding 24 the latent space sample to obtain further multiple synthetic data samples in the data space; and
- Output 26 of the other multiple synthetic data samples.

3 stellt eine Einrichtung zum Erzeugen synthetischer Daten unter Verwendung eines Maschinenlernmodells schematisch dar. 3 Figure 12 schematically depicts a facility for generating synthetic data using a machine learning model.

Nach dem oben in Verbindung mit dem ersten Aspekt beschriebenen Trainingsprozess können synthetische Datenproben erhalten werden, die eine Verteilung D aufweisen, die sich der der zum Trainieren des NAE verwendeten Daten annähert oder dieser ähnelt. Sampling eines Beispiels wird ein Decoder-Netz gemäß den gemäß dem ersten Aspekt oder dessen Ausführungsformen berechneten Decoder-Modellparametern konfiguriert. Eine Generatorverteilung-Engine 29 wird durch eine Abtasteinrichtung 31 abgetastet, um dem trainierten Decoder 15a' Eingangsproben bereitzustellen. Das Prime-Zeichen gibt an, dass der Decoder 15a unter Verwendung in einer vorherigen Trainingsiteration berechneter Modellparameter konfiguriert wurde.Following the training process described above in connection with the first aspect, synthetic data samples can be obtained that have a distribution D that approaches or is similar to that of the data used to train the NAE. Sampling an example, a decoder network is configured according to the decoder model parameters calculated according to the first aspect or its embodiments. A generator distribution engine 29 is sampled by a sampler 31 to provide input samples to the trained decoder 15a'. The prime character indicates that the decoder 15a has been configured using model parameters calculated in a previous training iteration.

Der Decoder 15a' erhält die Proben von der Generatorverteilung-Engine 29 und decodiert sie, was mehrere synthetische Datenproben X̂_s = {x̂_s1, x̂_s2, x̂_s3, ..., x̂_sn} ergibt. Die mehreren synthetischen Datenproben weisen eine statistische Verteilung auf, die die Verteilung der zum Trainieren des NAE verwendeten Daten X approximiert.The decoder 15a' receives the samples from the generator distribution engine 29 and decodes them, yielding multiple synthetic data samples X̂ _s = {x̂ _s1 , x̂ _s2 , x̂ _s3 , ..., x̂ _sn }. The multiple synthetic data samples have a statistical distribution that approximates the distribution of the data X used to train the NAE.

Anders ausgedrückt lässt sich die Generatorverteilung-Engine 29 so verstehen, dass sie die Proben des latenten Raums Z aus der Trainingsphase am Eingang des trainierten Decoders 15a' ersetzt.In other words, the generator distribution engine 29 can be understood as replacing the latent space Z samples from the training phase at the input of the trained decoder 15a'.

Durch Abtasten unter Verwendung der Generatorverteilung-Engine 29, um Ersatzproben des latenten Raums bereitzustellen, und Berechnen von DEC(z) unter Verwendung des trainierten Decoder-Netzes 15a' wird eine Datenprobe synthetischer Ausgangsdaten X̂_s = {x̂_s1, x̂_s2, x̂_s3, ..., x̂_sn} erhalten, die den zum Trainieren des NAE verwendeten Daten stark ähneln.By sampling using the generator distribution engine 29 to provide surrogate latent space samples and computing DEC(z) using the trained decoder network 15a', a data sample of synthetic output data X̂ _s = {x̂ _s1 , x̂ _s2 , x̂ _s3 , ..., x̂ _sn } that closely resemble the data used to train the NAE.

Bei einem Beispiel, wenn S bezüglich einer Verteilungsklasse C testet, werden die Parameter des am besten übereinstimmenden Elements D(n) von C geschätzt, bevor der latente Raum Z abgetastet wird. Dann wird das am besten übereinstimmende Element D(n) zum Abtasten von Z verwendet.In one example, when S is testing against a distribution class C, the parameters of C's closest matching element D(n) are estimated before latent space Z is scanned. Then the best matching element D(n) is used to sample Z.

Gemäß einem vierten Aspekt wird eine Einrichtung 34 zum Trainieren eines generativen Maschinenlernmodells bereitgestellt. Die Einrichtung umfasst Folgendes:

- eine Eingabeschnittstelle 36, ausgelegt zum Empfangen mehrerer Eingangsproben;
- einen Prozessor 38, ausgelegt zum Erhalten der mehreren Eingangsproben; und
- eine Ausgabeschnittstelle 42, ausgelegt zum Ausgeben eines trainierten Modells und/oder mehrerer durch das trainierte Modell erzeugter synthetischer Datenproben; wobei der Prozessor ausgelegt ist, in einer Trainingsphase, zum Abbilden der mehreren Eingangsproben über einen durch den Prozessor ausgeführten Encoder in einen latenten Raum als eine oder mehrere Proben des latenten Raums Z, zum Rekonstruieren mehrerer synthetischer Datenproben über einen durch den Prozessor ausgeführten Decoder 15a auf der Grundlage der Proben des latenten Raums; und
- zum iterativen Optimieren, unter Verwendung des Prozessors, des generativen Maschinenlernmodells auf der Grundlage einer Optimierungsfunktion, die einen Rekonstruktionsverlust und eine Teststatistik umfasst, um mehrere Modellparameter zu liefern.

According to a fourth aspect, a device 34 for training a generative machine learning model is provided. The facility includes the following:

- an input interface 36 adapted to receive a plurality of input samples;
- a processor 38 arranged to receive the plurality of input samples; and
- an output interface 42 arranged to output a trained model and/or a plurality of synthetic data samples generated by the trained model; wherein the processor is adapted, in a training phase, to map the plurality of input samples via a processor-executed encoder into a latent space as one or more samples of the latent space Z, to reconstruct a plurality of synthetic data samples via a processor-executed decoder 15a the basis of the latent space samples; and
- to iteratively optimize, using the processor, the generative machine learning model based on an optimization function that includes a reconstruction loss and a test statistic to yield multiple model parameters.

Gemäß einem fünften Aspekt wird ein Computerprogrammelement bereitgestellt, das zumindest Folgendes umfasst: (i) computerausführbare Anweisungen zum Trainieren, unter Verwendung maschinellen Lernens, eines generativen Maschinenlernmodells gemäß dem Verfahren des ersten Aspekts oder dessen Beispielen und/oder (ii) computerausführbare Anweisungen zum Erzeugen synthetischer Datenproben gemäß dem dritten Aspekt und/oder (iii) computerausführbare Anweisungen, die Modellparameter zum Bereitstellen eines gemäß dem ersten Aspekt trainierten generativen Maschinenlernmodells umfassen.According to a fifth aspect, there is provided a computer program element, comprising at least: (i) computer-executable instructions for training, using machine learning, a generative machine learning model according to the method of the first aspect or examples thereof, and/or (ii) computer-executable instructions for generating synthetic Data samples according to the third aspect and/or (iii) computer-executable instructions comprising model parameters for providing a generative machine learning model trained according to the first aspect.

Die Computerprogrammelemente des fünften Aspekts können zum Beispiel maschinenlesbare Anweisungen umfassen, die in einem Computerspeicher gespeichert werden können.The computer program elements of the fifth aspect may, for example, comprise machine-readable instructions storable in a computer memory.

Bei einem Beispiel umfassen die mehreren Eingangsproben X ein oder mehrere Beispiele von Systemanomalien, und die durch die Einrichtung zum Erzeugen synthetischer Daten unter Verwendung eines Maschinenlernmodells erzeugten mehreren synthetischen Datenproben Xs umfassen mehrere synthetische Systemanomalien. Bei einer Anomalie handelt es sich um ein Muster, das von einem erwarteten Ergebnis abweicht. Dementsprechend ist ein gemäß dem ersten Aspekt trainiertes generatives Maschinenlernmodell zum Erzeugen einer Reihe synthetischer Systemanomalien ausgelegt. Diese synthetischen Systemanomalien lassen sich wiederum zum Trainieren eines weiteren Maschinenlernmodells (z. B. eines Anomaliedetektors, der zum Detektieren von Systemanomalien in einem Eingangsstrom von Systemdaten ausgelegt ist) verwenden. Die Eingangsproben können beispielsweise von einem Sensor (z. B. einer Wärmeüberwachungseinrichtung) einer Maschine (z. B. eines Motors) erhalten werden, und eine Systemanomalie kann einen vorbestimmten Zustand der Maschine (z. B. einen Fehlerzustand, beispielsweise eine Überhitzungsbedingung des Motors auf der Grundlage einer vorherigen Motordrehzahl) repräsentieren. Die synthetischen Systemanomalien lassen sich zum Trainieren eines weiteren Maschinenlernmodells, das zum Detektieren eines Zustands der Maschine auf der Grundlage eines Eingangsstroms von Systemdaten ausgelegt ist, verwenden.In one example, the plurality of input samples X comprise one or more instances of system anomalies, and the plurality of synthetic data samples Xs generated by the synthetic data generator using a machine learning model comprise a plurality of synthetic system anomalies. An anomaly is a pattern that deviates from an expected result. Accordingly, a generative machine learning model trained according to the first aspect is designed to generate a series of synthetic system anomalies. These synthetic system anomalies can in turn be used to train another machine learning model (e.g. an anomaly detector designed to detect system anomalies in an input stream of system data). For example, the input samples may be obtained from a sensor (e.g., a thermal monitor) of a machine (e.g., an engine), and a system anomaly may indicate a predetermined condition of the machine (e.g., a fault condition, such as an engine overheating condition based on a previous engine speed). The synthetic system anomalies can be used to train another machine learning model configured to detect a state of the machine based on an input stream of system data.

Bei anderen Beispielen umfassen die Eingangsproben X ein oder mehrere Beispiele von Daten, die interne Zustände eines technischen Systems (z. B. einer Maschine) repräsentieren. Die interne Zustände repräsentierenden Daten lassen sich auf der Grundlage von Sensordaten bestimmen und/oder für das jeweilige technische System simulieren. Die mehreren durch die Einrichtung zum Erzeugen synthetischer Daten unter Verwendung eines Maschinenlernmodells erzeugten synthetischen Datenproben Xs können mehrere synthetische Beispiele von Daten umfassen, die in diesen Beispielen interne Zustände eines technischen Systems repräsentieren. Dementsprechend ist ein gemäß dem ersten Aspekt trainiertes generatives Maschinenlernmodell zum Erzeugen synthetischer Beispiele von Daten, die interne Zustände eines technischen Systems repräsentieren, ausgelegt. Die synthetischen Beispiele von interne Zustände des technischen Systems repräsentierenden Daten lassen sich zum Trainieren eines weiteren Maschinenlernmodells (z. B. eines Maschinenlernmodells zum Überwachen und/oder Steuern des technischen Systems) verwenden.In other examples, the input samples X include one or more examples of data representing internal states of a technical system (e.g., a machine). The data representing internal states can be determined on the basis of sensor data and/or simulated for the respective technical system. The multiple synthetic data samples Xs generated by the device for generating synthetic data using a machine learning model can include multiple synthetic examples of data, which in these examples represent internal states of a technical system. Accordingly, a generative machine learning model trained according to the first aspect is designed to generate synthetic examples of data that represent internal states of a technical system. The synthetic examples of internal states The data representing the technical system can be used to train a further machine learning model (e.g. a machine learning model for monitoring and/or controlling the technical system).

Zusätzlich oder alternativ dazu umfassen die Eingangsproben X ein oder mehrere Beispiele von Daten, die die Umgebung eines technischen Systems (z. B. einer Maschine) repräsentieren. Die die Umgebung repräsentierenden Daten lassen sich auf der Grundlage von Sensordaten bestimmen und/oder für das jeweilige technische System simulieren. Die mehreren durch die Einrichtung zum Erzeugen synthetischer Daten unter Verwendung eines Maschinenlernmodells erzeugten synthetischen Datenproben Xs können mehrere synthetische Beispiele von Daten umfassen, die in diesen Beispielen die Umgebung eines technischen Systems repräsentieren. Dementsprechend ist ein gemäß dem ersten Aspekt trainiertes generatives Maschinenlernmodell zum Erzeugen synthetischer Beispiele von Sensordaten, die die Umgebung eines technischen Systems repräsentieren, ausgelegt. Die synthetischen Beispiele von die Umgebung des technischen Systems repräsentierenden Daten lassen sich zum Trainieren eines weiteren Maschinenlernmodells (z. B. eines Maschinenlernmodells zum Überwachen und/oder Steuern des technischen Systems) verwenden.Additionally or alternatively, the input samples X include one or more examples of data representing the environment of a technical system (e.g. a machine). The data representing the environment can be determined on the basis of sensor data and/or simulated for the respective technical system. The multiple synthetic data samples Xs generated by the synthetic data generating facility using a machine learning model may include multiple synthetic examples of data, which in these examples represent the environment of a technical system. Accordingly, a generative machine learning model trained according to the first aspect is designed to generate synthetic examples of sensor data that represent the environment of a technical system. The synthetic examples of data representing the environment of the technical system can be used to train a further machine learning model (e.g. a machine learning model for monitoring and/or controlling the technical system).

Zusätzlich oder alternativ dazu umfassen die Eingangsproben X ein oder mehrere Beispiele von Sensordaten, und die durch die Einrichtung zum Erzeugen synthetischer Daten unter Verwendung eines Maschinenlernmodells erzeugten mehreren synthetischen Datenproben Xs umfassen mehrere synthetische Beispiele von Sensordaten. Die Sensordaten lassen sich zum Beispiel zum Überwachen und/oder Steuern eines technischen Systems (z. B. einer Maschine) erhalten. Dementsprechend ist ein gemäß dem ersten Aspekt trainiertes generatives Maschinenlernmodell zum Erzeugen synthetischer Beispiele von Sensordaten ausgelegt. Die synthetischen Beispiele von Sensordaten lassen sich zum Trainieren eines weiteren Maschinenlernmodells (z. B. eines Maschinenlernmodells zum Überwachen und/oder Steuern des technischen Systems) verwenden.Additionally or alternatively, the input samples X comprise one or more instances of sensor data and the multiple synthetic data samples Xs generated by the synthetic data generator using a machine learning model comprise multiple synthetic instances of sensor data. The sensor data can be obtained, for example, for monitoring and/or controlling a technical system (e.g. a machine). Accordingly, a generative machine learning model trained according to the first aspect is designed to generate synthetic samples of sensor data. The synthetic examples of sensor data can be used to train another machine learning model (e.g. a machine learning model for monitoring and/or controlling the technical system).

Bei noch weiteren Beispielen umfassen die Eingangsproben Beispiele von Bild- oder Videodaten (z. B. einen Vektor oder ein Array semantisch segmentierter Bilddaten, z. B. Pixeldaten). Die mehreren durch die Einrichtung zum Erzeugen synthetischer Daten unter Verwendung eines Maschinenlernmodells erzeugten synthetischen Datenproben Xs können mehrere synthetische Beispiele von Bild- oder Videodaten umfassen. Die Bild- oder Videodaten können durch einen oder mehrere Sensoren (z. B. einen kamerabasierten Sensor, einen Ultraschallsensor, einen RADAR-Sensor, einen LIDAR-Sensor oder einen Wärmesensor) erhalten werden und/oder synthetische Bilddaten sein. Dementsprechend ist ein gemäß dem ersten Aspekt trainiertes generatives Maschinenlernmodell zum Erzeugen synthetischer Beispiele von Bild- oder Videodaten ausgelegt. Die synthetischen Beispiele von Bild- oder Videodaten lassen sich zum Trainieren eines weiteren Maschinenlernmodells (z. B. eines Maschinenlernmodells zur Bild- oder Videodatenklassifizierung) verwenden.In still other examples, the input samples include instances of image or video data (e.g., a vector or array of semantically segmented image data, e.g., pixel data). The multiple synthetic data samples Xs generated by the synthetic data generator using a machine learning model may include multiple synthetic samples of image or video data. The image or video data may be obtained from one or more sensors (e.g., a camera-based sensor, an ultrasonic sensor, a RADAR sensor, a LIDAR sensor, or a thermal sensor) and/or be synthetic image data. Accordingly, a generative machine learning model trained according to the first aspect is designed to generate synthetic samples of image or video data. The synthetic samples of image or video data can be used to train another machine learning model (e.g., a machine learning model for image or video data classification).

Die Eingangsproben können zum Beispiel von einer industriellen Fertigungsmaschine oder Fertigungsstraße erhalten werden. Die Eingangsproben können einen Vektor von Sensoreingaben von allen oder einer Teilmenge der in der industriellen Fertigungsmaschine oder Fertigungsstraße enthaltenen Sensoren umfassen. Die Sensoren sind dazu ausgelegt, mindestens eines der Folgenden zu detektieren: die Winkelposition eines oder mehrerer Servomotoren, die Leistungsaufnahme des einen oder der mehreren Servomotoren, das visuelle Erscheinungsbild eines Bereichs der industriellen Fertigungsmaschine oder Fertigungsstraße, Rückkopplungssignale von mehreren hydraulischen oder pneumatischen Stellgliedern und dergleichen. In diesem Fall kann eine Systemanomalie ein Objekt repräsentieren, das inkorrekt verarbeitet wird oder sich in der industriellen Fertigungsmaschine oder Fertigungsstraße verklemmt. In diesem Fall werden die Sensorsignale in vielerlei Hinsicht von den im normalen Herstellungsablauf erwarteten abweichen. Zusätzlich kann eine Teilmenge von Sensorsignalen eine Vorankündigung der Anomalie bereitstellen. Daher umfassen die mehreren synthetischen Datenproben Xs mehrere synthetische Sensorzustände industrieller Fertigungsmaschinen oder Fertigungsstraßen.The input samples can be obtained from an industrial production machine or production line, for example. The input samples may include a vector of sensor inputs from all or a subset of the sensors included in the industrial manufacturing machine or assembly line. The sensors are designed to detect at least one of the following: the angular position of one or more servomotors, the power consumption of the one or more servomotors, the visual appearance of a portion of the industrial manufacturing machine or assembly line, feedback signals from multiple hydraulic or pneumatic actuators, and the like. In this case, a system anomaly may represent an object that is being processed incorrectly or is stuck in the industrial manufacturing machine or assembly line. In this case, the sensor signals will deviate in many respects from those expected in the normal manufacturing process. Additionally, a subset of sensor signals can provide advance notice of the anomaly. Therefore, the multiple synthetic data samples Xs include multiple synthetic sensor states of industrial manufacturing machines or assembly lines.

Bei einem Beispiel umfassen die mehreren Eingangsproben X eine oder mehrere Fahrzeug- oder Robotertrajektorien im zwei- oder dreidimensionalen Raum (beispielsweise im euklidischen Raum), und die durch die Einrichtung zum Erzeugen synthetischer Daten unter Verwendung eines Maschinenlernmodells erzeugten mehreren synthetischen Datenproben Xs umfassen mehrere synthetische Fahrzeugtrajektorien. Im Falle eines Fahrzeugs kann eine Anomalie ein Fahrzeug repräsentieren, das in eine falsche Spur gerät oder das einer nicht verkehrsregelkonformen Trajektorie folgt. Im Falle eines Roboters kann es sich bei einer Anomalie um einen Roboter handeln, der einer Trajektorie folgt, die zu einem Zusammenstoß mit einem Objekt in der Umgebung des Roboters führt. Dementsprechend ist ein weiteres mit dem gemäß dem ersten Aspekt erzeugten Modell konfiguriertes Maschinenlernmodell ausgelegt zum Erzeugen eines synthetischen Beispiels mehrerer synthetischer Fahrzeugtraj ektorien.In one example, the multiple input samples X include one or more vehicle or robot trajectories in two- or three-dimensional space (e.g., Euclidean space), and the multiple synthetic data samples Xs generated by the synthetic data generator using a machine learning model include multiple synthetic vehicle trajectories . In the case of a vehicle, an anomaly may represent a vehicle veering into a wrong lane or following a trajectory that does not comply with traffic regulations. In the case of a robot, an anomaly can be a robot following a trajectory that results in a collision with an object in the robot's environment. Accordingly, another machine learning model configured with the model generated according to the first aspect is configured to generate a synthetic example of a plurality of synthetic vehicle trajectories.

Bei einem Beispiel umfassen die mehreren Eingangsproben X eine oder mehrere Roboterarmtrajektorien im dreidimensionalen Raum (beispielsweise im euklidischen Raum oder im Polarraum). In diesem Fall kann es sich bei einer Anomalie darum handeln, dass der Roboterarm mit einem feststehenden Objekt innerhalb seines Bewegungsbereichs zusammenstößt, oder dass der Roboterarm einen Versuch einer Bewegung durch eine Singularität unternimmt (im Falle eines Roboterarms mit mehreren Gelenken). Die mehreren durch die Einrichtung zum Erzeugen synthetischer Daten unter Verwendung eines Maschinenlernmodells erzeugten synthetischen Datenproben Xs umfassen mehrere synthetische Roboterarmtrajektorien.In one example, the plurality of input samples X include one or more robotic arm trajectories in three-dimensional space (e.g., Euclidean space or polar space). In this case, an anomaly may be the robotic arm colliding with a stationary object within its range of motion, or the robotic arm attempting to move through a singularity (in the case of a multi-jointed robotic arm). The plurality of synthetic data samples Xs generated by the synthetic data generator using a machine learning model comprises a plurality of synthetic robotic arm trajectories.

Bei einem Beispiel umfassen die mehreren Eingangsproben X mehrere Proben, die den Zustand eines Drahtloskommunikationsnetzes an einer Basisstation, einem UE (User Equipment - Benutzergerät, Handgerät) oder einem anderen Element oder Netzsteuerelement wie einem RAN (Radio Access Node - Funkzugangsknoten), einem eNB (Evolved Node B) oder einem gNB repräsentieren. Zum Beispiel könnten die mehreren Eingangsproben X eine Gütezahl umfassen, die die Leistungsfähigkeit des Drahtlosnetzes zusammenfasst. In einem 5G- oder 4G-System beispielsweise könnten die mehreren Eingangsproben X eines oder mehrere von Folgendem umfassen: Nachbarkanalleistung (ACP - Adjacent Channel Power), EVM (Error Vector Magnitude - Fehlervektorbetrag) empfangener Symbole, SNIR (Signal to Noise and Interference Ratio - Signal-zu-Rauschen-plus-Störung-Verhältnis), SNR (Signal to Noise Ratio - Signal-Rausch-Verhältnis), SS-RSRP (Synchronization Signal Reference Signal Received Power - Synchronisationssignal-Referenzsignalempfangsleistung) und dergleichen. Ein diesbezüglicher Vorteil besteht darin, dass ein generatives Modell, das seltene Drahtlosnetzereignisse charakterisiert, bereitgestellt werden kann, um ein weiteres Drahtlosnetz zu trainieren. Dementsprechend ist ein weiteres mit dem gemäß dem ersten Aspekt erzeugten Modell konfiguriertes Maschinenlernmodell ausgelegt zum Erzeugen synthetischer Beispiele mehrerer Drahtlosnetzzustände oder Zustände einzelner Komponenten eines Drahtlosnetzes.In one example, the multiple input probes X include multiple probes that represent the state of a wireless communication network at a base station, a UE (User Equipment), or other element or network controller such as a RAN (Radio Access Node), an eNB ( Evolved Node B) or represent a gNB. For example, the multiple input samples X could include a figure of merit that summarizes the performance of the wireless network. For example, in a 5G or 4G system, the multiple input samples X might include one or more of the following: Adjacent Channel Power (ACP), EVM (Error Vector Magnitude) of received symbols, SNIR (Signal to Noise and Interference Ratio - signal-to-noise-plus-interference ratio), SNR (Signal to Noise Ratio), SS-RSRP (Synchronization Signal Reference Signal Received Power) and the like. An advantage of this is that a generative model that characterizes rare wireless network events can be provided to train another wireless network. Accordingly, another machine learning model configured with the model generated according to the first aspect is adapted to generate synthetic examples of multiple wireless network states or states of individual components of a wireless network.

Es ist nicht unbedingt notwendig, dass die mehreren Eingangsproben auf der Grundlage der Erfassungszeit zeitlich eingeplant werden. Die Eingangsproben können beispielsweise auf der Grundlage der Position eines Aktors im dreidimensionalen Raum, dem Zustand einer Phasenebene oder dergleichen erhalten werden.It is not essential that the multiple input samples be scheduled based on acquisition time. For example, the input samples can be obtained based on the position of an actuator in three-dimensional space, the state of a phase plane, or the like.

Bei den oben erörterten Fällen bedeutet die Bereitstellung von Trainingsdaten unter Verwendung des generativen Maschinenlernmodells gemäß dem ersten Aspekt und dessen Ausführungsformen, dass ein weiteres Maschinenlernmodell effizienter und genauer trainiert werden kann. Das weitere Maschinenlernmodell (das nicht darauf beschränkt ist, ein Autoencoder zu sein) kann mit Eingangsdaten zum Trainieren versorgt werden, die realistische Anomalieinformationen aufweisen, die weitaus häufiger auftreten als im normalen Betriebsablauf eines gegebenen industriellen Systems, autonomen Fahrzeugs, Roboterarms oder dergleichen beobachtete Anomalien. Daher konvergiert das weitere Maschinenlernmodell schneller und verbraucht somit weniger Energie beim Trainieren, da Kandidatenanomalien in gemäß dem ersten Aspekt oder dessen Ausführungsformen bereitgestellten Trainingsdaten häufiger auftreten.In the cases discussed above, providing training data using the generative machine learning model according to the first aspect and its embodiments means that another machine learning model can be trained more efficiently and accurately. The further machine learning model (which is not limited to being an autoencoder) can be provided with input data for training that has realistic anomaly information that occurs far more frequently than anomalies observed in the normal operation of a given industrial system, autonomous vehicle, robotic arm or the like. Therefore, the further machine learning model converges faster and thus consumes less energy when training since candidate anomalies occur more frequently in training data provided according to the first aspect or its embodiments.

Die in den Zeichnungen bereitgestellten und in der obigen schriftlichen Beschreibung beschriebenen Beispiele sollen ein Verständnis der Grundsätze der Offenbarung bereitstellen. Dies soll keine Einschränkung des Schutzumfangs der Offenbarung darstellen. Die vorliegende Offenbarung beschreibt Änderungen und Modifizierungen der dargestellten Beispiele. Es wurden lediglich die bevorzugten Beispiele präsentiert, und jegliche Änderungen, Modifikationen und weiteren Anwendungen davon innerhalb des Schutzumfangs der Offenbarung sollen geschützt werden.The examples provided in the drawings and described in the written description above are intended to provide an understanding of the principles of the disclosure. This is not intended to limit the scope of the disclosure. The present disclosure describes changes and modifications to the illustrated examples. Only the preferred examples have been presented, and any changes, modifications and further applications thereof within the scope of the disclosure are intended to be protected.

Claims

A computer-implemented method (10) for training a generative machine learning model, comprising: - obtaining (12) a plurality of input samples (X) in a data space; - mapping (14) the input samples into a latent space as a plurality of samples in the latent space; - reconstructing (16) a plurality of synthetic samples in the data space based on one or more samples of the latent space (Z); and - iteratively optimizing (18) the generative machine learning model based on an optimization function comprising a reconstruction loss (L) and a test statistic (S) to yield multiple model parameters.

Computer-implemented method (10) for training a generative machine learning model claim 1 , where the generative machine learning model is an autoencoder or a normalized autoencoder.

Computer-implemented method (10) according to claim 1 or 2 , where the reconstruction loss on at least one iteration is a function of the multiple input samples and the synthetic multiple samples, and the test statistic is a function of at least one sample in the latent space (Z).

Computer-implemented method (10) according to one of Claims 1 until 3 , wherein the test statistic is a metric that defines a similarity or dissimilarity of an empirical distribution of the plurality of input samples and a predefined distribution or a predefined distribution class.

Computer-implemented method (10) according to claim 4 , further comprising: - minimizing the test statistic when a sample of the latent space at least approximates a sample of the predefined distribution or the predefined distribution class.

Computer-implemented method (10) according to one of Claims 1 until 5 , where the value of the test statistic is based on at least one sample of latent space (Z).

A computer-implemented method (10) as claimed in any preceding claim, further comprising: - testing, using the test statistic, the predefined distribution or the class of predefined distributions, to perform an evaluation for the existence of a normal distribution and/or a multivariate exponential distribution and/or a discrete random variable distribution.

Computer-implemented method (10) according to claim 7 , further comprising: - iteratively optimizing the generative machine learning model based on an optimization function comprising the reconstruction loss and the test statistic by iteratively adjusting multiple weights in the encoder and/or the decoder based on a combination of the reconstruction loss and the test statistic.

Computer-implemented method (10) according to one of the preceding claims, wherein the generative machine learning model is a normalized autoencoder and/or wherein the test statistic replaces a generative adversarial network of an adversarial autoencoder.

A computer-implemented method (10) as claimed in any preceding claim, further comprising: - Obtaining, as the multiple input samples, one or more examples of system anomalies, and wherein the multiple synthetic data samples include multiple synthetic system anomalies, or Obtaining, as the multiple input samples, one or more vehicle trajectories and wherein the multiple synthetic data samples include multiple synthetic vehicle trajectories.

A computer-implemented method (11) for generating synthetic data samples using a generative machine learning model, comprising: - configuring (20) a decoder network (15a') according to the plurality according to the method of any one of Claims 1 until 10 calculated model parameters; - obtaining (22) at least one sample of a generator distribution (D); - decoding (24) the at least one sample of the generator distribution (D) using the decoder network, thereby obtaining further multiple synthetic data samples in the data space; and - outputting (26) the further plurality of synthetic data samples.

Apparatus (34) for training a generative machine learning model, comprising: - an input interface (36) arranged to receive a plurality of input samples; - a processor (38) arranged to receive the plurality of input samples; and - an output interface (42) arranged to output a trained model and/or a plurality of synthetic data samples generated by the trained model; wherein the processor (38) is adapted, in a training phase, to map the plurality of input samples via an encoder executed by the processor into a latent space as one or more samples of the latent space (Z); - for reconstructing, via a decoder executed by the processor, a plurality of synthetic data samples based on one or more samples of the latent space (Z); and - to optimize iteratively, using the processor, the generative machine learning model based on an optimization function that includes a reconstruction loss and a test statistic to yield multiple model parameters.

One or more computer program elements comprising at least: (i) computer-executable instructions for training, using machine learning, a generative machine learning model according to the method of any one of Claims 1 until 10 and/or (ii) computer-executable instructions for generating synthetic data samples claim 11 and/or (iii) computer-executable instructions that provide the model parameters len one after one of the Claims 1 until 10 trained generative machine learning model.

Computer-readable medium carrying one or more of the computer program elements Claim 13 includes.

A computer-implemented method (11) for training another machine learning model, comprising: - obtaining a plurality of synthetic data samples as according to the computer-implemented method of claim 11 generated; - inputting the plurality of synthetic data samples into another machine learning model; - iteratively optimizing the further machine learning model based on an optimization function comprising a reconstruction loss (L) to thereby update further multiple model parameters; and - Outputting a further number of model parameters of the further machine learning model.