EP4081950A1

EP4081950A1 - System and method for the quality assurance of data-based models

Info

Publication number: EP4081950A1
Application number: EP20830273.7A
Authority: EP
Inventors: Sebastian NIEHAUS; Michael Diebold; Janis REINELT; Daniel LICHTERFELD
Original assignee: Aicura Medical GmbH
Current assignee: Aicura Medical GmbH
Priority date: 2019-12-23
Filing date: 2020-12-23
Publication date: 2022-11-02
Also published as: CN114930346A; WO2021130340A1; EP3843011A1; US20230023058A1

Abstract

The invention relates to a system that has a classifier that is formed by a discriminative neural network and realises a binary class model or a multi-class model. The system also has a model-based sample generator that is formed by a generative neural network. Both the classifier and the model-based sample generator are – for a corresponding class – trained with the same training data records and therefore embody models that correspond to one another for this class. The invention also relates to a method for determining a quality criterion for input data records for a classifier with a discriminative neural network. The classifier is trained with training data records and embodies a classification model for a class. According to the method, first a model-based sample generator with a generative neural network is provided and trained with the same training data records with which the classifier was trained. Then, by means of the trained model-based sample generators and an input data record based on random values, an artificial data record is generated that is representative of the classification model embodied by the classifier. The artificial data record generated by the trained generator or at least one parameter derived therefrom is used to check the input data records with regard to their suitability for a classification or regression.

Description

System and procedure for quality assurance of data-based models

The invention relates to a system and a method for quality assurance of data-based models.

The invention relates in particular to a system and a method for quality assurance of classifiers or regressors formed by discriminative neural networks, the classifiers serving to determine whether objects, states or events represented by a respective input data record belong to a class or a class of several classes. Regressors, on the other hand, output a numerical value, e.g. an age specification for a person, if the regressor processes a picture of a person. For example, binary classifier units are known which, for an object, event or state represented by a respective data record, indicate the affiliation or non-affiliation of this object, this state or this event with a class for which the binary classifier unit is trained.

An input data set is typically a vector or a matrix. If an object is represented by a vector, for example, the vector contains values that describe certain properties of the object, for example a value for the weight property, a value for the size property and a value for the property Gender. A corresponding vector could look like this for a 1.90 m tall, 100 kg, male person, for example: (2,190,100). In this vector, for example, 2 stands for the gender (1 = female, 2 = male), 190 for the body height of 190 cm and 100 for the body weight, namely 100 kg. A neural network is typically structured in such a way that there is an input node for each value, i.e. three input nodes in the example mentioned, of which one input node is provided for gender, a second input node for height and a third input node for weight. The latter, however, does not apply to so-called LSTM neural networks (LSTM: long short-term memory). The values contained in an input data record are assigned to the (input) nodes of the input layer. The input nodes supply their output values as input values to typically several (or all) nodes of the next layer of the artificial neural network. An (output) node in an output layer of the artificial neural network finally supplies the membership value, which indicates with what probability (or whether) an object, event or state represented by the input data set belongs to a certain class. Typically, several intermediate layers (hidden layers) are provided between the input layer and the output layer, which, together with the input layer and the output layer, define the topology of the neural network. A binary classifier can have two nodes in the output layer, namely one that supplies the membership value for class A and one that supplies the membership value for class non-A as the output value. A multi-class classifier can have several nodes in the output layer, namely one in each case that supplies a membership value for one of the classes for which the multi-class classifier was trained, and another node that indicates the likelihood that this will be caused by the The object represented by the input data record or the state represented by the input data record cannot be assigned to any of the classes for which the multi-class classifier was trained. A multi-class classifier can be formed from several binary sub-classification models in such a way that the multi-class classifier is composed of several parallel binary sub-classification models (which form binary sub-paths) each with their own intermediate layers (hidden layers), the several parallel binary classification models having a common one Have input layer and a common output layer. In typical artificial neural networks, an input layer with its input nodes is followed by several other layers (hidden layers) with nodes. In this case, each node of a subsequent layer is typically linked to all nodes of the previous layer and in this way can receive the respective output values of the nodes of the previous layer. The values received in this way are typically weighted and summed up in a respective node in order to then form an output value of the respective node from the weighted sum, for example via a sigmoid function or another activation function, which is then output to all nodes of the next layer. The number of layers and nodes result in the topology of a neural network. The function with which a respective node weights the typically different input values from the nodes of the previous layer and processes them into an output value, defines the parameterization of the artificial neural network and defines a model, for example a classification model. The parameterization with regard to the weights with which the input values of the individual nodes are weighted takes place during the training of the artificial neural network with training data sets.

As is generally known, the individual weights are created in the course of a training phase for a respective artificial neural network. In such a training phase, training data sets are made available to the artificial neural network as input data sets and typically also the class (called label or target) belonging to a respective training data set. During the training phase, the deviation of the output value of the nodes at an output layer of the neural network from the expected value is regularly determined.

For example, an artificial neural network that represents a binary classification model (a binary classifier) has exactly two nodes in the output layer Class belongs, while the other node supplies a membership value which indicates the probability that the input data record does not belong to this class. The membership values do not necessarily have to be unique, but rather indicate a probability that, for example, the object represented by the input data record belongs to class A or not to class A. A possible membership value can therefore e.g. B. 0.78 and mean that the object belongs to class A with 78% probability and not to class A with 22% probability. In the training or learning phase for an artificial neural network, the difference between the expected output values of the artificial neural network and the actual output values of the artificial neural network is determined and, based on this error given by a difference, the weights of all nodes and node inputs are determined iteratively changed until the values obtained at the output layer of the artificial neural network approximately correspond to the expected values. The weightings are gradually adapted backwards, so to speak, starting from the output layer through all the preceding layers up to the input layer. In the learning phase, the weights are optimized step by step in an iterative process in such a way that the deviation between a given target value (ie a given class, also called label or target) and the initial value of the classifier is as small as possible. The deviation between the specified target value and the initial value of the classifier can be evaluated using a quality criterion and the weights can be optimized using a gradient algorithm in which a typically quadratic quality criterion is optimized, ie minima of the quality criterion are searched for. A minimum is approached with the aid of a known gradient algorithm in which the gradients are determined with which the weights change from iteration step to iteration step. Larger gradients correspond to a larger change per iteration step and small gradients to a smaller change per iteration step. In the vicinity of a sought (local) minimum of the quality criterion, the changes in the weights from iteration step to iteration step - and thus the corresponding gradient - are typically relatively small. With the help of the gradients, changed weights can be determined for the next iteration step. The iterative optimization is continued until a specified termination criterion is met, for example the quality criterion has reached a specified level or a specified number of iteration steps has been reached.

Since the values in the input data sets can differ for different states or objects of the same class, a classifier with many, more or less different input data sets is trained as training data sets for a respective class and the model parameter values are determined in the course of the optimization process in such a way that, despite differing Input data sets supply a possible reliable membership value for a respective class. If, for example, a given class for an object is "rose" and the values of the input data set represent the pixels of a photo - namely the color and brightness of a respective pixel of the photo - the color of the rose petals is obviously less important than, for example their shape to the object of the class shown in the photo Assign "Rose". The training of a corresponding classifier with many different photos of roses will foreseeably lead to the result that values of the input data records that are dependent on the color of the petals are less heavily weighted than values of the input data records that are dependent on the shape of the petals, which leads to correspondingly adapted model parameter values, especially weighting for the various input values of the nodes.

The reliability with which such an artificial neural network as a classifier can assign objects, events or states represented by input data records to one or more classes thus depends crucially on the input data records that were used as training data records in the training phase of the artificial neural network.

The same also applies when the input data records are not vectors but, for example, matrices that can represent recorded images, for example. Such matrices, for example images, are typically processed with the help of convolutional neural networks (CNN) in which the dimensions of the input matrix are gradually reduced with the help of convolution layers by adding a respective input matrix (to the input level as well as on the following levels) with the help of smaller convolution matrices serving as filters (for example 3x3 matrices, which are also referred to as filter kernels). The filter kernel is shifted line by line for the respective input matrix. The input values of a respective node of a (convolution) layer following the convolution layer are thus determined by means of a discrete convolution. The input values of a node in the convolution layer are calculated as the inner product of the convolution matrix (filter kernel) with the values of the input matrix currently assigned in a respective step. The comparatively small convolution matrix is moved step by step over the relatively larger input value matrix and the inner product is formed in each case. This is shown very clearly in https: //de.wikipedia.orq / wiki / Convolutional Neural Network. After a corresponding input matrix has been reduced enough, its values can be further processed by a fully connected artificial neural network (similar to a perceptron) in the subsequent levels, in order, for example, to classify the images represented by the input matrices.

In the case of a regressor based on a neural network, this can be trained to, for example, an input data record representing an image Output a number that represents, for example, the age of the person from whom the picture was taken

In this case too, the quality of the classification or the regression depends on the input data sets (matrices, for example images) with which the corresponding convolutional neural network was trained.

Known problems such as overfitting can mean that a discriminative neural network used as a classifier cannot reliably classify certain input data records if the input data records differ too greatly from the training data records. For example, if the input data sets used for training are too similar or if too few different variants of the same object or input data sets representing status are available for training, the known overfitting can occur. If a classifier unit for the object "rose" were trained only with photos of red roses, for example, it is quite possible that such a classifier unit only determines a low membership value for photos of white roses, although white roses are just like red roses.

The aim of the invention is to create a possibility of determining the reliability of an artificial neural network with regard to various input data sets occurring in practice in order to be able to specify the conditions under which, for example, a classifier can expect reliable classifications and under which conditions a classification by the classifier is possibly incorrect.

Input data records can represent, for example, images, tomographies or three-dimensional models that have been obtained using imaging processes in medical technology. In this case, the input data sets can be very different, depending on how the respective images were taken or the models were created. The differences can result, for example, from the values of the technical parameters that were used in the creation of the image or the modeling. Such technical parameters, the values of which influence the properties of the input data records, are for example the contrast range, the image distance, the reconstructed slice thickness, the reconstructed volume or the like in imaging or tomographic methods. According to the invention, a system is proposed for this purpose which, on the one hand, has a classifier which is formed by a discriminative neural network and which realizes a binary class model or a multi-class model. The system also has a model-based sample generator that is formed by a generative neural network. Both the classifier and the model-based sample generator are trained - for a corresponding class - with the same training data sets and therefore embody corresponding models for this class.

The classifier and the model-based sample generator can be spatially separated from one another. In particular, the classifier can be operated in a confidential environment, while the model-based sample generator does not have to, since no confidential data need to be supplied to the model-based sample generator.

Instead of a classifier, a regressor can also be provided. In this case, the model-based sample generator is also trained with the same training data sets as the regressor. A generative neural network (a generator) uses a random input data set, for example an input data set that represents noise, to generate an artificial data set that represents an artificial object, an artificial state or an artificial event and that corresponds, for example, to an input data set for a classifier. For example, a generative neural network (a generator) can generate a data set that represents an image of an object from a matrix that represents noise. This is the case when the generative neural network has been trained with corresponding images of the object as training data sets. This applies in particular to deconvolutional generative networks, which have corresponding layers that process a small, random input matrix step-by-step into a larger output matrix as the output data set. This then represents, for example, an artificially created image of an object.

A generative neural network can, however, also be constructed in the manner of a perceptron, which is formed from fully connected layers. and has a comparatively large input layer and an output layer of the same size (ie having the same number of nodes) and a plurality of hidden layers that are initially smaller and then larger again. Such a generative network can be used with a random vector at the input layer and then supplies a vector as an output value which represents a certain object, a certain state or a certain event.

An autoencoder is a preferred variant of a generative neural network for the application described here, in particular for determining a quality criterion.

In order to train the generative neural network - that is, the model-based sample generator - an instance is provided which can determine the deviation of a generative model represented by the model-based sample generator from the training data sets so that the deviations are minimized during training can be. This instance can be a loss function that determines, for example, a similarity loess. The entity determining the deviation - the loess - can, however, also be a discriminator which, similar to the classifiers described above, is formed by a discriminative neural network. If a generative neural network is used as a model-based sample generator in conjunction with a discriminator, i.e. with a discriminative neural network, the output data set generated by the model-based sample generator can be fed to the discriminator as an input data set. The discriminator is typically trained with training data sets that represent the object for which the model-based sample generator was also trained. The discriminator can thus determine in the sense of a binary classifier for a respective output data set generated by the model-based sample generator whether this output data set actually represents the corresponding object or not. Such a combination of a generative neural network and a discriminative neural network is also known as GAN (Generative Adverserial Network) and, for example, in Andreas Wiegand "An Introduction to Generative Adverserial Network (GAN)", seminar Kl: yesterday, today, tomorrow Applied Computer Science , University of Bamberg.

The model-based sample generator can also be an autoencoder that has been trained with the aid of a similarity loess function or a root mean square error (RMSE) function as the entity determining the deviation. According to the idea on which the invention is based, the model-based sample generator of the system according to the invention is used to determine for which input data sets the classifier can supply meaningful output values. If the classifier is not a binary classifier but a multi-class classifier whose discriminatory artificial neural network realizes several classification models representing different classes, each of which was generated with different training data sets, then a model-based sample generator is used for each classification model provided, which was generated with the training data sets for the respective classification model and which can also only provide statements for the corresponding classification model.

With the aid of the trained generator, an artificial data set representative of the classification model can be generated for a class, which data set is also referred to as an artificial prototype in the context of this description. This is done by supplying the model-based sample generator in a known manner with an input data record that represents noise - that is, it is formed from random values.

Various input data sets representing noise are preferably fed to the model-based sample generator and different prototypes are generated in this way.

Technical properties can then be derived from the prototype or the different prototypes, which should be at least approximately fulfilled by the input data sets to be fed to the classifier, so that the association value generated by the classifier for the respective input data set is reliable.

This is possible because both the classification and the model-based sample generator were trained with the same training data sets.

If the input data records to be classified represent, for example, magnetic resonance tomographies or computer tomographies, these technical properties are, for example, the contrast range in Hounsfield units, the image distance or the reconstructed slice thickness, the imaged volume, etc. Only if the input data sets to be classified are similar to the training data sets with regard to these technical properties, a reliable classification result - i.e. a reliable membership value can be expected.

If the training data sets for the classifier are not available for direct analysis for technical reasons or for reasons of confidentiality, the technical boundary conditions that the input data sets to be classified must meet in order for a reliable classification to be possible can be determined with the aid of the model-based sample generator. because these technical properties can be read on the artificially generated data set that is representative of the classification model (the prototype).

In the event that different prototypes were generated by the model-based sample generator with the aid of different input data sets representing noise, these different prototypes can be used to define a range of values for the parameter values that must meet the technical properties of the input data sets to be classified the classifier can form reliable membership values. Parameters whose (parameter) values are relevant are, for example, the contrast range, the image distance, the reconstructed slice thickness, the reconstructed volume or the like on which a respective tomography or image represented by an input data set is based. The artificially generated prototypes, which are based on different parameter values, can be checked with the help of a loess function and / or a similarity function and / or a similar metric to determine whether the respective artificially generated prototype (and thus the parameter values on which it is based) expect reliable classification results to let. The respective loess function provides a measure of the reliability with which a classifier correctly classifies an input data set. By classifying the prototypes artificially generated by means of the generative neural network and generating the associated loess function, suitable artificial prototypes can be determined which allow a reliable classification result to be expected. The parameter values underlying the artificial prototypes providing reliable classification results define a range of values for the parameter values of the relevant parameters (eg image resolution) within which reliable classification results can be expected. A model-based sample generator that has been trained with training data sets that provide reliable classification results can in particular also be used to check input data sets based on real recordings, images or tomographs to determine whether the input data sets allow a reliable classification result to be expected from the corresponding classifier. For this purpose, an artificial prototype generated by the model-based sample generator can be used with the respective input data set with the aid of a suitable metric, e.g. B. with the help of a loess function can be compared. A small amount of loess then indicates that the input data set allows a reliable classification by the model-based sample generator to be expected.

The comparison of input data sets based on real data with artificial prototypes generated by the model-based sample generator can be used to define (parameter) value spaces that contain parameter values and combinations of parameter values that lead to input data sets that can probably be reliably classified. For this purpose, different input data sets based on real data, each of which is based on different parameter values, must be compared with the artificially generated prototypes so that the parameter values and combinations of parameter values can be determined which lead to a low level of loess. A parameter space determined in this way can be made available to the operator of the classifier so that the operator can check the quality of the input data before a classification is carried out by the classifier. Only those input data sets for which the data with parameter values and value combinations that lie within the parameter space have been obtained meet the quality criterion. By "obtaining data with parameter values and value combinations" is meant here that when the data is generated by a data-generating entity, for example a tomograph, parameter values such as resolution, slice thickness, etc. prevail that the generation of the data and thus the generated ones Affect data.

Instead of operating the model-based sample generator as outlined above independently of the input data records to be specifically classified so that a parameter value space can be formed, the model-based sample generator can also be connected upstream of the classifier - and thus possibly part of a confidential environment. In this case, an input data set to be specifically classified by the classifier can first be compared with a corresponding artificial prototype generated by the model-based sample generator in order to obtain an estimate of the reliability of the classification before the classification. In this solution, the model-based sample generator is part of the confidential environment.

In the variant described above, in which a parameter value space that is expected to allow reliable classification results is formed, the model-based sample generator, on the other hand, can also be operated outside of a confidential environment. In particular, several model-based sample generators can be operated in separate containers on one or more servers. Since the model-based sample generators are operated in (software) containers, i.e. in a logically closed area of a server, the model-based sample generators can also be part of a quasi-confidential area in which the respective classifier is also operated. For example, a model-based sample generator operated in a container can be connected to the respective classifier and / or the confidential environment in which the respective classifier is operated via a VPN connection (VPN: Virtual Private Network).

A maximum value is specified for the loess function used in each case, which an artificially generated prototype must not exceed in order to be considered reliable.

Conversely (and correspondingly) the check can also be carried out by means of a similarity function, which provides a measure of similarity for a respective artificial prototype. A minimum value can be specified for the degree of similarity, which must not be fallen below, so that the associated artificial prototype is considered reliable. The (parameter) value range determined in accordance with the variant first outlined above is a quality criterion which can be used as the basis for checking input data records to be classified. Input data records to be classified which meet the quality criterion because the values of technical parameters represented by the input data records to be classified lie within the value space or value range according to the quality criterion, lead to a reliable classification result. A method is also proposed for determining a quality criterion for input data records for a classifier with a discriminative neural network. The input data records depend on values of technical parameters that are represented in the input data records and the quality criterion relates to at least one value of one of these technical parameters. The classifier is trained with training data sets and embodies a classification model for a class.

According to the method, a model-based sample generator with a generative neural network is first provided and trained with the same training data sets with which the classifier was trained. Then, by means of the trained model-based sample generator and an input data set based on random values, an artificial data set is generated which is representative of the classification model embodied by the classifier and represents an artificial prototype.

From the artificial data set - i.e. the artificial prototype - values for technical parameters represented by this artificial prototype are then determined.

A quality criterion is formed from the determined values of the technical parameters in that a value range or range of values is determined from the determined values of the technical parameters, which depends on the determined values of the technical parameters and a specified tolerance range, the classifier for such input data records, represents the values of technical parameters that lie within the value range and thus meet the quality criterion, deliver a reliable classification result.

Preferably, an associated loess function is formed for input data sets based on different values of the parameters and an artificially borrowed prototype generated by the model-based sample generator and an output value supplied by the loess function is compared with a predetermined reference value. In the event that a respective loess function is smaller than the specified reference value, the parameter values on which the respective input data set is based are classified as those which provide a sufficiently reliable classification result. By determining the output values of the loess function for different input data sets based on different values of the parameters, and comparing the respective output value of the loess function with the specified reference value a value space can be formed for the values of the parameters, which forms a quality criterion for the parameter values, namely in such a way that parameter values within the value space meet the quality criterion.

Alternatively, each input data set to be processed specifically by the classifier or the regressor can be compared with the artificial data set generated by the model-based sample generator in order to determine the loess compared to or the similarity to the artificial data set generated by the model-based sample generator and in this way, before or in parallel with the classification of the input data set, an estimate of the reliability of the classification can be obtained

For this purpose, an entity determining the deviations between the prototype and the respective input data set (e.g. a discriminator or a similarity loess function) can be connected upstream or parallel to the classifier and, for an input data set to be specifically classified, the loess compared to a sample based on the model -Generator-generated prototypes can be determined in order to obtain an estimate of the reliability of the classification before or in parallel with the classification of the input data set.

In the simplest case, it is sufficient if the classifier is preceded or paralleled by an entity determining a deviation, such as a discriminator or a similarity function in connection with a prototype generated by the model-based sample generator, in order to provide a solution for each input data set to be specifically classified or to determine a similarity to the prototype.

The input data sets to be classified preferably represent tomographic images and the technical parameters, the values of which are determined from the artificially generated data set, are preferably the contrast range, the image distance, the reconstructed slice thickness, the reconstructed volume or a combination thereof.

It is also preferred if, by means of the trained model-based sample generator and several different input data sets based on random values, several artificial data sets are generated which are representative of the classification model embodied by the classifier, and values for these artificial data sets are generated from the artificial data sets Data sets represented technical parameters are determined. The invention will now be explained in more detail using an exemplary embodiment with reference to the figures. From the figures shows:

1: a system according to the invention with a classifier and a model-based sample generator independent of the classifier for generating a prototype;

2: a sketch for explaining the training phase;

3: a sketch of a possible implementation of a quality check in a confidential environment;

4 shows a sketch of an alternative implementation of a quality check in a confidential environment;

FIG. 5: a sketch of a similar implementation as sketched in FIG. 4, a regressor being provided instead of a classifier;

6: an illustration of a system with a model-based sample generator for generating a prototype, which is connected to two different discriminators for training; and

7: an illustration of an embodiment variant in which the model-based

Sample generator generates a pair from a prototypical input data set and the associated class, which can be used to test the classifier. FIG. 1 is a sketch of a system 10 which on the one hand comprises a classifier 12 and on the other hand a model-based sample generator 14. Instead of the classifier 12, a regressor 28 can also be provided.

In the case of a classifier, it can be trained, for example with the help of training data sets representing healthy anatomical structures, to recognize healthy anatomical structures and to assign them a high membership value (ie a low loess) and in this way to assign data sets representing pathological anatomical structures differ because such pathological anatomical Data sets representing structures have a lower similarity with the training data sets - and thus with the model embodied by the classifier.

The classifier 12 is formed by a trained discriminative artificial neural network that embodies a one-class model or a multi-class model. Accordingly, the classifier 12 is either a binary classifier or a multi-class classifier. The classifier 12 is trained with the aid of corresponding training data sets for a respective class. As described at the beginning, it depends on the training data sets how the classifier behaves for the respective class.

If input data sets 18 are fed to the classifier during operation, which represent objects, states or events that are to be classified, the classification result depends on the training data sets 24 with which the classifier 12 was trained (see FIG. 2).

As already explained at the beginning, each value from an input data set 18 is fed to the nodes of an input layer of the discriminative neural network of the classifier 12 during operation. The output values of the nodes of the input layer are then passed on to the nodes of the following hidden layers (hidden layer) until finally the nodes of the output layer generate a signal that represents a membership value that indicates how much the input data set - and thus that represented by the input data set The object, the represented state or the represented event - is to be assigned to one of the classes for which the classifier 12 was trained.

The discriminative artificial neural network of the classifier 12 can for example be a perceptron or else a convolutional neural network (CNN) with one or more convolution layers at the entrance. In the case of the perceptron, the input data set is typically a vector of the type described at the beginning. In the case of a Conventional Neural Network (CNN), the input data set is typically a matrix which in most cases represents an image.

The problem is that for a user of a classifier such as the classifier 12 it is typically not easy to see whether or not he can trust the respective classification result - that is, the membership values supplied by the classifier. In particular, the user does not know for which input data sets the classifier 12 will deliver foreseeable reliable results and for which input data sets the classifier 12 will not deliver any reliable results foreseeably. This is because the classification result depends not only on the content represented by a respective input data record (for example an image of an anatomical structure), but also on the technical parameters of the data record such as resolution and contrast. A classifier 12 trained with training data sets representing healthy anatomical structures, for example, can falsely classify an input data set also representing healthy anatomical structures as a data set representing pathological structures if the input data set representing healthy anatomical structures is due to technical reasons, e.g. insufficient triggering, deviates more from the training data sets. In addition, the classification result may also depend on the completeness or correctness of the respective input data set. An input data record can consist, for example, of a matrix representing an image and additional parameter values (eg modality, age of the patient, etc.). If the input data set is incomplete, ie if, for example, additional parameter values are missing or completely incorrect (e.g. a negative age), the input data set is unsuitable for reliable classification.

The model-based sample generator 14 is provided in order to determine the limits within which the classifier 12 can deliver reliable and reliable results. The model-based sample generator 14 is formed by a generative artificial neural network that is trained with training data sets for the class or one of the classes with which the classifier 12 was also trained for the corresponding class. However, the training data sets themselves are not available to the user of the model-based sample generator, that is to say the training data sets with which the classifier 12 and also the model-based sample generator 14 were trained can remain anonymous to outsiders. Accordingly, it is not possible to infer directly from the training data sets under which conditions or prerequisites the classifier 12 is likely to deliver reliable results.

However, since the model-based sample generator 14 for a class has been trained with the same training data sets as the classifier 12 for this class, it is possible to generate an artificial data set with the model-based sample generator 14 from a random input data set that typically represents noise . The artificial data record generated in this way represents a type of artificial prototype for an object, a state or an event, which defines the corresponding class for which the classifier 12 is also trained. By looking at the artificial prototype it can now be determined what the object, state or event looks like for which the Classifier is trained for the corresponding class. For objects, states or events that differ greatly from the artificial prototype, the classifier 12 will typically not provide a high membership value for the corresponding class, even if these differing objects, events or states were to be assigned to the corresponding class.

It should be noted here that different data records can already result from how the (input) data record was generated, which represents a corresponding object, a corresponding state or a corresponding event. This means that the differences in the input data records depend not only on the represented object, the represented state or the represented event, but also on how (ie with what means or what settings or under what circumstances) the corresponding data record for such a thing Object, event or state was generated. For example, the data records can simply differ in the resolution with which a corresponding object, a corresponding event or a corresponding state is represented by the data record. Different resolutions can lead to different classification results. This can be estimated with the aid of the prototype artificially generated by the model-based sample generator 14.

It is advantageous that the classifier 12 or the regressor 28 can be part of a confidential environment, for example in a hospital with confidential patient data, while the model-based sample generator does not need to be, because it can be trained with anonymized training data sets (namely the same ones with which the classifier or the regressor was trained) and can deliver a product criterion as a result, which can then be used in the confidential environment in the vicinity of the classifier or regressor to check real input data records.

FIG. 2 illustrates that the classifier 12 (or the regressor 28) and the model-based sample generator 14 were trained with the same training data sets. The training data sets themselves are invisible to the operator of the model-based sample generator 14, so that confidentiality can also be maintained in this respect.

In the example shown in Figure 2, the model-based sample generator 14 is part of a Generative Adverserial Network 34 (GAN), which is from the model-based sample Generator 14 and an associated discriminator 16 is formed. To train the model-based sample generator 14 with the training data sets that are fed to the discriminator 16, the model-based sample generator 14 generates an artificial (generated) data set from an input data set 20 representing noise and thus a data set 22 representing an artificial prototype, the the discriminator 16 is supplied as an input data record. The discriminator 16 can determine the deviation between the artificial prototype and the model defined by the training data sets - the loess - and form an output signal representing this deviation. During training, the output signal of the discriminator 16 representing the loess is fed back to the model-based sample generator 14, where the weights of the nodes of the layers of the model-based sample generator 14 are adjusted and the model represented by the training data sets 24 is sufficiently small. As soon as this is the case, the model-based sample generator 14 is trained for the corresponding class. The discriminator 16 is no longer required for the analysis described in connection with FIG.

Known functions for determining the loess (loss functions) are the cross entropy function, the root mean square function (RMS) or the structural similarity index function (SSIM). So that the model-based sample generator 14 is suitable for generating an artificial prototype that not only represents healthy anatomical structures, but also represents pathological anatomical structures, the training data sets contain, for example, both data sets that represent healthy anatomical structures and data sets that represent pathological anatomical structures Represent structures. In this case, the properties of the prototype are shaped by the more general common properties of the training data sets, ie in particular also by their technical properties. A similarity of the input data sets to be classified for the classifier 12 or the regressor 28 with an artificial prototype generated as described therefore indicates a technical suitability of the input data sets for a reliable classification or regression. A strong deviation of an input data set to be classified by the classifier 12 from the artificial prototype 22 is an indication of a lack of suitability for a reliable classification. After the training, the model-based sample generator 14 can then be used as follows to determine a quality criterion for the input data records to be classified for the classifier:

First, a model-based sample generator 14 with a generative neural network is provided and trained with the same training data sets with which the classifier 14 was trained.

Then, by means of the trained model-based sample generator 14 and an input data set based on random values, an artificial data set 22 is generated which is representative of the classification model embodied by the classifier and which is also referred to as an artificial prototype 22 in the context of this description.

According to a first approach, values for technical parameters represented by this artificial data set are determined from the artificial data record 22.

A quality criterion is formed from the determined values of the technical parameters in that a value range or range of values is determined from the determined values of the technical parameters, which depends on the determined values of the technical parameters and a specified tolerance range, the classifier for such input data records, the values represent technical parameters that are within the range of values and thus meet the quality criterion, delivering a reliable classification result.

Alternatively, the value range or value space serving as a product criterion can also be determined by comparing a data set artificially generated by the model-based sample generator 14 (i.e. an artificial prototype) with various input data sets based on real data. For this purpose, input data records are used to which different parameter values for the relevant parameters such as B. resolution, layer thickness or the like are based. For each input data set, the loess compared to the artificial prototype 22 or the similarity to the artificial prototype is determined by means of a loss function known per se (loess function) or a similarity function known per se (similarity function). If the comparison shows that the loess is low or the similarity is high enough, the parameter values on which the respective input data set is based are assigned to the parameter space that represents sufficient quality. In this alternative way, too, a parameter space can be formed with the aid of the model-based sample generator 14, which can serve as a quality criterion for input data sets to be classified by the classifier 12 - in the sense that input data sets that are based on data are obtained when they are obtained Parameter values come from the parameter space serving as a quality criterion, meet the quality criterion and allow a reliable classification to be expected.

According to a third variant, a unit 32 for determining similarity is assigned to the classifier, which unit checks input data records to be processed by the classifier 12 or the regressor 28 for their similarity to the artificial prototype. The unit 32 for determining similarity can be connected upstream of the classifier 12 of the regressor 28 or connected in parallel - and is thus part of a possibly confidential environment. The unit 32 for determining similarity can, for example, be a discriminator which is configured to determine a loess between a respective input data set to be checked by the classifier 12 or regressor 28 and the artificial prototype 22. The loess can then be determined for an input data set to be specifically classified in relation to an output data set (prototype) generated by the model-based sample generator in order to obtain an estimate of the reliability of the classification before the input data set is classified. However, the unit 34 for determining similarity can also be used for a simple comparison of a respective input data set to be checked by the classifier 12 or regressor 28 with the artificial prototype 22 by means of a similarity function, for example by determining the root mean square error (RMSE), the Cross entropy or the Structural Similarity Index (SSIM) measure. Accordingly, in the simplest case, it is sufficient if the classifier (or the regressor) is preceded or paralleled by an entity that determines a deviation, such as a discriminator or a similarity function, in conjunction with an output data set that is generated by the model-based sample generator and represents an artificial prototype, in order to determine a loess opposite or a similarity to the prototype for each input data set to be specifically classified. If a loess is determined, it should be as small as possible (e.g. close to zero on a scale from 0 to 1). If a similarity is determined, it should be as close as possible to 1 on a scale from 0 to 1. Suitable functions aim to map the distance between two data sets - i.e. the input data set and the artificial prototype. In the simplest case, such a function could be an average one Determine the difference between the individual elements of the input data set and corresponding elements of the artificial prototype. However, this is disadvantageous since, for example, the direction of the difference is not taken into account and outliers are not corrected. A loss function is typically used to optimize a model using an optimizer. If necessary, the scaling is changed for a loss function - so that it corresponds to the mathematical requirements of the optimization algorithm.

The input data sets to be classified preferably represent tomographic images and the technical parameters, the values of which are determined from the artificially generated data set, are preferably the contrast range, the image distance, the reconstructed slice thickness, the reconstructed volume or a combination thereof. Further parameters of an input data set can be data on the receiving modality or also on a patient. The last data mentioned would be, for example, gender, age, body size etc. It is also preferred if using the trained model-based sample generator and several different random values based Input data sets a plurality of artificial data sets are generated which are representative of the classification model embodied by the classifier, and values for technical parameters represented by these artificial data sets are determined from the artificial data sets.

With the aid of the method and the system, the problem is solved that an operator of a classifier can only recognize with difficulty whether he is relying on classification results obtained from the classifier - that is, membership values generated by the classifier. The latter is namely only the case if the input data records to be classified for the classifier meet technical criteria that match the classification model that is embodied by the classifier. Such technical criteria are the value ranges of the technical parameters that are represented by the respective input data record. The classification by the classifier is only reliable if these match the classification model. As already mentioned above, a quality check of input data sets 18 to be classified can also take place in that the classifier 12 is operated in the confidential environment in connection with a (second) discriminator 26. Figure 3 illustrates a corresponding arrangement. On the left in Figure 3 is the confidential environment with the classifier 12 and the (second) discriminator 26. The second discriminator 26 is used to determine a loss function, that is to say to determine a measure of an (averaged) deviation between an artificial prototype 22 ′ generated by the model-based sample generator 14 and an input data set 18 to be classified in each case 'can be generated in the non-confidential environment by the model-based sample generator 14 and then made available for the quality check in the confidential environment. In fact, the second discriminator 26 can also be viewed as a classifier trained with the artificial prototype 22 ', which for each input data set 18 to be classified belongs to a class “suitable for a reliable classification (OK)” or to a class “for a reliable classification unsuitable (NOK) ”. The second discriminator 26 can thus be designed as a binary classifier. The actual classifier 12, on the other hand, is typically a multi-class classifier and classifies the input data records 18 by assigning each input data record 18 to one of several classes for which the classifier 12 was trained.

In the exemplary embodiment shown in FIG. 4, the discriminator 26 is an entity which checks the discrepancy between the artificially generated prototype 22 'and an input data set 18 to be classified in each case. The discriminator 26 can thus also embody a simple similarity loss function. As also mentioned above, instead of a classifier 12, a regressor 28 can also be provided in the confidential environment. This is illustrated in FIG. A regressor typically does not provide an assignment to one of several classes as an output value (as the classifier 12 does), but rather provides a numerical value for a respective input data set. For example, the numerical value can represent the age of a person if the input data set checked by the regressor 28 represents an image of this person. Other numerical values supplied by a regressor can, for example, indicate the probable length of stay of a patient in the hospital. The regressor 28 also embodies a discriminative neural network that has been trained with corresponding training data sets which, for example, represent images of people of different ages.

FIG. 6 illustrates an exemplary embodiment in which the GAN 34 has two discriminators 16 and 30 for training the model-based sample generator 14. Of these two discriminators 16 and 30, the first discriminator 16 of the GAN is configured as usual in such a way that its feedback to the model-based sample generator 14 is provided leads to the fact that the loess between the training data sets and the iteratively changing output data 22 generated by the model-based sample generator 14 during the training is minimized by the weights of the generative neural network embodied by the model-based sample generator 14 during the training be gradually adapted. The first discriminator 16 of the GAN 34 is thus configured to determine the loess between the artificially generated data set 22 and the training data sets in a manner known per se and to minimize it in the course of the training.

The second discriminator 30 of the GAN 34 is configured in such a way that for each prototype 22 generated by the model-based sample generator 14 during the training it determines its similarity to each of the training data sets and, in conjunction with the model-based sample generator 14, works to ensure that a predetermined minimum number of training data records has a predetermined minimum similarity with the respectively generated prototype 22 in order to prevent the prototype from having a particularly great similarity with only one or very few training data records. In particular, the second discriminator 30 of the GAN 34 is configured in such a way that it always generates a high value to be added to the loess determined by the first discriminator 16 if less than a predetermined number of training data sets result in a degree of similarity that is less than a predetermined one Has the maximum deviation from the best occurring degree of similarity. Only when a sufficient number of training data sets have a similarly great similarity to the artificially generated data set 22 as the most similar training data set is the value determined by the second discriminator 30 for the similarity cluster measure small.

In the training, the sum of the measure for the loess (the measure of loss that the first discriminator 16 generates) and the value determined by the second discriminator 30 for the similarity cluster measure is minimized. Since the similarity cluster measure severely “penalizes” a deviation from the specified minimum number of very similar training data records, it is ensured that this specification is typically met. For example, in the case of twenty training data sets, 20 similarity values (values of the degree of similarity) are also generated. If, for example, 3 is selected as the limit value (minimum size of the cluster with the best similarity values within a given range), a cluster with the highest degree of similarity (the height expresses an increased similarity between the elements) must contain at least three elements. If this cluster is not reached, negative feedback is returned to the model-based sample generator 14. In the exemplary embodiment shown in FIG. 6, the GAN 34 is configured in such a way that at least approximately an optimum with regard to the loess determined by the discriminator 16 and the similarity criterion determined by the discriminator 30 results during training. With regard to all of the illustrated exemplary embodiments, a variant can also be provided in which the output data set 22 generated by the model-based sample generator 14 is a pair of prototypical input data (for example an artificially generated tomography) and the associated label. The Laber designates one of the classes for which the classifier 12 is trained and for which the artificially generated prototype is intended to be a prototype.

Such a pair of prototypical input data set and associated label (ie associated class) generated by the model-based sample generator 14 can be used for testing the classifier 12 by the artificially generated, prototypical input data set 22 'being supplied to the classifier 12 and the data set for this the class determined by the classifier is compared with the label. The class determined by the classifier 12 for the artificially generated, prototypical input data set 22 'must be identical to the label if the system is to be classified as reliable. FIG. 7 illustrates this using an example analogous to the exemplary embodiment shown in FIG. However, the concept can also be transferred to all other exemplary embodiments. In FIG. 7, a double line indicates that the output data set 22 generated by the model-based sample generator 14 is a pair of a prototypical input data set and an associated label (i.e., an associated class).

A check of the classifier 12 with the help of a pair generated by the model-based sample generator 14 from prototypical input data set and the associated label is particularly helpful when the classifier is retrained during operation (e.g. via online training). In this case, the pair of prototypical input data sets and associated labels generated by the model-based sample generator 14 can be used to check whether the retraining was successful. Should the retrained classifier 12 misclassify the pair of prototypical input data sets generated by the generator 14, the classifier 12 can be reconfigured back to the classification model prevailing before the retraining. For this purpose, it is advantageous if log data records (log files) are created during the post-training session. in which the changes made to the classification model during the retraining are recorded in order to be able to reverse these changes if necessary.

The preferred embodiment variants offer the advantage that they also allow federated training of the classifier 12 or regressor 28 and also of the GAN 34. In federated learning, different trained generative neural networks (ie, generative models) to be embodied by the model-based sample generator 14 are generated by different GANs, which can also be located at different locations. The decentralized - and thus federated - generated generative neural networks (to be more precise, the models represented by the generative neural networks, which are characterized above all by the weights in the nodes of the network) can become a single model (and thus one single generative neural network) that is then implemented by the model-based sample generator (34). Instead of federated learning, the model-based sample generator (14) can also be trained with training data sets from different sources (via data pooling) in order to avoid overfitting to a single source.

Reference number

12 classifier

14 model-based sample generator

16 discriminator connected to generator 14 18 input data records for objects to be classified

20 input data record

22, 22 'Artificial prototype generated by generator 14

24 training data sets

26 Discriminator in the confidential environment 28 Regressor 30 Second discriminator, which can be connected to the generator 14 32 Unit for determining similarity

34 GAN (Generative Adversial Network)

Claims

Expectations

1. System for checking input data records for their suitability for automatic evaluation, in particular classification by means of a classifier (12) or regressor (28) approximated or trained for one class or several classes, which is operated by a first discriminative neural network for the classification o- which the regression is formed at a first location, the system having a model-based sample generator (14) with a generative neural network at a second location remote from the first location, the generative neural network of the model-based sample generator (14) is trained with the same training data sets (24) as the discriminative neural network of the classifier (12) or regressor (28) and is designed to generate an artificial output data set (22) from an input data set (20) representing random values a prototype for the class for which the classifier (12) or the recourse or (28) is trained, represents the represented model, the first discriminative neural network independent of the model-based sample

Generator (14) is implemented and can neither receive data sets generated by the model-based sample generator (14) nor deliver data sets to the model-based sample generator (14) that the first discriminative neural network of the classifier (12) or regressor (28 ) generated during operation, so that the first discriminative neural network of the classifier (12) or regressor (28) and the model-based sample generator (14) represent entities that are independent of one another.

2. System according to claim 1, in which the classifier (12) or the regressor (28) is part of a confidential environment and the model-based sample generator (14) is implemented in a software container and via a VPN connection with the classifier ( 12) or the regressor (28) is connected.

3. System according to claim 1 or 2, characterized in that a unit (32) for determining similarity is provided at the first location which is configured to supply input data records to the classifier (12) or regressor (28) for the purpose of classification or regression by means of the model-based

Sample generator (14) generated artificial output data set (22) to check with regard to their suitability for a classification or regression by a respective input data set with that of the model-based sample generator (14) generated artificial output data set (22) is compared and classified as "suitable" or "not suitable".

4. System according to at least one of claims 1 to 3, characterized in that the model-based sample generator (14) is assigned a discriminative neural network that forms a discriminator (16) which, together with the model-based sample generator (14) forms a Generative Adversial Network (GAN).

5. System according to claim 4, characterized in that the discriminator (16) of the GAN (34) is configured to determine a loess of the artificial output data sets generated by the model-based sample (14) compared to the training input data sets.

6. System according to claim 4 or 5, characterized in that the model-based sample generator (14) is assigned a second discriminative neural network that forms a second discriminator (30) of the GAN (34) and a second from the similarity between the training data sets (18) and the artificial output data set (22) generated by the model-based sample generator (14) forms a dependent measure, which is also optimized or fulfilled during training of the model-based sample generator (14).

7. System according to claim 6, characterized in that the second discriminator (30) assigned to the model-based sample generator (14) is configured to always have a high value to be added to the loess determined by the first discriminator (16) generate if fewer than a predetermined number of training data sets result in a similarity measure that has less than a predetermined maximum deviation from the best occurring similarity measure.

8. System according to at least one of claims 1 to 7, characterized in that the model-based sample generator (14) is configured to generate artificial output data sets (22), the at least one prototypical input data set for the classifier (12) or the regressor (28) and, in addition to a respective prototype input data set, an associated one

Label. 9. A method for determining a quality criterion for input data records (18) for a classifier (12) or a regressor (28) with a discriminative neural network, the input data records (18) depending on values of technical parameters represented in the input data records (18) and the quality criterion relates to at least one value of one of these technical parameters, and wherein the classifier (12) or the regressor (28) is trained with training data sets (24) and embodies a classification model for a class or a regression model, characterized in that, that a model-based sample generator (14) with a generative neural network is trained with the same training data sets (24) with which the classifier (12) or the regressor (28) was trained, that by means of the trained model-based sample generator (14) and an input data set (20) based on random values, followed by an artificial data atz (22) is generated, which is representative of the classification model embodied by the classifier (12) or the regression model represented by the regressor (28) that values for technical parameters represented by this artificial data set are determined from the artificial data set (22) and that a quality criterion is formed from the determined values of the technical parameters in such a way that a value range is determined from the determined values of the technical parameters, which depends on the determined values of the technical parameters and a predetermined tolerance range, the classifier (12) o- which the regressor (28) delivers a reliable classification result or regression result for those input data sets (18) which represent values of technical parameters which lie within the value range and thus meet the quality criterion.

10. A method for determining a quality criterion for input data sets (18) for a classifier (12) or a regressor (28) with a discriminative neural network, the input data sets (18) depending on values of technical parameters represented in the input data sets (18) and the quality criterion relates to at least one value of one of these technical parameters, and where- the classifier (12) or the regressor (28) is trained with training data sets (24) and embodies a classification model for a class or a regression model, characterized in that a model-based sample generator (14) with a generative neural network with the same Training data sets (24) are trained, with which the classifier (12) or the regressor (28) was trained that by means of the trained model-based sample generator (14) and an input data set (20) based on random values, an artificial data set (22 ) is generated which is representative of the classification model embodied by the classifier (12) or the regression model embodied by the regressor (28) that the artificial data set (22) is compared with various input data sets (18) based on real data, which different parameter values are based, in that for each input data set (18) the loess compared to the artificial data set (22) or the similarity to the artificial data set

(22) is determined and, if the comparison shows that the loess is low or the similarity is large enough, the parameter values on which the respective input data set (18) is based are assigned to the parameter space.

11. A method for checking input data sets for a classifier (12) or a regressor (28) with a discriminative neural network, the input data sets (18) depending on values of technical parameters which are represented in the input data sets (18) and the quality criterion at least relates to a value of one of these technical parameters, and wherein the classifier (12) or the regressor (28) is trained with training data sets (24) and embodies a classification model for a class or a regression model, characterized in that a model-based sample Generator (14) is trained with a generative neural network with the same training data sets (24) with which the classifier (12) or the regressor (28) was trained that by means of the trained model-based sample generator (14) and one on random values based input data set (20) then an artificial one Data set (22) is generated which is representative of the classification model embodied by the classifier or the regression model embodied by the regressor, and that an input data set (18) to be processed specifically by the classifier (12) or the regressor (28) with the from the model-based sample

Generator (14) generated artificial data record (22) is compared in order to determine the loess compared to or the similarity to the artificial data record (22) generated by the model-based sample generator (14) and in this way before the classification of the Input data set (18) to obtain an estimate of the reliability of the classification.

12. The method according to claim 9, 10 or 11, characterized in that the input data sets to be classified (18) represent tomographic images and the technical parameters, the values of which are determined from the artificially generated data set, the contrast range, the image distance, the reconstructed slice thickness , which are reconstructed volumes, or a combination thereof.

13. The method according to claim 9, 10, 11 or 12, characterized in that by means of the trained model-based sample generator (14) and several different input data records based on random values, several artificial data records (22) are generated which are used by the classifier (12) embodied classification model or the regression model embodied by the regressor (28) are representative, and that values for technical parameters represented by these artificial data records (22) are determined from the artificial data records (22).

14. The method according to at least one of claims 9 to 13, characterized in that the model-based sample generator (14) is trained by federated learning (spring-ated learning).