EP3785169A1 - Procédé et dispositif de conversion d'une image d'entrée d'un premier domaine en une image de sortie d'un second domaine - Google Patents

Procédé et dispositif de conversion d'une image d'entrée d'un premier domaine en une image de sortie d'un second domaine

Info

Publication number
EP3785169A1
EP3785169A1 EP19721223.6A EP19721223A EP3785169A1 EP 3785169 A1 EP3785169 A1 EP 3785169A1 EP 19721223 A EP19721223 A EP 19721223A EP 3785169 A1 EP3785169 A1 EP 3785169A1
Authority
EP
European Patent Office
Prior art keywords
network
images
training
image
domain
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
EP19721223.6A
Other languages
German (de)
English (en)
Inventor
Andrej Junginger
Markus Hanselmann
Thilo Strauss
Holger Ulmer
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Robert Bosch GmbH
Original Assignee
Robert Bosch GmbH
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Robert Bosch GmbH filed Critical Robert Bosch GmbH
Publication of EP3785169A1 publication Critical patent/EP3785169A1/fr
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • G06V10/451Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
    • G06V10/454Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Definitions

  • the invention relates to methods for training a neural network for converting an input image of a first domain or in a first display style into an output image of a second domain or in a second display style.
  • Motor vehicles are often equipped with camera systems that capture image information about a vehicle environment, in particular an image of a vehicle environment ahead in the direction of travel. This image information is used to perform driver assistance functions to assist the driver and autonomous driving functions. Examples of such driver assistance functions may include a recognition system for traffic signs or a brake assist, which recognizes, for example, that a pedestrian is in a collision area in front of the motor vehicle or moves into it.
  • driver assistance functions may include a recognition system for traffic signs or a brake assist, which recognizes, for example, that a pedestrian is in a collision area in front of the motor vehicle or moves into it.
  • image data usually contain no meta information, the z. B. image segmentation information, ie, indicate which pixel regions of the image data to a pedestrian, to a surrounding area, to a street area, to a building area and the like. Often, such image information must be manually created, which is a costly and, above all, time-consuming process.
  • a disadvantage of the methods described above is that a so-called cycle consistency must be calculated during training, whereby in training the input image data must be explicitly calculated into the output image data and vice versa, which makes the training very computationally intensive and thus time consuming.
  • a method for training a neural network for converting an input image of a first domain into an output image of a second domain according to claim 1 and a corresponding device according to the independent claim are provided. Further embodiments are specified in the dependent claims.
  • a method of training a first neural network to convert an input image of a first domain to an output image of a second domain wherein the training is performed on first domain input images provided for the training and second domain training images; with the following steps:
  • Training the discriminator network based on a discriminator error value and one or more training images and / or one or more output images generated by processing one or more of the input images by the generator network, the discriminator error value being dependent on a respective quality of the one or more training images and / or the one or more output images is determined;
  • Training the generator network based on an input image provided for training and a generator error value that depends on a quality of the output image provided by the generator network responsive to the input image and a similarity size between the input image and the output image that indicates a measure of structural similarity.
  • the aim of the above method is to train a neural network so that a given input image is converted into an output image.
  • the input and output images should have different styles, ie the input image data should be available in a first domain and the output image data in a second domain.
  • the styles correspond to display styles, such as a segmentation representation in which, for example, different color areas are assigned to different objects or image areas, a photorealistic image, a comic image, a line drawing, a watercolor sketch, and the like.
  • These images are intended to replace camera images and to be as indistinguishable as possible from them.
  • These images may also optionally be provided with meta information including, for example, segmentation information that associates image areas of the photorealistic image with particular objects or backgrounds.
  • meta information including, for example, segmentation information that associates image areas of the photorealistic image with particular objects or backgrounds.
  • an input image indicating only image areas for particular objects and / or backgrounds such as image areas representing a person, a cyclist, a road area, a development area, a vegetation area, and the like, may be processed by the trained neural network such that corresponding image areas are provided with realistic structures of the corresponding objects.
  • the above method envisages using a GAN network (GAN: Generative Adversarial Network) in which a generator network corresponding to a first neural network is to be trained by means of a discriminator network which corresponds to a second neural network.
  • the generator network then generates output image data in a second domain from provided input image data in a first domain.
  • the discriminator network provides training for the generator network as relevant information Rating label for the output image generated by the generator network.
  • the discriminator network is trained to evaluate whether an image provided at its input is an image in a second domain.
  • the discriminator network is trained at the same time or in alternation with the generator network based on generator-generated output images and training images in a second domain, wherein the training images are assigned a rating label indicating a high degree of allocation to the second domain (ie, indicating that the images in question are the second domain).
  • the discriminator network is supplied with the output images generated by the generator network, together with a rating label indicating a low allocation level to the second domain (ie indicating that the respective second domain images were artificially generated by the generator network).
  • Generator network and discriminator network can be trained alternately, thereby iteratively improving both neural networks and finally learning the generator network to convert a provided input image in the first domain into an output image in the second domain.
  • loss functions or cost functions are used.
  • a generator function that includes two parts is used as the cost function.
  • a first part forces the generated output image to be assigned to the second domain.
  • the output image generated by the generator network is supplied to the discriminator network and the distance to the desired evaluation label (evaluation label for a training image of the second domain) is minimized.
  • the second part ensures that the image contents of the output image generated by the generator network correspond to the original image by minimizing a structural distance of the output image to the input image, i. H. the output image differs from the input image only by the style of presentation (domain) but only slightly by the image content or the scene shown.
  • the structural distance can be determined, for example, by a similarity value, which is a measure of the structural similarity of two images in different domains.
  • a similarity value which is a measure of the structural similarity of two images in different domains.
  • an SSIM index (SSIM: Structural Similarity Index), which indicates the structural similarity between the input image and the output image in a known manner, is suitable for this purpose.
  • the generator network is allowed to train an input image in the first domain or a first rendering style into an output image in a second domain, i. a second style of presentation, to transform.
  • the input images of the first presentation style and the training images of the second presentation style must be specified, wherein a similarity or identity of the representation of the input images and the training images is not necessary, d. H. it is not necessary to provide input images that differ from the training images only by the style of presentation.
  • a neural network (generator network) can be trained by the above method, which automatically and monitored from synthetic input images that show, for example, a traffic situation schematically or stylized, photorealistic output images of the corresponding traffic situation generated.
  • the output images can then be used to develop and / or test driver assistance functions or autonomous driving functions.
  • situations can be created that can not be tested in reality, such as: B. a running on the roadway person to test a brake assist system or to test an evasive behavior of an autonomous driving function.
  • the training method described above can achieve a significantly improved conversion of an input image of a first presentation style into a corresponding output image of a second presentation style, wherein the training method can be implemented in a simple manner and has high reliability and robustness. Also, the above training method results in better results, ie, an improved more precise conversion of the input image of the first presentation style into the output image of the second presentation style, than corresponding conventional methods. Furthermore, the training of the discriminator network and the generator network can be performed simultaneously or alternately repeatedly, in particular using a backpropagation method, until an abort condition is met.
  • the termination condition is fulfilled if a number of passes or a predetermined quality of the output images generated by the generator network is reached.
  • the quality of the one or more training images and / or the one or more output images may each be determined by the discriminator network and may correspond to a rating of the extent to which the image in question is an image of the second domain.
  • the discriminator error value may be a function of a deviation measure for the deviation between the respective quality of the one or more training images and a rating label indicating a training image as a real image of the second domain, and depending on a deviation measure for the deviation between the respective quality of the respective one output image or the respective plurality of output images and a rating label which indicates an output image generated by the generator network as a false image of the second domain, the deviation measure corresponding in particular to a mean squared error or a binary cross entropy.
  • the similarity quantity depends on or corresponds to an SSIM index for a structural similarity between one of the input images and an output image generated by the generator network from the relevant input image.
  • the first and / or the second neural network can be configured as a convolutional neural network (folding neural network), wherein in particular the first and / or the second neural network is a series connection of some convolutional layer blocks (Convolution blocks), some ResNet blocks, and some Deconvolutional blocks, each of which blocks may contain as an activation function a ReLU, leaky-ReLU, tanh, or sigmoid function.
  • Convolution blocks convolutional layer blocks
  • ResNet blocks some Deconvolutional blocks, each of which blocks may contain as an activation function a ReLU, leaky-ReLU, tanh, or sigmoid function.
  • the generator error value may depend on a deviation measure for the deviation between the respective quality of the output image provided by the generator network as a function of the input image and a rating label from the discriminator network indicating a second domain image, wherein the deviation measure is in particular a mean squared error or corresponds to a binary cross entropy.
  • the training of the discriminator network and / or the generator network can only be performed if a condition dependent on the current discriminator error value and / or on the generator error value is satisfied.
  • a method for providing a control for a technical system in particular for a robot, a vehicle, a tool or a factory machine, wherein the above method is carried out for training a first neural network, wherein the trained first neural network uses is going to workout images, ie Output images of the second domain, with which the controller, which in particular contains a neural network, is trained.
  • the technical system can be operated using the controller.
  • a use of a first neural network trained in accordance with the above method is for generating photorealistic seed images in a second domain dependent on predetermined input images in a first domain, which are created in particular via a script-based description
  • a GAN network is for training a first neural network to convert an input image of a first domain to an output image of a second domain, wherein the training is performed on first domain input images provided for training and second domain training images
  • the GAN network comprises a generator network comprising the first neural network and a discriminator network comprising a second neural network, the GAN network being adapted to
  • the discriminator network based on a discriminator error value and one or more training images and / or one or more output images generated by processing one or more of the input images by the generator network, the discriminator error value being dependent on a respective quality of the one or more training images and / or the one or more output images is determined;
  • Figures 1 a and 1 b exemplary representations of an image of a first
  • Figure 2 is a block diagram illustrating a system for training a GAN network to translate an input image of a first presentation style and an output image of a second presentation style; and FIG. 3 shows a flow chart for illustrating a method for training a neural network for converting an input image into an output image of a different presentation style.
  • a neural network is to be trained which is able to convert an input image into an output image.
  • the goal is that the input image in a first domain, i. H. in a first display style, and in an output image corresponding to the input image in a second domain, i. H. in a second of the first different style of presentation.
  • Presentation style herein refers to a representation of information contained in the corresponding image.
  • a segmentation image indicating segmentation of object and background areas of a photorealistic image, or other artificially generated (synthetic) image, such as a photorealistic image may be used.
  • a sketch as an input image represent a template from which a photorealistic image is generated as an output image, so that the input image and the output image correspond to different presentation styles.
  • Figures 1 a and 1 b show exemplary representations of a synthetic image or a photorealistic image corresponding to the synthetic image in sketch form and as realistic representations.
  • a possible application of such a trained neural network could be to convert a given input image in the form of a segmentation image, in which only segmentation ranges are given, into an artificially generated photorealistic output image.
  • Figure 1 as a real image and as a sketch image, for example, a Segment michstruck ( Figure 1 a) in which only areas are marked, for example, display areas for a carriageway area, a development area, a vegetation area of foreign vehicles, pedestrians, of Cyclists or other objects, in a corresponding photorealistic Image ( Figure 1 b) are converted.
  • Such a photorealistic image may then be used in a test or development environment for testing and / or creating driver assistance functions or autonomous driving functions.
  • FIG. 2 essentially shows a basic structure of a GAN network 1 with a generator network 2 comprising a first neural network and a discriminator network 3 comprising a second neural network.
  • the first and / or second neural network may in particular be designed as convolutional neural networks or other types of neural networks.
  • the first neural network of the generator network 2 Various architectures known per se are conceivable for the first neural network of the generator network 2.
  • a series connection of a few convolutional layer blocks (folding blocks), some ResNet blocks and a few deconvolutional blocks can be selected.
  • Each of these blocks may optionally include a batch or other type of normalization.
  • Each of the blocks may further contain none, one or more activation functions, such as a ReLU, leaky-ReLU, tanh or sigmoid function.
  • each of these blocks may contain a batch or other type of normalization.
  • each of the blocks may contain none, one or more activation functions, such as a ReLU, leaky-ReLU, tanh or sigmoid function.
  • the generator network 2 is designed to generate an output image A of a second presentation style based on an input image E of a first presentation style.
  • the input image E can be an image with one or more Be color channels, in particular three color channels, and the output image A a tensor same or different format. Alternatively, a random tensor may be added to the input image E to cause the output image A to have higher variability.
  • the generator network 2 is trained based on a provided generator error value GF, in particular using a backpropagation method.
  • the generator error value GF is generated in an evaluation block 4, on the one hand, the structural similarity S or dissimilarity of the input image E and of the generator network 2 based on a predetermined input image E generated output image A (image similarity (similarity of the image content or the scene) regardless of Domain or the presentation style) and on the other hand, the quality C of the output image A indicates.
  • the quality C of the output image A indicates the proximity of the presentation style of the output image A to the style of presentation of predetermined training images T.
  • the quality C of the output image A is determined by means of the discriminator network 3, to which the output image A produced is provided as input.
  • the quality C By taking into account the quality C during training of the generator network 3, it is achieved that the generated output image A assumes the second style of presentation.
  • the structural similarity S between the input image E and the output image A it is achieved that the images have the same image content.
  • the discriminator network 3 can be supplied with training images T, which are images of the second representation style and which are each provided with a rating label BT, which confirms the second presentation style of the training images.
  • the training images T may be provided with a rating label BT of 1, indicating that the training images T correspond to the second style of presentation.
  • the discriminator network 3 can also be provided with the output images A generated by the generator network 2, which are provided with a rating label B A of 0, indicating that the presentation style of these images is of the second style significantly different.
  • the Discriminator network 3 z. B. be trained using the Backpropagation method or other training method to determine the quality of C provided by the generator network 2 output images A.
  • the discriminator network 3 When training the discriminator network 3, this can with the help of a discriminator error DFK, such. As a mean squared error, binary cross entropy or other appropriate cost functions are trained. As a result, by influencing the generator error value, the discriminator network 3 obtains the capability that the generator network 2 generates not only output images A corresponding to the second display style, but simultaneously the output images A have the same image content as the input image E of the first presentation style supplied to the generator network 2.
  • a discriminator error DFK such.
  • a mean squared error binary cross entropy or other appropriate cost functions are trained.
  • the discriminator network 3 obtains the capability that the generator network 2 generates not only output images A corresponding to the second display style, but simultaneously the output images A have the same image content as the input image E of the first presentation style supplied to the generator network 2.
  • generator network 2 is trained with the generator error value GF by means of a backpropagation method or another training method, the generator error value GF being determined by the structural similarity between the input image E and the output image A generated by the generator network 2 and by the quality determined by the discriminator network 3 C of the generated by the generator network 2 output image A is determined.
  • a tensor B x is provided. This can be multidimensional or correspond to a real number.
  • the tensor B x corresponds to the evaluation label and can indicate 1 for the training images and 0 for the images generated by the generator network.
  • the rating labels thus correspond to Bi for a training image T and Bo for an output image A generated by the generator network.
  • the dimension of the evaluation label B is essentially freely selectable and depends on the selected network architecture.
  • the evaluation label B can also be provided with a different standardization, and in particular so-called soft evaluation labels B can be used, ie instead of the values 1 and 0 correspondingly slightly noisy values can be assumed, whereby the stability of the training can be improved depending on the application ,
  • the map of the discriminator network 3 corresponds to D 9d , where 0 D are the discriminator parameters (weights) of the neural network of the discriminator network to be optimized.
  • the mapping performed by the generator network 2 corresponds to Gg G, where 0 G are the generator parameters (weights) of the neural network of the generator network 2 to be optimized.
  • the discriminator error function for training the discriminator network 2 serves to determine the discriminator error value DF used in the parameter optimization training of the discriminator parameter 0D.
  • the loss function has several addend IDs.
  • the discriminator error function DFK used for this training of the discriminator network 3 must realize a deviation measure l D as far as C (T), C (A) and the corresponding evaluation label B A , B T differ from each other.
  • l D the deviation measure l D
  • MSE mean squared error
  • BCE binary cross entropy
  • a generator error function is used to generate a generator error value consisting of two parts, a first part corresponding to a deviation amount l G between the quality C of the output image A T based on an input image E T applied for training and a rating label B indicating complete achievement of the second display style, in particular a rating label BT, which is given training images T for the training of the discriminator network 3, preferably a rating label of 1.
  • MSE Mean Squared Error
  • BCE binary Cross Entropy
  • the second part of the generator error function corresponds to a similarity quantity S, which is determined in a similarity block 6 by means of a similarity evaluation function.
  • the similarity evaluation function calculates a measure of a structural similarity of the two images based on the input image ET of the first presentation style and the output image AT of the second presentation style respectively generated by the generator network 2.
  • a function may be provided as a similarity evaluation function which, with a high structural similarity, assumes a value close to 1 and with no structural similarity near -1.
  • Suitable as a similarity evaluation function is, for example, to select a so-called SSIM function which indicates an index of structural similarity or a MSSIM based thereon, such as Zhou Wang et al., Image Quality Assessment: From Error Visibility to Structural Similarity, IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 13, NO. 4, APRIL 2004, pages 600-612.
  • the training method for the first neural network of the generator network 2 is descriptive described, so that the trained generator network 2 can be used to change a display style of an input image E.
  • an initial parameterization of the first neural network with the Generator parameters 0G and the second neural network with the discriminator 0D is descriptive described, so that the trained generator network 2 can be used to change a display style of an input image E.
  • step S2 With the aid of the discriminator network 3, a quality is achieved in step S2
  • step S3 the similarity quantity S between the input image ET provided for training and the corresponding output image AT is calculated:
  • a generator error value GF for the generated output image A T is determined using the generator error function GFF in step S4.
  • a learning step for the first neural network of the generator network 2 is performed in step S5, in particular based on a backpropagation method.
  • the generator parameters 0 G are updated based on the partial derivatives dGF / d0 G.
  • steps S1 to S5 of the training of the generator network 2 can be repeated with the same or with another input image ET provided for training.
  • step S8 one or more last-generated output images Ai . m corresponding to a quality C (Ai ..m) is determined and from it in step S9, the dimensions or the deviation Z ß (Ai .. m) determined: D l (71) MSE ⁇ Dg 04), B a) and l D (71) BCE ⁇ Dg (71), B A )
  • a discriminator error value DF is determined, for example, according to the following formula:
  • a learning step for the second neural network of the discriminator network 3 may be performed in step S1 1.
  • Characterized the Diskriminatorparameter be updated 0 D in a back propagation method by using the corresponding partial derivatives dDF / dQ D.
  • the backpropagation method can also be carried out only based on a training image T and / or an output image A.
  • the discriminator network 3 not only generated images in the second display style but also other training images in the first presentation style of 0 (or near 0) may be used. This makes it easier for the discriminator, if necessary, to better learn the differences between the two domains.
  • step S12 an abort condition is checked. If the termination condition is not fulfilled (alternative: no), the method is continued with step S1, otherwise (alternative: yes) the method is continued with step S13.
  • An abort condition can be, for example, the achievement of a number of passes or the achievement of a predetermined discriminator error value DF and / or generator error value GF, or the achievement of a predetermined quality C (A) of the output images A generated by the generator network 2.
  • the step S13 now represents the generator network 2 as a system for converting an input image E of a first presentation style or a first domain into an output image A of a second presentation style or a second domain.
  • the discriminator parameters 0D and generator parameters 0G are only updated under certain conditions, e.g. B. depending on the current discriminator error value DF for training the discriminator network 3 and the generator error value GF for training the generator network 2.
  • the size of the batches for the training of the discriminator network 3 or the generator network 2 can be varied.
  • an input image deviation measure which adds a deviation of the quality C of the input image from a rating label B A for a fake image, ie an output image A generated by the generator network can still be additively added. This can increase the stability of the training.
  • the trained generator network 2 can then be used to select from input images E created via a script-based description, e.g. B. Traffic situations show input images E to produce a first presentation style. If the generator network 2 has been trained based on images of the first representation style and photorealistic images of traffic situations, the artificially generated input images E can be assigned photorealistic images that represent a corresponding traffic situation. As a result, the generator network 2 can be used to create any number of photorealistic images that represent desired traffic situations.
  • the generator network 2 can also be trained in a reverse manner to convert photorealistic images into synthetic images, for example to remove reflections or the like from the photorealistic images, for example when a classifier can better classify synthetic images than photorealistic images.
  • the above system may also be trained to create segmented images from photorealistic images, in which case the photorealistic images correspond to the first style of presentation and the segmented images to the images of the second style of presentation.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • Multimedia (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

La présente invention concerne un procédé d'apprentissage d'un premier réseau neuronal pour la conversion d'une image d'entrée (E) d'un premier domaine en une image de sortie (A) d'un second domaine, l'apprentissage étant effectué sur des images d'entrée (E) du premier domaine fournies pour l'apprentissage et sur des images d'apprentissage (T) du second domaine. Ledit procédé comprend les étapes suivantes : - l'utilisation d'un réseau GAN avec un réseau de générateurs (2), qui comporte le premier réseau neuronal, et avec un réseau de discriminateurs (3) qui comporte un second réseau neuronal ; - l'apprentissage du réseau de discriminateurs (3) sur la base d'une valeur d'erreur de discriminateur (DF) et d'une ou de plusieurs images d'apprentissage (T) et/ou d'une ou de plusieurs images de sortie (A), qui sont générées par le réseau de générateurs (2) par le traitement d'une ou de plusieurs images d'entrée, la valeur d'erreur de discriminateur (DF) étant déterminée en fonction d'un facteur de qualité (C) respectif d'une ou de plusieurs images d'apprentissage (T) et/ou d'une ou de plusieurs images de sortie ; - l'apprentissage du réseau de générateurs (2) sur la base d'une image d'entrée (E) fournie pour l'apprentissage et d'une valeur d'erreur de générateur (GF) qui dépend d'un facteur de qualité (C) de l'image de sortie (A), fournie par le réseau de générateurs (2) en fonction de l'image d'entrée (E), et d'une grandeur de similarité (S) entre l'image d'entrée (E) et l'image de sortie (A), qui indique une cote pour une similarité structurelle.
EP19721223.6A 2018-04-23 2019-04-18 Procédé et dispositif de conversion d'une image d'entrée d'un premier domaine en une image de sortie d'un second domaine Pending EP3785169A1 (fr)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
DE102018206199 2018-04-23
DE102018206806.2A DE102018206806A1 (de) 2018-04-23 2018-05-03 Verfahren und Vorrichtung zur Umsetzung eines Eingangsbildes einer ersten Domäne in ein Ausgangsbild einer zweiten Domäne
PCT/EP2019/060047 WO2019206792A1 (fr) 2018-04-23 2019-04-18 Procédé et dispositif de conversion d'une image d'entrée d'un premier domaine en une image de sortie d'un second domaine

Publications (1)

Publication Number Publication Date
EP3785169A1 true EP3785169A1 (fr) 2021-03-03

Family

ID=68105256

Family Applications (1)

Application Number Title Priority Date Filing Date
EP19721223.6A Pending EP3785169A1 (fr) 2018-04-23 2019-04-18 Procédé et dispositif de conversion d'une image d'entrée d'un premier domaine en une image de sortie d'un second domaine

Country Status (3)

Country Link
EP (1) EP3785169A1 (fr)
DE (1) DE102018206806A1 (fr)
WO (1) WO2019206792A1 (fr)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP7046786B2 (ja) * 2018-12-11 2022-04-04 株式会社日立製作所 機械学習システム、ドメイン変換装置、及び機械学習方法
US11745749B2 (en) * 2019-12-30 2023-09-05 Magna Electronics Inc. Vehicular system for testing performance of object detection algorithms
US11690579B2 (en) * 2020-06-16 2023-07-04 Shanghai United Imaging Intelligence Co., Ltd. Attention-driven image domain translation
DE102021200374A1 (de) 2021-01-15 2022-07-21 Volkswagen Aktiengesellschaft Digitale Repräsentation eines Materials
CN114610677B (zh) * 2022-03-10 2024-07-23 腾讯科技(深圳)有限公司 一种转换模型的确定方法和相关装置

Also Published As

Publication number Publication date
DE102018206806A1 (de) 2019-10-24
WO2019206792A1 (fr) 2019-10-31

Similar Documents

Publication Publication Date Title
WO2019206792A1 (fr) Procédé et dispositif de conversion d'une image d'entrée d'un premier domaine en une image de sortie d'un second domaine
DE102019202090A1 (de) Verfahren zum Erzeugen eines Trainingsdatensatzes zum Trainieren eines Künstlichen-Intelligenz-Moduls für eine Steuervorrichtung eines Roboters
EP3393875B1 (fr) Procédé de détection améliorée d'objets au moyen d'un système d'aide à la conduite
DE102019209644A1 (de) Verfahren zum Trainieren eines neuronalen Netzes
WO2020048669A1 (fr) Procédé servant à définir une information relative au changement de voie d'un véhicule, support de stockage lisible par ordinateur, et véhicule
EP3748453B1 (fr) Procédé et dispositif de réalisation automatique d'une fonction de commande d'un véhicule
DE102019216206A1 (de) Vorrichtung und Verfahren zum Bestimmen einer Kehrtwendestrategie eines autonomen Fahrzeugs
WO2020051618A1 (fr) Analyse de scénarios spatiaux dynamiques
DE102019208733A1 (de) Verfahren und Generator zum Erzeugen von gestörten Eingangsdaten für ein neuronales Netz
WO2019110177A1 (fr) Formation et fonctionnement d'un système d'apprentissage automatique
DE102018130004B3 (de) Auf einer support vector machine basierende intelligente fahrweise zum passieren von kreuzungen und intelligentes fahrsystem dafür
DE102019208735A1 (de) Verfahren zum Betreiben eines Fahrassistenzsystems eines Fahrzeugs und Fahrerassistenzsystem für ein Fahrzeug
DE102019105850A1 (de) Verfahren zur Erzeugung eines reduzierten neuronalen Netzes für ein Steuergerät eines Fahrzeuges mithilfe von Eigenvektoren
EP3748454B1 (fr) Procédé et dispositif de réalisation automatique d'une fonction de commande d'un véhicule
DE102020109364A1 (de) Verfahren und Vorrichtung zum Ermitteln und Klassifizieren wenigstens eines Objekts in einem Erfassungsbereich eines Sensors
DE102020105070A1 (de) Verfahren zum Erkennen eines befahrbaren Bereichs in einer Umgebung eines Fahrzeugs mithilfe eines binären künstlichen neuronalen Netzes, Recheneinrichtung sowie Fahrerassistenzsystem
DE102018129871A1 (de) Trainieren eins tiefen konvolutionellen neuronalen Netzwerks zum Verarbeiten von Sensordaten zur Anwendung in einem Fahrunterstützungssystem
DE102021133977A1 (de) Verfahren und System zur Klassifikation von Szenarien eines virtuellen Tests sowie Trainingsverfahren
WO2022043203A1 (fr) Entraînement d'un générateur destiné à la génération des images réalistes avec un discriminateur à segmentation sémantique
DE102019217951A1 (de) Verfahren und Vorrichtung zum Bestimmen einer Domänendistanz zwischen mindestens zwei Datendomänen
EP3772017A1 (fr) Détection de signal ferroviaire pour véhicules ferroviaires autonomes
DE102019217952A1 (de) Verfahren und Vorrichtung zum Bereitstellen eines Trainingsdatensatzes zum Trainieren einer KI-Funktion auf eine unbekannte Datendomäne
DE102020211596A1 (de) Verfahren zum Generieren eines trainierten neuronalen Faltungs-Netzwerks mit invarianter Integrationsschicht zum Klassifizieren von Objekten
DE102019114049A1 (de) Verfahren zur Validierung eines Fahrerassistenzsystems mithilfe von weiteren generierten Testeingangsdatensätzen
DE102021208472B3 (de) Computerimplementiertes Verfahren zum Trainieren eines Machine-Learning-Modells für ein Fahrzeug oder einen Roboter

Legal Events

Date Code Title Description
STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: UNKNOWN

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE

PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE

17P Request for examination filed

Effective date: 20201123

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

AX Request for extension of the european patent

Extension state: BA ME

DAV Request for validation of the european patent (deleted)
DAX Request for extension of the european patent (deleted)
STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: EXAMINATION IS IN PROGRESS

17Q First examination report despatched

Effective date: 20221215