EP4078941A2 - Conversion de données image d'entrée d'une pluralité de caméras de véhicule d'un système à visibilité périphérique en données image de sortie optimisées - Google Patents

Conversion de données image d'entrée d'une pluralité de caméras de véhicule d'un système à visibilité périphérique en données image de sortie optimisées

Info

Publication number
EP4078941A2
EP4078941A2 EP20841885.5A EP20841885A EP4078941A2 EP 4078941 A2 EP4078941 A2 EP 4078941A2 EP 20841885 A EP20841885 A EP 20841885A EP 4078941 A2 EP4078941 A2 EP 4078941A2
Authority
EP
European Patent Office
Prior art keywords
image data
output
neural network
brightness
cnn11
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
EP20841885.5A
Other languages
German (de)
English (en)
Inventor
Christian Scharfenberger
Michelle Karg
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Continental Autonomous Mobility Germany GmbH
Original Assignee
Continental Autonomous Mobility Germany GmbH
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Continental Autonomous Mobility Germany GmbH filed Critical Continental Autonomous Mobility Germany GmbH
Publication of EP4078941A2 publication Critical patent/EP4078941A2/fr
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • H04N5/2624Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects for obtaining an image which is composed of whole input images, e.g. splitscreen
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4038Image mosaicing, e.g. composing plane images from plane sub-images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4046Scaling of whole images or parts thereof, e.g. expanding or contracting using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/698Control of cameras or camera modules for achieving an enlarged field of view, e.g. panoramic image capture
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/70Circuitry for compensating brightness variation in the scene
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/90Arrangement of cameras or camera modules, e.g. multiple cameras in TV studios or sports stadiums
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30248Vehicle exterior or interior
    • G06T2207/30252Vehicle exterior; Vicinity of vehicle

Definitions

  • the invention relates to a machine learning method, a method and a device for converting input image data from a plurality of vehicle cameras of an all-round vision system into optimized output image data.
  • Today's vehicles are increasingly equipped with surround view and / or assistance systems that monitor the areas in front of, next to or behind the vehicle. This is used either to recognize objects to avoid collisions, to recognize road boundaries, to keep the vehicle within the lane or simply to display the surroundings to assist with a parking process.
  • DE 102014210323 A1 shows a device and a method for adaptive image correction of at least one image parameter of a camera image with: several cameras for generating camera images, the camera images from adjacent cameras each having overlapping image areas; and with an image processing unit which combines the camera images generated by the cameras to form a composite overall image; wherein the image processing unit has an image correction component that calculates a plurality of average image parameter levels of the image parameter in the overlapping image areas of the camera image for each received camera image and sets the respective image parameter as a function of the calculated average image parameter levels.
  • a prominent example is driving on an unlit country road at night.
  • the vehicle is equipped with a surround view system, which is supposed to offer both assistance and display functions while driving. While the vehicle illuminates the front and rear area through the front and rear headlights, the area next to the vehicle is almost unlit.
  • Another example is parking a vehicle in a dark corner of a parking garage.
  • the case occurs especially in parking positions next to walls or other vehicles in which too little or no light is available for the side cameras.
  • a system would therefore be desirable which algorithmically enables good upgrading of the unlit areas without additional lighting.
  • a method for machine learning of the conversion of input image data from several vehicle cameras of a panoramic system into optimized output image data by means of an artificial neural network provides that the learning is carried out with a large number of training image pairs in such a way that at the input of the artificial neural network one first image of a first brightness or color distribution and a second image of the same scene with a different second brightness or color distribution is provided as the target output image.
  • the artificial neural network can be, for example, a convolutional neural network (“folding neural network”, CNN).
  • the vehicle cameras are preferably arranged and configured in such a way that, taken together, they capture and image the area of the vehicle environment surrounding the vehicle.
  • the training image pairs are generated by recording a first image with a first and a second image with a second brightness at the same time or immediately following one another with different exposure times.
  • a first, shorter exposure time leads to a darker training image and a second, longer exposure time to a lighter training image.
  • the respective vehicle camera is stationary (immobile) in relation to the surroundings to be recorded during the generation of the training data.
  • the training data can be recorded with at least one vehicle camera of a stationary vehicle, for example.
  • the scene captured by the vehicle camera can, for example, contain a static environment, that is to say without moving objects.
  • one artificial neural network is trained jointly or simultaneously for all vehicle cameras.
  • a sequence of successive images can be used for each individual camera for joint training.
  • the time correlation of images can be profitably taken into account during training and / or when using the trained network.
  • At least one factor d is determined as a measure for the difference between the second and the first brightness or color distribution of a training image pair and is provided to the artificial neural network as part of the training.
  • the factor d can be determined, for example, as the ratio of the second brightness or color distribution to the first brightness or color distribution.
  • the brightness can in particular be determined as the mean brightness of an image or on the basis of a luminance histogram of an image.
  • the artificial neural network has a common input interface for two separate output interfaces.
  • the common input interface has shared feature representation layers.
  • Converted image data are output at the first output interface.
  • ADAS-relevant detections of at least one ADAS detection function are output at the second output interface.
  • ADAS stands for advanced systems for assisted or automated driving (English: Advanced Driver Assistance Systems).
  • ADAS-relevant detections are e.g. objects, objects, road users, which represent important input variables for ADAS / AD systems.
  • the artificial neural network includes ADAS detection functions, e.g. lane recognition, object recognition, depth recognition (3D estimation of the image components), semantic recognition, or the like. As part of the training, the outputs of both output interfaces are optimized.
  • a method for converting input image data from a plurality of vehicle cameras of an all-round vision system into optimized output image data comprises the steps: a) input image data of a current brightness or color distribution recorded by the vehicle cameras are provided to a trained artificial neural network, b) the trained artificial one The neural network is configured to convert the input image data with the current brightness or color distribution into output image data with a different output brightness or color distribution, and c) the trained artificial neural network is configured to output the output image data.
  • the output image data optimized in terms of their brightness or color distribution advantageously enable the images from the individual vehicle cameras to be better combined to form a combined image which can be displayed to the driver.
  • a factor d is additionally provided to the trained artificial neural network and in step b) the (strength or degree of) conversion is controlled as a function of factor d.
  • the conversion in step b) takes place in such a way that an improvement in vision with regard to overexposure is achieved. For example, during the training course, people learned to reduce the brightness of overexposed images or to adjust the color distribution.
  • step b) the input image data with the current brightness are converted into output image data with a longer (virtual) exposure time. This offers the advantage of avoiding motion blur.
  • the factor d is estimated and in the estimation the brightness or color distribution of the current captured image data (e.g. illuminance histogram or average brightness) or the previously captured image data or the history of the factor d is taken into account.
  • the current captured image data e.g. illuminance histogram or average brightness
  • too high a brightness indicates overexposure
  • too low a brightness indicates underexposure. Both can be determined by means of corresponding threshold values and corrected by a corresponding conversion
  • a separate factor d is estimated or determined for each of the vehicle cameras. This enables the individual conversion for image data from the individual vehicle cameras, in particular as a function of the current brightness or color distribution of the image from the respective vehicle camera.
  • a different factor d is estimated or determined for each of the image regions. If there are image regions with different illumination intensities, the factor d can thus vary within an image and image regions with different factors d are determined via brightness estimates. The brightness improvement can thus be adapted to individual image regions.
  • a development of factor d over time can be taken into account when determining or estimating factor d.
  • the development of factor d over time and a sequence of input images are included in the estimate.
  • Information about the temporal development of the brightness can also be used for image regions with different factors d.
  • information about the current surroundings of the vehicle is taken into account when determining the factor d.
  • the estimate of the factor d can take into account further scene information, such as information about the surroundings (country road, city, motorway, tunnel, underpass), which is obtained via image processing from the sensor data or data from a navigation system (e.g. GPS receiver with digital map).
  • the factor d can be estimated based on information about the surroundings and from the chronological sequence of images as well as from the history of the factor d.
  • the estimation of the factor d when using a trained artificial neural network can thus take place dynamically.
  • the converted image data is output to at least one ADAS detection function which determines and outputs ADAS-relevant detections.
  • ADAS detection functions can include known edge or pattern recognition methods as well as recognition methods which can recognize and optionally classify relevant image objects by means of an artificial neural network.
  • the approach can be expanded and the artificial neural network for converting the image data can be combined with a neural network for ADAS detection functions, for example lane detection, object detection, depth detection, semantic detection.
  • ADAS detection functions for example lane detection, object detection, depth detection, semantic detection.
  • the invention further relates to a device with at least one data processing unit configured to convert input image data from a plurality of vehicle cameras of a panoramic vision system into optimized output image data.
  • the device comprises: an input interface, a trained artificial neural network and a (first) output interface.
  • the input interface is configured to receive input image data of a current brightness or color distribution from the vehicle cameras.
  • the trained artificial neural network is configured to convert the input image data, which have a first brightness or color distribution, into output image data with a different output brightness or color distribution.
  • the (first) output interface is configured to output the converted image data.
  • the device or the data processing unit can in particular be a microcontroller or processor, a central processing unit (CPU), a graphic processing unit (GPU), a digital signal processor (DSP), an ASIC (Application Specific Integrated Circuit), an FPGA (Field Programmable) Gate array) and the like, as well as software for carrying out the corresponding method steps.
  • CPU central processing unit
  • GPU graphic processing unit
  • DSP digital signal processor
  • ASIC Application Specific Integrated Circuit
  • FPGA Field Programmable
  • the data processing unit is implemented in a hardware-based image preprocessing stage (Image Signal Processor, ISP).
  • ISP Image Signal Processor
  • the trained artificial neural network for converting input image data into output image data with optimized brightness or color distribution is part of an ADAS detection neural network in the vehicle, e.g. for semantic segmentation, lane detection or object detection, with a divided input interface (Input or feature representation layers), and two separate output interfaces (output layers).
  • the first output interface is for Output of the converted output image data and the second output interface is configured to output the ADAS detections (image recognition data).
  • the invention further relates to a computer program element which, when a data processing unit is programmed with it, instructs the data processing unit to carry out a method for converting input image data from the vehicle cameras into optimized output image data.
  • the invention also relates to a computer-readable storage medium on which such a program element is stored.
  • the invention also relates to the use of a method for machine learning conversion of input image data from a plurality of vehicle cameras of a panoramic vision system into optimized output image data for training an artificial neural network of a device with at least one data processing unit.
  • the present invention can thus be implemented in digital electronic circuitry, computer hardware, firmware, or software.
  • FIG. 1 shows a first schematic representation of a device according to the invention in one embodiment
  • FIG. 5 shows a neural network for improving the view of an input image, which shares feature representation layers with a second network for the detection functions and has two outputs; and
  • FIG. 6 shows a modified approach based on FIG. 5.
  • a device according to the invention for 1 for converting input image data from several vehicle cameras of an all-round viewing system into optimized output image data can be several units or Have circuit components.
  • the device for adaptive image correction has a plurality of vehicle cameras 2-i which each generate camera images or video data.
  • the device 1 has four vehicle cameras 2-i for generating camera images.
  • the number of vehicle cameras 2-i can vary for different applications.
  • the device 1 according to the invention has at least two vehicle cameras for generating camera images.
  • the camera images from neighboring vehicle cameras 2-i typically have overlapping image areas.
  • the device 1 contains a data processing unit 3 which combines the camera images generated by the vehicle cameras 2-i to form a composite overall image.
  • the data processing unit 3 has a system for image conversion 4.
  • the image conversion system 4 uses the input image data (Ini) of the vehicle cameras (2-i) to generate output or output image data (Opti) which have an optimized brightness or color distribution.
  • the optimized output image data from the individual vehicle cameras are combined to form a composite overall image (so-called stitching).
  • the overall image composed of the optimized image data (Opti) by the image processing unit 3 is then displayed to a user by a display unit 5.
  • the image conversion system 4 is formed by an independent hardware circuit which converts the brightness or the color distribution.
  • the system executes program instructions when performing an image conversion method.
  • the data processing unit 3 can have one or more image processing processors, converting the camera images or video data received from the various vehicle cameras 2-i and then putting them together to form a composite overall image (stitching).
  • the image conversion system 4 is formed by a processor provided for this purpose, which converts the brightness or the color distribution in parallel with the other processor or processors of the data processing unit 3.
  • the parallel data processing reduces the time required to process the image data.
  • Figure 2 shows a further schematic representation of a device 1 according to the invention in one embodiment.
  • the device 1 shown in FIG. 2 is used in a surround view system of a vehicle 10, in particular a passenger car or a truck.
  • the four different vehicle cameras 2-1, 2-2, 2-3, 2-4 can be located on different sides of the vehicle 10 and have corresponding viewing areas (dashed lines) in front of V, behind H, left L and right R the or . of the vehicle (s) 10.
  • the first vehicle camera 2-1 is located on a front side of the vehicle 10
  • the second vehicle camera 2-2 is located on a rear side of the vehicle 10
  • the third vehicle camera 2-3 is located on the left side of the vehicle 10
  • the fourth vehicle camera 2-4 is on the right side of the vehicle 10.
  • the camera images from two adjacent vehicle cameras 2-i have overlapping image areas VL, VR, HL, HR.
  • the vehicle cameras 2-i are so-called fisheye cameras which have a viewing angle of at least 185 °.
  • the vehicle cameras 2 - i can transmit the camera images or camera image frames or video data in one possible embodiment to the data processing unit 3 via an Ethernet connection.
  • the data processing unit 3 calculates a composite surround view camera image from the camera images of the vehicle cameras 2-i, which is displayed on the display 5 of the vehicle 10 to the driver and / or a passenger.
  • the light conditions in the surroundings deviate from each other in the front vehicle camera 2-1 and in the rear vehicle camera 2-2 while driving, for example when entering a vehicle tunnel or when driving into a vehicle garage.
  • the activated front headlights illuminate the front area V in front of vehicle 10 with white light and relatively high intensity
  • the rear headlights illuminate the rear area H behind the vehicle with red light and medium intensity.
  • the areas to the left L and right R next to the vehicle 10 are virtually unlit.
  • ground truth data are preferably used in a first application, which have a brightness and balance used for all target cameras 2-1, 2-2, 2-3, 2-4.
  • the ground truth data for all target cameras 2-1, 2-2, 2-3, 2-4 are balanced in such a way that, for example, in a surround view application, no differences in brightness are discernible in the ground truth data.
  • a neural network CNN1, CNN10, CNN11, CNN12 is created with regard to an optimal parameter set trained for the network.
  • This data record can consist, for example, of images with white and red headlights for the front cameras 2-1 and rear cameras 2-2, and dark images for the side cameras 2-3, 2-4.
  • Data with differently illuminated side areas L, R are also conceivable, for example when the vehicle 10 is located next to a street lamp or the vehicle 10 has an additional light source on one side.
  • the neural network for the common cameras 2-i can be trained to the effect that even in the case of missing training data and ground truth data for a camera, for example a side camera 2-3 or 2-4, the network can set the parameters for this camera 2-3 or 2-4 is trained and optimized with the missing data based on the training data of the other cameras 2-1, 2-2 and 2-4 or 2-3.
  • This can be achieved, for example, as a restriction (or constraint) in the training of the network, for example as an assumption that the correction and training must always be the same due to similar lighting conditions of the side cameras 2-3 and 2-4.
  • the neural network uses training and ground truth data that differ over time and that are correlated with the cameras 2-i and that were recorded by the various cameras 2-i at different points in time.
  • information from features or objects and their ground truth data can be used which, for example, were recorded at a point in time t by the front camera 2-1 and at a point in time t + n by the side cameras 2-3, 2-4.
  • These features or objects and their ground truth data can replace missing information in the training and basic truth data of the respective other cameras if they are used in the images of the other cameras 2-i and then by the network as training data.
  • the network can set the parameters for Optimize all side cameras 2-3, 2-4 and, if necessary, compensate for missing information in the training data.
  • An essential component is an artificial neural network CNN1, which learns in a training phase to assign a set of corresponding vision-improved images Out (Out1, Out2, Out3, ...) to a set of training images In (In1, In2, In3, ...) .
  • Assigning means here that the neural network CNN1 learns to generate an image with improved vision.
  • a training image (In1, In2, In3, ...) can contain, for example, a street scene at dusk, on which the human eye can only see another vehicle immediately in front of the vehicle and the sky.
  • On the corresponding image with improved visibility (Out1, Out2, Out3, ...) the contours of the other vehicle, a sidewalk as a lane boundary and adjacent buildings can also be seen.
  • a factor d is preferably used as an additional input variable for the neural network CNN1.
  • the factor d is a measure of the degree of visibility improvement.
  • the factor d for an image pair consisting of the training image and the image with improved vision can be determined in advance and made available to the neural network CNN1.
  • a factor d can be used to control how strongly the neural network CNN1 "brightens” or “darkens” an image - one can also think of factor d as an external regression parameter (not just bright - dark, but with any gradation). Since the factor d can be subject to possible fluctuations in the range of +/- 10%, this is taken into account during training.
  • the factor d can be noisy during the training by approx. +/- 10% (e.g. during the different epochs of the training of the neural network), in order to be robust against incorrect estimates of the factor d in the range of approx. +/- during the inference in the vehicle.
  • the necessary accuracy of factor d is in the range of +/- 10% - thus the neural network CNN1 is robust against deviations in estimates from this parameter.
  • One possibility for generating the training data (training images (In1, In2, In3, ...) and associated vision-enhanced images (Out1, Out2, Out3, ...)) is to record image data of a scene, each with a short and simultaneous or . in immediate succession with a long exposure time.
  • image pairs (In1, Out1; In2, Out2; In3, Out3; ...) with different factors d can be recorded for a scene in order to learn a continuous spectrum for the improvement of vision depending on the parameter or factor d.
  • the vehicle camera 2-i is preferably stationary (immobile) with respect to the surroundings to be recorded during the generation of the training data.
  • the training data can be recorded by means of a vehicle camera 2 - i of a stationary vehicle 10.
  • the scene captured by the vehicle camera 2-i can in particular contain a static environment, that is to say without moving objects.
  • CNN1 visually enhanced source / output image.
  • the trained neural network CNN1 receives original input image data (Ini) from the multiple cameras 2-i as input.
  • a factor d can be specified or determined by the neural network CNN1 itself on the basis of the input image data (Ini).
  • the factor d specifies (controls) how strongly the input image data should be converted.
  • the neural network calculates image data (Opti) with improved visibility from the multiple cameras 2-i.
  • the optimized image data (Opti) of the multiple cameras 2-i are output.
  • FIGS. 5 and 6 show exemplary embodiments for possible combinations of a first network for improving visibility with one or more networks of the functions for driver assistance functions and automated driving.
  • FIG. 5 shows a neural network CNN10 for improving the view of an input image (Ini), possibly controlled by a factor d, which Share feature representation layers (as input or lower layers) with a network for detection functions (fn1, fn2, fn3, fn4).
  • the detection functions (fn1, fn2, fn3, fn4) are image processing functions that detect objects, structures, properties (generally: features) in the image data that are relevant for ADAS or AD functions.
  • Many such detection functions (fn1, fn2, fn3, fn4), which are based on machine learning, have already been developed or are the subject of current developments (e.g. traffic sign classification, object classification, semantic segmentation, depth estimation, lane marking recognition and localization).
  • Detection functions (fn1, fn2, fn3, fn4) of the second neural network CNN2 deliver better results on visually enhanced images (Opti) than on the original input image data (Ini) in poor visibility conditions.
  • Opti visually enhanced images
  • Ini original input image data
  • the neural network CNN10 with divided input layers and two separate outputs has a first output CNN 11 for outputting the visually enhanced output / output image (Opti) and a second output CNN 12 for outputting the detections: objects, depth, trace, semantics, etc. .
  • the feature representation layers are optimized during training with regard to both the vision improvement and the detection functions (fn1, fn2, fn3, fn4), an optimization of the vision improvement also leads to an improvement in the detection functions (fn1, fn2, fn3, fn4).
  • FIG. 6 shows an approach based on the system of FIG. 5 for neural network-based vision improvement by optimizing the features.
  • the features for the detection functions (fn1, fn2, fn3, fn4) are optimized during training with regard to improved visibility and with regard to the detection functions (fn1, fn2, fn3, fn4).
  • the detection functions (fn1, fn2, fn3, fn4) - as already explained - are improved by the joint training of vision improvement and detection functions compared to a system with only one neural network (CNN2) for detection functions (fn1, fn2, fn3, fn4) , in which only the detection functions (fn1, fn2, fn3, fn4) have been optimized during training.
  • an additional output interface (CNN11) outputs the improved brightness image (Opti) and compares it with the ground truth (the corresponding training image with improved visibility).
  • This output (CNN11) can continue to be used in the test phase or during runtime, or it can be cut off to save computing time.
  • the weights for the detection functions (fn1, fn2, fn3, fn4) are modified in this training with the additional output (CNN11) so that they take into account the brightness improvements for the detection functions (fn1, fn2, fn3, fn4).
  • the weights of the detection functions (fn1, fn2, fn3, fn4) thus implicitly learn the information about the brightness improvement.
  • an assistance system based on surround view cameras which algorithmically converts the image data, despite darkness and lack of color information, into a representation that corresponds to a recording with illumination or daylight are set out below.
  • the converted image can either be used for display purposes only or as input for feature-based recognition algorithms.
  • the calculation in a system is based, for example, on a neural network which, upstream of a detection or display unit, converts a very dark input image with little contrast and color information into a, for example, daylight representation.
  • the neural network was trained with a data set consisting of “dark input images” and the associated “daylight images”.
  • This training can be carried out individually for each vehicle camera, so that a conversion takes place individually for each individual vehicle camera.
  • Individual training for each vehicle camera offers the advantage of redundancy in the overall system.
  • the neural network can reproduce processes such as white balancing, gamma correction and histogram equalization in a very ideal way, and additional ones stored in the network structure Use information to automatically add missing color or contrast information. In this way, very dark images can be converted into a representation which is advantageous for feature-based recognition and viewing.
  • ground truth data enables the network to learn to convert the different images from the individual vehicle cameras into images with optimized brightness and / or color distribution. For example, it can be considered optimal that the brightness or color distribution in the overlapping field of view of adjacent vehicle cameras are the same or almost the same.
  • ground truth data for all vehicle cameras which all correspond to the optimized image of the front camera in terms of brightness or color distribution, the converted images from all vehicle cameras are always ideal later when the trained neural network is used, in order to use stitching to create a composite image for the display unit to create.
  • the training can provide ground truth data which are the result of an image simulation or an image synthesis.
  • the target output images from the side vehicle cameras can be simulated or synthesized taking real front and rear images into account.
  • this method can be integrated in a hardware-based image preprocessing stage, the ISP (Image Signal Processor).
  • the ISP Image Signal Processor
  • this ISP is supplemented by a small, trained neural network, which carries out the corresponding conversion and provides the processed information with the original data for possible detection or display methods.
  • This method can be used in a further application in such a way that, as with a surround view system, it calculates images with different lighting profiles to form an overall image with balanced lighting.
  • An example is the display of the vehicle surroundings on a display on an unlit country road, where the areas of the front and rear vehicle cameras are illuminated by headlights, but the lateral areas are not illuminated by headlights.
  • system with the neural network can be trained to use information from the better-lit areas in order to further improve the conversion for the unlit areas.
  • the network is trained less with individual images for each vehicle camera, but rather as an overall system consisting of several camera systems.
  • the computing efficiency is optimized. No separate neural network is required for the reconstruction of the night images with improved visibility; instead, one of the operational networks in the vehicle is used for this purpose, e.g. a network for semantic segmentation, lane recognition, object recognition or a multi-task network.
  • One or more further output layer (s) is added to this network, which is responsible for the reconstruction of the vision-enhanced night images.
  • the training data for the night vision improvement is used for the calculation of this initial layer.
  • the output layers of the functions and the vision enhancements contain separate neurons for the function or the visual enhancement
  • the training of the network includes data for the function and the visibility improvement.
  • This setup makes it possible for the feature representation of the commonly used layers to contain information about the visibility improvement and for this information to be made available to the functions. At runtime, there is thus the possibility of using the network only for the calculation of the functions which for this purpose work on visualized feature representations. This is a computation-time-efficient implementation that is particularly suitable for operation on embedded systems.
  • the additional computational effort at runtime in this embodiment is either only the calculation of the initial layer (s) if the vision-enhanced night images are made available to other functions in the vehicle, or no additional computing effort if the vision enhancement is integrated into the algorithms of the functions and only the Output of these functions is further used, for example, lane recognition, object recognition, semantic segmentation and / or depth estimation.
  • the visibility improvement can be extended with regard to overexposure.
  • a common network for improving visibility can be learned, which enhances both the quality of overexposed and underexposed images.
  • a fusion of these two applications in a network enables a computationally efficient implementation in the vehicle.
  • the computational efficiency can be increased if this network fusion is also expanded to include functions such as object recognition, lane recognition, depth estimation, semantic segmentation.
  • the network calculates an image with improved visibility, which corresponds to a picture with a longer exposure time.
  • the exposure time is limited, especially for night photos, so that the quality of the recording is not impaired by the driving speed or in curves (including motion blur).
  • images with longer exposure times are calculated without being affected by e.g. the driving speed.
  • the estimate of the ratio (or factor d) can be obtained from the current image, the previous image or a sequence of images.
  • An example of this is the change from an illuminated city center to the country road.
  • a control loop can be set up in which the properties of the luminance histogram are tracked. If there is a deviation from the mean expected distribution, the ratio can be increased or decreased.
  • This adjustment of the ratio is relevant during runtime in the vehicle.
  • the relationships between the exposure times are known, and the mean expected distribution for improved visibility can be calculated from the ground truth images of the training data for various scenes.
  • the scene types can be obtained from the functions during runtime in the vehicle, e.g. semantic segmentation.
  • the ratio to the improvement in visibility during runtime a) can be a constant, b) depending on the luminance histogram, c) depending on the street scene, or d) depending on the luminance histogram and the street scene.
  • deviations from a fixed visibility improvement factor d can occur in dynamic scenes such as road traffic scenes.
  • this factor it is possible to improve the visibility for a large number of different traffic scenes during runtime in the vehicle.
  • the main advantages are: - Very efficient method for improving the image quality when it is insufficient

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

L'invention concerne un procédé d'apprentissage machine, un procédé et un dispositif pour convertir des données image d'entrée (Ini) d'une pluralité de caméras de véhicule (2-i) d'un système à visibilité périphérique en données image de sortie optimisées (Opti). Le procédé de conversion de données image d'entrée d'une pluralité de caméras de véhicule (2-i) d'un système à visibilité périphérique en données image de sortie optimisées comprend les étapes suivantes : a) des données d'image d'entrée (Ini) acquises par les caméras de véhicule (2-i) et présentant une luminosité ou répartition chromatique dans l'instant sont fournies à un réseau (CNN1, CNN10, CNN11, CNN12) neuronal artificiel entraîné, b) ce réseau (CNN1, CNN10, CNN11, CNN12) neuronal artificiel entraîné est configuré pour convertir les données image d'entrée (Ini) présentant la luminosité ou répartition chromatique dans l'instant en données image de sortie optimisées (Opti) présentant une luminosité ou répartition chromatique de sortie différente, et c) le réseau (CNN1, CNN10, CNN11, CNN12) neuronal artificiel entraîné est configuré pour sortir les données image de sortie (Opti).
EP20841885.5A 2019-12-19 2020-12-09 Conversion de données image d'entrée d'une pluralité de caméras de véhicule d'un système à visibilité périphérique en données image de sortie optimisées Pending EP4078941A2 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
DE102019220171.7A DE102019220171A1 (de) 2019-12-19 2019-12-19 Umwandlung von Eingangs-Bilddaten einer Mehrzahl von Fahrzeugkameras eines Rundumsichtsystems in optimierte Ausgangs-Bilddaten
PCT/DE2020/200112 WO2021121491A2 (fr) 2019-12-19 2020-12-09 Conversion de données image d'entrée d'une pluralité de caméras de véhicule d'un système à visibilité périphérique en données image de sortie optimisées

Publications (1)

Publication Number Publication Date
EP4078941A2 true EP4078941A2 (fr) 2022-10-26

Family

ID=74184333

Family Applications (1)

Application Number Title Priority Date Filing Date
EP20841885.5A Pending EP4078941A2 (fr) 2019-12-19 2020-12-09 Conversion de données image d'entrée d'une pluralité de caméras de véhicule d'un système à visibilité périphérique en données image de sortie optimisées

Country Status (4)

Country Link
US (1) US20230342894A1 (fr)
EP (1) EP4078941A2 (fr)
DE (2) DE102019220171A1 (fr)
WO (1) WO2021121491A2 (fr)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP4049235A4 (fr) * 2020-01-23 2023-01-11 Samsung Electronics Co., Ltd. Dispositif électronique et procédé de commande de dispositif électronique
DE102021119951A1 (de) 2021-08-02 2023-02-02 Dr. Ing. H.C. F. Porsche Aktiengesellschaft Verfahren, System und Computerprogrammprodukt zur Erkennung der Umgebung eines Kraftfahrzeugs
DE102021208736A1 (de) 2021-08-11 2023-02-16 Zf Friedrichshafen Ag Computerimplementiertes Verfahren zur Korrektur von Belichtungsfehlern und Hardwaremodul für automatisiertes Fahren

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100054542A1 (en) * 2008-09-03 2010-03-04 Texas Instruments Incorporated Processing video frames with the same content but with luminance variations across frames
DE102014210323A1 (de) 2014-06-02 2015-12-03 Conti Temic Microelectronic Gmbh Vorrichtung und Verfahren zur adaptiven Bildkorrektur von mindestens einem Bildparameter
JP6468356B2 (ja) * 2015-06-25 2019-02-13 富士通株式会社 プログラム生成装置、プログラム生成方法および生成プログラム
TWI555379B (zh) * 2015-11-06 2016-10-21 輿圖行動股份有限公司 一種全景魚眼相機影像校正、合成與景深重建方法與其系統
US20170297488A1 (en) * 2016-04-19 2017-10-19 GM Global Technology Operations LLC Surround view camera system for object detection and tracking
KR101832967B1 (ko) * 2016-05-16 2018-02-28 엘지전자 주식회사 차량에 구비된 제어장치 및 이의 제어방법
US10430913B2 (en) * 2017-06-30 2019-10-01 Intel Corporation Approximating image processing functions using convolutional neural networks
US10713816B2 (en) * 2017-07-14 2020-07-14 Microsoft Technology Licensing, Llc Fully convolutional color constancy with confidence weighted pooling
US20190095877A1 (en) * 2017-09-26 2019-03-28 Panton, Inc. Image recognition system for rental vehicle damage detection and management
US10475169B2 (en) * 2017-11-28 2019-11-12 Adobe Inc. High dynamic range illumination estimation
US11042156B2 (en) * 2018-05-14 2021-06-22 Honda Motor Co., Ltd. System and method for learning and executing naturalistic driving behavior
CN112703509A (zh) * 2018-08-07 2021-04-23 布赖凯科技股份有限公司 用于图像增强的人工智能技术
US20200294194A1 (en) * 2019-03-11 2020-09-17 Nvidia Corporation View synthesis using neural networks
DE102019205962A1 (de) * 2019-04-25 2020-10-29 Robert Bosch Gmbh Verfahren zur Generierung von digitalen Bildpaaren als Trainingsdaten für Neuronale Netze

Also Published As

Publication number Publication date
DE112020006216A5 (de) 2022-12-22
DE102019220171A1 (de) 2021-06-24
WO2021121491A3 (fr) 2021-08-12
US20230342894A1 (en) 2023-10-26
WO2021121491A2 (fr) 2021-06-24

Similar Documents

Publication Publication Date Title
EP4078941A2 (fr) Conversion de données image d'entrée d'une pluralité de caméras de véhicule d'un système à visibilité périphérique en données image de sortie optimisées
DE102018130821A1 (de) Verfahren zum Beurteilen einer Umgebung eines Kraftfahrzeugs durch ein künstliches neuronales Netz mit einer Aggregationseinheit; Steuereinheit, Fahrerassistenzsystem sowie Computerprogrammprodukt
DE102007034657B4 (de) Bildverarbeitungsvorrichtung
DE102009036844B4 (de) Belichtungsbestimmungsvorrichtung und Bildverarbeitungsvorrichtung
DE102010030044A1 (de) Wiederherstellvorrichtung für durch Wettereinflüsse verschlechterte Bilder und Fahrerunterstützungssystem hiermit
WO2022128014A1 (fr) Correction d'images d'un système de caméra panoramique en conditions de pluie, sous une lumière incidente et en cas de salissure
DE112018007715B4 (de) Fahrzeug-bordanzeigesteuervorrichtung
DE102019220168A1 (de) Helligkeits-Umwandlung von Bildern einer Kamera
DE102016216795A1 (de) Verfahren zur Ermittlung von Ergebnisbilddaten
DE102016121755A1 (de) Verfahren zum Bestimmen eines zusammengesetzten Bilds eines Umgebungsbereichs eines Kraftfahrzeugs mit Anpassung von Helligkeit und/oder Farbe, Kamerasystem sowie Krafzfahrzeug
DE112012004352T5 (de) Stereobildgebungsvorrichtung
EP3809691A1 (fr) Procédé de reconnaissance et de correction d'erreurs d'image ainsi qu'unité de traitement des images et programme informatique associé
DE102012214637A1 (de) Verfahren und Vorrichtung zur Einstellung einer Leuchtcharakteristik eines Scheinwerfers und Scheinwerfersystem
DE102010014733B4 (de) Chromakeyverfahren und Chromakeyvorrichtung zur Aufnahme und Bildbearbeitung von Kamerabildern
DE102014209863A1 (de) Verfahren und Vorrichtung zum Betreiben einer Stereokamera für ein Fahrzeug sowie Stereokamera für ein Fahrzeug
EP2539851A1 (fr) Procédé et dispositif pour analyser une image prise par un dispositif de prise de vue destiné à un véhicule
DE102014114328A1 (de) Kraftfahrzeug-Kameravorrichtung mit Histogramm-Spreizung
DE102018201909A1 (de) Verfahren und Vorrichtung zur Objekterkennung
DE102013020952A1 (de) Verfahren zum Einstellen eines für die Helligkeit und/oder für den Weißabgleich einer Bilddarstellung relevanten Parameters in einem Kamerasystem eines Kraftfahrzeugs, Kamerasystem und Kraftfahrzeug
DE102013011844A1 (de) Verfahren zum Anpassen einer Gammakurve eines Kamerasystems eines Kraftfahrzeugs, Kamerasystem und Kraftfahrzeug
DE102017100529A1 (de) Verfahren zum Betreiben einer Kamera abhängig von einem aktuellen Zustand eines Umgebungsbereichs der Kamera, Kamera und Kraftfahrzeug
DE10318499B4 (de) Verfahren und Vorrichtung zur Einstellung eines Bildsensors
EP1797710A1 (fr) Procede de representation d'une image prise par une camera video
WO2022083833A1 (fr) Système pour éviter les accidents provoqués par des animaux sauvages qui traversent au crépuscule et la nuit
DE102020213267A1 (de) Helligkeits-Umwandlung von Stereobildern

Legal Events

Date Code Title Description
STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: UNKNOWN

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE

PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE

17P Request for examination filed

Effective date: 20220719

AK Designated contracting states

Kind code of ref document: A2

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

DAV Request for validation of the european patent (deleted)
DAX Request for extension of the european patent (deleted)
STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: EXAMINATION IS IN PROGRESS

17Q First examination report despatched

Effective date: 20240826