WO2022171590A1 - Method for determining a degradation degree of a captured image, computer program product, computer-readable storage medium as well as assistance system - Google Patents
Method for determining a degradation degree of a captured image, computer program product, computer-readable storage medium as well as assistance system Download PDFInfo
- Publication number
- WO2022171590A1 WO2022171590A1 PCT/EP2022/052939 EP2022052939W WO2022171590A1 WO 2022171590 A1 WO2022171590 A1 WO 2022171590A1 EP 2022052939 W EP2022052939 W EP 2022052939W WO 2022171590 A1 WO2022171590 A1 WO 2022171590A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- module
- assistance system
- pixels
- computing device
- electronic computing
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 37
- 230000015556 catabolic process Effects 0.000 title claims abstract description 33
- 238000006731 degradation reaction Methods 0.000 title claims abstract description 33
- 238000004590 computer program Methods 0.000 title claims abstract description 13
- 238000011156 evaluation Methods 0.000 claims abstract description 7
- 238000000605 extraction Methods 0.000 claims abstract description 6
- 230000006870 function Effects 0.000 claims description 32
- 238000012549 training Methods 0.000 claims description 15
- 230000006403 short-term memory Effects 0.000 claims description 12
- 238000013528 artificial neural network Methods 0.000 claims description 9
- 238000013527 convolutional neural network Methods 0.000 claims description 5
- 238000012935 Averaging Methods 0.000 claims description 4
- 238000011109 contamination Methods 0.000 description 10
- 238000012545 processing Methods 0.000 description 9
- 230000006866 deterioration Effects 0.000 description 6
- 230000000717 retained effect Effects 0.000 description 6
- 238000003064 k means clustering Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 3
- 230000004913 activation Effects 0.000 description 2
- 238000001994 activation Methods 0.000 description 2
- 238000007792 addition Methods 0.000 description 2
- 230000002457 bidirectional effect Effects 0.000 description 2
- 230000007613 environmental effect Effects 0.000 description 2
- 230000015654 memory Effects 0.000 description 2
- 238000011176 pooling Methods 0.000 description 2
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 2
- 230000001133 acceleration Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000007621 cluster analysis Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000006735 deficit Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005286 illumination Methods 0.000 description 1
- 230000001771 impaired effect Effects 0.000 description 1
- 239000011499 joint compound Substances 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 238000013139 quantization Methods 0.000 description 1
- 239000004576 sand Substances 0.000 description 1
- 239000002689 soil Substances 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/56—Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
- G06N3/0442—Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
- G06N3/0455—Auto-encoder networks; Encoder-decoder networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/762—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/70—Labelling scene content, e.g. deriving syntactic or semantic representations
Definitions
- the invention relates to a method for determining a degradation degree of an image captured by a camera of an assistance system of a motor vehicle by the assistance system. Further, the invention relates to a computer program product, to a computer- readable storage medium as well as to an assistance system.
- cameras are in particular known, which can capture an environmental image of an environment of the motor vehicle.
- surround view cameras are known, which can in particular be impaired by harsh environmental conditions like snow, rain, mud and the like.
- special contamination classes like soil and water drops can be determined and recognized depending on the contamination.
- lens contaminations based on mud, sand, water drops and frost as well as further unbeneficial weather conditions like snow, rain or fog are not yet possible to determine, wherein they herein in particular generate a deterioration caused by illumination like low light, blinding, shadow or motion blur, wherein they are also to be taken into account in the future.
- illumination like low light, blinding, shadow or motion blur
- US 2018/315167 A1 discloses an object recognition method, in which an object such as for example a vehicle can be recognized from an image captured by an on-board camera even if a lens of the on-board camera is contaminated.
- an original image containing the object to be recognized is created from the captured image
- a processed image is generated from the created original image by applying predetermined processing to the original image
- a learning process with respect to the restoration of an image of the object to be recognized is performed using the original image and the processed image.
- EP 3657379 A1 discloses an image processing device with neural network, which obtains a first image and a respective additional image, wherein a first image capturing device has another field of view than each additional image capturing device.
- Each image is processed by corresponding instances of a common feature extraction processing network to generate a corresponding first map and at least one additional map.
- the feature classification includes processing the first map to generate a first classification, which indicates a contamination of the first image.
- the processing of the or each additional map to generate at least one additional classification, which indicates a corresponding contamination of the or each additional image.
- the first and each additional classification are combined to generate an enhanced classification, which indicates a contamination of the first image. If the enhanced classification does not indicate a contamination of the first image capturing device, further processing can be performed.
- An aspect of the invention relates to a method for determining a degradation degree of an image captured by a camera of an assistance system of a motor vehicle by the assistance system. Capturing the image by the camera and performing a deep feature extraction of a plurality of pixels of the image by an encoding module of an electronic computing device of the assistance system are effected. The plurality of pixels is clustered by a feature point cluster module of the electronic computing device. The clustered pixels are regressed by a regression module of the electronic computing device. Determining the degradation degree depending on an evaluation by applying a sigmoid function after the regression by a sigmoid function module of the electronic computing device as an output of the sigmoid function module is effected.
- the degradation degree can for example be between 0 and 1 , wherein 0 can indicate that a degradation is not present, thus the lens is for example clear and clean, respectively, and 1 can signify that the lens is contaminated.
- 0 can indicate that a degradation is not present
- 1 can signify that the lens is contaminated.
- This is in particular independent of a classification of the contamination.
- the degradation is only generally determined and if for example then the image with the corresponding degradation degree can in turn be further used to for example use an evaluation of this image for a drive function.
- the invention in particular solves the problem that independently of a classification and thus also independently of a plurality of datasets for a contamination, a corresponding degradation degree can nevertheless be determined.
- the assistance system can be simply trained, wherein a low number of datasets is in particular required hereto.
- the degradation deterioration is in particular estimated between 0 and 1 by the sigmoid function, wherein 0 means clean and 1 opaque.
- 0 means clean and 1 opaque.
- the two above mentioned cases are in particular extremes, for which corresponding annotations are known.
- the quality of the degradation deterioration is determined in the range between 0 and 1 due to the presence of the regression module.
- the encoding module is provided as a convolutional neural network.
- the encoding module can also be referred to as encoder.
- a simple convolutional neural network (CNN) is used to extract depth features from the corresponding input images.
- CNN convolutional neural network
- a neural network with a higher capacity can also be used.
- a feature extraction can in particular be reliably performed.
- the plurality of pixels is clustered by a K-means algorithm of the feature point cluster module.
- the K-means clustering method with a number of for example n clusters, wherein n can for example be 5, is in particular applied. It is advantageous to first differentiate the depth feature points in n clusters and later each of these clusters is passed through a regression module.
- the K-means algorithm is in particular a method for vector quantization, which is in particular also used for cluster analysis.
- a previously known number of K-groups is formed from an amount of similar objects. It is further advantageous if the clustered pixels are regressed by a regression module formed as a long short term memory module.
- the long short term memory module is in particular a so-called long short term memory module (LSTM).
- the presented problem is in particular not a classification problem, but a regression problem.
- the output from the in particular K-means clustering is nothing else than a series of depth feature points, which are separated in clusters, wherefore the LSTM module is in particular advantageous for regressing.
- the core concept of such an LSTM module is the cell state and its different gates. Cell states act like a memory of the network and can carry relevant information during the processing of the sequence.
- the gates are different neural networks, which decide which information is allowed on the cell status. During the training, the gates can learn which information is relevant, to retain or to forget it. Each gateway or gate contains sigmoidal activations. This is helpful to update or to forget data, since each number, which is multiplied by 0, is 0, whereby values disappear or are "forgotten". Each number, which is multiplied by 1 , is the same value, therefore this value remains the same or is "retained”. In such an LSTM module, the so-called forget gate is in particular first formed. This gate decides which information are to be discarded or retained. Information from the previous hidden state and information from the current input are passed through the sigmoid function. Therein, values between 0 and 1 result. The closer to 0, the more it is forgotten, and the closer to 1 , the more it is retained.
- the clustered pixels are unidirectionally transferred from the feature point cluster module to the regression module.
- the connection from the encoder to the LSTM module via the K-means algorithm is unidirectional and is only used for the forward passage since unsupervised K-means clustering is used.
- the regressed pixels are backpropagated to the encoding device.
- the backpropagation is effected via a separate connection from the LSTM module to the encoder. It is also advised against using supervised K-means and making the connection bidirectional since it has only few trainable parameters, which are not sufficient in the backpropagation since the gradient flow is effected through a great number of trainable parameters from the LSTM module.
- a sigmoid loss is trained in a first training phase for the assistance system for applying the sigmoid function.
- sigmoid is in particular an advantageous function in order that the values are obtained in the range between 0 and 1.
- the reason for this is that a continuous annotation is not present, but only annotations for 0 and 1.
- the sigmoidal loss function is in particular first trained, which estimates the quality of the perception deterioration, thus the degradation, from an input image.
- this flow cannot output the quality of the degradation deterioration per pixel, therefore, a functionality is added, which generates the output in the form of the input image.
- the self-attention module is in particular a so- called attention map.
- a so-called "self-attention” can in particular be performed by the attention map. This is in particular an advantageous step to discriminate the output of the LSTM.
- the self-attention module is provided in the form of a global averaging.
- the global averaging is in particular a so-called global average pooling (GAP).
- the output of the sigmoid function module is transferred to a decoding device of the electronic computing device for decoding the output.
- a decoder is in particular added, which adopts the output of the sigmoid function module in the one-dimensional format.
- this one-dimensional vector is converted into a two-dimensional vector by conversion and then in turn supplied to the decoder for reconstruction.
- This reconstruction can then in turn be transferred to a superordinated assistance system, wherein the superordinated assistance system can then in turn decide whether or not the image of the camera can be used for evaluation based on the reconstruction. For example, if the image should not be able to be used for evaluation, thus, an alarm signal can be generated for a user of the motor vehicle.
- the credibility of the image is for example downgraded for an at least partially automated operation of the motor vehicle if a corresponding value of the degradation should be present.
- a safe operation of the motor vehicle can be realized.
- the decoding device is provided in the form of a fully convoluted neural network.
- the fully convoluted neural network is in particular a so-called fully convoluted network (FCN).
- FCN fully convoluted network
- the decoding device is trained in a second training phase for the assistance system, wherein exclusively the decoding device is trained in the second training phase, which is after the first training phase in time.
- the pre-trained model from the first phase is in particular used.
- an estimation of the degradation degree is effected in the first phase, and a reconstruction of the output is effected in the second phase. Therefore, only the decoder is trained in the second phase, the remaining components of the assistance system or of the so-called "pipeline" remain untrainable. Otherwise, the gradient of the decoder would destruct most of the trained weights, including the encoder, in the first epochs. Therefore, the components used in the first training phase are only used as feature extractors in the second phase.
- the presented method is in particular a computer-implemented method.
- the method is in particular performed on an electronic computing device, wherein the electronic computing device can in particular comprise circuits, for example integrated circuits, processors and further electronic components to perform the corresponding method steps.
- a further aspect of the invention relates to a computer program product with program code means, which are stored in a computer-readable storage medium, to perform the method for determining a degradation degree according to the preceding aspect, when the computer program product is executed on a processor of an electronic computing device.
- a still further aspect of the invention relates to a computer-readable storage medium with a computer program product, in particular an electronic computing device with a computer program product, according to the preceding aspect.
- a still further aspect of the invention relates to an assistance system for determining a degradation degree of an image captured by a camera of a motor vehicle, with at least one camera and with an electronic computing device, wherein the assistance system is formed for performing a method according to the preceding aspect.
- the method is performed by the assistance system.
- a still further aspect of the invention relates to a motor vehicle with an assistance system according to the preceding aspect.
- the motor vehicle is formed as an at least partially autonomous, in particular as a fully autonomous, motor vehicle.
- Advantageous forms of configuration of the method are to be regarded as advantageous forms of configuration of the computer program product, of the computer-readable storage medium, of the assistance system as well as of the motor vehicle.
- the assistance system as well as the motor vehicle comprise concrete features, which allow performing the method or an advantageous form of configuration thereof.
- Fig. 1 a schematic top view to a motor vehicle with an embodiment of an assistance system
- Fig. 2 a schematic block diagram of an embodiment of an assistance system
- Fig. 3 a schematic view of an embodiment of a regression module of an embodiment of an electronic computing device of an embodiment of the assistance system.
- Fig. 1 shows a schematic top view to an embodiment of a motor vehicle 1 with an embodiment of an assistance system 2.
- the assistance system 2 is formed for determining a degradation degree 3 (Fig. 2) of an image 5 captured by a camera 4 of the motor vehicle 1.
- the assistance system 2 in particular comprises the camera 4 as well as an electronic computing device 6.
- an environment 7 of the motor vehicle 1 can be captured by the camera 4.
- the motor vehicle 1 is in particular an at least partially autonomous, in particular a fully autonomous, motor vehicle 1.
- the assistance system 2 can be formed only for determining the degradation degree 3.
- the assistance system 2 can also be formed for at least partially autonomous operation or for fully autonomous operation of the motor vehicle 1.
- the assistance system 2 can for example perform interventions in a steering and acceleration device of the motor vehicle 1 . Based on the captured environment 7, then, the assistance system 2 can in turn in particular perform for example corresponding control signals for the steering and braking intervention, respectively.
- Fig. 2 shows a schematic block diagram of an embodiment of the assistance system 2, in particular of the electronic computing device 6 of the assistance system 2.
- the image 5 is captured by the camera 4.
- Performing a deep feature extraction of a plurality of pixels 8 of the image 5 by an encoding module 9 of the electronic computing device 6 is effected.
- the encoding module 9 can also be referred to as encoder.
- Clustering of the plurality of pixels 8 by a feature point cluster module of the electronic computing device 6 is effected.
- the clustered pixels 8 are regressed by a regression module 11 of the electronic computing device 6.
- determining the degradation degree 3 depending on an evaluation by applying a sigmoid function 20 (Fig. 3) after the regression by a sigmoid function module 12 of the electronic computing device 6 as the output of the sigmoid function module 12 is then in turn effected.
- Fig. 2 further shows that a sigmoid loss 13 is trained in a first training phase for the assistance system 2 for applying the sigmoid function 20.
- the assistance system 2 or the electronic computing device 6 comprises a self-attention module 14, by which the regressed pixels 8 are discriminated and the discriminated pixels 8 are transferred to the sigmoid function module 12.
- the self-attention module 14 is in particular provided in the form of a global averaging (global average pooling).
- Fig. 2 further shows that the encoding module 9 is provided as a convolutional neural network.
- the plurality of pixels 8 is clustered by a K-means algorithm 15 of the feature point cluster module 10.
- the clustered pixels 8 are in turn regressed by a regression module 11 formed as a long short term memory module 16.
- the clustered pixels 8 are unidirectionally, which is represented by the arrows 17, transferred from the feature point cluster module 10 to the regression module 11 .
- the pixels 8 are in turn bidirectionally transferred from the regression module 11 to the self-attention module 14, from the self-attention module 14 to the sigmoid function module 12 and bidirectionally from the sigmoid function module 12 to the sigmoid loss 13. This is in particular represented by the arrows 18.
- the passage of the degradation degree 3 is also bidirectionally effected to a decoding device 19 of the assistance system 2.
- the output of the sigmoid function module 12 is in particular effected to the decoding device 19 for decoding the output.
- the decoding device 19 can in particular be provided in the form of a fully convoluted neural network.
- the decoding device 19 is in particular trained in a second training phase for the assistance system 2, wherein exclusively the decoding device 19 is trained in the second training phase, which is after the first training phase in time.
- Fig. 3 shows a schematic block diagram of an embodiment of the regression module 11 according to Fig. 2.
- a long short term memory module 16 which can also be referred to as long short term memory module (LSTM).
- the long short term memory module 16 presently comprises at least three sigmoid functions 20 as well as at least two tanh functions 21. Further, the long short term memory module 16 comprises at least two pointwise multiplications 22 as well as two pointwise additions 23.
- the results of the regression are in turn backpropagated to the encoding device 9, which is in particular represented by the arrows 24.
- such a long short term memory module 16 is suitable since the K-means algorithm 15 is a series of depth feature points, which are separated in clusters, such that it is a regression problem.
- the core concept of the long short term memory module 16 is the cell state and its different gates. Cell states act like a memory of the network and can carry relevant information during the processing of the sequence.
- the gates are different neural networks, which decide which information is allowed on the cell status. The gates can learn during the training, which information is relevant to retain or to forget it.
- Each gateway or gate contains sigmoidal activation. This is helpful to update or forget data since each number, which is multiplied by 0, is 0, whereby values disappear or are "forgotten”.
- the forget gate is first provided. This gate decides which information is to be discarded or retained. Information from the previous hidden state and information from the current input are passed through the sigmoid function 20. Therein, values between 0 and 1 result. The closer to 0, the more it is forgotten, and the closer to 1 , the more it is retained.
- the input gate follows, in which the previous hidden state into the current input is first passed to a sigmoid function 20, which decides which values are updated, in that it transforms the values such that they are between 0 and 1 .
- the hidden state to the current input is also passed to the tanh function 21 to "squeeze" values between -1 and 1 to regulate the network. Later, the outputs of both are multiplied.
- the sigmoid output decides which information is important to retain it from the tanh output.
- the cell state is first pointwise multiplied by the forget vector. Then, the output of the input gate is used and a pointwise addition 23 is performed, which updates the cell status to new values, which the neural network regards as relevant.
- An output gate decides what is to be the next hidden state.
- the hidden state contains information about the previous inputs. This state is also used for predictions.
- the connection from the encoder to the LSTM module via K-means is unidirectional and is only used for the forward passage, since unsupervised K-means clustering is used.
- the backpropagation 24 is effected via a separate connection from the regression module 11 to the encoding device 9. It is also advised against using the supervised K-means and making the connection bidirectional since it only has few trainable parameters, which are not sufficient in the backpropagation 24, since the gradient flow is effected through a great number of trainable parameters from the LSTM.
- temporal consistency represents an important aspect in algorithms based on image processing to keep the system performance constant.
- the solution proposed in the figures can be extended by a further encoding device such that both encoders have successive image frames or frames of the video sequence. This is advantageous for the assistance system 2 to output predictions without flicker effect.
- the assistance system 2 is flexible such that it does not require a hard annotation, which is cumbersome and very cost-intensive.
- the assistance system 2 can in particular be used for at least partially autonomous driving of the motor vehicle 1 .
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Mathematical Physics (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Probability & Statistics with Applications (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to a method for determining a degradation degree (3) of an image (5) captured by a camera (4) of an assistance system (2) of a motor vehicle (1) by the assistance system (2), comprising the steps of: - capturing the image (5) by the camera (4); - performing a deep feature extraction of a plurality of pixels (8) of the image (5) by 0 an encoding module (9) of an electronic computing device (6) of the assistance system (2); - clustering the plurality of pixels (8) by a feature point cluster module (10) of the electronic computing device (6); - regressing the clustered pixels (8) by a regression module (11) of the electronic 5 computing device (6); and - determining the degradation degree (3) depending on an evaluation by applying a sigmoid function (20) after the regression by a sigmoid function module (12) of the electronic computing device (6) as an output of the sigmoid function module (12). Further, the invention relates to a computer program product, to a computer-readable storage medium as well as to an assistance system (2).
Description
Method for determining a degradation degree of a captured image, computer program product, computer-readable storage medium as well as assistance system
The invention relates to a method for determining a degradation degree of an image captured by a camera of an assistance system of a motor vehicle by the assistance system. Further, the invention relates to a computer program product, to a computer- readable storage medium as well as to an assistance system.
From the motor vehicle construction, cameras are in particular known, which can capture an environmental image of an environment of the motor vehicle. In particular, so-called surround view cameras are known, which can in particular be impaired by harsh environmental conditions like snow, rain, mud and the like. In order to capture such an impairment, it is already known that special contamination classes like soil and water drops can be determined and recognized depending on the contamination. Flerein, great challenges in the annotation of only these few classes arise. In particular, the transition areas are to be subjectively annotated and it is difficult to obtain realistic contamination captures with corresponding variety. In particular, lens contaminations based on mud, sand, water drops and frost as well as further unbeneficial weather conditions like snow, rain or fog are not yet possible to determine, wherein they herein in particular generate a deterioration caused by illumination like low light, blinding, shadow or motion blur, wherein they are also to be taken into account in the future. Flowever, it is herein impractical to create further datasets for each of these individual classes and with all combinations of degradation and to annotate them.
US 2018/315167 A1 discloses an object recognition method, in which an object such as for example a vehicle can be recognized from an image captured by an on-board camera even if a lens of the on-board camera is contaminated. In order to achieve the aim, in this object recognition method for recognizing an object contained in a captured image, an original image containing the object to be recognized is created from the captured image, a processed image is generated from the created original image by applying predetermined processing to the original image, and a learning process with respect to the restoration of an image of the object to be recognized is performed using the original image and the processed image.
EP 3657379 A1 discloses an image processing device with neural network, which obtains a first image and a respective additional image, wherein a first image capturing device has another field of view than each additional image capturing device. Each image is processed by corresponding instances of a common feature extraction processing network to generate a corresponding first map and at least one additional map. The feature classification includes processing the first map to generate a first classification, which indicates a contamination of the first image. The processing of the or each additional map to generate at least one additional classification, which indicates a corresponding contamination of the or each additional image. The first and each additional classification are combined to generate an enhanced classification, which indicates a contamination of the first image. If the enhanced classification does not indicate a contamination of the first image capturing device, further processing can be performed.
It is the object of the present invention to provide a method, a computer program product, a computer-readable storage medium as well as an assistance system, by which an improved determination of a degradation degree of a camera can be performed.
This object is solved by a method, a computer program product, a computer-readable storage medium as well as an assistance system according to the independent claims. Advantageous forms of configuration are specified in the dependent claims.
An aspect of the invention relates to a method for determining a degradation degree of an image captured by a camera of an assistance system of a motor vehicle by the assistance system. Capturing the image by the camera and performing a deep feature extraction of a plurality of pixels of the image by an encoding module of an electronic computing device of the assistance system are effected. The plurality of pixels is clustered by a feature point cluster module of the electronic computing device. The clustered pixels are regressed by a regression module of the electronic computing device. Determining the degradation degree depending on an evaluation by applying a sigmoid function after the regression by a sigmoid function module of the electronic computing device as an output of the sigmoid function module is effected.
Thus, an improved determination of the degradation degree can in particular be performed. In particular, the degradation degree can for example be between 0 and 1 , wherein 0 can indicate that a degradation is not present, thus the lens is for example clear and clean, respectively, and 1 can signify that the lens is contaminated. Thus, the closer the degradation degree is to 0, the cleaner the lens is. Thus, this is in particular
independent of a classification of the contamination. The degradation is only generally determined and if for example then the image with the corresponding degradation degree can in turn be further used to for example use an evaluation of this image for a drive function.
Thus, the invention in particular solves the problem that independently of a classification and thus also independently of a plurality of datasets for a contamination, a corresponding degradation degree can nevertheless be determined. Thus, the assistance system can be simply trained, wherein a low number of datasets is in particular required hereto.
Thus, the degradation deterioration is in particular estimated between 0 and 1 by the sigmoid function, wherein 0 means clean and 1 opaque. The two above mentioned cases are in particular extremes, for which corresponding annotations are known. The quality of the degradation deterioration is determined in the range between 0 and 1 due to the presence of the regression module.
According to an advantageous form of configuration, the encoding module is provided as a convolutional neural network. The encoding module can also be referred to as encoder. In particular, a simple convolutional neural network (CNN) is used to extract depth features from the corresponding input images. In particular if sufficient storage and computational assistance should for example be present, thus, a neural network with a higher capacity can also be used. Thus, a feature extraction can in particular be reliably performed.
It has further proven advantageous if the plurality of pixels is clustered by a K-means algorithm of the feature point cluster module. In particular since the annotations are each only available for two extreme cases, namely clean and opaque, however, the input image will most likely have a varying deterioration in the range between 0 and 1 during the inference time. In order to learn this variation, the K-means clustering method with a number of for example n clusters, wherein n can for example be 5, is in particular applied. It is advantageous to first differentiate the depth feature points in n clusters and later each of these clusters is passed through a regression module. The K-means algorithm is in particular a method for vector quantization, which is in particular also used for cluster analysis. Therein, a previously known number of K-groups is formed from an amount of similar objects.
It is further advantageous if the clustered pixels are regressed by a regression module formed as a long short term memory module. The long short term memory module is in particular a so-called long short term memory module (LSTM). The presented problem is in particular not a classification problem, but a regression problem. The output from the in particular K-means clustering is nothing else than a series of depth feature points, which are separated in clusters, wherefore the LSTM module is in particular advantageous for regressing. The core concept of such an LSTM module is the cell state and its different gates. Cell states act like a memory of the network and can carry relevant information during the processing of the sequence. The gates are different neural networks, which decide which information is allowed on the cell status. During the training, the gates can learn which information is relevant, to retain or to forget it. Each gateway or gate contains sigmoidal activations. This is helpful to update or to forget data, since each number, which is multiplied by 0, is 0, whereby values disappear or are "forgotten". Each number, which is multiplied by 1 , is the same value, therefore this value remains the same or is "retained". In such an LSTM module, the so-called forget gate is in particular first formed. This gate decides which information are to be discarded or retained. Information from the previous hidden state and information from the current input are passed through the sigmoid function. Therein, values between 0 and 1 result. The closer to 0, the more it is forgotten, and the closer to 1 , the more it is retained.
In a further advantageous form of configuration, the clustered pixels are unidirectionally transferred from the feature point cluster module to the regression module. In particular the connection from the encoder to the LSTM module via the K-means algorithm is unidirectional and is only used for the forward passage since unsupervised K-means clustering is used.
Further, it has proven advantageous if the regressed pixels are backpropagated to the encoding device. In particular in order to generate an unconnected graph, the backpropagation is effected via a separate connection from the LSTM module to the encoder. It is also advised against using supervised K-means and making the connection bidirectional since it has only few trainable parameters, which are not sufficient in the backpropagation since the gradient flow is effected through a great number of trainable parameters from the LSTM module.
In a further advantageous form of configuration, a sigmoid loss is trained in a first training phase for the assistance system for applying the sigmoid function. Here, sigmoid is in particular an advantageous function in order that the values are obtained in the range
between 0 and 1. The reason for this is that a continuous annotation is not present, but only annotations for 0 and 1. Thus, in the first phase, the sigmoidal loss function is in particular first trained, which estimates the quality of the perception deterioration, thus the degradation, from an input image. However, this flow cannot output the quality of the degradation deterioration per pixel, therefore, a functionality is added, which generates the output in the form of the input image.
It has further proven advantageous if the regressed pixels are discriminated by a self attention module of the electronic computing device and the discriminated pixels are transferred to the sigmoid function module. The self-attention module is in particular a so- called attention map. A so-called "self-attention" can in particular be performed by the attention map. This is in particular an advantageous step to discriminate the output of the LSTM.
It is also advantageous if the self-attention module is provided in the form of a global averaging. The global averaging is in particular a so-called global average pooling (GAP).
It is applied to the output signal of the LSTM und multiplied by a one-dimensional vector.
In a further advantageous form of configuration, the output of the sigmoid function module is transferred to a decoding device of the electronic computing device for decoding the output. Thus, a decoder is in particular added, which adopts the output of the sigmoid function module in the one-dimensional format. In particular, this one-dimensional vector is converted into a two-dimensional vector by conversion and then in turn supplied to the decoder for reconstruction. This reconstruction can then in turn be transferred to a superordinated assistance system, wherein the superordinated assistance system can then in turn decide whether or not the image of the camera can be used for evaluation based on the reconstruction. For example, if the image should not be able to be used for evaluation, thus, an alarm signal can be generated for a user of the motor vehicle.
Further, it can also be provided that the credibility of the image is for example downgraded for an at least partially automated operation of the motor vehicle if a corresponding value of the degradation should be present. Thus, a safe operation of the motor vehicle can be realized.
In a further advantageous form of configuration, the decoding device is provided in the form of a fully convoluted neural network. The fully convoluted neural network is in particular a so-called fully convoluted network (FCN). In particular, it has turned out that
this is very advantageous to be able to further process the output of the sigmoid function module.
In a further advantageous form of configuration, the decoding device is trained in a second training phase for the assistance system, wherein exclusively the decoding device is trained in the second training phase, which is after the first training phase in time. Thus, the pre-trained model from the first phase is in particular used. Herein, it is known that there are two simultaneous problems. In particular, an estimation of the degradation degree is effected in the first phase, and a reconstruction of the output is effected in the second phase. Therefore, only the decoder is trained in the second phase, the remaining components of the assistance system or of the so-called "pipeline" remain untrainable. Otherwise, the gradient of the decoder would destruct most of the trained weights, including the encoder, in the first epochs. Therefore, the components used in the first training phase are only used as feature extractors in the second phase.
The presented method is in particular a computer-implemented method. Therein, the method is in particular performed on an electronic computing device, wherein the electronic computing device can in particular comprise circuits, for example integrated circuits, processors and further electronic components to perform the corresponding method steps.
Therefore, a further aspect of the invention relates to a computer program product with program code means, which are stored in a computer-readable storage medium, to perform the method for determining a degradation degree according to the preceding aspect, when the computer program product is executed on a processor of an electronic computing device.
A still further aspect of the invention relates to a computer-readable storage medium with a computer program product, in particular an electronic computing device with a computer program product, according to the preceding aspect.
A still further aspect of the invention relates to an assistance system for determining a degradation degree of an image captured by a camera of a motor vehicle, with at least one camera and with an electronic computing device, wherein the assistance system is formed for performing a method according to the preceding aspect. In particular, the method is performed by the assistance system.
A still further aspect of the invention relates to a motor vehicle with an assistance system according to the preceding aspect. In particular, the motor vehicle is formed as an at least partially autonomous, in particular as a fully autonomous, motor vehicle.
Advantageous forms of configuration of the method are to be regarded as advantageous forms of configuration of the computer program product, of the computer-readable storage medium, of the assistance system as well as of the motor vehicle. Thereto, the assistance system as well as the motor vehicle comprise concrete features, which allow performing the method or an advantageous form of configuration thereof.
Further features are apparent from the claims, the figures and the description of figures. The features and feature combinations mentioned above in the description as well as the features and feature combinations mentioned below in the description of figures and/or shown in the figures alone are usable not only in the respectively specified combination, but also in other combinations without departing from the scope of the invention. Thus, implementations are also to be considered as encompassed and disclosed by the invention, which are not explicitly shown in the figures and explained, but arise from and can be generated by separated feature combinations from the explained implementations. Implementations and feature combinations are also to be considered as disclosed, which thus do not comprise all of the features of an originally formulated independent claim. Moreover, implementations and feature combinations are to be considered as disclosed, in particular by the implementations set out above, which extend beyond or deviate from the feature combinations set out in the relations of the claims.
Now, the invention is explained in more detail based on preferred embodiments as well as with reference to the attached drawings.
There show:
Fig. 1 a schematic top view to a motor vehicle with an embodiment of an assistance system;
Fig. 2 a schematic block diagram of an embodiment of an assistance system; and
Fig. 3 a schematic view of an embodiment of a regression module of an embodiment of an electronic computing device of an embodiment of the assistance system.
In the figures, identical or functionally identical elements are provided with the same reference characters.
Fig. 1 shows a schematic top view to an embodiment of a motor vehicle 1 with an embodiment of an assistance system 2. The assistance system 2 is formed for determining a degradation degree 3 (Fig. 2) of an image 5 captured by a camera 4 of the motor vehicle 1. Flereto, the assistance system 2 in particular comprises the camera 4 as well as an electronic computing device 6. In particular, an environment 7 of the motor vehicle 1 can be captured by the camera 4. Presently, the motor vehicle 1 is in particular an at least partially autonomous, in particular a fully autonomous, motor vehicle 1. Presently, the assistance system 2 can be formed only for determining the degradation degree 3. In addition, the assistance system 2 can also be formed for at least partially autonomous operation or for fully autonomous operation of the motor vehicle 1. Flereto, the assistance system 2 can for example perform interventions in a steering and acceleration device of the motor vehicle 1 . Based on the captured environment 7, then, the assistance system 2 can in turn in particular perform for example corresponding control signals for the steering and braking intervention, respectively.
Fig. 2 shows a schematic block diagram of an embodiment of the assistance system 2, in particular of the electronic computing device 6 of the assistance system 2.
In the method for determining the degradation degree 3, the image 5 is captured by the camera 4. Performing a deep feature extraction of a plurality of pixels 8 of the image 5 by an encoding module 9 of the electronic computing device 6 is effected. The encoding module 9 can also be referred to as encoder. Clustering of the plurality of pixels 8 by a feature point cluster module of the electronic computing device 6 is effected. The clustered pixels 8 are regressed by a regression module 11 of the electronic computing device 6. Then, determining the degradation degree 3 depending on an evaluation by applying a sigmoid function 20 (Fig. 3) after the regression by a sigmoid function module 12 of the electronic computing device 6 as the output of the sigmoid function module 12 is then in turn effected.
Fig. 2 further shows that a sigmoid loss 13 is trained in a first training phase for the assistance system 2 for applying the sigmoid function 20. Further, the assistance system 2 or the electronic computing device 6 comprises a self-attention module 14, by which the regressed pixels 8 are discriminated and the discriminated pixels 8 are transferred to the sigmoid function module 12. The self-attention module 14 is in particular provided in the form of a global averaging (global average pooling).
Fig. 2 further shows that the encoding module 9 is provided as a convolutional neural network. Further, the plurality of pixels 8 is clustered by a K-means algorithm 15 of the feature point cluster module 10. The clustered pixels 8 are in turn regressed by a regression module 11 formed as a long short term memory module 16. Further, the clustered pixels 8 are unidirectionally, which is represented by the arrows 17, transferred from the feature point cluster module 10 to the regression module 11 . The pixels 8 are in turn bidirectionally transferred from the regression module 11 to the self-attention module 14, from the self-attention module 14 to the sigmoid function module 12 and bidirectionally from the sigmoid function module 12 to the sigmoid loss 13. This is in particular represented by the arrows 18. Further, the passage of the degradation degree 3 is also bidirectionally effected to a decoding device 19 of the assistance system 2. Thus, the output of the sigmoid function module 12 is in particular effected to the decoding device 19 for decoding the output. Therein, the decoding device 19 can in particular be provided in the form of a fully convoluted neural network. The decoding device 19 is in particular trained in a second training phase for the assistance system 2, wherein exclusively the decoding device 19 is trained in the second training phase, which is after the first training phase in time.
Fig. 3 shows a schematic block diagram of an embodiment of the regression module 11 according to Fig. 2. Presently, it is in particular a long short term memory module 16, which can also be referred to as long short term memory module (LSTM). The long short term memory module 16 presently comprises at least three sigmoid functions 20 as well as at least two tanh functions 21. Further, the long short term memory module 16 comprises at least two pointwise multiplications 22 as well as two pointwise additions 23. The results of the regression are in turn backpropagated to the encoding device 9, which is in particular represented by the arrows 24.
In particular, such a long short term memory module 16 is suitable since the K-means algorithm 15 is a series of depth feature points, which are separated in clusters, such that it is a regression problem.
The core concept of the long short term memory module 16 is the cell state and its different gates. Cell states act like a memory of the network and can carry relevant information during the processing of the sequence. The gates are different neural networks, which decide which information is allowed on the cell status. The gates can learn during the training, which information is relevant to retain or to forget it. Each gateway or gate contains sigmoidal activation. This is helpful to update or forget data since each number, which is multiplied by 0, is 0, whereby values disappear or are "forgotten". Each number, which is multiplied by 1 , is the same value, therefore this value remains the same or is "retained". In the long short term memory module 16, the forget gate is first provided. This gate decides which information is to be discarded or retained. Information from the previous hidden state and information from the current input are passed through the sigmoid function 20. Therein, values between 0 and 1 result. The closer to 0, the more it is forgotten, and the closer to 1 , the more it is retained.
Next, the input gate follows, in which the previous hidden state into the current input is first passed to a sigmoid function 20, which decides which values are updated, in that it transforms the values such that they are between 0 and 1 . The hidden state to the current input is also passed to the tanh function 21 to "squeeze" values between -1 and 1 to regulate the network. Later, the outputs of both are multiplied. The sigmoid output decides which information is important to retain it from the tanh output.
For calculating the cell state, the cell state is first pointwise multiplied by the forget vector. Then, the output of the input gate is used and a pointwise addition 23 is performed, which updates the cell status to new values, which the neural network regards as relevant.
An output gate then decides what is to be the next hidden state. The hidden state contains information about the previous inputs. This state is also used for predictions. It is to be noted that the connection from the encoder to the LSTM module via K-means is unidirectional and is only used for the forward passage, since unsupervised K-means clustering is used. In order to generate an unconnected graph, the backpropagation 24 is effected via a separate connection from the regression module 11 to the encoding device 9. It is also advised against using the supervised K-means and making the connection bidirectional since it only has few trainable parameters, which are not sufficient in the backpropagation 24, since the gradient flow is effected through a great number of trainable parameters from the LSTM.
Further, the temporal consistency represents an important aspect in algorithms based on image processing to keep the system performance constant. The solution proposed in the figures can be extended by a further encoding device such that both encoders have successive image frames or frames of the video sequence. This is advantageous for the assistance system 2 to output predictions without flicker effect.
Overall, a semi-supervised algorithm is thus proposed, which estimates the degradation degree 3 on pixel basis from the input image. The assistance system 2 is flexible such that it does not require a hard annotation, which is cumbersome and very cost-intensive. The assistance system 2 can in particular be used for at least partially autonomous driving of the motor vehicle 1 .
Claims
1. A method for determining a degradation degree (3) of an image (5) captured by a camera (4) of an assistance system (2) of a motor vehicle (1) by the assistance system (2), comprising the steps of: - capturing the image (5) by the camera (4);
- performing a deep feature extraction of a plurality of pixels (8) of the image (5) by an encoding module (9) of an electronic computing device (6) of the assistance system (2);
- clustering the plurality of pixels (8) by a feature point cluster module (10) of the electronic computing device (6);
- regressing the clustered pixels (8) by a regression module (11) of the electronic computing device (6); and
- determining the degradation degree (3) depending on an evaluation by applying a sigmoid function (20) after the regression by a sigmoid function module (12) of the electronic computing device (6) as an output of the sigmoid function module (12).
2. The method according to claim 1 , characterized in that the encoding module (9) is provided as a convolutional neural network.
3. The method according to claim 1 or 2, characterized in that the plurality of pixels (8) is clustered by a K-means algorithm (15) of the feature point cluster module (10).
4. The method according to any one of the preceding claims, characterized in that the clustered pixels (8) are regressed by a regression module (11) formed as a long short term memory module (16).
5. The method according to any one of the preceding claims, characterized in that the clustered pixels (8) are unidirectionally transferred at least from the feature point cluster module (10) to the regression module (11).
6. The method according to any one of the preceding claims, characterized in that the regressed pixels (8) are backpropagated to the encoding device (9).
7. The method according to any one of the preceding claims, characterized in that a sigmoid loss (13) is trained in a first training phase for the assistance system (2) for applying the sigmoid function (20).
8. The method according to any one of the preceding claims, characterized in that the regressed pixels (8) are discriminated by a self-attention module (14) of the electronic computing device (6) and the discriminated pixels (8) are transferred to the sigmoid function module (12).
9. The method according to claim 8, characterized in that the self-attention module (14) is provided in the form of a global averaging.
10. The method according to any one of the preceding claims, characterized in that the output of the sigmoid function module (12) is transferred to a decoding device (19) of the electronic computing device (6) for decoding the output.
11. The method according to claim 10, characterized in that the decoding device (19) is provided in the form of a fully convoluted neural network.
12. The method according to claim 10 or 11 , characterized in that
the decoding device (19) is trained in a second training phase for the assistance system (2), wherein exclusively the decoding device (19) is trained in the second training phase, which is after the first training phase in time.
13. A computer program product with program code means, which, when the program code means are executed on an electronic computing device (6), cause it to perform a method according to any one of claims 1 to 12.
14. A computer-readable storage medium with at least one computer program product according to claim 13.
15. An assistance system (2) for determining a degradation degree (3) of an image (5) captured by a camera (4) of a motor vehicle (1 ), with at least one camera (4) and with an electronic computing device (6), wherein the assistance system (2) is formed for performing a method according to any one of claims 1 to 12.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
DE102021103200.8 | 2021-02-11 | ||
DE102021103200.8A DE102021103200B3 (en) | 2021-02-11 | 2021-02-11 | Method for determining a degree of degradation of a recorded image, computer program product, computer-readable storage medium and assistance system |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022171590A1 true WO2022171590A1 (en) | 2022-08-18 |
Family
ID=80786370
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/EP2022/052939 WO2022171590A1 (en) | 2021-02-11 | 2022-02-08 | Method for determining a degradation degree of a captured image, computer program product, computer-readable storage medium as well as assistance system |
Country Status (2)
Country | Link |
---|---|
DE (1) | DE102021103200B3 (en) |
WO (1) | WO2022171590A1 (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE102022207416B3 (en) | 2022-07-20 | 2023-10-05 | Zf Friedrichshafen Ag | Computer-implemented method for detecting occlusions of an imaging sensor |
DE102022121781A1 (en) | 2022-08-29 | 2024-02-29 | Connaught Electronics Ltd. | Computer vision based on thermal imaging in a vehicle |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180315167A1 (en) | 2015-11-06 | 2018-11-01 | Clarion Co., Ltd. | Object Detection Method and Object Detection System |
EP3657379A1 (en) | 2018-11-26 | 2020-05-27 | Connaught Electronics Ltd. | A neural network image processing apparatus for detecting soiling of an image capturing device |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9445057B2 (en) | 2013-02-20 | 2016-09-13 | Magna Electronics Inc. | Vehicle vision system with dirt detection |
-
2021
- 2021-02-11 DE DE102021103200.8A patent/DE102021103200B3/en active Active
-
2022
- 2022-02-08 WO PCT/EP2022/052939 patent/WO2022171590A1/en active Application Filing
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180315167A1 (en) | 2015-11-06 | 2018-11-01 | Clarion Co., Ltd. | Object Detection Method and Object Detection System |
EP3657379A1 (en) | 2018-11-26 | 2020-05-27 | Connaught Electronics Ltd. | A neural network image processing apparatus for detecting soiling of an image capturing device |
Non-Patent Citations (4)
Title |
---|
MATHILDE CARON ET AL: "Deep Clustering for Unsupervised Learning of Visual Features", ARXIV.ORG, 18 March 2019 (2019-03-18), XP081122377, Retrieved from the Internet <URL:https://arxiv.org/pdf/1807.05520.pdf> * |
RUI QIAN ET AL: "Attentive Generative Adversarial Network for Raindrop Removal from a Single Image", ARXIV.ORG, 6 May 2018 (2018-05-06), XP081316577, Retrieved from the Internet <URL:https://arxiv.org/pdf/1711.10098.pdf> * |
SUN JIAHAO ET AL: "An Introductory Survey on Attention Mechanisms in Computer Vision Problems", 2020 6TH INTERNATIONAL CONFERENCE ON BIG DATA AND INFORMATION ANALYTICS (BIGDIA), IEEE, 4 December 2020 (2020-12-04), pages 295 - 300, XP033893246 * |
URICAR MICHAL ET AL: "SoilingNet: Soiling Detection on Automotive Surround-View Cameras", 2019 IEEE INTELLIGENT TRANSPORTATION SYSTEMS CONFERENCE (ITSC), IEEE, 27 October 2019 (2019-10-27), pages 67 - 72, XP033668555 * |
Also Published As
Publication number | Publication date |
---|---|
DE102021103200B3 (en) | 2022-06-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2022000426A1 (en) | Method and system for segmenting moving target on basis of twin deep neural network | |
CN108062562B (en) | Object re-recognition method and device | |
WO2022171590A1 (en) | Method for determining a degradation degree of a captured image, computer program product, computer-readable storage medium as well as assistance system | |
CN107977638B (en) | Video monitoring alarm method, device, computer equipment and storage medium | |
CN112507990A (en) | Video time-space feature learning and extracting method, device, equipment and storage medium | |
Akilan et al. | sEnDec: an improved image to image CNN for foreground localization | |
US11574500B2 (en) | Real-time facial landmark detection | |
Hsu et al. | Learning to tell brake and turn signals in videos using cnn-lstm structure | |
US11804026B2 (en) | Device and a method for processing data sequences using a convolutional neural network | |
Rekabdar et al. | Dilated convolutional neural network for predicting driver's activity | |
WO2020000382A1 (en) | Motion-based object detection method, object detection apparatus and electronic device | |
CN113065645A (en) | Twin attention network, image processing method and device | |
Gesnouin et al. | TrouSPI-Net: Spatio-temporal attention on parallel atrous convolutions and U-GRUs for skeletal pedestrian crossing prediction | |
CN115601403A (en) | Event camera optical flow estimation method and device based on self-attention mechanism | |
CN113158905A (en) | Pedestrian re-identification method based on attention mechanism | |
CN110705564A (en) | Image recognition method and device | |
CN113657200A (en) | Video behavior action identification method and system based on mask R-CNN | |
Anees et al. | Deep learning framework for density estimation of crowd videos | |
CN110121055B (en) | Method and apparatus for object recognition | |
CN113011395B (en) | Single-stage dynamic pose recognition method and device and terminal equipment | |
CN117561540A (en) | System and method for performing computer vision tasks using a sequence of frames | |
US11816181B2 (en) | Blur classification and blur map estimation | |
CN114463810A (en) | Training method and device for face recognition model | |
EP4002270A1 (en) | Image recognition evaluation program, image recognition evaluation method, evaluation device, and evaluation system | |
JP2021064343A (en) | Behavior recognition device, behavior recognition method, and information generation device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22708055 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 22708055 Country of ref document: EP Kind code of ref document: A1 |