WO2022171590A1 - Method for determining a degradation degree of a captured image, computer program product, computer-readable storage medium as well as assistance system - Google Patents

Method for determining a degradation degree of a captured image, computer program product, computer-readable storage medium as well as assistance system Download PDF

Info

Publication number
WO2022171590A1
WO2022171590A1 PCT/EP2022/052939 EP2022052939W WO2022171590A1 WO 2022171590 A1 WO2022171590 A1 WO 2022171590A1 EP 2022052939 W EP2022052939 W EP 2022052939W WO 2022171590 A1 WO2022171590 A1 WO 2022171590A1
Authority
WO
WIPO (PCT)
Prior art keywords
module
assistance system
pixels
computing device
electronic computing
Prior art date
Application number
PCT/EP2022/052939
Other languages
French (fr)
Inventor
Senthil Kumar Yogamani
Arindam Das
Original Assignee
Connaught Electronics Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Connaught Electronics Ltd. filed Critical Connaught Electronics Ltd.
Publication of WO2022171590A1 publication Critical patent/WO2022171590A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • G06N3/0442Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/762Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/70Labelling scene content, e.g. deriving syntactic or semantic representations

Definitions

  • the invention relates to a method for determining a degradation degree of an image captured by a camera of an assistance system of a motor vehicle by the assistance system. Further, the invention relates to a computer program product, to a computer- readable storage medium as well as to an assistance system.
  • cameras are in particular known, which can capture an environmental image of an environment of the motor vehicle.
  • surround view cameras are known, which can in particular be impaired by harsh environmental conditions like snow, rain, mud and the like.
  • special contamination classes like soil and water drops can be determined and recognized depending on the contamination.
  • lens contaminations based on mud, sand, water drops and frost as well as further unbeneficial weather conditions like snow, rain or fog are not yet possible to determine, wherein they herein in particular generate a deterioration caused by illumination like low light, blinding, shadow or motion blur, wherein they are also to be taken into account in the future.
  • illumination like low light, blinding, shadow or motion blur
  • US 2018/315167 A1 discloses an object recognition method, in which an object such as for example a vehicle can be recognized from an image captured by an on-board camera even if a lens of the on-board camera is contaminated.
  • an original image containing the object to be recognized is created from the captured image
  • a processed image is generated from the created original image by applying predetermined processing to the original image
  • a learning process with respect to the restoration of an image of the object to be recognized is performed using the original image and the processed image.
  • EP 3657379 A1 discloses an image processing device with neural network, which obtains a first image and a respective additional image, wherein a first image capturing device has another field of view than each additional image capturing device.
  • Each image is processed by corresponding instances of a common feature extraction processing network to generate a corresponding first map and at least one additional map.
  • the feature classification includes processing the first map to generate a first classification, which indicates a contamination of the first image.
  • the processing of the or each additional map to generate at least one additional classification, which indicates a corresponding contamination of the or each additional image.
  • the first and each additional classification are combined to generate an enhanced classification, which indicates a contamination of the first image. If the enhanced classification does not indicate a contamination of the first image capturing device, further processing can be performed.
  • An aspect of the invention relates to a method for determining a degradation degree of an image captured by a camera of an assistance system of a motor vehicle by the assistance system. Capturing the image by the camera and performing a deep feature extraction of a plurality of pixels of the image by an encoding module of an electronic computing device of the assistance system are effected. The plurality of pixels is clustered by a feature point cluster module of the electronic computing device. The clustered pixels are regressed by a regression module of the electronic computing device. Determining the degradation degree depending on an evaluation by applying a sigmoid function after the regression by a sigmoid function module of the electronic computing device as an output of the sigmoid function module is effected.
  • the degradation degree can for example be between 0 and 1 , wherein 0 can indicate that a degradation is not present, thus the lens is for example clear and clean, respectively, and 1 can signify that the lens is contaminated.
  • 0 can indicate that a degradation is not present
  • 1 can signify that the lens is contaminated.
  • This is in particular independent of a classification of the contamination.
  • the degradation is only generally determined and if for example then the image with the corresponding degradation degree can in turn be further used to for example use an evaluation of this image for a drive function.
  • the invention in particular solves the problem that independently of a classification and thus also independently of a plurality of datasets for a contamination, a corresponding degradation degree can nevertheless be determined.
  • the assistance system can be simply trained, wherein a low number of datasets is in particular required hereto.
  • the degradation deterioration is in particular estimated between 0 and 1 by the sigmoid function, wherein 0 means clean and 1 opaque.
  • 0 means clean and 1 opaque.
  • the two above mentioned cases are in particular extremes, for which corresponding annotations are known.
  • the quality of the degradation deterioration is determined in the range between 0 and 1 due to the presence of the regression module.
  • the encoding module is provided as a convolutional neural network.
  • the encoding module can also be referred to as encoder.
  • a simple convolutional neural network (CNN) is used to extract depth features from the corresponding input images.
  • CNN convolutional neural network
  • a neural network with a higher capacity can also be used.
  • a feature extraction can in particular be reliably performed.
  • the plurality of pixels is clustered by a K-means algorithm of the feature point cluster module.
  • the K-means clustering method with a number of for example n clusters, wherein n can for example be 5, is in particular applied. It is advantageous to first differentiate the depth feature points in n clusters and later each of these clusters is passed through a regression module.
  • the K-means algorithm is in particular a method for vector quantization, which is in particular also used for cluster analysis.
  • a previously known number of K-groups is formed from an amount of similar objects. It is further advantageous if the clustered pixels are regressed by a regression module formed as a long short term memory module.
  • the long short term memory module is in particular a so-called long short term memory module (LSTM).
  • the presented problem is in particular not a classification problem, but a regression problem.
  • the output from the in particular K-means clustering is nothing else than a series of depth feature points, which are separated in clusters, wherefore the LSTM module is in particular advantageous for regressing.
  • the core concept of such an LSTM module is the cell state and its different gates. Cell states act like a memory of the network and can carry relevant information during the processing of the sequence.
  • the gates are different neural networks, which decide which information is allowed on the cell status. During the training, the gates can learn which information is relevant, to retain or to forget it. Each gateway or gate contains sigmoidal activations. This is helpful to update or to forget data, since each number, which is multiplied by 0, is 0, whereby values disappear or are "forgotten". Each number, which is multiplied by 1 , is the same value, therefore this value remains the same or is "retained”. In such an LSTM module, the so-called forget gate is in particular first formed. This gate decides which information are to be discarded or retained. Information from the previous hidden state and information from the current input are passed through the sigmoid function. Therein, values between 0 and 1 result. The closer to 0, the more it is forgotten, and the closer to 1 , the more it is retained.
  • the clustered pixels are unidirectionally transferred from the feature point cluster module to the regression module.
  • the connection from the encoder to the LSTM module via the K-means algorithm is unidirectional and is only used for the forward passage since unsupervised K-means clustering is used.
  • the regressed pixels are backpropagated to the encoding device.
  • the backpropagation is effected via a separate connection from the LSTM module to the encoder. It is also advised against using supervised K-means and making the connection bidirectional since it has only few trainable parameters, which are not sufficient in the backpropagation since the gradient flow is effected through a great number of trainable parameters from the LSTM module.
  • a sigmoid loss is trained in a first training phase for the assistance system for applying the sigmoid function.
  • sigmoid is in particular an advantageous function in order that the values are obtained in the range between 0 and 1.
  • the reason for this is that a continuous annotation is not present, but only annotations for 0 and 1.
  • the sigmoidal loss function is in particular first trained, which estimates the quality of the perception deterioration, thus the degradation, from an input image.
  • this flow cannot output the quality of the degradation deterioration per pixel, therefore, a functionality is added, which generates the output in the form of the input image.
  • the self-attention module is in particular a so- called attention map.
  • a so-called "self-attention” can in particular be performed by the attention map. This is in particular an advantageous step to discriminate the output of the LSTM.
  • the self-attention module is provided in the form of a global averaging.
  • the global averaging is in particular a so-called global average pooling (GAP).
  • the output of the sigmoid function module is transferred to a decoding device of the electronic computing device for decoding the output.
  • a decoder is in particular added, which adopts the output of the sigmoid function module in the one-dimensional format.
  • this one-dimensional vector is converted into a two-dimensional vector by conversion and then in turn supplied to the decoder for reconstruction.
  • This reconstruction can then in turn be transferred to a superordinated assistance system, wherein the superordinated assistance system can then in turn decide whether or not the image of the camera can be used for evaluation based on the reconstruction. For example, if the image should not be able to be used for evaluation, thus, an alarm signal can be generated for a user of the motor vehicle.
  • the credibility of the image is for example downgraded for an at least partially automated operation of the motor vehicle if a corresponding value of the degradation should be present.
  • a safe operation of the motor vehicle can be realized.
  • the decoding device is provided in the form of a fully convoluted neural network.
  • the fully convoluted neural network is in particular a so-called fully convoluted network (FCN).
  • FCN fully convoluted network
  • the decoding device is trained in a second training phase for the assistance system, wherein exclusively the decoding device is trained in the second training phase, which is after the first training phase in time.
  • the pre-trained model from the first phase is in particular used.
  • an estimation of the degradation degree is effected in the first phase, and a reconstruction of the output is effected in the second phase. Therefore, only the decoder is trained in the second phase, the remaining components of the assistance system or of the so-called "pipeline" remain untrainable. Otherwise, the gradient of the decoder would destruct most of the trained weights, including the encoder, in the first epochs. Therefore, the components used in the first training phase are only used as feature extractors in the second phase.
  • the presented method is in particular a computer-implemented method.
  • the method is in particular performed on an electronic computing device, wherein the electronic computing device can in particular comprise circuits, for example integrated circuits, processors and further electronic components to perform the corresponding method steps.
  • a further aspect of the invention relates to a computer program product with program code means, which are stored in a computer-readable storage medium, to perform the method for determining a degradation degree according to the preceding aspect, when the computer program product is executed on a processor of an electronic computing device.
  • a still further aspect of the invention relates to a computer-readable storage medium with a computer program product, in particular an electronic computing device with a computer program product, according to the preceding aspect.
  • a still further aspect of the invention relates to an assistance system for determining a degradation degree of an image captured by a camera of a motor vehicle, with at least one camera and with an electronic computing device, wherein the assistance system is formed for performing a method according to the preceding aspect.
  • the method is performed by the assistance system.
  • a still further aspect of the invention relates to a motor vehicle with an assistance system according to the preceding aspect.
  • the motor vehicle is formed as an at least partially autonomous, in particular as a fully autonomous, motor vehicle.
  • Advantageous forms of configuration of the method are to be regarded as advantageous forms of configuration of the computer program product, of the computer-readable storage medium, of the assistance system as well as of the motor vehicle.
  • the assistance system as well as the motor vehicle comprise concrete features, which allow performing the method or an advantageous form of configuration thereof.
  • Fig. 1 a schematic top view to a motor vehicle with an embodiment of an assistance system
  • Fig. 2 a schematic block diagram of an embodiment of an assistance system
  • Fig. 3 a schematic view of an embodiment of a regression module of an embodiment of an electronic computing device of an embodiment of the assistance system.
  • Fig. 1 shows a schematic top view to an embodiment of a motor vehicle 1 with an embodiment of an assistance system 2.
  • the assistance system 2 is formed for determining a degradation degree 3 (Fig. 2) of an image 5 captured by a camera 4 of the motor vehicle 1.
  • the assistance system 2 in particular comprises the camera 4 as well as an electronic computing device 6.
  • an environment 7 of the motor vehicle 1 can be captured by the camera 4.
  • the motor vehicle 1 is in particular an at least partially autonomous, in particular a fully autonomous, motor vehicle 1.
  • the assistance system 2 can be formed only for determining the degradation degree 3.
  • the assistance system 2 can also be formed for at least partially autonomous operation or for fully autonomous operation of the motor vehicle 1.
  • the assistance system 2 can for example perform interventions in a steering and acceleration device of the motor vehicle 1 . Based on the captured environment 7, then, the assistance system 2 can in turn in particular perform for example corresponding control signals for the steering and braking intervention, respectively.
  • Fig. 2 shows a schematic block diagram of an embodiment of the assistance system 2, in particular of the electronic computing device 6 of the assistance system 2.
  • the image 5 is captured by the camera 4.
  • Performing a deep feature extraction of a plurality of pixels 8 of the image 5 by an encoding module 9 of the electronic computing device 6 is effected.
  • the encoding module 9 can also be referred to as encoder.
  • Clustering of the plurality of pixels 8 by a feature point cluster module of the electronic computing device 6 is effected.
  • the clustered pixels 8 are regressed by a regression module 11 of the electronic computing device 6.
  • determining the degradation degree 3 depending on an evaluation by applying a sigmoid function 20 (Fig. 3) after the regression by a sigmoid function module 12 of the electronic computing device 6 as the output of the sigmoid function module 12 is then in turn effected.
  • Fig. 2 further shows that a sigmoid loss 13 is trained in a first training phase for the assistance system 2 for applying the sigmoid function 20.
  • the assistance system 2 or the electronic computing device 6 comprises a self-attention module 14, by which the regressed pixels 8 are discriminated and the discriminated pixels 8 are transferred to the sigmoid function module 12.
  • the self-attention module 14 is in particular provided in the form of a global averaging (global average pooling).
  • Fig. 2 further shows that the encoding module 9 is provided as a convolutional neural network.
  • the plurality of pixels 8 is clustered by a K-means algorithm 15 of the feature point cluster module 10.
  • the clustered pixels 8 are in turn regressed by a regression module 11 formed as a long short term memory module 16.
  • the clustered pixels 8 are unidirectionally, which is represented by the arrows 17, transferred from the feature point cluster module 10 to the regression module 11 .
  • the pixels 8 are in turn bidirectionally transferred from the regression module 11 to the self-attention module 14, from the self-attention module 14 to the sigmoid function module 12 and bidirectionally from the sigmoid function module 12 to the sigmoid loss 13. This is in particular represented by the arrows 18.
  • the passage of the degradation degree 3 is also bidirectionally effected to a decoding device 19 of the assistance system 2.
  • the output of the sigmoid function module 12 is in particular effected to the decoding device 19 for decoding the output.
  • the decoding device 19 can in particular be provided in the form of a fully convoluted neural network.
  • the decoding device 19 is in particular trained in a second training phase for the assistance system 2, wherein exclusively the decoding device 19 is trained in the second training phase, which is after the first training phase in time.
  • Fig. 3 shows a schematic block diagram of an embodiment of the regression module 11 according to Fig. 2.
  • a long short term memory module 16 which can also be referred to as long short term memory module (LSTM).
  • the long short term memory module 16 presently comprises at least three sigmoid functions 20 as well as at least two tanh functions 21. Further, the long short term memory module 16 comprises at least two pointwise multiplications 22 as well as two pointwise additions 23.
  • the results of the regression are in turn backpropagated to the encoding device 9, which is in particular represented by the arrows 24.
  • such a long short term memory module 16 is suitable since the K-means algorithm 15 is a series of depth feature points, which are separated in clusters, such that it is a regression problem.
  • the core concept of the long short term memory module 16 is the cell state and its different gates. Cell states act like a memory of the network and can carry relevant information during the processing of the sequence.
  • the gates are different neural networks, which decide which information is allowed on the cell status. The gates can learn during the training, which information is relevant to retain or to forget it.
  • Each gateway or gate contains sigmoidal activation. This is helpful to update or forget data since each number, which is multiplied by 0, is 0, whereby values disappear or are "forgotten”.
  • the forget gate is first provided. This gate decides which information is to be discarded or retained. Information from the previous hidden state and information from the current input are passed through the sigmoid function 20. Therein, values between 0 and 1 result. The closer to 0, the more it is forgotten, and the closer to 1 , the more it is retained.
  • the input gate follows, in which the previous hidden state into the current input is first passed to a sigmoid function 20, which decides which values are updated, in that it transforms the values such that they are between 0 and 1 .
  • the hidden state to the current input is also passed to the tanh function 21 to "squeeze" values between -1 and 1 to regulate the network. Later, the outputs of both are multiplied.
  • the sigmoid output decides which information is important to retain it from the tanh output.
  • the cell state is first pointwise multiplied by the forget vector. Then, the output of the input gate is used and a pointwise addition 23 is performed, which updates the cell status to new values, which the neural network regards as relevant.
  • An output gate decides what is to be the next hidden state.
  • the hidden state contains information about the previous inputs. This state is also used for predictions.
  • the connection from the encoder to the LSTM module via K-means is unidirectional and is only used for the forward passage, since unsupervised K-means clustering is used.
  • the backpropagation 24 is effected via a separate connection from the regression module 11 to the encoding device 9. It is also advised against using the supervised K-means and making the connection bidirectional since it only has few trainable parameters, which are not sufficient in the backpropagation 24, since the gradient flow is effected through a great number of trainable parameters from the LSTM.
  • temporal consistency represents an important aspect in algorithms based on image processing to keep the system performance constant.
  • the solution proposed in the figures can be extended by a further encoding device such that both encoders have successive image frames or frames of the video sequence. This is advantageous for the assistance system 2 to output predictions without flicker effect.
  • the assistance system 2 is flexible such that it does not require a hard annotation, which is cumbersome and very cost-intensive.
  • the assistance system 2 can in particular be used for at least partially autonomous driving of the motor vehicle 1 .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a method for determining a degradation degree (3) of an image (5) captured by a camera (4) of an assistance system (2) of a motor vehicle (1) by the assistance system (2), comprising the steps of: - capturing the image (5) by the camera (4); - performing a deep feature extraction of a plurality of pixels (8) of the image (5) by 0 an encoding module (9) of an electronic computing device (6) of the assistance system (2); - clustering the plurality of pixels (8) by a feature point cluster module (10) of the electronic computing device (6); - regressing the clustered pixels (8) by a regression module (11) of the electronic 5 computing device (6); and - determining the degradation degree (3) depending on an evaluation by applying a sigmoid function (20) after the regression by a sigmoid function module (12) of the electronic computing device (6) as an output of the sigmoid function module (12). Further, the invention relates to a computer program product, to a computer-readable storage medium as well as to an assistance system (2).

Description

Method for determining a degradation degree of a captured image, computer program product, computer-readable storage medium as well as assistance system
The invention relates to a method for determining a degradation degree of an image captured by a camera of an assistance system of a motor vehicle by the assistance system. Further, the invention relates to a computer program product, to a computer- readable storage medium as well as to an assistance system.
From the motor vehicle construction, cameras are in particular known, which can capture an environmental image of an environment of the motor vehicle. In particular, so-called surround view cameras are known, which can in particular be impaired by harsh environmental conditions like snow, rain, mud and the like. In order to capture such an impairment, it is already known that special contamination classes like soil and water drops can be determined and recognized depending on the contamination. Flerein, great challenges in the annotation of only these few classes arise. In particular, the transition areas are to be subjectively annotated and it is difficult to obtain realistic contamination captures with corresponding variety. In particular, lens contaminations based on mud, sand, water drops and frost as well as further unbeneficial weather conditions like snow, rain or fog are not yet possible to determine, wherein they herein in particular generate a deterioration caused by illumination like low light, blinding, shadow or motion blur, wherein they are also to be taken into account in the future. Flowever, it is herein impractical to create further datasets for each of these individual classes and with all combinations of degradation and to annotate them.
US 2018/315167 A1 discloses an object recognition method, in which an object such as for example a vehicle can be recognized from an image captured by an on-board camera even if a lens of the on-board camera is contaminated. In order to achieve the aim, in this object recognition method for recognizing an object contained in a captured image, an original image containing the object to be recognized is created from the captured image, a processed image is generated from the created original image by applying predetermined processing to the original image, and a learning process with respect to the restoration of an image of the object to be recognized is performed using the original image and the processed image. EP 3657379 A1 discloses an image processing device with neural network, which obtains a first image and a respective additional image, wherein a first image capturing device has another field of view than each additional image capturing device. Each image is processed by corresponding instances of a common feature extraction processing network to generate a corresponding first map and at least one additional map. The feature classification includes processing the first map to generate a first classification, which indicates a contamination of the first image. The processing of the or each additional map to generate at least one additional classification, which indicates a corresponding contamination of the or each additional image. The first and each additional classification are combined to generate an enhanced classification, which indicates a contamination of the first image. If the enhanced classification does not indicate a contamination of the first image capturing device, further processing can be performed.
It is the object of the present invention to provide a method, a computer program product, a computer-readable storage medium as well as an assistance system, by which an improved determination of a degradation degree of a camera can be performed.
This object is solved by a method, a computer program product, a computer-readable storage medium as well as an assistance system according to the independent claims. Advantageous forms of configuration are specified in the dependent claims.
An aspect of the invention relates to a method for determining a degradation degree of an image captured by a camera of an assistance system of a motor vehicle by the assistance system. Capturing the image by the camera and performing a deep feature extraction of a plurality of pixels of the image by an encoding module of an electronic computing device of the assistance system are effected. The plurality of pixels is clustered by a feature point cluster module of the electronic computing device. The clustered pixels are regressed by a regression module of the electronic computing device. Determining the degradation degree depending on an evaluation by applying a sigmoid function after the regression by a sigmoid function module of the electronic computing device as an output of the sigmoid function module is effected.
Thus, an improved determination of the degradation degree can in particular be performed. In particular, the degradation degree can for example be between 0 and 1 , wherein 0 can indicate that a degradation is not present, thus the lens is for example clear and clean, respectively, and 1 can signify that the lens is contaminated. Thus, the closer the degradation degree is to 0, the cleaner the lens is. Thus, this is in particular independent of a classification of the contamination. The degradation is only generally determined and if for example then the image with the corresponding degradation degree can in turn be further used to for example use an evaluation of this image for a drive function.
Thus, the invention in particular solves the problem that independently of a classification and thus also independently of a plurality of datasets for a contamination, a corresponding degradation degree can nevertheless be determined. Thus, the assistance system can be simply trained, wherein a low number of datasets is in particular required hereto.
Thus, the degradation deterioration is in particular estimated between 0 and 1 by the sigmoid function, wherein 0 means clean and 1 opaque. The two above mentioned cases are in particular extremes, for which corresponding annotations are known. The quality of the degradation deterioration is determined in the range between 0 and 1 due to the presence of the regression module.
According to an advantageous form of configuration, the encoding module is provided as a convolutional neural network. The encoding module can also be referred to as encoder. In particular, a simple convolutional neural network (CNN) is used to extract depth features from the corresponding input images. In particular if sufficient storage and computational assistance should for example be present, thus, a neural network with a higher capacity can also be used. Thus, a feature extraction can in particular be reliably performed.
It has further proven advantageous if the plurality of pixels is clustered by a K-means algorithm of the feature point cluster module. In particular since the annotations are each only available for two extreme cases, namely clean and opaque, however, the input image will most likely have a varying deterioration in the range between 0 and 1 during the inference time. In order to learn this variation, the K-means clustering method with a number of for example n clusters, wherein n can for example be 5, is in particular applied. It is advantageous to first differentiate the depth feature points in n clusters and later each of these clusters is passed through a regression module. The K-means algorithm is in particular a method for vector quantization, which is in particular also used for cluster analysis. Therein, a previously known number of K-groups is formed from an amount of similar objects. It is further advantageous if the clustered pixels are regressed by a regression module formed as a long short term memory module. The long short term memory module is in particular a so-called long short term memory module (LSTM). The presented problem is in particular not a classification problem, but a regression problem. The output from the in particular K-means clustering is nothing else than a series of depth feature points, which are separated in clusters, wherefore the LSTM module is in particular advantageous for regressing. The core concept of such an LSTM module is the cell state and its different gates. Cell states act like a memory of the network and can carry relevant information during the processing of the sequence. The gates are different neural networks, which decide which information is allowed on the cell status. During the training, the gates can learn which information is relevant, to retain or to forget it. Each gateway or gate contains sigmoidal activations. This is helpful to update or to forget data, since each number, which is multiplied by 0, is 0, whereby values disappear or are "forgotten". Each number, which is multiplied by 1 , is the same value, therefore this value remains the same or is "retained". In such an LSTM module, the so-called forget gate is in particular first formed. This gate decides which information are to be discarded or retained. Information from the previous hidden state and information from the current input are passed through the sigmoid function. Therein, values between 0 and 1 result. The closer to 0, the more it is forgotten, and the closer to 1 , the more it is retained.
In a further advantageous form of configuration, the clustered pixels are unidirectionally transferred from the feature point cluster module to the regression module. In particular the connection from the encoder to the LSTM module via the K-means algorithm is unidirectional and is only used for the forward passage since unsupervised K-means clustering is used.
Further, it has proven advantageous if the regressed pixels are backpropagated to the encoding device. In particular in order to generate an unconnected graph, the backpropagation is effected via a separate connection from the LSTM module to the encoder. It is also advised against using supervised K-means and making the connection bidirectional since it has only few trainable parameters, which are not sufficient in the backpropagation since the gradient flow is effected through a great number of trainable parameters from the LSTM module.
In a further advantageous form of configuration, a sigmoid loss is trained in a first training phase for the assistance system for applying the sigmoid function. Here, sigmoid is in particular an advantageous function in order that the values are obtained in the range between 0 and 1. The reason for this is that a continuous annotation is not present, but only annotations for 0 and 1. Thus, in the first phase, the sigmoidal loss function is in particular first trained, which estimates the quality of the perception deterioration, thus the degradation, from an input image. However, this flow cannot output the quality of the degradation deterioration per pixel, therefore, a functionality is added, which generates the output in the form of the input image.
It has further proven advantageous if the regressed pixels are discriminated by a self attention module of the electronic computing device and the discriminated pixels are transferred to the sigmoid function module. The self-attention module is in particular a so- called attention map. A so-called "self-attention" can in particular be performed by the attention map. This is in particular an advantageous step to discriminate the output of the LSTM.
It is also advantageous if the self-attention module is provided in the form of a global averaging. The global averaging is in particular a so-called global average pooling (GAP).
It is applied to the output signal of the LSTM und multiplied by a one-dimensional vector.
In a further advantageous form of configuration, the output of the sigmoid function module is transferred to a decoding device of the electronic computing device for decoding the output. Thus, a decoder is in particular added, which adopts the output of the sigmoid function module in the one-dimensional format. In particular, this one-dimensional vector is converted into a two-dimensional vector by conversion and then in turn supplied to the decoder for reconstruction. This reconstruction can then in turn be transferred to a superordinated assistance system, wherein the superordinated assistance system can then in turn decide whether or not the image of the camera can be used for evaluation based on the reconstruction. For example, if the image should not be able to be used for evaluation, thus, an alarm signal can be generated for a user of the motor vehicle.
Further, it can also be provided that the credibility of the image is for example downgraded for an at least partially automated operation of the motor vehicle if a corresponding value of the degradation should be present. Thus, a safe operation of the motor vehicle can be realized.
In a further advantageous form of configuration, the decoding device is provided in the form of a fully convoluted neural network. The fully convoluted neural network is in particular a so-called fully convoluted network (FCN). In particular, it has turned out that this is very advantageous to be able to further process the output of the sigmoid function module.
In a further advantageous form of configuration, the decoding device is trained in a second training phase for the assistance system, wherein exclusively the decoding device is trained in the second training phase, which is after the first training phase in time. Thus, the pre-trained model from the first phase is in particular used. Herein, it is known that there are two simultaneous problems. In particular, an estimation of the degradation degree is effected in the first phase, and a reconstruction of the output is effected in the second phase. Therefore, only the decoder is trained in the second phase, the remaining components of the assistance system or of the so-called "pipeline" remain untrainable. Otherwise, the gradient of the decoder would destruct most of the trained weights, including the encoder, in the first epochs. Therefore, the components used in the first training phase are only used as feature extractors in the second phase.
The presented method is in particular a computer-implemented method. Therein, the method is in particular performed on an electronic computing device, wherein the electronic computing device can in particular comprise circuits, for example integrated circuits, processors and further electronic components to perform the corresponding method steps.
Therefore, a further aspect of the invention relates to a computer program product with program code means, which are stored in a computer-readable storage medium, to perform the method for determining a degradation degree according to the preceding aspect, when the computer program product is executed on a processor of an electronic computing device.
A still further aspect of the invention relates to a computer-readable storage medium with a computer program product, in particular an electronic computing device with a computer program product, according to the preceding aspect.
A still further aspect of the invention relates to an assistance system for determining a degradation degree of an image captured by a camera of a motor vehicle, with at least one camera and with an electronic computing device, wherein the assistance system is formed for performing a method according to the preceding aspect. In particular, the method is performed by the assistance system. A still further aspect of the invention relates to a motor vehicle with an assistance system according to the preceding aspect. In particular, the motor vehicle is formed as an at least partially autonomous, in particular as a fully autonomous, motor vehicle.
Advantageous forms of configuration of the method are to be regarded as advantageous forms of configuration of the computer program product, of the computer-readable storage medium, of the assistance system as well as of the motor vehicle. Thereto, the assistance system as well as the motor vehicle comprise concrete features, which allow performing the method or an advantageous form of configuration thereof.
Further features are apparent from the claims, the figures and the description of figures. The features and feature combinations mentioned above in the description as well as the features and feature combinations mentioned below in the description of figures and/or shown in the figures alone are usable not only in the respectively specified combination, but also in other combinations without departing from the scope of the invention. Thus, implementations are also to be considered as encompassed and disclosed by the invention, which are not explicitly shown in the figures and explained, but arise from and can be generated by separated feature combinations from the explained implementations. Implementations and feature combinations are also to be considered as disclosed, which thus do not comprise all of the features of an originally formulated independent claim. Moreover, implementations and feature combinations are to be considered as disclosed, in particular by the implementations set out above, which extend beyond or deviate from the feature combinations set out in the relations of the claims.
Now, the invention is explained in more detail based on preferred embodiments as well as with reference to the attached drawings.
There show:
Fig. 1 a schematic top view to a motor vehicle with an embodiment of an assistance system;
Fig. 2 a schematic block diagram of an embodiment of an assistance system; and Fig. 3 a schematic view of an embodiment of a regression module of an embodiment of an electronic computing device of an embodiment of the assistance system.
In the figures, identical or functionally identical elements are provided with the same reference characters.
Fig. 1 shows a schematic top view to an embodiment of a motor vehicle 1 with an embodiment of an assistance system 2. The assistance system 2 is formed for determining a degradation degree 3 (Fig. 2) of an image 5 captured by a camera 4 of the motor vehicle 1. Flereto, the assistance system 2 in particular comprises the camera 4 as well as an electronic computing device 6. In particular, an environment 7 of the motor vehicle 1 can be captured by the camera 4. Presently, the motor vehicle 1 is in particular an at least partially autonomous, in particular a fully autonomous, motor vehicle 1. Presently, the assistance system 2 can be formed only for determining the degradation degree 3. In addition, the assistance system 2 can also be formed for at least partially autonomous operation or for fully autonomous operation of the motor vehicle 1. Flereto, the assistance system 2 can for example perform interventions in a steering and acceleration device of the motor vehicle 1 . Based on the captured environment 7, then, the assistance system 2 can in turn in particular perform for example corresponding control signals for the steering and braking intervention, respectively.
Fig. 2 shows a schematic block diagram of an embodiment of the assistance system 2, in particular of the electronic computing device 6 of the assistance system 2.
In the method for determining the degradation degree 3, the image 5 is captured by the camera 4. Performing a deep feature extraction of a plurality of pixels 8 of the image 5 by an encoding module 9 of the electronic computing device 6 is effected. The encoding module 9 can also be referred to as encoder. Clustering of the plurality of pixels 8 by a feature point cluster module of the electronic computing device 6 is effected. The clustered pixels 8 are regressed by a regression module 11 of the electronic computing device 6. Then, determining the degradation degree 3 depending on an evaluation by applying a sigmoid function 20 (Fig. 3) after the regression by a sigmoid function module 12 of the electronic computing device 6 as the output of the sigmoid function module 12 is then in turn effected. Fig. 2 further shows that a sigmoid loss 13 is trained in a first training phase for the assistance system 2 for applying the sigmoid function 20. Further, the assistance system 2 or the electronic computing device 6 comprises a self-attention module 14, by which the regressed pixels 8 are discriminated and the discriminated pixels 8 are transferred to the sigmoid function module 12. The self-attention module 14 is in particular provided in the form of a global averaging (global average pooling).
Fig. 2 further shows that the encoding module 9 is provided as a convolutional neural network. Further, the plurality of pixels 8 is clustered by a K-means algorithm 15 of the feature point cluster module 10. The clustered pixels 8 are in turn regressed by a regression module 11 formed as a long short term memory module 16. Further, the clustered pixels 8 are unidirectionally, which is represented by the arrows 17, transferred from the feature point cluster module 10 to the regression module 11 . The pixels 8 are in turn bidirectionally transferred from the regression module 11 to the self-attention module 14, from the self-attention module 14 to the sigmoid function module 12 and bidirectionally from the sigmoid function module 12 to the sigmoid loss 13. This is in particular represented by the arrows 18. Further, the passage of the degradation degree 3 is also bidirectionally effected to a decoding device 19 of the assistance system 2. Thus, the output of the sigmoid function module 12 is in particular effected to the decoding device 19 for decoding the output. Therein, the decoding device 19 can in particular be provided in the form of a fully convoluted neural network. The decoding device 19 is in particular trained in a second training phase for the assistance system 2, wherein exclusively the decoding device 19 is trained in the second training phase, which is after the first training phase in time.
Fig. 3 shows a schematic block diagram of an embodiment of the regression module 11 according to Fig. 2. Presently, it is in particular a long short term memory module 16, which can also be referred to as long short term memory module (LSTM). The long short term memory module 16 presently comprises at least three sigmoid functions 20 as well as at least two tanh functions 21. Further, the long short term memory module 16 comprises at least two pointwise multiplications 22 as well as two pointwise additions 23. The results of the regression are in turn backpropagated to the encoding device 9, which is in particular represented by the arrows 24.
In particular, such a long short term memory module 16 is suitable since the K-means algorithm 15 is a series of depth feature points, which are separated in clusters, such that it is a regression problem. The core concept of the long short term memory module 16 is the cell state and its different gates. Cell states act like a memory of the network and can carry relevant information during the processing of the sequence. The gates are different neural networks, which decide which information is allowed on the cell status. The gates can learn during the training, which information is relevant to retain or to forget it. Each gateway or gate contains sigmoidal activation. This is helpful to update or forget data since each number, which is multiplied by 0, is 0, whereby values disappear or are "forgotten". Each number, which is multiplied by 1 , is the same value, therefore this value remains the same or is "retained". In the long short term memory module 16, the forget gate is first provided. This gate decides which information is to be discarded or retained. Information from the previous hidden state and information from the current input are passed through the sigmoid function 20. Therein, values between 0 and 1 result. The closer to 0, the more it is forgotten, and the closer to 1 , the more it is retained.
Next, the input gate follows, in which the previous hidden state into the current input is first passed to a sigmoid function 20, which decides which values are updated, in that it transforms the values such that they are between 0 and 1 . The hidden state to the current input is also passed to the tanh function 21 to "squeeze" values between -1 and 1 to regulate the network. Later, the outputs of both are multiplied. The sigmoid output decides which information is important to retain it from the tanh output.
For calculating the cell state, the cell state is first pointwise multiplied by the forget vector. Then, the output of the input gate is used and a pointwise addition 23 is performed, which updates the cell status to new values, which the neural network regards as relevant.
An output gate then decides what is to be the next hidden state. The hidden state contains information about the previous inputs. This state is also used for predictions. It is to be noted that the connection from the encoder to the LSTM module via K-means is unidirectional and is only used for the forward passage, since unsupervised K-means clustering is used. In order to generate an unconnected graph, the backpropagation 24 is effected via a separate connection from the regression module 11 to the encoding device 9. It is also advised against using the supervised K-means and making the connection bidirectional since it only has few trainable parameters, which are not sufficient in the backpropagation 24, since the gradient flow is effected through a great number of trainable parameters from the LSTM. Further, the temporal consistency represents an important aspect in algorithms based on image processing to keep the system performance constant. The solution proposed in the figures can be extended by a further encoding device such that both encoders have successive image frames or frames of the video sequence. This is advantageous for the assistance system 2 to output predictions without flicker effect.
Overall, a semi-supervised algorithm is thus proposed, which estimates the degradation degree 3 on pixel basis from the input image. The assistance system 2 is flexible such that it does not require a hard annotation, which is cumbersome and very cost-intensive. The assistance system 2 can in particular be used for at least partially autonomous driving of the motor vehicle 1 .

Claims

Claims
1. A method for determining a degradation degree (3) of an image (5) captured by a camera (4) of an assistance system (2) of a motor vehicle (1) by the assistance system (2), comprising the steps of: - capturing the image (5) by the camera (4);
- performing a deep feature extraction of a plurality of pixels (8) of the image (5) by an encoding module (9) of an electronic computing device (6) of the assistance system (2);
- clustering the plurality of pixels (8) by a feature point cluster module (10) of the electronic computing device (6);
- regressing the clustered pixels (8) by a regression module (11) of the electronic computing device (6); and
- determining the degradation degree (3) depending on an evaluation by applying a sigmoid function (20) after the regression by a sigmoid function module (12) of the electronic computing device (6) as an output of the sigmoid function module (12).
2. The method according to claim 1 , characterized in that the encoding module (9) is provided as a convolutional neural network.
3. The method according to claim 1 or 2, characterized in that the plurality of pixels (8) is clustered by a K-means algorithm (15) of the feature point cluster module (10).
4. The method according to any one of the preceding claims, characterized in that the clustered pixels (8) are regressed by a regression module (11) formed as a long short term memory module (16).
5. The method according to any one of the preceding claims, characterized in that the clustered pixels (8) are unidirectionally transferred at least from the feature point cluster module (10) to the regression module (11).
6. The method according to any one of the preceding claims, characterized in that the regressed pixels (8) are backpropagated to the encoding device (9).
7. The method according to any one of the preceding claims, characterized in that a sigmoid loss (13) is trained in a first training phase for the assistance system (2) for applying the sigmoid function (20).
8. The method according to any one of the preceding claims, characterized in that the regressed pixels (8) are discriminated by a self-attention module (14) of the electronic computing device (6) and the discriminated pixels (8) are transferred to the sigmoid function module (12).
9. The method according to claim 8, characterized in that the self-attention module (14) is provided in the form of a global averaging.
10. The method according to any one of the preceding claims, characterized in that the output of the sigmoid function module (12) is transferred to a decoding device (19) of the electronic computing device (6) for decoding the output.
11. The method according to claim 10, characterized in that the decoding device (19) is provided in the form of a fully convoluted neural network.
12. The method according to claim 10 or 11 , characterized in that the decoding device (19) is trained in a second training phase for the assistance system (2), wherein exclusively the decoding device (19) is trained in the second training phase, which is after the first training phase in time.
13. A computer program product with program code means, which, when the program code means are executed on an electronic computing device (6), cause it to perform a method according to any one of claims 1 to 12.
14. A computer-readable storage medium with at least one computer program product according to claim 13.
15. An assistance system (2) for determining a degradation degree (3) of an image (5) captured by a camera (4) of a motor vehicle (1 ), with at least one camera (4) and with an electronic computing device (6), wherein the assistance system (2) is formed for performing a method according to any one of claims 1 to 12.
PCT/EP2022/052939 2021-02-11 2022-02-08 Method for determining a degradation degree of a captured image, computer program product, computer-readable storage medium as well as assistance system WO2022171590A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
DE102021103200.8 2021-02-11
DE102021103200.8A DE102021103200B3 (en) 2021-02-11 2021-02-11 Method for determining a degree of degradation of a recorded image, computer program product, computer-readable storage medium and assistance system

Publications (1)

Publication Number Publication Date
WO2022171590A1 true WO2022171590A1 (en) 2022-08-18

Family

ID=80786370

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2022/052939 WO2022171590A1 (en) 2021-02-11 2022-02-08 Method for determining a degradation degree of a captured image, computer program product, computer-readable storage medium as well as assistance system

Country Status (2)

Country Link
DE (1) DE102021103200B3 (en)
WO (1) WO2022171590A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE102022207416B3 (en) 2022-07-20 2023-10-05 Zf Friedrichshafen Ag Computer-implemented method for detecting occlusions of an imaging sensor
DE102022121781A1 (en) 2022-08-29 2024-02-29 Connaught Electronics Ltd. Computer vision based on thermal imaging in a vehicle

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180315167A1 (en) 2015-11-06 2018-11-01 Clarion Co., Ltd. Object Detection Method and Object Detection System
EP3657379A1 (en) 2018-11-26 2020-05-27 Connaught Electronics Ltd. A neural network image processing apparatus for detecting soiling of an image capturing device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9445057B2 (en) 2013-02-20 2016-09-13 Magna Electronics Inc. Vehicle vision system with dirt detection

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180315167A1 (en) 2015-11-06 2018-11-01 Clarion Co., Ltd. Object Detection Method and Object Detection System
EP3657379A1 (en) 2018-11-26 2020-05-27 Connaught Electronics Ltd. A neural network image processing apparatus for detecting soiling of an image capturing device

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
MATHILDE CARON ET AL: "Deep Clustering for Unsupervised Learning of Visual Features", ARXIV.ORG, 18 March 2019 (2019-03-18), XP081122377, Retrieved from the Internet <URL:https://arxiv.org/pdf/1807.05520.pdf> *
RUI QIAN ET AL: "Attentive Generative Adversarial Network for Raindrop Removal from a Single Image", ARXIV.ORG, 6 May 2018 (2018-05-06), XP081316577, Retrieved from the Internet <URL:https://arxiv.org/pdf/1711.10098.pdf> *
SUN JIAHAO ET AL: "An Introductory Survey on Attention Mechanisms in Computer Vision Problems", 2020 6TH INTERNATIONAL CONFERENCE ON BIG DATA AND INFORMATION ANALYTICS (BIGDIA), IEEE, 4 December 2020 (2020-12-04), pages 295 - 300, XP033893246 *
URICAR MICHAL ET AL: "SoilingNet: Soiling Detection on Automotive Surround-View Cameras", 2019 IEEE INTELLIGENT TRANSPORTATION SYSTEMS CONFERENCE (ITSC), IEEE, 27 October 2019 (2019-10-27), pages 67 - 72, XP033668555 *

Also Published As

Publication number Publication date
DE102021103200B3 (en) 2022-06-23

Similar Documents

Publication Publication Date Title
WO2022000426A1 (en) Method and system for segmenting moving target on basis of twin deep neural network
CN108062562B (en) Object re-recognition method and device
WO2022171590A1 (en) Method for determining a degradation degree of a captured image, computer program product, computer-readable storage medium as well as assistance system
CN107977638B (en) Video monitoring alarm method, device, computer equipment and storage medium
CN112507990A (en) Video time-space feature learning and extracting method, device, equipment and storage medium
Akilan et al. sEnDec: an improved image to image CNN for foreground localization
US11574500B2 (en) Real-time facial landmark detection
Hsu et al. Learning to tell brake and turn signals in videos using cnn-lstm structure
US11804026B2 (en) Device and a method for processing data sequences using a convolutional neural network
Rekabdar et al. Dilated convolutional neural network for predicting driver's activity
WO2020000382A1 (en) Motion-based object detection method, object detection apparatus and electronic device
CN113065645A (en) Twin attention network, image processing method and device
Gesnouin et al. TrouSPI-Net: Spatio-temporal attention on parallel atrous convolutions and U-GRUs for skeletal pedestrian crossing prediction
CN115601403A (en) Event camera optical flow estimation method and device based on self-attention mechanism
CN113158905A (en) Pedestrian re-identification method based on attention mechanism
CN110705564A (en) Image recognition method and device
CN113657200A (en) Video behavior action identification method and system based on mask R-CNN
Anees et al. Deep learning framework for density estimation of crowd videos
CN110121055B (en) Method and apparatus for object recognition
CN113011395B (en) Single-stage dynamic pose recognition method and device and terminal equipment
CN117561540A (en) System and method for performing computer vision tasks using a sequence of frames
US11816181B2 (en) Blur classification and blur map estimation
CN114463810A (en) Training method and device for face recognition model
EP4002270A1 (en) Image recognition evaluation program, image recognition evaluation method, evaluation device, and evaluation system
JP2021064343A (en) Behavior recognition device, behavior recognition method, and information generation device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22708055

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22708055

Country of ref document: EP

Kind code of ref document: A1