WO2020220926A1 - Procédé et dispositif d'identification de données multimédia - Google Patents

Procédé et dispositif d'identification de données multimédia Download PDF

Info

Publication number
WO2020220926A1
WO2020220926A1 PCT/CN2020/082961 CN2020082961W WO2020220926A1 WO 2020220926 A1 WO2020220926 A1 WO 2020220926A1 CN 2020082961 W CN2020082961 W CN 2020082961W WO 2020220926 A1 WO2020220926 A1 WO 2020220926A1
Authority
WO
WIPO (PCT)
Prior art keywords
alif
time step
multimedia data
data
network layer
Prior art date
Application number
PCT/CN2020/082961
Other languages
English (en)
Chinese (zh)
Inventor
高岱恒
Original Assignee
北京灵汐科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京灵汐科技有限公司 filed Critical 北京灵汐科技有限公司
Publication of WO2020220926A1 publication Critical patent/WO2020220926A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Definitions

  • the present invention relates to the technical field of deep learning, in particular to a method and device for identifying multimedia data.
  • Deep learning refers to a collection of algorithms that use various machine learning algorithms to solve various problems such as images and texts on a multilayer neural network. Deep learning can be classified into neural networks from a broad category, but there are many changes in specific implementation.
  • the core of deep learning is feature learning, which aims to obtain hierarchical feature information through a hierarchical network, so as to solve the important problem of manually designing features in the past.
  • Deep learning is a framework that includes many important algorithms such as Convolutional Neural Networks (CNN), AutoEncoder, Sparse Coding, Restricted Boltzmann Machine (RBM), and confidence Networks (Deep Belief Networks, DBN) and multi-layer feedback loop neural network (Recurrent Neural Network, RNN) and other neural networks.
  • CNN Convolutional Neural Networks
  • RBM Restricted Boltzmann Machine
  • RNN Recurrent Neural Network
  • SNN Spike Neural Network
  • HH Hodgkin-Huxley
  • Image recognition is a classic problem in the field of computer vision. With the rapid development of AI technology represented by deep learning, the field of image recognition has attracted the attention of many researchers. However, in the field of fuzzy image recognition, because it is difficult to evaluate the data distribution form of fuzzy and noise and simulate and model it, the existing ANN-based algorithms are difficult to achieve the recognition ability similar to human.
  • the classic fuzzy image recognition process can be divided into two steps: 1) remove the noise and blur of the image; 2) perform image recognition on the denoised image.
  • the source of noise in images is usually caused by violent spatial scene conversion or shooting techniques or devices (such as shooting equipment with too low resolution).
  • the denoising of images/videos usually requires a large amount of prior knowledge (such as knowledge distilling of various possible noises).
  • prior knowledge such as knowledge distilling of various possible noises.
  • Freeman is equal to the heavy-tail gradient prior proposed in 2008, which can effectively remove the blur caused by the photographer's hand shaking from a single image.
  • the motion blur kernel estimation or sequence modeling is highly sensitive to irregular and dense noise.
  • the success rate of the recognition of the timing model will also be significantly reduced.
  • the present invention provides a method and device for recognizing multimedia data that overcomes the above problems or at least partially solves the above problems.
  • a method for identifying multimedia data including:
  • Input multimedia data to be identified into a pre-built neural network structure wherein the neural network structure includes an adaptive leak-integration-transmit ALIF timing model, and the ALIF timing model includes multiple ALIF network layers;
  • the to-be-identified Multimedia data includes image data and/or video data;
  • the multi-layer ALIF network layer in the neural network structure performs recognition calculation on the multimedia data to be recognized, and outputs the calculation result.
  • the neuron output is calculated by the following formula:
  • t represents the t-th time step
  • y t represents the output of the neuron in the ALIF network layer at the t-th time step
  • represents the activation function including the adaptive adjustment of the f thres algorithm
  • represents the setting to simulate the random noise of the brain Tensor
  • v t represents the membrane potential at the t-th time step.
  • the membrane potential v t at the t-th time step is calculated by the following formula:
  • v t represents the membrane potential at the t-th time step
  • Wx represents the two-dimensional weight matrix that changes the input in the ALIF timing model
  • x t represents the input of the ALIF network layer
  • v t-1 represents the t-1 time Step membrane potential
  • represents the preset matrix
  • the shape of the Wx is: the data dimension input at each time step ⁇ the number of units of the ALIF network layer.
  • the image data and/or video data are blurred image data and/or video data.
  • a multimedia data recognition device including:
  • the data input module is configured to input multimedia data to be recognized into a pre-built neural network structure; wherein the neural network structure includes an adaptive leak-integration-transmit ALIF timing model, and the ALIF timing model includes a multilayer ALIF network Layer; the multimedia data to be identified includes image data and/or video data;
  • the data calculation module is configured to perform recognition calculation on the multimedia data to be recognized through the multi-layer ALIF network layer in the neural network structure, and output the calculation result.
  • the data calculation module is further configured to: for any ALIF network layer, calculate the output of the neuron by the following formula:
  • t represents the t-th time step
  • y t represents the output of the neuron in the ALIF network layer at the t-th time step
  • represents the activation function including the adaptive adjustment of the f thres algorithm
  • represents the setting to simulate the random noise of the brain Tensor
  • v t represents the membrane potential at the t-th time step.
  • the data calculation module is further configured to calculate the membrane potential v t at the t-th time step by the following formula:
  • v t represents the membrane potential at the t-th time step
  • Wx represents the two-dimensional weight matrix that changes the input in the ALIF timing model
  • x t represents the input of the ALIF network layer
  • v t-1 represents the t-1 time Step membrane potential
  • represents the preset matrix
  • the shape of the Wx is: the data dimension input at each time step ⁇ the number of units of the ALIF network layer.
  • a storage device in which a computer program is stored.
  • the computer program runs in an electronic device, it is loaded and executed by the processor of the electronic device.
  • Multimedia data recognition method Multimedia data recognition method.
  • an electronic device including:
  • a processor for running computer programs and
  • the storage device is used to store a computer program, which is loaded by the processor when running in the electronic device and executes the multimedia data identification method described in any one of the above.
  • the present invention proposes a new algorithm ALIF that combines SNN and ANN. Under the premise that the weight is significantly less than the commonly used time series models RNN, LSTM and GRU, it can show better recognition ability and resistance in fuzzy image recognition tasks. Noise capability, the model is more robust, and can effectively identify whether the scene contains target objects.
  • Figure 1 shows a schematic diagram of a blurred image
  • Figure 2 shows a schematic diagram of a neural network structure according to an embodiment of the present invention
  • Figure 3 shows a schematic flowchart of a method for identifying multimedia data according to an embodiment of the present invention
  • Figure 4 shows a schematic diagram of a neuron
  • Fig. 5 shows a schematic diagram of ALIF network layer calculation according to an embodiment of the present invention
  • Fig. 6 shows a schematic diagram of a training picture of an input neural network according to an embodiment of the present invention
  • Fig. 7 shows a schematic diagram of a test picture of an input neural network according to an embodiment of the present invention.
  • FIG. 8 shows a comparison diagram of experimental results of neural networks for different realization models according to an embodiment of the present invention.
  • Fig. 9 shows a schematic structural diagram of a multimedia data recognition device according to an embodiment of the present invention.
  • the embodiment of the present invention proposes an adaptive leak-integrate-and-fire (ALIF) timing algorithm model that integrates SNN and ANN, and uses it to perform fuzzy image recognition.
  • Figure 2 shows a schematic diagram of a neural network structure according to an embodiment of the present invention. As shown in Figure 2, because ALIF is a time series model, the input of the neural network shown in Figure 2 is a continuous blurred image, and the continuous image of the neural network is input.
  • the time step of the sequence (GoI, Group of Images) has five situations of 2, 5, 10, 15, and 20.
  • RNN, LSTM and other timing models for comparison we added RNN, LSTM and other timing models for comparison.
  • the FC in Figure 2 means a fully connected layer.
  • the neural network structure provided by the embodiment of the present invention mainly lies in ALIF, which is a layer in the network, which is consistent with the conceptual level of popular deep learning time series models such as RNN, LSTM, and GRU, but more in the implementation
  • ALIF is a layer in the network, which is consistent with the conceptual level of popular deep learning time series models such as RNN, LSTM, and GRU, but more in the implementation
  • the reference to the working mechanism of the pulse neural network is more reasonable at the biological level.
  • FIG. 3 shows a schematic flow chart of a method for identifying multimedia data according to an embodiment of the present invention.
  • the multimedia data may include fuzzy image data or fuzzy video data.
  • the method for identifying multimedia data provided by an embodiment of the present invention may include:
  • Step S301 input multimedia data to be recognized into a pre-built neural network structure; wherein, the neural network structure includes an adaptive leak-integration-transmit ALIF timing model, and the ALIF timing model includes multiple ALIF network layers;
  • the multimedia data to be identified includes image data and/or video data; the multimedia data to be identified is fuzzy image data or fuzzy video data, that is, multimedia data with low pixels/large noise.
  • Step S302 Perform recognition calculation on the multimedia data to be recognized through the multi-layer ALIF network layer in the neural network structure, and output the calculation result.
  • Fig. 5 shows a schematic diagram of ALIF network layer calculation according to an embodiment of the present invention.
  • the calculation logic of the ALIF network layer is similar to the RNN, but this embodiment makes it more unique by adding random noise, adaptive transmission module and other content.
  • v t,1 indicates that the current layer is 1, the membrane potential of the t-th time step
  • x t indicates the input of the ALIF network layer
  • y t indicates the output of the neuron at the t-th time step of the ALIF network layer.
  • the calculation logic can be expressed as follows (taking the current time step as an example for illustration).
  • the neuron output at each time step of any ALIF network layer can be calculated by the following formula:
  • t represents the t-th time step
  • y t represents the output of the neuron in the ALIF network layer at the t-th time step
  • represents the activation function including the adaptive adjustment of the f thres algorithm
  • represents the setting to simulate the random noise of the brain Tensor
  • v t represents the membrane potential at the t-th time step.
  • membrane potential v t at the t-th time step is calculated by the following formula:
  • W x represents the two-dimensional weight matrix that changes the input in the ALIF time series model
  • x t represents the input of the ALIF network layer
  • v t-1 represents the membrane potential at the t-1 time step
  • represents the preset matrix
  • Wx is a two-dimensional weight matrix that changes the input in the time series model.
  • the matrix shape of the two-dimensional weight matrix is input_dim (the data dimension of each time step input (fixed))
  • ⁇ unit of ALIF the number of units of the ALIF network layer, Consistent with the concepts in time series models such as RNN
  • can represent a preset matrix, which is used to replace W h (where h refers to Hidden (hidden)) that is matrix multiplied with membrane potential v t , which is a 1 ⁇ unit of ALIF's matrix.
  • the membrane potential corresponding to this position will be subtracted from the preset parameter ⁇ , which is equivalent to the recovery potential in SNN, that is, the membrane potential returns to its original position. Note that all the above parameters are updated using the back propagation mechanism.
  • the emission threshold of f thres is adaptively adjusted according to the activation level distribution of the current time step, which is not limited in the present invention.
  • the specific calculation logic of the ALIF timing model may be as follows. Take the forward propagation process of the first layer when the time step is t as an example, where only f thres is a scalar.
  • step1 step 1), calculate the hidden state h t,l
  • step2 update the membrane potential v t,l
  • step3 (step 3), get the activated y t,l
  • step4 update f thres through Adaptive learning method
  • step5 step 5
  • step 5 regularize v t,l according to y t,l and f thres ,l
  • step6 step 6
  • step 6 limit y t,l by changing the limit
  • the embodiment of the present invention recognizes the neural network structure using different time series models for the training pictures and test pictures of the input neural network shown in FIG. 6 and FIG. 7.
  • the input data is video data
  • the video data can be understood as a discrete picture sequence.
  • the original image is randomly rotated by plus or minus 15 degrees
  • the test image is based on Gaussian blur and salt noise (salt noise). noise)
  • the test picture takes 30% of all pictures as an example, and the leftmost picture represents the original picture.
  • Figure 8 shows a comparison diagram of the experimental results of neural networks for different implementation models.
  • ALIF For ALIF, CNN, MLP and ConvSNN, test the recognition accuracy after changing the proportion of salt noise in the whole image.
  • the experimental results shown in Figure 8 show that the neural network structure based on the ALIF time series model is better than CNN, MLP and ConvSNN in identifying fuzzy pictures/videos with different noise ratios.
  • the embodiment of the present invention also provides a multimedia data recognition device.
  • the multimedia data recognition device may include:
  • the data input module 910 is configured to input multimedia data to be recognized into a pre-built neural network structure; wherein the neural network structure includes an adaptive leak-integration-transmit ALIF timing model, and the ALIF timing model includes a multilayer ALIF Network layer; the multimedia data to be identified includes image data and/or video data;
  • the data calculation module 920 is configured to perform recognition calculation on the multimedia data to be recognized through the multi-layer ALIF network layer in the neural network structure, and output the calculation result.
  • the data calculation module 920 is further configured to: for any ALIF network layer, use the following formula to calculate the output of the neuron:
  • t represents the t-th time step
  • y t represents the output of the neuron in the ALIF network layer at the t-th time step
  • represents the activation function including the adaptive adjustment of the f thres algorithm
  • represents the setting to simulate the random noise of the brain Tensor
  • v t represents the membrane potential at the t-th time step.
  • the data calculation module 920 is further configured to calculate the membrane potential v t at the t-th time step by the following formula:
  • v t represents the membrane potential at the t-th time step
  • Wx represents the two-dimensional weight matrix that changes the input in the ALIF timing model
  • x t represents the input of the ALIF network layer
  • v t-1 represents the t-1 time Step membrane potential
  • represents the preset matrix
  • the shape of the Wx is: the data dimension input at each time step ⁇ the number of units of the ALIF network layer.
  • an embodiment of the present invention also provides a storage device, in which a computer program is stored, and when the computer program is running in an electronic device, it is loaded by the processor of the electronic device and executes any of the above-mentioned embodiments. Said multimedia data recognition method.
  • an embodiment of the present invention also provides an electronic device, including:
  • a processor for running computer programs and
  • the storage device is used to store a computer program, which is loaded by the processor when running in the electronic device and executes the multimedia data identification method described in any of the above embodiments.
  • the present invention proposes a fuzzy image/video recognition method considering time sequence information and spatial information (based on the fusion of SNN and ANN methods).
  • the basic idea of the technical scheme proposed by the present invention is: Integrating the advantages of SNN and ANN methods into one, designing a new type of timing model, thereby effectively extracting the region of interest in the video sequence, and effectively increasing the noise Image recognition ability in the case of complex sources and large impact on pictures/videos.
  • the model provided in this embodiment can also incorporate convolution, that is, a form similar to ConvLSTM2D, which constructs the special ConvALIF2D of this embodiment. Compared with the time series model method, it combines a new algorithm of SNN and ANN.
  • ALIF under the premise that the weight is significantly smaller than the commonly used time series models RNN, LSTM and GRU, it can show better recognition ability and anti-noise ability in fuzzy image recognition tasks, and the model is more robust and can effectively identify Whether there are target objects in the scene.
  • modules or units or components in the embodiments can be combined into one module or unit or component, and in addition, they can be divided into multiple sub-modules or sub-units or sub-components. Except that at least some of such features and/or processes or units are mutually exclusive, any combination can be used to compare all features disclosed in this specification (including the accompanying claims, abstract and drawings) and any method or methods disclosed in this manner or All the processes or units of the equipment are combined. Unless expressly stated otherwise, each feature disclosed in this specification (including the accompanying claims, abstract and drawings) may be replaced by an alternative feature providing the same, equivalent or similar purpose.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Medical Informatics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

La présente invention concerne un procédé et un dispositif d'identification de données multimédia, le procédé consistant : à entrer des données multimédia à identifier dans une structure de réseau neuronal pré-construite, la structure de réseau neuronal comportant un modèle de minutage d'intégration et de déclenchement à aspects de fuite et adaptatif (ALIF), le modèle de minutage ALIF comportant une couche de réseau ALIF multicouche, et les données multimédia à identifier comportant des données d'image et/ou des données vidéo (S301) ; à identifier et à calculer les données multimédia à identifier par l'intermédiaire de la couche de réseau ALIF multicouche dans la structure de réseau neuronal, puis à fournir un résultat de calcul (S302). Un nouvel algorithme ALIF qui intègre un SNN avec un ANN est proposé, peut présenter une meilleure capacité d'identification ainsi qu'une meilleure capacité anti-bruit dans des tâches d'identification d'image floue, le modèle est plus robuste et peut efficacement identifier si un objet cible est compris dans la scène.
PCT/CN2020/082961 2019-04-28 2020-04-02 Procédé et dispositif d'identification de données multimédia WO2020220926A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910348456.XA CN111860053B (zh) 2019-04-28 2019-04-28 一种多媒体数据识别方法及装置
CN201910348456.X 2019-04-28

Publications (1)

Publication Number Publication Date
WO2020220926A1 true WO2020220926A1 (fr) 2020-11-05

Family

ID=72966205

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/082961 WO2020220926A1 (fr) 2019-04-28 2020-04-02 Procédé et dispositif d'identification de données multimédia

Country Status (2)

Country Link
CN (1) CN111860053B (fr)
WO (1) WO2020220926A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113610707A (zh) * 2021-07-23 2021-11-05 广东工业大学 一种基于时间注意力与循环反馈网络的视频超分辨率方法

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110141258A1 (en) * 2007-02-16 2011-06-16 Industrial Technology Research Institute Emotion recognition method and system thereof
EP3023911A1 (fr) * 2014-11-24 2016-05-25 Samsung Electronics Co., Ltd. Procédé et appareil de reconnaissance d'objet et procédé et appareil de reconnaissance d'apprentissage
CN106250829A (zh) * 2016-07-22 2016-12-21 中国科学院自动化研究所 基于唇部纹理结构的数字识别方法
CN108021927A (zh) * 2017-11-07 2018-05-11 天津大学 一种基于慢变视觉特征的视频指纹提取方法
CN108960059A (zh) * 2018-06-01 2018-12-07 众安信息技术服务有限公司 一种视频动作识别方法及装置
CN109635791A (zh) * 2019-01-28 2019-04-16 深圳大学 一种基于深度学习的视频取证方法

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110141258A1 (en) * 2007-02-16 2011-06-16 Industrial Technology Research Institute Emotion recognition method and system thereof
EP3023911A1 (fr) * 2014-11-24 2016-05-25 Samsung Electronics Co., Ltd. Procédé et appareil de reconnaissance d'objet et procédé et appareil de reconnaissance d'apprentissage
CN106250829A (zh) * 2016-07-22 2016-12-21 中国科学院自动化研究所 基于唇部纹理结构的数字识别方法
CN108021927A (zh) * 2017-11-07 2018-05-11 天津大学 一种基于慢变视觉特征的视频指纹提取方法
CN108960059A (zh) * 2018-06-01 2018-12-07 众安信息技术服务有限公司 一种视频动作识别方法及装置
CN109635791A (zh) * 2019-01-28 2019-04-16 深圳大学 一种基于深度学习的视频取证方法

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113610707A (zh) * 2021-07-23 2021-11-05 广东工业大学 一种基于时间注意力与循环反馈网络的视频超分辨率方法
CN113610707B (zh) * 2021-07-23 2024-02-09 广东工业大学 一种基于时间注意力与循环反馈网络的视频超分辨率方法

Also Published As

Publication number Publication date
CN111860053B (zh) 2023-11-24
CN111860053A (zh) 2020-10-30

Similar Documents

Publication Publication Date Title
CN108133188B (zh) 一种基于运动历史图像与卷积神经网络的行为识别方法
CN107506712B (zh) 一种基于3d深度卷积网络的人类行为识别的方法
KR102224253B1 (ko) 심층 네트워크와 랜덤 포레스트가 결합된 앙상블 분류기의 경량화를 위한 교사-학생 프레임워크 및 이를 기반으로 하는 분류 방법
Liu et al. Predicting eye fixations using convolutional neural networks
Salama et al. Sheep identification using a hybrid deep learning and bayesian optimization approach
US11551076B2 (en) Event-driven temporal convolution for asynchronous pulse-modulated sampled signals
US11443514B2 (en) Recognizing minutes-long activities in videos
CN111507182B (zh) 基于骨骼点融合循环空洞卷积的乱丢垃圾行为检测方法
CN112699956A (zh) 一种基于改进脉冲神经网络的神经形态视觉目标分类方法
CN112528830A (zh) 一种结合迁移学习的轻量级cnn口罩人脸姿态分类方法
Wang et al. Fire detection in infrared video surveillance based on convolutional neural network and SVM
Gao et al. An end-to-end broad learning system for event-based object classification
US20220132050A1 (en) Video processing using a spectral decomposition layer
CN112288080A (zh) 面向脉冲神经网络的自适应模型转化方法及系统
Yang et al. RGBT tracking via cross-modality message passing
KR20210018600A (ko) 얼굴 표정 인식 시스템
CN115471831B (zh) 一种基于文本增强学习的图像显著性检测方法
WO2020220926A1 (fr) Procédé et dispositif d'identification de données multimédia
Shi et al. Knowledge-guided semantic computing network
US20230076290A1 (en) Rounding mechanisms for post-training quantization
Wang et al. A fast interpretable adaptive meta-learning enhanced deep learning framework for diagnosis of diabetic retinopathy
Guan et al. Deep learning approaches for image classification techniques
KR102178469B1 (ko) 교사-학생 프레임워크 기반의 소프트 타겟 학습방법을 이용한 보행자 포즈 방향 추정 방법 및 시스템
Zuo et al. NALA: A Nesterov Accelerated Look-Ahead optimizer for deep neural networks
Chen et al. Deep global-connected net with the generalized multi-piecewise ReLU activation in deep learning

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20799239

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20799239

Country of ref document: EP

Kind code of ref document: A1