CN110532959B - Real-time violent behavior detection system based on two-channel three-dimensional convolutional neural network - Google Patents

Real-time violent behavior detection system based on two-channel three-dimensional convolutional neural network Download PDF

Info

Publication number
CN110532959B
CN110532959B CN201910817372.6A CN201910817372A CN110532959B CN 110532959 B CN110532959 B CN 110532959B CN 201910817372 A CN201910817372 A CN 201910817372A CN 110532959 B CN110532959 B CN 110532959B
Authority
CN
China
Prior art keywords
video
channel
module
processing module
real
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910817372.6A
Other languages
Chinese (zh)
Other versions
CN110532959A (en
Inventor
沈小艳
阴文佳
毕胜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dalian Maritime University
Original Assignee
Dalian Maritime University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dalian Maritime University filed Critical Dalian Maritime University
Priority to CN201910817372.6A priority Critical patent/CN110532959B/en
Publication of CN110532959A publication Critical patent/CN110532959A/en
Application granted granted Critical
Publication of CN110532959B publication Critical patent/CN110532959B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Biology (AREA)
  • Biomedical Technology (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a real-time violent behavior detection system based on a two-channel three-dimensional convolutional neural network, which comprises the following components: the video acquisition module captures video frames in real time and respectively sends the video frames to the video processing module and the playing module; the video processing module is used for extracting the characteristics of the received video frames by using the convolutional neural network, combining the extracted characteristics and classifying the image data according to the combined characteristics; the playing module is used for marking the image classification result obtained by the video processing module into the video frame sent by the video acquisition module and playing the video frame to a user; the video acquisition module, the video processing module and the playing module work in parallel. The invention improves the identification accuracy by introducing a double-channel idea, and realizes the accurate positioning of the occurrence time of violent behaviors by introducing the deconvolution layer.

Description

Real-time violent behavior detection system based on two-channel three-dimensional convolutional neural network
Technical Field
The invention relates to the technical field of video monitoring, in particular to a real-time violent behavior detection system based on a fast and slow two-channel three-dimensional convolutional neural network.
Background
Video human behavior recognition and detection are one of the most challenging tasks in computer vision, and can be widely applied to numerous fields such as video monitoring, motion retrieval, man-machine interaction, intelligent home, medical care and the like. The current field of behavior recognition has two major branches: a conventional manner represented by IDT (enhanced depth transactions) algorithm, and a deep learning manner represented by two-dimensional convolution, three-dimensional convolution, RNN-LSTM. The deep learning mode exceeds the traditional mode in performance from the development trend.
Dense trajectory algorithm (IDT): the main difference between traditional methods and deep learning methods is the source of the features used for classification. In the traditional method, one or more characteristics with good classification effect are manually extracted according to experience and mixed for classification. The deep learning method is that a person gives classified samples to a computer to learn a set of models, the learned models can be used for extracting characteristics of a certain characteristic combination for classification, and the model specifically extracts the characteristics, so that the person can not know the characteristics. The traditional method has limited extracted feature types and limited feature selection range, so the manually extracted features are not as accurate as those extracted by a model. This is also an advantage of deep learning.
Two-channel convolutional neural network (Two-Stream-CNN): is a representative algorithm for solving the behavior recognition problem by two-dimensional convolution. The main contents are as follows: the two channels simultaneously process the RGB frame sequence and the optical flow frame sequence, no information exchange exists in the feature extraction process of the two channels, and after the feature extraction is finished, the features are fused in a certain mode for classification to obtain a final result. Because the network can only process one image at a time, each frame of image in the sequence needs to be processed, and a large amount of repeated information exists between adjacent frames of images in the video, the algorithm has the phenomenon of repeated calculation, the identification and detection speed is greatly restricted, and the requirement of real-time property cannot be met.
Long-Short Term Memory network (LSTM): due to the unique design structure, LSTM is suitable for handling and predicting significant events in time series that are very long-spaced and delayed. Therefore, the LSTM has good effect in the behavior identification and detection direction, and is one of the mainstream directions at present.
Two-dimensional convolution has matured well in the image recognition and detection problem, but video adds information in one time dimension compared to images. The traditional two-dimensional convolution kernel can not meet the requirement of extracting three-dimensional features. The three-dimensional convolution operation speed is an advantage and can well capture interframe information, and is the mainstream research direction at present. However, the existing methods have the problems of low identification accuracy and low identification speed, and the development and application of the human behavior identification detection technology are greatly limited.
Disclosure of Invention
According to the technical problems of low recognition accuracy and low recognition speed, the real-time violent behavior detection system based on the fast and slow two-channel three-dimensional convolutional neural networks is provided, the recognition accuracy is improved by introducing a two-channel idea, and meanwhile, the accurate positioning of the occurrence time of the violent behaviors is realized by introducing the deconvolution layer.
The technical means adopted by the invention are as follows:
a real-time violent behavior detection system based on a two-channel three-dimensional convolutional neural network comprises:
the video acquisition module captures video frames in real time and respectively sends the video frames to the video processing module and the playing module;
the video processing module is used for extracting the characteristics of the received video frames by using the convolutional neural network, combining the extracted characteristics and classifying the image data according to the combined characteristics;
the playing module is used for marking the image classification result obtained by the video processing module into the video frame sent by the video acquisition module and playing the video frame to a user;
the video acquisition module, the video processing module and the playing module work in parallel.
Further, before the video processing module performs feature extraction on the received video frame, the video processing module also performs preprocessing on the video frame, including: and respectively sending the RGB images into a slow channel and a fast channel for processing, and taking the obtained preprocessing result of the slow channel and the preprocessing result of the fast channel as the input of the video processing module.
Furthermore, the slow channel is used for sampling RGB images into video segments at equal intervals, and inputting the trained slow channel network model to predict and obtain the slow channel preprocessing data.
Further, the fast channel is used for processing the RGB image into gray image data, extracting optical flow image data, inputting the trained fast channel network model, and predicting to obtain fast channel preprocessing data.
Further, the video processing module performs horizontal fusion processing based on convolution feature fusion on the slow channel feature extraction result and the fast channel feature extraction result.
Furthermore, the system also comprises a storage module used for storing data information in the operation process of the system.
Compared with the prior art, the invention has the following advantages:
the method utilizes the multilayer convolutional neural network to extract the time correlation characteristics of the video frames, achieves parameter sharing to a certain extent, can capture interframe information with good performance, improves the operation speed and has strong real-time performance. Meanwhile, the fast channel and the slow channel are combined, the characteristics of the fast channel and the slow channel to be fused have similar shapes under the condition that data are not lost through a convolution mode, and after the characteristic fusion structure is added, the slow output of the convolution layer in the slow channel and the output of the fast channel which is subjected to convolution deformation at the same layer are superposed to be used as the input of the next convolution layer, so that the identification accuracy is improved.
In addition, the present invention does not use the pooling operation in the time domain, which ensures that the time domain information is retained to the maximum extent, and further, the time of the violent behavior can be more accurately positioned.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a functional block diagram of the structure of the detection system of the present invention.
FIG. 2 is a flow chart of the operation of the detection system of the present invention.
FIG. 3 is a flow chart of the operation of the video processing module of the present invention.
Detailed Description
It should be noted that the embodiments and features of the embodiments may be combined with each other without conflict. The present invention will be described in detail below with reference to the embodiments with reference to the attached drawings.
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all the embodiments. The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the invention, its application, or uses. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
As shown in fig. 1 to 3, the present invention provides a real-time violent behavior detection system based on a two-channel three-dimensional convolutional neural network, which comprises three modules, namely a video acquisition module, a video processing module and a delayed playing module, and three threads are required to be performed simultaneously to meet the real-time performance.
And the video acquisition module captures video frames in real time and respectively sends the video frames to the video processing module and the delayed playing module. Specifically, video frames are captured in real time by using an opencv library and a network camera. Dividing the captured frame image into two paths, storing one path of the captured frame image into a second queue, providing input for a thread three-delay playing module, storing the other path of the captured frame image into the first queue, and providing materials for the image preprocessing step of a video processing module.
And the video processing module is used for extracting the characteristics of the received video frames by using the convolutional neural network, combining the extracted characteristics and classifying the image data according to the combined characteristics. The method is particularly used for realizing the functions of image preprocessing, video feature extraction, image feature classification and the like. The invention introduces the Slow _ Fast idea in order to improve the identification accuracy, and data are respectively sent into a Fast channel and a Slow channel for processing.
As a preferred embodiment of the invention, the following technical scheme is adopted when image preprocessing is carried out:
slow passage: RGB frame equal interval sampling: and sampling the video unit at equal intervals by using 64 frames as a video unit, taking one frame every 16 frames, and obtaining a video segment in a shape of 4 × h × w × 3.
Adjusting: in the prediction stage, the width and height of each frame of RGB image is scaled to 224 × 224, and the shape of the output result is 4 × 224 × 3. In the training phase, data was first scaled to 4 × 256 × 3, then randomly cropped to 4 × 224 × 3, with random flipping to complete data augmentation. Thereby increasing the generalization ability of the network model and preventing the model from being over-fitted.
Fast channel: the first part converts the RGB image into a gray image, the RGB comprises three color channels, and the gray image only has one color channel. Multiplying the gray values of the three channels of RGB by the corresponding weights, and taking the summation result as the gray value of the corresponding point of the gray map, wherein the formula is as follows:
Gray=R*0.299+G*0.587+B*0.114
the shape of the output data of the link is 64 × w × h × 1
The second part converts the grayscale image into optical flow data, where the optical flow reflects the motion information of objects between adjacent frames.
The present embodiment preferably employs the Farneback optical flow algorithm to extract dense optical flow. The optical flow is calculated every two frames, one video unit has 64 frames, so the shape of the output data of the link is 32 × w × h × 2 (the optical flow image has 2 channels of x-direction optical flow and y-direction optical flow respectively)
And adjusting the prediction stage, namely scaling the width and the height of each frame of image to be 224 × 224, and outputting the shape of the output result to be 32 × 224 × 2. In the training phase, data was first scaled to 32 × 256 × 2, then randomly cropped to 32 × 224 × 2, and augmented with random flips. Thereby increasing the generalization ability of the network model and preventing the model from being over-fitted. Here the random cropping and random flipping should be consistent with the slow channel.
And preprocessing results of the data of the two channels and simultaneously taking the results as input of the step of extracting the features.
As a preferred embodiment of the present invention, the following technical scheme is adopted when extracting video features:
and inputting the processed data into corresponding network channels respectively, inputting the trained network model, and extracting the features layer by layer. The output shape of the feature after passing through each layer is labeled in the form of T × W × H × C, and example 32 × 112 × 8 refers to the shape of the feature after passing through the last convolution module, where the output feature is 32 frames wide and 112 high, and the number of convolution kernel channels is 8.
Fast passage: 32 frames of optical flow images are input, each frame of the images is 224 wide and 224 high, and 2 channels of optical flow in the X direction and optical flow in the Y direction are divided. The fast channel comprises five convolution modules, all the convolution modules have the same structure and comprise a three-dimensional convolution layer, a BN layer, a relu excitation layer and a three-dimensional pooling layer, and the names of the convolution modules, the sizes of convolution kernels and the sizes of pooling kernels are marked on a graph. The example Conv 1_3 _1 _2indicates that the name of the layer is Conv 1, the convolution kernel size is 3 x 3 and the pooling kernel size is 1 x 2. After 5 convolution layers, an Average Possing 3D layer was added, the pooled nuclei were sized 1 × 7, and the feature shapes were reduced from 32 × 7 × 128 to 32 × 1 × 128, reducing the computational cost.
Slow passage: input 4 frames of RGB color images, each frame 224 wide and 224 high, in three RGB color channels. 5 convolution modules are also included in the slow channel, with the names Convx _ S and x being 1-5, respectively. The first and last module hierarchies are the same as the fast path convolution module hierarchy, and comprise a three-dimensional convolution layer, a BN layer, a relu excitation layer and a three-dimensional pooling layer, wherein the convolution kernel size is 3 x 3, and the pooling kernel size is 1 x 2. In the middle three layers of the slow channel, the up-sampling in the time domain and the up-down sampling in the space domain are simultaneously completed by using the convolution and deconvolution operations. The convolution kernel and pooling kernel sizes are shown. Similarly, after 5 convolutional layers, an Average Possing 3D layer was added, pooling the nuclei with a size of 1 × 7, reducing the feature shapes from 32 × 7 × 128 to 32 × 1 × 128, reducing the computational cost.
Transverse fusion: in order to enable the two channels to fully utilize contents learned by the other channel, transverse feature fusion is used in many schemes, and fusion modes are also various. That is, under the condition of ensuring that data is not lost through a convolution mode, two features to be fused have similar shapes, and the output shape of each volume layer is marked in the attached drawing. After the characteristic fusion structure is added, in the slow channel, the output of the slow channel of the convolution layer and the output of the fast channel which is subjected to convolution deformation at the same layer are superposed to be used as the input of the next convolution layer.
As a preferred embodiment of the present invention, the arrangement scheme of the full connection layer and the classifier is as follows:
after the feature information of the two channels is fused and compressed, a plurality of local features are extracted, the local features are required to be reassembled into complete features through a full connection layer, and the global features are used as input for classifier classification. In this embodiment, the number of nodes in the full connection layer is preferably 1024.
The Sigmoid function is chosen as the classifier because it is only necessary to classify the action as being violent or not. The number of output nodes is 2.
The classifier output is 32 x 2 in shape, i.e. one video unit 64 frames, and two probability values of whether the violent behavior is present or not can be obtained for every two frames of images. The higher probability value is taken as the prediction result, so the final result is a sequence with the length of 32, corresponding to 32 frames of images. Thereby enabling frame-level prediction.
Sigmoid function formula is as follows:
g(x)=1/(1+e^(-x))
it should be noted that the pooling operation is not used in the time domain in the whole network, so as to ensure that the time domain information is retained to the maximum extent. In addition, since RGB mainly focuses on detail information, the change speed over time is low, the repeated information is more, calculation expense is saved in order to avoid repeated calculation, a slow channel is carried out at a low frame rate, 1 frame is sampled every 16 frames, and 4 frames are sampled in a time unit. Since the light flow graph mainly focuses on motion information, the time change speed is high, the time change is carried out at a high frame rate, 1 frame is sampled every 2 frames, and 32 frames are sampled in a time unit. Finally, although the number of input frames is small, the slow channel needs to pay attention to more detailed information, and it is known that the more the number of convolution kernels, the more detailed information that can be paid attention to, the more the number of convolution kernels of the fast channel is, the more the number of convolution kernels of the slow channel is, the more the number of convolution kernels of the fast channel is set to be 8 times that of the slow channel in the whole process.
And the delayed playing module is used for marking the image classification result obtained by the video processing module into the video frame sent by the video acquisition module and performing delayed playing on the video frame to a user.
In addition, the system in this embodiment further includes a storage module for storing data information during the operation of the system.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and these modifications or substitutions do not depart from the spirit of the corresponding technical solutions of the embodiments of the present invention.

Claims (2)

1. A real-time violent behavior detection system based on a two-channel three-dimensional convolution neural network is characterized by comprising:
the video acquisition module captures video frames in real time and respectively sends the video frames to the video processing module and the playing module;
a video processing module for preprocessing the video frame, which comprises respectively sending RGB images into a slow channel and a fast channel for processing, using the obtained preprocessing result of the slow channel and the preprocessing result of the fast channel as the input of the video processing module, the slow channel is used for sampling the RGB images into video segments at equal intervals, inputting the trained network model of the slow channel for predicting to obtain the preprocessing data of the slow channel, the fast channel is used for processing the RGB images into gray image data and extracting the optical flow image data, inputting the trained network model of the fast channel for predicting to obtain the preprocessing data of the fast channel,
performing feature extraction on the received video frames by using a convolutional neural network, and combining the extracted features, wherein the transverse fusion processing based on convolutional feature fusion is performed on the slow channel feature extraction result and the fast channel feature extraction result, and then the image data is classified according to the combined features;
the playing module is used for marking the image classification result obtained by the video processing module into the video frame sent by the video acquisition module and playing the video frame to a user;
the video acquisition module, the video processing module and the playing module work in parallel.
2. The real-time violent behavior detection system of claim 1 further comprising a storage module for storing data information during operation of the system.
CN201910817372.6A 2019-08-30 2019-08-30 Real-time violent behavior detection system based on two-channel three-dimensional convolutional neural network Active CN110532959B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910817372.6A CN110532959B (en) 2019-08-30 2019-08-30 Real-time violent behavior detection system based on two-channel three-dimensional convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910817372.6A CN110532959B (en) 2019-08-30 2019-08-30 Real-time violent behavior detection system based on two-channel three-dimensional convolutional neural network

Publications (2)

Publication Number Publication Date
CN110532959A CN110532959A (en) 2019-12-03
CN110532959B true CN110532959B (en) 2022-10-14

Family

ID=68665934

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910817372.6A Active CN110532959B (en) 2019-08-30 2019-08-30 Real-time violent behavior detection system based on two-channel three-dimensional convolutional neural network

Country Status (1)

Country Link
CN (1) CN110532959B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111191528B (en) * 2019-12-16 2024-02-23 江苏理工学院 Campus violence behavior detection system and method based on deep learning
JP2021179728A (en) * 2020-05-12 2021-11-18 株式会社日立製作所 Video processing device and method thereof
CN111860395A (en) * 2020-07-28 2020-10-30 公安部第三研究所 Method for realizing prison violent behavior detection based on vision and acceleration information
CN112990013B (en) * 2021-03-15 2024-01-12 西安邮电大学 Time sequence behavior detection method based on dense boundary space-time network

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017177661A1 (en) * 2016-04-15 2017-10-19 乐视控股(北京)有限公司 Convolutional neural network-based video retrieval method and system
CN110175596A (en) * 2019-06-04 2019-08-27 重庆邮电大学 The micro- Expression Recognition of collaborative virtual learning environment and exchange method based on double-current convolutional neural networks

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017177661A1 (en) * 2016-04-15 2017-10-19 乐视控股(北京)有限公司 Convolutional neural network-based video retrieval method and system
CN110175596A (en) * 2019-06-04 2019-08-27 重庆邮电大学 The micro- Expression Recognition of collaborative virtual learning environment and exchange method based on double-current convolutional neural networks

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于双流卷积神经网络的改进人体行为识别算法;张怡佳等;《计算机测量与控制》;20180825(第08期);全文 *

Also Published As

Publication number Publication date
CN110532959A (en) 2019-12-03

Similar Documents

Publication Publication Date Title
CN111639692B (en) Shadow detection method based on attention mechanism
CN111401177B (en) End-to-end behavior recognition method and system based on adaptive space-time attention mechanism
CN110532959B (en) Real-time violent behavior detection system based on two-channel three-dimensional convolutional neural network
CN106683048B (en) Image super-resolution method and device
CN108334848B (en) Tiny face recognition method based on generation countermeasure network
CN114202672A (en) Small target detection method based on attention mechanism
CN111291809B (en) Processing device, method and storage medium
CN110717851A (en) Image processing method and device, neural network training method and storage medium
CN113591795B (en) Lightweight face detection method and system based on mixed attention characteristic pyramid structure
CN112464807A (en) Video motion recognition method and device, electronic equipment and storage medium
CN110059728B (en) RGB-D image visual saliency detection method based on attention model
CN113642634A (en) Shadow detection method based on mixed attention
CN108875482B (en) Object detection method and device and neural network training method and device
CN108830185B (en) Behavior identification and positioning method based on multi-task joint learning
CN110222718B (en) Image processing method and device
CN111079507B (en) Behavior recognition method and device, computer device and readable storage medium
CN113011329A (en) Pyramid network based on multi-scale features and dense crowd counting method
CN110837786B (en) Density map generation method and device based on spatial channel, electronic terminal and medium
CN111738344A (en) Rapid target detection method based on multi-scale fusion
CN113297956B (en) Gesture recognition method and system based on vision
CN110929685A (en) Pedestrian detection network structure based on mixed feature pyramid and mixed expansion convolution
CN114241422A (en) Student classroom behavior detection method based on ESRGAN and improved YOLOv5s
CN111797841A (en) Visual saliency detection method based on depth residual error network
CN112183649A (en) Algorithm for predicting pyramid feature map
CN115484410A (en) Event camera video reconstruction method based on deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant