CN106911930A - It is a kind of that the method for perceiving video reconstruction is compressed based on recursive convolution neutral net - Google Patents

It is a kind of that the method for perceiving video reconstruction is compressed based on recursive convolution neutral net Download PDF

Info

Publication number
CN106911930A
CN106911930A CN201710124135.2A CN201710124135A CN106911930A CN 106911930 A CN106911930 A CN 106911930A CN 201710124135 A CN201710124135 A CN 201710124135A CN 106911930 A CN106911930 A CN 106911930A
Authority
CN
China
Prior art keywords
cnn
network
lstm
video
frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN201710124135.2A
Other languages
Chinese (zh)
Inventor
夏春秋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Vision Technology Co Ltd
Original Assignee
Shenzhen Vision Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Vision Technology Co Ltd filed Critical Shenzhen Vision Technology Co Ltd
Priority to CN201710124135.2A priority Critical patent/CN106911930A/en
Publication of CN106911930A publication Critical patent/CN106911930A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • G06T9/002Image coding using neural networks

Abstract

A kind of method that perception video reconstruction is compressed based on recursive convolution neutral net proposed in the present invention, its main contents are included:Compressed sensing network (CSNet), CSNet algorithm structures, convolutional neural networks (CNN), shot and long term memory (LSTM) network, CSNet network trainings, compressed sensing video reconstruction, its process is, motion feature is extracted using RNN, CNN extracts visual signature, the information that both fusions are extracted, the all features extracted using LSTM network aggregations, it are formed with the deduction movement combination of hidden state and are rebuild.The present invention breaches the problem that existing method is difficult to ensure that video reconstruction quality under high compression ratio, devise a kind of training and non-iterative model end to end, improve the compression ratio (CR) of CS video cameras, and improve video reconstruction quality, the bandwidth of data transfer is reduced simultaneously so that can support the Video Applications of frame per second high.

Description

It is a kind of that the method for perceiving video reconstruction is compressed based on recursive convolution neutral net
Technical field
The present invention relates to video compress and reconstruction field, carried out based on recursive convolution neutral net more particularly, to one kind The method of compressed sensing video reconstruction.
Background technology
Video compress and reconstruction are usually used in research, video monitoring, remote sensing technology, social networks of physics and bioscience etc. Field, in the research of physics and bioscience, high-speed camera is used to record the two-forty to be recorded of traditional camera Affair character, it can record the static image of high-resolution of high speed event, for example, tracking " insignificant motion blur and image The explosion bubble of distortion artifacts ".In video monitoring, region interested in monitor video can be rebuild, to particular persons Or the image of car plate carries out enhancing and improves identification.But, if frame per second is 1080P's for the video camera of 10kfps shoots resolution ratio HD video, then the data that can produce about 500GB per second, this constitutes huge to existing transmission and memory technology How challenge, efficiently transmit and store the focus that these large-capacity videos are current research.
The present invention proposes a kind of method for being compressed based on recursive convolution neutral net and perceiving video reconstruction, using volume Neutral net (CNN) and recurrent neural network (RNN) is accumulated to extract space-time characteristic, including background, object detail and motion letter Breath, has reached more preferable reconstruction quality.Specifically, random coded device parallel running, using in more measurement encoded video First frame, while using less measurement coded residual frame, for each compression measurement, thering is specific CNN therefrom to extract space special Levy, length memory all features for being extracted by each CNN of (LSTM) network aggregation, and hidden state deduction campaign shape together Into reconstruction.The present invention breaches a series of limitation of the conventional process mode that video is considered as independent images, by RNN by the time Information application is in process of reconstruction, so as to generate more accurate models, in addition this method is also regarded keeping preferably original On the basis of frequency visual details, improve compression ratio and reduce the broadband of data transfer, improve video reconstruction quality, prop up Hold the Video Applications of frame per second high.
The content of the invention
The problem of video reconstruction quality is difficult to ensure that under high compression ratio for existing method, it is an object of the invention to carry The method for perceiving video reconstruction is compressed based on recursive convolution neutral net for a kind of, has surmounted the limitation of conventional method, carried The compression ratio (CR) of CS video cameras high, and video reconstruction quality is improve, while reducing the bandwidth of data transfer so that can To support the Video Applications of frame per second high.
To solve the above problems, the present invention provides one kind and is compressed perception video reconstruction based on recursive convolution neutral net Method, its main contents includes:
(1) compressed sensing network (CSNet);
(2) CSNet algorithm structures;
(3) convolutional neural networks (CNN);
(4) shot and long term memory (LSTM) network;
(5) CSNet network trainings;
(6) compressed sensing video reconstruction.
Wherein, described compressed sensing network (CSNet), is a kind of deep neural network, can be suffered from random measurement Solution visual representation, is a kind of training and non-iterative model end to end for compressed sensing video reconstruction, combines convolutional Neural Network (CNN) and recurrent neural network (RNN), so as to carry out video reconstruction using space-time characteristic, this network structure can connect Receive with the random measurement of multi-stage compression ratio (CR), separately provide background information and object detail, reach preferably reconstruction Quality.
Wherein, described CSNet algorithm structures, the structure includes three modules:For measure random coded, for regarding Feel CNN clusters, the LSTM for time reconstruction of feature extraction, random coded device parallel running is encoded using more measurement First frame in video, while using less measurement coded residual frame, can receive multi-stage compression ratio (CR) measurement, is calculated by this Method, key frame and non-key frame (remaining frame of main contributions movable information) are compressed respectively, and recurrent neural network (RNN) is calculated Go out movable information, and these information are combined with the visual signature extracted by convolutional Neural system (CNN), synthesize high-quality Frame, efficient information fusion can make be reached most between the fidelity of compressed sensing (CS) Video Applications and compression ratio (CR) Excellent balance.
Wherein, described convolutional neural networks (CNN), the network to image be compressed measurement and put reconstruction outward, when Between compression and space compression be combined together with maximum compression ratio, one larger CNN of design processes key frame, because closing Key frame contains entropy information high, meanwhile, one less CNN of design processes non-key frame, in order to reduce system delay and Simplify network structure, using image block as input, now, the size of all characteristic patterns generated by CNN is identical with image block, The quantity monotonic decreasing of characteristic pattern, the m dimensional vectors that this network inputs is made up of compression measurement, there is a holostrome before CNN, It measures one two dimensional character figure of generation using these.
Further, described time compression, to obtain compression ratio (CR) higher, each video comprising T frames is mended Fourth is divided into K key frame and (T-K) individual non-key frame, and key frame compresses by low compression ratio (CR), and non-key frame is by high pressure Contracting is compressed than (CR) so that the metrical information of key frame can be used to rebuild non-key frame again, and this can regard time compression as.
Wherein, described shot and long term memory (LSTM) network, rebuilds for the time, for obtain end-to-end training, And effective model is calculated, and do not pre-process to being originally inputted, and rebuild using a LSTM network extraction must Few motion feature, so as to estimate the light stream of video, the LSTM networks of synthesis are used for motion extrapolation, spatial vision feature and fortune Dynamic aggregation, to reach video reconstruction.
Further, described LSTM network training process, it is characterised in that in the training process of LSTM networks, rises The M- of first LSTM is input into the CNN data of extraction process key frame, and the CNN of remaining (T-M) extraction process non-key frame is exported, For each LSTM unit, it will receive the visual signature of key frame, and these visual signatures are used for Background Reconstruction, recover object Present frame and estimation last several frames.
Wherein, described CSNet network trainings, are divided into two stages, first stage, pre-training background CNN, and from Visual signature is extracted in K key frames, second stage gives model more basic blocks extracted from origin needed for building object, Then training (T-M) the smaller CNN, these objects CNN and pre-training background CNN that start from scratch is tied by a LSTM for synthesis Close, three networks are trained together, the number of parameters for needed for reducing training, the last several layers of of only key frame CNN are combined, institute Input with these figure layers is Feature Mapping rather than measurement, using average Euclidean loss as loss function, i.e.,
Herein, W and b are network weight and biasing, xiAnd yiIt is that each image block and its CS are measured, a random Gaussian Matrix is used for CS codings.
Wherein, described compressed sensing video reconstruction, sets up the present frame based on information, using recurrent neural network (RNN) motion feature is extracted, convolutional neural networks (CNN) extract visual signature, the information that both fusions are extracted, using LSTM All features that network aggregation is extracted, it are formed with the deduction movement combination of hidden state and are rebuild.
Brief description of the drawings
Fig. 1 is a kind of system stream that the method for perceiving video reconstruction is compressed based on recursive convolution neutral net of the present invention Cheng Tu.
Fig. 2 is that a kind of framework of the method that perception video reconstruction is compressed based on recursive convolution neutral net of the present invention is whole Body structure.
Fig. 3 is a kind of CSNet that the method for perceiving video reconstruction is compressed based on recursive convolution neutral net of the present invention Network training schematic diagram.
Fig. 4 is a kind of compression sense that the method for perceiving video reconstruction is compressed based on recursive convolution neutral net of the present invention Know video reconstruction flow chart.
Specific embodiment
It should be noted that in the case where not conflicting, the feature in embodiment and embodiment in the application can phase Mutually combine, the present invention is described in further detail with specific embodiment below in conjunction with the accompanying drawings.
Fig. 1 is a kind of system stream that the method for perceiving video reconstruction is compressed based on recursive convolution neutral net of the present invention Cheng Tu.Mainly include compressed sensing network (CSNet), CSNet algorithm structures, convolutional neural networks (CNN), shot and long term memory (LSTM) network, CSNet network trainings, compressed sensing video reconstruction.
Wherein, described compressed sensing network (CSNet), is a kind of deep neural network, can be suffered from random measurement Solution visual representation, is a kind of training and non-iterative model end to end for compressed sensing video reconstruction, combines convolutional Neural Network (CNN) and recurrent neural network (RNN), so as to carry out video reconstruction using space-time characteristic, this network structure can connect Receive with the random measurement of multi-stage compression ratio (CR), separately provide background information and object detail, reach preferably reconstruction Quality.
Wherein, described CSNet algorithm structures, the structure includes three modules:For measure random coded, for regarding Feel CNN clusters, the LSTM for time reconstruction of feature extraction, random coded device parallel running is encoded using more measurement First frame in video, while using less measurement coded residual frame, can receive multi-stage compression ratio (CR) measurement, is calculated by this Method, key frame and non-key frame (remaining frame of main contributions movable information) are compressed respectively, and recurrent neural network (RNN) is calculated Go out movable information, and these information are combined with the visual signature extracted by convolutional Neural system (CNN), synthesize high-quality Frame, efficient information fusion can make be reached most between the fidelity of compressed sensing (CS) Video Applications and compression ratio (CR) Excellent balance.
Wherein, described convolutional neural networks (CNN), the network to image be compressed measurement and put reconstruction outward, when Between compression and space compression be combined together with maximum compression ratio, one larger CNN of design processes key frame, because closing Key frame contains entropy information high, meanwhile, one less CNN of design processes non-key frame, in order to reduce system delay and Simplify network structure, using image block as input, now, the size of all characteristic patterns generated by CNN is identical with image block, The quantity monotonic decreasing of characteristic pattern, the m dimensional vectors that this network inputs is made up of compression measurement, there is a holostrome before CNN, It measures one two dimensional character figure of generation using these.To obtain compression ratio (CR) higher, each video comprising T frames is mended Fourth is divided into K key frame and (T-K) individual non-key frame, and key frame compresses by low compression ratio (CR), and non-key frame is by high pressure Contracting is compressed than (CR) so that the metrical information of key frame can be used to rebuild non-key frame again, and this can regard time compression as.
Wherein, described shot and long term memory (LSTM) network, rebuilds for the time, for obtain end-to-end training, And effective model is calculated, and do not pre-process to being originally inputted, and rebuild using a LSTM network extraction must Few motion feature, so as to estimate the light stream of video, the LSTM networks of synthesis are used for motion extrapolation, spatial vision feature and fortune Dynamic aggregation, to reach video reconstruction.In the training process of LSTM networks, the M- input extraction processs of LSTM originally are crucial The CNN data of frame, the CNN outputs of remaining (T-M) extraction process non-key frame, for each LSTM unit, it will be received The visual signature of key frame, these visual signatures are used for the last of Background Reconstruction, the present frame of recovery object and estimation Several frames.
Wherein, described CSNet network trainings, are divided into two stages, first stage, pre-training background CNN, and from Visual signature is extracted in K key frames, second stage gives model more basic blocks extracted from origin needed for building object, Then training (T-M) the smaller CNN, these objects CNN and pre-training background CNN that start from scratch is tied by a LSTM for synthesis Close, three networks are trained together, the number of parameters for needed for reducing training, the last several layers of of only key frame CNN are combined, institute Input with these figure layers is Feature Mapping rather than measurement, using average Euclidean loss as loss function, i.e.,
Herein, W and b are network weight and biasing, xiAnd yiIt is that each image block and its CS are measured, a random Gaussian Matrix is used for CS codings.
Wherein, described compressed sensing video reconstruction, sets up the present frame based on information, using recurrent neural network (RNN) motion feature is extracted, convolutional neural networks (CNN) extract visual signature, the information that both fusions are extracted, using LSTM All features that network aggregation is extracted, it are formed with the deduction movement combination of hidden state and are rebuild.
Fig. 2 is that a kind of framework of the method that perception video reconstruction is compressed based on recursive convolution neutral net of the present invention is whole Body structure chart.Compressed video frame is obtained by compressed sensing.Reconstruction is performed by CSNet, and CSNet is by background CNN, object CNN and the LSTM of synthesis compositions.In per T frames, preceding M frames and remaining (T-M) frame are compressed by low CR and CR high respectively. Background CNN first by pre-training, then, train together by the remainder of CNN layers of remaining background and model.
Fig. 3 is a kind of CSNet that the method for perceiving video reconstruction is compressed based on recursive convolution neutral net of the present invention Network training schematic diagram.Network training process is divided into two stages, wherein pre-training of the figure a for background CNN, figure b is CNN and conjunction Into LSTM joint training.First stage, pre-training background CNN, and visual signature is extracted from K key frame, such as figure a It is shown;Second stage, gives model more basic blocks extracted from origin needed for building object, and we start from scratch training (T-M) small CNNs, these objects CNN and pre-training background CNN are combined by a LSTM for synthesis, and three networks are instructed together Practice, as shown in figure b.Number of parameters for needed for reducing training, the last several layers of of only key frame CNN are combined, so layer Input is Feature Mapping rather than measurement.
Fig. 4 is a kind of compression sense that the method for perceiving video reconstruction is compressed based on recursive convolution neutral net of the present invention Know video reconstruction flow chart.The present frame based on information is set up, motion feature, convolution are extracted using recurrent neural network (RNN) Neutral net (CNN) extracts visual signature, the information that both fusions are extracted, all spies extracted using LSTM network aggregations Levy, it is formed with the deduction movement combination of hidden state and is rebuild.
For those skilled in the art, the present invention is not restricted to the details of above-described embodiment, without departing substantially from essence of the invention In the case of god and scope, the present invention can be realized with other concrete forms.Additionally, those skilled in the art can be to this hair Bright to carry out various changes and modification without departing from the spirit and scope of the present invention, these improvement also should be regarded as of the invention with modification Protection domain.Therefore, appended claims are intended to be construed to include preferred embodiment and fall into all changes of the scope of the invention More and modification.

Claims (10)

1. it is a kind of that the method for perceiving video reconstruction is compressed based on recursive convolution neutral net, it is characterised in that mainly to include Compressed sensing network (CSNet) (one);CSNet algorithm structures (two);Convolutional neural networks (CNN) (three);Shot and long term is remembered (LSTM) network (four);CSNet network trainings (five);Compressed sensing video reconstruction (six).
2. based on the compressed sensing network (CSNet) () described in claims 1, it is characterised in that compressed sensing network (CSNet) it is a kind of deep neural network, visual representation can be understood from random measurement, for compressed sensing video reconstruction, It is a kind of training and non-iterative model end to end, combines convolutional neural networks (CNN) and recurrent neural network (RNN), from And video reconstruction is carried out using space-time characteristic, this network structure can receive the random measurement with multi-stage compression ratio (CR), Background information and object detail are separately provided, more preferable reconstruction quality is reached.
3. based on the recurrent neural network (RNN) described in claims 2, it is characterised in that for video reconstruction application, simulation Time course is extremely important, and by setting up the present frame based on information, these packets are containing outer between present frame and patch Time-dependent relation is pushed away, temporal information is applied to process of reconstruction by recurrent neural network (RNN), can be used to generate more accurate Model.
4. based on the CSNet algorithm structures (two) described in claims 1, it is characterised in that the structure includes three modules:With Random coded in measurement, CNN clusters, the LSTM for time reconstruction for Visual Feature Retrieval Process, random coded device are parallel Operation, using the first frame in more measurement encoded video, while using less measurement coded residual frame, multistage can be received Compression ratio (CR) is measured, and by this algorithm, key frame and non-key frame (remaining frame of main contributions movable information) are pressed respectively Contracting, recurrent neural network (RNN) extrapolates movable information, and extracted by these information and by convolutional Neural system (CNN) Visual signature is combined, and synthesizes high-quality frame, and efficient information fusion can make the fidelity of compressed sensing (CS) Video Applications The balance that must be optimal and compression ratio (CR) between.
5. based on the convolutional neural networks (CNN) (three) described in claims 1, it is characterised in that the network is carried out to image Compression is measured and puts reconstruction outward, and time compression and space compression are combined together with maximum compression ratio, and design one is larger CNN process key frame because key frame contains entropy information high, meanwhile, one less CNN is non-key to process for design Frame, in order to reduce the delay of system and simplify network structure, using image block as input, now, by owning that CNN is generated The size of characteristic pattern is identical with image block, the quantity monotonic decreasing of characteristic pattern, and this network inputs is tieed up by the m that compression measurement is constituted Vector, there is a holostrome before CNN, and it measures one two dimensional character figure of generation using these.
6. based on the time compression described in claims 5, it is characterised in that to obtain compression ratio (CR) higher, will be comprising T Each video patch of frame is divided into K key frame and (T-K) individual non-key frame, and key frame compresses by low compression ratio (CR), non- Key frame compresses by high compression ratio (CR) so that the metrical information of key frame can be used to rebuild non-key frame again, this Time compression can be regarded as.
7. (LSTM) network (four) is remembered based on the shot and long term described in claims 1, it is characterised in that rebuild for the time, For obtaining end-to-end training and calculate effective model, do not pre-process to being originally inputted, and utilize one LSTM network extractions rebuild essential motion feature, so as to estimate the light stream of video, the LSTM networks of synthesis are used for fortune The aggregation of dynamic extrapolation, spatial vision feature and motion, to reach video reconstruction.
8. based on the LSTM network training process described in claims 7, it is characterised in that in the training process of LSTM networks In, the M- of LSTM originally is input into the CNN data of extraction process key frame, the CNN of remaining (T-M) extraction process non-key frame Output, for each LSTM unit, it will receive the visual signature of key frame, and these visual signatures are used for Background Reconstruction, extensive The present frame of multiple object and last several frames of estimation.
9. based on the CSNet network trainings (five) described in claims 1, it is characterised in that be divided into two stages, first rank Section, pre-training background CNN, and visual signature is extracted from K key frames, second stage is more carried to model from origin Take the basic block needed for building object, then start from scratch training (T-M) smaller CNN, these objects CNN and pre-training background CNN is combined by a LSTM for synthesis, and three networks are trained together, and the number of parameters for needed for reducing training is only crucial The last several layers of of frame CNN are combined, so the input of these figure layers is Feature Mapping rather than measurement, average Euclidean is lost and is made It is loss function, i.e.,
L ( w , b ) = 1 2 N Σ i T | | f ( y i , W , b ) - x i | | 2 2
Herein, W and b are network weight and biasing, xiAnd yiIt is that each image block and its CS are measured, a random Gaussian matrix It is used for CS codings.
10. based on the compressed sensing video reconstruction (six) described in claims 1, it is characterised in that set up working as based on information Previous frame, motion feature is extracted using recurrent neural network (RNN), and convolutional neural networks (CNN) extract visual signature, both fusions The information extracted, all features extracted using LSTM network aggregations are formed it with the deduction movement combination of hidden state Rebuild.
CN201710124135.2A 2017-03-03 2017-03-03 It is a kind of that the method for perceiving video reconstruction is compressed based on recursive convolution neutral net Withdrawn CN106911930A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710124135.2A CN106911930A (en) 2017-03-03 2017-03-03 It is a kind of that the method for perceiving video reconstruction is compressed based on recursive convolution neutral net

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710124135.2A CN106911930A (en) 2017-03-03 2017-03-03 It is a kind of that the method for perceiving video reconstruction is compressed based on recursive convolution neutral net

Publications (1)

Publication Number Publication Date
CN106911930A true CN106911930A (en) 2017-06-30

Family

ID=59186285

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710124135.2A Withdrawn CN106911930A (en) 2017-03-03 2017-03-03 It is a kind of that the method for perceiving video reconstruction is compressed based on recursive convolution neutral net

Country Status (1)

Country Link
CN (1) CN106911930A (en)

Cited By (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107154150A (en) * 2017-07-25 2017-09-12 北京航空航天大学 A kind of traffic flow forecasting method clustered based on road with double-layer double-direction LSTM
CN107392189A (en) * 2017-09-05 2017-11-24 百度在线网络技术(北京)有限公司 For the method and apparatus for the driving behavior for determining unmanned vehicle
CN107808122A (en) * 2017-09-30 2018-03-16 中国科学院长春光学精密机械与物理研究所 Method for tracking target and device
CN108198202A (en) * 2018-01-23 2018-06-22 北京易智能科技有限公司 A kind of video content detection method based on light stream and neural network
CN108197566A (en) * 2017-12-29 2018-06-22 成都三零凯天通信实业有限公司 Monitoring video behavior detection method based on multi-path neural network
CN108322685A (en) * 2018-01-12 2018-07-24 广州华多网络科技有限公司 Video frame interpolation method, storage medium and terminal
CN108810651A (en) * 2018-05-09 2018-11-13 太原科技大学 Wireless video method of multicasting based on depth-compression sensing network
CN108923984A (en) * 2018-07-16 2018-11-30 西安电子科技大学 Space-time video compress cognitive method based on convolutional network
CN109003614A (en) * 2018-07-31 2018-12-14 上海爱优威软件开发有限公司 A kind of voice transmission method, voice-transmission system and terminal
CN109360436A (en) * 2018-11-02 2019-02-19 Oppo广东移动通信有限公司 A kind of video generation method, terminal and storage medium
CN109376856A (en) * 2017-08-09 2019-02-22 上海寒武纪信息科技有限公司 Data processing method and processing unit
CN109614896A (en) * 2018-10-29 2019-04-12 山东大学 A method of the video content semantic understanding based on recursive convolution neural network
CN109743571A (en) * 2018-12-26 2019-05-10 西安交通大学 A kind of image encoding method based on parallelly compressed perception multilayer residual error coefficient
CN109819256A (en) * 2019-03-06 2019-05-28 西安电子科技大学 Video compress cognitive method based on characteristic perception
CN110007366A (en) * 2019-03-04 2019-07-12 中国科学院深圳先进技术研究院 A kind of life searching method and system based on Multi-sensor Fusion
CN110046537A (en) * 2017-12-08 2019-07-23 辉达公司 The system and method for carrying out dynamic face analysis using recurrent neural network
CN110087092A (en) * 2019-03-11 2019-08-02 西安电子科技大学 Low bit-rate video decoding method based on image reconstruction convolutional neural networks
CN110516736A (en) * 2019-06-04 2019-11-29 沈阳瑞初科技有限公司 The visual multi-source heterogeneous data multilayer DRNN depth integration method of multidimensional
CN110738108A (en) * 2019-09-09 2020-01-31 北京地平线信息技术有限公司 Target object detection method, target object detection device, storage medium and electronic equipment
CN110784228A (en) * 2019-10-23 2020-02-11 武汉理工大学 Compression method of subway structure vibration signal based on LSTM model
CN110933429A (en) * 2019-11-13 2020-03-27 南京邮电大学 Video compression sensing and reconstruction method and device based on deep neural network
WO2020107877A1 (en) * 2018-11-29 2020-06-04 北京市商汤科技开发有限公司 Video compression processing method and apparatus, electronic device, and storage medium
CN111383245A (en) * 2018-12-29 2020-07-07 北京地平线机器人技术研发有限公司 Video detection method, video detection device and electronic equipment
CN111428751A (en) * 2020-02-24 2020-07-17 清华大学 Object detection method based on compressed sensing and convolutional network
CN111479982A (en) * 2017-11-15 2020-07-31 吉奥奎斯特系统公司 In-situ operating system with filter
CN112866697A (en) * 2020-12-31 2021-05-28 杭州海康威视数字技术股份有限公司 Video image coding and decoding method and device, electronic equipment and storage medium
CN113766313A (en) * 2019-02-26 2021-12-07 深圳市商汤科技有限公司 Video data processing method and device, electronic equipment and storage medium
CN114339221A (en) * 2020-09-30 2022-04-12 脸萌有限公司 Convolutional neural network based filter for video coding and decoding
WO2022213992A1 (en) * 2021-04-09 2022-10-13 华为技术有限公司 Data processing method and apparatus

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104822063A (en) * 2015-04-16 2015-08-05 长沙理工大学 Compressed sensing video reconstruction method based on dictionary learning residual-error reconstruction
CN105469065A (en) * 2015-12-07 2016-04-06 中国科学院自动化研究所 Recurrent neural network-based discrete emotion recognition method
CN106157319A (en) * 2016-07-28 2016-11-23 哈尔滨工业大学 The significance detection method that region based on convolutional neural networks and Pixel-level merge
CN106331433A (en) * 2016-08-25 2017-01-11 上海交通大学 Video denoising method based on deep recursive neural network

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104822063A (en) * 2015-04-16 2015-08-05 长沙理工大学 Compressed sensing video reconstruction method based on dictionary learning residual-error reconstruction
CN105469065A (en) * 2015-12-07 2016-04-06 中国科学院自动化研究所 Recurrent neural network-based discrete emotion recognition method
CN106157319A (en) * 2016-07-28 2016-11-23 哈尔滨工业大学 The significance detection method that region based on convolutional neural networks and Pixel-level merge
CN106331433A (en) * 2016-08-25 2017-01-11 上海交通大学 Video denoising method based on deep recursive neural network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
KAI XU等: "CSVideoNet: A Recurrent Convolutional Neural Network for Compressive Sensing Video Reconstruction", 《网页在线公开:HTTPS://ARXIV.ORG/ABS/1612.05203V1》 *

Cited By (45)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107154150A (en) * 2017-07-25 2017-09-12 北京航空航天大学 A kind of traffic flow forecasting method clustered based on road with double-layer double-direction LSTM
CN107154150B (en) * 2017-07-25 2019-07-02 北京航空航天大学 A kind of traffic flow forecasting method based on road cluster and double-layer double-direction LSTM
CN109376856A (en) * 2017-08-09 2019-02-22 上海寒武纪信息科技有限公司 Data processing method and processing unit
CN107392189A (en) * 2017-09-05 2017-11-24 百度在线网络技术(北京)有限公司 For the method and apparatus for the driving behavior for determining unmanned vehicle
CN107808122A (en) * 2017-09-30 2018-03-16 中国科学院长春光学精密机械与物理研究所 Method for tracking target and device
CN111479982A (en) * 2017-11-15 2020-07-31 吉奥奎斯特系统公司 In-situ operating system with filter
CN110046537A (en) * 2017-12-08 2019-07-23 辉达公司 The system and method for carrying out dynamic face analysis using recurrent neural network
CN110046537B (en) * 2017-12-08 2023-12-29 辉达公司 System and method for dynamic facial analysis using recurrent neural networks
CN108197566B (en) * 2017-12-29 2022-03-25 成都三零凯天通信实业有限公司 Monitoring video behavior detection method based on multi-path neural network
CN108197566A (en) * 2017-12-29 2018-06-22 成都三零凯天通信实业有限公司 Monitoring video behavior detection method based on multi-path neural network
CN108322685A (en) * 2018-01-12 2018-07-24 广州华多网络科技有限公司 Video frame interpolation method, storage medium and terminal
CN108322685B (en) * 2018-01-12 2020-09-25 广州华多网络科技有限公司 Video frame insertion method, storage medium and terminal
CN108198202A (en) * 2018-01-23 2018-06-22 北京易智能科技有限公司 A kind of video content detection method based on light stream and neural network
CN108810651B (en) * 2018-05-09 2020-11-03 太原科技大学 Wireless video multicast method based on deep compression sensing network
CN108810651A (en) * 2018-05-09 2018-11-13 太原科技大学 Wireless video method of multicasting based on depth-compression sensing network
CN108923984A (en) * 2018-07-16 2018-11-30 西安电子科技大学 Space-time video compress cognitive method based on convolutional network
CN108923984B (en) * 2018-07-16 2021-01-12 西安电子科技大学 Space-time video compressed sensing method based on convolutional network
CN109003614A (en) * 2018-07-31 2018-12-14 上海爱优威软件开发有限公司 A kind of voice transmission method, voice-transmission system and terminal
CN109614896A (en) * 2018-10-29 2019-04-12 山东大学 A method of the video content semantic understanding based on recursive convolution neural network
CN109360436A (en) * 2018-11-02 2019-02-19 Oppo广东移动通信有限公司 A kind of video generation method, terminal and storage medium
US11290723B2 (en) * 2018-11-29 2022-03-29 Beijing Sensetime Technology Development Co., Ltd. Method for video compression processing, electronic device and storage medium
WO2020107877A1 (en) * 2018-11-29 2020-06-04 北京市商汤科技开发有限公司 Video compression processing method and apparatus, electronic device, and storage medium
CN109743571B (en) * 2018-12-26 2020-04-28 西安交通大学 Image coding method based on parallel compressed sensing multilayer residual error coefficients
CN109743571A (en) * 2018-12-26 2019-05-10 西安交通大学 A kind of image encoding method based on parallelly compressed perception multilayer residual error coefficient
CN111383245A (en) * 2018-12-29 2020-07-07 北京地平线机器人技术研发有限公司 Video detection method, video detection device and electronic equipment
CN111383245B (en) * 2018-12-29 2023-09-22 北京地平线机器人技术研发有限公司 Video detection method, video detection device and electronic equipment
CN113766313A (en) * 2019-02-26 2021-12-07 深圳市商汤科技有限公司 Video data processing method and device, electronic equipment and storage medium
CN113766313B (en) * 2019-02-26 2024-03-05 深圳市商汤科技有限公司 Video data processing method and device, electronic equipment and storage medium
CN110007366A (en) * 2019-03-04 2019-07-12 中国科学院深圳先进技术研究院 A kind of life searching method and system based on Multi-sensor Fusion
CN109819256B (en) * 2019-03-06 2022-07-26 西安电子科技大学 Video compression sensing method based on feature sensing
CN109819256A (en) * 2019-03-06 2019-05-28 西安电子科技大学 Video compress cognitive method based on characteristic perception
CN110087092A (en) * 2019-03-11 2019-08-02 西安电子科技大学 Low bit-rate video decoding method based on image reconstruction convolutional neural networks
CN110516736A (en) * 2019-06-04 2019-11-29 沈阳瑞初科技有限公司 The visual multi-source heterogeneous data multilayer DRNN depth integration method of multidimensional
CN110516736B (en) * 2019-06-04 2022-02-22 沈阳瑞初科技有限公司 Multi-dimensional visual multi-source heterogeneous data multi-layer DRNN depth fusion method
CN110738108A (en) * 2019-09-09 2020-01-31 北京地平线信息技术有限公司 Target object detection method, target object detection device, storage medium and electronic equipment
CN110784228B (en) * 2019-10-23 2023-07-25 武汉理工大学 Compression method of subway structure vibration signal based on LSTM model
CN110784228A (en) * 2019-10-23 2020-02-11 武汉理工大学 Compression method of subway structure vibration signal based on LSTM model
CN110933429B (en) * 2019-11-13 2021-11-12 南京邮电大学 Video compression sensing and reconstruction method and device based on deep neural network
CN110933429A (en) * 2019-11-13 2020-03-27 南京邮电大学 Video compression sensing and reconstruction method and device based on deep neural network
CN111428751B (en) * 2020-02-24 2022-12-23 清华大学 Object detection method based on compressed sensing and convolutional network
CN111428751A (en) * 2020-02-24 2020-07-17 清华大学 Object detection method based on compressed sensing and convolutional network
CN114339221A (en) * 2020-09-30 2022-04-12 脸萌有限公司 Convolutional neural network based filter for video coding and decoding
CN112866697A (en) * 2020-12-31 2021-05-28 杭州海康威视数字技术股份有限公司 Video image coding and decoding method and device, electronic equipment and storage medium
CN112866697B (en) * 2020-12-31 2022-04-05 杭州海康威视数字技术股份有限公司 Video image coding and decoding method and device, electronic equipment and storage medium
WO2022213992A1 (en) * 2021-04-09 2022-10-13 华为技术有限公司 Data processing method and apparatus

Similar Documents

Publication Publication Date Title
CN106911930A (en) It is a kind of that the method for perceiving video reconstruction is compressed based on recursive convolution neutral net
CN107483920B (en) A kind of panoramic video appraisal procedure and system based on multi-layer quality factor
CN105069825B (en) Image super-resolution rebuilding method based on depth confidence network
CN110634105B (en) Video high-space-time resolution signal processing method combining optical flow method and depth network
CN107197260A (en) Video coding post-filter method based on convolutional neural networks
CN110751597B (en) Video super-resolution method based on coding damage repair
CN111260560B (en) Multi-frame video super-resolution method fused with attention mechanism
CN112653899A (en) Network live broadcast video feature extraction method based on joint attention ResNeSt under complex scene
CN112040222B (en) Visual saliency prediction method and equipment
CN112381866B (en) Attention mechanism-based video bit enhancement method
CN109214985A (en) The intensive residual error network of recurrence for image super-resolution reconstruct
CN109948721A (en) A kind of video scene classification method based on video presentation
CN110490804A (en) A method of based on the generation super resolution image for generating confrontation network
CN113066022B (en) Video bit enhancement method based on efficient space-time information fusion
CN108111860A (en) Video sequence lost frames prediction restoration methods based on depth residual error network
CN111462208A (en) Non-supervision depth prediction method based on binocular parallax and epipolar line constraint
CN109949217A (en) Video super-resolution method for reconstructing based on residual error study and implicit motion compensation
Löhdefink et al. GAN-vs. JPEG2000 image compression for distributed automotive perception: Higher peak SNR does not mean better semantic segmentation
Löhdefink et al. On low-bitrate image compression for distributed automotive perception: Higher peak snr does not mean better semantic segmentation
CN113132727B (en) Scalable machine vision coding method and training method of motion-guided image generation network
CN109583334A (en) A kind of action identification method and its system based on space time correlation neural network
CN104408697A (en) Image super-resolution reconstruction method based on genetic algorithm and regular prior model
CN113689382A (en) Tumor postoperative life prediction method and system based on medical images and pathological images
CN115239857B (en) Image generation method and electronic device
Kumawat et al. Action recognition from a single coded image

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20170630