CN106911930A - It is a kind of that the method for perceiving video reconstruction is compressed based on recursive convolution neutral net - Google Patents
It is a kind of that the method for perceiving video reconstruction is compressed based on recursive convolution neutral net Download PDFInfo
- Publication number
- CN106911930A CN106911930A CN201710124135.2A CN201710124135A CN106911930A CN 106911930 A CN106911930 A CN 106911930A CN 201710124135 A CN201710124135 A CN 201710124135A CN 106911930 A CN106911930 A CN 106911930A
- Authority
- CN
- China
- Prior art keywords
- cnn
- network
- lstm
- video
- frame
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/42—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T9/00—Image coding
- G06T9/002—Image coding using neural networks
Abstract
A kind of method that perception video reconstruction is compressed based on recursive convolution neutral net proposed in the present invention, its main contents are included:Compressed sensing network (CSNet), CSNet algorithm structures, convolutional neural networks (CNN), shot and long term memory (LSTM) network, CSNet network trainings, compressed sensing video reconstruction, its process is, motion feature is extracted using RNN, CNN extracts visual signature, the information that both fusions are extracted, the all features extracted using LSTM network aggregations, it are formed with the deduction movement combination of hidden state and are rebuild.The present invention breaches the problem that existing method is difficult to ensure that video reconstruction quality under high compression ratio, devise a kind of training and non-iterative model end to end, improve the compression ratio (CR) of CS video cameras, and improve video reconstruction quality, the bandwidth of data transfer is reduced simultaneously so that can support the Video Applications of frame per second high.
Description
Technical field
The present invention relates to video compress and reconstruction field, carried out based on recursive convolution neutral net more particularly, to one kind
The method of compressed sensing video reconstruction.
Background technology
Video compress and reconstruction are usually used in research, video monitoring, remote sensing technology, social networks of physics and bioscience etc.
Field, in the research of physics and bioscience, high-speed camera is used to record the two-forty to be recorded of traditional camera
Affair character, it can record the static image of high-resolution of high speed event, for example, tracking " insignificant motion blur and image
The explosion bubble of distortion artifacts ".In video monitoring, region interested in monitor video can be rebuild, to particular persons
Or the image of car plate carries out enhancing and improves identification.But, if frame per second is 1080P's for the video camera of 10kfps shoots resolution ratio
HD video, then the data that can produce about 500GB per second, this constitutes huge to existing transmission and memory technology
How challenge, efficiently transmit and store the focus that these large-capacity videos are current research.
The present invention proposes a kind of method for being compressed based on recursive convolution neutral net and perceiving video reconstruction, using volume
Neutral net (CNN) and recurrent neural network (RNN) is accumulated to extract space-time characteristic, including background, object detail and motion letter
Breath, has reached more preferable reconstruction quality.Specifically, random coded device parallel running, using in more measurement encoded video
First frame, while using less measurement coded residual frame, for each compression measurement, thering is specific CNN therefrom to extract space special
Levy, length memory all features for being extracted by each CNN of (LSTM) network aggregation, and hidden state deduction campaign shape together
Into reconstruction.The present invention breaches a series of limitation of the conventional process mode that video is considered as independent images, by RNN by the time
Information application is in process of reconstruction, so as to generate more accurate models, in addition this method is also regarded keeping preferably original
On the basis of frequency visual details, improve compression ratio and reduce the broadband of data transfer, improve video reconstruction quality, prop up
Hold the Video Applications of frame per second high.
The content of the invention
The problem of video reconstruction quality is difficult to ensure that under high compression ratio for existing method, it is an object of the invention to carry
The method for perceiving video reconstruction is compressed based on recursive convolution neutral net for a kind of, has surmounted the limitation of conventional method, carried
The compression ratio (CR) of CS video cameras high, and video reconstruction quality is improve, while reducing the bandwidth of data transfer so that can
To support the Video Applications of frame per second high.
To solve the above problems, the present invention provides one kind and is compressed perception video reconstruction based on recursive convolution neutral net
Method, its main contents includes:
(1) compressed sensing network (CSNet);
(2) CSNet algorithm structures;
(3) convolutional neural networks (CNN);
(4) shot and long term memory (LSTM) network;
(5) CSNet network trainings;
(6) compressed sensing video reconstruction.
Wherein, described compressed sensing network (CSNet), is a kind of deep neural network, can be suffered from random measurement
Solution visual representation, is a kind of training and non-iterative model end to end for compressed sensing video reconstruction, combines convolutional Neural
Network (CNN) and recurrent neural network (RNN), so as to carry out video reconstruction using space-time characteristic, this network structure can connect
Receive with the random measurement of multi-stage compression ratio (CR), separately provide background information and object detail, reach preferably reconstruction
Quality.
Wherein, described CSNet algorithm structures, the structure includes three modules:For measure random coded, for regarding
Feel CNN clusters, the LSTM for time reconstruction of feature extraction, random coded device parallel running is encoded using more measurement
First frame in video, while using less measurement coded residual frame, can receive multi-stage compression ratio (CR) measurement, is calculated by this
Method, key frame and non-key frame (remaining frame of main contributions movable information) are compressed respectively, and recurrent neural network (RNN) is calculated
Go out movable information, and these information are combined with the visual signature extracted by convolutional Neural system (CNN), synthesize high-quality
Frame, efficient information fusion can make be reached most between the fidelity of compressed sensing (CS) Video Applications and compression ratio (CR)
Excellent balance.
Wherein, described convolutional neural networks (CNN), the network to image be compressed measurement and put reconstruction outward, when
Between compression and space compression be combined together with maximum compression ratio, one larger CNN of design processes key frame, because closing
Key frame contains entropy information high, meanwhile, one less CNN of design processes non-key frame, in order to reduce system delay and
Simplify network structure, using image block as input, now, the size of all characteristic patterns generated by CNN is identical with image block,
The quantity monotonic decreasing of characteristic pattern, the m dimensional vectors that this network inputs is made up of compression measurement, there is a holostrome before CNN,
It measures one two dimensional character figure of generation using these.
Further, described time compression, to obtain compression ratio (CR) higher, each video comprising T frames is mended
Fourth is divided into K key frame and (T-K) individual non-key frame, and key frame compresses by low compression ratio (CR), and non-key frame is by high pressure
Contracting is compressed than (CR) so that the metrical information of key frame can be used to rebuild non-key frame again, and this can regard time compression as.
Wherein, described shot and long term memory (LSTM) network, rebuilds for the time, for obtain end-to-end training,
And effective model is calculated, and do not pre-process to being originally inputted, and rebuild using a LSTM network extraction must
Few motion feature, so as to estimate the light stream of video, the LSTM networks of synthesis are used for motion extrapolation, spatial vision feature and fortune
Dynamic aggregation, to reach video reconstruction.
Further, described LSTM network training process, it is characterised in that in the training process of LSTM networks, rises
The M- of first LSTM is input into the CNN data of extraction process key frame, and the CNN of remaining (T-M) extraction process non-key frame is exported,
For each LSTM unit, it will receive the visual signature of key frame, and these visual signatures are used for Background Reconstruction, recover object
Present frame and estimation last several frames.
Wherein, described CSNet network trainings, are divided into two stages, first stage, pre-training background CNN, and from
Visual signature is extracted in K key frames, second stage gives model more basic blocks extracted from origin needed for building object,
Then training (T-M) the smaller CNN, these objects CNN and pre-training background CNN that start from scratch is tied by a LSTM for synthesis
Close, three networks are trained together, the number of parameters for needed for reducing training, the last several layers of of only key frame CNN are combined, institute
Input with these figure layers is Feature Mapping rather than measurement, using average Euclidean loss as loss function, i.e.,
Herein, W and b are network weight and biasing, xiAnd yiIt is that each image block and its CS are measured, a random Gaussian
Matrix is used for CS codings.
Wherein, described compressed sensing video reconstruction, sets up the present frame based on information, using recurrent neural network
(RNN) motion feature is extracted, convolutional neural networks (CNN) extract visual signature, the information that both fusions are extracted, using LSTM
All features that network aggregation is extracted, it are formed with the deduction movement combination of hidden state and are rebuild.
Brief description of the drawings
Fig. 1 is a kind of system stream that the method for perceiving video reconstruction is compressed based on recursive convolution neutral net of the present invention
Cheng Tu.
Fig. 2 is that a kind of framework of the method that perception video reconstruction is compressed based on recursive convolution neutral net of the present invention is whole
Body structure.
Fig. 3 is a kind of CSNet that the method for perceiving video reconstruction is compressed based on recursive convolution neutral net of the present invention
Network training schematic diagram.
Fig. 4 is a kind of compression sense that the method for perceiving video reconstruction is compressed based on recursive convolution neutral net of the present invention
Know video reconstruction flow chart.
Specific embodiment
It should be noted that in the case where not conflicting, the feature in embodiment and embodiment in the application can phase
Mutually combine, the present invention is described in further detail with specific embodiment below in conjunction with the accompanying drawings.
Fig. 1 is a kind of system stream that the method for perceiving video reconstruction is compressed based on recursive convolution neutral net of the present invention
Cheng Tu.Mainly include compressed sensing network (CSNet), CSNet algorithm structures, convolutional neural networks (CNN), shot and long term memory
(LSTM) network, CSNet network trainings, compressed sensing video reconstruction.
Wherein, described compressed sensing network (CSNet), is a kind of deep neural network, can be suffered from random measurement
Solution visual representation, is a kind of training and non-iterative model end to end for compressed sensing video reconstruction, combines convolutional Neural
Network (CNN) and recurrent neural network (RNN), so as to carry out video reconstruction using space-time characteristic, this network structure can connect
Receive with the random measurement of multi-stage compression ratio (CR), separately provide background information and object detail, reach preferably reconstruction
Quality.
Wherein, described CSNet algorithm structures, the structure includes three modules:For measure random coded, for regarding
Feel CNN clusters, the LSTM for time reconstruction of feature extraction, random coded device parallel running is encoded using more measurement
First frame in video, while using less measurement coded residual frame, can receive multi-stage compression ratio (CR) measurement, is calculated by this
Method, key frame and non-key frame (remaining frame of main contributions movable information) are compressed respectively, and recurrent neural network (RNN) is calculated
Go out movable information, and these information are combined with the visual signature extracted by convolutional Neural system (CNN), synthesize high-quality
Frame, efficient information fusion can make be reached most between the fidelity of compressed sensing (CS) Video Applications and compression ratio (CR)
Excellent balance.
Wherein, described convolutional neural networks (CNN), the network to image be compressed measurement and put reconstruction outward, when
Between compression and space compression be combined together with maximum compression ratio, one larger CNN of design processes key frame, because closing
Key frame contains entropy information high, meanwhile, one less CNN of design processes non-key frame, in order to reduce system delay and
Simplify network structure, using image block as input, now, the size of all characteristic patterns generated by CNN is identical with image block,
The quantity monotonic decreasing of characteristic pattern, the m dimensional vectors that this network inputs is made up of compression measurement, there is a holostrome before CNN,
It measures one two dimensional character figure of generation using these.To obtain compression ratio (CR) higher, each video comprising T frames is mended
Fourth is divided into K key frame and (T-K) individual non-key frame, and key frame compresses by low compression ratio (CR), and non-key frame is by high pressure
Contracting is compressed than (CR) so that the metrical information of key frame can be used to rebuild non-key frame again, and this can regard time compression as.
Wherein, described shot and long term memory (LSTM) network, rebuilds for the time, for obtain end-to-end training,
And effective model is calculated, and do not pre-process to being originally inputted, and rebuild using a LSTM network extraction must
Few motion feature, so as to estimate the light stream of video, the LSTM networks of synthesis are used for motion extrapolation, spatial vision feature and fortune
Dynamic aggregation, to reach video reconstruction.In the training process of LSTM networks, the M- input extraction processs of LSTM originally are crucial
The CNN data of frame, the CNN outputs of remaining (T-M) extraction process non-key frame, for each LSTM unit, it will be received
The visual signature of key frame, these visual signatures are used for the last of Background Reconstruction, the present frame of recovery object and estimation
Several frames.
Wherein, described CSNet network trainings, are divided into two stages, first stage, pre-training background CNN, and from
Visual signature is extracted in K key frames, second stage gives model more basic blocks extracted from origin needed for building object,
Then training (T-M) the smaller CNN, these objects CNN and pre-training background CNN that start from scratch is tied by a LSTM for synthesis
Close, three networks are trained together, the number of parameters for needed for reducing training, the last several layers of of only key frame CNN are combined, institute
Input with these figure layers is Feature Mapping rather than measurement, using average Euclidean loss as loss function, i.e.,
Herein, W and b are network weight and biasing, xiAnd yiIt is that each image block and its CS are measured, a random Gaussian
Matrix is used for CS codings.
Wherein, described compressed sensing video reconstruction, sets up the present frame based on information, using recurrent neural network
(RNN) motion feature is extracted, convolutional neural networks (CNN) extract visual signature, the information that both fusions are extracted, using LSTM
All features that network aggregation is extracted, it are formed with the deduction movement combination of hidden state and are rebuild.
Fig. 2 is that a kind of framework of the method that perception video reconstruction is compressed based on recursive convolution neutral net of the present invention is whole
Body structure chart.Compressed video frame is obtained by compressed sensing.Reconstruction is performed by CSNet, and CSNet is by background CNN, object
CNN and the LSTM of synthesis compositions.In per T frames, preceding M frames and remaining (T-M) frame are compressed by low CR and CR high respectively.
Background CNN first by pre-training, then, train together by the remainder of CNN layers of remaining background and model.
Fig. 3 is a kind of CSNet that the method for perceiving video reconstruction is compressed based on recursive convolution neutral net of the present invention
Network training schematic diagram.Network training process is divided into two stages, wherein pre-training of the figure a for background CNN, figure b is CNN and conjunction
Into LSTM joint training.First stage, pre-training background CNN, and visual signature is extracted from K key frame, such as figure a
It is shown;Second stage, gives model more basic blocks extracted from origin needed for building object, and we start from scratch training
(T-M) small CNNs, these objects CNN and pre-training background CNN are combined by a LSTM for synthesis, and three networks are instructed together
Practice, as shown in figure b.Number of parameters for needed for reducing training, the last several layers of of only key frame CNN are combined, so layer
Input is Feature Mapping rather than measurement.
Fig. 4 is a kind of compression sense that the method for perceiving video reconstruction is compressed based on recursive convolution neutral net of the present invention
Know video reconstruction flow chart.The present frame based on information is set up, motion feature, convolution are extracted using recurrent neural network (RNN)
Neutral net (CNN) extracts visual signature, the information that both fusions are extracted, all spies extracted using LSTM network aggregations
Levy, it is formed with the deduction movement combination of hidden state and is rebuild.
For those skilled in the art, the present invention is not restricted to the details of above-described embodiment, without departing substantially from essence of the invention
In the case of god and scope, the present invention can be realized with other concrete forms.Additionally, those skilled in the art can be to this hair
Bright to carry out various changes and modification without departing from the spirit and scope of the present invention, these improvement also should be regarded as of the invention with modification
Protection domain.Therefore, appended claims are intended to be construed to include preferred embodiment and fall into all changes of the scope of the invention
More and modification.
Claims (10)
1. it is a kind of that the method for perceiving video reconstruction is compressed based on recursive convolution neutral net, it is characterised in that mainly to include
Compressed sensing network (CSNet) (one);CSNet algorithm structures (two);Convolutional neural networks (CNN) (three);Shot and long term is remembered
(LSTM) network (four);CSNet network trainings (five);Compressed sensing video reconstruction (six).
2. based on the compressed sensing network (CSNet) () described in claims 1, it is characterised in that compressed sensing network
(CSNet) it is a kind of deep neural network, visual representation can be understood from random measurement, for compressed sensing video reconstruction,
It is a kind of training and non-iterative model end to end, combines convolutional neural networks (CNN) and recurrent neural network (RNN), from
And video reconstruction is carried out using space-time characteristic, this network structure can receive the random measurement with multi-stage compression ratio (CR),
Background information and object detail are separately provided, more preferable reconstruction quality is reached.
3. based on the recurrent neural network (RNN) described in claims 2, it is characterised in that for video reconstruction application, simulation
Time course is extremely important, and by setting up the present frame based on information, these packets are containing outer between present frame and patch
Time-dependent relation is pushed away, temporal information is applied to process of reconstruction by recurrent neural network (RNN), can be used to generate more accurate
Model.
4. based on the CSNet algorithm structures (two) described in claims 1, it is characterised in that the structure includes three modules:With
Random coded in measurement, CNN clusters, the LSTM for time reconstruction for Visual Feature Retrieval Process, random coded device are parallel
Operation, using the first frame in more measurement encoded video, while using less measurement coded residual frame, multistage can be received
Compression ratio (CR) is measured, and by this algorithm, key frame and non-key frame (remaining frame of main contributions movable information) are pressed respectively
Contracting, recurrent neural network (RNN) extrapolates movable information, and extracted by these information and by convolutional Neural system (CNN)
Visual signature is combined, and synthesizes high-quality frame, and efficient information fusion can make the fidelity of compressed sensing (CS) Video Applications
The balance that must be optimal and compression ratio (CR) between.
5. based on the convolutional neural networks (CNN) (three) described in claims 1, it is characterised in that the network is carried out to image
Compression is measured and puts reconstruction outward, and time compression and space compression are combined together with maximum compression ratio, and design one is larger
CNN process key frame because key frame contains entropy information high, meanwhile, one less CNN is non-key to process for design
Frame, in order to reduce the delay of system and simplify network structure, using image block as input, now, by owning that CNN is generated
The size of characteristic pattern is identical with image block, the quantity monotonic decreasing of characteristic pattern, and this network inputs is tieed up by the m that compression measurement is constituted
Vector, there is a holostrome before CNN, and it measures one two dimensional character figure of generation using these.
6. based on the time compression described in claims 5, it is characterised in that to obtain compression ratio (CR) higher, will be comprising T
Each video patch of frame is divided into K key frame and (T-K) individual non-key frame, and key frame compresses by low compression ratio (CR), non-
Key frame compresses by high compression ratio (CR) so that the metrical information of key frame can be used to rebuild non-key frame again, this
Time compression can be regarded as.
7. (LSTM) network (four) is remembered based on the shot and long term described in claims 1, it is characterised in that rebuild for the time,
For obtaining end-to-end training and calculate effective model, do not pre-process to being originally inputted, and utilize one
LSTM network extractions rebuild essential motion feature, so as to estimate the light stream of video, the LSTM networks of synthesis are used for fortune
The aggregation of dynamic extrapolation, spatial vision feature and motion, to reach video reconstruction.
8. based on the LSTM network training process described in claims 7, it is characterised in that in the training process of LSTM networks
In, the M- of LSTM originally is input into the CNN data of extraction process key frame, the CNN of remaining (T-M) extraction process non-key frame
Output, for each LSTM unit, it will receive the visual signature of key frame, and these visual signatures are used for Background Reconstruction, extensive
The present frame of multiple object and last several frames of estimation.
9. based on the CSNet network trainings (five) described in claims 1, it is characterised in that be divided into two stages, first rank
Section, pre-training background CNN, and visual signature is extracted from K key frames, second stage is more carried to model from origin
Take the basic block needed for building object, then start from scratch training (T-M) smaller CNN, these objects CNN and pre-training background
CNN is combined by a LSTM for synthesis, and three networks are trained together, and the number of parameters for needed for reducing training is only crucial
The last several layers of of frame CNN are combined, so the input of these figure layers is Feature Mapping rather than measurement, average Euclidean is lost and is made
It is loss function, i.e.,
Herein, W and b are network weight and biasing, xiAnd yiIt is that each image block and its CS are measured, a random Gaussian matrix
It is used for CS codings.
10. based on the compressed sensing video reconstruction (six) described in claims 1, it is characterised in that set up working as based on information
Previous frame, motion feature is extracted using recurrent neural network (RNN), and convolutional neural networks (CNN) extract visual signature, both fusions
The information extracted, all features extracted using LSTM network aggregations are formed it with the deduction movement combination of hidden state
Rebuild.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710124135.2A CN106911930A (en) | 2017-03-03 | 2017-03-03 | It is a kind of that the method for perceiving video reconstruction is compressed based on recursive convolution neutral net |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710124135.2A CN106911930A (en) | 2017-03-03 | 2017-03-03 | It is a kind of that the method for perceiving video reconstruction is compressed based on recursive convolution neutral net |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106911930A true CN106911930A (en) | 2017-06-30 |
Family
ID=59186285
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710124135.2A Withdrawn CN106911930A (en) | 2017-03-03 | 2017-03-03 | It is a kind of that the method for perceiving video reconstruction is compressed based on recursive convolution neutral net |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106911930A (en) |
Cited By (29)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107154150A (en) * | 2017-07-25 | 2017-09-12 | 北京航空航天大学 | A kind of traffic flow forecasting method clustered based on road with double-layer double-direction LSTM |
CN107392189A (en) * | 2017-09-05 | 2017-11-24 | 百度在线网络技术(北京)有限公司 | For the method and apparatus for the driving behavior for determining unmanned vehicle |
CN107808122A (en) * | 2017-09-30 | 2018-03-16 | 中国科学院长春光学精密机械与物理研究所 | Method for tracking target and device |
CN108198202A (en) * | 2018-01-23 | 2018-06-22 | 北京易智能科技有限公司 | A kind of video content detection method based on light stream and neural network |
CN108197566A (en) * | 2017-12-29 | 2018-06-22 | 成都三零凯天通信实业有限公司 | Monitoring video behavior detection method based on multi-path neural network |
CN108322685A (en) * | 2018-01-12 | 2018-07-24 | 广州华多网络科技有限公司 | Video frame interpolation method, storage medium and terminal |
CN108810651A (en) * | 2018-05-09 | 2018-11-13 | 太原科技大学 | Wireless video method of multicasting based on depth-compression sensing network |
CN108923984A (en) * | 2018-07-16 | 2018-11-30 | 西安电子科技大学 | Space-time video compress cognitive method based on convolutional network |
CN109003614A (en) * | 2018-07-31 | 2018-12-14 | 上海爱优威软件开发有限公司 | A kind of voice transmission method, voice-transmission system and terminal |
CN109360436A (en) * | 2018-11-02 | 2019-02-19 | Oppo广东移动通信有限公司 | A kind of video generation method, terminal and storage medium |
CN109376856A (en) * | 2017-08-09 | 2019-02-22 | 上海寒武纪信息科技有限公司 | Data processing method and processing unit |
CN109614896A (en) * | 2018-10-29 | 2019-04-12 | 山东大学 | A method of the video content semantic understanding based on recursive convolution neural network |
CN109743571A (en) * | 2018-12-26 | 2019-05-10 | 西安交通大学 | A kind of image encoding method based on parallelly compressed perception multilayer residual error coefficient |
CN109819256A (en) * | 2019-03-06 | 2019-05-28 | 西安电子科技大学 | Video compress cognitive method based on characteristic perception |
CN110007366A (en) * | 2019-03-04 | 2019-07-12 | 中国科学院深圳先进技术研究院 | A kind of life searching method and system based on Multi-sensor Fusion |
CN110046537A (en) * | 2017-12-08 | 2019-07-23 | 辉达公司 | The system and method for carrying out dynamic face analysis using recurrent neural network |
CN110087092A (en) * | 2019-03-11 | 2019-08-02 | 西安电子科技大学 | Low bit-rate video decoding method based on image reconstruction convolutional neural networks |
CN110516736A (en) * | 2019-06-04 | 2019-11-29 | 沈阳瑞初科技有限公司 | The visual multi-source heterogeneous data multilayer DRNN depth integration method of multidimensional |
CN110738108A (en) * | 2019-09-09 | 2020-01-31 | 北京地平线信息技术有限公司 | Target object detection method, target object detection device, storage medium and electronic equipment |
CN110784228A (en) * | 2019-10-23 | 2020-02-11 | 武汉理工大学 | Compression method of subway structure vibration signal based on LSTM model |
CN110933429A (en) * | 2019-11-13 | 2020-03-27 | 南京邮电大学 | Video compression sensing and reconstruction method and device based on deep neural network |
WO2020107877A1 (en) * | 2018-11-29 | 2020-06-04 | 北京市商汤科技开发有限公司 | Video compression processing method and apparatus, electronic device, and storage medium |
CN111383245A (en) * | 2018-12-29 | 2020-07-07 | 北京地平线机器人技术研发有限公司 | Video detection method, video detection device and electronic equipment |
CN111428751A (en) * | 2020-02-24 | 2020-07-17 | 清华大学 | Object detection method based on compressed sensing and convolutional network |
CN111479982A (en) * | 2017-11-15 | 2020-07-31 | 吉奥奎斯特系统公司 | In-situ operating system with filter |
CN112866697A (en) * | 2020-12-31 | 2021-05-28 | 杭州海康威视数字技术股份有限公司 | Video image coding and decoding method and device, electronic equipment and storage medium |
CN113766313A (en) * | 2019-02-26 | 2021-12-07 | 深圳市商汤科技有限公司 | Video data processing method and device, electronic equipment and storage medium |
CN114339221A (en) * | 2020-09-30 | 2022-04-12 | 脸萌有限公司 | Convolutional neural network based filter for video coding and decoding |
WO2022213992A1 (en) * | 2021-04-09 | 2022-10-13 | 华为技术有限公司 | Data processing method and apparatus |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104822063A (en) * | 2015-04-16 | 2015-08-05 | 长沙理工大学 | Compressed sensing video reconstruction method based on dictionary learning residual-error reconstruction |
CN105469065A (en) * | 2015-12-07 | 2016-04-06 | 中国科学院自动化研究所 | Recurrent neural network-based discrete emotion recognition method |
CN106157319A (en) * | 2016-07-28 | 2016-11-23 | 哈尔滨工业大学 | The significance detection method that region based on convolutional neural networks and Pixel-level merge |
CN106331433A (en) * | 2016-08-25 | 2017-01-11 | 上海交通大学 | Video denoising method based on deep recursive neural network |
-
2017
- 2017-03-03 CN CN201710124135.2A patent/CN106911930A/en not_active Withdrawn
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104822063A (en) * | 2015-04-16 | 2015-08-05 | 长沙理工大学 | Compressed sensing video reconstruction method based on dictionary learning residual-error reconstruction |
CN105469065A (en) * | 2015-12-07 | 2016-04-06 | 中国科学院自动化研究所 | Recurrent neural network-based discrete emotion recognition method |
CN106157319A (en) * | 2016-07-28 | 2016-11-23 | 哈尔滨工业大学 | The significance detection method that region based on convolutional neural networks and Pixel-level merge |
CN106331433A (en) * | 2016-08-25 | 2017-01-11 | 上海交通大学 | Video denoising method based on deep recursive neural network |
Non-Patent Citations (1)
Title |
---|
KAI XU等: "CSVideoNet: A Recurrent Convolutional Neural Network for Compressive Sensing Video Reconstruction", 《网页在线公开:HTTPS://ARXIV.ORG/ABS/1612.05203V1》 * |
Cited By (45)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107154150A (en) * | 2017-07-25 | 2017-09-12 | 北京航空航天大学 | A kind of traffic flow forecasting method clustered based on road with double-layer double-direction LSTM |
CN107154150B (en) * | 2017-07-25 | 2019-07-02 | 北京航空航天大学 | A kind of traffic flow forecasting method based on road cluster and double-layer double-direction LSTM |
CN109376856A (en) * | 2017-08-09 | 2019-02-22 | 上海寒武纪信息科技有限公司 | Data processing method and processing unit |
CN107392189A (en) * | 2017-09-05 | 2017-11-24 | 百度在线网络技术(北京)有限公司 | For the method and apparatus for the driving behavior for determining unmanned vehicle |
CN107808122A (en) * | 2017-09-30 | 2018-03-16 | 中国科学院长春光学精密机械与物理研究所 | Method for tracking target and device |
CN111479982A (en) * | 2017-11-15 | 2020-07-31 | 吉奥奎斯特系统公司 | In-situ operating system with filter |
CN110046537A (en) * | 2017-12-08 | 2019-07-23 | 辉达公司 | The system and method for carrying out dynamic face analysis using recurrent neural network |
CN110046537B (en) * | 2017-12-08 | 2023-12-29 | 辉达公司 | System and method for dynamic facial analysis using recurrent neural networks |
CN108197566B (en) * | 2017-12-29 | 2022-03-25 | 成都三零凯天通信实业有限公司 | Monitoring video behavior detection method based on multi-path neural network |
CN108197566A (en) * | 2017-12-29 | 2018-06-22 | 成都三零凯天通信实业有限公司 | Monitoring video behavior detection method based on multi-path neural network |
CN108322685A (en) * | 2018-01-12 | 2018-07-24 | 广州华多网络科技有限公司 | Video frame interpolation method, storage medium and terminal |
CN108322685B (en) * | 2018-01-12 | 2020-09-25 | 广州华多网络科技有限公司 | Video frame insertion method, storage medium and terminal |
CN108198202A (en) * | 2018-01-23 | 2018-06-22 | 北京易智能科技有限公司 | A kind of video content detection method based on light stream and neural network |
CN108810651B (en) * | 2018-05-09 | 2020-11-03 | 太原科技大学 | Wireless video multicast method based on deep compression sensing network |
CN108810651A (en) * | 2018-05-09 | 2018-11-13 | 太原科技大学 | Wireless video method of multicasting based on depth-compression sensing network |
CN108923984A (en) * | 2018-07-16 | 2018-11-30 | 西安电子科技大学 | Space-time video compress cognitive method based on convolutional network |
CN108923984B (en) * | 2018-07-16 | 2021-01-12 | 西安电子科技大学 | Space-time video compressed sensing method based on convolutional network |
CN109003614A (en) * | 2018-07-31 | 2018-12-14 | 上海爱优威软件开发有限公司 | A kind of voice transmission method, voice-transmission system and terminal |
CN109614896A (en) * | 2018-10-29 | 2019-04-12 | 山东大学 | A method of the video content semantic understanding based on recursive convolution neural network |
CN109360436A (en) * | 2018-11-02 | 2019-02-19 | Oppo广东移动通信有限公司 | A kind of video generation method, terminal and storage medium |
US11290723B2 (en) * | 2018-11-29 | 2022-03-29 | Beijing Sensetime Technology Development Co., Ltd. | Method for video compression processing, electronic device and storage medium |
WO2020107877A1 (en) * | 2018-11-29 | 2020-06-04 | 北京市商汤科技开发有限公司 | Video compression processing method and apparatus, electronic device, and storage medium |
CN109743571B (en) * | 2018-12-26 | 2020-04-28 | 西安交通大学 | Image coding method based on parallel compressed sensing multilayer residual error coefficients |
CN109743571A (en) * | 2018-12-26 | 2019-05-10 | 西安交通大学 | A kind of image encoding method based on parallelly compressed perception multilayer residual error coefficient |
CN111383245A (en) * | 2018-12-29 | 2020-07-07 | 北京地平线机器人技术研发有限公司 | Video detection method, video detection device and electronic equipment |
CN111383245B (en) * | 2018-12-29 | 2023-09-22 | 北京地平线机器人技术研发有限公司 | Video detection method, video detection device and electronic equipment |
CN113766313A (en) * | 2019-02-26 | 2021-12-07 | 深圳市商汤科技有限公司 | Video data processing method and device, electronic equipment and storage medium |
CN113766313B (en) * | 2019-02-26 | 2024-03-05 | 深圳市商汤科技有限公司 | Video data processing method and device, electronic equipment and storage medium |
CN110007366A (en) * | 2019-03-04 | 2019-07-12 | 中国科学院深圳先进技术研究院 | A kind of life searching method and system based on Multi-sensor Fusion |
CN109819256B (en) * | 2019-03-06 | 2022-07-26 | 西安电子科技大学 | Video compression sensing method based on feature sensing |
CN109819256A (en) * | 2019-03-06 | 2019-05-28 | 西安电子科技大学 | Video compress cognitive method based on characteristic perception |
CN110087092A (en) * | 2019-03-11 | 2019-08-02 | 西安电子科技大学 | Low bit-rate video decoding method based on image reconstruction convolutional neural networks |
CN110516736A (en) * | 2019-06-04 | 2019-11-29 | 沈阳瑞初科技有限公司 | The visual multi-source heterogeneous data multilayer DRNN depth integration method of multidimensional |
CN110516736B (en) * | 2019-06-04 | 2022-02-22 | 沈阳瑞初科技有限公司 | Multi-dimensional visual multi-source heterogeneous data multi-layer DRNN depth fusion method |
CN110738108A (en) * | 2019-09-09 | 2020-01-31 | 北京地平线信息技术有限公司 | Target object detection method, target object detection device, storage medium and electronic equipment |
CN110784228B (en) * | 2019-10-23 | 2023-07-25 | 武汉理工大学 | Compression method of subway structure vibration signal based on LSTM model |
CN110784228A (en) * | 2019-10-23 | 2020-02-11 | 武汉理工大学 | Compression method of subway structure vibration signal based on LSTM model |
CN110933429B (en) * | 2019-11-13 | 2021-11-12 | 南京邮电大学 | Video compression sensing and reconstruction method and device based on deep neural network |
CN110933429A (en) * | 2019-11-13 | 2020-03-27 | 南京邮电大学 | Video compression sensing and reconstruction method and device based on deep neural network |
CN111428751B (en) * | 2020-02-24 | 2022-12-23 | 清华大学 | Object detection method based on compressed sensing and convolutional network |
CN111428751A (en) * | 2020-02-24 | 2020-07-17 | 清华大学 | Object detection method based on compressed sensing and convolutional network |
CN114339221A (en) * | 2020-09-30 | 2022-04-12 | 脸萌有限公司 | Convolutional neural network based filter for video coding and decoding |
CN112866697A (en) * | 2020-12-31 | 2021-05-28 | 杭州海康威视数字技术股份有限公司 | Video image coding and decoding method and device, electronic equipment and storage medium |
CN112866697B (en) * | 2020-12-31 | 2022-04-05 | 杭州海康威视数字技术股份有限公司 | Video image coding and decoding method and device, electronic equipment and storage medium |
WO2022213992A1 (en) * | 2021-04-09 | 2022-10-13 | 华为技术有限公司 | Data processing method and apparatus |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106911930A (en) | It is a kind of that the method for perceiving video reconstruction is compressed based on recursive convolution neutral net | |
CN107483920B (en) | A kind of panoramic video appraisal procedure and system based on multi-layer quality factor | |
CN105069825B (en) | Image super-resolution rebuilding method based on depth confidence network | |
CN110634105B (en) | Video high-space-time resolution signal processing method combining optical flow method and depth network | |
CN107197260A (en) | Video coding post-filter method based on convolutional neural networks | |
CN110751597B (en) | Video super-resolution method based on coding damage repair | |
CN111260560B (en) | Multi-frame video super-resolution method fused with attention mechanism | |
CN112653899A (en) | Network live broadcast video feature extraction method based on joint attention ResNeSt under complex scene | |
CN112040222B (en) | Visual saliency prediction method and equipment | |
CN112381866B (en) | Attention mechanism-based video bit enhancement method | |
CN109214985A (en) | The intensive residual error network of recurrence for image super-resolution reconstruct | |
CN109948721A (en) | A kind of video scene classification method based on video presentation | |
CN110490804A (en) | A method of based on the generation super resolution image for generating confrontation network | |
CN113066022B (en) | Video bit enhancement method based on efficient space-time information fusion | |
CN108111860A (en) | Video sequence lost frames prediction restoration methods based on depth residual error network | |
CN111462208A (en) | Non-supervision depth prediction method based on binocular parallax and epipolar line constraint | |
CN109949217A (en) | Video super-resolution method for reconstructing based on residual error study and implicit motion compensation | |
Löhdefink et al. | GAN-vs. JPEG2000 image compression for distributed automotive perception: Higher peak SNR does not mean better semantic segmentation | |
Löhdefink et al. | On low-bitrate image compression for distributed automotive perception: Higher peak snr does not mean better semantic segmentation | |
CN113132727B (en) | Scalable machine vision coding method and training method of motion-guided image generation network | |
CN109583334A (en) | A kind of action identification method and its system based on space time correlation neural network | |
CN104408697A (en) | Image super-resolution reconstruction method based on genetic algorithm and regular prior model | |
CN113689382A (en) | Tumor postoperative life prediction method and system based on medical images and pathological images | |
CN115239857B (en) | Image generation method and electronic device | |
Kumawat et al. | Action recognition from a single coded image |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20170630 |