CN111626308A - Real-time optical flow estimation method based on lightweight convolutional neural network - Google Patents

Real-time optical flow estimation method based on lightweight convolutional neural network Download PDF

Info

Publication number
CN111626308A
CN111626308A CN202010322368.5A CN202010322368A CN111626308A CN 111626308 A CN111626308 A CN 111626308A CN 202010322368 A CN202010322368 A CN 202010322368A CN 111626308 A CN111626308 A CN 111626308A
Authority
CN
China
Prior art keywords
optical flow
cost
frame
pyramid
real
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010322368.5A
Other languages
Chinese (zh)
Other versions
CN111626308B (en
Inventor
孔令通
杨杰
黄晓霖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Jiaotong University
Original Assignee
Shanghai Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jiaotong University filed Critical Shanghai Jiaotong University
Priority to CN202010322368.5A priority Critical patent/CN111626308B/en
Publication of CN111626308A publication Critical patent/CN111626308A/en
Application granted granted Critical
Publication of CN111626308B publication Critical patent/CN111626308B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Abstract

The invention discloses a real-time optical flow estimation method based on a lightweight convolutional neural network, which comprises the following steps: giving two adjacent frames of images, and constructing a multi-scale feature pyramid with shared parameters; on the basis of the constructed characteristic pyramid, a first frame image U-shaped network structure is constructed by adopting deconvolution operation to perform multi-scale information fusion; initializing a lowest-resolution optical flow field to be zero, and performing deformation operation based on bilinear sampling on the second frame matching characteristics after the optical flow estimated by the second low resolution is up-sampled; local similarity calculation based on inner products is carried out on the characteristics of the first frame and the deformed characteristics of the second frame, matching cost is constructed, and cost aggregation is carried out; taking the multi-scale features, the up-sampled optical flow field and the matched cost features after cost aggregation as the input of an optical flow regression network, and estimating the optical flow field under the resolution; and repeating until the optical flow field under the highest resolution is estimated. By the method and the device, the optical flow estimation is more accurate, and the model is light, efficient, real-time and rapid.

Description

Real-time optical flow estimation method based on lightweight convolutional neural network
Technical Field
The invention relates to the technical field of computer vision, in particular to a real-time optical flow estimation method based on a lightweight convolutional neural network.
Background
Optical flow estimation is a fundamental research task in computer vision, and is a bridge and a link connecting images and videos. The core idea is to give two frames of images before and after, and estimate the corresponding relation of each pixel. This can also be understood approximately as the projected motion field of the 3D object on the 2D image plane. The optical flow method plays an important role in behavior understanding, video processing, motion prediction, multi-view 3D reconstruction, automatic driving, instantaneous positioning, and map construction (SLAM). Therefore, it is important in the field of computer vision to estimate optical flow accurately and quickly (especially dense optical flow).
The traditional optical flow estimation method is based on brightness consistency assumption, introduces prior knowledge such as local smoothness and the like, and solves the problem by constructing an energy function and a regularization constraint condition and using a variation optimization strategy. The disadvantage is slow running speed and poor estimation effect for large displacement.
The block matching based method can obtain sparse optical flow of a non-occlusion area in an image, and then fill a missing part through an interpolation algorithm and construct dense optical flow. The method has the disadvantages that the non-reference block matching algorithm relates to a random initialization and random search algorithm, the result depends on a random initial value, and the stability is not high. And the large number of search matching operations increases the time overhead.
The existing deep learning-based method constructs an image pyramid or a single-feature pyramid, and the method fuses multi-scale features by constructing a U-shaped network structure, so that the matched features have global consciousness, and the robustness of the algorithm is improved. In the existing deep learning method, the matching cost is directly used as the optical flow regression network input, but the dynamic range of the optical flow regression network input at the same time is inconsistent with the dynamic range of the previously-stage up-sampled optical flow field characteristics, so that the performance is reduced.
The application numbers are: 201710731234.7, entitled: the Chinese patent of a dense optical flow estimation method and a dense optical flow estimation device discloses a dense optical flow estimation method and a dense optical flow estimation device. However, it still relies on traditional methods and cannot make inferences quickly in real time.
Therefore, it is urgently needed to provide a lightweight, efficient, real-time and fast convolutional neural network for full-scene dense optical flow estimation.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides a real-time optical flow estimation method based on a lightweight convolutional neural network, which carries out cost aggregation after matching the cost, adjusts the dynamic range of output while purifying the matching cost, improves the network performance, and exceeds the existing deep learning method in terms of parameter quantity, inference speed and model precision.
In order to solve the technical problems, the invention is realized by the following technical scheme:
the invention provides a real-time optical flow estimation method based on a lightweight convolutional neural network, which comprises the following steps of:
s11: giving two adjacent frames of images, and extracting hierarchical image features by using a parameter-shared convolutional neural network to construct a first frame feature pyramid and a second frame feature pyramid;
s12: on the basis of the characteristic pyramid constructed in the S11, constructing a first frame image U-shaped network structure by adopting deconvolution operation to perform multi-scale information fusion to obtain multi-scale characteristics;
s13: initializing a lowest-resolution optical flow field to be zero, and performing bilinear sampling-based deformation operation on the second frame matching characteristics after the optical flow field estimated by the second low resolution is up-sampled;
s14: local similarity calculation based on inner products is carried out on the characteristics of the first frame characteristic pyramid and the characteristics of the deformed second frame obtained in the step S13, matching cost is constructed, and cost aggregation is carried out;
s15: taking the multi-scale features constructed in S12, the up-sampled optical flow field in S13 and the matched cost features after cost aggregation in S14 as the input of an optical flow regression network, and estimating the optical flow field under the resolution;
s16: and repeating the steps from S13 to S15 until the optical flow field under the highest resolution is estimated.
Preferably, the S11 specifically includes:
given two adjacent input images I1,I2Extracting multi-scale image features by a pyramid network, and constructing a first frame feature pyramid and a second frame feature pyramid:
Figure BDA0002461923820000021
wherein:
Figure BDA0002461923820000022
for the first frame image feature at the k-th level,
Figure BDA0002461923820000023
for the second frame image feature at the k-th level, k denotes a scale level, and k is 1,2, …, 6. Where 1 represents 1/2 native resolution and 6 represents 1/64 native resolution.
Preferably, the S12 specifically includes:
aiming at the first frame feature pyramid, the pyramid feature of the k +1 level is used
Figure BDA0002461923820000024
Upsampling to k level spatial resolution by deconvolution operation, and matching with the original pyramid feature of k level
Figure BDA0002461923820000025
Performing cascade convolution to obtain k level semantic features of fused multi-scale information
Figure BDA0002461923820000026
Preferably, the S13 specifically includes:
optical flow field flow estimated at the k +1 th levelk+12 times of spatial upsampling is carried out to obtain an initial optical flow Up of the k level2(flowk+1) Using Up2(flowk+1) Pyramid feature to k level of second frame image
Figure BDA0002461923820000027
Carrying out deformation operation based on bilinear sampling to obtain deformed target characteristics
Figure BDA0002461923820000031
Preferably, the S14 specifically includes:
s141: calculating a matching cost:
Figure BDA0002461923820000032
wherein, the inner product is represented, x represents the two-dimensional space position coordinate of the first frame feature, d represents the two-dimensional coordinate of the search offset at x, and the search radius is R, so that d belongs to the square region of { - (2R +1), …,2R +1} × { - (2R +1), …,2R +1 };
s142: for matching cost ck(x, d) performing convolution operation of 3 × 3 to obtain matched cost characteristics after cost aggregation
Figure BDA0002461923820000033
Compared with the prior art, the invention has the following advantages:
(1) according to the real-time optical flow estimation method based on the lightweight convolutional neural network, multi-scale features are obtained by performing multi-scale information fusion in S12, and compared with the traditional method of only constructing an image or a single feature pyramid, the fusion feature pyramid has higher expressive power; both low texture information and multi-scale semantic information are considered, and a full-scene dense optical flow field can be accurately estimated;
(2) according to the real-time optical flow estimation method based on the lightweight convolutional neural network, the deformation operation based on bilinear sampling is carried out on the second frame matching characteristics from coarse to fine in S13, the spatial distance of large-amplitude movement can be shortened, the challenge brought by large-amplitude movement is relieved, and residual error estimation is facilitated; (ii) a
(3) According to the real-time optical flow estimation method based on the lightweight convolutional neural network, through cost aggregation in S14, compared with the traditional strategy, the cost aggregation enables the original matching cost based on inner products to have certain adaptability, and therefore the network performance is improved;
(4) according to the real-time optical flow estimation method based on the lightweight convolutional neural network, the multiple information in the S15 cascade connection is used as the input of the optical flow regression network, and different from the prior art, semantic information is provided by fused multi-scale features instead of original pyramid features, so that the overall perception field of the network is improved, and mismatching is reduced. In addition, the characteristics after cost aggregation are used for replacing the original matching cost as input, so that the network convergence is accelerated and the model precision is improved;
(5) according to the real-time optical flow estimation method based on the lightweight convolutional neural network, a pyramid estimation method from coarse to fine is adopted for optical flow estimation in S16, and specifically, the relative displacement of large-amplitude motion in a low-resolution pyramid is small, so that the search radius R during matching is reduced, and therefore, compared with the conventional method, the method has the advantages of large estimation dynamic range and high inference speed. (ii) a
(6) The real-time optical flow estimation method based on the lightweight convolutional neural network provided by the invention is light in weight, efficient, real-time and rapid, can be deployed in mobile computing equipment and has strong practicability.
Of course, it is not necessary for any product in which the invention is practiced to achieve all of the above-described advantages at the same time.
Drawings
Embodiments of the invention are further described below with reference to the accompanying drawings:
FIG. 1 is a flow chart of a method for estimating a real-time optical flow based on a lightweight convolutional neural network according to an embodiment of the present invention;
FIG. 2 is a network structure diagram of a method for estimating real-time optical flow based on a lightweight convolutional neural network according to an embodiment of the present invention;
FIG. 3a is a first frame of an image according to an embodiment of the present invention;
FIG. 3b is a diagram of a second frame image according to an embodiment of the present invention
FIG. 3c is a result of a dense optical flow estimation obtained by performing real-time optical flow estimation on FIGS. 3a and 3b using a method according to an embodiment of the present invention;
FIG. 4a is a first frame image according to another embodiment of the present invention;
FIG. 4b is a diagram of a second frame image according to another embodiment of the present invention
FIG. 4c is a result of a dense optical flow estimation obtained by performing real-time optical flow estimation on the optical flows of FIGS. 4a and 4b according to an embodiment of the present invention;
FIG. 5a is a first frame image according to another embodiment of the present invention;
FIG. 5b is a diagram of a second frame image according to another embodiment of the present invention
FIG. 5c is a result of a dense optical flow estimation obtained by performing real-time optical flow estimation on FIGS. 5a and 5b using a method according to an embodiment of the present invention;
FIG. 6a is a first frame image according to another embodiment of the present invention;
FIG. 6b is a diagram of a second frame image according to another embodiment of the present invention
FIG. 6c is a result of a dense optical flow estimation obtained by performing real-time optical flow estimation on FIGS. 6a and 6b using a method according to an embodiment of the present invention;
FIG. 7a is a first frame image according to another embodiment of the present invention;
FIG. 7b is a diagram of a second frame image according to another embodiment of the present invention
FIG. 7c is a result of a dense optical flow estimation using a method of one embodiment of the present invention to perform real-time optical flow estimation on FIGS. 7a and 7 b;
FIG. 8 is a comparison of the method of the present invention with existing depth models in terms of parametric quantities and inferred velocities.
Detailed Description
The present invention will be described in detail with reference to specific examples. The following examples will assist those skilled in the art in further understanding the invention, but are not intended to limit the invention in any way. It should be noted that variations and modifications can be made by persons skilled in the art without departing from the spirit of the invention. All falling within the scope of the present invention.
Fig. 1 is a flowchart of a method for estimating a real-time optical flow based on a lightweight convolutional neural network according to an embodiment of the present invention.
Referring to fig. 1, the method for estimating a real-time optical flow based on a lightweight convolutional neural network of the present embodiment includes the following steps:
s11: giving two adjacent frames of images, and extracting hierarchical image features by using a parameter-shared convolutional neural network to construct a first frame feature pyramid and a second frame feature pyramid;
s12: on the basis of the characteristic pyramid constructed in the S11, a first frame image U-shaped network structure is constructed through deconvolution operation to perform multi-scale information fusion, and multi-scale characteristics are obtained;
s13: initializing a lowest-resolution optical flow field to be zero, and performing bilinear sampling-based deformation operation on the second frame matching characteristics after the optical flow field estimated by the second low resolution is up-sampled;
s14: local similarity calculation based on inner products is carried out on the characteristics of the first frame characteristic pyramid and the characteristics of the deformed second frame obtained in the step S13, matching cost is constructed, and cost aggregation is carried out;
s15: taking the multi-scale features constructed in S12, the up-sampled optical flow field in S13 and the matched cost features after cost aggregation in S14 as the input of an optical flow regression network, and estimating the optical flow field under the resolution;
s16: and repeating S13-S15 until the optical flow field under the highest resolution is estimated.
In an embodiment, S11 specifically includes:
given two adjacent input images I1,I2Extracting multi-scale image features by a pyramid network, and constructing a first frame feature pyramid and a second frame feature pyramid:
Figure BDA0002461923820000051
wherein:
Figure BDA0002461923820000052
for the first frame image feature at the k-th level,
Figure BDA0002461923820000053
for the second frame image feature at the k-th level, k denotes a scale level, and k is 1,2, …, 6. Where 1 represents 1/2 native resolution and 6 represents 1/64 native resolution.
S12 specifically includes:
aiming at the first frame feature pyramid, the pyramid features of the k +1 level are combined
Figure BDA0002461923820000054
Upsampling to k level spatial resolution by deconvolution operation, and matching with the original pyramid feature of k level
Figure BDA0002461923820000055
Performing cascade convolution to obtain k level semantic features of fused multi-scale information
Figure BDA0002461923820000056
S13 specifically includes:
optical flow field flow estimated at the k +1 th levelk+12 times of spatial upsampling is carried out to obtain an initial optical flow Up of the k level2(flowk+1) Using Up2(flowk+1) Pyramid feature to k level of second frame image
Figure BDA0002461923820000061
Carrying out deformation operation based on bilinear sampling to obtain deformed target characteristics
Figure BDA0002461923820000062
S14 specifically includes:
s141: calculating a matching cost:
Figure BDA0002461923820000063
wherein, the inner product is represented, x represents the two-dimensional space position coordinate of the first frame feature, d represents the two-dimensional coordinate of the search offset at x, and the search radius is R, so that d belongs to the square region of { - (2R +1), …,2R +1} × { - (2R +1), …,2R +1 };
s142: for matching cost ck(x, d) performing convolution operation of 3 × 3 to obtain matched cost characteristics after cost aggregation
Figure BDA0002461923820000064
In one embodiment, the optical flow network structure shown in fig. 2 is first constructed using any deep learning framework. Such as: the proposed network architecture can be implemented using a PyTorch framework.
Then, a forward propagation algorithm as shown in fig. 1 is constructed, and the optical flow field with 5 levels of resolutions including network outputs 1/4, 1/8, 1/16, 1/32 and 1/64 is trained end to end by using the following multi-scale loss function:
Figure BDA0002461923820000065
α therein6=0.32,α5=0.08,α4=0.02,α3=0.01,α20.005 is the weighting coefficient of the loss function between each level. flow (W)l(x) Representing the optical flow field estimated by the l-th level network,
Figure BDA0002461923820000066
supervised signal representing down-sampling of true optical flow to corresponding hierarchical resolution, | luminance2Representing a 2 norm.
Next, the proposed model was supervised trained using a second step multi-scale loss function using the FlyingChairs, Flyingthings3D synthetic dataset. In the FlyingChairs training phase, the initial learning rate is set to lr being 1e-4, 600k iterations are performed, and then the decay is half of the previous learning rate at 300k, 400k, 500k iterations. The model is then fine-tune on the Flyingthings3D dataset with the initial learning rate set to lr-1 e-5 for a total of 500k iterations, decaying to half the previous learning rate at 200k, 300k, 400k iterations. Through the training of the two stages, the proposed model can be subjected to fine-tune in other synthetic or real scene data sets and finally deployed for use. In the training process, a plurality of data amplification modes such as random mirror image, random rotation, random scaling, random color dithering, random slicing and the like are used.
And finally, when the model is trained and is actually used, the optical flow field with the highest resolution (1/4 resolution) in 5 levels is adopted for up-sampling to obtain the optical flow field with the resolution of the original input image as the final estimation result of the network.
The effects of the examples of the present invention will be further described below by experiments.
1. Conditions of the experiment
The MPI Sintel and KITTI standard test video image sequence is adopted as experimental data in the experiment. The experimental facility had one Intel Core i7-6700 CPU and a single NVIDIA GTX1080Ti GPU, and the experimental environment was PyTorch-0.4.0.
2. Content of the experiment
The proposed dense optical flow estimation method is validated from both qualitative and quantitative perspectives.
2.1 qualitative test results
The invention selects 5 representative adjacent frame test picture sequences 3a-3b, 4a-4b, 5a-5b, 6a-6b and 7a-7b from a computer synthesis data set MPI Sintel and a real automatic driving data set KITTI, and test scenes comprising flexible object motion, large-amplitude rigid motion and the like, and optical flow estimation results of the method are shown in figures 3c, 4c, 5c, 6c and 7 c.
2.2 quantitative test results
The method adopts MPI Sintel and KITTI test data set to carry out quantitative analysis on the precision of the estimated dense optical flow. The estimated result is submitted to a relevant test server for evaluation. The methods compared include the currently preferred FlowNet C, FlowNet2, LiteFlowNet and PWC-Net. The evaluation indexes include an Average End Point Error (AEPE) and an Error estimation pixel percentage (Fl). Where a correctly estimated pixel is defined as an estimated value that differs from the authentic signature by less than 3 pixels or by a difference distance that is less than 5% of the authentic signature amplitude. The performance of the related methods in the test data set is shown in Table 1, where Fl-Noc represents the Fl index for non-occluded regions.
TABLE 1 comparison of different depth learning methods in Sintel, KITTI test datasets
Figure BDA0002461923820000071
Figure BDA0002461923820000081
The best results are shown in bold in table 1, from which it can be seen that the average accuracy of optical flow estimation in different test benchmarks of multiple data sets exceeds most of the current advanced methods. The optical flow estimation method based on the optical flow prediction model has the advantages that the optical flow estimation precision can be improved in a plurality of different testing environments, and the scene generalization capability is good. As shown in FIG. 8, the method has the fastest inference speed under the same test environment, and can reach 63fps in a 448x1024 resolution video sequence, so that the high-efficiency real-time performance and the wide application prospect of the method are fully embodied.
The embodiments were chosen and described in order to best explain the principles of the invention and the practical application, and not to limit the invention. Any modifications and variations within the scope of the description, which may occur to those skilled in the art, are intended to be within the scope of the invention.

Claims (9)

1. A real-time optical flow estimation method based on a lightweight convolutional neural network is characterized by comprising the following steps:
s11: giving two adjacent frames of images, and extracting hierarchical image features by using a parameter-shared convolutional neural network to construct a first frame feature pyramid and a second frame feature pyramid;
s12: on the basis of the characteristic pyramid constructed in the S11, constructing a first frame image U-shaped network structure by adopting deconvolution operation to perform multi-scale information fusion to obtain multi-scale characteristics;
s13: initializing a lowest-resolution optical flow field to be zero, and performing deformation operation based on bilinear sampling on the second frame matching characteristics after the optical flow estimated by the second low resolution is up-sampled;
s14: local similarity calculation based on inner products is carried out on the characteristics of the first frame characteristic pyramid and the characteristics of the deformed second frame obtained in the step S13, matching cost is constructed, and cost aggregation is carried out;
s15: taking the multi-scale features constructed in S12, the up-sampled optical flow field in S13 and the matched cost features after cost aggregation in S14 as the input of an optical flow regression network, and estimating the optical flow field under the resolution;
s16: and repeating the steps from S13 to S15 until the optical flow field under the highest resolution is estimated.
2. The method for estimating a real-time optical flow based on a lightweight convolutional neural network as claimed in claim 1, wherein the S11 specifically comprises:
given two adjacent input images I1,I2Extracting multi-scale image features by a pyramid network, and constructing a first frame feature pyramid and a second frame feature pyramid:
Figure FDA0002461923810000011
wherein:
Figure FDA0002461923810000012
for the first frame image feature at the k-th level,
Figure FDA0002461923810000013
for the second frame image feature at the k-th level, k denotes a scale level, and k is 1,2, …, 6. Wherein 1 represents 1/2 atomStarting resolution, 6 denotes 1/64 original resolution.
3. The method for estimating a real-time optical flow based on a lightweight convolutional neural network as claimed in claim 1 or 2, wherein the S12 specifically comprises:
aiming at the first frame feature pyramid, the pyramid feature of the k +1 level is used
Figure FDA0002461923810000014
Upsampling to k level spatial resolution by deconvolution operation, and matching with the original pyramid feature of k level
Figure FDA0002461923810000015
Performing cascade convolution to obtain k level semantic features of fused multi-scale information
Figure FDA0002461923810000016
4. The method for estimating a real-time optical flow based on a lightweight convolutional neural network as claimed in claim 1 or 2, wherein the S13 specifically comprises:
optical flow field flow estimated at the k +1 th levelk+12 times of spatial upsampling is carried out to obtain an initial optical flow Up of the k level2(flowk+1) Using Up2(flowk+1) Pyramid feature to k level of second frame image
Figure FDA0002461923810000017
Carrying out deformation operation based on bilinear sampling to obtain deformed target characteristics
Figure FDA0002461923810000021
5. The method according to claim 3, wherein the S13 specifically comprises:
optical flow field flow estimated at the k +1 th levelk+12 times of spatial upsampling is carried out to obtain an initial optical flow Up of the k level2(flowk+1) Using Up2(flowk+1) Pyramid feature to k level of second frame image
Figure FDA0002461923810000022
Carrying out deformation operation based on bilinear sampling to obtain deformed target characteristics
Figure FDA0002461923810000023
6. The method for estimating a real-time optical flow based on a lightweight convolutional neural network as claimed in claim 1 or 2, wherein the S14 specifically comprises:
s141: calculating a matching cost:
Figure FDA0002461923810000024
wherein, the inner product is represented, x represents the two-dimensional space position coordinate of the first frame feature, d represents the two-dimensional coordinate of the search offset at x, and the search radius is R, so that d belongs to the square region of { - (2R +1), …,2R +1} × { - (2R +1), …,2R +1 };
s142: for matching cost ck(x, d) performing convolution operation of 3 × 3 to obtain matched cost characteristics after cost aggregation
Figure FDA0002461923810000025
7. The method for estimating a real-time optical flow based on a lightweight convolutional neural network as claimed in claim 3, wherein the step S14 specifically comprises:
s141: calculating a matching cost:
Figure FDA0002461923810000026
wherein, the inner product is represented, x represents the two-dimensional space position coordinate of the first frame feature, d represents the two-dimensional coordinate of the search offset at x, and the search radius is R, so that d belongs to the square region of { - (2R +1), …,2R +1} × { - (2R +1), …,2R +1 };
s142: for matching cost ck(x, d) performing convolution operation of 3 × 3 to obtain matched cost characteristics after cost aggregation
Figure FDA0002461923810000027
8. The method for estimating real-time optical flow based on a lightweight convolutional neural network as claimed in claim 4, wherein the step S14 specifically comprises:
s141: calculating a matching cost:
Figure FDA0002461923810000028
wherein, the inner product is represented, x represents the two-dimensional space position coordinate of the first frame feature, d represents the two-dimensional coordinate of the search offset at x, and the search radius is R, so that d belongs to the square region of { - (2R +1), …,2R +1} × { - (2R +1), …,2R +1 };
s142: for matching cost ck(x, d) performing convolution operation of 3 × 3 to obtain matched cost characteristics after cost aggregation
Figure FDA0002461923810000031
9. The method for estimating a real-time optical flow based on a lightweight convolutional neural network as claimed in claim 5, wherein the step S14 specifically comprises:
s141: calculating a matching cost:
Figure FDA0002461923810000032
wherein, the inner product is represented, x represents the two-dimensional space position coordinate of the first frame feature, d represents the two-dimensional coordinate of the search offset at x, and the search radius is R, so that d belongs to the square region of { - (2R +1), …,2R +1} × { - (2R +1), …,2R +1 };
s142: for matching cost ck(x, d) performing convolution operation of 3 × 3 to obtain matched cost characteristics after cost aggregation
Figure FDA0002461923810000033
CN202010322368.5A 2020-04-22 2020-04-22 Real-time optical flow estimation method based on lightweight convolutional neural network Active CN111626308B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010322368.5A CN111626308B (en) 2020-04-22 2020-04-22 Real-time optical flow estimation method based on lightweight convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010322368.5A CN111626308B (en) 2020-04-22 2020-04-22 Real-time optical flow estimation method based on lightweight convolutional neural network

Publications (2)

Publication Number Publication Date
CN111626308A true CN111626308A (en) 2020-09-04
CN111626308B CN111626308B (en) 2023-04-18

Family

ID=72260062

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010322368.5A Active CN111626308B (en) 2020-04-22 2020-04-22 Real-time optical flow estimation method based on lightweight convolutional neural network

Country Status (1)

Country Link
CN (1) CN111626308B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112465872A (en) * 2020-12-10 2021-03-09 南昌航空大学 Image sequence optical flow estimation method based on learnable occlusion mask and secondary deformation optimization
CN113538527A (en) * 2021-07-08 2021-10-22 上海工程技术大学 Efficient lightweight optical flow estimation method
CN115619740A (en) * 2022-10-19 2023-01-17 广西交科集团有限公司 High-precision video speed measuring method and system, electronic equipment and storage medium
CN116486107A (en) * 2023-06-21 2023-07-25 南昌航空大学 Optical flow calculation method, system, equipment and medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180293737A1 (en) * 2017-04-07 2018-10-11 Nvidia Corporation System and method for optical flow estimation
CN108881899A (en) * 2018-07-09 2018-11-23 深圳地平线机器人科技有限公司 Based on the pyramidal image prediction method and apparatus of optical flow field and electronic equipment
CN109756690A (en) * 2018-12-21 2019-05-14 西北工业大学 Lightweight view interpolation method based on feature rank light stream
CN110111366A (en) * 2019-05-06 2019-08-09 北京理工大学 A kind of end-to-end light stream estimation method based on multistage loss amount
CN110176023A (en) * 2019-04-29 2019-08-27 同济大学 A kind of light stream estimation method based on pyramid structure

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180293737A1 (en) * 2017-04-07 2018-10-11 Nvidia Corporation System and method for optical flow estimation
CN108881899A (en) * 2018-07-09 2018-11-23 深圳地平线机器人科技有限公司 Based on the pyramidal image prediction method and apparatus of optical flow field and electronic equipment
CN109756690A (en) * 2018-12-21 2019-05-14 西北工业大学 Lightweight view interpolation method based on feature rank light stream
CN110176023A (en) * 2019-04-29 2019-08-27 同济大学 A kind of light stream estimation method based on pyramid structure
CN110111366A (en) * 2019-05-06 2019-08-09 北京理工大学 A kind of end-to-end light stream estimation method based on multistage loss amount

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
TAK-WAI HUI; XIAOOU TANG等: "LiteFlowNet: A Lightweight Convolutional Neural Network for Optical Flow Estimation" *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112465872A (en) * 2020-12-10 2021-03-09 南昌航空大学 Image sequence optical flow estimation method based on learnable occlusion mask and secondary deformation optimization
CN112465872B (en) * 2020-12-10 2022-08-26 南昌航空大学 Image sequence optical flow estimation method based on learnable occlusion mask and secondary deformation optimization
CN113538527A (en) * 2021-07-08 2021-10-22 上海工程技术大学 Efficient lightweight optical flow estimation method
CN113538527B (en) * 2021-07-08 2023-09-26 上海工程技术大学 Efficient lightweight optical flow estimation method, storage medium and device
CN115619740A (en) * 2022-10-19 2023-01-17 广西交科集团有限公司 High-precision video speed measuring method and system, electronic equipment and storage medium
CN115619740B (en) * 2022-10-19 2023-08-08 广西交科集团有限公司 High-precision video speed measuring method, system, electronic equipment and storage medium
CN116486107A (en) * 2023-06-21 2023-07-25 南昌航空大学 Optical flow calculation method, system, equipment and medium
CN116486107B (en) * 2023-06-21 2023-09-05 南昌航空大学 Optical flow calculation method, system, equipment and medium

Also Published As

Publication number Publication date
CN111626308B (en) 2023-04-18

Similar Documents

Publication Publication Date Title
US11210803B2 (en) Method for 3D scene dense reconstruction based on monocular visual slam
CN111626308B (en) Real-time optical flow estimation method based on lightweight convolutional neural network
Liu et al. Video super-resolution based on deep learning: a comprehensive survey
CN111325794B (en) Visual simultaneous localization and map construction method based on depth convolution self-encoder
Chen et al. Monocular neural image based rendering with continuous view control
CN111047516B (en) Image processing method, image processing device, computer equipment and storage medium
CN110782490B (en) Video depth map estimation method and device with space-time consistency
Li et al. From beginner to master: A survey for deep learning-based single-image super-resolution
Vitoria et al. Semantic image inpainting through improved wasserstein generative adversarial networks
CN114339409A (en) Video processing method, video processing device, computer equipment and storage medium
CN116205962B (en) Monocular depth estimation method and system based on complete context information
CN114429555A (en) Image density matching method, system, equipment and storage medium from coarse to fine
Wang et al. 4k-nerf: High fidelity neural radiance fields at ultra high resolutions
Klenk et al. E-nerf: Neural radiance fields from a moving event camera
CN113610912B (en) System and method for estimating monocular depth of low-resolution image in three-dimensional scene reconstruction
CN117115786B (en) Depth estimation model training method for joint segmentation tracking and application method
CN111242999A (en) Parallax estimation optimization method based on up-sampling and accurate re-matching
CN111767679B (en) Method and device for processing time-varying vector field data
Hara et al. Enhancement of novel view synthesis using omnidirectional image completion
WO2024032331A9 (en) Image processing method and apparatus, electronic device, and storage medium
Ye Learning of dense optical flow, motion and depth, from sparse event cameras
Park et al. Relativistic Approach for Training Self-Supervised Adversarial Depth Prediction Model Using Symmetric Consistency
CN117474956B (en) Light field reconstruction model training method based on motion estimation attention and related equipment
CN117593702B (en) Remote monitoring method, device, equipment and storage medium
CN117241065B (en) Video plug-in frame image generation method, device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant