CN113658230A - Optical flow estimation method, terminal and storage medium - Google Patents

Optical flow estimation method, terminal and storage medium Download PDF

Info

Publication number
CN113658230A
CN113658230A CN202010396695.5A CN202010396695A CN113658230A CN 113658230 A CN113658230 A CN 113658230A CN 202010396695 A CN202010396695 A CN 202010396695A CN 113658230 A CN113658230 A CN 113658230A
Authority
CN
China
Prior art keywords
layer
feature
characteristic
optical flow
mth
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010396695.5A
Other languages
Chinese (zh)
Other versions
CN113658230B (en
Inventor
徐璐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan TCL Group Industrial Research Institute Co Ltd
Original Assignee
Wuhan TCL Group Industrial Research Institute Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan TCL Group Industrial Research Institute Co Ltd filed Critical Wuhan TCL Group Industrial Research Institute Co Ltd
Priority to CN202010396695.5A priority Critical patent/CN113658230B/en
Publication of CN113658230A publication Critical patent/CN113658230A/en
Application granted granted Critical
Publication of CN113658230B publication Critical patent/CN113658230B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/269Analysis of motion using gradient-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20016Hierarchical, coarse-to-fine, multiscale or multiresolution image processing; Pyramid transform

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an optical flow estimation method, a terminal and a storage medium, wherein the optical flow estimation method is used for processing each layer of feature layer used for being input into an optical flow estimation network in an image of an optical flow to be estimated before the image is input into the optical flow estimation network for optical flow estimation, so that the features of each layer input into the optical flow estimation network can be fused with the features of other layers, the optical flow estimation network can effectively utilize richer features for optical flow estimation, and the accuracy of optical flow estimation is improved.

Description

Optical flow estimation method, terminal and storage medium
Technical Field
The present invention relates to the field of computer vision technologies, and in particular, to an optical flow estimation method, a terminal, and a storage medium.
Background
Optical flow has wide application in the field of computer vision, such as video compression, video behavior recognition, video frame interpolation, and the like, and currently, deep learning models for estimating optical flow exist. In the optical flow estimation network model based on the feature pyramid network structure, when the optical flow is estimated, a feature pyramid is established according to an original image, and the optical flow is estimated layer by layer from bottom-layer features upwards.
Thus, there is a need for improvements and enhancements in the art.
Disclosure of Invention
In view of the above-mentioned drawbacks of the prior art, the present invention provides an optical flow estimation method, a terminal and a storage medium, which are intended to solve the problem in the prior art that an optical flow estimation error for complex motion in an optical flow estimation network model based on a pyramid network structure is large.
In order to solve the technical problems, the technical scheme adopted by the invention is as follows:
an optical flow estimation method, wherein the optical flow estimation method comprises:
acquiring a target image pair of an optical flow to be estimated, performing feature extraction on each target image in the target image pair, and acquiring first features corresponding to each target image, wherein the first features comprise N layers of feature layers, and N is a positive integer greater than or equal to 2;
acquiring M characteristic layers except a first characteristic layer in the N characteristic layers, processing each characteristic layer in the M characteristic layers according to other characteristic layers in the M characteristic layers, and acquiring a second characteristic corresponding to the target image, wherein the second characteristic comprises M characteristic layers, M is N-1, and M is a positive integer;
and inputting second characteristics corresponding to each target image in the target image pair into a preset optical flow estimation network, and acquiring the optical flow output by the preset optical flow estimation network.
The optical flow estimation method, wherein the extracting features of each target image in the target image pair, and acquiring first features corresponding to each target image respectively includes:
extracting a first layer of feature layer in the N layers of feature layers according to the target image;
and performing convolution pooling on the ith layer of feature layer according to a preset resolution scaling to obtain an i +1 th layer, wherein the numerical value of i is 1, 2, … and N-1 in sequence, so as to obtain the N layers of feature layers of the target image.
The optical flow estimation method, wherein the resolution of each of the M feature layers is sequentially reduced from the 1 st layer to the M th layer; the processing each of the M feature layers according to the other feature layers of the M feature layers, and obtaining the second feature corresponding to the target image includes:
and for the mth layer of the M layers of features, processing the mth layer of features according to the 1 st layer of features and/or the (M + 1) th layer of features in the M layers of features to obtain a second feature corresponding to the target image, wherein M is a positive integer, and M is greater than or equal to 1 and less than or equal to M.
The optical flow estimation method, wherein the processing the mth layer feature layer according to the 1 st layer feature layer and/or the M +1 th layer feature layer of the M layer feature layers includes:
performing feature fusion processing on the mth layer of feature layer according to the first intermediate feature layer and/or the second intermediate feature layer to obtain the mth layer of feature layer in the second feature;
the first middle feature layer is a feature layer which is acquired after up-sampling the (M + 1) th feature layer and is consistent with the resolution of the M-th feature layer, and the second middle feature layer is a feature layer which is acquired after down-sampling the 1 st feature layer in the M-th feature layer and is consistent with the resolution of the M-th feature layer.
The optical flow estimation method, wherein the performing feature fusion processing on the mth layer feature layer according to the first intermediate feature layer and/or the second intermediate feature layer, and acquiring the mth layer feature layer in the second features includes:
when M is larger than 1 and smaller than M, performing feature fusion processing on the mth layer of feature layer according to the first intermediate feature layer and the second intermediate feature layer to obtain the mth layer of feature layer in the second features;
when m is equal to 1, performing feature fusion processing on the mth layer feature layer according to the first intermediate feature layer to obtain a 1 st layer feature layer in the second features;
and when M is equal to M, performing feature fusion processing on the mth layer feature layer according to the second intermediate feature layer to obtain the mth layer feature layer in the second feature.
The optical flow estimation method, wherein the performing feature fusion processing on the mth layer feature layer according to the first intermediate feature layer and the second intermediate feature layer, and acquiring the mth layer feature layer in the second features includes:
convolving the first intermediate characteristic layer and the second intermediate characteristic layer respectively to generate a first convolution characteristic layer and a second convolution characteristic layer;
adding the first convolution feature layer and the second convolution feature layer to the mth layer feature layer to generate a third intermediate feature layer;
and performing convolution on the third intermediate characteristic layer to generate an m-th layer characteristic layer in the second characteristic.
The optical flow estimation method, wherein the performing feature fusion processing on the mth layer feature layer according to the first intermediate feature layer, and acquiring a 1 st layer feature layer in the second features includes:
performing convolution on the first intermediate characteristic layer to generate a first convolution characteristic layer;
adding the first convolution feature layer and the mth layer feature layer to generate a third intermediate feature layer;
and performing convolution on the third intermediate feature layer to generate a layer 1 feature layer in the second features.
The optical flow estimation method, wherein the performing feature fusion processing on the mth layer feature layer according to the second intermediate feature layer, and acquiring the mth layer feature layer in the second features includes:
convolving the second intermediate characteristic layer to generate a second convolution characteristic layer;
adding the second convolution feature layer and the mth layer feature layer to generate a third intermediate feature layer;
and performing convolution on the third intermediate characteristic layer to generate an Mth layer characteristic layer in the second characteristic.
A terminal, wherein the terminal comprises: a processor, a storage medium communicatively connected to the processor, the storage medium adapted to store a plurality of instructions, the processor adapted to invoke the instructions in the storage medium to perform the steps of implementing the optical flow estimation method of any of the above.
A storage medium, wherein the storage medium stores one or more programs, which are executable by one or more processors to implement the steps of the optical flow estimation method of any of the above.
Has the advantages that: compared with the prior art, the optical flow estimation method, the terminal and the storage medium are provided, each layer of feature layer used for being input into the optical flow estimation network in the image of the optical flow to be estimated is processed by using other feature layers before the image is input into the optical flow estimation network for optical flow estimation, so that the features of each layer input into the optical flow estimation network can be fused with the features of other layers, the optical flow estimation network can effectively use richer features for optical flow estimation, and the accuracy of optical flow estimation is improved.
Drawings
FIG. 1 is a flow chart of an embodiment of an optical flow estimation method provided by the present invention;
FIG. 2 is a schematic diagram of the operation of a PWC-Net optical flow estimation network;
FIG. 3 is a schematic diagram of generating a second feature according to a first feature in the optical flow estimation method provided by the present invention;
FIG. 4 is a comparison graph of the optical flow EPE value estimated by the optical flow estimation method of the present invention and the existing method;
FIG. 5 is a comparison graph of a visualization of an optical flow estimated by the optical flow estimation method of the present invention and an optical flow estimated by a conventional method;
fig. 6 is a schematic structural diagram of an embodiment of a terminal provided in the present invention.
Detailed Description
In order to make the objects, technical solutions and effects of the present invention clearer and clearer, the present invention is further described in detail below with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The optical flow estimation method provided by the invention can be applied to terminals, and the terminals can be but are not limited to various personal computers, notebook computers, mobile phones, tablet computers, vehicle-mounted computers and portable wearable equipment. When the terminal acquires the image pairs of the optical flow to be estimated, the optical flow corresponding to the image pairs can be estimated by the optical flow estimation method provided by the invention.
Example one
The inventor finds that in the existing optical flow estimation network (such as PWC-Net, flownet2.0, LiteFlowNet and the like) based on the feature pyramid network structure, a feature pyramid is established according to an original image, and then the optical flow is estimated according to the feature pyramid established by the original image, as shown in fig. 2, fig. 2 is a schematic view of an optical flow estimation process of the PWC-Net optical flow estimation network, wherein the PWC-Net is to extract the pyramid features of 6 layers according to the original image, predict the optical flow by using the features of the later 5 layers with smaller resolution, estimate the optical flow layer by layer from the features of the top layer, the optical flow is finer and finer, the resolution is higher, and when each layer is estimated, the optical flow sampled on the upper layer is used, and then a series of operations are performed to obtain the finally estimated optical flow. In the feature pyramid established from the original image, as the resolution of each feature layer from the bottom layer to the top layer is gradually reduced, that is, the feature details of the bottom layer are rich, the features of the high layer are more abstract, and the semantic information is rich, the optical flow estimated by the existing optical flow estimation network based on the feature pyramid network structure at each layer is the optical flow sampled at the previous layer, which causes more details to be lost in the optical flow estimation process, and causes a larger error of the final optical flow estimation result obtained when complex motion exists in the image.
Based on the above problem, the present invention provides an optical flow estimation method, which further processes pyramid feature layers obtained from an original image, and fuses other feature layers for each feature layer to be input to an optical flow estimation network.
Referring to fig. 1, fig. 1 is a flowchart illustrating a first embodiment of an optical flow estimation method according to the present invention.
The optical flow estimation method comprises the following steps:
s100, obtaining a target image pair of the optical flow to be estimated, extracting features of each target image in the target image pair, and obtaining first features corresponding to each target image.
Specifically, optical flow estimation is to determine an optical flow between two images according to a difference between the two images, in this embodiment, first, a target image pair of the optical flow to be estimated is obtained, where the target image pair includes the two target images, feature extraction is performed on each target image, and first features corresponding to each target image are obtained, where the first features are N layers of feature layers, and N is a positive integer greater than or equal to 2. The number of feature layers included in the first feature may be determined according to the number of feature layers set by an optical flow estimation network for estimating an optical flow, and specifically, in the optical flow estimation method provided in this embodiment, after the target image is processed, the feature corresponding to the target image is input into a preset optical flow estimation network, N may be a value obtained by adding one to a feature layer number setting value of the input feature in the preset optical flow estimation network, for example, if the number of layers of the input feature by the PWC-Net network is set to 5, N may be 6.
The extracting the features of each target image in the target image pair, and the obtaining the first features respectively corresponding to each target image comprises:
s110, extracting a first layer of feature layer in the N layers of feature layers according to the target image;
after the target image is obtained, performing convolution pooling on the target image to obtain a feature layer, taking the feature layer obtained after performing convolution pooling on the target image as a first layer of feature layer in the first feature, and performing convolution pooling on the target image is a common processing mode in the field, which is not described herein in detail.
And S120, performing convolution pooling on the ith layer of feature layer according to a preset resolution scaling to obtain an i +1 th layer, wherein the numerical value of i is 1, 2, … and N-1 in sequence, so as to obtain the N layers of feature layers of the target image.
After a first characteristic layer of the N characteristic layers is obtained, convolution pooling is sequentially carried out on the basis of the first characteristic layer, a new characteristic layer with lower resolution is obtained according to a preset resolution scaling ratio during each convolution pooling, namely, the convolution pooling is carried out on the basis of the first characteristic layer to obtain a second characteristic layer, the convolution pooling is carried out on the basis of the second characteristic layer to obtain a third characteristic layer until the N characteristic layers including the first characteristic layer are obtained, and the resolution ratio of the new characteristic layer to the previous characteristic layer is constant during each convolution pooling of the characteristic layers to obtain the new characteristic layer, so that the N characteristic layers with the resolution being sequentially reduced can be obtained. That is, the resolution of the first layer of feature layer is the largest, and the resolution of the nth layer of feature layer is the smallest. For example, the preset resolution scaling may be 2: 1, that is, the resolution of each feature layer is half of the resolution of the feature layer of the previous layer.
In the process of acquiring the first feature according to the target image, since the resolution is gradually reduced when each layer of feature layer is acquired, the feature layer at the bottom layer has high resolution and contains richer detailed features, and the feature layer at the top layer has more concentrated features and richer semantic information but loses part of details, the optical flow estimation network preset with the first feature input value is directly used for optical flow estimation, so that a large estimation error is caused for the situation that an object has complex motion in an image pair. In this embodiment, after the first feature is obtained, the first feature is further processed.
S200, obtaining M layers of feature layers except for the first layer of feature layer in the N layers of feature layers, processing each feature layer in the M layers of feature layers according to other feature layers in the M layers of feature layers, and obtaining second features corresponding to the target images respectively.
In the N-layer feature layers, since the resolution of the first-layer feature layer is the largest, the number of parameters and the amount of computation are increased greatly for optical flow estimation, and therefore, in this embodiment, M-layer feature layers other than the first-layer feature layer in the N-layer feature layers are processed to obtain the second feature for optical flow estimation, where M is N-1 and M is a positive integer.
Specifically, the sequentially decreasing of the resolution of each of the M feature layers from the 1 st layer to the M th layer, the processing each of the M feature layers according to the other feature layers in the M feature layers, and the obtaining of the second features respectively corresponding to the target image includes:
and for the mth layer of the M layers of features, processing the mth layer of features according to the 1 st layer of features and/or the (M + 1) th layer of features in the M layers of features to obtain a second feature corresponding to the target image, wherein M is a positive integer, and M is greater than or equal to 1 and less than or equal to M.
As the feature layer with high resolution has richer details, while the feature layer with low resolution is more abstract, and the semantic information is richer, in this embodiment, as shown in fig. 3, for the mth layer feature layer in the M layers of features, processing is performed according to the feature layer with the highest resolution (i.e., the 1 st layer feature layer) and/or the M +1 th layer feature layer in the M layers of feature layers, so as to obtain the second feature corresponding to the target image, specifically: and performing feature fusion processing on the mth layer of feature layer according to the first intermediate feature layer and/or the second intermediate feature layer to obtain the mth layer of feature layer in the second feature. The first middle characteristic layer is obtained after up-sampling the (M + 1) th layer of characteristic layer, the characteristic layer with the same size of the M-th layer of characteristic layer, the second middle characteristic layer is obtained after down-sampling the 1 st layer of characteristic layer in the M-layer of characteristic layer, the characteristic layer with the same size of the M-th layer of characteristic layer.
When feature fusion processing is performed on the mth layer feature layer according to the 1 st layer feature layer and/or the M +1 th layer feature layer in the M layers of feature layers, the 1 st layer feature layer and/or the M +1 th layer feature layer in the M layers of feature layers are processed in advance to generate a feature layer having a resolution that is consistent with the resolution of the mth layer feature layer, specifically, an adjacent previous layer feature layer is up-sampled to increase the resolution, the first intermediate feature layer is generated, a feature layer having the highest resolution is down-sampled to decrease the resolution, and the second intermediate feature layer is generated.
The feature fusion is to perform an operation on features in a feature layer to generate a new feature layer fusing features of a plurality of feature layers, and the obtaining of the mth layer feature layer in the second features by performing the feature fusion processing on the mth layer feature layer according to the first intermediate feature layer and/or the second intermediate feature layer includes:
s210, when M is larger than 1 and smaller than M, performing feature fusion processing on the mth layer of feature layer according to the first intermediate feature layer and the second intermediate feature layer to obtain the mth layer of feature layer in the second features;
s220, when m is equal to 1, performing feature fusion processing on the mth layer feature layer according to the first intermediate feature layer to obtain a 1 st layer feature layer in the second features;
and S230, when M is equal to M, performing feature fusion processing on the mth layer feature layer according to the second intermediate feature layer to obtain the mth layer feature layer in the second features. Specifically, the performing feature fusion processing on the mth layer feature layer according to the first intermediate feature layer and the second intermediate feature layer to obtain the mth layer feature layer in the second feature includes:
s211, respectively convolving the first intermediate characteristic layer and the second intermediate characteristic layer to generate a first convolution characteristic layer and a second convolution characteristic layer;
s212, adding the first convolution characteristic layer and the second convolution characteristic layer to the mth layer characteristic layer to generate a third intermediate characteristic layer;
and S213, performing convolution on the third intermediate characteristic layer to generate an m-th layer characteristic layer in the second characteristic.
When M is greater than 1 and less than M, after the first intermediate feature layer and the second intermediate feature layer with the resolution consistent with the resolution of the mth layer feature layer are obtained, the first intermediate feature layer, the second intermediate feature layer and the mth layer feature layer can be directly subjected to addition operation due to the fact that the resolutions of the first intermediate feature layer, the second intermediate feature layer and the mth layer feature layer are consistent, namely the first intermediate feature layer, the second intermediate feature layer and the mth layer feature layer can be directly added for feature fusion, and a third intermediate feature layer is generated. In a possible implementation manner, before the first intermediate feature layer, the second intermediate feature layer, and the mth feature layer are added, the first intermediate feature layer and the second intermediate feature layer are respectively convolved in advance to perform dimension increasing or dimension reducing on the first intermediate feature layer and the second intermediate feature layer so as to facilitate operation, specifically, the first convolved feature layer and the second convolved feature layer may be obtained by respectively convolving the first intermediate feature layer and the second intermediate feature layer by using a convolution kernel with a size of 1 × 1, and then the first convolved feature layer and the second convolved feature layer are added to the mth feature layer to generate a third intermediate feature layer.
After the third intermediate feature layer is obtained, the third intermediate feature layer may be directly used as an mth feature layer in the second features, and in a possible implementation manner, the third intermediate feature layer may be further convolved to further extract features, for example, the third intermediate feature layer may be convolved by a convolution kernel with a size of 3 × 3 to generate the mth feature layer in the second features.
The performing feature fusion processing on the mth layer feature layer according to the second intermediate feature layer to obtain the mth layer feature layer in the second feature includes:
s221, performing convolution on the second intermediate characteristic layer to generate a second convolution characteristic layer;
s222, adding the second convolution characteristic layer and the mth layer characteristic layer to generate a third intermediate characteristic layer;
and S223, performing convolution on the third intermediate characteristic layer to generate an Mth layer characteristic layer in the second characteristic.
When m is 1, after the first intermediate feature layer with the resolution consistent with the resolution of the mth layer feature layer is obtained, the first intermediate feature layer and the mth layer feature layer may be directly subjected to addition operation, that is, the first intermediate feature layer and the mth layer feature layer may be directly added to perform feature fusion, so as to generate a third intermediate feature layer. In a possible implementation manner, before the first intermediate feature layer and the mth layer feature layer are added, the first intermediate feature layer is subjected to convolution processing in advance, so that the dimension of the first intermediate feature layer is increased or decreased to facilitate operation. Specifically, the first intermediate feature layer may be convolved by using a convolution kernel with a size of 1 × 1 to obtain a first convolution feature layer, and then the first convolution feature layer and the mth feature layer are added to generate a third intermediate feature layer.
After the third intermediate feature layer is obtained, the third intermediate feature layer may be directly used as the layer 1 feature layer in the second feature, and in a possible implementation manner, the third intermediate feature layer may be further convolved to further extract features, for example, the third intermediate feature layer may be convolved by a convolution kernel with a size of 3 × 3 to generate the layer 1 feature layer in the second feature.
The performing feature fusion processing on the mth layer feature layer according to the second intermediate feature layer to obtain the mth layer feature layer in the second feature includes:
s231, performing convolution on the second intermediate characteristic layer to generate a second convolution characteristic layer;
s232, adding the second convolution characteristic layer and the mth layer characteristic layer to generate a third intermediate characteristic layer;
and S233, performing convolution on the third intermediate characteristic layer to generate an Mth layer characteristic layer in the second characteristic.
When M is equal to M, after the second intermediate feature layer with the resolution consistent with the resolution of the mth layer feature layer is obtained, the second intermediate feature layer and the mth layer feature layer with the resolution consistent with each other may be directly subjected to addition operation, that is, the second intermediate feature layer and the mth layer feature layer may be directly added to perform feature fusion, so as to generate a third intermediate feature layer. In a possible implementation manner, before adding the second intermediate feature layer and the mth layer feature layer, the second intermediate feature layer is respectively subjected to convolution processing in advance to perform dimension increasing or dimension reducing on the second intermediate feature layer so as to facilitate operation, and specifically, a convolution kernel with a size of 1 × 1 is used to convolve the second intermediate feature layer to obtain a second convolution feature layer, and then the second convolution feature layer and the mth layer feature layer are added to generate a third intermediate feature layer.
After the third intermediate feature layer is obtained, the third intermediate feature layer may be directly used as an mth feature layer in the second feature, and in a possible implementation manner, the third intermediate feature layer may be further convolved to further extract features, for example, the third intermediate feature layer may be convolved by a convolution kernel with a size of 3 × 3 to generate the mth feature layer in the second feature.
After the first features respectively corresponding to each target image in the target image pair are all processed to obtain the second features respectively corresponding to each target image, the optical flow estimation method further includes:
s300, inputting second characteristics corresponding to each target image in the target image pair to a preset optical flow estimation network, and acquiring an optical flow output by the preset optical flow estimation network.
According to the foregoing description, the preset optical flow estimation network may be an existing optical flow estimation network based on a feature pyramid network structure, such as PWC-Net, flownet2.0, LiteFlowNet, and the like, and since M is equal to a layer number setting value for an input feature in the preset optical flow estimation network, the second feature may be directly input to the preset optical flow estimation network, and the optical flow of the target image pair is output by the preset optical flow estimation network.
In order to verify the effect of the optical flow estimation method provided by the invention, experimental verification is performed, optical flow estimation is performed on the same image by the optical flow estimation method provided by the invention (adopting PWC-Net as the preset optical flow estimation network) and the existing PWC-Net optical flow estimation network, the sample of the experiment is 300 images selected from the Flying trains test set, and as a result, as shown in fig. 4-5, fig. 4 is an optical flow EPE value comparison graph for optical flow estimation by using the optical flow estimation method provided by the invention and the existing optical flow estimation network, and EPE is an evaluation index of optical flow, and is more accurate as the optical flow is smaller. Fig. 5 is a comparison diagram of optical flow visualizations of a part of experimental samples, a first row in fig. 5 is a real optical flow visualizer, a second row is a visualization diagram of optical flow estimated by a conventional optical flow estimation network, and a third row is a visualization diagram of optical flow estimated by the optical flow estimation method provided by the invention. As can be seen from FIG. 4, the EPE value of the optical flow estimated by the optical flow estimation method provided by the present invention is significantly lower than that of the existing method, i.e., more accurate, and as can be seen from FIG. 5, the color of the visual image of the optical flow estimated by the optical flow estimation method provided by the present invention is closer to the color of the visual image of the real optical flow, i.e., closer to the real value.
In summary, the present embodiment provides an optical flow estimation method, in an image of an optical flow to be estimated, each layer of feature layer used for being input to an optical flow estimation network is processed by using other feature layers before being input to the optical flow estimation network for optical flow estimation, so that each layer of features input to the optical flow estimation network can be fused with features of other layers, and the optical flow estimation network can effectively use richer features for optical flow estimation, thereby improving accuracy of optical flow estimation.
It should be understood that, although the steps in the flowcharts shown in the figures of the present specification are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a portion of the steps in the flowchart may include multiple sub-steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of performing the sub-steps or stages is not necessarily sequential, but may be performed alternately or alternately with other steps or at least a portion of the sub-steps or stages of other steps.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, databases, or other media used in embodiments provided herein may include non-volatile and/or volatile memory. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
Example two
Based on the above embodiments, the present invention further provides a terminal, and a schematic block diagram thereof may be as shown in fig. 6. The terminal comprises a processor, a memory, a network interface, a display screen and a temperature sensor which are connected through a system bus. Wherein the processor of the terminal is configured to provide computing and control capabilities. The memory of the terminal comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The network interface of the terminal is used for connecting and communicating with an external terminal through a network. The computer program is executed by a processor to implement an optical flow estimation method. The display screen of the terminal can be a liquid crystal display screen or an electronic ink display screen, and the temperature sensor of the terminal is arranged in the terminal in advance and used for detecting the current operating temperature of internal equipment.
It will be appreciated by those skilled in the art that the block diagram of fig. 6 is only a block diagram of a portion of the structure associated with the inventive arrangements and does not constitute a limitation of the terminal to which the inventive arrangements are applied, and that a particular terminal may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, a terminal is provided, comprising a memory and a processor, the memory having a computer program stored therein, the processor when executing the computer program implementing at least the following steps:
acquiring a target image pair of an optical flow to be estimated, performing feature extraction on each target image in the target image pair, and acquiring first features corresponding to each target image, wherein the first features comprise N layers of feature layers, and N is a positive integer greater than or equal to 2;
acquiring M characteristic layers except a first characteristic layer in the N characteristic layers, processing each characteristic layer in the M characteristic layers according to other characteristic layers in the M characteristic layers, and acquiring a second characteristic corresponding to the target image, wherein the second characteristic comprises M characteristic layers, M is N-1, and M is a positive integer;
and inputting second characteristics corresponding to each target image in the target image pair into a preset optical flow estimation network, and acquiring the optical flow output by the preset optical flow estimation network.
The feature extraction of each target image in the target image pair, and the obtaining of the first feature corresponding to each target image respectively, includes:
extracting a first layer of feature layer in the N layers of feature layers according to the target image;
and performing convolution pooling on the ith layer of feature layer according to a preset resolution scaling to obtain an i +1 th layer, wherein the numerical value of i is 1, 2, … and N-1 in sequence, so as to obtain the N layers of feature layers of the target image.
The processing method includes the steps that the resolution of each of the M characteristic layers is sequentially reduced from the 1 st layer to the M th layer, the processing of each of the M characteristic layers according to other characteristic layers of the M characteristic layers is performed, and the obtaining of the second characteristic corresponding to the target image includes:
and for the mth layer of the M layers of features, processing the mth layer of features according to the 1 st layer of features and/or the (M + 1) th layer of features in the M layers of features to obtain a second feature corresponding to the target image, wherein M is a positive integer, and M is greater than or equal to 1 and less than or equal to M.
Wherein the processing the mth layer of feature layer according to the 1 st layer of feature layer and/or the M +1 th layer of feature layer includes:
performing feature fusion processing on the mth layer of feature layer according to the first intermediate feature layer and/or the second intermediate feature layer to obtain the mth layer of feature layer in the second feature;
the first middle feature layer is a feature layer which is acquired after up-sampling the (M + 1) th feature layer and is consistent with the resolution of the M-th feature layer, and the second middle feature layer is a feature layer which is acquired after down-sampling the 1 st feature layer in the M-th feature layer and is consistent with the resolution of the M-th feature layer.
Performing feature fusion processing on the mth layer feature layer according to the first intermediate feature layer and/or the second intermediate feature layer, and acquiring the mth layer feature layer in the second feature includes:
when M is larger than 1 and smaller than M, performing feature fusion processing on the mth layer of feature layer according to the first intermediate feature layer and the second intermediate feature layer to obtain the mth layer of feature layer in the second features;
when m is equal to 1, performing feature fusion processing on the mth layer feature layer according to the first intermediate feature layer to obtain a 1 st layer feature layer in the second features;
and when M is equal to M, performing feature fusion processing on the mth layer feature layer according to the second intermediate feature layer to obtain the mth layer feature layer in the second feature.
Performing feature fusion processing on the mth layer feature layer according to the first intermediate feature layer and the second intermediate feature layer, and acquiring the mth layer feature layer in the second feature includes:
convolving the first intermediate characteristic layer and the second intermediate characteristic layer respectively to generate a first convolution characteristic layer and a second convolution characteristic layer;
adding the first convolution feature layer and the second convolution feature layer to the mth layer feature layer to generate a third intermediate feature layer;
and performing convolution on the third intermediate characteristic layer to generate an m-th layer characteristic layer in the second characteristic.
Wherein, the performing feature fusion processing on the mth layer feature layer according to the first intermediate feature layer to obtain the 1 st layer feature layer in the second feature includes:
performing convolution on the first intermediate characteristic layer to generate a first convolution characteristic layer;
adding the first convolution feature layer and the mth layer feature layer to generate a third intermediate feature layer;
and performing convolution on the third intermediate feature layer to generate a layer 1 feature layer in the second features.
Wherein the performing feature fusion processing on the mth layer feature layer according to the second intermediate feature layer to obtain the mth layer feature layer in the second feature includes:
convolving the second intermediate characteristic layer to generate a second convolution characteristic layer;
adding the second convolution feature layer and the mth layer feature layer to generate a third intermediate feature layer;
and performing convolution on the third intermediate characteristic layer to generate an Mth layer characteristic layer in the second characteristic.
EXAMPLE III
The present invention also provides a storage medium storing one or more programs executable by one or more processors to implement the steps of the optical flow estimation method described in the above embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. An optical flow estimation method, comprising:
acquiring a target image pair of an optical flow to be estimated, performing feature extraction on each target image in the target image pair, and acquiring first features corresponding to each target image, wherein the first features comprise N layers of feature layers, and N is a positive integer greater than or equal to 2;
acquiring M characteristic layers except a first characteristic layer in the N characteristic layers, processing each characteristic layer in the M characteristic layers according to other characteristic layers in the M characteristic layers, and acquiring a second characteristic corresponding to the target image, wherein the second characteristic comprises M characteristic layers, M is N-1, and M is a positive integer;
and inputting second characteristics corresponding to each target image in the target image pair into a preset optical flow estimation network, and acquiring the optical flow output by the preset optical flow estimation network.
2. The optical flow estimation method according to claim 1, wherein the extracting features of each target image in the target image pair, and acquiring first features corresponding to each target image comprises:
extracting a first layer of feature layer in the N layers of feature layers according to the target image;
and performing convolution pooling on the ith layer of feature layer according to a preset resolution scaling to obtain an i +1 th layer, wherein the numerical value of i is 1, 2, … and N-1 in sequence, so as to obtain the N layers of feature layers of the target image.
3. The optical flow estimation method according to claim 1, wherein the resolution of each of the M feature layers is sequentially reduced from layer 1 to layer M; the processing each of the M feature layers according to the other feature layers of the M feature layers, and obtaining the second feature corresponding to the target image includes:
and for the mth layer of the M layers of features, processing the mth layer of features according to the 1 st layer of features and/or the (M + 1) th layer of features in the M layers of features to obtain a second feature corresponding to the target image, wherein M is a positive integer, and M is greater than or equal to 1 and less than or equal to M.
4. The optical flow estimation method according to claim 3, wherein said processing the M-th layer feature layer according to the 1 st layer feature layer and/or the M + 1-th layer feature layer of the M-layer feature layers comprises:
performing feature fusion processing on the mth layer of feature layer according to the first intermediate feature layer and/or the second intermediate feature layer to obtain the mth layer of feature layer in the second feature;
the first middle feature layer is a feature layer which is acquired after up-sampling the (M + 1) th feature layer and is consistent with the resolution of the M-th feature layer, and the second middle feature layer is a feature layer which is acquired after down-sampling the 1 st feature layer in the M-th feature layer and is consistent with the resolution of the M-th feature layer.
5. The optical flow estimation method according to claim 4, wherein the performing feature fusion processing on the mth feature layer according to a first intermediate feature layer and/or a second intermediate feature layer to obtain the mth feature layer in the second features comprises:
when M is larger than 1 and smaller than M, performing feature fusion processing on the mth layer of feature layer according to the first intermediate feature layer and the second intermediate feature layer to obtain the mth layer of feature layer in the second features;
when m is equal to 1, performing feature fusion processing on the mth layer feature layer according to the first intermediate feature layer to obtain a 1 st layer feature layer in the second features;
and when n is equal to M, performing feature fusion processing on the mth layer feature layer according to the second intermediate feature layer to obtain the mth layer feature layer in the second feature.
6. The optical flow estimation method according to claim 5, wherein the performing feature fusion processing on the mth layer feature layer according to the first intermediate feature layer and the second intermediate feature layer to obtain the mth layer feature layer in the second features comprises:
convolving the first intermediate characteristic layer and the second intermediate characteristic layer respectively to generate a first convolution characteristic layer and a second convolution characteristic layer;
adding the first convolution feature layer and the second convolution feature layer to the mth layer feature layer to generate a third intermediate feature layer;
and performing convolution on the third intermediate characteristic layer to generate an m-th layer characteristic layer in the second characteristic.
7. The optical flow estimation method according to claim 5, wherein the performing feature fusion processing on the mth feature layer according to the first intermediate feature layer to obtain the 1 st feature layer in the second features comprises:
performing convolution on the first intermediate characteristic layer to generate a first convolution characteristic layer;
adding the first convolution feature layer and the mth layer feature layer to generate a third intermediate feature layer;
and performing convolution on the third intermediate feature layer to generate a layer 1 feature layer in the second features.
8. The optical flow estimation method according to claim 5, wherein the performing feature fusion processing on the mth layer feature layer according to the second intermediate feature layer to obtain the mth layer feature layer in the second features comprises:
convolving the second intermediate characteristic layer to generate a second convolution characteristic layer;
adding the second convolution feature layer and the mth layer feature layer to generate a third intermediate feature layer;
and performing convolution on the third intermediate characteristic layer to generate an Mth layer characteristic layer in the second characteristic.
9. A terminal, characterized in that the terminal comprises: a processor, a storage medium communicatively connected to the processor, the storage medium adapted to store a plurality of instructions, the processor adapted to invoke the instructions in the storage medium to perform the steps of implementing the optical flow estimation method of any of the above claims 1-8.
10. A storage medium characterized in that the storage medium stores one or more programs executable by one or more processors to implement the steps of the optical flow estimation method according to any one of claims 1 to 8.
CN202010396695.5A 2020-05-12 2020-05-12 Optical flow estimation method, terminal and storage medium Active CN113658230B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010396695.5A CN113658230B (en) 2020-05-12 2020-05-12 Optical flow estimation method, terminal and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010396695.5A CN113658230B (en) 2020-05-12 2020-05-12 Optical flow estimation method, terminal and storage medium

Publications (2)

Publication Number Publication Date
CN113658230A true CN113658230A (en) 2021-11-16
CN113658230B CN113658230B (en) 2024-05-28

Family

ID=78488670

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010396695.5A Active CN113658230B (en) 2020-05-12 2020-05-12 Optical flow estimation method, terminal and storage medium

Country Status (1)

Country Link
CN (1) CN113658230B (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107155110A (en) * 2017-06-14 2017-09-12 福建帝视信息科技有限公司 A kind of picture compression method based on super-resolution technique
US20180018543A1 (en) * 2014-10-10 2018-01-18 The Penn State Research Foundation Identifying visual storm signatures from satellite images
CN107679462A (en) * 2017-09-13 2018-02-09 哈尔滨工业大学深圳研究生院 A kind of depth multiple features fusion sorting technique based on small echo
CN108764063A (en) * 2018-05-07 2018-11-06 华中科技大学 A kind of pyramidal remote sensing image time critical target identifying system of feature based and method
CN109816671A (en) * 2019-01-31 2019-05-28 深兰科技(上海)有限公司 A kind of object detection method, device and storage medium
CN109903315A (en) * 2019-03-08 2019-06-18 腾讯科技(深圳)有限公司 Method, apparatus, equipment and readable storage medium storing program for executing for light stream prediction
CN110175613A (en) * 2019-06-03 2019-08-27 常熟理工学院 Street view image semantic segmentation method based on Analysis On Multi-scale Features and codec models
CN110176023A (en) * 2019-04-29 2019-08-27 同济大学 A kind of light stream estimation method based on pyramid structure
CN110324664A (en) * 2019-07-11 2019-10-11 南开大学 A kind of video neural network based mends the training method of frame method and its model
US20200118245A1 (en) * 2017-06-30 2020-04-16 SZ DJI Technology Co., Ltd. Optical flow tracking device and method

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180018543A1 (en) * 2014-10-10 2018-01-18 The Penn State Research Foundation Identifying visual storm signatures from satellite images
CN107155110A (en) * 2017-06-14 2017-09-12 福建帝视信息科技有限公司 A kind of picture compression method based on super-resolution technique
US20200118245A1 (en) * 2017-06-30 2020-04-16 SZ DJI Technology Co., Ltd. Optical flow tracking device and method
CN107679462A (en) * 2017-09-13 2018-02-09 哈尔滨工业大学深圳研究生院 A kind of depth multiple features fusion sorting technique based on small echo
CN108764063A (en) * 2018-05-07 2018-11-06 华中科技大学 A kind of pyramidal remote sensing image time critical target identifying system of feature based and method
CN109816671A (en) * 2019-01-31 2019-05-28 深兰科技(上海)有限公司 A kind of object detection method, device and storage medium
CN109903315A (en) * 2019-03-08 2019-06-18 腾讯科技(深圳)有限公司 Method, apparatus, equipment and readable storage medium storing program for executing for light stream prediction
CN110176023A (en) * 2019-04-29 2019-08-27 同济大学 A kind of light stream estimation method based on pyramid structure
CN110175613A (en) * 2019-06-03 2019-08-27 常熟理工学院 Street view image semantic segmentation method based on Analysis On Multi-scale Features and codec models
CN110324664A (en) * 2019-07-11 2019-10-11 南开大学 A kind of video neural network based mends the training method of frame method and its model

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
包晓安, 等: "应用色彩纹理特征的人脸防欺骗算法", 计算机科学, vol. 46, no. 10 *
罗会兰;张云;: "结合上下文特征与CNN多层特征融合的语义分割", 中国图象图形学报, no. 12 *
顾婷婷;赵海涛;孙韶媛;: "基于金字塔型残差神经网络的红外图像深度估计", 红外技术, no. 05 *

Also Published As

Publication number Publication date
CN113658230B (en) 2024-05-28

Similar Documents

Publication Publication Date Title
JP6902611B2 (en) Object detection methods, neural network training methods, equipment and electronics
CN112200722A (en) Generation method and reconstruction method of image super-resolution reconstruction model and electronic equipment
CN111080628A (en) Image tampering detection method and device, computer equipment and storage medium
CN113159143B (en) Infrared and visible light image fusion method and device based on jump connection convolution layer
CN110544214A (en) Image restoration method and device and electronic equipment
CN114549913B (en) Semantic segmentation method and device, computer equipment and storage medium
CN112001399B (en) Image scene classification method and device based on local feature saliency
CN112597918A (en) Text detection method and device, electronic equipment and storage medium
CN113421276A (en) Image processing method, device and storage medium
CN113674191A (en) Weak light image enhancement method and device based on conditional countermeasure network
CN111914654A (en) Text layout analysis method, device, equipment and medium
CN112232397A (en) Knowledge distillation method and device of image classification model and computer equipment
CN115239642A (en) Detection method, detection device and equipment for hardware defects in power transmission line
CN111709415A (en) Target detection method, target detection device, computer equipment and storage medium
CN111461211A (en) Feature extraction method for lightweight target detection and corresponding detection method
CN113658230B (en) Optical flow estimation method, terminal and storage medium
CN115620017A (en) Image feature extraction method, device, equipment and storage medium
CN114064972A (en) Video type determination method and related device
CN113256662A (en) Pathological section image segmentation method and device, computer equipment and storage medium
CN111914779A (en) Table text detection method and device, computer equipment and storage medium
CN116659520B (en) Matching positioning method, device and equipment based on bionic polarization vision enhancement
CN113239878B (en) Image classification method, device, equipment and medium
CN111815631B (en) Model generation method, device, equipment and readable storage medium
CN116630631B (en) Image segmentation method and device, electronic equipment and storage medium
CN113283453B (en) Target detection method, device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant