CN111402302B - Optical flow generating device and method - Google Patents

Optical flow generating device and method Download PDF

Info

Publication number
CN111402302B
CN111402302B CN202010352547.3A CN202010352547A CN111402302B CN 111402302 B CN111402302 B CN 111402302B CN 202010352547 A CN202010352547 A CN 202010352547A CN 111402302 B CN111402302 B CN 111402302B
Authority
CN
China
Prior art keywords
optical flow
generator
discriminator
neural network
predicted
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010352547.3A
Other languages
Chinese (zh)
Other versions
CN111402302A (en
Inventor
康燕斌
张志齐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Yitu Technology Co ltd
Original Assignee
Shanghai Yitu Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Yitu Technology Co ltd filed Critical Shanghai Yitu Technology Co ltd
Priority to CN202010352547.3A priority Critical patent/CN111402302B/en
Publication of CN111402302A publication Critical patent/CN111402302A/en
Application granted granted Critical
Publication of CN111402302B publication Critical patent/CN111402302B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/269Analysis of motion using gradient-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an optical flow generating device, comprising: generating an antagonism network; generating an antagonism network comprising a first generator and a discriminator; the first generator comprises a first neural network, two frames of images are input by the first generator, and a corresponding first predicted optical flow is output; the output end of the first generator is connected with the input end of the discriminator; the optical flow input by the discriminator comprises a sample optical flow or a first predicted optical flow corresponding to the training set; the weight parameters of the first neural network of the first generator are obtained through training, and when the discriminator cannot distinguish the first predicted optical flow and the sample optical flow, the training of the first generator is completed. The invention also discloses a method for generating the optical flow. The invention has better light index and faster training speed.

Description

Optical flow generating device and method
Technical Field
The invention relates to the technical field of computer vision, in particular to an optical flow generating device. The invention also relates to an optical flow generation method.
Background
In the process of shooting a moving object or shooting an external environment when the shooting object moves, continuous frame images are formed, for the moving object in the frame images, the actual positions of the moving object change along with the change of time, and the frame images are arranged in time sequence, so that the positions of the moving object in the frame images also change continuously, and the optical flow refers to the change of the positions of the images corresponding to the moving object in different frame images; can also be expressed as: in computer vision, optical flow defines the movement of objects in an image, which may be caused by camera movement or object movement. Specifically, the amount of movement of a pixel representing the same object (object) in one frame of a video image to the next frame is represented by a two-dimensional vector. The optical flow generating device can predict the future position of the moving object according to the calculated optical flow of the moving object in the frame image. By optical flow calculation, motion capture, motion prediction, object tracking, motion recognition, and the like can be realized.
The existing optical flow generating method comprises the step of directly calculating the optical flow, wherein the optical flow is mainly calculated by using the Lucas-Kanade (L-K), and the method is invented by Bruce D. It assumes that the optical flow is a constant in the neighborhood of pixels, and then uses the least squares method to solve the basic optical flow equation for all pixels in the neighborhood. The calculation amount of directly calculating the optical flow is large and the speed is low.
With the development of deep learning technology and the wide application in the artificial intelligence fields such as computer vision, convolutional Neural Networks (CNNs) are also adopted to solve the problem of optical flow estimation, and the initial network structure of the CNN for calculating optical flow is FlowNet, and the improved version is fownet2.0. Flownet2.0 is commonly used in the prior art to calculate optical flow.
In FlowNet, a dimension-reduced encoding (encoder) module and a dimension-enhanced decoding (decoder) module are generally included; the encoder module includes FlowNetS and FlowNetCoor. The FlowNetCoor has better effect, and is added with manual correlation operation. The decoder modules of the FlowNetS and FlowNetCoor are the same.
The network structure of FlowNet2.0 comprises a superposition structure of FlowNetCoor+FlowNetS+FlowNetS, and the FlowNet2.0 has a fast calculation speed and meets the real-time requirement.
Disclosure of Invention
The invention aims to solve the technical problem of providing an optical flow generating device which has better optical indexes and faster training speed. For this purpose, the invention also provides an optical flow generation method.
In order to solve the above technical problems, an optical flow generating device provided by the present invention includes: an antagonizing network is generated.
The generation of the antagonism network includes a first generator and a arbiter.
The first generator comprises a first neural network, inputs two frames of images and outputs corresponding first predicted optical flow.
The output end of the first generator is connected with the input end of the discriminator.
The optical flow input by the discriminator comprises a sample optical flow or a first predicted optical flow corresponding to a training set; the sample optical flow is a real optical flow, and the discriminator is used for judging the reality of the first predicted optical flow; when the optical flow input by the discriminator is the first predicted optical flow, and the output of the discriminator judges that the optical flow input by the discriminator is the sample optical flow, the discriminator considers the first predicted optical flow to be equal to the sample optical flow. That is, when the optical flow is input by the discriminator, the discriminator does not know whether the input optical flow is the sample optical flow or the first predicted optical flow at first, and the inputted optical flow is the sample optical flow or the first predicted optical flow only after the discriminating process by the discriminator; typically, when the input optical flow is the sample optical flow, the arbiter determines that the input optical flow is the sample optical flow; this is achieved by training the discriminators individually using the sample optical flow. However, when the input optical flow is the first predicted optical flow, if the sample optical flow corresponding to the first predicted optical flow and the actual optical flow has a large deviation, the discriminator can separately recognize the first predicted optical flow and consider the first predicted optical flow and the sample optical flow to be different; how the first predicted optical flow and the sample optical flow are very close and the difference between them is less than the recognition limit of the arbiter, the arbiter treats the first predicted optical flow as the sample optical flow when the input optical flow is the first predicted optical flow.
The weight parameters of the first neural network of the first generator are obtained through training, and when the discriminator considers that the first predicted optical flow is equal to the sample optical flow, the training of the first generator is completed. That is, when the arbiter considers that the first predicted optical flow is equal to the sample optical flow, it indicates that the difference between the first predicted optical flow and the sample optical flow is small, and the first generator can be practically applied, that is, training is completed.
In the training stage, the two frames of images input by the first generator and the two frames of images input by the discriminator are the same and correspond to the samples in the training set, and the two frames of images are continuous two frames of images with a sequence from front to back in time.
The optical flow is a displacement formed in the two frame images by an object in the two frame images at different times.
In a further improvement, the generating countermeasure network further comprises a first loss function module for training the first generator, when the first generator is trained, the first loss function module inputs the first predicted optical flow and the corresponding sample optical flow, the first loss function module output end outputs a first loss function formed by the first predicted optical flow and the corresponding sample optical flow, and the first generator trains weight parameters of the first neural network according to the first loss function.
A further improvement is that, when training the first generator, the input of the discriminator inputs two frames of images, the first predicted optical flow and the corresponding sample optical flow.
The output of the discriminator outputs a determination of the authenticity of the input optical flow.
A further improvement is that the generation countermeasure network comprises a second generator, the second generator comprises a third neural network, the second generator inputs two frames of images and outputs a corresponding second predicted optical flow.
The third neural network is a student network with a smaller scale formed by performing knowledge distillation on the first neural network.
A further improvement is that the third neural network is a neural network formed by scaling down each layer of channels of the first neural network.
A further improvement is that the generating an countermeasure network further comprises a third loss function module for training the second generator; when the second generator is trained, the weight parameters of the first neural network and the weight parameters of the second neural network are fixed, the third loss function module inputs the first predicted optical flow and the second predicted optical flow, the output end of the third loss function module outputs a third loss function formed by the first predicted optical flow and the second predicted optical flow, and the second generator trains the weight parameters of the third neural network according to the third loss function.
A further improvement is that, in training the second generator, the input of the arbiter inputs two frames of images, the first predicted optical flow and the second predicted optical flow.
The output of the discriminator outputs a determination of the authenticity of the input optical flow.
A further improvement is that the weight parameters of the second neural network of the arbiter are obtained by training, and the generating countermeasure network further comprises a second loss function module for training the arbiter.
In a further improvement, the network structures of the first neural network and the third neural network are FlowNet2.0.
Further improvements include network structures of the second neural network including Resnet-50, flown 2.0, mobilet, densnet, and acceptance.
In order to solve the technical problems, the optical flow generating method provided by the invention adopts a generation countermeasure network to generate the optical flow.
The generation of the antagonism network includes a first generator and a arbiter.
The first generator comprises a first neural network, inputs two frames of images and outputs corresponding first predicted optical flow.
The output end of the first generator is connected with the input end of the discriminator.
The optical flow input by the discriminator comprises a sample optical flow or a first predicted optical flow corresponding to a training set; the sample optical flow is a real optical flow, and the discriminator is used for judging the reality of the first predicted optical flow; when the optical flow input by the discriminator is the first predicted optical flow, and the output of the discriminator judges that the optical flow input by the discriminator is the sample optical flow, the discriminator considers the first predicted optical flow to be equal to the sample optical flow.
The weight parameters of the first neural network of the first generator are obtained through training, and when the discriminator considers that the first predicted optical flow is equal to the sample optical flow, the training of the first generator is completed.
In the training stage, the two frames of images input by the first generator and the two frames of images input by the discriminator are the same and correspond to the samples in the training set, and the two frames of images are continuous two frames of images with a sequence from front to back in time.
The optical flow is a displacement formed in the two frame images by an object in the two frame images at different times.
A further improvement is that: the generating countermeasure network further comprises a first loss function module for training the first generator, when the first generator is trained, the first loss function module inputs the first predicted optical flow and the corresponding sample optical flow, the first loss function module output end outputs a first loss function formed by the first predicted optical flow and the corresponding sample optical flow, and the first generator trains weight parameters of the first neural network according to the first loss function.
A further improvement is that, when training the first generator, the input of the discriminator inputs two frames of images, the first predicted optical flow and the corresponding sample optical flow.
The output of the discriminator outputs a determination of the authenticity of the input optical flow.
A further improvement is that the generation countermeasure network comprises a second generator, the second generator comprises a third neural network, the second generator inputs two frames of images and outputs a corresponding second predicted optical flow.
The third neural network is a student network with a smaller scale formed by performing knowledge distillation on the first neural network.
A further improvement is that the third neural network is a neural network formed by scaling down each layer of channels of the first neural network.
A further improvement is that the generating an countermeasure network further comprises a third loss function module for training the second generator; when the second generator is trained, the weight parameters of the first neural network and the weight parameters of the second neural network are fixed, the third loss function module inputs the first predicted optical flow and the second predicted optical flow, the output end of the third loss function module outputs a third loss function formed by the first predicted optical flow and the second predicted optical flow, and the second generator trains the weight parameters of the third neural network according to the third loss function.
A further improvement is that, in training the second generator, the input of the arbiter inputs two frames of images, the first predicted optical flow and the second predicted optical flow.
The output of the discriminator outputs a determination of the authenticity of the input optical flow.
A further improvement is that the weight parameters of the second neural network of the arbiter are obtained by training, and the generating countermeasure network further comprises a second loss function module for training the arbiter.
In a further improvement, the network structures of the first neural network and the third neural network are FlowNet2.0.
Further improvements include network structures of the second neural network including Resnet-50, flown 2.0, mobilet, densnet, and acceptance.
The invention adopts the device for generating the optical flow by generating the countermeasure network, wherein the generating countermeasure network comprises a discriminator, and the discriminator can discriminate the first predicted optical flow generated by the first generator, thereby improving the optical index of the generated optical flow.
The first neural network of the first generator of the invention can adopt FlowNet2.0, thereby optimizing the network structure of the first generator.
The invention can also carry out knowledge distillation on the basis of the first generator to form the second generator, and the network scale of the second generator can be halved, thereby improving the training speed.
Drawings
The invention is described in further detail below with reference to the attached drawings and detailed description:
FIG. 1 is a schematic diagram of an optical flow generating device according to a first embodiment of the present invention;
fig. 2 is a schematic structural view of an optical flow generating device according to a second embodiment of the present invention.
Detailed Description
Optical flow generating device according to first embodiment of the present invention:
referring to fig. 1, which is a schematic diagram of an optical flow generating device according to a first embodiment of the present invention, fig. 1 is a schematic diagram mainly showing the structure of the optical flow generating device according to the first embodiment of the present invention when training a first generator 1; the optical flow generating device according to the first embodiment of the present invention includes: an antagonizing network is generated.
The generation of the countermeasure network comprises a first generator 1 and a discriminator 2.
The first generator 1 includes a first neural network, and the first generator 1 inputs two frames of images and outputs a corresponding first predicted optical flow.
The output end of the first generator 1 is connected with the input end of the discriminator 2.
The discriminator 2 comprises a second neural network, the discriminator 2 inputs two frames of images and an optical flow, and the optical flow input by the discriminator 2 comprises a sample optical flow or a first predicted optical flow corresponding to a training set; the sample optical flow is a real optical flow, and the discriminator 2 is used for judging the reality of the first predicted optical flow; when the optical flow input by the discriminator 2 is the first predicted optical flow, and the output of the discriminator 2 determines that the optical flow input by the discriminator is the sample optical flow, the discriminator 2 considers the first predicted optical flow to be equivalent to the sample optical flow. That is, when the optical flow is input to the discriminator 2, the discriminator 2 does not know whether the input optical flow is the sample optical flow or the first predicted optical flow at first, and the inputted optical flow is the sample optical flow or the first predicted optical flow after the discriminating process of the discriminator 2; in general, when the input optical flow is the sample optical flow, the arbiter 2 determines that the input optical flow is the sample optical flow; this is achieved by training the discriminators individually using the sample optical flow. However, when the input optical flow is the first predicted optical flow, if the sample optical flow corresponding to the first predicted optical flow and the actual optical flow has a large deviation, the arbiter 2 can recognize the first predicted optical flow alone and consider the first predicted optical flow and the sample optical flow to be different; how the first predicted optical flow and the sample optical flow are very close and the difference between them is smaller than the recognition limit of the arbiter 2, the arbiter 2 regards the first predicted optical flow as the sample optical flow when the inputted optical flow is the first predicted optical flow.
The weight parameters of the first neural network of the first generator 1 are obtained through training, and when the arbiter 2 considers that the first predicted optical flow is equivalent to the sample optical flow, the training of the first generator 1 is completed. That is, when the arbiter 2 considers that the first predicted optical flow is equal to the sample optical flow, it indicates that the difference between the first predicted optical flow and the sample optical flow is small, and the first generator 1 can be practically applied, that is, training is completed.
In the training stage, the two frames of images input by the first generator 1, that is, the kth frame of image and the kth+1 frame of image in fig. 1, and the two frames of images input by the discriminator 2 are identical and correspond to samples in the training set, and the two frames of images are continuous two frames of images having a sequential order in time.
The optical flow is a displacement formed in the two frame images by an object in the two frame images at different times.
The generating countermeasure network further comprises a first loss function module 3 for training the first generator 1, when the first generator 1 is trained, the first loss function module 3 inputs the first predicted optical flow and the corresponding sample optical flow, an output end of the first loss function module 3 outputs a first loss function formed by the first predicted optical flow and the corresponding sample optical flow, and the first generator 1 trains weight parameters of the first neural network according to the first loss function.
In training the first generator 1, the input of the discriminator 2 inputs two frames of images, the first predicted optical flow and the corresponding sample optical flow.
The output of the discriminator 2 outputs a determination of the authenticity of the input optical flow. Fig. 1 shows that when the output of the discriminator 2 is used to determine which optical flow is generated by a generator, and the difference between the optical flow generated by the generator and the sample optical flow cannot be determined, the optical index indicating the corresponding first predicted optical flow of the first generator 1 is good.
The weight parameters of the second neural network of the arbiter 2 are obtained through training, and the generating countermeasure network further comprises a second loss function module for training the arbiter.
In the first embodiment of the present invention, the first generator 1 and the arbiter 2 need to perform training, and the training mode is divided into 2 dimensions, namely, data and a loss function. The data are divided into two types, positive example data corresponding to the actual optical flow, that is, the sample optical flow, as the optical flow input to the discriminator 2, and negative example data corresponding to the predicted optical flow, that is, the first predicted optical flow, as the optical flow input to the discriminator 2. Because the positive column data is irrelevant to prediction, only the discriminator 2 can be trained, namely the sample data can be adopted and the second loss function module is combined to train the discriminator 2 independently; the negative example data can train the generator, namely the first generator 1 and the discriminator 2, simultaneously. For the loss function, for the arbiter 2, the loss function corresponding to the second loss function module is cross entropy (cross_entropy).
Preferably, the network structures of the first neural network are all flownet2.0. The network structure of the second neural network is Resnet-50. In other embodiments can also be: the network structure of the second neural network is flownet2.0, mobilet, densenet, and acceptance.
The first embodiment of the present invention adopts the device for generating the optical flow as the optical flow generation, and the discriminator 2 is included in the generation countermeasure network, so that the first predicted optical flow generated by the first generator 1 can be discriminated, and the optical index of the generated optical flow can be improved.
The first neural network of the first generator 1 of the first embodiment of the present invention adopts flownet2.0, so that the network structure of the first generator 1 itself can be optimized.
Optical flow generating device according to second embodiment of the present invention:
fig. 2 is a schematic structural view of an optical flow generating device according to a second embodiment of the present invention, and fig. 2 is a schematic structural view mainly showing the optical flow generating device according to the second embodiment of the present invention when training the second generator 4; the optical flow generating device according to the second embodiment of the present invention includes: an antagonizing network is generated.
The generation of the countermeasure network comprises a first generator 1 and a discriminator 2.
The first generator 1 includes a first neural network, and the first generator 1 inputs two frames of images and outputs a corresponding first predicted optical flow.
The output end of the first generator 1 is connected with the input end of the discriminator 2.
The discriminator 2 comprises a second neural network, the discriminator 2 inputs two frames of images and an optical flow, and the optical flow input by the discriminator 2 comprises a sample optical flow or a first predicted optical flow corresponding to a training set; the sample optical flow is a real optical flow, and the discriminator 2 is used for judging the reality of the first predicted optical flow; when the optical flow input by the discriminator 2 is the first predicted optical flow, and the output of the discriminator 2 determines that the optical flow input by the discriminator is the sample optical flow, the discriminator 2 considers the first predicted optical flow to be equivalent to the sample optical flow.
The weight parameters of the first neural network of the first generator 1 are obtained through training, and when the arbiter 2 considers that the first predicted optical flow is equivalent to the sample optical flow, the training of the first generator 1 is completed.
In the training stage, the two frames of images input by the first generator 1, that is, the kth frame of image and the kth+1 frame of image in fig. 2, are identical to the two frames of images input by the discriminator 2 and correspond to samples in the training set, and the two frames of images are continuous two frames of images having a sequential order in time.
The optical flow is a displacement formed in the two frame images by an object in the two frame images at different times.
The generating countermeasure network further comprises a first loss function module 3 for training the first generator 1, when the first generator 1 is trained, the first loss function module 3 inputs the first predicted optical flow and the corresponding sample optical flow, an output end of the first loss function module 3 outputs a first loss function formed by the first predicted optical flow and the corresponding sample optical flow, and the first generator 1 trains weight parameters of the first neural network according to the first loss function.
In training the first generator 1, the input of the discriminator 2 inputs two frames of images, the first predicted optical flow and the corresponding sample optical flow.
The output end of the discriminator 2 outputs a judgment of the authenticity of the input optical flow, and fig. 2 shows that the output of the discriminator 2 is used for judging which optical flow is generated by a generator, and when the difference between the optical flow generated by the generator and the sample optical flow cannot be judged, the optical index of the corresponding first predicted optical flow of the first generator 1 is good.
Preferably, the network structures of the first neural network are all flownet2.0. The network structure of the second neural network is Resnet-50. In other embodiments can also be: the network structure of the second neural network is flownet2.0, mobilet, densenet, and acceptance.
The generation countermeasure network comprises a second generator 4, the second generator 4 comprises a third neural network, and the second generator 4 inputs two frames of images and outputs corresponding second predicted optical flow.
The third neural network is a student network with a smaller scale formed by performing knowledge distillation on the first neural network. In a second embodiment of the present invention, the third neural network is a neural network formed by scaling down each layer of channels of the first neural network. Preferably, the third neural network is a neural network formed by cutting each layer of channels of the first neural network in half; the corresponding model calculation speed (latency) after the channel is cut in half is doubled; and the different reduction ratios can lead the corresponding model calculation speed to be increased in different ratios, so that the method can be flexibly set.
The generation of the countermeasure network further comprises a third loss function module 5 for training the second generator 4; when training the second generator 4, the weight parameters of the first neural network and the weight parameters of the second neural network are fixed, the third loss function module 5 inputs the first predicted optical flow and the second predicted optical flow, the output end of the third loss function module 5 outputs a third loss function formed by the first predicted optical flow and the second predicted optical flow, and the second generator 4 trains the weight parameters of the third neural network according to the third loss function.
In training the second generator 4, the input of the discriminator 2 inputs two frames of images, the first predicted optical flow and the second predicted optical flow.
The output of the discriminator 2 outputs a determination of the authenticity of the input optical flow.
The weight parameters of the second neural network of the arbiter 2 are obtained by training, and the generation countermeasure network further comprises a second loss function module for training the arbiter 2.
The network structure of the third neural network is FlowNet2.0.
The optical flow generating device according to the second embodiment of the present invention can further perform knowledge distillation based on the first generator 1 to form the second generator 4, and the network size of the second generator 4 can be halved, thereby improving the training speed.
The optical flow generating method of the first embodiment of the invention comprises the following steps:
the optical flow generation method according to the first embodiment of the present invention generates an optical flow by using the generation countermeasure network, that is, the optical flow generation device according to the first embodiment of the present invention shown in fig. 1 is used to generate an optical flow.
As shown in fig. 1, the generation of the countermeasure network includes a first generator 1 and a discriminator 2.
The first generator 1 includes a first neural network, and the first generator 1 inputs two frames of images and outputs a corresponding first predicted optical flow.
The output end of the first generator 1 is connected with the input end of the discriminator 2.
The discriminator 2 comprises a second neural network, the discriminator 2 inputs two frames of images and an optical flow, and the optical flow input by the discriminator 2 comprises a sample optical flow or a first predicted optical flow corresponding to a training set; the sample optical flow is a real optical flow, and the discriminator 2 is used for judging the reality of the first predicted optical flow; when the optical flow input by the discriminator 2 is the first predicted optical flow, and the output of the discriminator 2 determines that the optical flow input by the discriminator is the sample optical flow, the discriminator 2 considers the first predicted optical flow to be equivalent to the sample optical flow.
The weight parameters of the first neural network of the first generator 1 are obtained through training, and when the arbiter 2 considers that the first predicted optical flow is equivalent to the sample optical flow, the training of the first generator 1 is completed.
In the training stage, the two frames of images input by the first generator 1, that is, the kth frame of image and the kth+1 frame of image in fig. 1, and the two frames of images input by the discriminator 2 are identical and correspond to samples in the training set, and the two frames of images are continuous two frames of images having a sequential order in time.
The optical flow is a displacement formed in the two frame images by an object in the two frame images at different times.
The generating countermeasure network further comprises a first loss function module 3 for training the first generator 1, when the first generator 1 is trained, the first loss function module 3 inputs the first predicted optical flow and the corresponding sample optical flow, an output end of the first loss function module 3 outputs a first loss function formed by the first predicted optical flow and the corresponding sample optical flow, and the first generator 1 trains weight parameters of the first neural network according to the first loss function.
In training the first generator 1, the input of the discriminator 2 inputs two frames of images, the first predicted optical flow and the corresponding sample optical flow.
The output of the discriminator 2 outputs a determination of the authenticity of the input optical flow. Fig. 1 shows that when the output of the discriminator 2 is used to determine which optical flow is generated by a generator, and the difference between the optical flow generated by the generator and the sample optical flow cannot be determined, the optical index indicating the corresponding first predicted optical flow of the first generator 1 is good.
Preferably, the network structures of the first neural network are all flownet2.0. The network structure of the second neural network is Resnet-50. In other embodiments can also be: the network structure of the second neural network is flownet2.0, mobilet, densenet, and acceptance.
The optical flow generating method of the second embodiment of the invention comprises the following steps:
the optical flow generation method according to the second embodiment of the present invention generates an optical flow by using the generation countermeasure network, that is, the optical flow generation device according to the second embodiment of the present invention shown in fig. 2.
As shown in fig. 2, the generation of the countermeasure network includes a first generator 1 and a discriminator 2.
The first generator 1 includes a first neural network, and the first generator 1 inputs two frames of images and outputs a corresponding first predicted optical flow.
The output end of the first generator 1 is connected with the input end of the discriminator 2.
The discriminator 2 comprises a second neural network, the discriminator 2 inputs two frames of images and an optical flow, and the optical flow input by the discriminator 2 comprises a sample optical flow or a first predicted optical flow corresponding to a training set; the sample optical flow is a real optical flow, and the discriminator 2 is used for judging the reality of the first predicted optical flow; when the optical flow input by the discriminator 2 is the first predicted optical flow, and the output of the discriminator 2 determines that the optical flow input by the discriminator is the sample optical flow, the discriminator 2 considers the first predicted optical flow to be equivalent to the sample optical flow.
The weight parameters of the first neural network of the first generator 1 are obtained through training, and when the arbiter 2 considers that the first predicted optical flow is equivalent to the sample optical flow, the training of the first generator 1 is completed.
In the training stage, the two frames of images input by the first generator 1, that is, the kth frame of image and the kth+1 frame of image in fig. 2, are identical to the two frames of images input by the discriminator 2 and correspond to samples in the training set, and the two frames of images are continuous two frames of images having a sequential order in time.
The optical flow is a displacement formed in the two frame images by an object in the two frame images at different times.
The generating countermeasure network further comprises a first loss function module 3 for training the first generator 1, when the first generator 1 is trained, the first loss function module 3 inputs the first predicted optical flow and the corresponding sample optical flow, an output end of the first loss function module 3 outputs a first loss function formed by the first predicted optical flow and the corresponding sample optical flow, and the first generator 1 trains weight parameters of the first neural network according to the first loss function.
In training the first generator 1, the input of the discriminator 2 inputs two frames of images, the first predicted optical flow and the corresponding sample optical flow.
The output end of the discriminator 2 outputs a judgment of the authenticity of the input optical flow, and fig. 2 shows that the output of the discriminator 2 is used for judging which optical flow is generated by a generator, and when the difference between the optical flow generated by the generator and the sample optical flow cannot be judged, the optical index of the corresponding first predicted optical flow of the first generator 1 is good.
Preferably, the network structures of the first neural network are all flownet2.0. The network structure of the second neural network is Resnet-50. In other embodiments can also be: the network structure of the second neural network is flownet2.0, mobilet, densenet, and acceptance.
The generation countermeasure network comprises a second generator 4, the second generator 4 comprises a third neural network, and the second generator 4 inputs two frames of images and outputs corresponding second predicted optical flow.
The third neural network is a student network with a smaller scale formed by performing knowledge distillation on the first neural network. In the method of the second embodiment of the present invention, the third neural network is a neural network formed by scaling down each layer of channels of the first neural network. Preferably, the third neural network is a neural network formed by cutting each layer of channels of the first neural network in half; the corresponding model calculation speed (latency) after the channel is cut in half is doubled; and the different reduction ratios can lead the corresponding model calculation speed to be increased in different ratios, so that the method can be flexibly set.
The generation of the countermeasure network further comprises a third loss function module 5 for training the second generator 4; when training the second generator 4, the weight parameters of the first neural network and the weight parameters of the second neural network are fixed, the third loss function module 5 inputs the first predicted optical flow and the second predicted optical flow, the output end of the third loss function module 5 outputs a third loss function formed by the first predicted optical flow and the second predicted optical flow, and the second generator 4 trains the weight parameters of the third neural network according to the third loss function.
In training the second generator 4, the input of the discriminator 2 inputs two frames of images, the first predicted optical flow and the second predicted optical flow.
The output of the discriminator 2 outputs a determination of the authenticity of the input optical flow.
The weight parameters of the second neural network of the arbiter 2 are obtained by training, and the generation countermeasure network further comprises a second loss function module for training the arbiter 2.
The network structure of the third neural network is FlowNet2.0.
The present invention has been described in detail by way of specific examples, but these should not be construed as limiting the invention. Many variations and modifications may be made by one skilled in the art without departing from the principles of the invention, which is also considered to be within the scope of the invention.

Claims (20)

1. An optical flow generating device, comprising: generating an antagonism network;
the generation countermeasure network comprises a first generator, a discriminator and a first loss function module;
the first generator comprises a first neural network, two frames of images are input by the first generator, and a corresponding first predicted optical flow is output;
the output end of the first generator is connected with the input end of the discriminator;
the optical flow input by the discriminator comprises a sample optical flow or a first predicted optical flow corresponding to a training set; the sample optical flow is a real optical flow, and the discriminator is used for judging the reality of the first predicted optical flow; when the optical flow input by the discriminator is the first predicted optical flow, and the output of the discriminator judges that the optical flow input by the discriminator is the sample optical flow, the discriminator considers that the first predicted optical flow is equal to the sample optical flow;
the weight parameters of the first neural network of the first generator are obtained through training, the first loss function module is used for training the first generator, when the first generator is trained, the first loss function module inputs the first predicted optical flow and the corresponding sample optical flow, the output end of the first loss function module outputs a first loss function formed by the first predicted optical flow and the corresponding sample optical flow, and the first generator trains the weight parameters of the first neural network according to the first loss function; training of the first generator is completed when the arbiter considers the first predicted optical flow to be equivalent to the sample optical flow.
2. The optical flow generating device according to claim 1, wherein: in the training stage, two frames of images input by the first generator are identical to two frames of images input by the discriminator and correspond to samples in a training set, and the two frames of images are continuous two frames of images with a front-back sequence in time;
the optical flow is a displacement formed in the two frame images by an object in the two frame images at different times.
3. The optical flow generating device according to claim 1, wherein: when training the first generator, the input end of the discriminator inputs two frames of images, the first predicted optical flow and the corresponding sample optical flow;
the output of the discriminator outputs a determination of the authenticity of the input optical flow.
4. The optical flow generating device according to claim 3, wherein: the generation countermeasure network comprises a second generator, the second generator comprises a third neural network, the second generator inputs two frames of images and outputs a corresponding second predicted optical flow;
the third neural network is a student network with a smaller scale formed by performing knowledge distillation on the first neural network.
5. The optical flow generating device according to claim 4, wherein: the third neural network is formed by scaling down each layer of channels of the first neural network.
6. The optical flow generating device according to claim 4, wherein: the generating an countermeasure network further includes a third loss function module for training the second generator; when the second generator is trained, the weight parameters of the first neural network and the weight parameters of the second neural network are fixed, the third loss function module inputs the first predicted optical flow and the second predicted optical flow, the output end of the third loss function module outputs a third loss function formed by the first predicted optical flow and the second predicted optical flow, and the second generator trains the weight parameters of the third neural network according to the third loss function.
7. The optical flow generating device according to claim 6, wherein: when training the second generator, the input end of the discriminator inputs two frames of images, the first predicted optical flow and the second predicted optical flow;
the output of the discriminator outputs a determination of the authenticity of the input optical flow.
8. The optical flow generating device according to claim 1, wherein: the weight parameters of the second neural network of the discriminator are obtained through training, and the generating countermeasure network further comprises a second loss function module for training the discriminator.
9. The optical flow generating device according to claim 4, wherein: the network structures of the first neural network and the third neural network are flownet2.0.
10. The optical flow generating device according to claim 1, wherein: the network structure of the second neural network comprises Resnet-50, flownet2.0, mobilet, densene, and indication.
11. An optical flow generating method, characterized in that: generating an optical flow by adopting a generation countermeasure network;
the generation countermeasure network comprises a first generator, a discriminator and a first loss function module;
the first generator comprises a first neural network, two frames of images are input by the first generator, and a corresponding first predicted optical flow is output;
the output end of the first generator is connected with the input end of the discriminator;
the optical flow input by the discriminator comprises a sample optical flow or a first predicted optical flow corresponding to a training set; the sample optical flow is a real optical flow, and the discriminator is used for judging the reality of the first predicted optical flow; when the optical flow input by the discriminator is the first predicted optical flow, and the output of the discriminator judges that the optical flow input by the discriminator is the sample optical flow, the discriminator considers that the first predicted optical flow is equal to the sample optical flow;
judging the authenticity of the input optical flow, wherein the optical flow input by the discriminator comprises a sample optical flow or a first predicted optical flow corresponding to the training set;
the weight parameters of the first neural network of the first generator are obtained through training, the first loss function module is used for training the first generator, when the first generator is trained, the first loss function module inputs the first predicted optical flow and the corresponding sample optical flow, the output end of the first loss function module outputs a first loss function formed by the first predicted optical flow and the corresponding sample optical flow, and the first generator trains the weight parameters of the first neural network according to the first loss function; training of the first generator is completed when the arbiter considers the first predicted optical flow to be equivalent to the sample optical flow.
12. The optical flow generation method according to claim 11, wherein: in the training stage, two frames of images input by the first generator are identical to two frames of images input by the discriminator and correspond to samples in a training set, and the two frames of images are continuous two frames of images with a front-back sequence in time;
the optical flow is a displacement formed in the two frame images by an object in the two frame images at different times.
13. The optical flow generation method according to claim 11, wherein: when training the first generator, the input end of the discriminator inputs two frames of images, the first predicted optical flow and the corresponding sample optical flow;
the output of the discriminator outputs a determination of the authenticity of the input optical flow.
14. The optical flow generation method according to claim 13, wherein: the generation countermeasure network comprises a second generator, the second generator comprises a third neural network, the second generator inputs two frames of images and outputs a corresponding second predicted optical flow;
the third neural network is a student network with a smaller scale formed by performing knowledge distillation on the first neural network.
15. The optical flow generation method according to claim 14, wherein: the third neural network is formed by scaling down each layer of channels of the first neural network.
16. The optical flow generation method according to claim 14, wherein: the generating an countermeasure network further includes a third loss function module for training the second generator; when the second generator is trained, the weight parameters of the first neural network and the weight parameters of the second neural network are fixed, the third loss function module inputs the first predicted optical flow and the second predicted optical flow, the output end of the third loss function module outputs a third loss function formed by the first predicted optical flow and the second predicted optical flow, and the second generator trains the weight parameters of the third neural network according to the third loss function.
17. The optical flow generation method of claim 16, wherein: when training the second generator, the input end of the discriminator inputs two frames of images, the first predicted optical flow and the second predicted optical flow;
the output of the discriminator outputs a determination of the authenticity of the input optical flow.
18. The optical flow generation method according to claim 11, wherein: the weight parameters of the second neural network of the discriminator are obtained through training, and the generating countermeasure network further comprises a second loss function module for training the discriminator.
19. The optical flow generation method of claim 16, wherein: the network structures of the first neural network and the third neural network are flownet2.0.
20. The optical flow generation method according to claim 11, wherein: the network structure of the second neural network comprises Resnet-50, flownet2.0, mobilet, densene, and indication.
CN202010352547.3A 2020-04-28 2020-04-28 Optical flow generating device and method Active CN111402302B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010352547.3A CN111402302B (en) 2020-04-28 2020-04-28 Optical flow generating device and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010352547.3A CN111402302B (en) 2020-04-28 2020-04-28 Optical flow generating device and method

Publications (2)

Publication Number Publication Date
CN111402302A CN111402302A (en) 2020-07-10
CN111402302B true CN111402302B (en) 2023-06-06

Family

ID=71413727

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010352547.3A Active CN111402302B (en) 2020-04-28 2020-04-28 Optical flow generating device and method

Country Status (1)

Country Link
CN (1) CN111402302B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB201902459D0 (en) * 2019-02-22 2019-04-10 Facesoft Ltd Facial shape representation and generation system and method
CN110210429A (en) * 2019-06-06 2019-09-06 山东大学 A method of network is generated based on light stream, image, movement confrontation and improves anxiety, depression, angry facial expression recognition correct rate
CN110599421A (en) * 2019-09-12 2019-12-20 腾讯科技(深圳)有限公司 Model training method, video fuzzy frame conversion method, device and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB201902459D0 (en) * 2019-02-22 2019-04-10 Facesoft Ltd Facial shape representation and generation system and method
CN110210429A (en) * 2019-06-06 2019-09-06 山东大学 A method of network is generated based on light stream, image, movement confrontation and improves anxiety, depression, angry facial expression recognition correct rate
CN110599421A (en) * 2019-09-12 2019-12-20 腾讯科技(深圳)有限公司 Model training method, video fuzzy frame conversion method, device and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
周泳 ; 陶兆胜 ; 阮孟丽 ; 王丽华 ; .基于FlowNet2.0网络的目标光流检测方法.龙岩学院学报.2020,(02),全文. *

Also Published As

Publication number Publication date
CN111402302A (en) 2020-07-10

Similar Documents

Publication Publication Date Title
CN110363716B (en) High-quality reconstruction method for generating confrontation network composite degraded image based on conditions
CN109064507B (en) Multi-motion-stream deep convolution network model method for video prediction
JP7147078B2 (en) Video frame information labeling method, apparatus, apparatus and computer program
Wu et al. Multi-teacher knowledge distillation for compressed video action recognition on deep neural networks
CN110795990B (en) Gesture recognition method for underwater equipment
CN109886225A (en) A kind of image gesture motion on-line checking and recognition methods based on deep learning
CN113158862B (en) Multitasking-based lightweight real-time face detection method
CN111062395B (en) Real-time video semantic segmentation method
CN110705412A (en) Video target detection method based on motion history image
CN114565655B (en) Depth estimation method and device based on pyramid segmentation attention
CN110443784B (en) Effective significance prediction model method
CN110688905A (en) Three-dimensional object detection and tracking method based on key frame
CN112906631B (en) Dangerous driving behavior detection method and detection system based on video
CN113569882A (en) Knowledge distillation-based rapid pedestrian detection method
CN110852199A (en) Foreground extraction method based on double-frame coding and decoding model
Vemprala et al. Representation learning for event-based visuomotor policies
CN116524593A (en) Dynamic gesture recognition method, system, equipment and medium
CN112288772A (en) Channel attention target tracking method based on online multi-feature selection
CN116229452A (en) Point cloud three-dimensional target detection method based on improved multi-scale feature fusion
CN111402302B (en) Optical flow generating device and method
CN117456330A (en) MSFAF-Net-based low-illumination target detection method
CN113033283A (en) Improved video classification system
CN117576149A (en) Single-target tracking method based on attention mechanism
Guo et al. Research on human-vehicle gesture interaction technology based on computer visionbility
CN115578436A (en) Monocular depth prediction method based on multi-level feature parallel interaction fusion

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant