WO2023061187A1 - 一种光流估计方法和装置 - Google Patents

一种光流估计方法和装置 Download PDF

Info

Publication number
WO2023061187A1
WO2023061187A1 PCT/CN2022/121050 CN2022121050W WO2023061187A1 WO 2023061187 A1 WO2023061187 A1 WO 2023061187A1 CN 2022121050 W CN2022121050 W CN 2022121050W WO 2023061187 A1 WO2023061187 A1 WO 2023061187A1
Authority
WO
WIPO (PCT)
Prior art keywords
optical flow
image frame
frame
event
image
Prior art date
Application number
PCT/CN2022/121050
Other languages
English (en)
French (fr)
Inventor
王耀园
张子阳
杨晨
王瀛
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2023061187A1 publication Critical patent/WO2023061187A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/269Analysis of motion using gradient-based methods

Definitions

  • the present application relates to the field of computer vision, in particular to an optical flow estimation method and device.
  • Optical flow refers to the use of changes in the time domain of pixels in the image sequence and the correlation between adjacent frames to find the corresponding relationship between the previous frame and the current frame, so as to calculate the adjacent A method for the motion information of objects between frames.
  • the traditional optical flow estimation method can only estimate the optical flow between the first image frame and the second image frame (including the optical flow from the first image frame to the second image frame and the optical flow from the second image frame to the second image frame).
  • the optical flow of an image frame wherein, the first image frame and the second image frame are two adjacent image frames, and for the optical flow at any moment between the above two image frames and the above two image frames ( Including the optical flow from the first image frame to the arbitrary moment and the optical flow from the second image frame to the arbitrary moment) can only be averagely distributed according to the length of time as a weight through the assumption of linear motion.
  • the present application provides an optical flow estimation method and device, which can improve the accuracy of optical flow estimation at any time between two adjacent image frames and the two adjacent image frames.
  • the present application provides an optical flow estimation method.
  • the method may include: obtaining a first image frame and a second image frame, the first image frame and the second image frame are any two adjacent image frames in an image sequence, and the image sequence is obtained by shooting a target scene ; Obtain a first event frame, which is used to describe the brightness change of the target scene in the time period between the first image frame and the second image frame; based on the first image frame, the second image frame Two image frames and the first event frame, determine the target optical flow, the target optical flow is the optical flow from the first image frame to the target moment, and the target moment is between the first image frame and the second image frame any moment.
  • the method can be used in an optical flow estimation system, and the optical flow estimation system can include a pixel sensor, an event sensor, and an optical flow estimation device, and the pixel sensor and the event sensor are respectively connected with the optical flow estimation device connection.
  • This method can be executed, for example, by the above-mentioned optical flow estimating device.
  • the first image frame and the second image frame contain pixel information of the target object in the target scene. Since the event stream data is obtained by shooting the target scene through the event camera, the first event stream data can capture the target object in the target scene between the first image frame and the second image frame Real high-speed motion information (including linear and nonlinear motion) within a time period.
  • the optical flow estimation device first estimates the distance between the first image frame and the second image frame based on the first image frame, the second image frame, and the first event frame. Then, based on the second event frame, determine the weight of the optical flow between the first image frame and the target moment compared to the optical flow between the first image frame and the second image frame (ie the first An optical flow allocation mask), since there is no motion assumption of the target object, the obtained first optical flow allocation mask has the characteristics of accurately allocating real motion optical flow, therefore, through the first optical flow allocation mask to The accuracy of the target optical flow obtained by weighting the first optical flow is higher.
  • the optical flow estimating apparatus may obtain the first image frame and the second image frame in various ways, which are not limited in this application.
  • the apparatus for estimating optical flow may receive the first image frame and the second image frame sent by the pixel camera.
  • the optical flow estimating apparatus may obtain the first image frame and the second image frame from other devices or for input through an input interface.
  • the target scene may include at least one target object, and part or all of the at least one target object is in a moving state.
  • the optical flow estimating apparatus may obtain the first event frame in various ways, which is not limited in this application.
  • the optical flow estimating device may receive event flow data sent by an event camera, where the event flow data includes event data of each event in at least one event, and the at least one event is related to the target scene in There is a one-to-one correspondence between the first image frame and the second image frame for at least one brightness change, and the data of each event includes time stamp, pixel coordinates and polarity; the optical flow estimation device can be based on the event flow data to get the first event frame. That is to say, the event camera can collect the event flow data and send the event flow data to the optical flow estimating device.
  • the optical flow estimating apparatus may receive the first event frame sent by the event camera. That is to say, the event camera can collect the event flow data, generate the first event frame based on the event flow data and send it to the optical flow estimating device.
  • the resolutions of the first image frame, the second image frame and the first event frame are the same.
  • the first event frame may include multiple channels, and the multiple channels Can include a first channel, a second channel, a third channel and a fourth channel.
  • the first channel includes H ⁇ W first numerical values, the H ⁇ W first numerical values are in one-to-one correspondence with the positions of the H ⁇ W pixel points, and the first numerical values are used to represent the The number of times the brightness of the pixel at the corresponding position increases during the time period between the first image frame and the second image frame;
  • the second channel includes H ⁇ W second values, and the H ⁇ W second The values correspond to the positions of the H ⁇ W pixels one by one, and the second value is used to represent the brightness of the pixels at the corresponding positions in the first image frame between the first image frame and the second image frame The number of times of reduction in the time period;
  • the third channel includes H ⁇ W third numerical values, the H ⁇ W third numerical values are in one-to-one correspondence with the positions of the H ⁇ W pixel points, and the third numerical values are used to represent The time when the brightness of the pixel at the corresponding position in the first image frame increases for the last time in the time period between the first image frame and the second image frame;
  • the fourth channel includes H ⁇ W
  • the optical flow estimating device before the optical flow estimating device determines the target optical flow based on the first image frame, the second image frame and the first event frame, the optical flow estimating device can obtain the second event frame, the second event frame is used to describe the brightness change of the target scene in the time period between the first image frame and the target moment; correspondingly, the optical flow estimation device is based on the first image frame, the Determining the target optical flow for the second image frame and the first event frame includes: the optical flow estimating device determines the target optical flow based on the first image frame, the second image frame, the first event frame and the second event frame Target flow.
  • the optical flow estimating device can determine a first optical flow based on the first image frame, the second image frame and the first event frame, and the first optical flow is from the first image frame to the second The optical flow of the image frame; based on the second event frame, determine a first optical flow allocation mask, and the first optical flow allocation mask is used to indicate the weight of the target optical flow relative to the first optical flow; based on the second An optical flow and the first optical flow assignment mask are used to determine the target optical flow.
  • the first optical flow may be a sparse optical flow or a dense optical flow, which is not limited in this application.
  • the first optical flow represents different motion directions through different colors of pixels, and different motion speeds through different brightness of pixels. .
  • the optical flow estimating device may determine the first optical flow based on the first image frame, the second image frame, and the first event frame in various ways, which is not limited in the present application.
  • the optical flow estimating device may input the first image frame, the second image frame and the first event frame into a preset optical flow estimation model to obtain the first optical flow.
  • the application is obtained by training the lightweight first convolutional neural network.
  • the first convolutional neural network can include several processing layers of dimensionality reduction, convolution, residual, deconvolution, and dimensionality enhancement. Through the first convolution The cyclic iteration of the neural network achieves the purpose of improving the accuracy of optical flow estimation, and is easy to deploy on the device side.
  • the optical flow estimation device may input the first image frame, the second image frame and the first event frame into the optical flow estimation model, and perform loop iterations to obtain the first optical flow. That is to say, the optical flow estimation device can input the first image frame, the second image frame and the first event frame into the optical flow estimation model to obtain a second optical flow; the first image frame, the second Two image frames, the first event frame and the second optical flow are input into the optical flow estimation model to obtain the third optical flow; and so on, the loop iteration is satisfied until the optical flow estimation model outputs the first optical flow and satisfies the optical flow The first optical flow is output when the flow estimates the loss function preset by the model.
  • the above-mentioned first optical flow allocation mask has the same resolution as the first image frame, the second image frame and the first event frame.
  • the optical flow estimating apparatus may determine the first optical flow allocation mask based on the second event frame in various ways, which is not limited in this application.
  • the optical flow estimating apparatus may input the second event frame into a preset optical flow allocation model to obtain the first optical flow allocation mask.
  • this application adopts a lightweight second CNN to train, and the second CNN can include processing layers such as fusion and convolution. By looping and iterating the second CNN, the accuracy of optical flow estimation is improved, and it is easy to deploy on the device side.
  • the optical flow estimating device may input the second event frame into the optical flow allocation model, and perform loop iterations to obtain the first optical flow allocation mask.
  • the present application also provides an optical flow estimation device, which is characterized in that it includes: an obtaining module and an optical flow estimation module; the obtaining module is used to obtain a first image frame and a second image frame, and the first image frame and the optical flow estimation module
  • the second image frame is any two adjacent image frames in the image sequence, and the image sequence is obtained by shooting the target scene; the first event frame is obtained, and the first event frame is used to describe the target scene in the first
  • the optical flow estimation module is used to determine the target optical flow based on the first image frame, the second image frame and the first event frame,
  • the target optical flow is an optical flow from the first image frame to a target moment, and the target moment is any moment between the first image frame and the second image frame.
  • the obtaining module is further configured to obtain a second event frame before determining the target optical flow based on the first image frame, the second image frame and the first event frame, and the first event frame
  • Two event frames are used to describe the brightness change of the target scene in the time period between the first image frame and the target moment;
  • the optical flow estimation module is specifically used to based on the first image frame, the second image frame , the first event frame and the second event frame, determine the target optical flow.
  • the optical flow estimation module includes an inter-frame optical flow estimation submodule, an optical flow distribution submodule, and an optical flow estimation submodule at any time between frames;
  • the inter-frame optical flow estimation submodule is used to The first image frame, the second image frame and the first event frame determine a first optical flow, and the first optical flow is an optical flow from the first image frame to the second image frame;
  • the optical flow distribution The sub-module is used to determine a first optical flow allocation mask based on the second event frame, and the first optical flow allocation mask is used to indicate the weight of the target optical flow relative to the first optical flow; at any time between the frames
  • the optical flow estimation sub-module is used to determine the target optical flow based on the first optical flow and the first optical flow assignment mask.
  • the inter-frame optical flow estimation submodule is specifically configured to input the first image frame, the second image frame and the first event frame into a preset optical flow estimation model to obtain the second A stream of light.
  • the inter-frame optical flow estimation submodule is specifically configured to input the first image frame, the second image frame and the first event frame into the optical flow estimation model, and perform loop iterations, Obtain the first optical flow.
  • the optical flow allocation submodule is specifically configured to input the second event frame into a preset optical flow allocation model to obtain the first optical flow allocation mask.
  • the optical flow allocation submodule is specifically configured to input the second event frame into the optical flow allocation model, and perform loop iterations to obtain the first optical flow allocation mask.
  • the first image frame includes H ⁇ W pixels, where both H and W are integers greater than 1, the first event frame includes multiple channels, and the multiple channels include the first channel , the second channel, the third channel and the fourth channel;
  • the first channel includes H ⁇ W first numerical values, the H ⁇ W first numerical values correspond to the positions of the H ⁇ W pixel points one by one, and the first A numerical value is used to represent the number of times that the brightness of the pixel at the corresponding position in the first image frame increases during the time period between the first image frame and the second image frame;
  • the second channel includes H ⁇ W The second numerical value, the H ⁇ W second numerical values correspond to the positions of the H ⁇ W pixel points one by one, and the second numerical value is used to represent the brightness of the pixel point at the corresponding position in the first image frame in the first image frame.
  • the third channel includes H ⁇ W third numerical values, the H ⁇ W third numerical values and the positions of the H ⁇ W pixel points In one-to-one correspondence, the third value is used to represent the timestamp of the last increase in the brightness of the pixel at the corresponding position in the first image frame during the time period between the first image frame and the second image frame;
  • the fourth channel includes H ⁇ W fourth numerical values, the H ⁇ W fourth numerical values are in one-to-one correspondence with the positions of the H ⁇ W pixel points, and the fourth numerical values are used to represent the corresponding positions in the first image frame
  • the time stamp of the last decrease in the brightness of the pixels in the time period between the first image frame and the second image frame.
  • the obtaining module is specifically configured to: obtain event stream data, where the event stream data includes event data of each event in at least one event, and the at least one event is related to the target scene in the first There is a one-to-one correspondence between the image frame and the at least one brightness change that occurs between the second image frame, and the data of each event includes time stamp, pixel point coordinates and polarity; based on the event stream data, the first event frame is obtained.
  • the present application further provides an optical flow estimation device
  • the optical flow estimation device may include at least one processor and at least one communication interface
  • the at least one processor is coupled to the at least one communication interface
  • the at least A communication interface is used to provide information and/or data for the at least one processor
  • the at least one processor is used to run computer program instructions to perform the optical flow estimation described in the above first aspect and any possible implementation thereof method.
  • the device may be a chip or an integrated circuit.
  • the present application also provides a terminal, which may include the optical flow estimation device as described in the above second aspect and any possible implementation thereof, or the optical flow estimation device as described in the above third aspect .
  • the present application also provides a computer-readable storage medium, which is characterized in that it is used to store a computer program, and when the computer program is run by a processor, it can realize the above-mentioned first aspect and any possible implementations thereof. optical flow estimation method.
  • the present application also provides a computer program product, which is characterized in that, when the computer program product is run on a processor, it implements the optical flow estimation method described in the first aspect and any possible implementation thereof .
  • optical flow estimation device computer storage medium, computer program product, chip and terminal provided in this application are all used to implement the optical flow estimation method provided above. Therefore, the beneficial effects that it can achieve can refer to the above provided The beneficial effects of the optical flow estimation method will not be repeated here.
  • FIG. 1 is a schematic diagram of event flow data provided by an embodiment of the present application.
  • Fig. 2 is a schematic diagram of sparse optical flow and dense optical flow provided by the embodiment of the present application;
  • FIG. 3 is a schematic block diagram of an optical flow estimation system 100 provided by an embodiment of the present application.
  • FIG. 4 is a schematic flowchart of an optical flow estimation method 200 provided in an embodiment of the present application.
  • Fig. 5 is another schematic diagram of event flow data provided by the embodiment of the present application.
  • FIG. 6 is a schematic diagram of a first event frame provided by an embodiment of the present application.
  • FIG. 7 is a schematic block diagram of an optical flow estimation device 300 provided in an embodiment of the present application.
  • FIG. 8 is a schematic flowchart of an optical flow estimation method provided in an embodiment of the present application.
  • FIG. 9 is a schematic block diagram of an optical flow estimation apparatus 400 provided by an embodiment of the present application.
  • the pixel camera that is, the traditional camera, collects the brightness value of the scene at a fixed rate (ie frame rate), and outputs it as image data at a fixed rate.
  • An event camera is a new type of sensor that captures dynamic changes in pixel brightness in a scene based on an event-driven approach.
  • an event camera While traditional cameras, to some extent, capture a static/stationary space, the purpose of an event camera is to sensitively capture moving objects.
  • event cameras only observe “motion” in the scene, specifically “changes in brightness” in the scene.
  • the event camera will only output the brightness change (1 or 0) of the corresponding pixel when there is a brightness change, which has the advantages of fast response and wide dynamic range.
  • the event camera will output only when the light intensity changes. For example, if the brightness increases and exceeds a threshold, the corresponding pixel will output an event of brightness increase.
  • Event cameras have no concept of frames, and when the scene changes, a series of pixel-level outputs are generated. Its theoretical time resolution is as high as 1us, so the resulting delay is very low, which is lower than most of the motion rates in common scenes, so there will be no motion blur problem.
  • each pixel of the event camera works independently and asynchronously, so the dynamic range is very large. Event cameras also have the advantage of low power consumption.
  • a traditional camera shoots a scene in full frame at a fixed frame rate, and all pixels work synchronously.
  • the event camera works independently and asynchronously for each pixel, with a sampling rate of up to one million hertz (Hz), and only outputs changes in brightness (events).
  • An event is described by a quadruple of event data, and all pixels are output
  • the event data of the camera is aggregated to form an event list composed of each event, which is used as the event stream data output by the camera.
  • Commonly used event cameras can include dynamic vision sensors (DVS) or dynamic active-pixel vision sensors (DAVIS).
  • DVS dynamic vision sensors
  • DAVIS dynamic active-pixel vision sensors
  • FIG. 1 shows a schematic diagram of event stream data.
  • Figure (a) in Figure 1 represents the event flow data through the event data list, the event data list includes a plurality of event data, each event data describes an event, and each event data is represented by a quadruple, that is, timestamp, x-coordinate, y-coordinate, and polarity.
  • Figure (b) in Figure 1 represents the event stream data through a visual graph, where the three-dimensional coordinate axis is the frame width, frame height and time, and the color of the coordinate point in the three-dimensional coordinates represents the number of times the brightness of the pixel at the corresponding position changes. The brighter the color, the more times the brightness changes.
  • Optical flow that is, the flow of light, uses the changes of pixels in the image sequence in the time domain and the correlation between adjacent frames to find the corresponding relationship between the previous frame and the current frame, thereby calculating the adjacent A method for the motion information of objects between frames.
  • optical flow can be divided into sparse optical flow and dense optical flow.
  • FIG. 2 shows a schematic diagram of sparse optical flow and dense optical flow.
  • (a) in Figure 2 is a schematic diagram of sparse optical flow, which describes the optical flow of pixels of some prominent feature points (such as corner points) in the image moving to the next frame
  • (b) in Figure 2 is Schematic diagram of dense optical flow, which describes the optical flow of all pixels in the image moving to the next frame.
  • Different colors in the dense optical flow diagram indicate different motion directions, and different brightnesses indicate different motion rates.
  • an optical flow network (FlowNet) is a model for estimating optical flow obtained through convolutional neural network training.
  • Optical flow estimation refers to estimating Pixel-level optical flow between these two image frames.
  • the first image frame is the previous frame of the second image frame
  • the target object may be moving at high speed.
  • optical flow from the above two image frames to any moment between the above two image frames that is, the optical flow from the first image frame to the arbitrary moment and the optical flow from the second image frame to the arbitrary moment, It can only be distributed evenly according to the length of time as the weight through the assumption of linear motion.
  • the target object may move according to nonlinear motion. Therefore, using the existing linear motion assumption method to estimate the optical flow at any time between two adjacent image frames and the above two image frames, Less accurate.
  • Fig. 3 shows a schematic block diagram of an optical flow estimation system 100 provided by an embodiment of the present application.
  • the optical flow estimation system 100 may include a pixel sensor 110 , an event sensor 120 and an optical flow estimation device 130 , and the pixel sensor 110 and the event sensor 120 are respectively connected to the optical flow estimation device 130 .
  • the pixel sensor 110 is used to shoot the target scene to obtain an image sequence; the first image frame and the second video frame are sent to the optical flow estimation device 130, and the first image frame and the second image frame are arbitrary in the image sequence. Two adjacent image frames.
  • the pixel sensor may be a pixel camera. This application does not limit the model and type of the pixel camera.
  • the event sensor 120 is used to photograph the target scene to obtain event stream data, the event stream data including event data of each event in at least one event, the at least one event and the target scene in the first image frame and the There is a one-to-one correspondence between at least one brightness change that occurs between the second image frames, and the data of each event includes a time stamp, pixel coordinates and polarity; based on the event flow data, a first event frame is obtained, and the first event frame It is used to describe the brightness change of the target scene during the time period between the first image frame and the second image frame, and the resolutions of the first image frame, the second image frame and the first event frame are the same ; sending the first image frame, the second image frame and the first event frame to the optical flow estimation device 130 ;
  • the event sensor may be an event camera. This application does not limit the model and type of the event camera.
  • the optical flow estimating device 130 is used to determine the target optical flow based on the first image frame, the second image frame and the first event frame (for a specific method, please refer to the optical flow estimation method provided in this application described below), the The target optical flow is the optical flow from the first image frame to a target moment, and the target moment is any moment between the first image frame and the second image frame.
  • the above only uses the estimation method of the optical flow from the first image frame to the target moment (that is, the target optical flow) as an example, but the application is not limited thereto.
  • the method for estimating the optical flow at the target moment is similar to the method for estimating the target optical flow, and reference can be made to the method for estimating the target optical flow provided in this application, which will not be repeated here.
  • the event sensor 120 may directly send the event flow data to the optical flow estimating device 130; correspondingly, the optical flow estimating device 130 obtains the first event frame based on the event flow data.
  • the above devices may communicate with each other in a wired or wireless manner, which is not limited in this application.
  • the above-mentioned wired manner may be to implement communication through a data line connection or through an internal bus connection.
  • the above-mentioned wireless method may be to realize communication through a communication network
  • the communication network may be a local area network, or a wide area network switched through a relay (relay) device, or include a local area network and a wide area network.
  • the communication network can be a wireless fidelity (wireless fidelity, Wifi) hotspot network, a wifi peer-to-peer (peer-to-peer, P2P) network, bluetooth (bluetooth) network, zigbee network, near field Communication (near field communication, NFC) network or possible general short-distance communication network in the future.
  • the communication network may be a third-generation mobile communication technology (3rd-generation wireless telephone technology, 3G) network, a fourth-generation mobile communication technology (the 4th generation mobile communication technology, 4G ) network, 5th-generation mobile communication technology (5th-generation mobile communication technology, 5G) network, public land mobile network (public land mobile network, PLMN) or the Internet (Internet), etc., which are not limited in this application.
  • 3G third-generation mobile communication technology
  • 4G fourth-generation mobile communication technology
  • 5th-generation mobile communication technology 5th-generation mobile communication technology
  • public land mobile network public land mobile network
  • PLMN public land mobile network
  • Internet Internet
  • the first event frame can capture the low-latency motion information of the target in the target scene between the first image frame and the second image frame
  • the first image frame and the second image frame can capture the pixel information of the target scene, therefore, combining the first event frame, the first image frame and the second image frame to determine the target optical flow can improve the accuracy of the target optical flow Spend.
  • optical flow estimation system provided by the present application is introduced above, and the optical flow estimation method applied to the above optical flow estimation system provided by the present application will be further introduced below.
  • FIG. 4 shows an optical flow estimation method 200 provided by an embodiment of the present application, and the optical flow estimation method 200 may be used in the above optical flow estimation system 100 .
  • the optical flow estimation method 200 may include the following steps. It should be noted that the steps listed below may be executed in various orders and/or simultaneously, and are not limited to the execution order shown in FIG. 4 .
  • the method 200 may be executed by an optical flow estimation device.
  • the optical flow estimating device here may be the optical flow estimating device 130 in the optical flow estimating system 100 described above.
  • the optical flow estimating apparatus may obtain the first image frame and the second image frame in various ways, which are not limited in this application.
  • the apparatus for estimating optical flow may receive the first image frame and the second image frame sent by the pixel camera.
  • the pixel camera here may be the pixel camera 110 in the above optical flow estimation system 100 .
  • the optical flow estimating apparatus may obtain the first image frame and the second image frame from other devices or for input through an input interface.
  • the target scene may include at least one target object, and part or all of the at least one target object is in a moving state.
  • the optical flow estimating apparatus may obtain the first event frame in various ways, which is not limited in this application.
  • the optical flow estimating device may receive event flow data sent by an event camera, where the event flow data includes event data of each event in at least one event, and the at least one event is related to the target scene in There is a one-to-one correspondence between the first image frame and the second image frame for at least one brightness change, and the data of each event includes time stamp, pixel coordinates and polarity; the optical flow estimation device can be based on the event flow data to get the first event frame. That is to say, the event camera can collect the event flow data and send the event flow data to the optical flow estimating device.
  • the optical flow estimating apparatus may receive the first event frame sent by the event camera. That is to say, the event camera can collect the event flow data, generate the first event frame based on the event flow data and send it to the optical flow estimating device.
  • the event camera here may be the event camera 120 in the above-mentioned optical flow estimation system 100 .
  • the resolutions of the first image frame, the second image frame and the first event frame are the same.
  • FIG. 5 shows a schematic diagram of event stream data provided by the embodiment of the present application.
  • the event flow data includes 20 event data, which respectively describe the pixels in the target scene from the first image frame (i.e. t1 moment) to the second image frame (i.e. t10 moment 20 brightness changes (that is, 20 events occurring in the target scene) that occurred in the time period between ), wherein t1 ⁇ t10 increase sequentially.
  • the event data of event 1 includes the following four-tuple: the timestamp is t1+ ⁇ t1, the x coordinate is 1, the y coordinate is 1, and the polarity is 1 (that is, the brightness increases ), the event 1 described is that the brightness of the pixel with coordinates (1,1) increases at time t1+ ⁇ t1.
  • the other 19 events have a similar understanding and will not be repeated here.
  • the speed and direction of motion of different pixels at different moments can be estimated through the above event flow data.
  • the first event frame may include multiple channels, and the multiple channels Can include a first channel, a second channel, a third channel and a fourth channel.
  • the first channel includes H ⁇ W first numerical values, the H ⁇ W first numerical values are in one-to-one correspondence with the positions of the H ⁇ W pixel points, and the first numerical values are used to represent the The number of times the brightness of the pixel at the corresponding position increases during the time period between the first image frame and the second image frame;
  • the second channel includes H ⁇ W second values, and the H ⁇ W second The values correspond to the positions of the H ⁇ W pixels one by one, and the second value is used to represent the brightness of the pixels at the corresponding positions in the first image frame between the first image frame and the second image frame The number of times of reduction in the time period;
  • the third channel includes H ⁇ W third numerical values, the H ⁇ W third numerical values are in one-to-one correspondence with the positions of the H ⁇ W pixel points, and the third numerical values are used to represent The time when the brightness of the pixel at the corresponding position in the first image frame increases for the last time in the time period between the first image frame and the second image frame;
  • the fourth channel includes H ⁇ W
  • FIG. 6 shows a schematic diagram of a first event frame provided by an embodiment of the present application.
  • the first event frame may include a first channel, a second channel, a third channel, and a fourth channel.
  • the first event frame is Based on the event flow data shown in Figure 5.
  • the polarity of this pixel point is 1 (that is, the brightness increases) at time t2 and time t9 respectively, namely The brightness of the pixel point between t1 and t10 increases to 2, decreases to 0, the time of the last increase is t9, and the time of the last decrease does not exist.
  • the coordinates in the first channel are (1 ,1) (as shown in the pixels of the black background of the first channel in Figure 6), the first value at the pixel point is 2, and the pixel point with coordinates (1,1) in the second channel (as shown in Figure 6
  • the second value at the pixel point of the black background of the second channel) is 0, and the pixel point with coordinates (1,1) in the third channel (as shown by the pixel point of the black background of the third channel in Figure 6 ) at the third value is t9, and the fourth value at the pixel point with coordinates (1,1) in the fourth channel (as shown in the pixel of the black background of the fourth channel in Figure 6) is recorded as 0.
  • the polarity of this pixel point is 0 at time t2, time t3, time t4, time t5 and time t6 respectively (ie Brightness decreases), that is, the brightness of the pixel between t1 and t10 increases to 0 times, decreases to 5 times, the time of the last increase does not exist, and the time of the last decrease is t6, therefore, the first
  • the first value at the pixel point with coordinates (2,2) in the channel is 0, and the coordinates in the second channel are (2,2)
  • the second value at the pixel point (as shown in the thickened pixel point of the second channel in Figure 6) is 5, and the pixel point with coordinates (2,2) in the third channel (as shown in the third channel in Figure 6
  • the third value at the place where the border is thickened is marked as 0, and the pixel point whose coordinates are (2
  • the target optical flow is the optical flow from the first image frame to a target time
  • the target time is the second Any time between an image frame and the second image frame.
  • the first image frame and the second image frame contain pixel information of the target object in the target scene. Since the event stream data is obtained by shooting the target scene through the event camera, the first event stream data can capture the target object in the target scene between the first image frame and the second image frame Real high-speed motion information (including linear and nonlinear motion) within a time period.
  • estimating the optical flow at the target moment in combination with the first image frame, the second image frame and the first event frame can improve the accuracy of the target optical flow.
  • the optical flow estimating device may obtain a second event frame, where the second event frame is used to describe the time between the first image frame and the target moment of the target scene Changes in brightness within a segment; correspondingly, in S203, the optical flow estimating device may determine the target optical flow based on the first image frame, the second image frame, the first event frame and the second event frame .
  • the method for obtaining the second event frame may refer to the above method for obtaining the first event frame, which will not be repeated here.
  • the optical flow estimating device can determine a first optical flow based on the first image frame, the second image frame and the first event frame, and the first optical flow is from the first image frame to the second The optical flow of the image frame; based on the second event frame, determine a first optical flow allocation mask, and the first optical flow allocation mask is used to indicate the weight of the target optical flow relative to the first optical flow; based on the second An optical flow and the first optical flow assignment mask are used to determine the target optical flow.
  • the first optical flow may be a sparse optical flow or a dense optical flow, which is not limited in this application.
  • the first optical flow represents different motion directions through different colors of pixels, and different motion speeds through different brightness of pixels. .
  • the optical flow estimating device may determine the first optical flow based on the first image frame, the second image frame, and the first event frame in various ways, which is not limited in the present application.
  • the optical flow estimating device may input the first image frame, the second image frame and the first event frame into a preset optical flow estimation model to obtain the first optical flow.
  • the application is obtained by training the first lightweight convolutional neural network (CNN).
  • the first CNN can include several processing layers of dimensionality reduction, convolution, residual, deconvolution, and dimensionality enhancement.
  • the cyclic iteration of the first CNN achieves the purpose of improving the accuracy of optical flow estimation, and is easy to deploy on the device side.
  • the optical flow estimation device may input the first image frame, the second image frame and the first event frame into the optical flow estimation model, and perform loop iterations to obtain the first optical flow. That is to say, the optical flow estimation device can input the first image frame, the second image frame and the first event frame into the optical flow estimation model to obtain a second optical flow; the first image frame, the second Two image frames, the first event frame and the second optical flow are input into the optical flow estimation model to obtain the third optical flow; and so on, the loop iteration is satisfied until the optical flow estimation model outputs the first optical flow and satisfies the optical flow The first optical flow is output when the flow estimates the loss function preset by the model.
  • the above-mentioned first optical flow allocation mask has the same resolution as the first image frame, the second image frame and the first event frame.
  • the optical flow estimating apparatus may determine the first optical flow allocation mask based on the second event frame in various ways, which is not limited in this application.
  • the optical flow estimating apparatus may input the second event frame into a preset optical flow allocation model to obtain the first optical flow allocation mask.
  • this application adopts a lightweight second CNN to train, and the second CNN can include processing layers such as fusion and convolution. By looping and iterating the second CNN, the accuracy of optical flow estimation is improved, and it is easy to deploy on the device side.
  • the fusion processing layer is used to fuse the multi-channel second event frame into an image of one channel
  • the convolution processing layer is used to respectively perform a convolution on a channel through a convolution kernel in the X direction and a convolution kernel in the Y direction.
  • Convolution processing is performed on images of channels to output the first optical flow distribution mask of a channel, and the resolution of the first optical flow distribution mask is the same as that of the second event frame.
  • the convolution kernel in the X direction can be
  • the convolution kernel in the Y direction can be
  • the optical flow estimating device may input the second event frame into the optical flow allocation model, and perform loop iterations to obtain the first optical flow allocation mask. That is to say, the optical flow estimation device can input the second event frame into the optical flow allocation model to obtain a second optical flow allocation mask; input the second event frame and the second optical flow allocation mask into the optical flow Allocation model to obtain the third optical flow allocation mask; and so on, iterate until the optical flow allocation model outputs the first optical flow allocation mask and outputs the first when the loss function preset by the optical flow allocation model is satisfied.
  • Optical flow assignment mask may be used to obtain the first optical flow allocation mask.
  • the optical flow estimating device may use the first optical flow assignment mask to weight the optical flows at corresponding positions in the first optical flow to obtain the target optical flow.
  • the optical flow estimation device first estimates the distance between the first image frame and the second image frame based on the first image frame, the second image frame, and the first event frame. Then, based on the second event frame, determine the weight of the optical flow between the first image frame and the target moment compared to the optical flow between the first image frame and the second image frame (ie the first An optical flow allocation mask), since there is no motion assumption of the target object, the obtained first optical flow allocation mask has the characteristics of accurately allocating real motion optical flow, therefore, through the first optical flow allocation mask to The accuracy of the target optical flow obtained by weighting the first optical flow is higher.
  • optical flow estimation method provided by the embodiment of the present application is described above with reference to FIG. 4 to FIG. 6 , and the optical flow estimation device provided by the embodiment of the present application will be further described below.
  • FIG. 7 shows a schematic block diagram of an optical flow estimation apparatus 300 provided by an embodiment of the present application.
  • the optical flow estimation apparatus 300 may include an obtaining module 301 and an optical flow estimation module 302 .
  • the obtaining module 301 is used to obtain the first image frame and the second image frame, the first image frame and the second image frame are any two adjacent image frames in the image sequence, and the image sequence is obtained by shooting the target scene and obtaining a first event frame, where the first event frame is used to describe the brightness change of the target scene between the first image frame and the second image frame.
  • the optical flow estimation module 302 is used to determine the target optical flow based on the first image frame, the second image frame and the first event frame, and the target optical flow includes corresponding pixel points between the first image frame and the target moment At least one of the optical flow of the target moment and the optical flow of the corresponding pixel between the target moment and the second image frame, the target moment is any moment between the first image frame and the second image frame.
  • the obtaining module 301 is further configured to obtain a second event frame before determining the target optical flow based on the first image frame, the second image frame and the first event frame, the The second event frame is used to describe the brightness change between the first image frame of the target scene and the target moment; the optical flow estimation module 302 is specifically configured to based on the first image frame, the second image frame, the second An event frame and the second event frame are used to determine the target optical flow.
  • the optical flow estimation module 302 may include an inter-frame optical flow estimation sub-module 3021 , an optical flow allocation sub-module 3022 and an optical flow estimation sub-module 3023 at any time between frames.
  • the inter-frame optical flow estimation submodule 3021 is configured to determine a first optical flow based on the first image frame, the second image frame, and the first event frame, and the first optical flow is the optical flow of corresponding pixels between the first image frame and the second image frame;
  • the optical flow allocation sub-module 3022 is used to determine a first optical flow allocation mask based on the second event frame, and the first optical flow
  • the flow allocation mask is used to indicate the weight of the target optical flow relative to the first optical flow;
  • the optical flow estimation sub-module 3023 at any time between frames is used to allocate the mask based on the first optical flow and the first optical flow, Determine the target optical flow.
  • the inter-frame optical flow estimation submodule 3021 is specifically configured to input the first image frame, the second image frame, and the first event frame into a preset optical flow estimation model to obtain the First light flow.
  • the inter-frame optical flow estimation sub-module 3021 is specifically configured to input the first image frame, the second image frame and the first event frame into the optical flow estimation model, and perform loop iterations , to obtain the first optical flow.
  • the optical flow allocation sub-module 3022 is specifically configured to input the second event frame into a preset optical flow allocation model to obtain the first optical flow allocation mask.
  • the optical flow allocation sub-module 3022 is specifically configured to input the second event frame into the optical flow allocation model, and perform loop iterations to obtain the first optical flow allocation mask.
  • the first image frame includes H ⁇ W pixels, where both H and W are integers greater than 1, the first event frame includes multiple channels, and the multiple channels include the first channel , the second channel, the third channel and the fourth channel;
  • the first channel includes H ⁇ W first numerical values, the H ⁇ W first numerical values correspond to the positions of the H ⁇ W pixel points one by one, and the first A value is used to represent the number of times that the brightness of the pixel at the corresponding position in the first image frame increases between the first image frame and the second image frame;
  • the second channel includes H ⁇ W second values, The H ⁇ W second numerical values correspond to the positions of the H ⁇ W pixel points one by one, and the second numerical values are used to represent the brightness of the corresponding pixel points in the first image frame from the first image frame to the The times of reduction between the second image frames;
  • the third channel includes H ⁇ W third numerical values, the H ⁇ W third numerical values correspond to the positions of the H ⁇ W pixel points one by one, and the third numerical values It is used to represent the time when the
  • the obtaining module is specifically configured to obtain event stream data, where the event stream data includes event data of each event in at least one event, and the at least one event is related to the target scene in the first image There is a one-to-one correspondence between frames and at least one brightness change that occurs between the second image frame, and the data of each event includes time stamp, pixel point coordinates and polarity; based on the event flow data, the first event frame is obtained.
  • the optical flow estimating device 300 may specifically be the optical flow estimating device in the above embodiment of the optical flow estimating method 200, and the optical flow estimating device 300 may be used to perform the optical flow estimating device in the above embodiment of the optical flow estimating method 200.
  • the processes and/or steps corresponding to the flow estimating apparatus are not repeated here to avoid repetition.
  • One or more of the various modules in the embodiment shown in FIG. 7 may be implemented by software, hardware, firmware or a combination thereof.
  • the software or firmware includes but is not limited to computer program instructions or codes, and can be executed by a hardware processor.
  • the hardware includes but is not limited to various integrated circuits, such as a central processing unit (CPU, Central Processing Unit), a digital signal processor (DSP, Digital Signal Processor), a field programmable gate array (FPGA, Field Programmable Gate Array) or Application Specific Integrated Circuit (ASIC).
  • CPU central processing unit
  • DSP Digital Signal Processor
  • FPGA Field Programmable Gate Array
  • ASIC Application Specific Integrated Circuit
  • FIG. 8 shows a schematic flowchart of an optical flow estimation method provided by an embodiment of the present application.
  • the steps in this process may be executed by the optical flow estimation apparatus 300 described in FIG. 7 . It should be noted that the steps listed below may be executed in various orders and/or concurrently, and are not limited to the execution order shown in FIG. 8 .
  • the process includes the following steps:
  • the obtaining module 301 obtains the first image frame, the second image frame, the first event frame and the second event frame. For details, reference may be made to relevant introductions in step 201 and step 202 of the above method.
  • the obtaining module 301 sends the first image frame, the second image frame and the first event frame to the inter-frame optical flow estimation sub-module 3021 .
  • Inter-frame optical flow estimation sub-module 3021 inputs the first image frame, the second image frame and the first event frame into the optical flow estimation model, and performs loop iterations to obtain the first optical flow.
  • Inter-frame optical flow estimation sub-module 3021 inputs the first image frame, the second image frame and the first event frame into the optical flow estimation model, and performs loop iterations to obtain the first optical flow.
  • the inter-frame optical flow estimation submodule 3021 sends the first optical flow to the optical flow estimation submodule 3023 at any time between frames.
  • the obtaining module 301 sends the second event frame to the optical flow distribution sub-module 3022 .
  • the optical flow allocation sub-module 3022 inputs the second event frame into the optical flow allocation model, and performs loop iterations to obtain the first optical flow allocation mask. For details, reference may be made to relevant introductions in step 203 of the above-mentioned method.
  • the optical flow distribution sub-module 3022 sends the first optical flow distribution mask to the optical flow estimation sub-module 3023 at any time between frames.
  • the optical flow allocation sub-module 3022 weights the first optical flow through the first optical flow allocation mask to obtain the target optical flow.
  • FIG. 9 shows a schematic block diagram of an optical flow estimation device 400 provided by an embodiment of the present application.
  • the optical flow estimation device 400 may include a processor 401 and a communication interface 402, wherein the processor 401 and the communication interface 402 coupling.
  • the communication interface 402 is used to input image data to the processor 401, and/or output image data from the processor 401; the processor 401 runs computer programs or instructions, so that the optical flow estimation device 400 implements the method described in the embodiment of the method 200 Optical flow estimation method.
  • the processor 401 in the embodiment of the present application includes but is not limited to a central processing unit (Central Processing Unit, CPU), a general purpose processor, a digital signal processor (Digital Signal Processor, DSP), an application specific integrated circuit (Application Specific Integrated Circuit, ASIC ), off-the-shelf programmable gate array (Field Programmable Gate Array, FPGA), discrete gate or transistor logic devices or discrete hardware components, etc.
  • a general-purpose processor may be a microprocessor, a microcontroller, or any conventional processor or the like.
  • the processor 401 is configured to obtain a first image frame and a second image frame through the communication interface 402, the first image frame and the second image frame are any two adjacent image frames in an image sequence, and the image sequence is Obtained by photographing the target scene; the first event frame is obtained through the communication interface 402, and the first event frame is used to describe the brightness change of the target scene between the first image frame and the second image frame; based on the first An image frame, the second image frame and the first event frame, determine the target optical flow, the target optical flow includes the optical flow of the corresponding pixel between the first image frame and the target time and the target time and the second At least one of the optical flows corresponding to pixels between image frames, the target moment is any moment between the first image frame and the second image frame.
  • the optical flow estimating device 400 can be specifically the optical flow estimating device in the above embodiment of the optical flow estimating method 200, and the optical flow estimating device 400 can be used to perform the above optical flow estimation In order to avoid repetition, the procedures and/or steps corresponding to the optical flow estimating apparatus in the embodiment of the method 200 are not repeated here.
  • the optical flow estimating apparatus 400 may further include a memory 403 .
  • Memory 403 may be volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory.
  • the non-volatile memory can be read-only memory (Read-Only Memory, ROM), programmable read-only memory (Programmable ROM, PROM), erasable programmable read-only memory (Erasable PROM, EPROM), electronically programmable Erase Programmable Read-Only Memory (Electrically EPROM, EEPROM) or Flash.
  • the volatile memory can be Random Access Memory (RAM), which acts as external cache memory.
  • RAM Static Random Access Memory
  • SRAM Static Random Access Memory
  • DRAM Dynamic Random Access Memory
  • Synchronous Dynamic Random Access Memory Synchronous Dynamic Random Access Memory
  • SDRAM double data rate synchronous dynamic random access memory
  • Double Data Rate SDRAM, DDR SDRAM enhanced synchronous dynamic random access memory
  • Enhanced SDRAM, ESDRAM synchronous connection dynamic random access memory
  • Synchlink DRAM, SLDRAM Direct Memory Bus Random Access Memory
  • Direct Rambus RAM Direct Rambus RAM
  • the memory 403 is used to store program codes and instructions of the optical flow estimation device.
  • the memory 403 is also used to store data obtained by the processor 401 during execution of the above-mentioned embodiment of the optical flow estimation method 200, such as the first optical flow, the first optical flow allocation mask, the target optical flow, and the like.
  • the memory 403 may be an independent device or integrated in the processor 401 .
  • FIG. 9 only shows a simplified design of the optical flow estimation device 400 .
  • the optical flow estimating device 400 may also include other necessary components, including but not limited to any number of communication interfaces, processors, controllers, memories, etc., all of which can realize the optical flow estimating device 400 of the present application All within the scope of protection of this application.
  • the optical flow estimating device 400 may be a chip.
  • the chip may also include one or more memories for storing computer-executable instructions.
  • the processor may execute the computer-executable instructions stored in the memory, so that the chip executes the above optical flow estimation method.
  • the chip device can be a field programmable gate array, an ASIC, a system chip, a central processing unit, a network processor, a digital signal processing circuit, a microcontroller, or a programmable controller for realizing relevant functions. or other integrated chips.
  • the embodiment of the present application also provides a computer-readable storage medium, where computer instructions are stored in the computer-readable storage medium, and when the computer instructions are run on a computer, the optical flow estimation method described in the foregoing method embodiments is implemented.
  • An embodiment of the present application further provides a computer program product, and when the computer program product is run on a processor, the method for estimating optical flow described in the foregoing method embodiments is implemented.
  • An embodiment of the present application further provides a terminal, where the terminal includes the foregoing optical flow estimation system.
  • the terminal may further include a display screen for displaying the target optical flow output by the above-mentioned optical flow estimation system.
  • optical flow estimation device computer-readable storage medium, computer program product, chip, or terminal provided in the embodiments of the present application are all used to execute the corresponding optical flow estimation method provided above. Therefore, the beneficial effects it can achieve can be Referring to the beneficial effects of the corresponding optical flow estimation method provided above, details will not be repeated here.
  • sequence numbers of the above-mentioned processes do not mean the order of execution, and the execution order of the processes should be determined by their functions and internal logic, and should not be used in the embodiments of the present application.
  • the implementation process constitutes any limitation.
  • the disclosed systems, devices and methods may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components can be combined or May be integrated into another system, or some features may be ignored, or not implemented.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the functions described above are realized in the form of software function units and sold or used as independent products, they can be stored in a computer-readable storage medium.
  • the technical solution of the present application is essentially or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage media include: various media that can store program codes such as U disk, mobile hard disk, read-only memory (Read Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

本申请提供一种光流估计方法和装置,能够提高相邻两个图像帧到该相邻两个图像帧之间的任意时刻的光流估计的准确度。该方法可以包括:获得第一图像帧和第二图像帧,该第一图像帧和该第二图像帧为图像序列中任意两个相邻的图像帧,该图像序列是对目标场景拍摄得到的;获得第一事件帧,该第一事件帧用于描述该目标场景在该第一图像帧到该第二图像帧之间的时间段内的亮度变化情况;基于该第一图像帧、该第二图像帧和该第一事件帧,确定目标光流,该目标光流为从该第一图像帧到目标时刻的光流,该目标时刻为该第一图像帧与该第二图像帧之间的任意时刻。

Description

一种光流估计方法和装置
本申请要求于2021年10月14日递交的申请号为202111199513.6、申请名称为“一种光流估计方法和装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及计算机视觉领域,特别涉及一种光流估计方法和装置。
背景技术
光流(optical flow)是指是利用图像序列中像素点在时间域上的变化以及相邻帧之间的相关性来找到上一帧跟当前帧之间存在的对应关系,从而计算出相邻帧之间物体的运动信息的一种方法。
传统的光流估计方法只能估计第一图像帧与第二图像帧之间的光流(包括从该第一图像帧到该第二图像帧的光流以及从该第二图像帧到该第一图像帧的光流),其中,第一图像帧和第二图像帧为相邻的两个图像帧,而对于上述两个图像帧到上述两个图像帧之间的任意时刻的光流(包括从该第一图像帧到该任意时刻的光流以及从该第二图像帧到该任意时刻的光流),只能通过线性运动假设,按照时间长度作为权重进行平均分配。
由于线性运动是一种假设的情况,与实际情况的差别较大,因此,采用传统的光流计算方法对相邻两个图像帧到该相邻两个图像帧之间的任意时刻的光流进行估计,准确度较低。
发明内容
本申请提供一种光流估计方法和装置,能够提高相邻两个图像帧到该相邻两个图像帧之间的任意时刻的光流估计的准确度。
第一方面,本申请提供一种光流估计方法。该方法可以包括:获得第一图像帧和第二图像帧,该第一图像帧和该第二图像帧为图像序列中任意两个相邻的图像帧,该图像序列是对目标场景拍摄得到的;获得第一事件帧,该第一事件帧用于描述该目标场景在该第一图像帧到该第二图像帧之间的时间段内的亮度变化情况;基于该第一图像帧、该第二图像帧和该第一事件帧,确定目标光流,该目标光流为从该第一图像帧到目标时刻的光流,该目标时刻为该第一图像帧与该第二图像帧之间的任意时刻。
在一种可能的实现方式中,该方法可以用于光流估计系统,该光流估计系统可以包括包括像素传感器、事件传感器和光流估计装置,该像素传感器和该事件传感器分别与该光流估计装置连接。该方法例如可以由上述光流估计装置执行。
还需要说明的是,由于该图像序列是通过像素相机对该目标场景拍摄得到的,因此,该第一图像帧和该第二图像帧中包含该目标场景中的目标对象的像素信息。由于该事件流数据是通过事件相机对该目标场景拍摄得到的,因此,该第一事件流数据可以捕捉到该目标场景中的目标对象在该第一图像帧与该第二图像帧之间的时间段内真实的高速运动信 息(包括线性的和非线性的运动)。
采用本申请提供的光流估计方法,该光流估计装置基于该第一图像帧、该第二图像帧和该第一事件帧,先估计出该第一图像帧与该第二图像帧之间的第一光流;然后,基于该第二事件帧,确定该第一图像帧与该目标时刻的光流相比于该第一图像帧与该第二图像帧的光流的权重(即第一光流分配掩码),由于不存在目标对象的运动假设,因此,得到的第一光流分配掩码具备精确分配真实运动光流的特点,因此,通过该第一光流分配掩码对该第一光流进行加权得到的目标光流的准确度更高。
可选地,该光流估计装置可以通过多种方式获得该第一图像帧和该第二图像帧,本申请对此不做限定。
在一种可能的实现方式中,该光流估计装置可以接收像素相机发送的该第一图像帧和该第二图像帧。
在另一种可能的实现方式中,该光流估计装置可以通过输入接口获得其它设备或用于输入的该第一图像帧和该第二图像帧。
可选地,该目标场景中可以包括至少一个目标对象,该至少一个目标对象中的部分或全部对象处于运动状态。
可选地,该光流估计装置可以通过多种方式获得该第一事件帧,本申请对此不做限定。
在一种可能的实现方式中,该光流估计装置可以接收事件相机发送的事件流数据,该事件流数据包括至少一个事件中的每个事件的事件数据,该至少一个事件与该目标场景在该第一图像帧与该第二图像帧之间发生的至少一次亮度变化一一对应,该每个事件的数据包括时间戳、像素点坐标和极性;该光流估计装置可以基于该事件流数据,得到该第一事件帧。也就是说,事件相机可以采集该事件流数据,并将该事件流数据发送至该光流估计装置。
在另一种可能的实现方式中,该光流估计装置可以接收事件相机发送的该第一事件帧。也就是说,该事件相机可以采集该事件流数据,基于该事件流数据生成该第一事件帧并发送至该光流估计装置。
需要说明的是,该第一图像帧、该第二图像帧和该第一事件帧的分辨率相同。
在一种可能的实现方式中,以该第一图像帧包括H×W个像素点,H和W均为大于1的整数为例,该第一事件帧可以包括多个通道,该多个通道可以包括第一通道、第二通道、第三通道和第四通道。其中,该第一通道包括H×W个第一数值,该H×W个第一数值与该H×W个像素点的位置一一对应,该第一数值用于表示该第一图像帧中对应位置的像素点的亮度在该第一图像帧到该第二图像帧之间的时间段内增大的次数;该第二通道包括H×W个第二数值,该H×W个第二数值与该H×W个像素点的位置一一对应,该第二数值用于表示该第一图像帧中对应位置的像素点的亮度在该第一图像帧到该第二图像帧之间的时间段内减小的次数;该第三通道包括H×W个第三数值,该H×W个第三数值与该H×W个像素点的位置一一对应,该第三数值用于表示该第一图像帧中对应位置的像素点的亮度在该第一图像帧到该第二图像帧之间的时间段内最后一次增大的时间;该第四通道包括H×W个第四数值,该H×W个第四数值与该H×W个像素点的位置一一对应,该第四数值用于表示该第一图像帧中对应位置的像素点的亮度在该第一图像帧到该第二图像帧之间的时间段内最后一次减小的时间。
在一种可能的实现方式中,在该光流估计装置基于该第一图像帧、该第二图像帧和该第一事件帧,确定目标光流之前,该光流估计装置可以获得第二事件帧,该第二事件帧用于描述该目标场景在该第一图像帧到该目标时刻之间的时间段内的亮度变化情况;相应地,该光流估计装置基于该第一图像帧、该第二图像帧和该第一事件帧,确定目标光流,包括:该光流估计装置基于该第一图像帧、该第二图像帧、该第一事件帧和该第二事件帧,确定该目标光流。
具体地,该光流估计装置可以基于该第一图像帧、该第二图像帧和该第一事件帧,确定第一光流,该第一光流为从该第一图像帧到该第二图像帧的光流;基于该第二事件帧,确定第一光流分配掩码,该第一光流分配掩码用于指示该目标光流相对于该第一光流的权重;基于该第一光流和该第一光流分配掩码,确定该目标光流。
可选地,该第一光流可以为稀疏光流,也可以为稠密光流,本申请对此不做限定。
在一种可能的实现方式中,以该第一光流为稠密光流为例,该第一光流通过像素点的不同颜色表示不同的运动方向,通过像素点的不同亮度表示不同的运动速率。
可选地,该光流估计装置可以通过多种方式基于该第一图像帧、该第二图像帧和该第一事件帧,确定第一光流,本申请对此不做限定。
在一种可能的实现方式中,该光流估计装置可以将该第一图像帧、该第二图像帧和该第一事件帧输入预设的光流估计模型,得到该第一光流。
为了在端侧部署和实时性要求,上述光流估计模型的网络结构不能过于复杂,同时,由于光流估计是一项复杂任务,结构简单的网络难以完成较高精度光流估计,因此,本申请采用轻量化的第一卷积神经网络训练得到,该第一卷积神经网络可以包括降维、卷积、残差、反卷积和升维几个处理层,通过对该第一卷积神经网络的循环迭代,达到提高光流估计的准确度的目的,并且易于在端侧部署。
可选地,该光流估计装置可以将该第一图像帧、该第二图像帧和该第一事件帧输入该光流估计模型,并进行循环迭代,得到该第一光流。也就是说,该光流估计装置可以将该第一图像帧、该第二图像帧和该第一事件帧输入该光流估计模型,得到第二光流;将该第一图像帧、该第二图像帧、该第一事件帧和该第二光流输入该光流估计模型,得到第三光流;以此类推,循环迭代直到该光流估计模型输出该第一光流时满足该光流估计模型预设的损失函数时输出该第一光流。
需要说明的是,上述第一光流分配掩码与该第一图像帧、该第二图像帧和该第一事件帧的分辨率相同。
可选地,该光流估计装置可以通过多种方式基于该第二事件帧,确定第一光流分配掩码,本申请对此不做限定。
在一种可能的实现方式中,该光流估计装置可以将该该第二事件帧输入预设的光流分配模型,得到该第一光流分配掩码。
为了在端侧部署和实时性要求,上述光流分配模型的网络结构不能过于复杂,因此,本申请采用轻量化的第二CNN训练得到,该第二CNN可以包括融合和卷积等处理层,通过对该第二CNN的循环迭代,达到提高光流估计的准确度的目的,并且易于在端侧部署。
可选地,该光流估计装置可以将该第二事件帧输入该光流分配模型,并进行循环迭代,得到该第一光流分配掩码。
第二方面,本申请还提供一种光流估计装置,其特征在于,包括:获得模块和光流估计模块;该获得模块用于获得第一图像帧和第二图像帧,该第一图像帧和该第二图像帧为图像序列中任意两个相邻的图像帧,该图像序列是对目标场景拍摄得到的;获得第一事件帧,该第一事件帧用于描述该目标场景在该第一图像帧到该第二图像帧之间的时间段内的亮度变化情况;该光流估计模块用于基于该第一图像帧、该第二图像帧和该第一事件帧,确定目标光流,该目标光流为从该第一图像帧到目标时刻的光流,该目标时刻为该第一图像帧与该第二图像帧之间的任意时刻。
在一种可能的实现方式中,该获得模块还用于在该基于该第一图像帧、该第二图像帧和该第一事件帧,确定目标光流之前,获得第二事件帧,该第二事件帧用于描述该目标场景在该第一图像帧到该目标时刻之间的时间段内的亮度变化情况;该光流估计模块具体用于基于该第一图像帧、该第二图像帧、该第一事件帧和该第二事件帧,确定该目标光流。
在一种可能的实现方式中,该光流估计模块包括帧间光流估计子模块、光流分配子模块和帧间任意时刻光流估计子模块;该帧间光流估计子模块用于基于该第一图像帧、该第二图像帧和该第一事件帧,确定第一光流,该第一光流为从该第一图像帧到该第二图像帧的光流;该光流分配子模块用于基于该第二事件帧,确定第一光流分配掩码,该第一光流分配掩码用于指示该目标光流相对于该第一光流的权重;该帧间任意时刻光流估计子模块用于基于该第一光流和该第一光流分配掩码,确定该目标光流。
在一种可能的实现方式中,该帧间光流估计子模块具体用于将该第一图像帧、该第二图像帧和该第一事件帧输入预设的光流估计模型,得到该第一光流。
在一种可能的实现方式中,该帧间光流估计子模块具体用于将该第一图像帧、该第二图像帧和该第一事件帧输入该光流估计模型,并进行循环迭代,得到该第一光流。
在一种可能的实现方式中,该光流分配子模块具体用于将该第二事件帧输入预设的光流分配模型,得到该第一光流分配掩码。
在一种可能的实现方式中,该光流分配子模块具体用于将该第二事件帧输入该光流分配模型,并进行循环迭代,得到该第一光流分配掩码。
在一种可能的实现方式中,该第一图像帧包括H×W个像素点,H和W均为大于1的整数,该第一事件帧包括多个通道,该多个通道包括第一通道、第二通道、第三通道和第四通道;该第一通道包括H×W个第一数值,该H×W个第一数值与该H×W个像素点的位置一一对应,该第一数值用于表示该第一图像帧中对应位置的像素点的亮度在该第一图像帧到该第二图像帧之间的时间段内增大的次数;该第二通道包括H×W个第二数值,该H×W个第二数值与该H×W个像素点的位置一一对应,该第二数值用于表示该第一图像帧中对应位置的像素点的亮度在该第一图像帧到该第二图像帧之间的时间段内减小的次数;该第三通道包括H×W个第三数值,该H×W个第三数值与该H×W个像素点的位置一一对应,该第三数值用于表示该第一图像帧中对应位置的像素点的亮度在该第一图像帧到该第二图像帧之间的时间段内最后一次增大的时间戳;该第四通道包括H×W个第四数值,该H×W个第四数值与该H×W个像素点的位置一一对应,该第四数值用于表示该第一图像帧中对应位置的像素点的亮度在该第一图像帧到该第二图像帧之间的时间段内最后一次减小的时间戳。
在一种可能的实现方式中,该获得模块具体用于:获得事件流数据,该事件流数据包 括至少一个事件中的每个事件的事件数据,该至少一个事件与该目标场景在该第一图像帧与该第二图像帧之间发生的至少一次亮度变化一一对应,该每个事件的数据包括时间戳、像素点坐标和极性;基于该事件流数据,得到该第一事件帧。
第三方面,本申请还提供一种光流估计装置,该光流估计装置可以包括至少一个处理器和至少一个通信接口,所述至少一个处理器和所述至少一个通信接口耦合,所述至少一个通信接口用于为所述至少一个处理器提供信息和/或数据,所述至少一个处理器用于运行计算机程序指令以执行上述第一方面及其任意可能的实现方式中所述的光流估计方法。
可选地,该装置可以为芯片或集成电路。
第四方面,本申请还提供一种终端,该终端可以包括如上述第二方面及其任意可能的实现方式中所述的光流估计装置,或如上述第三方面所述的光流估计装置。
第五方面,本申请还提供一种计算机可读存储介质,其特征在于,用于存储计算机程序,该计算机程序被处理器运行时,实现上述第一方面及其任意可能的实现方式中所述的光流估计方法。
第六方面,本申请还提供一种计算机程序产品,其特征在于,当该计算机程序产品在处理器上运行时,实现上述第一方面及其任意可能的实现方式中所述的光流估计方法。
本申请提供的光流估计装置、计算机存储介质、计算机程序产品、芯片和终端均用于执行上文所提供的光流估计方法,因此,其所能达到的有益效果可参考上文所提供的光流估计方法中的有益效果,此处不再赘述。
附图说明
图1是本申请实施例提供的事件流数据的示意图;
图2是本申请实施例提供的稀疏光流和稠密光流的示意图;
图3是本申请实施例提供的光流估计系统100的示意性框图;
图4是本申请实施例提供的光流估计方法200的示意性流程图;
图5是本申请实施例提供的事件流数据的另一示意图;
图6是本申请实施例提供的第一事件帧的示意图;
图7是本申请实施例提供的光流估计装置300的示意性框图;
图8是本申请实施例提供的光流估计方法的流程示意图;
图9是本申请实施例提供的光流估计装置400的示意性框图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行描述。
1.像素相机
像素相机,即传统相机,以固定速率(即帧率)采集场景的亮度值,并以固定速率输出为图像数据。
2.事件相机(event-based camera)
事件相机是一种新型传感器,基于事件驱动的方式来捕捉场景中的像素点亮度的动态变化。
对于传统的相机,从某种程度上,是捕获一个静态/静止的空间,而事件相机的目的 是:敏感地捕捉运动的物体。
与传统相机不同,事件相机只观测场景中的“运动”,确切地说是观察场景中的“亮度的变化”。事件相机只会在有亮度变化时,输出对应像素的亮度变化(1或0),具有响应快、动态范围宽等优势。
对于单个像素点,事件相机只有在光强产生变化时,该像素点才会输出。比如亮度增加并超过的一个阈值,那么对应像素点将输出一个亮度增大的事件。事件相机没有帧的概念,当场景变化时,就产生一系列的像素级(pixel-level)的输出。其理论时间分辨率高达1us,因此产生的延迟非常低,低于常见场景中的绝大多数的运动速率,因此就不会出现运动模糊问题。除此之外,事件相机的每个像素点是独立异步工作的,所以动态范围很大。事件相机还有能耗低的优势。
综上所述,传统相机以固定的帧率对场景进行全帧拍摄,所有像素点同步工作。事件相机是每个像素点独立异步工作,采样率高达一百万赫兹(Hz),且仅对亮度变化(即事件)进行输出,一个事件通过一个四元组的事件数据描述,所有像素点输出的事件数据汇总起来,形成由一个个事件组成的事件列表,作为相机输出的事件流数据。
例如:一个事件的事件数据可以表示为(x,y,t,p),其中,(x,y)为事件发生的像素点坐标、t为时间发生的时刻,p为事件发生的极性,(如p=0表示该像素点的亮度相比于前一次采样减小,p=1表示该像素点亮度相比于前一次采样增大)。
常用的事件相机可以包括动态视觉传感器(dynamic vision sensor,DVS)或动态有源像素视觉传感器(dynamic and active-pixel vision sensor,DAVIS)。
示例的,图1示出了事件流数据的示意图。图1中的(a)图是通过事件数据列表表示事件流数据,该事件数据列表包括多个事件数据,每个事件数据描述一个事件,每个事件数据通过四元组表示,即时间戳、x坐标、y坐标和极性。图1中的(b)图是通过可视化图表示事件流数据,其中,三维坐标轴为帧宽、帧高和时间,三维坐标中坐标点的颜色表示对应位置的像素点的亮度变化的次数,颜色越亮亮度变化的次数越多。
3.光流
光流,即光的流动,是利用图像序列中像素点在时间域上的变化以及相邻帧之间的相关性来找到上一帧跟当前帧之间存在的对应关系,从而计算出相邻帧之间物体的运动信息的一种方法。
根据是否选取图像中的稀疏点进行光流估计,可以将光流分为稀疏光流和稠密光流。
示例的,图2示出了稀疏光流和稠密光流的示意图。其中,图2中的(a)为稀疏光流的示意图,其中描述了图像中一些显著特征点(如角点)的像素点向下一帧运动的光流,图2中的(b)为稠密光流的示意图,其中描述了图像中所有个像素点向下一帧运动的光流,稠密光流图中不同颜色表示不同的运动方向,不同的亮度表示不同的运动速率。
现有技术中,光流网络(FlowNet)是一种通过卷积神经网络训练得到的用于估计光流的模型,光流估计是指根据图像序列中任意两个相邻的图像帧,去估计这两个图像帧之间像素级的光流。
示例的,以两个相邻的图像帧为第一图像帧和第二图像帧,且该第一图像帧为该第二图像帧的前一帧为例,采用现有的光流网络,可以估计该第一图像帧与该第二图像帧之间的双向光流,即从第一图像帧到第二图像帧的光流以及从第二图像帧到第一图像帧的光 流。
由于传统相机是以恒定的频率(即帧率)拍摄获取图像的,即使帧率能够达到1KHz,那也具有1ms的延时。在这1ms的延时内,目标对象可能在进行高速运动。
对于上述两个图像帧到上述两个图像帧之间的任意时刻的光流,即从该第一图像帧到该任意时刻的光流以及从该第二图像帧到该任意时刻的光流,只能通过线性运动假设,按照时间长度作为权重进行平均分配。
然而,在上述场景下,目标对象可能是按照非线性运动移动的,因此,采用现有的线性运动假设方法估计相邻两个图像帧到上述两个图像帧之间的任意时刻的光流,准确度较低。
首先介绍一下本申请提供的光流估计方法和装置所应用的光流估计系统。
图3示出了本申请实施例提供的光流估计系统100的示意性框图。如图3所示,该光流估计系统100可以包括像素传感器110、事件传感器120和光流估计装置130,像素传感器110和事件传感器120分别与光流估计装置130连接。
像素传感器110用于对目标场景进行拍摄,得到图像序列;将第一图像帧和第二视频帧发送至光流估计装置130,该第一图像帧和该第二图像帧为该图像序列中任意两个相邻的图像帧。
示例的,该像素传感器可以为像素相机。本申请对该像素相机的型号和类型不做限定。
事件传感器120用于对该目标场景进行拍摄,得到事件流数据,该事件流数据包括至少一个事件中的每个事件的事件数据,该至少一个事件与该目标场景在该第一图像帧与该第二图像帧之间发生的至少一次亮度变化一一对应,该每个事件的数据包括时间戳、像素点坐标和极性;基于该事件流数据,得到第一事件帧,该第一事件帧用于描述该目标场景在该第一图像帧到该第二图像帧之间的时间段内的亮度变化情况,该第一图像帧、该第二图像帧和该第一事件帧的分辨率相同;将该第一图像帧、该第二图像帧和该第一事件帧发送至光流估计装置130。
示例的,该事件传感器可以为事件相机。本申请对该事件相机的型号和类型不做限定。
光流估计装置130用于基于该第一图像帧、该第二图像帧和该第一事件帧,确定目标光流(具体方法可以参考下文中介绍的本申请提供的光流估计方法),该目标光流为从该第一图像帧到目标时刻的光流,该目标时刻为该第一图像帧与该第二图像帧之间的任意时刻。
需要说明的是,上面仅以从该第一图像帧到该目标时刻的光流(即目标光流)的估计方法为例进行介绍,但本申请不限于此,从该第二图像帧到该目标时刻的光流的估计方法与该目标光流的估计方法类似,可以参考本申请提供的目标光流的估计方法,此处不再赘述。
可选地,上述事件传感器120可以直接将该事件流数据发送至光流估计装置130;相应地,光流估计装置130基于该事件流数据,得到该第一事件帧。
可选地,上述各装置之间可以通过有线方式或无线方式进行通信,本申请对此不做限定。
示例的,上述有线方式可以为通过数据线连接、或通过内部总线连接实现通信。
示例的,上述无线方式可以为通过通信网络实现通信,该通信网络可以是局域网,也 可以是通过中继(relay)设备转接的广域网,或者包括局域网和广域网。当该通信网络为局域网时,该通信网络可以是无线保真(wireless fidelity,Wifi)热点网络、wifi对等(peer-to-peer,P2P)网络、蓝牙(bluetooth)网络、zigbee网络、近场通信(near field communication,NFC)网或者未来可能的通用短距离通信网络等。当该通信网络为广域网时,示例性的,该通信网络可以是第三代移动通信技术(3rd-generation wireless telephone technology,3G)网络、第四代移动通信技术(the 4th generation mobile communication technology,4G)网络、第五代移动通信技术(5th-generation mobile communication technology,5G)网络、公共陆地移动网络(public land mobile network,PLMN)或因特网(Internet)等,本申请对此不做限定。
采用本申请提供的光流估计系统,由于该第一事件帧能够捕捉该目标场景中的目标在该第一图像帧和该第二图像帧之间低时延的运动信息,该第一图像帧和该第二图像帧能够捕捉该目标场景的像素信息,因此,结合该第一事件帧、该第一图像帧和该第二图像帧,确定该目标光流,能够提高该目标光流的准确度。
上面介绍本申请提供的光流估计系统,下面将进一步介绍本申请提供的应用于上述光流估计系统的光流估计方法。
请参考图4,图4示出了本申请实施例提供的光流估计方法200,该光流估计方法200可以用于上述光流估计系统100。如图2所示,该光流估计方法200可以包括以下步骤,需要说明的是,以下所列步骤可以以各种顺序执行和/或同时发生,不限于图4所示的执行顺序。
S201.获得第一图像帧和第二图像帧,该第一图像帧和该第二图像帧为图像序列中任意两个相邻的图像帧,该图像序列是对目标场景拍摄得到的。
可选地,该方法200可以由光流估计装置执行。
示例的,这里的光流估计装置可以为上述光流估计系统100中的光流估计装置130。
可选地,该光流估计装置可以通过多种方式获得该第一图像帧和该第二图像帧,本申请对此不做限定。
在一种可能的实现方式中,该光流估计装置可以接收像素相机发送的该第一图像帧和该第二图像帧。
示例的,这里的像素相机可以为上述光流估计系统100中的像素相机110。
在另一种可能的实现方式中,该光流估计装置可以通过输入接口获得其它设备或用于输入的该第一图像帧和该第二图像帧。
可选地,该目标场景中可以包括至少一个目标对象,该至少一个目标对象中的部分或全部对象处于运动状态。
S202.获得第一事件帧,该第一事件帧用于描述该目标场景在该第一图像帧到该第二图像帧之间的时间段内的亮度变化情况。
可选地,该光流估计装置可以通过多种方式获得该第一事件帧,本申请对此不做限定。
在一种可能的实现方式中,该光流估计装置可以接收事件相机发送的事件流数据,该事件流数据包括至少一个事件中的每个事件的事件数据,该至少一个事件与该目标场景在该第一图像帧与该第二图像帧之间发生的至少一次亮度变化一一对应,该每个事件的数据包括时间戳、像素点坐标和极性;该光流估计装置可以基于该事件流数据,得到该第一事 件帧。也就是说,事件相机可以采集该事件流数据,并将该事件流数据发送至该光流估计装置。
在另一种可能的实现方式中,该光流估计装置可以接收事件相机发送的该第一事件帧。也就是说,该事件相机可以采集该事件流数据,基于该事件流数据生成该第一事件帧并发送至该光流估计装置。
示例的,这里的事件相机可以为上述光流估计系统100中的事件相机120。
需要说明的是,该第一图像帧、该第二图像帧和该第一事件帧的分辨率相同。
示例的,以该第一图像帧、该第二图像帧和该第一事件帧的为分辨率均为4×4为例,图5示出了本申请实施例提供的事件流数据的示意图。如图5所示,该事件流数据包括20个事件数据,该20个事件数据分别描述该目标场景中像素点在该第一图像帧(即t1时刻)到该第二图像帧(即t10时刻)之间的时间段内发生的20次亮度变化(即该目标场景发生的20个事件),其中,t1~t10依次增大。如图5所示,以事件1为例,该事件1的事件数据包括以下四元组:时间戳为t1+Δt1,x坐标为1,y坐标为1,极性为1(即亮度增大),所描述的事件1为在t1+Δt1时刻、坐标为(1,1)的像素点的亮度增大。其它19个事件具有类似的理解,此处不再赘述。通过上述事件流数据可以估计不同像素点在不同时刻的运动速度和运动方向。
在一种可能的实现方式中,以该第一图像帧包括H×W个像素点,H和W均为大于1的整数为例,该第一事件帧可以包括多个通道,该多个通道可以包括第一通道、第二通道、第三通道和第四通道。其中,该第一通道包括H×W个第一数值,该H×W个第一数值与该H×W个像素点的位置一一对应,该第一数值用于表示该第一图像帧中对应位置的像素点的亮度在该第一图像帧到该第二图像帧之间的时间段内增大的次数;该第二通道包括H×W个第二数值,该H×W个第二数值与该H×W个像素点的位置一一对应,该第二数值用于表示该第一图像帧中对应位置的像素点的亮度在该第一图像帧到该第二图像帧之间的时间段内减小的次数;该第三通道包括H×W个第三数值,该H×W个第三数值与该H×W个像素点的位置一一对应,该第三数值用于表示该第一图像帧中对应位置的像素点的亮度在该第一图像帧到该第二图像帧之间的时间段内最后一次增大的时间;该第四通道包括H×W个第四数值,该H×W个第四数值与该H×W个像素点的位置一一对应,该第四数值用于表示该第一图像帧中对应位置的像素点的亮度在该第一图像帧到该第二图像帧之间的时间段内最后一次减小的时间。
示例的,图6示出了本申请实施例提供的第一事件帧的示意图,该第一事件帧可以包括第一通道、第二通道、第三通道和第四通道,该第一事件帧是基于如图5所示的事件流数据得到的。如图6所示,以坐标(1,1)的像素点为例,在图5中的事件流数据中该像素点分别在t2时刻和t9时刻极性为1(即亮度增大),即该像素点的亮度在t1到t10之间增大次数为2,减少次数为0,最后一次增大的时间为t9,最后一次减小的时间不存在,因此,第一通道中坐标为(1,1)的像素点(如图6中第一通道的黑色背景的像素点所示)处的第一数值为2,第二通道中坐标为(1,1)的像素点(如图6中第二通道的黑色背景的像素点所示)处的第二数值为0,第三通道中坐标为(1,1)的像素点(如图6中第三通道的黑色背景的像素点所示)处的第三数值为t9,第四通道中坐标为(1,1)的像素点(如图6中第四通道的黑色背景的像素点所示)处的第四数值记为0。
类似地,以坐标(2,2)的像素点为例,在图5中的事件流数据中该像素点分别在t2时刻、t3时刻、t4时刻、t5时刻和t6时刻极性为0(即亮度减小),即该像素点的亮度在t1到t10之间增大次数为0,减少次数为5,最后一次增大的时间不存在,最后一次减小的时间为t6,因此,第一通道中坐标为(2,2)的像素点(如图6中第一通道的边框加粗的像素点所示)处的第一数值为0,第二通道中坐标为(2,2)的像素点(如图6中第二通道的边框加粗的像素点所示)处的第二数值为5,第三通道中坐标为(2,2)的像素点(如图6中第三通道的边框加粗的像素点所示)处的第三数值记为0,第四通道中坐标为(2,2)的像素点(如图6中第四通道的边框加粗的像素点所示)处的第四数值为t6。
S203.基于该第一图像帧、该第二图像帧和该第一事件帧,确定目标光流,该目标光流为从该第一图像帧到目标时刻的光流,该目标时刻为该第一图像帧与该第二图像帧之间的任意时刻。
还需要说明的是,由于该图像序列是通过像素相机对该目标场景拍摄得到的,因此,该第一图像帧和该第二图像帧中包含该目标场景中的目标对象的像素信息。由于该事件流数据是通过事件相机对该目标场景拍摄得到的,因此,该第一事件流数据可以捕捉到该目标场景中的目标对象在该第一图像帧与该第二图像帧之间的时间段内真实的高速运动信息(包括线性的和非线性的运动)。
综上所述,结合该第一图像帧、该第二图像帧和该第一事件帧对该目标时刻的光流进行估计,能够提高该目标光流的准确度。
在一种可能的实现方式中,在S203之前,该光流估计装置可以获得第二事件帧,该第二事件帧用于描述该目标场景在该第一图像帧到该目标时刻之间的时间段内的亮度变化情况;相应地,在S203中,该光流估计装置可以基于该第一图像帧、该第二图像帧、该第一事件帧和该第二事件帧,确定该目标光流。
需要说明的是,该第二事件帧的获得方式可以参考上述第一事件帧的获得方式,此处不再赘述。
具体地,该光流估计装置可以基于该第一图像帧、该第二图像帧和该第一事件帧,确定第一光流,该第一光流为从该第一图像帧到该第二图像帧的光流;基于该第二事件帧,确定第一光流分配掩码,该第一光流分配掩码用于指示该目标光流相对于该第一光流的权重;基于该第一光流和该第一光流分配掩码,确定该目标光流。
可选地,该第一光流可以为稀疏光流,也可以为稠密光流,本申请对此不做限定。
在一种可能的实现方式中,以该第一光流为稠密光流为例,该第一光流通过像素点的不同颜色表示不同的运动方向,通过像素点的不同亮度表示不同的运动速率。
可选地,该光流估计装置可以通过多种方式基于该第一图像帧、该第二图像帧和该第一事件帧,确定第一光流,本申请对此不做限定。
在一种可能的实现方式中,该光流估计装置可以将该第一图像帧、该第二图像帧和该第一事件帧输入预设的光流估计模型,得到该第一光流。
为了在端侧部署和实时性要求,上述光流估计模型的网络结构不能过于复杂,同时,由于光流估计是一项复杂任务,结构简单的网络难以完成较高精度光流估计,因此,本申请采用轻量化的第一卷积神经网络(convolutional neural network,CNN)训练得到,该第一CNN可以包括降维、卷积、残差、反卷积和升维几个处理层,通过对该第一CNN的循 环迭代,达到提高光流估计的准确度的目的,并且易于在端侧部署。
可选地,该光流估计装置可以将该第一图像帧、该第二图像帧和该第一事件帧输入该光流估计模型,并进行循环迭代,得到该第一光流。也就是说,该光流估计装置可以将该第一图像帧、该第二图像帧和该第一事件帧输入该光流估计模型,得到第二光流;将该第一图像帧、该第二图像帧、该第一事件帧和该第二光流输入该光流估计模型,得到第三光流;以此类推,循环迭代直到该光流估计模型输出该第一光流时满足该光流估计模型预设的损失函数时输出该第一光流。
需要说明的是,上述第一光流分配掩码与该第一图像帧、该第二图像帧和该第一事件帧的分辨率相同。
可选地,该光流估计装置可以通过多种方式基于该第二事件帧,确定第一光流分配掩码,本申请对此不做限定。
在一种可能的实现方式中,该光流估计装置可以将该该第二事件帧输入预设的光流分配模型,得到该第一光流分配掩码。
为了在端侧部署和实时性要求,上述光流分配模型的网络结构不能过于复杂,因此,本申请采用轻量化的第二CNN训练得到,该第二CNN可以包括融合和卷积等处理层,通过对该第二CNN的循环迭代,达到提高光流估计的准确度的目的,并且易于在端侧部署。
示例的,该融合处理层用于将多通道的该第二事件帧融合为一通道的图像,该卷积处理层用于通过X方向的卷积核和Y方向的卷积核分别对该一通道的图像进行卷积处理输出一通道的该第一光流分配掩码,该第一光流分配掩码的分辨率与该第二事件帧相同。
示例的,X方向的卷积核可以为
Figure PCTCN2022121050-appb-000001
Y方向的卷积核可以为
Figure PCTCN2022121050-appb-000002
可选地,该光流估计装置可以将该第二事件帧输入该光流分配模型,并进行循环迭代,得到该第一光流分配掩码。也就是说,该光流估计装置可以将该第二事件帧输入该光流分配模型,得到第二光流分配掩码;将该第二事件帧和第二光流分配掩码输入该光流分配模型,得到第三光流分配掩码;以此类推,循环迭代直到该光流分配模型输出该第一光流分配掩码时满足该光流分配模型预设的损失函数时输出该第一光流分配掩码。
在一种可能的实现方式中,该光流估计装置可以通过该第一光流分配掩码对该第一光流中对应位置的光流进行加权,得到该目标光流。
采用本申请提供的光流估计方法,该光流估计装置基于该第一图像帧、该第二图像帧和该第一事件帧,先估计出该第一图像帧与该第二图像帧之间的第一光流;然后,基于该第二事件帧,确定该第一图像帧与该目标时刻的光流相比于该第一图像帧与该第二图像帧的光流的权重(即第一光流分配掩码),由于不存在目标对象的运动假设,因此,得到的第一光流分配掩码具备精确分配真实运动光流的特点,因此,通过该第一光流分配掩码对该第一光流进行加权得到的目标光流的准确度更高。
上面结合图4至图6介绍了本申请实施例提供的光流估计方法,下面将进一步介绍本申请实施例提供的光流估计装置。
请参考图7,图7示出了本申请实施例提供的光流估计装置300的示意性框图,该光流估计装置300可以包括获得模块301和光流估计模块302。
该获得模块301用于获得第一图像帧和第二图像帧,该第一图像帧和该第二图像帧为图像序列中任意两个相邻的图像帧,该图像序列是对目标场景拍摄得到的;获得第一事件帧,该第一事件帧用于描述该目标场景在该第一图像帧与该第二图像帧之间的亮度变化情况。
该光流估计模块302用于基于该第一图像帧、该第二图像帧和该第一事件帧,确定目标光流,该目标光流包括该第一图像帧与目标时刻之间对应像素点的光流以及该目标时刻与该第二图像帧之间对应像素点的光流中的至少一项,该目标时刻为该第一图像帧与该第二图像帧之间的任意时刻。
在一种可能的实现方式中,该获得模块301还用于在该基于该第一图像帧、该第二图像帧和该第一事件帧,确定目标光流之前,获得第二事件帧,该第二事件帧用于描述该目标场景该第一图像帧与该目标时刻之间的亮度变化情况;该光流估计模块302具体用于基于该第一图像帧、该第二图像帧、该第一事件帧和该第二事件帧,确定该目标光流。
可选地,该光流估计模块302可以包括帧间光流估计子模块3021、光流分配子模块3022和帧间任意时刻光流估计子模块3023。
在一种可能的实现方式中,该帧间光流估计子模块3021用于基于该第一图像帧、该第二图像帧和该第一事件帧,确定第一光流,该第一光流为该第一图像帧与该第二图像帧之间对应像素点的光流;该光流分配子模块3022用于基于该第二事件帧,确定第一光流分配掩码,该第一光流分配掩码用于指示该目标光流相对于该第一光流的权重;该帧间任意时刻光流估计子模块3023用于基于该第一光流和该第一光流分配掩码,确定该目标光流。
在一种可能的实现方式中,该帧间光流估计子模块3021具体用于将该第一图像帧、该第二图像帧和该第一事件帧输入预设的光流估计模型,得到该第一光流。
在一种可能的实现方式中,该帧间光流估计子模块3021具体用于将该第一图像帧、该第二图像帧和该第一事件帧输入该光流估计模型,并进行循环迭代,得到该第一光流。
在一种可能的实现方式中,该光流分配子模块3022具体用于将该第二事件帧输入预设的光流分配模型,得到该第一光流分配掩码。
在一种可能的实现方式中,该光流分配子模块3022具体用于将该第二事件帧输入该光流分配模型,并进行循环迭代,得到该第一光流分配掩码。
在一种可能的实现方式中,该第一图像帧包括H×W个像素点,H和W均为大于1的整数,该第一事件帧包括多个通道,该多个通道包括第一通道、第二通道、第三通道和第四通道;该第一通道包括H×W个第一数值,该H×W个第一数值与该H×W个像素点的位置一一对应,该第一数值用于表示该第一图像帧中对应位置的像素点的亮度在该第一图像帧到该第二图像帧之间增大的次数;该第二通道包括H×W个第二数值,该H×W个第二数值与该H×W个像素点的位置一一对应,该第二数值用于表示该第一图像帧中对应位置的像素点的亮度在该第一图像帧到该第二图像帧之间减小的次数;该第三通道包括H×W个第三数值,该H×W个第三数值与该H×W个像素点的位置一一对应,该第三数值用于表示该第一图像帧中对应位置的像素点的亮度在该第一图像帧到该第二图像帧之间最后一次增大的时间;该第四通道包括H×W个第四数值,该H×W个第四数值与该H×W个像素点的位置一一对应,该第四数值用于表示该第一图像帧中对应位置的像素 点的亮度在该第一图像帧到该第二图像帧之间最后一次减小的时间。
在一种可能的实现方式中,该获得模块具体用于获得事件流数据,该事件流数据包括至少一个事件中的每个事件的事件数据,该至少一个事件与该目标场景在该第一图像帧与该第二图像帧之间发生的至少一次亮度变化一一对应,该每个事件的数据包括时间戳、像素点坐标和极性;基于该事件流数据,得到该第一事件帧。
需要说明的是,上述模块之间的信息交互、执行过程等内容,由于与本申请方法实施例基于同一构思,其具体功能及带来的技术效果,具体可参见方法实施例部分,此处不再赘述。在一个可选例子中,光流估计装置300可以具体为上述光流估计方法200实施例中的光流估计装置,光流估计装置300可以用于执行上述光流估计方法200实施例中与光流估计装置对应的各个流程和/或步骤,为避免重复,在此不再赘述。
图7所示实施例中的各个模块中的一个或多个可以通过软件、硬件、固件或其结合实现。所述软件或固件包括但不限于计算机程序指令或代码,并可以被硬件处理器所执行。所述硬件包括但不限于各类集成电路,如中央处理单元(CPU,Central Processing Unit)、数字信号处理器(DSP,Digital Signal Processor)、现场可编程门阵列(FPGA,Field Programmable Gate Array)或专用集成电路(ASIC,Application Specific Integrated Circuit)。
示例的,图8示出了本申请实施例提供的光流估计方法的流程示意图。可选地,该流程中的步骤可以由图7中所述的光流估计装置300执行。需要说明的是,以下所列步骤可以以各种顺序执行和/或同时发生,不限于图8所示的执行顺序。该流程包括以下步骤:
(1)获得模块301获得第一图像帧、第二图像帧、第一事件帧和第二事件帧。具体可以参考上述方法步骤201和步骤202中的相关介绍。
(2)获得模块301将该第一图像帧、该第二图像帧和该第一事件帧发送至帧间光流估计子模块3021。
(3)帧间光流估计子模块3021将该第一图像帧、该第二图像帧和该第一事件帧输入光流估计模型,并进行循环迭代,得到第一光流。具体可以参考上述方法步骤203中的相关介绍。
(4)帧间光流估计子模块3021将该第一光流发送至帧间任意时刻光流估计子模块3023。
(5)获得模块301将该第二事件帧发送至光流分配子模块3022。
(6)光流分配子模块3022将该第二事件帧输入光流分配模型,并进行循环迭代,得到第一光流分配掩码。具体可以参考上述方法步骤203中的相关介绍。
(7)光流分配子模块3022将该第一光流分配掩码发送至帧间任意时刻光流估计子模块3023。
(8)光流分配子模块3022通过该第一光流分配掩码对该第一光流进行加权,得到该目标光流。
请参见图9,图9示出了本申请实施例提供的光流估计装置400的示意性框图,光流估计装置400可以包括处理器401和通信接口402,其中,处理器401和通信接口402耦合。
通信接口402,用于向处理器401输入图像数据,和/或从处理器401输出图像数据;处理器401运行计算机程序或指令,以使光流估计装置400实现上述方法200实施例所描 述的光流估计方法。
本申请实施例中的处理器401包括但不限于中央处理单元(Central Processing Unit,CPU)、通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field Programmable Gate Array,FPGA)、分立门或者晶体管逻辑器件或分立硬件组件等。通用处理器可以是微处理器、微控制器或者是任何常规的处理器等。
例如,处理器401用于通过通信接口402获得第一图像帧和第二图像帧,该第一图像帧和该第二图像帧为图像序列中任意两个相邻的图像帧,该图像序列是对目标场景拍摄得到的;通过通信接口402获得第一事件帧,该第一事件帧用于描述该目标场景在该第一图像帧与该第二图像帧之间的亮度变化情况;基于该第一图像帧、该第二图像帧和该第一事件帧,确定目标光流,该目标光流包括该第一图像帧与目标时刻之间对应像素点的光流以及该目标时刻与该第二图像帧之间对应像素点的光流中的至少一项,该目标时刻为该第一图像帧与该第二图像帧之间的任意时刻。
在一个可选例子中,本领域技术人员可以理解,光流估计装置400可以具体为上述光流估计方法200实施例中的光流估计装置,光流估计装置400可以用于执行上述光流估计方法200实施例中与光流估计装置对应的各个流程和/或步骤,为避免重复,在此不再赘述。
可选地,光流估计装置400还可以包括存储器403。
存储器403可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(Read-Only Memory,ROM)、可编程只读存储器(Programmable ROM,PROM)、可擦除可编程只读存储器(Erasable PROM,EPROM)、电可擦除可编程只读存储器(Electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(Random Access Memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(Static RAM,SRAM)、动态随机存取存储器(Dynamic RAM,DRAM)、同步动态随机存取存储器(Synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(Double Data Rate SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(Enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(Synchlink DRAM,SLDRAM)和直接内存总线随机存取存储器(Direct Rambus RAM,DR RAM)。
具体地,存储器403用于存储光流估计装置的程序代码和指令。可选地,存储器403还用于存储处理器401执行上述光流估计方法200实施例过程中获得的数据,如第一光流、第一光流分配掩码、目标光流等。
可选地,存储器403可以为单独的器件或集成在处理器401中。
需要说明的是,图9仅仅示出了光流估计装置400的简化设计。在实际应用中,光流估计装置400还可以分别包含必要的其他元件,包含但不限于任意数量的通信接口、处理器、控制器、存储器等,而所有可以实现本申请的光流估计装置400都在本申请的保护范围之内。
在一种可能的设计中,光流估计装置400可以为芯片。可选地,该芯片还可以包括一个或多个存储器,用于存储计算机执行指令,当该芯片装置运行时,处理器可执行存储器 存储的计算机执行指令,以使芯片执行上述光流估计方法。
可选地,该芯片装置可以为实现相关功能的现场可编程门阵列,专用集成芯片,系统芯片,中央处理器,网络处理器,数字信号处理电路,微控制器,还可以采用可编程控制器或其他集成芯片。
本申请实施例还提供一种计算机可读存储介质,该计算机可读存储介质中存储有计算机指令,当该计算机指令在计算机上运行时,实现上述方法实施例描述的光流估计方法。
本申请实施例还提供一种计算机程序产品,当该计算机程序产品在处理器上运行时,实现上述方法实施例描述的光流估计方法。
本申请实施例还提供一种终端,该终端包括上述光流估计系统。可选地,该终端还可以包括显示屏,该显示屏用于显示上述光流估计系统输出的该目标光流。
本申请实施例提供的光流估计装置、计算机可读存储介质、计算机程序产品、芯片或终端均用于执行上文所提供的对应的光流估计方法,因此,其所能达到的有益效果可参考上文所提供的对应的光流估计方法中的有益效果,此处不再赘述。
应理解,在本申请的各种实施例中,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而 前述的存储介质包括:U盘、移动硬盘、只读存储器(Read Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应所述以权利要求的保护范围为准。

Claims (21)

  1. 一种光流估计方法,其特征在于,包括:
    获得第一图像帧和第二图像帧,所述第一图像帧和所述第二图像帧为图像序列中任意两个相邻的图像帧,所述图像序列是对目标场景拍摄得到的;
    获得第一事件帧,所述第一事件帧用于描述所述目标场景在所述第一图像帧到所述第二图像帧之间的时间段内的亮度变化情况;
    基于所述第一图像帧、所述第二图像帧和所述第一事件帧,确定目标光流,所述目标光流为从所述第一图像帧到目标时刻的光流,所述目标时刻为所述第一图像帧与所述第二图像帧之间的任意时刻。
  2. 根据权利要求1所述的方法,其特征在于,在所述基于所述第一图像帧、所述第二图像帧和所述第一事件帧,确定目标光流之前,所述方法还包括:
    获得第二事件帧,所述第二事件帧用于描述所述目标场景在所述第一图像帧到所述目标时刻之间的时间段内的亮度变化情况;
    所述基于所述第一图像帧、所述第二图像帧和所述第一事件帧,确定目标光流,包括:
    基于所述第一图像帧、所述第二图像帧、所述第一事件帧和所述第二事件帧,确定所述目标光流。
  3. 根据权利要求2所述的方法,其特征在于,所述基于所述第一图像帧、所述第二图像帧、所述第一事件帧和所述第二事件帧,确定所述目标光流,包括:
    基于所述第一图像帧、所述第二图像帧和所述第一事件帧,确定第一光流,所述第一光流为从所述第一图像帧到所述第二图像帧的光流;
    基于所述第二事件帧,确定第一光流分配掩码,所述第一光流分配掩码用于指示所述目标光流相对于所述第一光流的权重;
    基于所述第一光流和所述第一光流分配掩码,确定所述目标光流。
  4. 根据权利要求3所述的方法,其特征在于,所述基于所述第一图像帧、所述第二图像帧和所述第一事件帧,确定第一光流,包括:
    将所述第一图像帧、所述第二图像帧和所述第一事件帧输入预设的光流估计模型,得到所述第一光流。
  5. 根据权利要求4所述的方法,其特征在于,所述将所述第一图像帧、所述第二图像帧和所述第一事件帧输入预设的光流估计模型,得到所述第一光流,包括:
    将所述第一图像帧、所述第二图像帧和所述第一事件帧输入所述光流估计模型,并进行循环迭代,得到所述第一光流。
  6. 根据权利要求3-5中任一项所述的方法,其特征在于,所述基于所述第二事件帧,确定第一光流分配掩码,包括:
    将所述第二事件帧输入预设的光流分配模型,得到所述第一光流分配掩码。
  7. 根据权利要求6所述的方法,其特征在于,所述将所述第二事件帧输入预设的光流分配模型,得到所述第一光流分配掩码,包括:
    将所述第二事件帧输入所述光流分配模型,并进行循环迭代,得到所述第一光流分配掩码。
  8. 根据权利要求1-7中任一项所述的方法,其特征在于,所述第一图像帧包括H×W个像素点,H和W均为大于1的整数,所述第一事件帧包括多个通道,该多个通道包括第一通道、第二通道、第三通道和第四通道;
    所述第一通道包括H×W个第一数值,所述H×W个第一数值与所述H×W个像素点的位置一一对应,所述第一数值用于表示所述第一图像帧中对应位置的像素点的亮度在所述第一图像帧到所述第二图像帧之间的时间段内增大的次数;
    所述第二通道包括H×W个第二数值,所述H×W个第二数值与所述H×W个像素点的位置一一对应,所述第二数值用于表示所述第一图像帧中对应位置的像素点的亮度在所述第一图像帧到所述第二图像帧之间的时间段内减小的次数;
    所述第三通道包括H×W个第三数值,所述H×W个第三数值与所述H×W个像素点的位置一一对应,所述第三数值用于表示所述第一图像帧中对应位置的像素点的亮度在所述第一图像帧到所述第二图像帧之间的时间段内最后一次增大的时间戳;
    所述第四通道包括H×W个第四数值,所述H×W个第四数值与所述H×W个像素点的位置一一对应,所述第四数值用于表示所述第一图像帧中对应位置的像素点的亮度在所述第一图像帧到所述第二图像帧之间的时间段内最后一次减小的时间戳。
  9. 根据权利要求1-8中任一项所述的方法,其特征在于,所述获得第一事件帧,包括:
    获得事件流数据,所述事件流数据包括至少一个事件中的每个事件的事件数据,所述至少一个事件与所述目标场景在所述第一图像帧与所述第二图像帧之间发生的至少一次亮度变化一一对应,所述每个事件的数据包括时间戳、像素点坐标和极性;
    基于所述事件流数据,得到所述第一事件帧。
  10. 一种光流估计装置,其特征在于,包括:获得模块和光流估计模块;
    所述获得模块用于获得第一图像帧和第二图像帧,所述第一图像帧和所述第二图像帧为图像序列中任意两个相邻的图像帧,所述图像序列是对目标场景拍摄得到的;获得第一事件帧,所述第一事件帧用于描述所述目标场景在所述第一图像帧到所述第二图像帧之间的时间段内的亮度变化情况;
    所述光流估计模块用于基于所述第一图像帧、所述第二图像帧和所述第一事件帧,确定目标光流,所述目标光流为从所述第一图像帧到目标时刻的光流,所述目标时刻为所述第一图像帧与所述第二图像帧之间的任意时刻。
  11. 根据权利要求10所述的装置,其特征在于,
    所述获得模块还用于在所述基于所述第一图像帧、所述第二图像帧和所述第一事件帧,确定目标光流之前,获得第二事件帧,所述第二事件帧用于描述所述目标场景在所述第一图像帧到所述目标时刻之间的时间段内的亮度变化情况;
    所述光流估计模块具体用于基于所述第一图像帧、所述第二图像帧、所述第一事件帧和所述第二事件帧,确定所述目标光流。
  12. 根据权利要求11所述的装置,其特征在于,所述光流估计模块包括帧间光流估计子模块、光流分配子模块和帧间任意时刻光流估计子模块;
    所述帧间光流估计子模块用于基于所述第一图像帧、所述第二图像帧和所述第一事件帧,确定第一光流,所述第一光流为从所述第一图像帧到所述第二图像帧的光流;
    所述光流分配子模块用于基于所述第二事件帧,确定第一光流分配掩码,所述第一光 流分配掩码用于指示所述目标光流相对于所述第一光流的权重;
    所述帧间任意时刻光流估计子模块用于基于所述第一光流和所述第一光流分配掩码,确定所述目标光流。
  13. 根据权利要求12所述的装置,其特征在于,所述帧间光流估计子模块具体用于将所述第一图像帧、所述第二图像帧和所述第一事件帧输入预设的光流估计模型,得到所述第一光流。
  14. 根据权利要求13所述的装置,其特征在于,所述帧间光流估计子模块具体用于将所述第一图像帧、所述第二图像帧和所述第一事件帧输入所述光流估计模型,并进行循环迭代,得到所述第一光流。
  15. 根据权利要求12-14中任一项所述的装置,其特征在于,所述光流分配子模块具体用于将所述第二事件帧输入预设的光流分配模型,得到所述第一光流分配掩码。
  16. 根据权利要求15所述的装置,其特征在于,所述光流分配子模块具体用于将所述第二事件帧输入所述光流分配模型,并进行循环迭代,得到所述第一光流分配掩码。
  17. 根据权利要求10-16中任一项所述的装置,其特征在于,所述第一图像帧包括H×W个像素点,H和W均为大于1的整数,所述第一事件帧包括多个通道,该多个通道包括第一通道、第二通道、第三通道和第四通道;
    所述第一通道包括H×W个第一数值,所述H×W个第一数值与所述H×W个像素点的位置一一对应,所述第一数值用于表示所述第一图像帧中对应位置的像素点的亮度在所述第一图像帧到所述第二图像帧之间的时间段内增大的次数;
    所述第二通道包括H×W个第二数值,所述H×W个第二数值与所述H×W个像素点的位置一一对应,所述第二数值用于表示所述第一图像帧中对应位置的像素点的亮度在所述第一图像帧到所述第二图像帧之间的时间段内减小的次数;
    所述第三通道包括H×W个第三数值,所述H×W个第三数值与所述H×W个像素点的位置一一对应,所述第三数值用于表示所述第一图像帧中对应位置的像素点的亮度在所述第一图像帧到所述第二图像帧之间的时间段内最后一次增大的时间戳;
    所述第四通道包括H×W个第四数值,所述H×W个第四数值与所述H×W个像素点的位置一一对应,所述第四数值用于表示所述第一图像帧中对应位置的像素点的亮度在所述第一图像帧到所述第二图像帧之间的时间段内最后一次减小的时间戳。
  18. 根据权利要求10-17中任一项所述的装置,其特征在于,所述获得模块具体用于:
    获得事件流数据,所述事件流数据包括至少一个事件中的每个事件的事件数据,所述至少一个事件与所述目标场景在所述第一图像帧与所述第二图像帧之间发生的至少一次亮度变化一一对应,所述每个事件的数据包括时间戳、像素点坐标和极性;
    基于所述事件流数据,得到所述第一事件帧。
  19. 一种光流估计装置,其特征在于,包括:处理器和通信接口,所述处理器和所述通信接口耦合,所述通信接口用于为所述处理器提供数据,所述处理器用于运行计算机程序指令以执行上述权利要求1-9中任一项所述的方法。
  20. 一种计算机可读存储介质,其特征在于,用于存储计算机程序,所述计算机程序被处理器运行时,实现如权利要求1-9任一项所述的方法。
  21. 一种计算机程序产品,其特征在于,当所述计算机程序产品在处理器上运行时, 实现如权利要求1-9任一项所述的方法。
PCT/CN2022/121050 2021-10-14 2022-09-23 一种光流估计方法和装置 WO2023061187A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111199513.6 2021-10-14
CN202111199513.6A CN115984336A (zh) 2021-10-14 2021-10-14 一种光流估计方法和装置

Publications (1)

Publication Number Publication Date
WO2023061187A1 true WO2023061187A1 (zh) 2023-04-20

Family

ID=85966795

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/121050 WO2023061187A1 (zh) 2021-10-14 2022-09-23 一种光流估计方法和装置

Country Status (2)

Country Link
CN (1) CN115984336A (zh)
WO (1) WO2023061187A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116456202A (zh) * 2023-04-25 2023-07-18 北京大学 脉冲相机及其彩色成像方法及装置

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150262380A1 (en) * 2014-03-17 2015-09-17 Qualcomm Incorporated Adaptive resolution in optical flow computations for an image processing system
CN110390685A (zh) * 2019-07-24 2019-10-29 中国人民解放军国防科技大学 一种基于事件相机的特征点跟踪方法
CN111402292A (zh) * 2020-03-10 2020-07-10 南昌航空大学 基于特征变形误差遮挡检测的图像序列光流计算方法
CN111696035A (zh) * 2020-05-21 2020-09-22 电子科技大学 一种基于光流运动估计算法的多帧图像超分辨率重建方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150262380A1 (en) * 2014-03-17 2015-09-17 Qualcomm Incorporated Adaptive resolution in optical flow computations for an image processing system
CN110390685A (zh) * 2019-07-24 2019-10-29 中国人民解放军国防科技大学 一种基于事件相机的特征点跟踪方法
CN111402292A (zh) * 2020-03-10 2020-07-10 南昌航空大学 基于特征变形误差遮挡检测的图像序列光流计算方法
CN111696035A (zh) * 2020-05-21 2020-09-22 电子科技大学 一种基于光流运动估计算法的多帧图像超分辨率重建方法

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116456202A (zh) * 2023-04-25 2023-07-18 北京大学 脉冲相机及其彩色成像方法及装置
CN116456202B (zh) * 2023-04-25 2023-12-15 北京大学 脉冲相机及其彩色成像方法及装置

Also Published As

Publication number Publication date
CN115984336A (zh) 2023-04-18

Similar Documents

Publication Publication Date Title
US11756223B2 (en) Depth-aware photo editing
US10893251B2 (en) Three-dimensional model generating device and three-dimensional model generating method
EP3816929B1 (en) Method and apparatus for restoring image
Mueggler et al. Lifetime estimation of events from dynamic vision sensors
US20210366133A1 (en) Image frame prediction method, image frame prediction apparatus and head display apparatus
CN113286194A (zh) 视频处理方法、装置、电子设备及可读存储介质
US11620754B2 (en) System, method, and computer program for adjusting image contrast using parameterized cumulative distribution functions
WO2020146911A2 (en) Multi-stage multi-reference bootstrapping for video super-resolution
US9384579B2 (en) Stop-motion video creation from full-motion video
US20240203045A1 (en) Illumination rendering method and apparatus, storage medium, and electronic device
JP2019502275A (ja) 映像安定化
CN109661815B (zh) 存在相机阵列的显著强度变化的情况下的鲁棒视差估计
CN111327887B (zh) 电子装置及其操作方法,以及处理电子装置的图像的方法
CN103460242A (zh) 信息处理装置、信息处理方法、以及位置信息的数据结构
WO2021232963A1 (zh) 视频去噪方法、装置、移动终端和存储介质
US20210334992A1 (en) Sensor-based depth estimation
WO2023061187A1 (zh) 一种光流估计方法和装置
WO2021093534A1 (zh) 主体检测方法和装置、电子设备、计算机可读存储介质
WO2021163928A1 (zh) 光流获取方法和装置
CN105516579A (zh) 一种图像处理方法、装置和电子设备
US11862053B2 (en) Display method based on pulse signals, apparatus, electronic device and medium
WO2023160426A1 (zh) 视频插帧方法、训练方法、装置和电子设备
JP6544978B2 (ja) 画像出力装置およびその制御方法、撮像装置、プログラム
JP7218721B2 (ja) 画像処理装置および方法
Cuevas et al. Statistical moving object detection for mobile devices with camera

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22880127

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2022880127

Country of ref document: EP

Effective date: 20240425

NENP Non-entry into the national phase

Ref country code: DE