CN115022613A

CN115022613A - Video reconstruction method and device, electronic equipment and storage medium

Info

Publication number: CN115022613A
Application number: CN202210556761.XA
Authority: CN
Inventors: 焦少慧; 陈誉中; 吴泽寰
Original assignee: Beijing ByteDance Network Technology Co Ltd
Current assignee: Beijing ByteDance Network Technology Co Ltd
Priority date: 2022-05-19
Filing date: 2022-05-19
Publication date: 2022-09-06
Anticipated expiration: 2042-05-19
Also published as: CN115022613B

Abstract

The embodiment of the disclosure discloses a video reconstruction method and device, electronic equipment and a storage medium. The method comprises the following steps: acquiring an original video acquired based on light field acquisition equipment, wherein the light field acquisition equipment comprises at least two free view angle acquisition equipment groups, each free view angle acquisition equipment group is placed at different heights, and each free view angle acquisition equipment group comprises a plurality of free view angle acquisition equipment; aiming at adjacent video frames in each synchronous video frame in an original video, generating a virtual video frame according to each adjacent video frame, wherein each synchronous video frame is acquired on the same frame, and free visual angle acquisition equipment corresponding to any two adjacent video frames are adjacent in the arrangement position; and performing light field reconstruction according to each synchronous video frame collected on each frame in the original video and the virtual video frame corresponding to each synchronous video frame to obtain the light field video. According to the technical scheme, the light field video capable of meeting the 6DoF watching requirement can be obtained through reconstruction.

Description

Video reconstruction method and device, electronic equipment and storage medium

Technical Field

The present disclosure relates to the field of data processing technologies, and in particular, to a video reconstruction method and apparatus, an electronic device, and a storage medium.

Background

With the rapid development of information technology, videos play an important role in the aspects of people's life. Currently, people are no longer satisfied with the viewing of two-dimensional (2D) video, but want to view more stereoscopic three-dimensional (3D) video.

In a light field video in a 3D video, people often want to view multiview information with 6 degrees of freedom (DoF) when viewing the light field video. However, the current reconstructed light field video cannot meet the viewing requirement, i.e. cannot allow people to view in 6 DoF.

Disclosure of Invention

The embodiment of the disclosure provides a video reconstruction method and device, electronic equipment and a storage medium, so as to reconstruct and obtain a light field video capable of meeting the 6DoF watching requirement.

In a first aspect, an embodiment of the present disclosure provides a video reconstruction method, which may include:

acquiring an original video acquired based on light field acquisition equipment, wherein the light field acquisition equipment comprises at least two groups of free view angle acquisition equipment groups, each group of free view angle acquisition equipment groups is arranged at different heights, and each group of free view angle acquisition equipment group comprises a plurality of free view angle acquisition equipment;

aiming at adjacent video frames in synchronous video frames in an original video, generating virtual video frames according to the adjacent video frames, wherein the synchronous video frames are acquired on the same frame, and free visual angle acquisition equipment corresponding to any two adjacent video frames are adjacent in the arrangement position;

and performing light field reconstruction according to each synchronous video frame collected on each frame in the original video and the virtual video frame corresponding to each synchronous video frame to obtain the light field video.

In a second aspect, an embodiment of the present disclosure further provides a video reconstruction apparatus, which may include:

the system comprises an original video acquisition module, a video acquisition module and a video acquisition module, wherein the original video acquisition module is used for acquiring an original video acquired based on light field acquisition equipment, the light field acquisition equipment comprises at least two groups of free visual angle acquisition equipment groups, each group of free visual angle acquisition equipment groups is arranged at different heights, and each group of free visual angle acquisition equipment groups comprises a plurality of free visual angle acquisition equipment;

the virtual video frame generation module is used for generating virtual video frames according to adjacent video frames in synchronous video frames in an original video, wherein the synchronous video frames are acquired on the same frame, and free visual angle acquisition equipment corresponding to any two adjacent video frames are adjacent in the arrangement position;

and the light field video reconstruction module is used for reconstructing a light field according to each synchronous video frame acquired on each frame in the original video and the virtual video frame corresponding to each synchronous video frame to obtain the light field video.

In a third aspect, an embodiment of the present disclosure further provides an electronic device, which may include:

one or more processors;

a memory for storing one or more programs,

when the one or more programs are executed by the one or more processors, the one or more processors implement the video reconstruction method provided by any embodiment of the present disclosure.

In a fourth aspect, the embodiments of the present disclosure further provide a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the video reconstruction method provided in any embodiment of the present disclosure.

According to the technical scheme of the embodiment of the disclosure, the original video acquired based on the light field acquisition equipment is acquired, and the light field acquisition equipment comprises at least two groups of free view angle acquisition equipment groups, wherein each group of free view angle acquisition equipment groups is arranged at different heights, and each group of free view angle acquisition equipment group comprises a plurality of free view angle acquisition equipment, so that the acquired original video simultaneously has multi-view angle information, namely light field information, in the vertical direction and the horizontal direction; furthermore, in order to reconstruct and obtain dense light field information on the basis of deploying sparse free view angle acquisition equipment, virtual video frames can be generated according to adjacent video frames in synchronous video frames acquired on the same frame in an original video, wherein the free view angle acquisition equipment corresponding to any two adjacent video frames is adjacent to the free view angle acquisition equipment in the placement position; and further, performing light field reconstruction according to each synchronous video frame collected on each frame in the original video and the virtual video frame corresponding to each synchronous video frame to obtain the light field video. According to the technical scheme, in the acquisition process, the acquisition of the free view angle in the vertical direction is added on the basis of the acquisition of the free view angle in the horizontal direction, and the virtual video frames are generated through each adjacent video frame to reconstruct and obtain dense light field information, so that the light field video capable of meeting the 6DoF watching requirement is obtained, and therefore, a user is allowed to watch the light field video in the 6DoF mode based on Augmented Reality (AR) or head-mounted display equipment and the like.

Drawings

The above and other features, advantages and aspects of various embodiments of the present disclosure will become more apparent by referring to the following detailed description when taken in conjunction with the accompanying drawings. Throughout the drawings, the same or similar reference numbers refer to the same or similar elements. It should be understood that the drawings are schematic and that elements and features are not necessarily drawn to scale.

Fig. 1 is a flow chart of a video reconstruction method in an embodiment of the present disclosure;

fig. 2 is a schematic layout diagram of a light field acquisition device in a video reconstruction method according to an embodiment of the present disclosure;

fig. 3 is a schematic diagram of adjacent video frames and a virtual video frame in a video reconstruction method in an embodiment of the present disclosure;

FIG. 4 is a flow chart of yet another video reconstruction method in an embodiment of the present disclosure;

fig. 5 is a schematic diagram illustrating generation of a virtual video frame in a video reconstruction method according to another embodiment of the disclosure;

FIG. 6 is a flow chart of another video reconstruction method in an embodiment of the present disclosure;

fig. 7 is a schematic diagram illustrating generation of a virtual video frame in another video reconstruction method in the embodiment of the present disclosure;

fig. 8 is a block diagram of a video reconstruction apparatus according to an embodiment of the present disclosure;

fig. 9 is a schematic structural diagram of an electronic device in an embodiment of the present disclosure.

Detailed Description

Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure are shown in the drawings, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but rather are provided for a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the disclosure are for illustration purposes only and are not intended to limit the scope of the disclosure.

It should be understood that the various steps recited in the method embodiments of the present disclosure may be performed in a different order, and/or performed in parallel. Moreover, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this respect.

The term "include" and variations thereof as used herein are open-ended, i.e., "including but not limited to". The term "based on" is "based, at least in part, on". The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments". Relevant definitions for other terms will be given in the following description.

It should be noted that the terms "first", "second", and the like in the present disclosure are only used for distinguishing different devices, modules or units, and are not used for limiting the order or interdependence of the functions performed by the devices, modules or units.

It is noted that references to "a", "an", and "the" modifications in this disclosure are intended to be illustrative rather than limiting, and that those skilled in the art will recognize that "one or more" may be used unless the context clearly dictates otherwise.

The names of messages or information exchanged between devices in the embodiments of the present disclosure are for illustrative purposes only, and are not intended to limit the scope of the messages or information.

Fig. 1 is a flowchart of a video reconstruction method provided in an embodiment of the present disclosure. The embodiment can be suitable for the situation of reconstructing the light field video, and is particularly suitable for the situation of reconstructing the light field video based on the free view angle acquisition scheme. The method can be performed by a video reconstruction apparatus provided by the embodiments of the present disclosure, the apparatus can be implemented by software and/or hardware, and the apparatus can be integrated on an electronic device, which can be various terminal devices or a server.

Referring to fig. 1, the method of the embodiment of the present disclosure specifically includes the following steps:

s110, acquiring an original video acquired based on light field acquisition equipment, wherein the light field acquisition equipment comprises at least two free view angle acquisition equipment groups, each free view angle acquisition equipment group is arranged on different heights, and each free view angle acquisition equipment group comprises a plurality of free view angle acquisition equipment.

The light field acquisition equipment can be pre-placed equipment for acquiring an original video, and can comprise at least two free visual angle acquisition equipment groups placed at different heights, so that multi-visual angle information in the vertical direction can be acquired, and the vertical direction can be understood as the direction of the height, namely the direction perpendicular to the ground plane; on the basis, each free-viewing angle acquisition device group can comprise a plurality of free-viewing angle acquisition devices, so that multi-viewing angle information in the horizontal direction can be acquired, and the horizontal direction can be understood as a direction parallel to the ground plane. In practical application, optionally, the individual viewing angle acquisition devices in each free viewing angle acquisition device group can be annularly arranged, that is, arranged in a 360-degree circular-shooting manner (outside-in), so that light field information at more viewing angles in the horizontal direction can be acquired. Illustratively, referring to the layout schematic diagram of the light field collecting device shown in fig. 2 (wherein the horizontal lines represent the ground plane, and the vertical lines represent the height), the light field collecting device includes a first free viewing angle collecting device group 20 and a second free viewing angle collecting device group 21 which are arranged at different heights, taking the first free viewing angle collecting device group 20 as an example, wherein a plurality of free viewing angle collecting devices 201 are included, and these free viewing angle collecting devices 201 are located at the same height, and are disposed in a circle in the horizontal direction (only some of the free viewing angle collecting devices 201 are illustrated in the figure).

Based on the original video acquired by the light field acquisition device, it can be considered as a video obtained by simultaneously capturing light field samples from different viewpoints, i.e., view angles, in the target space based on a plurality of free view angle acquisition devices. As above, since the light field acquisition device can acquire the light field information in the horizontal direction and the vertical direction at the same time, the original video reconstructed from the above has relatively complete light field information.

And S120, aiming at adjacent video frames in each synchronous video frame in the original video, generating a virtual video frame according to each adjacent video frame, wherein each synchronous video frame is acquired on the same frame, and the free visual angle acquisition devices corresponding to any two adjacent video frames are adjacent in the arrangement position.

In order to reconstruct a light field video with fully-prepared and dense light field information, one alternative is to deploy a very dense free-view acquisition device, but the following problems may exist in the scheme: 1) the hardware cost is high; 2) the consistency of equipment parameters such as white balance, brightness and the like and the consistency of time synchronization are required to be achieved by the free visual angle acquisition equipment, and the consistency is difficult to unify as the number of the free visual angle acquisition equipment is increased; 3) each free viewing angle acquisition device needs to be calibrated, and the greater number of free viewing angle acquisition devices can cause the calibration complexity to rise significantly, thereby causing the calibration time to be too long.

On this basis, in order to obtain the fully-reserved light field information without the above problem, the embodiment of the present disclosure provides a technical solution for obtaining the fully-reserved light field information by generating a virtual video frame on the basis of a sparsely deployed free-view acquisition device. Specifically, each synchronous video frame may be a video frame acquired on the same frame in the original video at different viewing angles, that is, a video frame acquired synchronously by the viewing angle acquisition device in each free viewing angle acquisition device group. The adjacent video frames may be video frames in each synchronous video frame, and the free view angle acquisition devices corresponding to any two adjacent video frames are adjacent in the placement position, such as adjacent in the horizontal direction, the vertical direction, or the diagonal direction. Illustratively, referring to fig. 3, to simplify the description, the video frames in the diagram are represented by views, and the specific meaning of the video frames at different positions (i.e., the positions corresponding to the coordinates represented by the numbers in the diagram) is as follows. view (0,0), view (0,1), view (1,0) and view (1,1) respectively represent a certain adjacent video frame, wherein free view acquisition devices respectively corresponding to view (0,0) and view (0,1) are adjacent in the horizontal direction, and the view (1,0) and view (1,1) are similar in condition; the free view acquisition equipment corresponding to view (0,0) and view (1,0) are adjacent in the vertical direction, and the view (0,1) and the view (1,1) are similar in condition; the free view acquisition devices corresponding to view (0,0) and view (1,1) are adjacent in the diagonal direction.

Further, a virtual video frame is generated according to each adjacent video frame, and the virtual video frame can be understood as a video frame under a corresponding virtual viewing angle, that is, a video frame acquired by a virtual viewing angle acquisition device corresponding to the virtual viewing angle. In practical applications, optionally, the virtual video frame may be generated in a plurality of manners, such as a frame interpolation manner or a three-dimensional reconstruction manner, which is not specifically limited herein. Optionally, the virtual view angle corresponding to the virtual video frame is located in a target range, where the target range may be a range formed by physical view angles corresponding to adjacent video frames, for example, a range formed by article view angles corresponding to view (0,0), view (0,1), view (1,0), and view (1,1) in fig. 3, and the virtual video frame at this time may be view (0.6,0.3) in fig. 3; or may be a range outside the range formed by the physical views corresponding to the adjacent video frames, and the virtual video frame in this case may be view (0.6,1.2) in fig. 3; etc., and are not specifically limited herein.

And S130, performing light field reconstruction according to each synchronous video frame collected on each frame in the original video and the virtual video frame corresponding to each synchronous video frame to obtain the light field video.

If the light field acquisition device comprises M free view angle acquisition device groups, each free view angle acquisition device group comprises N free view angle acquisition devices, and M and N are integers greater than 2, the original video comprises M × N synchronous video frames respectively acquired on each frame. And aiming at each synchronous video frame on each frame, obtaining a virtual video frame corresponding to each synchronous video frame through the steps. Therefore, dense light field information can be obtained based on each synchronous video frame acquired on each frame and the virtual video frame reconstruction corresponding to each synchronous video frame, so that the light field video with the dense light field information is obtained.

According to the technical scheme of the embodiment of the disclosure, the original video acquired based on the light field acquisition equipment is acquired, and the light field acquisition equipment comprises at least two groups of free view angle acquisition equipment groups, wherein each group of free view angle acquisition equipment groups is arranged at different heights, and each group of free view angle acquisition equipment group comprises a plurality of free view angle acquisition equipment, so that the acquired original video simultaneously has multi-view angle information, namely light field information, in the vertical direction and the horizontal direction; furthermore, in order to reconstruct and obtain dense light field information on the basis of deploying sparse free view angle acquisition equipment, virtual video frames can be generated according to adjacent video frames in synchronous video frames acquired on the same frame in an original video, wherein the free view angle acquisition equipment corresponding to any two adjacent video frames is adjacent to the free view angle acquisition equipment in the placement position; and further, performing light field reconstruction according to each synchronous video frame collected on each frame in the original video and the virtual video frame corresponding to each synchronous video frame to obtain the light field video. According to the technical scheme, in the acquisition process, the acquisition of the free visual angle in the vertical direction is added on the basis of the acquisition of the free visual angle in the horizontal direction, and the dense light field information is reconstructed by generating the virtual video frame through each adjacent video frame, so that the light field video capable of meeting the 6DoF watching requirement is obtained, and therefore, a user is allowed to watch the light field video in the 6DoF mode based on an AR (augmented reality) mode or a head-mounted display device and the like.

Fig. 4 is a flowchart of still another video reconstruction method provided in an embodiment of the present disclosure. The present embodiment is optimized on the basis of the alternatives in the above-described embodiment. In this embodiment, optionally, the generating a virtual video frame according to each adjacent video frame may include: taking free visual angle acquisition equipment corresponding to adjacent video frames as adjacent visual angle acquisition equipment, and performing feature matching on each adjacent video frame to obtain matched feature points in each adjacent video frame; acquiring a physical calibration result of adjacent visual angle acquisition equipment for each adjacent visual angle acquisition equipment, and projecting feature points in adjacent video frames acquired by the adjacent visual angle acquisition equipment according to the physical calibration result to obtain space points; determining a target point according to the spatial points corresponding to the matched characteristic points in each adjacent video frame, acquiring a virtual calibration result of the virtual visual angle acquisition equipment corresponding to the virtual video frame to be generated, and projecting the target point according to the virtual calibration result to obtain a projection point; and generating a virtual video frame according to the projection points corresponding to the characteristic points in any adjacent video frame. The same or corresponding terms as those in the above embodiments are not explained in detail herein.

Correspondingly, as shown in fig. 4, the method of this embodiment may specifically include the following steps:

s210, acquiring an original video acquired based on light field acquisition equipment, wherein the light field acquisition equipment comprises at least two free view angle acquisition equipment groups, each free view angle acquisition equipment group is placed at different heights, and each free view angle acquisition equipment group comprises a plurality of free view angle acquisition equipment.

S220, aiming at adjacent video frames in synchronous video frames in an original video, taking free visual angle acquisition equipment corresponding to the adjacent video frames as adjacent visual angle acquisition equipment, and performing feature matching on the adjacent video frames to obtain matched feature points in the adjacent video frames, wherein the synchronous video frames are acquired on the same frame, and the free visual angle acquisition equipment corresponding to any two adjacent video frames are adjacent in the arrangement position.

Wherein S220-S250 are performed for each adjacent video frame. Specifically, a free view angle acquisition device for acquiring adjacent video frames is used as the adjacent view angle acquisition device. Feature matching is performed on each adjacent video frame, and the feature matching process can be implemented based on algorithms such as stereo matching (stereo matching), so as to obtain feature points matched in each adjacent video frame, such as feature point a1 in adjacent video frame 1, feature point a2 in adjacent video frame 2, feature point A3 in adjacent video frame, and feature point a4 in adjacent video frame.

And S230, acquiring a physical calibration result of the adjacent visual angle acquisition equipment aiming at each adjacent visual angle acquisition equipment, and projecting the characteristic points in the adjacent video frames acquired by the adjacent visual angle acquisition equipment according to the physical calibration result to obtain space points.

For each adjacent view acquisition device, the physical calibration result may be a result obtained by calibrating the view acquisition device, and may be represented by a pose (R, t) and an internal reference (K), where R represents a rotation matrix and t represents a translation matrix. It should be noted that, since the adjacent view angle acquiring apparatus is an apparatus actually existing in physics, the calibration result thereof is referred to as a physical calibration result herein. And then, for each feature point in the adjacent video frames acquired by the adjacent visual angle acquisition equipment, performing spatial projection on the feature point according to a physical calibration result of the adjacent visual angle acquisition equipment, thereby acquiring a spatial point.

S240, determining a target point according to the spatial points corresponding to the matched feature points in each adjacent video frame, acquiring a virtual calibration result of the virtual visual angle acquisition equipment corresponding to the virtual video frame to be generated, and projecting the target point according to the virtual calibration result to obtain a projection point.

Wherein, for the spatial points corresponding to the matched feature points in each adjacent video frame, the target points are determined according to the spatial points corresponding to a1, a2, A3 and a4 in the above example. In practical applications, optionally, when there are many pairs of matched feature points in each adjacent video frame, it is possible to determine how many target points. Still alternatively, the target point may be determined by: aiming at each adjacent video frame, connecting the characteristic points in the adjacent video frames with the space points corresponding to the characteristic points to obtain straight lines where the characteristic points are located; and after the straight lines where the matched characteristic points in the adjacent video frames are located are obtained, the intersection point of each straight line is used as a target point.

The virtual perspective collecting device may be a device for collecting a virtual video frame to be generated, and the virtual calibration result may be a calibration result thereof. Similar to the physical calibration result, since the virtual perspective capturing device is not a physically real device, the calibration result thereof is referred to as a virtual calibration result. And projecting the target point according to the virtual calibration result to obtain a projection point (namely a pixel point) on the virtual video frame to be generated. In practical applications, the process of obtaining the projection point from the target point may be understood as a process of drawing the reconstruction result (i.e., the spatial point), that is, a process of drawing the target point onto the virtual video frame to be generated.

And S250, generating a virtual video frame according to the projection points corresponding to the feature points in any adjacent video frame.

The matched feature points in each adjacent video frame correspond to the same projection point, so that the virtual video can be generated according to the projection points corresponding to the feature points in any adjacent video frame. That is, after obtaining each projection point on the virtual video frame to be generated, the virtual video frame can be generated accordingly.

In order to better understand the generation process of the virtual video frame as a whole, it is exemplified below with reference to specific examples. Illustratively, as shown in fig. 5, taking an adjacent video frame 51 captured by the adjacent view capturing device 50 and an adjacent video frame 53 captured by the adjacent view capturing device 52 as an example, a1 on the adjacent video frame 51 and a2 on the adjacent video frame 53 are matched feature points, and a1 is projected to obtain a spatial point K1 corresponding to a 1; then, a line a1 is connected to K1, resulting in a straight line L1 of a 1. The process of obtaining the space point K2 corresponding to a2 and the straight line L2 where a2 is located is similar, and is not described herein again. And taking the intersection point of the straight line L1 where the A1 is located and the straight line L2 where the A2 is located as a target point M, and then projecting the target point M according to the virtual calibration result to obtain a projection point T. The process of obtaining the projection points corresponding to the other matched feature points on the adjacent video frame 51 and the adjacent video frame 53 is similar, and is not described herein again. After the projection points corresponding to the feature points on the adjacent video frame 51 or the adjacent video frame 53 are obtained, the virtual video frame 55 collected by the virtual visual angle collecting device 54 can be obtained according to the projection points.

And S260, performing light field reconstruction according to each synchronous video frame collected on each frame in the original video and the virtual video frame corresponding to each synchronous video frame to obtain the light field video.

According to the technical scheme of the embodiment of the disclosure, free visual angle acquisition equipment corresponding to adjacent video frames is used as the adjacent visual angle acquisition equipment, and feature matching is carried out on each adjacent video frame to obtain matched feature points in each adjacent video frame; further, for each adjacent visual angle acquisition device, acquiring a physical calibration result of the adjacent visual angle acquisition device, and projecting feature points in adjacent video frames acquired by the adjacent visual angle acquisition device according to the physical calibration result to obtain space points; then, according to the spatial points corresponding to the matched characteristic points in each adjacent video frame, determining a target point, and acquiring a virtual calibration result of the virtual visual angle acquisition equipment corresponding to the virtual video frame to be generated, so as to project the target point according to the virtual calibration result and obtain a projection point; after the projection points corresponding to the feature points in any adjacent video frame are obtained, the virtual video is generated based on the projection points, and therefore the effect of generating the high-precision virtual video is achieved based on a three-dimensional reconstruction mode.

An optional technical solution is that, on the basis of the foregoing embodiment, the physical calibration result includes an internal reference and a pose, and the obtaining the spatial point by projecting the feature point in the adjacent video frame acquired by the adjacent visual angle acquisition device according to the physical calibration result may include: taking the collected adjacent video frames as collected video frames; obtaining depth information of a physical visual angle under adjacent visual angle acquisition equipment according to the internal reference and the pose, and acquiring a first pixel value of a characteristic point in an acquired video frame; performing spatial back projection according to the depth information and the first pixel value to obtain a back projection matrix of the internal parameters and a back projection matrix of the pose; and projecting the characteristic points in the collected video frame according to the back projection matrix of the internal reference and the back projection matrix of the pose to obtain space points.

When each adjacent view angle acquisition device is explained, for simplifying the expression, it is emphasized that adjacent video frames acquired by the adjacent view angle acquisition device are processed, but adjacent video frames acquired by other adjacent view angle acquisition devices are not processed, and the adjacent video frames acquired by the adjacent view angle acquisition device are taken as acquired video frames. And then, obtaining the depth information of the physical visual angle under the adjacent visual angle acquisition equipment according to the internal reference and the pose, and acquiring a first pixel value of a characteristic point in an acquired video frame, wherein the first pixel value can be represented by RGB information. Further, space back projection is carried out according to the depth information and the first pixel value, and an internal reference back projection matrix and a pose back projection matrix are obtained. In this way, feature points in the captured video frame can be projected according to the back projection matrix of the internal parameter and the back projection matrix of the pose to obtain spatial points, for example, continuing with the above example, P ═ R | t] ^-1 K ^-1 p _t Wherein p is _t Representing a feature point, K, in the captured video frame ^-1 A back projection matrix representing the internal parameters, [ R | t] ^-1 And (3) representing a back projection matrix of the pose, and P represents a space point. According to the technical scheme, the space points can be accurately obtained.

Another optional technical solution, on the basis of the foregoing embodiment, the video scene method may further include: respectively acquiring second pixel values of the matched characteristic points in each adjacent video frame, and respectively acquiring the distance between each adjacent visual angle acquisition equipment and the virtual visual angle acquisition equipment; determining the weight of the adjacent visual angle acquisition equipment corresponding to the distance according to the distance; determining a projection pixel value of the projection point according to each second pixel value and each weight; generating a virtual video according to projection points corresponding to each feature point in any adjacent video frame, which may include: and generating a virtual video according to the projection points respectively corresponding to the characteristic points in any adjacent video frame and the projection pixel values of the projection points respectively corresponding to the characteristic points.

And respectively acquiring second pixel values of the matched characteristic points in each adjacent video frame, wherein the second pixel values can be represented by RGB information. Although the second pixel values are pixel values of matched feature points in adjacent video frames, the pose of adjacent view angle acquisition devices is different, so that the second pixel values may have differences, such as black a1 and gray a 2. On this basis, in order to accurately determine the projection pixel values of the projection points corresponding to the matched feature points, the distance between each adjacent viewing angle acquisition device and the virtual viewing angle acquisition device may be obtained, and the distance may be represented by a euclidean distance, a mahalanobis distance, a cosine distance, a hamming distance, or a manhattan distance, which is not specifically limited herein. In general, the smaller the distance, the greater the influence of the second pixel value on the projection pixel value of the adjacent view angle acquisition device having the distance, and the antisense is also true, so that the weight of the adjacent view angle acquisition device corresponding to the distance can be determined according to the distance, and the projection pixel value can be determined according to each second pixel value and the weight corresponding to each second pixel value. It should be noted that the feature points, the spatial points, the target points, and the projection points explained above mainly represent position information of corresponding pixel points (or spatial points), so that when a virtual video frame is generated, the virtual video frame can be generated according to the projection points and the projection pixel values of the projection points, thereby achieving an effect of accurately generating the virtual video frame.

Fig. 6 is a flowchart of another video reconstruction method provided in the embodiments of the present disclosure. The present embodiment is optimized based on various alternatives in the above embodiments. In this embodiment, optionally, the generating a virtual video frame according to each adjacent video frame may include: determining a first video frame and a second video frame which are adjacent to each other in a first direction in each adjacent video frame; in a first direction, performing frame interpolation according to a first video frame and a second video frame to obtain an intermediate video frame; and in a second direction vertical to the first direction, performing frame interpolation according to each intermediate video frame to obtain a virtual video frame. The same or corresponding terms as those in the above embodiments are not explained in detail herein.

Correspondingly, as shown in fig. 6, the method of this embodiment may specifically include the following steps:

s310, acquiring an original video acquired based on light field acquisition equipment, wherein the light field acquisition equipment comprises at least two free view angle acquisition equipment groups, each free view angle acquisition equipment group is arranged on different heights, and each free view angle acquisition equipment group comprises a plurality of free view angle acquisition equipment.

S320, aiming at adjacent video frames in synchronous video frames in the original video, determining a first video frame and a second video frame which are adjacent in the first direction in the adjacent video frames, wherein the synchronous video frames are acquired on the same frame, and free visual angle acquisition equipment corresponding to any two adjacent video frames are adjacent in the arrangement position.

Wherein S320-S340 are performed for each adjacent video frame. The first direction may be a horizontal direction or a vertical direction, which is relevant to the actual situation and is not specifically limited herein. The first video frame and the second video frame may be two video frames adjacent in the first direction among the adjacent video frames. Illustratively, as shown in fig. 7, to simplify the description, the video frames in the figure are represented by views, and the specific meaning of the video frames at different positions (i.e., the positions corresponding to the coordinates represented by the numbers in the figure) is as follows. view (0,0) and view (0,1) can be understood as two video frames adjacent in the horizontal direction, and the cases of view (1,0) and view (1,1) are similar; view (0,0) and view (1,0) can be understood as two video frames adjacent in the vertical direction, and the cases of view (0,1) and view (1,1) are similar.

S330, in a first direction, performing frame interpolation according to the first video frame and the second video frame to obtain intermediate video frames, and in a second direction perpendicular to the first direction, performing frame interpolation according to each intermediate video frame to obtain virtual video frames.

In the first direction, at least two pairs of first video frames and second video frames exist, and for each pair of first video frames and second video frames, frame interpolation is performed according to the two pairs of first video frames and second video frames to obtain an intermediate video frame, where a specific frame interpolation scheme may be optical flow frame interpolation or Depth Image Based Rendering (DIBR) frame interpolation, and the like, which is not specifically limited herein. Illustratively, taking the first direction as the horizontal direction as an example, as shown in fig. 7, the intermediate video frames resulting from the frame interpolation may be view (0,0.3) and view (1,0.3), respectively. Further, in a second direction perpendicular to the first direction, frame interpolation is performed again according to each intermediate video frame obtained by frame interpolation to obtain a virtual video frame. Illustratively, still taking the above example as an example, the second direction is a vertical direction, and interpolating the frame in the vertical direction according to view (0,0.3) and view (1,0.3) results in view (0.6, 0.3).

S340, light field reconstruction is carried out according to each synchronous video frame collected on each frame in the original video and the virtual video frame corresponding to each synchronous video frame to obtain the light field video.

According to the technical scheme of the embodiment of the disclosure, a first video frame and a second video frame which are adjacent to each other in a first direction in each adjacent video frame are determined; and then, in a second direction perpendicular to the first direction, performing frame interpolation according to each intermediate video frame to obtain a virtual video frame, so that the effect of quickly generating the virtual video based on a frame interpolation mode is achieved.

An optional technical solution, on the basis of the foregoing embodiment, the video reconstruction method may further include: taking a direction parallel to the ground plane as a horizontal direction and a direction perpendicular to the ground plane as a vertical direction; acquiring the horizontal interval of any two free visual angle acquisition devices in the horizontal direction and the vertical interval of any two free visual angle acquisition devices in the vertical direction; and determining the horizontal direction or the vertical direction as the first direction according to the numerical relationship between the horizontal interval and the vertical interval. When the free viewing angle acquisition devices in the horizontal direction are placed at equal intervals, the horizontal intervals of any two free viewing angle acquisition devices in the horizontal direction are the same, otherwise, a difference exists, which is related to the actual situation and is not limited herein. Similarly, when the respective free view angle capturing devices in the same group are located at substantially the same height, the vertical interval between any two free view angle capturing devices in the vertical direction is the same, otherwise, this is also related to the actual situation and is not limited herein. On the basis, according to the numerical relationship between the horizontal interval and the vertical interval, determining that the horizontal direction or the vertical direction is taken as the first direction, and if the horizontal interval is smaller than or equal to the vertical interval, taking the horizontal interval as the first direction; otherwise, the vertical spacing is taken as the first direction. In other words, the intermediate video frame may be obtained by performing frame interpolation in the direction with the smaller interval first, so as to obtain the virtual video frame based on the frame interpolation in the direction with the larger interval. The advantage of this arrangement is that the error of the intermediate video frame obtained by frame interpolation in the direction with smaller interval is smaller than that in the direction with larger interval, so that the accumulated error of the virtual video frame obtained by frame interpolation based on the intermediate video frame can be reduced, thereby ensuring the accuracy of the virtual video frame.

Fig. 8 is a block diagram illustrating a video reconstruction apparatus according to an embodiment of the disclosure, which is configured to perform the video reconstruction method according to any of the embodiments. The apparatus and the video reconstruction method of the foregoing embodiments belong to the same concept, and details that are not described in detail in the embodiments of the video reconstruction apparatus may refer to the embodiments of the video reconstruction method described above. Referring to fig. 8, the apparatus may specifically include: an original video acquisition module 410, a virtual video frame generation module 420, and a light field video reconstruction module 430.

The original video acquisition module 410 is configured to acquire an original video acquired based on a light field acquisition device, where the light field acquisition device includes at least two free view angle acquisition device groups, each free view angle acquisition device group is placed at a different height, and each free view angle acquisition device group includes multiple free view angle acquisition devices;

a virtual video frame generating module 420, configured to generate, for adjacent video frames in each synchronous video frame in the original video, a virtual video frame according to each adjacent video frame, where each synchronous video frame is acquired on the same frame, and the free view angle acquisition devices corresponding to any two adjacent video frames are adjacent to each other in the placement position;

the light field video reconstruction module 430 is configured to perform light field reconstruction according to each synchronous video frame acquired on each frame in the original video and the virtual video frame corresponding to each synchronous video frame, so as to obtain a light field video.

Optionally, the virtual video frame generating module 420 may include:

the feature point obtaining unit is used for taking the free visual angle acquisition equipment corresponding to the adjacent video frames as the adjacent visual angle acquisition equipment and performing feature matching on the adjacent video frames to obtain matched feature points in the adjacent video frames;

the spatial point obtaining unit is used for obtaining a physical calibration result of the adjacent visual angle acquisition equipment for each adjacent visual angle acquisition equipment, and projecting the characteristic points in the adjacent video frames acquired by the adjacent visual angle acquisition equipment according to the physical calibration result to obtain spatial points;

the projection point obtaining unit is used for determining a target point according to the spatial points corresponding to the matched feature points in each adjacent video frame, obtaining a virtual calibration result of the virtual visual angle acquisition equipment corresponding to the virtual video frame to be generated, and projecting the target point according to the virtual calibration result to obtain a projection point;

and the virtual video frame generating unit is used for generating a virtual video frame according to the projection points corresponding to the characteristic points in any adjacent video frame.

On this basis, optionally, the physical calibration result includes an internal reference and a pose, and the spatial point obtaining unit may include:

the acquisition video frame acquisition subunit is used for taking the adjacent video frames acquired by the adjacent visual angle acquisition equipment as acquisition video frames;

the first pixel value acquisition subunit is used for acquiring depth information of a physical visual angle under adjacent visual angle acquisition equipment according to the internal reference and the pose and acquiring a first pixel value of a feature point in an acquired video frame;

the back projection matrix obtaining subunit is used for carrying out spatial back projection according to the depth information and the first pixel value to obtain a back projection matrix of the internal parameter and a back projection matrix of the pose;

and the space point obtaining subunit is used for projecting the characteristic points in the acquired video frame according to the back projection matrix of the internal parameter and the back projection matrix of the pose to obtain space points.

Still optionally, the video reconstruction apparatus may further include:

a distance acquisition module, configured to respectively acquire second pixel values of the matched feature points in each adjacent video frame, and respectively acquire a distance between each adjacent view angle acquisition device and the virtual view angle acquisition device;

the weight determining module is used for determining the weight of the adjacent visual angle acquisition equipment corresponding to the distance according to the distance;

the projection pixel value determining module is used for determining the projection pixel value of the projection point according to each second pixel value and each weight;

the virtual video frame generation unit may specifically be configured to:

and generating a virtual video frame according to the projection pixel values of the projection points respectively corresponding to the characteristic points in any adjacent video frame and the projection pixel values of the projection points respectively corresponding to the characteristic points.

Optionally, the virtual video frame generating module 420 may include:

a video frame determination unit configured to determine a first video frame and a second video frame adjacent to each other in a first direction among the adjacent video frames;

an intermediate video frame obtaining unit, configured to perform frame interpolation according to the first video frame and the second video frame in the first direction to obtain an intermediate video frame;

and the virtual video frame obtaining unit is used for carrying out frame interpolation according to each intermediate video frame in a second direction perpendicular to the first direction to obtain a virtual video frame.

On this basis, optionally, the video reconstruction apparatus may further include:

a vertical direction obtaining module for taking a direction parallel to the ground plane as a horizontal direction and a direction perpendicular to the ground plane as a vertical direction;

the vertical interval acquisition module is used for acquiring the horizontal interval of any two free visual angle acquisition devices in the horizontal direction and the vertical interval of any two free visual angle acquisition devices in the vertical direction;

and the first direction obtaining module is used for determining the horizontal direction or the vertical direction as the first direction according to the numerical relationship between the horizontal interval and the vertical interval.

Optionally, the viewing angle acquisition devices in each free viewing angle acquisition device group are annularly arranged; and/or the virtual visual angle corresponding to the virtual video frame is positioned in a target range, wherein the target range is a range formed by the physical visual angles corresponding to the adjacent video frames.

According to the video reconstruction device provided by the embodiment of the disclosure, an original video acquired based on light field acquisition equipment is acquired through an original video acquisition module, and the light field acquisition equipment comprises at least two groups of free view angle acquisition equipment groups, wherein each group of free view angle acquisition equipment groups are arranged at different heights, and each group of free view angle acquisition equipment group comprises a plurality of free view angle acquisition equipment, so that the acquired original video simultaneously has multi-view angle information in the vertical direction and the horizontal direction, namely light field information; furthermore, in order to reconstruct and obtain dense light field information on the basis of deploying sparse free view angle acquisition equipment, a virtual video frame generation module is used for generating virtual video frames according to adjacent video frames in synchronous video frames acquired on the same frame in an original video, wherein the free view angle acquisition equipment corresponding to any two adjacent video frames is adjacent in the placement position; further, light field reconstruction is carried out through a light field video reconstruction module according to each synchronous video frame collected on each frame in the original video and the virtual video frame corresponding to each synchronous video frame, and the light field video is obtained. According to the device, in the acquisition process, the acquisition of the free visual angle in the vertical direction is added on the basis of the acquisition of the free visual angle in the horizontal direction, and dense light field information is reconstructed by generating the virtual video frames through each adjacent video frame, so that the light field video capable of meeting the 6DoF watching requirement is obtained, and therefore the user is allowed to watch the light field video in the 6DoF mode based on the augmented reality AR or the head-mounted display equipment and the like.

The video reconstruction device provided by the embodiment of the disclosure can execute the video reconstruction method provided by any embodiment of the disclosure, and has corresponding functional modules and beneficial effects of the execution method.

It should be noted that, in the embodiment of the video reconstruction apparatus, the included units and modules are only divided according to functional logic, but are not limited to the above division as long as the corresponding functions can be implemented; in addition, specific names of the functional units are only used for distinguishing one functional unit from another, and are not used for limiting the protection scope of the present disclosure.

Referring now to fig. 9, a schematic diagram of an electronic device (e.g., the terminal device or the server in fig. 9) 500 suitable for implementing embodiments of the present disclosure is shown. The electronic devices in the embodiments of the present disclosure may include, but are not limited to, mobile terminals such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), in-vehicle terminals (e.g., car navigation terminals), and the like, and fixed terminals such as digital TVs, desktop computers, and the like. The electronic device shown in fig. 9 is only an example, and should not bring any limitation to the functions and the use range of the embodiment of the present disclosure.

As shown in fig. 9, electronic device 500 may include a processing means (e.g., central processing unit, graphics processor, etc.) 501 that may perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)502 or a program loaded from a storage means 508 into a Random Access Memory (RAM) 503. In the RAM 503, various programs and data necessary for the operation of the electronic apparatus 500 are also stored. The processing device 501, the ROM 502, and the RAM 503 are connected to each other through a bus 504. An input/output (I/O) interface 505 is also connected to bus 504.

Generally, the following devices may be connected to the I/O interface 505: input devices 506 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; output devices 507 including, for example, a Liquid Crystal Display (LCD), speakers, vibrators, and the like; storage devices 508 including, for example, magnetic tape, hard disk, etc.; and a communication device 509. The communication means 509 may allow the electronic device 500 to communicate with other devices wirelessly or by wire to exchange data. While an electronic device 500 having various means is illustrated in FIG. 9, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may be alternatively implemented or provided.

In particular, the processes described above with reference to the flow diagrams may be implemented as computer software programs, according to embodiments of the present disclosure. For example, embodiments of the present disclosure include a computer program product comprising a computer program carried on a non-transitory computer readable medium, the computer program containing program code for performing the method illustrated by the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication means 509, or installed from the storage means 508, or installed from the ROM 502. The computer program performs the above-described functions defined in the methods of the embodiments of the present disclosure when executed by the processing device 501.

It should be noted that the computer readable medium in the present disclosure can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.

In some embodiments, the clients, servers may communicate using any currently known or future developed network Protocol, such as HTTP (HyperText Transfer Protocol), and may interconnect with any form or medium of digital data communication (e.g., a communications network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the Internet (e.g., the Internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed network.

The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device.

The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to:

Computer program code for carrying out operations for the present disclosure may be written in any combination of one or more programming languages, including but not limited to an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units described in the embodiments of the present disclosure may be implemented by software or hardware. For example, the original video capture module may be further described as "capturing an original video captured based on a light field capture device, where the light field capture device includes at least two groups of free view capture device groups, each group of free view capture device groups being placed at a different height, each group of free view capture device groups including modules of multiple free view capture devices.

The functions described herein above may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), systems on a chip (SOCs), Complex Programmable Logic Devices (CPLDs), and the like.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

According to one or more embodiments of the present disclosure, [ example one ] there is provided a video reconstruction method, which may include:

acquiring an original video acquired based on light field acquisition equipment, wherein the light field acquisition equipment comprises at least two free view angle acquisition equipment groups, each free view angle acquisition equipment group is placed at different heights, and each free view angle acquisition equipment group comprises a plurality of free view angle acquisition equipment;

According to one or more embodiments of the present disclosure, [ example two ] there is provided the method of example one, generating a virtual video frame from each neighboring video frame, may include:

taking free visual angle acquisition equipment corresponding to adjacent video frames as adjacent visual angle acquisition equipment, and performing feature matching on each adjacent video frame to obtain matched feature points in each adjacent video frame;

acquiring a physical calibration result of adjacent visual angle acquisition equipment for each adjacent visual angle acquisition equipment, and projecting feature points in adjacent video frames acquired by the adjacent visual angle acquisition equipment according to the physical calibration result to obtain space points;

determining a target point according to the spatial points corresponding to the matched characteristic points in each adjacent video frame, acquiring a virtual calibration result of the virtual visual angle acquisition equipment corresponding to the virtual video frame to be generated, and projecting the target point according to the virtual calibration result to obtain a projection point;

and generating a virtual video frame according to the projection points respectively corresponding to the characteristic points in any adjacent video frame.

According to one or more embodiments of the present disclosure, [ example three ] there is provided the method of example two, where the physical calibration result includes an internal reference and a pose, and the obtaining the spatial point by projecting the feature point in the adjacent video frame acquired by the adjacent view acquisition device according to the physical calibration result may include:

taking adjacent video frames acquired by adjacent visual angle acquisition equipment as acquired video frames;

obtaining depth information of a physical visual angle under adjacent visual angle acquisition equipment according to the internal reference and the pose, and acquiring a first pixel value of a characteristic point in an acquired video frame;

performing spatial back projection according to the depth information and the first pixel value to obtain an internal reference back projection matrix and a pose back projection matrix;

and projecting the characteristic points in the collected video frame according to the back projection matrix of the internal reference and the back projection matrix of the pose to obtain space points.

According to one or more embodiments of the present disclosure, [ example four ] there is provided the method of example two, which may further include:

respectively acquiring second pixel values of the matched characteristic points in each adjacent video frame, and respectively acquiring the distance between each adjacent visual angle acquisition equipment and the virtual visual angle acquisition equipment;

determining the weight of adjacent visual angle acquisition equipment corresponding to the distance according to the distance;

determining a projection pixel value of the projection point according to each second pixel value and each weight;

generating a virtual video frame according to projection points corresponding to each feature point in any adjacent video frame, may include:

According to one or more embodiments of the present disclosure, [ example five ] there is provided the method of example one, generating a virtual video frame from each neighboring video frame, may include:

determining a first video frame and a second video frame which are adjacent to each other in a first direction in each adjacent video frame;

in a first direction, performing frame interpolation according to a first video frame and a second video frame to obtain an intermediate video frame;

and in a second direction perpendicular to the first direction, performing frame interpolation according to each intermediate video frame to obtain a virtual video frame.

In accordance with one or more embodiments of the present disclosure, [ example six ] there is provided the method of example five, the light field reconstruction method may further include:

taking a direction parallel to the ground plane as a horizontal direction and a direction perpendicular to the ground plane as a vertical direction;

acquiring the horizontal interval of any two free visual angle acquisition devices in the horizontal direction and the vertical interval of any two free visual angle acquisition devices in the vertical direction;

and determining the horizontal direction or the vertical direction as the first direction according to the numerical relationship between the horizontal interval and the vertical interval.

According to one or more embodiments of the present disclosure, [ example seven ] there is provided the method of example one, each of the groups of free view angle acquisition devices is annularly arranged by a view angle acquisition device; and/or the presence of a gas in the atmosphere,

the virtual visual angle corresponding to the virtual video frame is located in a target range, and the target range is a range formed by the physical visual angles corresponding to the adjacent video frames.

According to one or more embodiments of the present disclosure, [ example eight ] there is provided a video reconstruction apparatus, which may include:

The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the disclosure herein is not limited to the particular combination of features described above, but also encompasses other combinations of features described above or equivalents thereof without departing from the spirit of the disclosure. For example, the above features and (but not limited to) the features disclosed in this disclosure having similar functions are replaced with each other to form the technical solution.

Further, while operations are depicted in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order. Under certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while several specific implementation details are included in the above discussion, these should not be construed as limitations on the scope of the disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.

Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims.

Claims

1. A method for video reconstruction, comprising:

acquiring an original video acquired based on light field acquisition equipment, wherein the light field acquisition equipment comprises at least two groups of free view angle acquisition equipment groups, each group of free view angle acquisition equipment groups is placed at different heights, and each group of free view angle acquisition equipment groups comprises a plurality of free view angle acquisition equipment;

generating virtual video frames according to adjacent video frames in synchronous video frames in the original video, wherein the synchronous video frames are acquired on the same frame, and the free visual angle acquisition equipment corresponding to any two adjacent video frames is adjacent to each other in the arrangement position;

and performing light field reconstruction according to the synchronous video frames acquired on each frame in the original video and the virtual video frames corresponding to the synchronous video frames to obtain a light field video.

2. The method of claim 1, wherein said generating a virtual video frame from each of said neighboring video frames comprises:

taking the free visual angle acquisition equipment corresponding to the adjacent video frames as adjacent visual angle acquisition equipment, and performing feature matching on each adjacent video frame to obtain matched feature points in each adjacent video frame;

acquiring a physical calibration result of the adjacent visual angle acquisition equipment for each adjacent visual angle acquisition equipment, and projecting the characteristic points in the adjacent video frames acquired by the adjacent visual angle acquisition equipment according to the physical calibration result to obtain space points;

determining a target point according to the space points corresponding to the matched feature points in each adjacent video frame, acquiring a virtual calibration result of a virtual visual angle acquisition device corresponding to a virtual video frame to be generated, and projecting the target point according to the virtual calibration result to obtain a projection point;

and generating the virtual video frame according to the projection points corresponding to the characteristic points in any adjacent video frame.

3. The method according to claim 2, wherein the physical calibration result includes an internal reference and a pose, and the projecting the feature points in the adjacent video frames acquired by the adjacent view acquisition device according to the physical calibration result to obtain spatial points comprises:

taking the adjacent video frames collected by the adjacent visual angle collecting equipment as collected video frames;

obtaining depth information of a physical visual angle under the adjacent visual angle acquisition equipment according to the internal reference and the pose, and acquiring a first pixel value of the feature point in the acquired video frame;

performing spatial back projection according to the depth information and the first pixel value to obtain a back projection matrix of the internal parameter and a back projection matrix of the pose;

and projecting the characteristic points in the acquired video frame according to the back projection matrix of the internal reference and the back projection matrix of the pose to obtain space points.

4. The method of claim 2, further comprising:

respectively acquiring second pixel values of the matched feature points in each adjacent video frame, and respectively acquiring the distance between each adjacent visual angle acquisition device and the virtual visual angle acquisition device;

determining the weight of the adjacent visual angle acquisition equipment corresponding to the distance according to the distance;

determining a projected pixel value of the projection point according to each second pixel value and each weight;

generating the virtual video frame according to the projection points corresponding to the feature points in any adjacent video frame, including:

and generating the virtual video frame according to the projection points corresponding to the feature points in any adjacent video frame and the projection pixel values of the projection points corresponding to the feature points.

5. The method of claim 1, wherein said generating a virtual video frame from each of said neighboring video frames comprises:

determining a first video frame and a second video frame adjacent to each other in a first direction in each adjacent video frame;

in the first direction, performing frame interpolation according to the first video frame and the second video frame to obtain an intermediate video frame;

6. The method of claim 5, further comprising:

taking a direction parallel to a ground plane as a horizontal direction and a direction perpendicular to the ground plane as a vertical direction;

acquiring a horizontal interval of any two free visual angle acquisition devices in the horizontal direction and a vertical interval of any two free visual angle acquisition devices in the vertical direction;

7. The method of claim 1, wherein;

the free visual angle acquisition equipment in each free visual angle acquisition equipment group is annularly arranged;

and/or the presence of a gas in the gas,

8. A video reconstruction apparatus, comprising:

the system comprises an original video acquisition module, a video acquisition module and a video processing module, wherein the original video acquisition module is used for acquiring an original video acquired based on light field acquisition equipment, the light field acquisition equipment comprises at least two free view angle acquisition equipment groups, each free view angle acquisition equipment group is arranged on different heights, and each free view angle acquisition equipment group comprises a plurality of free view angle acquisition equipment;

the virtual video frame generation module is used for generating virtual video frames according to adjacent video frames in each synchronous video frame in the original video, wherein each synchronous video frame is acquired on the same frame, and the free visual angle acquisition equipment corresponding to any two adjacent video frames is adjacent in the arrangement position;

and the light field video reconstruction module is used for reconstructing a light field according to the synchronous video frames acquired on each frame in the original video and the virtual video frames corresponding to the synchronous video frames to obtain the light field video.

9. An electronic device, comprising:

one or more processors;

a memory for storing one or more programs;

the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the video reconstruction method of any of claims 1-7.

10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the video reconstruction method according to any one of claims 1 to 7.