CN115022613B - Video reconstruction method and device, electronic equipment and storage medium - Google Patents

Video reconstruction method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN115022613B
CN115022613B CN202210556761.XA CN202210556761A CN115022613B CN 115022613 B CN115022613 B CN 115022613B CN 202210556761 A CN202210556761 A CN 202210556761A CN 115022613 B CN115022613 B CN 115022613B
Authority
CN
China
Prior art keywords
video
adjacent
video frame
virtual
acquisition equipment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210556761.XA
Other languages
Chinese (zh)
Other versions
CN115022613A (en
Inventor
焦少慧
陈誉中
吴泽寰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing ByteDance Network Technology Co Ltd
Original Assignee
Beijing ByteDance Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing ByteDance Network Technology Co Ltd filed Critical Beijing ByteDance Network Technology Co Ltd
Priority to CN202210556761.XA priority Critical patent/CN115022613B/en
Publication of CN115022613A publication Critical patent/CN115022613A/en
Application granted granted Critical
Publication of CN115022613B publication Critical patent/CN115022613B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/30Image reproducers
    • H04N13/332Displays for viewing with the aid of special glasses or head-mounted displays [HMD]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/30Image reproducers
    • H04N13/366Image reproducers using viewer tracking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/30Image reproducers
    • H04N13/398Synchronisation thereof; Control thereof

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Studio Devices (AREA)
  • Image Processing (AREA)

Abstract

The embodiment of the disclosure discloses a video reconstruction method, a video reconstruction device, electronic equipment and a storage medium. The method comprises the following steps: acquiring an original video acquired based on light field acquisition equipment, wherein the light field acquisition equipment comprises at least two free view acquisition equipment groups, the free view acquisition equipment groups are placed at different heights, and each free view acquisition equipment group comprises a plurality of free view acquisition equipment; for adjacent video frames in synchronous video frames in an original video, generating a virtual video frame according to each adjacent video frame, wherein each synchronous video frame is acquired on the same frame, and free view angle acquisition devices respectively corresponding to any two adjacent video frames are adjacent on the arrangement position; and carrying out light field reconstruction according to each synchronous video frame acquired on each frame in the original video and the virtual video frame corresponding to each synchronous video frame to obtain a light field video. According to the technical scheme, the light field video which can meet the 6DoF viewing requirement can be reconstructed.

Description

Video reconstruction method and device, electronic equipment and storage medium
Technical Field
The embodiment of the disclosure relates to the technical field of data processing, in particular to a video reconstruction method, a video reconstruction device, electronic equipment and a storage medium.
Background
With the rapid development of information technology, video plays an important role in the aspects of people's life. Currently, two-dimensional (2D) video viewing is no longer satisfied, but three-dimensional (three dimensional, 3D) video with stronger stereoscopic effect is desired to be viewed.
For light field video in 3D video, one often wants to see multi-view information with 6 degrees of freedom (degree of freedom, doF) when watching light field video. However, the current reconstructed light field video cannot meet the viewing requirement, i.e. people cannot be allowed to view in 6 DoF.
Disclosure of Invention
The embodiment of the disclosure provides a video reconstruction method, a video reconstruction device, electronic equipment and a storage medium, so as to reconstruct and obtain a light field video which can meet the 6DoF viewing requirement.
In a first aspect, an embodiment of the present disclosure provides a video reconstruction method, which may include:
acquiring an original video acquired based on light field acquisition equipment, wherein the light field acquisition equipment comprises at least two free view acquisition equipment groups, the free view acquisition equipment groups are placed on different heights, and each free view acquisition equipment group comprises a plurality of free view acquisition equipment;
Generating a virtual video frame according to each adjacent video frame aiming at the adjacent video frame in each synchronous video frame in the original video, wherein each synchronous video frame is acquired on the same frame, and free view angle acquisition devices respectively corresponding to any two adjacent video frames are adjacent on the arrangement position;
and carrying out light field reconstruction according to each synchronous video frame acquired on each frame in the original video and the virtual video frame corresponding to each synchronous video frame to obtain a light field video.
In a second aspect, embodiments of the present disclosure further provide a video reconstruction apparatus, which may include:
the system comprises an original video acquisition module, a display module and a display module, wherein the original video acquisition module is used for acquiring an original video acquired based on light field acquisition equipment, the light field acquisition equipment comprises at least two free view acquisition equipment groups, the free view acquisition equipment groups are placed on different heights, and the free view acquisition equipment groups comprise a plurality of free view acquisition equipment;
The virtual video frame generation module is used for generating virtual video frames according to adjacent video frames in synchronous video frames in an original video, wherein the synchronous video frames are acquired on the same frame, and free view angle acquisition devices corresponding to any two adjacent video frames are adjacent in arrangement position;
The light field video reconstruction module is used for carrying out light field reconstruction according to each synchronous video frame acquired on each frame in the original video and the virtual video frame corresponding to each synchronous video frame to obtain the light field video.
In a third aspect, an embodiment of the present disclosure further provides an electronic device, which may include:
one or more processors;
A memory for storing one or more programs,
The one or more programs, when executed by the one or more processors, cause the one or more processors to implement the video reconstruction method provided by any embodiment of the present disclosure.
In a fourth aspect, the embodiments of the present disclosure further provide a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the video reconstruction method provided by any of the embodiments of the present disclosure.
According to the technical scheme, the original video acquired based on the light field acquisition equipment is acquired, and because the light field acquisition equipment comprises at least two groups of free view acquisition equipment groups, each group of free view acquisition equipment groups are placed on different heights, and each group of free view acquisition equipment groups comprises a plurality of free view acquisition equipment, the acquired original video simultaneously has multi-view information, namely light field information, in the vertical direction and in the horizontal direction; furthermore, in order to reconstruct dense light field information on the basis of deploying sparse free view angle acquisition equipment, adjacent video frames in synchronous video frames acquired on the same frame in an original video can be aimed at, and virtual video frames are generated according to the adjacent video frames, wherein free view angle acquisition equipment corresponding to any two adjacent video frames respectively is adjacent on a placement position; further, light field reconstruction is carried out according to each synchronous video frame acquired on each frame in the original video and the virtual video frame corresponding to each synchronous video frame, so as to obtain a light field video. According to the technical scheme, in the acquisition process, the acquisition of the free view angle in the vertical direction is added on the basis of the acquisition of the free view angle in the horizontal direction, and the virtual video frames are generated through each adjacent video frame to reconstruct and obtain dense light field information, so that the light field video meeting the 6DoF watching requirement is obtained, and the user is allowed to watch the light field video in a 6DoF mode based on the augmented reality (Augmented Reality, AR) or a head-mounted display device and the like.
Drawings
The above and other features, advantages, and aspects of embodiments of the present disclosure will become more apparent by reference to the following detailed description when taken in conjunction with the accompanying drawings. The same or similar reference numbers will be used throughout the drawings to refer to the same or like elements. It should be understood that the figures are schematic and that elements and components are not necessarily drawn to scale.
FIG. 1 is a flow chart of a video reconstruction method in an embodiment of the present disclosure;
fig. 2 is a schematic illustration of placement of a light field acquisition device in a video reconstruction method according to an embodiment of the disclosure;
FIG. 3 is a schematic diagram of adjacent video frames and virtual video frames in a video reconstruction method according to an embodiment of the present disclosure;
FIG. 4 is a flow chart of yet another video reconstruction method in an embodiment of the present disclosure;
FIG. 5 is a schematic diagram of generating a virtual video frame in yet another video reconstruction method according to an embodiment of the present disclosure;
FIG. 6 is a flow chart of another video reconstruction method in an embodiment of the present disclosure;
FIG. 7 is a schematic diagram of generating a virtual video frame in another video reconstruction method according to an embodiment of the present disclosure;
Fig. 8 is a block diagram of a video reconstruction apparatus in an embodiment of the present disclosure;
fig. 9 is a schematic structural view of an electronic device in an embodiment of the present disclosure.
Detailed Description
Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure have been shown in the accompanying drawings, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but are provided to provide a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the present disclosure are for illustration purposes only and are not intended to limit the scope of the present disclosure.
It should be understood that the various steps recited in the method embodiments of the present disclosure may be performed in a different order and/or performed in parallel. Furthermore, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this respect.
The term "including" and variations thereof as used herein are intended to be open-ended, i.e., including, but not limited to. The term "based on" is based at least in part on. The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments. Related definitions of other terms will be given in the description below.
It should be noted that the terms "first," "second," and the like in this disclosure are merely used to distinguish between different devices, modules, or units and are not used to define an order or interdependence of functions performed by the devices, modules, or units.
It should be noted that references to "one", "a plurality" and "a plurality" in this disclosure are intended to be illustrative rather than limiting, and those of ordinary skill in the art will appreciate that "one or more" is intended to be understood as "one or more" unless the context clearly indicates otherwise.
The names of messages or information interacted between the various devices in the embodiments of the present disclosure are for illustrative purposes only and are not intended to limit the scope of such messages or information.
Fig. 1 is a flowchart of a video reconstruction method provided in an embodiment of the present disclosure. The embodiment can be suitable for the situation of reconstructing the light field video, in particular for the situation of reconstructing the light field video based on a free view acquisition scheme. The method may be performed by a video reconstruction apparatus provided by an embodiment of the present disclosure, where the apparatus may be implemented by software and/or hardware, and the apparatus may be integrated on an electronic device, where the electronic device may be various terminal devices or servers.
Referring to fig. 1, the method of the embodiment of the disclosure specifically includes the following steps:
S110, acquiring an original video acquired based on light field acquisition equipment, wherein the light field acquisition equipment comprises at least two free view acquisition equipment groups, the free view acquisition equipment groups are placed on different heights, and each free view acquisition equipment group comprises a plurality of free view acquisition equipment.
The light field acquisition equipment can be equipment which is put in advance and is used for acquiring original videos, and can comprise at least two free view acquisition equipment groups which are put on different heights, so that multi-view information in the vertical direction can be acquired, and the vertical direction can be understood as the direction in which the height is located, namely the direction perpendicular to the ground plane; on this basis, each free view acquisition device group may include a plurality of free view acquisition devices, whereby multi-view information in a horizontal direction, which may be understood as a direction parallel to the ground plane, may be acquired. In practical application, optionally, the view angle acquisition devices in each free view angle acquisition device group can be placed in a ring shape, namely 360-degree circular photographing (outide-in), so that light field information under more view angles in the horizontal direction can be acquired. For example, referring to the schematic presentation of a light field acquisition device shown in fig. 2 (wherein the horizontal lines represent ground planes and the vertical lines represent heights), the light field acquisition device comprises a first set of freeview acquisition devices 20 and a second set of freeview acquisition devices 21, which are presented at different heights, taking the first set of freeview acquisition devices 20 as an example, wherein a plurality of freeview acquisition devices 201 are included, these freeview acquisition devices 201 being located at the same height and being disposed one turn in the horizontal direction (only a portion of the freeview acquisition devices 201 are illustrated in the figure).
The original video acquired by the light field acquisition device can be considered as video obtained by capturing light field samples from different viewpoints, namely view angles, in the target space simultaneously by the plurality of free view angle acquisition devices. As described above, since the light field acquisition device can simultaneously acquire light field information in the horizontal direction and in the vertical direction, the original video reconstructed therefrom has relatively complete light field information.
S120, generating a virtual video frame according to each adjacent video frame aiming at the adjacent video frame in each synchronous video frame in the original video, wherein each synchronous video frame is acquired on the same frame, and free view angle acquisition devices corresponding to any two adjacent video frames are adjacent on the arrangement position.
In order to reconstruct a light field video with full-back, i.e. dense, light field information, an alternative solution is to deploy a very dense freeview acquisition device, but this solution may have the following problems: 1) The hardware cost is high; 2) The free view angle acquisition equipment needs to achieve consistency of equipment parameters such as white balance, brightness and the like and consistency of time synchronization, and the more the number of the free view angle acquisition equipment is, the more difficult the uniformity is; 3) Each free view angle acquisition device needs to be calibrated, and the greater the number of free view angle acquisition devices, the more the calibration complexity is increased, so that the longer the calibration time is.
On the basis, in order to obtain the fully-prepared light field information and avoid the problems, the embodiment of the disclosure proposes a technical scheme for obtaining the fully-prepared light field information by generating a virtual video frame on the basis of sparsely deployed free view acquisition equipment. Specifically, each synchronous video frame may be a video frame under different viewing angles acquired on the same frame in the original video, that is, each video frame acquired synchronously by the viewing angle acquisition device in each group of free viewing angle acquisition devices. The adjacent video frames can be video frames in each synchronous video frame, and the free view angle acquisition devices corresponding to any two adjacent video frames are adjacent in the arrangement position, such as adjacent in the horizontal direction, the vertical direction or the diagonal direction. For example, referring to fig. 3, for simplicity of description, the video frames in the illustration are represented by views, and the specific meaning of the video frames at different locations (i.e., locations corresponding to coordinates of the digital representation in the illustration) is as follows. view (0, 0), view (0, 1), view (1, 0) and view (1, 1) respectively represent a certain adjacent video frame, wherein the free view acquisition devices corresponding to the view (0, 0) and the view (0, 1) respectively are adjacent in the horizontal direction, and the view (1, 0) is similar to the view (1, 1); the free view acquisition devices respectively corresponding to the view (0, 0) and the view (1, 0) are adjacent in the vertical direction, and the view (0, 1) is similar to the view (1, 1); the freeview acquisition devices respectively corresponding to view (0, 0) and view (1, 1) are adjacent in the diagonal direction.
Further, a virtual video frame is generated according to each adjacent video frame, and the virtual video frame can be understood as a video frame under a corresponding virtual view angle, namely, a video frame acquired by a virtual view angle acquisition device corresponding to the virtual view angle. In practical applications, optionally, the virtual video frame may be generated in various manners, such as a frame inserting manner or a three-dimensional reconstruction manner, which is not limited herein. Still alternatively, the virtual view angle corresponding to the virtual video frame is located in a target range, where the target range may be a range formed by physical view angles corresponding to adjacent video frames, such as a range formed by object view angles corresponding to view (0, 0), view (0, 1), view (1, 0) and view (1, 1) in fig. 3, where the virtual video frame may be view (0.6,0.3) in fig. 3; the virtual video frame may be view (0.6, 1.2) in fig. 3, which is a range out of the range of physical views corresponding to the adjacent video frames; etc., and are not particularly limited herein.
S130, performing light field reconstruction according to each synchronous video frame acquired on each frame in the original video and the virtual video frame corresponding to each synchronous video frame to obtain a light field video.
The original video comprises synchronous video frames acquired on each frame respectively, for example, the light field acquisition equipment comprises M groups of free view acquisition equipment groups, each group of free view acquisition equipment groups comprises N free view acquisition equipment, M and N are integers greater than 2, and then the original video comprises M x N synchronous video frames acquired on each frame respectively. For each synchronous video frame on each frame, the virtual video frame corresponding to each synchronous video frame is obtained through the steps. Therefore, dense light field information can be reconstructed based on each synchronous video frame acquired on each frame and the virtual video frame corresponding to each synchronous video frame, so that light field video with the dense light field information can be obtained.
According to the technical scheme, the original video acquired based on the light field acquisition equipment is acquired, and because the light field acquisition equipment comprises at least two groups of free view acquisition equipment groups, each group of free view acquisition equipment groups are placed on different heights, and each group of free view acquisition equipment groups comprises a plurality of free view acquisition equipment, the acquired original video simultaneously has multi-view information, namely light field information, in the vertical direction and in the horizontal direction; furthermore, in order to reconstruct dense light field information on the basis of deploying sparse free view angle acquisition equipment, adjacent video frames in synchronous video frames acquired on the same frame in an original video can be aimed at, and virtual video frames are generated according to the adjacent video frames, wherein free view angle acquisition equipment corresponding to any two adjacent video frames respectively is adjacent on a placement position; further, light field reconstruction is carried out according to each synchronous video frame acquired on each frame in the original video and the virtual video frame corresponding to each synchronous video frame, so as to obtain a light field video. According to the technical scheme, in the acquisition process, the acquisition of the free view angle in the vertical direction is added on the basis of the acquisition of the free view angle in the horizontal direction, and the virtual video frames are generated through each adjacent video frame to reconstruct and obtain dense light field information, so that the light field video meeting the 6DoF watching requirement is obtained, and a user is allowed to watch the light field video in a 6DoF mode based on AR or a head-mounted display device and the like.
Fig. 4 is a flow chart of yet another video reconstruction method provided in an embodiment of the present disclosure. This embodiment is optimized based on the various alternatives in the embodiments described above. In this embodiment, optionally, generating a virtual video frame according to each adjacent video frame may include: the free view angle acquisition equipment corresponding to the adjacent video frames is used as the adjacent view angle acquisition equipment, and feature matching is carried out on each adjacent video frame, so that matched feature points in each adjacent video frame are obtained; for each adjacent view angle acquisition device, acquiring a physical calibration result of the adjacent view angle acquisition device, and projecting characteristic points in adjacent video frames acquired by the adjacent view angle acquisition device according to the physical calibration result to obtain space points; determining a target point according to the spatial points respectively corresponding to the matched characteristic points in each adjacent video frame, acquiring a virtual calibration result of virtual view acquisition equipment corresponding to the virtual video frame to be generated, and projecting the target point according to the virtual calibration result to obtain a projection point; and generating a virtual video frame according to the projection points corresponding to the feature points in any adjacent video frame. Wherein, the explanation of the same or corresponding terms as the above embodiments is not repeated herein.
Accordingly, as shown in fig. 4, the method of this embodiment may specifically include the following steps:
s210, acquiring an original video acquired based on light field acquisition equipment, wherein the light field acquisition equipment comprises at least two free view acquisition equipment groups, the free view acquisition equipment groups are placed on different heights, and the free view acquisition equipment groups comprise a plurality of free view acquisition equipment.
S220, aiming at adjacent video frames in synchronous video frames in an original video, free view angle acquisition equipment corresponding to the adjacent video frames is used as adjacent view angle acquisition equipment, feature matching is carried out on the adjacent video frames to obtain matched feature points in the adjacent video frames, wherein the synchronous video frames are acquired on the same frame, and free view angle acquisition equipment corresponding to any two adjacent video frames are adjacent on the arrangement position.
Wherein S220-S250 are performed for each adjacent video frame. Specifically, a free view angle acquisition device for acquiring adjacent video frames is used as an adjacent view angle acquisition device. The feature matching process can be implemented based on stereo matching (stereo matching) and other algorithms, so as to obtain the matching feature points in each adjacent video frame, such as feature point A1 in adjacent video frame 1, feature point A2 in adjacent video frame 2, feature point A3 in adjacent video frame, and feature point A4 in adjacent video frame.
S230, for each adjacent view angle acquisition device, acquiring a physical calibration result of the adjacent view angle acquisition device, and projecting characteristic points in adjacent video frames acquired by the adjacent view angle acquisition device according to the physical calibration result to obtain space points.
The physical calibration result may be a result obtained by calibrating each adjacent view angle acquisition device, and may be represented by a pose (R, t) and an internal reference (K), where R represents a rotation matrix and t represents a translation matrix. Since the adjacent view angle acquisition device is a device that physically exists, the calibration result thereof is referred to herein as a physical calibration result. And then, for each characteristic point in the adjacent video frames acquired by the adjacent view acquisition equipment, performing space projection on the characteristic point according to a physical calibration result of the adjacent view acquisition equipment, thereby acquiring a space point.
S240, determining target points according to the spatial points corresponding to the matched characteristic points in each adjacent video frame, obtaining virtual calibration results of the virtual visual angle acquisition equipment corresponding to the virtual video frame to be generated, and projecting the target points according to the virtual calibration results to obtain projection points.
For spatial points corresponding to the matched feature points in each adjacent video frame, such as spatial points corresponding to A1, A2, A3 and A4 in the above example, the target point is determined according to the spatial points. In practical application, optionally, when there are how many pairs of matched feature points in each adjacent video frame, it can determine how many target points. Still alternatively, the target point may be determined by: for each adjacent video frame, connecting the characteristic point in the adjacent video frame with the space point corresponding to the characteristic point to obtain the straight line where the characteristic point is located; and after obtaining the straight lines of the matched characteristic points in each adjacent video frame, taking the intersection point of each straight line as a target point.
The virtual view angle acquisition device may be a device for acquiring a virtual video frame to be generated, and the virtual calibration result may be a calibration result thereof. Similar to the physical calibration results, the calibration results of the virtual view acquisition device are referred to as virtual calibration results because they are not physically real devices. And projecting the target point according to the virtual calibration result to obtain a projection point (namely a pixel point) on the virtual video frame to be generated. In practical applications, the above process of obtaining a projection point from a target point may be understood as a process of drawing a reconstruction result (i.e., a spatial point), that is, a process of drawing the target point onto a virtual video frame to be generated.
S250, generating a virtual video frame according to projection points corresponding to the feature points in any adjacent video frame.
The feature points matched in each adjacent video frame correspond to the same projection point, so that a virtual video can be generated according to the projection points respectively corresponding to the feature points in any adjacent video frame. I.e. after each projection point on the virtual video frame to be generated is obtained, the virtual video frame can be generated therefrom.
In order to better understand the generation process of the virtual video frame as a whole, an exemplary description thereof is provided below in connection with specific examples. For example, as shown in fig. 5, taking an adjacent video frame 51 acquired by the adjacent view acquisition device 50 and an adjacent video frame 53 acquired by the adjacent view acquisition device 52 as an example, A1 on the adjacent video frame 51 and A2 on the adjacent video frame 53 are matched feature points, and projecting A1 to obtain a spatial point K1 corresponding to A1; then, connecting the lines A1 and K1 to obtain a straight line L1 where A1 is located. The process of obtaining the spatial point K2 corresponding to A2 and the straight line L2 where A2 is located is similar, and will not be described here again. Taking the intersection point of the straight line L1 where the A1 is positioned and the straight line L2 where the A2 is positioned as a target point M, and then projecting the target point M according to a virtual calibration result to obtain a projection point T. The process of obtaining the projection points corresponding to the remaining matched feature points on the adjacent video frame 51 and the adjacent video frame 53 is similar, and will not be described herein. After obtaining the projection points corresponding to the feature points on the adjacent video frame 51 or the adjacent video frame 53, the virtual video frame 55 acquired by the virtual view angle acquisition device 54 can be obtained according to the projection points.
And S260, performing light field reconstruction according to each synchronous video frame acquired on each frame in the original video and the virtual video frame corresponding to each synchronous video frame to obtain a light field video.
According to the technical scheme, the free view angle acquisition equipment corresponding to the adjacent video frames is used as the adjacent view angle acquisition equipment, and feature matching is carried out on each adjacent video frame, so that matched feature points in each adjacent video frame are obtained; further, physical calibration results of the adjacent view angle acquisition devices are obtained for each adjacent view angle acquisition device, and characteristic points in adjacent video frames acquired by the adjacent view angle acquisition devices are projected according to the physical calibration results to obtain space points; then, determining target points according to the spatial points respectively corresponding to the matched characteristic points in each adjacent video frame, and acquiring virtual calibration results of virtual visual angle acquisition equipment corresponding to the virtual video frame to be generated, so as to project the target points according to the virtual calibration results and obtain projection points; after the projection points corresponding to the feature points in any adjacent video frame are obtained, virtual videos are generated based on the projection points, and therefore the effect of generating high-precision virtual videos is achieved based on a three-dimensional reconstruction mode.
An optional technical solution, on the basis of the foregoing embodiment, the physical calibration result includes an internal parameter and a pose, and projecting, according to the physical calibration result, feature points in adjacent video frames acquired by adjacent view angle acquisition devices to obtain spatial points may include: taking the collected adjacent video frames as collected video frames; obtaining depth information of a physical view angle under adjacent view angle acquisition equipment according to the internal parameters and the pose, and obtaining a first pixel value of a feature point in an acquired video frame; performing space back projection according to the depth information and the first pixel value to obtain a back projection matrix of the internal reference and a back projection matrix of the pose; and projecting the characteristic points in the acquired video frames according to the back projection matrix of the internal reference and the back projection matrix of the pose to obtain the space points.
In the case of describing each adjacent view angle acquisition device separately, for simplicity, it is emphasized that the adjacent video frames acquired by the adjacent view angle acquisition device are processed, instead of the adjacent video frames acquired by the rest of the adjacent view angle acquisition devices, and the adjacent video frames acquired by the adjacent view angle acquisition device are taken as the acquired video frames. Further, depth information of the physical view angle under the adjacent view angle acquisition device is obtained according to the internal parameters and the pose, and a first pixel value of a feature point in an acquired video frame is obtained, wherein the first pixel value can be represented by RGB information. Further, performing spatial back projection according to the depth information and the first pixel value to obtain a back projection matrix of the internal reference and a back projection matrix of the pose. Thus, the feature points in the acquired video frame may be projected according to the back projection matrix of the internal reference and the back projection matrix of the pose to obtain the spatial points, for example, continuing to take the above example as an example, p= [ r|t ] -1K-1pt, where P t represents the feature points in the acquired video frame, K -1 represents the back projection matrix of the internal reference, [ r|t ] -1 represents the back projection matrix of the pose, and P represents the spatial points. According to the technical scheme, the space point can be accurately obtained.
In another alternative aspect, on the basis of the foregoing embodiment, the video scene method may further include: respectively acquiring second pixel values of matched feature points in each adjacent video frame, and respectively acquiring the distance between each adjacent view acquisition device and the virtual view acquisition device; determining the weight of adjacent visual angle acquisition equipment corresponding to the distance according to the distance; determining a projection pixel value of the projection point according to each second pixel value and each weight; generating a virtual video according to the projection points corresponding to the feature points in any adjacent video frame, may include: and generating a virtual video according to the projection points corresponding to the feature points in any adjacent video frame and the projection pixel values of the projection points corresponding to the feature points.
Wherein, respectively obtain the second pixel value of the matching feature point in each adjacent video frame, the second pixel value can be represented by RGB information. Although the second pixel values are the pixel values of the matched feature points in each adjacent video frame, due to the different pose of each adjacent view acquisition device, there may be differences in these second pixel values, such as A1 being blacker and A2 being greyish, etc. On this basis, in order to accurately determine the projection pixel values of the projection points corresponding to the matched feature points, the distance between each adjacent view angle acquisition device and the virtual view angle acquisition device may be respectively obtained, where the distance may be represented by a euclidean distance, a mahalanobis distance, a cosine distance, a hamming distance, a manhattan distance, or the like, and is not specifically limited herein. In general, the smaller the distance, the greater the influence degree of the second pixel value under the adjacent viewing angle acquisition device having the distance on the projection pixel value, and the antisense is true, so that the weight of the adjacent viewing angle acquisition device corresponding to the distance can be determined according to the distance, and the projection pixel value can be determined according to each second pixel value and the weight corresponding to each second pixel value. It should be noted that, the feature points, the spatial points, the target points and the projection points described above mainly represent the position information of the corresponding pixel points (or spatial points), so that when the virtual video frame is generated, the virtual video frame can be realized according to each projection point and the projection pixel value of each projection point, thereby achieving the effect of accurately generating the virtual video frame.
Fig. 6 is a flowchart of another video reconstruction method provided in an embodiment of the present disclosure. This embodiment is optimized based on the various alternatives in the embodiments described above. In this embodiment, optionally, generating a virtual video frame according to each adjacent video frame may include: determining a first video frame and a second video frame which are adjacent in a first direction in each adjacent video frame; in the first direction, inserting frames according to the first video frames and the second video frames to obtain intermediate video frames; and in a second direction perpendicular to the first direction, inserting frames according to each intermediate video frame to obtain a virtual video frame. Wherein, the explanation of the same or corresponding terms as the above embodiments is not repeated herein.
Accordingly, as shown in fig. 6, the method of this embodiment may specifically include the following steps:
S310, acquiring an original video acquired based on light field acquisition equipment, wherein the light field acquisition equipment comprises at least two free view acquisition equipment groups, the free view acquisition equipment groups are placed on different heights, and each free view acquisition equipment group comprises a plurality of free view acquisition equipment.
S320, determining a first video frame and a second video frame which are adjacent in a first direction in each adjacent video frame aiming at the adjacent video frames in each synchronous video frame in the original video, wherein each synchronous video frame is acquired on the same frame, and free view angle acquisition equipment corresponding to any two adjacent video frames is adjacent in the arrangement position.
Wherein S320-S340 are performed for each adjacent video frame. The first direction may be a horizontal direction or a vertical direction, which is relevant to the actual situation, and is not specifically limited herein. The first video frame and the second video frame may be two video frames adjacent in the first direction among the respective adjacent video frames. For simplicity, the video frames in the illustration are represented by views, as shown in fig. 7, for example, with specific meaning of the video frames at different locations (i.e., locations corresponding to coordinates of the digital representation in the illustration), as described below. view (0, 0) and view (0, 1) can be understood as two video frames adjacent in the horizontal direction, view (1, 0) and view (1, 1) being similar; view (0, 0) and view (1, 0) can be understood as two video frames adjacent in the vertical direction, and view (0, 1) and view (1, 1) are similar.
S330, inserting frames according to the first video frames and the second video frames in the first direction to obtain intermediate video frames, and inserting frames according to the intermediate video frames in the second direction perpendicular to the first direction to obtain virtual video frames.
In the first direction, since there are at least two pairs of first video frames and second video frames, for each pair of first video frames and second video frames, a frame is inserted according to the two to obtain an intermediate video frame, and a specific frame inserting scheme may be an optical flow frame inserting scheme or a Depth Image Based Rendering (DIBR) frame inserting scheme, which is not specifically limited herein. For example, taking the example that the first direction is the horizontal direction, as shown in fig. 7, the intermediate video frames obtained by the interpolation may be view (0, 0.3) and view (1, 0.3), respectively. Further, in a second direction perpendicular to the first direction, the frames are inserted again according to the intermediate video frames obtained by the frame insertion, and a virtual video frame is obtained. Illustratively, still taking the above example as an example, the second direction is a vertical direction, and the view (0.6,0.3) is obtained by inserting frames in the vertical direction according to the view (0, 0.3) and the view (1, 0.3).
S340, performing light field reconstruction according to each synchronous video frame acquired on each frame in the original video and the virtual video frame corresponding to each synchronous video frame to obtain a light field video.
According to the technical scheme, the first video frame and the second video frame which are adjacent in the first direction in each adjacent video frame are determined; and in the first direction, inserting frames according to the first video frames and the second video frames to obtain intermediate video frames, and then inserting frames according to each intermediate video frame in the second direction perpendicular to the first direction to obtain virtual video frames, so that the effect of quickly generating the virtual video is achieved based on an inserting frame mode.
An optional technical solution, on the basis of the foregoing embodiment, the video reconstruction method may further include: taking the direction parallel to the ground plane as a horizontal direction and taking the direction perpendicular to the ground plane as a vertical direction; acquiring the horizontal interval of any two free view angle acquisition devices in the horizontal direction and the vertical interval of any two free view angle acquisition devices in the vertical direction; and determining the horizontal direction or the vertical direction as a first direction according to the numerical relation between the horizontal interval and the vertical interval. When the view angle acquisition devices in the horizontal direction are placed at equal intervals, the horizontal intervals of any two free view angle acquisition devices in the horizontal direction are the same, otherwise, the two free view angle acquisition devices have differences, and the free view angle acquisition devices are related to practical situations and are not limited. Similarly, when the view angle acquisition devices in the same group are located at substantially the same height, the vertical intervals of any two free view angle acquisition devices in the vertical direction are the same, otherwise, the vertical intervals are the same, and the vertical intervals are also relevant to the actual situation, and are not limited herein. On the basis, according to the numerical relation between the horizontal interval and the vertical interval, determining the horizontal direction or the vertical direction as a first direction, and if the horizontal interval is smaller than or equal to the vertical interval, taking the horizontal interval as the first direction; otherwise, the vertical spacing is taken as the first direction. In other words, the intermediate video frame may be obtained by first performing the frame insertion in the direction with the smaller interval, and the virtual video frame may be obtained by performing the frame insertion in the direction with the larger interval based on the intermediate video frame. The advantage of this arrangement is that the error of the intermediate video frames obtained by frame interpolation in the direction of the smaller spacing is smaller than in the direction of the larger spacing, whereby the accumulated error of the virtual video frames obtained based on frame interpolation of the intermediate video frames can be reduced, thereby ensuring the accuracy of the virtual video frames.
Fig. 8 is a block diagram of a video reconstruction device according to an embodiment of the present disclosure, where the video reconstruction device is configured to perform the video reconstruction method according to any of the foregoing embodiments. The device belongs to the same conception as the video reconstruction method of each embodiment, and in the details which are not described in detail in the embodiment of the video reconstruction device, reference may be made to the embodiment of the video reconstruction method. Referring to fig. 8, the apparatus may specifically include: an original video acquisition module 410, a virtual video frame generation module 420, and a light field video reconstruction module 430.
The original video acquisition module 410 is configured to acquire an original video acquired based on a light field acquisition device, where the light field acquisition device includes at least two free view acquisition device groups, each free view acquisition device group is placed at a different height, and each free view acquisition device group includes a plurality of free view acquisition devices;
the virtual video frame generation module 420 is configured to generate, for adjacent video frames in each synchronous video frame in the original video, a virtual video frame according to each adjacent video frame, where each synchronous video frame is acquired on the same frame, and free view angle acquisition devices corresponding to any two adjacent video frames are adjacent in a placement position;
The light field video reconstruction module 430 is configured to perform light field reconstruction according to each synchronous video frame acquired on each frame in the original video and the virtual video frame corresponding to each synchronous video frame, so as to obtain a light field video.
Optionally, the virtual video frame generation module 420 may include:
The feature point obtaining unit is used for taking the free view angle acquisition equipment corresponding to the adjacent video frames as the adjacent view angle acquisition equipment, and carrying out feature matching on each adjacent video frame to obtain matched feature points in each adjacent video frame;
The space point obtaining unit is used for obtaining physical calibration results of the adjacent view angle acquisition devices according to each adjacent view angle acquisition device, and projecting characteristic points in the adjacent video frames acquired by the adjacent view angle acquisition devices according to the physical calibration results to obtain space points;
The projection point obtaining unit is used for determining a target point according to the spatial points corresponding to the matched characteristic points in each adjacent video frame, obtaining a virtual calibration result of the virtual visual angle acquisition equipment corresponding to the virtual video frame to be generated, and projecting the target point according to the virtual calibration result to obtain a projection point;
And the virtual video frame generation unit is used for generating a virtual video frame according to the projection points corresponding to the feature points in any adjacent video frame.
On the basis, optionally, the physical calibration result includes internal parameters and pose, and the spatial point obtaining unit may include:
the acquisition video frame obtaining subunit is used for taking the adjacent video frames acquired by the adjacent visual angle acquisition equipment as acquisition video frames;
the first pixel value acquisition subunit is used for acquiring depth information of a physical view angle under the adjacent view angle acquisition equipment according to the internal parameters and the pose, and acquiring a first pixel value of a characteristic point in an acquired video frame;
the back projection matrix obtaining subunit is used for carrying out space back projection according to the depth information and the first pixel value to obtain a back projection matrix of an internal reference and a back projection matrix of a pose;
The space point obtaining subunit is used for projecting the characteristic points in the acquired video frames according to the back projection matrix of the internal reference and the back projection matrix of the pose to obtain the space points.
Still alternatively, the video reconstruction device may further include:
The distance acquisition module is used for respectively acquiring second pixel values of the matched characteristic points in each adjacent video frame and respectively acquiring the distance between each adjacent view acquisition device and the virtual view acquisition device;
The weight determining module is used for determining the weight of the adjacent visual angle acquisition equipment corresponding to the distance according to the distance;
The projection pixel value determining module is used for determining the projection pixel value of the projection point according to each second pixel value and each weight;
the virtual video frame generation unit may specifically be configured to:
and generating a virtual video frame according to the projection points corresponding to the feature points in any adjacent video frame and the projection pixel values of the projection points corresponding to the feature points.
Optionally, the virtual video frame generation module 420 may include:
a video frame determination unit configured to determine a first video frame and a second video frame adjacent in a first direction from among the adjacent video frames;
The intermediate video frame obtaining unit is used for inserting frames according to the first video frame and the second video frame in the first direction to obtain an intermediate video frame;
the virtual video frame obtaining unit is used for inserting frames according to the intermediate video frames in a second direction perpendicular to the first direction to obtain virtual video frames.
On this basis, optionally, the video reconstruction device may further include:
The vertical direction obtaining module is used for taking the direction parallel to the ground plane as a horizontal direction and taking the direction perpendicular to the ground plane as a vertical direction;
The vertical interval acquisition module is used for acquiring the horizontal interval of any two free view angle acquisition devices in the horizontal direction and the vertical interval of any two free view angle acquisition devices in the vertical direction;
The first direction obtaining module is used for determining the horizontal direction or the vertical direction as a first direction according to the numerical relation between the horizontal interval and the vertical interval.
Optionally, the view angle acquisition devices in each free view angle acquisition device group are respectively arranged in a ring shape; and/or the virtual view angle corresponding to the virtual video frame is positioned in a target range, wherein the target range is a range formed by the physical view angles corresponding to the adjacent video frames.
According to the video reconstruction device provided by the embodiment of the disclosure, an original video acquired based on light field acquisition equipment is acquired through an original video acquisition module, and because the light field acquisition equipment comprises at least two groups of free view acquisition equipment groups, each group of free view acquisition equipment groups are placed on different heights, each group of free view acquisition equipment groups comprises a plurality of free view acquisition equipment, the original video acquired by the original video acquisition module simultaneously has multi-view information, namely light field information, in the vertical direction and in the horizontal direction; furthermore, in order to reconstruct dense light field information on the basis of deploying sparse free view angle acquisition equipment, a virtual video frame is generated according to each adjacent video frame by a virtual video frame generation module aiming at the adjacent video frames in each synchronous video frame acquired on the same frame in an original video, wherein the free view angle acquisition equipment corresponding to any two adjacent video frames is adjacent on a placement position; further, the light field video is reconstructed through a light field video reconstruction module according to each synchronous video frame acquired on each frame in the original video and the virtual video frame corresponding to each synchronous video frame, so as to obtain the light field video. According to the device, in the acquisition process, the acquisition of the free view angle in the vertical direction is added on the basis of the acquisition of the free view angle in the horizontal direction, and the virtual video frames are generated through each adjacent video frame to reconstruct and obtain dense light field information, so that the light field video meeting the 6DoF watching requirement is obtained, and the user is allowed to watch the light field video in a 6DoF mode based on the augmented reality AR or the head-mounted display equipment and the like.
The video reconstruction device provided by the embodiment of the disclosure can execute the video reconstruction method provided by any embodiment of the disclosure, and has the corresponding functional modules and beneficial effects of the execution method.
It should be noted that, in the embodiment of the video reconstruction device, each unit and module included are only divided according to the functional logic, but not limited to the above division, so long as the corresponding functions can be implemented; in addition, the specific names of the functional units are also only for distinguishing from each other, and are not used to limit the protection scope of the present disclosure.
Referring now to fig. 9, a schematic diagram of an electronic device (e.g., a terminal device or server in fig. 9) 500 suitable for use in implementing embodiments of the present disclosure is shown. The electronic devices in the embodiments of the present disclosure may include, but are not limited to, mobile terminals such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), in-vehicle terminals (e.g., in-vehicle navigation terminals), and the like, and stationary terminals such as digital TVs, desktop computers, and the like. The electronic device shown in fig. 9 is merely an example, and should not impose any limitations on the functionality and scope of use of embodiments of the present disclosure.
As shown in fig. 9, the electronic device 500 may include a processing means (e.g., a central processing unit, a graphics processor, etc.) 501, which may perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 502 or a program loaded from a storage means 508 into a Random Access Memory (RAM) 503. In the RAM 503, various programs and data required for the operation of the electronic apparatus 500 are also stored. The processing device 501, the ROM 502, and the RAM 503 are connected to each other via a bus 504. An input/output (I/O) interface 505 is also connected to bus 504.
In general, the following devices may be connected to the I/O interface 505: input devices 506 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; an output device 507 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 508 including, for example, magnetic tape, hard disk, etc.; and communication means 509. The communication means 509 may allow the electronic device 500 to communicate with other devices wirelessly or by wire to exchange data. While an electronic device 500 having various means is shown in fig. 9, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may be implemented or provided instead.
In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a non-transitory computer readable medium, the computer program comprising program code for performing the method shown in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication means 509, or from the storage means 508, or from the ROM 502. The above-described functions defined in the methods of the embodiments of the present disclosure are performed when the computer program is executed by the processing device 501.
It should be noted that the computer readable medium described in the present disclosure may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this disclosure, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present disclosure, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, fiber optic cables, RF (radio frequency), and the like, or any suitable combination of the foregoing.
In some embodiments, the clients, servers may communicate using any currently known or future developed network protocol, such as HTTP (HyperText Transfer Protocol ), and may be interconnected with any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the internet (e.g., the internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed networks.
The computer readable medium may be contained in the electronic device; or may exist alone without being incorporated into the electronic device.
The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to:
acquiring an original video acquired based on light field acquisition equipment, wherein the light field acquisition equipment comprises at least two free view acquisition equipment groups, the free view acquisition equipment groups are placed on different heights, and each free view acquisition equipment group comprises a plurality of free view acquisition equipment;
Generating a virtual video frame according to each adjacent video frame aiming at the adjacent video frame in each synchronous video frame in the original video, wherein each synchronous video frame is acquired on the same frame, and free view angle acquisition devices respectively corresponding to any two adjacent video frames are adjacent on the arrangement position;
and carrying out light field reconstruction according to each synchronous video frame acquired on each frame in the original video and the virtual video frame corresponding to each synchronous video frame to obtain a light field video.
Computer program code for carrying out operations of the present disclosure may be written in one or more programming languages, including, but not limited to, an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units involved in the embodiments of the present disclosure may be implemented by means of software, or may be implemented by means of hardware. The name of the unit is not limited to the unit itself in some cases, for example, the original video acquisition module may also be described as "acquiring an original video acquired based on a light field acquisition device, where the light field acquisition device includes at least two free view acquisition device groups, each set of free view acquisition devices being disposed at a different height, and each set of free view acquisition devices includes a module of a plurality of free view acquisition devices".
The functions described above herein may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: a Field Programmable Gate Array (FPGA), an Application Specific Integrated Circuit (ASIC), an Application Specific Standard Product (ASSP), a system on a chip (SOC), a Complex Programmable Logic Device (CPLD), and the like.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
According to one or more embodiments of the present disclosure, there is provided a video reconstruction method [ example one ], the method may include:
acquiring an original video acquired based on light field acquisition equipment, wherein the light field acquisition equipment comprises at least two free view acquisition equipment groups, the free view acquisition equipment groups are placed on different heights, and each free view acquisition equipment group comprises a plurality of free view acquisition equipment;
Generating a virtual video frame according to each adjacent video frame aiming at the adjacent video frame in each synchronous video frame in the original video, wherein each synchronous video frame is acquired on the same frame, and free view angle acquisition devices respectively corresponding to any two adjacent video frames are adjacent on the arrangement position;
and carrying out light field reconstruction according to each synchronous video frame acquired on each frame in the original video and the virtual video frame corresponding to each synchronous video frame to obtain a light field video.
In accordance with one or more embodiments of the present disclosure, a method of example one is provided [ example two ], generating a virtual video frame from each adjacent video frame may include:
the free view angle acquisition equipment corresponding to the adjacent video frames is used as the adjacent view angle acquisition equipment, and feature matching is carried out on each adjacent video frame to obtain matched feature points in each adjacent video frame;
Acquiring a physical calibration result of each adjacent view angle acquisition device, and projecting characteristic points in adjacent video frames acquired by the adjacent view angle acquisition devices according to the physical calibration result to obtain space points;
Determining target points according to the spatial points corresponding to the matched characteristic points in each adjacent video frame, obtaining virtual calibration results of virtual visual angle acquisition equipment corresponding to the virtual video frames to be generated, and projecting the target points according to the virtual calibration results to obtain projection points;
and generating a virtual video frame according to the projection points corresponding to the feature points in any adjacent video frame.
According to one or more embodiments of the present disclosure, a method of example two is provided [ example three ], where a physical calibration result includes an internal reference and a pose, and projection is performed on feature points in adjacent video frames acquired by adjacent view acquisition devices according to the physical calibration result to obtain spatial points, which may include:
taking the adjacent video frames acquired by the adjacent visual angle acquisition equipment as acquired video frames;
obtaining depth information of a physical view angle under adjacent view angle acquisition equipment according to the internal parameters and the pose, and obtaining a first pixel value of a feature point in an acquired video frame;
Performing space back projection according to the depth information and the first pixel value to obtain a back projection matrix of the internal reference and a back projection matrix of the pose;
and projecting the characteristic points in the acquired video frames according to the back projection matrix of the internal reference and the back projection matrix of the pose to obtain the space points.
According to one or more embodiments of the present disclosure, a method of example two is provided [ example four ], and the above light field reconstruction method may further include:
Respectively acquiring second pixel values of the matched characteristic points in each adjacent video frame, and respectively acquiring the distance between each adjacent view acquisition device and the virtual view acquisition device;
determining weights of adjacent visual angle acquisition equipment corresponding to the distance according to the distance;
determining a projection pixel value of the projection point according to each second pixel value and each weight;
Generating a virtual video frame according to the projection points corresponding to the feature points in any adjacent video frame, may include:
and generating a virtual video frame according to the projection points corresponding to the feature points in any adjacent video frame and the projection pixel values of the projection points corresponding to the feature points.
In accordance with one or more embodiments of the present disclosure, a method of example one is provided [ example five ], generating a virtual video frame from each adjacent video frame may include:
determining a first video frame and a second video frame which are adjacent in a first direction in each adjacent video frame;
In the first direction, inserting frames according to the first video frames and the second video frames to obtain intermediate video frames;
And in a second direction perpendicular to the first direction, inserting frames according to each intermediate video frame to obtain a virtual video frame.
According to one or more embodiments of the present disclosure, a method of example five is provided [ example six ], and the above light field reconstruction method may further include:
Taking the direction parallel to the ground plane as a horizontal direction and the direction perpendicular to the ground plane as a vertical direction;
acquiring the horizontal interval of any two free view angle acquisition devices in the horizontal direction and the vertical interval of any two free view angle acquisition devices in the vertical direction;
And determining the horizontal direction or the vertical direction as a first direction according to the numerical relation between the horizontal interval and the vertical interval.
According to one or more embodiments of the present disclosure, a method of example one is provided [ example seven ], each of the set of freeview acquisition devices being annularly disposed by the view acquisition device; and/or the number of the groups of groups,
The virtual view angle corresponding to the virtual video frame is located in a target range, and the target range is a range formed by the physical view angles corresponding to the adjacent video frames.
According to one or more embodiments of the present disclosure, there is provided a video reconstruction apparatus [ example eight ], the apparatus may include:
the system comprises an original video acquisition module, a display module and a display module, wherein the original video acquisition module is used for acquiring an original video acquired based on light field acquisition equipment, the light field acquisition equipment comprises at least two free view acquisition equipment groups, the free view acquisition equipment groups are placed on different heights, and the free view acquisition equipment groups comprise a plurality of free view acquisition equipment;
The virtual video frame generation module is used for generating virtual video frames according to adjacent video frames in synchronous video frames in an original video, wherein the synchronous video frames are acquired on the same frame, and free view angle acquisition devices corresponding to any two adjacent video frames are adjacent in arrangement position;
The light field video reconstruction module is used for carrying out light field reconstruction according to each synchronous video frame acquired on each frame in the original video and the virtual video frame corresponding to each synchronous video frame to obtain the light field video.
The foregoing description is only of the preferred embodiments of the present disclosure and description of the principles of the technology being employed. It will be appreciated by persons skilled in the art that the scope of the disclosure referred to in this disclosure is not limited to the specific combinations of features described above, but also covers other embodiments which may be formed by any combination of features described above or equivalents thereof without departing from the spirit of the disclosure. Such as those described above, are mutually substituted with the technical features having similar functions disclosed in the present disclosure (but not limited thereto).
Moreover, although operations are depicted in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order. In certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while several specific implementation details are included in the above discussion, these should not be construed as limiting the scope of the present disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.
Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are example forms of implementing the claims.

Claims (8)

1. A method of video reconstruction, comprising:
acquiring an original video acquired based on light field acquisition equipment, wherein the light field acquisition equipment comprises at least two free view acquisition equipment groups, each free view acquisition equipment group is placed on different heights, and each free view acquisition equipment group comprises a plurality of free view acquisition equipment;
Generating a virtual video frame according to each adjacent video frame aiming at the adjacent video frame in each synchronous video frame in the original video, wherein each synchronous video frame is acquired on the same frame, and the free view angle acquisition devices corresponding to any two adjacent video frames are adjacent in arrangement position;
performing light field reconstruction according to the synchronous video frames collected on each frame in the original video and the virtual video frames corresponding to the synchronous video frames to obtain a light field video;
Wherein the method further comprises: taking the direction parallel to the ground plane as a horizontal direction and the direction perpendicular to the ground plane as a vertical direction, and acquiring the horizontal interval of any two free view angle acquisition devices in the horizontal direction and the vertical interval in the vertical direction;
determining the horizontal direction or the vertical direction as a first direction according to the numerical relation between the horizontal interval and the vertical interval;
wherein the generating a virtual video frame from each of the adjacent video frames includes:
Determining a first video frame and a second video frame which are adjacent in the first direction in each adjacent video frame, inserting frames according to the first video frame and the second video frame in the first direction to obtain an intermediate video frame, and inserting frames according to each intermediate video frame in a second direction perpendicular to the first direction to obtain a virtual video frame.
2. The method of claim 1, wherein said generating a virtual video frame from each of said adjacent video frames further comprises:
taking the free view angle acquisition equipment corresponding to the adjacent video frames as adjacent view angle acquisition equipment, and performing feature matching on each adjacent video frame to obtain matched feature points in each adjacent video frame;
for each adjacent view angle acquisition device, acquiring a physical calibration result of the adjacent view angle acquisition device, and projecting the characteristic points in the adjacent video frames acquired by the adjacent view angle acquisition device according to the physical calibration result to obtain space points;
determining a target point according to the spatial points corresponding to the feature points matched in each adjacent video frame, obtaining a virtual calibration result of virtual view angle acquisition equipment corresponding to the virtual video frame to be generated, and projecting the target point according to the virtual calibration result to obtain a projection point;
and generating the virtual video frame according to the projection points corresponding to the characteristic points in any adjacent video frame.
3. The method according to claim 2, wherein the physical calibration result includes an internal reference and a pose, the projecting the feature point in the adjacent video frame acquired by the adjacent view acquisition device according to the physical calibration result to obtain a spatial point includes:
Taking the adjacent video frames acquired by the adjacent view angle acquisition equipment as acquired video frames;
Obtaining depth information of a physical view angle under the adjacent view angle acquisition equipment according to the internal parameters and the pose, and obtaining a first pixel value of the characteristic point in the acquired video frame;
performing spatial back projection according to the depth information and the first pixel value to obtain a back projection matrix of the internal reference and a back projection matrix of the pose;
And projecting the characteristic points in the acquired video frame according to the back projection matrix of the internal reference and the back projection matrix of the pose to obtain space points.
4. The method as recited in claim 2, further comprising:
Respectively acquiring second pixel values of the matched characteristic points in each adjacent video frame, and respectively acquiring the distance between each adjacent view acquisition device and the virtual view acquisition device;
Determining the weight of the adjacent view angle acquisition equipment corresponding to the distance according to the distance;
determining a projection pixel value of the projection point according to each second pixel value and each weight;
The generating the virtual video frame according to the projection points corresponding to the feature points in any one of the adjacent video frames, includes:
And generating the virtual video frame according to the projection points corresponding to the feature points in any adjacent video frame and the projection pixel values of the projection points corresponding to the feature points.
5. The method according to claim 1, wherein;
The free view angle acquisition devices in each free view angle acquisition device group are annularly arranged;
And/or the number of the groups of groups,
The virtual visual angle corresponding to the virtual video frame is located in a target range, and the target range is a range formed by the physical visual angles corresponding to the adjacent video frames.
6. A video reconstruction apparatus, comprising:
The system comprises an original video acquisition module, a display module and a display module, wherein the original video acquisition module is used for acquiring an original video acquired based on light field acquisition equipment, the light field acquisition equipment comprises at least two free view acquisition equipment groups, the free view acquisition equipment groups are placed on different heights, and each free view acquisition equipment group comprises a plurality of free view acquisition equipment;
The virtual video frame generation module is used for generating virtual video frames according to adjacent video frames in synchronous video frames in the original video, wherein the synchronous video frames are acquired on the same frame, and the free view angle acquisition devices corresponding to any two adjacent video frames are adjacent to each other in the arrangement position;
The light field video reconstruction module is used for carrying out light field reconstruction according to the synchronous video frames acquired on each frame in the original video and the virtual video frames corresponding to the synchronous video frames to obtain a light field video;
wherein, the video reconstruction device further includes:
The vertical direction obtaining module is used for taking a direction parallel to the ground plane as a horizontal direction and taking a direction perpendicular to the ground plane as a vertical direction;
A vertical interval acquisition module, configured to acquire a horizontal interval of any two free view angle acquisition devices in the horizontal direction and a vertical interval in the vertical direction;
The first direction obtaining module is used for determining whether the horizontal direction or the vertical direction is used as a first direction according to the numerical relation between the horizontal interval and the vertical interval;
The virtual video frame generation module comprises:
A video frame determining unit configured to determine a first video frame and a second video frame adjacent in the first direction in each of the adjacent video frames;
An intermediate video frame obtaining unit, configured to insert frames according to the first video frame and the second video frame in the first direction, to obtain an intermediate video frame;
and the virtual video frame obtaining unit is used for inserting frames according to the intermediate video frames in a second direction perpendicular to the first direction to obtain virtual video frames.
7. An electronic device, comprising:
one or more processors;
a memory for storing one or more programs;
the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the video reconstruction method of any one of claims 1-5.
8. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the video reconstruction method according to any one of claims 1-5.
CN202210556761.XA 2022-05-19 2022-05-19 Video reconstruction method and device, electronic equipment and storage medium Active CN115022613B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210556761.XA CN115022613B (en) 2022-05-19 2022-05-19 Video reconstruction method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210556761.XA CN115022613B (en) 2022-05-19 2022-05-19 Video reconstruction method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN115022613A CN115022613A (en) 2022-09-06
CN115022613B true CN115022613B (en) 2024-07-26

Family

ID=83069441

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210556761.XA Active CN115022613B (en) 2022-05-19 2022-05-19 Video reconstruction method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115022613B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111667438A (en) * 2019-03-07 2020-09-15 阿里巴巴集团控股有限公司 Video reconstruction method, system, device and computer readable storage medium
CN114143528A (en) * 2020-09-04 2022-03-04 北京大视景科技有限公司 Multi-video stream fusion method, electronic device and storage medium

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002259979A (en) * 2000-12-26 2002-09-13 Monolith Co Ltd Method and device for image interpolation, and method and device for image processing
US10085005B2 (en) * 2015-04-15 2018-09-25 Lytro, Inc. Capturing light-field volume image and video data using tiled light-field cameras
EP4072147A4 (en) * 2019-12-30 2022-12-14 Huawei Technologies Co., Ltd. Video stream processing method, apparatus and device, and medium
CN111640181A (en) * 2020-05-14 2020-09-08 佳都新太科技股份有限公司 Interactive video projection method, device, equipment and storage medium
CN113192185B (en) * 2021-05-18 2022-05-17 清华大学 Dynamic light field reconstruction method, device and equipment

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111667438A (en) * 2019-03-07 2020-09-15 阿里巴巴集团控股有限公司 Video reconstruction method, system, device and computer readable storage medium
CN114143528A (en) * 2020-09-04 2022-03-04 北京大视景科技有限公司 Multi-video stream fusion method, electronic device and storage medium

Also Published As

Publication number Publication date
CN115022613A (en) 2022-09-06

Similar Documents

Publication Publication Date Title
CN108492364B (en) Method and apparatus for generating image generation model
WO2024104248A1 (en) Rendering method and apparatus for virtual panorama, and device and storage medium
CN115002442A (en) Image display method and device, electronic equipment and storage medium
CN111862342B (en) Augmented reality texture processing method and device, electronic equipment and storage medium
CN117608396A (en) Mixed reality-based processing method, device, terminal and storage medium
CN115002345B (en) Image correction method, device, electronic equipment and storage medium
CN115022613B (en) Video reconstruction method and device, electronic equipment and storage medium
CN109816791B (en) Method and apparatus for generating information
CN115937291B (en) Binocular image generation method and device, electronic equipment and storage medium
CN111489428B (en) Image generation method, device, electronic equipment and computer readable storage medium
KR102534449B1 (en) Image processing method, device, electronic device and computer readable storage medium
CN112070903A (en) Virtual object display method and device, electronic equipment and computer storage medium
CN111292245A (en) Image processing method and device
CN112668474B (en) Plane generation method and device, storage medium and electronic equipment
US20240269553A1 (en) Method, apparatus, electronic device and storage medium for extending reality display
CN116363338A (en) Image processing method, device, electronic equipment and storage medium
CN117389502A (en) Spatial data transmission method, device, electronic equipment and storage medium
CN118433467A (en) Video display method, device, electronic equipment and storage medium
CN117289454A (en) Display method and device of virtual reality equipment, electronic equipment and storage medium
CN115457200A (en) Method, device, equipment and storage medium for automatic true stereo display of 2.5-dimensional image
CN117354483A (en) Synchronous verification method and device, electronic equipment and storage medium
CN117768599A (en) Method, device, system, electronic equipment and storage medium for processing image
CN117764883A (en) Image processing method, device, electronic equipment and storage medium
CN118135090A (en) Grid alignment method and device and electronic equipment
CN118411487A (en) Model generation method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant