CN111988597B - Virtual viewpoint synthesis method and device, electronic equipment and readable storage medium - Google Patents

Virtual viewpoint synthesis method and device, electronic equipment and readable storage medium Download PDF

Info

Publication number
CN111988597B
CN111988597B CN202010853423.3A CN202010853423A CN111988597B CN 111988597 B CN111988597 B CN 111988597B CN 202010853423 A CN202010853423 A CN 202010853423A CN 111988597 B CN111988597 B CN 111988597B
Authority
CN
China
Prior art keywords
depth map
original
map
background
texture map
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010853423.3A
Other languages
Chinese (zh)
Other versions
CN111988597A (en
Inventor
贝悦
王�琦
程志鹏
顾嵩
蔡砚刚
王荣刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University
China Mobile Communications Group Co Ltd
MIGU Video Technology Co Ltd
MIGU Culture Technology Co Ltd
Original Assignee
Peking University
China Mobile Communications Group Co Ltd
MIGU Video Technology Co Ltd
MIGU Culture Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University, China Mobile Communications Group Co Ltd, MIGU Video Technology Co Ltd, MIGU Culture Technology Co Ltd filed Critical Peking University
Priority to CN202010853423.3A priority Critical patent/CN111988597B/en
Publication of CN111988597A publication Critical patent/CN111988597A/en
Application granted granted Critical
Publication of CN111988597B publication Critical patent/CN111988597B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/122Improving the 3D impression of stereoscopic images by modifying image signal contents, e.g. by filtering or adding monoscopic depth cues

Abstract

The invention discloses a virtual viewpoint synthesis method and device, electronic equipment and a readable storage medium, and belongs to the technical field of image processing. The specific implementation scheme comprises the following steps: acquiring an original texture map and an original depth map of a reference viewpoint; according to the original texture map and the original depth map, reconstructing to obtain a background texture map and a background depth map of the reference viewpoint; acquiring difference information between the original texture map and the background texture map; optimizing the original depth map by using the background depth map according to the difference information to obtain an optimized depth map; and carrying out virtual viewpoint synthesis according to the original texture map and the optimized depth map. According to the scheme in the application, the improvement effect on the depth map can be improved, so that the time domain stability of the virtual viewpoint video is effectively improved.

Description

Virtual viewpoint synthesis method and device, electronic equipment and readable storage medium
Technical Field
The invention belongs to the technical field of image processing, and particularly relates to a virtual viewpoint synthesis method and device, electronic equipment and a readable storage medium.
Background
At present, a method for improving time domain stability of a virtual viewpoint video generally includes performing a preprocessing operation on an original depth map, for example, performing median filtering on the original depth map, and performing virtual viewpoint synthesis based on the preprocessed depth map. However, in the prior art, in terms of preprocessing of the depth map, preprocessing is often performed by using a simple median filtering method and the like, and the improvement effect on the depth map is limited, so that the temporal stability of the virtual viewpoint video cannot be effectively improved.
Disclosure of Invention
Embodiments of the present invention provide a virtual viewpoint synthesis method, an apparatus, an electronic device, and a readable storage medium, so as to solve a problem that a current method cannot effectively improve temporal stability of a virtual viewpoint video.
In order to solve the technical problem, the invention is realized as follows:
in a first aspect, an embodiment of the present invention provides a virtual viewpoint synthesis method, including:
acquiring an original texture map and an original depth map of a reference viewpoint;
according to the original texture map and the original depth map, reconstructing to obtain a background texture map and a background depth map of the reference viewpoint;
acquiring difference information between the original texture map and the background texture map;
optimizing the original depth map by using the background depth map according to the difference information to obtain an optimized depth map;
and carrying out virtual viewpoint synthesis according to the original texture map and the optimized depth map.
Optionally, the reconstructing to obtain the background texture map and the background depth map of the reference viewpoint according to the original texture map and the original depth map includes:
and respectively carrying out time-domain median filtering on the original texture map and the original depth map according to the video frame sequence of the reference viewpoint to obtain the background texture map and the background depth map.
Optionally, the obtaining difference information between the original texture map and the background texture map includes:
calculating the difference value of the pixel values of each pixel point in the background texture map and the corresponding pixel point in the original texture map;
the optimizing the original depth map by using the background depth map according to the difference information to obtain an optimized depth map includes:
and optimizing the original depth map by using the background depth map according to the difference value to obtain an optimized depth map.
Optionally, the optimizing the original depth map by using the background depth map according to the difference to obtain an optimized depth map includes:
obtaining a binary image according to the comparison between the difference value and a preset threshold value; when the difference value is larger than or equal to the preset threshold value, the value of a corresponding pixel point in the binary image is a first value; or when the difference value is smaller than the preset threshold value, taking the value of the corresponding pixel point in the binary image as a second value;
reserving an original depth value for a pixel point corresponding to a first pixel point in the original depth map, and covering the pixel point corresponding to a second pixel point in the original depth map with the depth value by using the background depth map to obtain the optimized depth map; the first pixel points are pixel points with a first value in the binary image, and the second pixel points are pixel points with a second value in the binary image.
Optionally, the performing virtual viewpoint synthesis according to the original texture map and the optimized depth map includes:
and performing virtual viewpoint synthesis of DIBR by using the original texture map and the optimized depth map.
In a second aspect, an embodiment of the present invention provides a virtual viewpoint synthesis apparatus, including:
the first acquisition module is used for acquiring an original texture map and an original depth map of a reference viewpoint;
the reconstruction module is used for reconstructing to obtain a background texture map and a background depth map of the reference viewpoint according to the original texture map and the original depth map;
the second obtaining module is used for obtaining difference information between the original texture map and the background texture map;
the optimization module is used for optimizing the original depth map by using the background depth map according to the difference information to obtain an optimized depth map;
and the synthesis module is used for carrying out virtual viewpoint synthesis according to the original texture map and the optimized depth map.
Optionally, the reconstruction module is specifically configured to:
and respectively carrying out time-domain median filtering on the original texture map and the original depth map according to the video frame sequence of the reference viewpoint to obtain the background texture map and the background depth map.
Optionally, the second obtaining module is specifically configured to: calculating the difference value of the pixel values of each pixel point in the background texture map and the corresponding pixel point in the original texture map;
the optimization module is specifically configured to: and optimizing the original depth map by using the background depth map according to the difference value to obtain an optimized depth map.
Optionally, the optimization module includes:
the obtaining unit is used for obtaining a binary image according to the comparison between the difference value and a preset threshold value; when the difference value is larger than or equal to the preset threshold value, the value of a corresponding pixel point in the binary image is a first value; or when the difference value is smaller than the preset threshold value, taking the value of the corresponding pixel point in the binary image as a second value;
the processing unit is used for reserving an original depth value for a pixel point corresponding to a first pixel point in the original depth map, and covering the depth value for the pixel point corresponding to a second pixel point in the original depth map by using the background depth map to obtain the optimized depth map; the first pixel points are pixel points with a first value in the binary image, and the second pixel points are pixel points with a second value in the binary image.
Optionally, the synthesis module is specifically configured to: and performing virtual viewpoint synthesis of DIBR by using the original texture map and the optimized depth map.
In a third aspect, an embodiment of the present invention provides an electronic device, which includes a processor, a memory, and a program or instructions stored on the memory and executable on the processor, and when executed by the processor, the program or instructions implement the steps of the method according to the first aspect.
In a fourth aspect, embodiments of the present invention provide a computer-readable storage medium on which a program or instructions are stored, which when executed by a processor implement the steps of the method according to the first aspect.
In the embodiment of the present invention, after obtaining the original texture map and the original depth map of the reference viewpoint, the background texture map and the background depth map of the reference viewpoint may be obtained by reconstruction according to the original texture map and the original depth map, difference information between the original texture map and the background texture map may be obtained, the original depth map may be optimized by using the background depth map according to the difference information, an optimized depth map may be obtained, and virtual viewpoint synthesis may be performed according to the original texture map and the optimized depth map. Therefore, the depth map is optimized by means of the difference information between the original texture map and the background texture map, the improvement effect on the depth map can be improved, and the time domain stability of the virtual viewpoint video is effectively improved.
Furthermore, after the optimized depth map is obtained, virtual viewpoint synthesis operation is subsequently performed only by using the optimized depth map, and virtual viewpoint synthesis is not required to be performed while the depth map is optimized in the process of virtual viewpoint synthesis, so that instantaneity is guaranteed, and the space occupation of electronic equipment and the calculation amount of the subsequent process of virtual viewpoint synthesis are not increased.
Drawings
Fig. 1 is a flowchart of a virtual viewpoint synthesis method according to an embodiment of the present invention;
FIG. 2 is a flow diagram of a virtual viewpoint synthesis process in an embodiment of the present invention;
FIGS. 3A, 3B and 3C are schematic texture maps according to an embodiment of the present invention;
FIGS. 4A, 4B, 4C, 4D, 4E, and 4F are second texture maps illustrated in an embodiment of the present invention;
FIG. 5A, FIG. 5B, FIG. 5C, FIG. 5D, FIG. 5E and FIG. 5F are three schematic texture maps in accordance with an embodiment of the present invention;
fig. 6 is a schematic structural diagram of a virtual viewpoint synthesis apparatus according to an embodiment of the present invention;
fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The virtual viewpoint synthesis method provided by the embodiment of the present invention is described in detail below with reference to the accompanying drawings by specific embodiments and application scenarios thereof.
Referring to fig. 1, fig. 1 is a flowchart of a virtual viewpoint synthesis method according to an embodiment of the present invention, where the method is applied to an electronic device, and as shown in fig. 1, the method includes the following steps:
step 101: and acquiring an original texture map and an original depth map of the reference viewpoint.
In this embodiment, the reference viewpoint may be a camera adjacent to the virtual viewpoint left and right.
Step 102: and reconstructing to obtain a background texture map and a background depth map of a reference viewpoint according to the original texture map and the original depth map.
Optionally, in step 102, the time domain information of the reference viewpoint video may be utilized to reconstruct and obtain a background texture map and a background depth map of the reference viewpoint. Methods of background reconstruction include, but are not limited to, temporal median filtering, and the like. For the median filtering method, the pixel value of each pixel point is set as the median of the pixel values of all the pixel points in a certain neighborhood window of the point.
Step 103: and acquiring difference information between the original texture map and the background texture map.
Optionally, in step 103, the difference information between the original texture map and the background texture map may be obtained by using an algorithm such as background subtraction, foreground and background segmentation. The difference information includes, but is not limited to, a difference value of pixel values of corresponding pixel points, and the like.
Step 104: and optimizing the original depth map by using the background depth map according to the difference information to obtain an optimized depth map.
It is noted that the purpose of this optimization is to preserve the depth values of the target objects in the image, such as people, etc., and to reconstruct the depth values of the background of the image, thereby making the texture of the stationary foreground and background regions more stable and thus improving temporal stability. There may be various optimized schemes, and the embodiment of the present invention does not limit this.
Step 105: and carrying out virtual viewpoint synthesis according to the original texture map and the optimized depth map.
Optionally, after obtaining the optimized Depth map, virtual viewpoint synthesis Based on Depth map Image-Based Rendering (DIBR) may be performed by using the original texture map and the optimized Depth map, so as to complete synthesis of the virtual viewpoint video. DIBR is an important method for virtual viewpoint synthesis, and only a texture map and a corresponding depth map of a reference viewpoint (camera) are needed, and a view of a viewpoint without a camera can be obtained through three-dimensional coordinate transformation.
According to the virtual viewpoint synthesis method provided by the embodiment of the invention, after the original texture map and the original depth map of the reference viewpoint are obtained, the background texture map and the background depth map of the reference viewpoint can be obtained through reconstruction according to the original texture map and the original depth map, the difference information between the original texture map and the background texture map is obtained, the original depth map is optimized by using the background depth map according to the difference information to obtain the optimized depth map, and the virtual viewpoint synthesis is carried out according to the original texture map and the optimized depth map. Therefore, the depth map is optimized by means of the difference information between the original texture map and the background texture map, the improvement effect on the depth map can be improved, and the time domain stability of the virtual viewpoint video is effectively improved.
Furthermore, after the optimized depth map is obtained, only the optimized depth map is needed to be used for virtual viewpoint synthesis operation subsequently, and virtual viewpoint synthesis is not needed while the depth map is optimized in the process of virtual viewpoint synthesis, so that the real-time performance is ensured, and the space occupation of the electronic equipment and the calculation amount of the subsequent process of virtual viewpoint synthesis are not increased.
In this embodiment of the present invention, optionally, the process of obtaining the background texture map and the background depth map by reconstructing may include: and respectively carrying out time-domain median filtering on the original texture map and the original depth map according to the video frame sequence of the reference viewpoint to obtain a background texture map and a background depth map. Understandably, because the depth inaccuracy of the static pixel point only occurs in a few frames, the time domain median filtering can well filter the inaccuracy points and obtain the more stable depth values, so that the background depth map can be optimized to the original depth map to a certain extent, and the purpose of improving the time domain stability is achieved. Further, the optimization of the depth map by using the time domain information occurs in the preprocessing stage of the depth map, so that the subsequent virtual viewpoint synthesis process is not changed, and the additional consumption of time and space is not required to be increased in the virtual viewpoint synthesis process.
For example, for a video frame sequence of reference viewpoints, median filtering may be performed on a texture map and a corresponding depth map in the video frame sequence, respectively, by using a filtering method expressed by the following formula, to obtain a texture map and a depth map of a background:
P(xt)=med({Ix,i|i∈[t1,t2]})
wherein, P (x)t) Pixel values, I, of pixels representing a background reconstructionx,iIndicates that pixel point x is at [ t ]1,t2]Med represents the median operation on the set of pixel values at a certain time i.
Optionally, the process of obtaining difference information between the original texture map and the background texture map may include: and calculating the difference value of the pixel values of each pixel point in the background texture map and the corresponding pixel point in the original texture map. And then, optimizing the original depth map by using the background depth map according to the difference value to obtain an optimized depth map.
Further, the process of optimizing the original depth map by using the background depth map according to the difference to obtain the optimized depth map may include: firstly, acquiring a binary image according to the comparison between the difference value and a preset threshold value; when the difference value is larger than or equal to the preset threshold value, the value of a corresponding pixel point in the binary image is a first value; or when the difference value is smaller than the preset threshold value, taking the value of the corresponding pixel point in the binary image as a second value; then, reserving an original depth value for a pixel point corresponding to a first pixel point in the original depth map, and covering the depth value for a pixel point corresponding to a second pixel point in the original depth map by using the background depth map to obtain the optimized depth map; the first pixel points are pixel points with a first value in the binary image, and the second pixel points are pixel points with a second value in the binary image.
For example, the following formula can be adopted to obtain the binary map mask:
Figure BDA0002645590010000071
the texture map comprises a mask (i, j) and a background texture map (B), wherein the mask (i, j) represents a binarization map, the B (i, j) represents a pixel value of a pixel point (i, j) in the background texture map, the Ori (i, j) represents a pixel value of a pixel point (i, j) in the original texture map, Cth represents a preset threshold value, n represents a first value, and m represents a second value. Wherein the first value is different from the second value.
In one embodiment, n is equal to 255 and m is equal to 0. Then, the original depth value of the pixel point corresponding to the pixel point part with the value of 255 in the original depth map can be reserved, and meanwhile, the depth value of the corresponding pixel point in the background depth map is used for replacing the original depth value of the pixel point corresponding to the pixel point part with the value of 0 in the original depth map, so that the original depth map and the background depth map are fused, and the optimized depth map is generated.
For example, referring to fig. 2, in an embodiment of the present invention, the virtual viewpoint synthesis process may include:
s1: after obtaining the original texture maps and the original depth maps of the left and right reference viewpoints, reconstructing to obtain background texture maps and background depth maps of the left and right reference viewpoints by using a time domain median filtering method according to the video frame sequences of the left and right reference viewpoints;
s2: and subtracting pixel values of the original texture images and the background texture images of the left reference viewpoint and the right reference viewpoint by using a background subtraction method, and comparing the pixel values with a preset threshold value to obtain a binarization mask. As shown in FIG. 2, the shadow of the character and the viewer behind it will be marked white in the mask and the other background areas will be marked black.
S3: and optimizing the original depth map by using the binarization mask, namely reserving the original depth value of a pixel point corresponding to a pixel point part (white area in the mask) with a first value in the binarization mask in the original depth map, and replacing the original depth value by using the depth value of the corresponding pixel point in the background depth map aiming at the pixel point corresponding to the pixel point part (black area in the mask) with a second value in the binarization mask in the original depth map, so that the original depth map and the background depth map are fused, and the optimized depth map is generated. Therefore, due to the characteristic of time domain median filtering, a foreground region and a background region which are kept static in the background depth map can obtain a relatively stable value with reasonable depth, and the values cover the original depth map, so that the time domain stability of the depth map can be effectively improved, and the quality is improved.
S4: and performing virtual viewpoint synthesis of DIBR by using the optimized depth map and the original texture map.
Based on the embodiment shown in fig. 2, due to the characteristic of temporal median filtering, most of depth-stable pixel points in the background and static foreground regions can be reserved in the background depth map after the temporal median filtering. Because a better background texture map is reconstructed, the depth of the pixel values of the movement of people and the like can be reserved by using a background subtraction method, and the depths of other areas are optimized, so that the textures of static foreground and background areas are more stable, and the time domain stability is further improved.
The present application will be described in detail with reference to specific examples.
In the specific example of the invention, two data sets, namely a data set 1 and a data set 2, shot in a real match scene are selected by using cameras adjacent to a virtual viewpoint at the left and right as reference viewpoints, and 70-100 frames in the data set 1 and the data set 2 are synthesized, wherein the specific results are as follows:
1) referring to fig. 3A, 3B and 3C, fig. 3A is a virtual viewpoint synthesized by using an original depth map, fig. 3B is a virtual viewpoint synthesized by using a depth map optimized according to the method of the present application, and fig. 3C is a comparison view of the surroundings of the basketball stand in fig. 3A and 3B. As can be seen from fig. 3A, 3B and 3C, some wrong textures are generated around the basketball stand, especially wrong textures are generated around the timer, and the virtual viewpoint synthesized by the depth map optimized in the present application can eliminate the wrong textures, thereby improving temporal stability.
2) Referring to fig. 4A, 4B, 4C, 4D, 4E and 4F, fig. 4A, 4B and 4C are respectively virtual viewpoints synthesized by using the video frames 76, 88 and 100 in the data set 1 and the original depth map, and fig. 4D, 4E and 4F are respectively virtual viewpoints synthesized by using the video frames 76, 88 and 100 in the data set 1 and the optimized depth map. Referring again to fig. 5A, 5B, 5C, 5D, 5E and 5F, wherein fig. 5A, 5B and 5C are respectively virtual viewpoints synthesized using the video frames 76, 88, 100 in the data set 2 and the original depth map, and fig. 5D, 5E and 5F are respectively virtual viewpoints synthesized using the video frames 76, 88, 100 in the data set 2 and the optimized depth map.
As can be seen from fig. 4A to 5F, when synthesizing a virtual viewpoint using an original depth map, a still texture (e.g., around a timer) may appear as a jitter phenomenon, and when synthesizing using an optimized depth map, the texture may be stabilized.
Referring to fig. 6, fig. 6 is a schematic structural diagram of a virtual viewpoint synthesis apparatus according to an embodiment of the present invention, which is applied to an electronic device, and as shown in fig. 6, the virtual viewpoint synthesis apparatus 60 includes:
a first obtaining module 61, configured to obtain an original texture map and an original depth map of a reference viewpoint;
a reconstruction module 62, configured to reconstruct a background texture map and a background depth map of the reference viewpoint according to the original texture map and the original depth map;
a second obtaining module 63, configured to obtain difference information between the original texture map and the background texture map;
an optimizing module 64, configured to optimize the original depth map by using the background depth map according to the difference information to obtain an optimized depth map;
and a synthesizing module 65, configured to perform virtual viewpoint synthesis according to the original texture map and the optimized depth map.
Optionally, the reconstruction module 62 is specifically configured to:
and respectively carrying out time-domain median filtering on the original texture map and the original depth map according to the video frame sequence of the reference viewpoint to obtain the background texture map and the background depth map.
Optionally, the second obtaining module 63 is specifically configured to: calculating the difference value of the pixel values of each pixel point in the background texture map and the corresponding pixel point in the original texture map;
the optimization module 64 is specifically configured to: and optimizing the original depth map by using the background depth map according to the difference value to obtain an optimized depth map.
Optionally, the optimizing module 64 includes:
the obtaining unit is used for obtaining a binary image according to the comparison between the difference value and a preset threshold value; when the difference value is larger than or equal to the preset threshold value, the value of a corresponding pixel point in the binary image is a first value; or when the difference value is smaller than the preset threshold value, taking the value of the corresponding pixel point in the binary image as a second value;
the processing unit is used for reserving an original depth value for a pixel point corresponding to a first pixel point in the original depth map, and covering the depth value for the pixel point corresponding to a second pixel point in the original depth map by using the background depth map to obtain the optimized depth map; the first pixel points are pixel points with a first value in the binary image, and the second pixel points are pixel points with a second value in the binary image.
Optionally, the synthesis module 65 is specifically configured to: and performing virtual viewpoint synthesis of DIBR by using the original texture map and the optimized depth map.
It can be understood that the virtual viewpoint synthesis apparatus 60 according to the embodiment of the present invention can implement the processes of the method embodiment shown in fig. 1, and can achieve the same technical effects, and for avoiding repetition, the details are not repeated here.
In addition, an embodiment of the present invention further provides an electronic device, which includes a memory, a processor, and a computer program stored in the memory and capable of running on the processor, where the computer program, when executed by the processor, can implement each process of the method embodiment shown in fig. 1 and achieve the same technical effect, and is not described herein again to avoid repetition.
Referring to fig. 7, an electronic device 70 according to an embodiment of the present invention includes a bus 71, a transceiver 72, an antenna 73, a bus interface 74, a processor 75, and a memory 76.
In the embodiment of the present invention, the electronic device 70 further includes: a computer program stored on the memory 76 and executable on the processor 75. Optionally, the computer program may be adapted to implement the following steps when executed by the processor 75:
acquiring an original texture map and an original depth map of a reference viewpoint;
according to the original texture map and the original depth map, reconstructing to obtain a background texture map and a background depth map of the reference viewpoint;
acquiring difference information between the original texture map and the background texture map;
optimizing the original depth map by using the background depth map according to the difference information to obtain an optimized depth map;
and carrying out virtual viewpoint synthesis according to the original texture map and the optimized depth map.
It is understood that the computer program can implement the processes of the embodiment of the method shown in fig. 1 when executed by the processor 75, and achieve the same technical effects, and therefore, the detailed description is omitted here to avoid repetition.
In fig. 7, a bus architecture (represented by bus 71), bus 71 may include any number of interconnected buses and bridges, bus 71 linking together various circuits including one or more processors, represented by processor 75, and memory, represented by memory 76. The bus 71 may also link together various other circuits such as peripherals, voltage regulators, power management circuits, and the like, which are well known in the art, and therefore, will not be described any further herein. A bus interface 74 provides an interface between the bus 71 and the transceiver 72. The transceiver 72 may be one element or a plurality of elements, such as a plurality of receivers and transmitters, providing a means for communicating with various other apparatus over a transmission medium. The data processed by the processor 75 is transmitted over a wireless medium via the antenna 73, and further, the antenna 73 receives the data and transmits the data to the processor 75.
The processor 75 is responsible for managing the bus 71 and general processing and may also provide various functions including timing, peripheral interfaces, voltage regulation, power management, and other control functions. And the memory 76 may be used to store data used by the processor 75 in performing operations.
The embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, can implement each process of the method embodiment shown in fig. 1 and achieve the same technical effect, and is not described herein again to avoid repetition.
Computer-readable media, which include both non-transitory and non-transitory, removable and non-removable media, may implement the information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
Through the description of the foregoing embodiments, it is clear to those skilled in the art that the method of the foregoing embodiments may be implemented by software plus a necessary general hardware platform, and certainly may also be implemented by hardware, but in many cases, the former is a better implementation. Based on such understanding, the technical solution of the present invention or the portions contributing to the prior art may be essentially embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a service classification device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims (10)

1. A virtual viewpoint synthesis method, comprising:
acquiring an original texture map and an original depth map of a reference viewpoint;
according to the original texture map and the original depth map, reconstructing to obtain a background texture map and a background depth map of the reference viewpoint;
acquiring difference information between the original texture map and the background texture map;
optimizing the original depth map by using the background depth map according to the difference information to obtain an optimized depth map;
and carrying out virtual viewpoint synthesis according to the original texture map and the optimized depth map.
2. The method of claim 1, wherein reconstructing the background texture map and the background depth map of the reference viewpoint from the original texture map and the original depth map comprises:
and respectively carrying out time-domain median filtering on the original texture map and the original depth map according to the video frame sequence of the reference viewpoint to obtain the background texture map and the background depth map.
3. The method according to claim 1, wherein the obtaining difference information between the original texture map and the background texture map comprises:
calculating the difference value of the pixel values of each pixel point in the background texture map and the corresponding pixel point in the original texture map;
the optimizing the original depth map by using the background depth map according to the difference information to obtain an optimized depth map includes:
and optimizing the original depth map by using the background depth map according to the difference value to obtain an optimized depth map.
4. The method of claim 3, wherein the optimizing the original depth map by using the background depth map according to the difference value to obtain an optimized depth map comprises:
obtaining a binary image according to the comparison between the difference value and a preset threshold value; when the difference value is larger than or equal to the preset threshold value, the value of a corresponding pixel point in the binary image is a first value; or when the difference value is smaller than the preset threshold value, taking the value of the corresponding pixel point in the binary image as a second value;
reserving an original depth value for a pixel point corresponding to a first pixel point in the original depth map, and covering the pixel point corresponding to a second pixel point in the original depth map with the depth value by using the background depth map to obtain the optimized depth map; the first pixel points are pixel points with a first value in the binary image, and the second pixel points are pixel points with a second value in the binary image.
5. The method of claim 1, wherein the performing virtual viewpoint synthesis according to the original texture map and the optimized depth map comprises:
and performing virtual viewpoint synthesis of image rendering DIBR based on the depth map by using the original texture map and the optimized depth map.
6. A virtual viewpoint synthesis apparatus, comprising:
the first acquisition module is used for acquiring an original texture map and an original depth map of a reference viewpoint;
the reconstruction module is used for reconstructing to obtain a background texture map and a background depth map of the reference viewpoint according to the original texture map and the original depth map;
the second obtaining module is used for obtaining difference information between the original texture map and the background texture map;
the optimization module is used for optimizing the original depth map by using the background depth map according to the difference information to obtain an optimized depth map;
and the synthesis module is used for carrying out virtual viewpoint synthesis according to the original texture map and the optimized depth map.
7. The apparatus of claim 6, wherein the reconstruction module is specifically configured to:
and respectively carrying out time-domain median filtering on the original texture map and the original depth map according to the video frame sequence of the reference viewpoint to obtain the background texture map and the background depth map.
8. The apparatus of claim 6,
the second obtaining module is specifically configured to: calculating the difference value of the pixel values of each pixel point in the background texture map and the corresponding pixel point in the original texture map;
the optimization module is specifically configured to: and optimizing the original depth map by using the background depth map according to the difference value to obtain an optimized depth map.
9. An electronic device comprising a processor, a memory, and a program or instructions stored on the memory and executable on the processor, the program or instructions when executed by the processor implementing the steps of the virtual viewpoint synthesis method according to any one of claims 1 to 5.
10. A computer-readable storage medium, on which a program or instructions are stored, which when executed by a processor, implement the steps of the virtual viewpoint synthesis method according to any one of claims 1 to 5.
CN202010853423.3A 2020-08-23 2020-08-23 Virtual viewpoint synthesis method and device, electronic equipment and readable storage medium Active CN111988597B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010853423.3A CN111988597B (en) 2020-08-23 2020-08-23 Virtual viewpoint synthesis method and device, electronic equipment and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010853423.3A CN111988597B (en) 2020-08-23 2020-08-23 Virtual viewpoint synthesis method and device, electronic equipment and readable storage medium

Publications (2)

Publication Number Publication Date
CN111988597A CN111988597A (en) 2020-11-24
CN111988597B true CN111988597B (en) 2022-06-14

Family

ID=73442986

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010853423.3A Active CN111988597B (en) 2020-08-23 2020-08-23 Virtual viewpoint synthesis method and device, electronic equipment and readable storage medium

Country Status (1)

Country Link
CN (1) CN111988597B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007052191A2 (en) * 2005-11-02 2007-05-10 Koninklijke Philips Electronics N.V. Filling in depth results
CN101771893A (en) * 2010-01-05 2010-07-07 浙江大学 Video frequency sequence background modeling based virtual viewpoint rendering method
CN102055982A (en) * 2011-01-13 2011-05-11 浙江大学 Coding and decoding methods and devices for three-dimensional video
CN109819229A (en) * 2019-01-22 2019-05-28 北京市商汤科技开发有限公司 Image processing method and device, electronic equipment and storage medium
CN110660131A (en) * 2019-09-24 2020-01-07 宁波大学 Virtual viewpoint hole filling method based on depth background modeling

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5858380B2 (en) * 2010-12-03 2016-02-10 国立大学法人名古屋大学 Virtual viewpoint image composition method and virtual viewpoint image composition system
KR20120129313A (en) * 2011-05-19 2012-11-28 한국전자통신연구원 System and method for transmitting three-dimensional image information using difference information
JP5983935B2 (en) * 2011-11-30 2016-09-06 パナソニックIpマネジメント株式会社 New viewpoint image generation apparatus and new viewpoint image generation method
CN103108187B (en) * 2013-02-25 2016-09-28 清华大学 The coded method of a kind of 3 D video, coding/decoding method, encoder
JP6308748B2 (en) * 2013-10-29 2018-04-11 キヤノン株式会社 Image processing apparatus, imaging apparatus, and image processing method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007052191A2 (en) * 2005-11-02 2007-05-10 Koninklijke Philips Electronics N.V. Filling in depth results
CN101771893A (en) * 2010-01-05 2010-07-07 浙江大学 Video frequency sequence background modeling based virtual viewpoint rendering method
CN102055982A (en) * 2011-01-13 2011-05-11 浙江大学 Coding and decoding methods and devices for three-dimensional video
CN109819229A (en) * 2019-01-22 2019-05-28 北京市商汤科技开发有限公司 Image processing method and device, electronic equipment and storage medium
CN110660131A (en) * 2019-09-24 2020-01-07 宁波大学 Virtual viewpoint hole filling method based on depth background modeling

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
A Novel Depth-Based Virtual View Synthesis Method for Free Viewpoint Video;Ilkoo Ahn et al.;《 IEEE Transactions on Broadcasting》;20131231;第59卷(第4期);全文 *
Loss-Resilient Coding of Texture and Depth for Free-Viewpoint Video Conferencing;Bruno Macchiavello et al.;《 IEEE Transactions on Multimedia》;20140430;第16卷(第3期);全文 *

Also Published As

Publication number Publication date
CN111988597A (en) 2020-11-24

Similar Documents

Publication Publication Date Title
US10937167B2 (en) Automated generation of pre-labeled training data
US9699380B2 (en) Fusion of panoramic background images using color and depth data
Xiao et al. Fast image dehazing using guided joint bilateral filter
CN109978774B (en) Denoising fusion method and device for multi-frame continuous equal exposure images
EP3238213B1 (en) Method and apparatus for generating an extrapolated image based on object detection
CN109993824B (en) Image processing method, intelligent terminal and device with storage function
Luo et al. A disocclusion inpainting framework for depth-based view synthesis
Wang et al. View generation with DIBR for 3D display system
Plath et al. Adaptive image warping for hole prevention in 3D view synthesis
CN112132836A (en) Video image clipping method and device, electronic equipment and storage medium
WO2021232965A1 (en) Video noise reduction method and apparatus, mobile terminal and storage medium
CN112927271A (en) Image processing method, image processing apparatus, storage medium, and electronic device
Yang et al. Dynamic 3D scene depth reconstruction via optical flow field rectification
CN111353955A (en) Image processing method, device, equipment and storage medium
CN107564085B (en) Image warping processing method and device, computing equipment and computer storage medium
CN110717962B (en) Dynamic photo generation method, device, photographing equipment and storage medium
CN113313832A (en) Semantic generation method and device of three-dimensional model, storage medium and electronic equipment
CN112819937B (en) Self-adaptive multi-object light field three-dimensional reconstruction method, device and equipment
CN113989717A (en) Video image processing method and device, electronic equipment and storage medium
CN111988597B (en) Virtual viewpoint synthesis method and device, electronic equipment and readable storage medium
US9460544B2 (en) Device, method and computer program for generating a synthesized image from input images representing differing views
CN109712094B (en) Image processing method and device
CN113256484A (en) Method and device for stylizing image
CN115914834A (en) Video processing method and device
CN113469889A (en) Image noise reduction method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant