CN116723305B - Virtual viewpoint quality enhancement method based on generation type countermeasure network - Google Patents
Virtual viewpoint quality enhancement method based on generation type countermeasure network Download PDFInfo
- Publication number
- CN116723305B CN116723305B CN202310445621.XA CN202310445621A CN116723305B CN 116723305 B CN116723305 B CN 116723305B CN 202310445621 A CN202310445621 A CN 202310445621A CN 116723305 B CN116723305 B CN 116723305B
- Authority
- CN
- China
- Prior art keywords
- network
- quality
- virtual viewpoint
- virtual
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 49
- 238000003786 synthesis reaction Methods 0.000 claims abstract description 23
- 230000015572 biosynthetic process Effects 0.000 claims abstract description 20
- 238000011156 evaluation Methods 0.000 claims description 11
- 238000012549 training Methods 0.000 claims description 9
- 230000008569 process Effects 0.000 claims description 8
- 230000006835 compression Effects 0.000 claims description 6
- 238000007906 compression Methods 0.000 claims description 6
- 238000007781 pre-processing Methods 0.000 claims description 5
- 230000004913 activation Effects 0.000 claims description 3
- 230000005540 biological transmission Effects 0.000 claims description 3
- 238000011176 pooling Methods 0.000 claims description 3
- 230000000007 visual effect Effects 0.000 abstract description 4
- 230000009286 beneficial effect Effects 0.000 abstract description 2
- 230000006870 function Effects 0.000 description 9
- 238000005457 optimization Methods 0.000 description 6
- 238000012360 testing method Methods 0.000 description 4
- 230000002708 enhancing effect Effects 0.000 description 3
- 230000008485 antagonism Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 1
- 238000012733 comparative method Methods 0.000 description 1
- YYJNOYZRYGDPNH-MFKUBSTISA-N fenpyroximate Chemical compound C=1C=C(C(=O)OC(C)(C)C)C=CC=1CO/N=C/C=1C(C)=NN(C)C=1OC1=CC=CC=C1 YYJNOYZRYGDPNH-MFKUBSTISA-N 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 238000012805 post-processing Methods 0.000 description 1
- 238000013441 quality evaluation Methods 0.000 description 1
- 238000009877 rendering Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000002194 synthesizing effect Effects 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/10—Processing, recording or transmission of stereoscopic or multi-view image signals
- H04N13/106—Processing image signals
- H04N13/161—Encoding, multiplexing or demultiplexing different image signal components
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0475—Generative networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/094—Adversarial learning
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Signal Processing (AREA)
- Multimedia (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
- Processing Or Creating Images (AREA)
Abstract
The invention provides a virtual viewpoint quality enhancement method based on a generation type countermeasure network, and belongs to the technical field of virtual viewpoint synthesis of three-dimensional videos. The technical problem of virtual viewpoint distortion caused by three-dimensional video coding and decoding and viewpoint synthesis is solved. The technical proposal is as follows: firstly, analyzing the relevance between network and video characteristics and virtual viewpoint quality; then, designing a virtual viewpoint quality enhancement flow frame; finally, constructing a virtual viewpoint quality enhancement model based on the generated type countermeasure network aiming at the quality enhancement model part in the flow frame. The beneficial effects of the invention are as follows: by designing a quality enhancement network model oriented to the virtual viewpoint, the quality of the synthesized virtual viewpoint is improved, and better subjective visual quality is obtained.
Description
Technical Field
The invention relates to the technical field of virtual viewpoint synthesis of three-dimensional video, in particular to a virtual viewpoint quality enhancement method based on a generation type countermeasure network for a virtual viewpoint of a three-dimensional video decoding end.
Background
In recent years, with the vigorous development of multimedia information technology and the further expansion of video fields, video televisions are also being updated continuously. On the one hand, video televisions are developed from standard definition to high definition and even full-high definition televisions, and the number of supportable pixels is increased. On the other hand, video televisions are developed from two-dimensional planes to three-dimensional stereoscopic and even free viewpoint televisions, and the number of viewpoints which can be supported is increased. From standard definition to high definition, and from plane to stereo, video technology has undergone several innovations, and has been broad-leaved toward the ultra-high definition era. In face of the new development of Video technology, three-dimensional efficient Video Coding (Three Dimensional HIGH EFFICIENCY Video Coding, 3D-HEVC) standards have evolved.
In the 3D-HEVC video coding standard, when coding, the coding is performed jointly according to the sequence of texture map and depth map, and when decoding, the synthesis of virtual view is realized by using a DEPTH IMAGE-Based Rendering (DIBR) technology Based on depth image. In terms of measuring coding distortion, the conventional rate-distortion optimization method calculates coding distortion by calculating the sum of absolute errors or the sum of squares of differences between a current coding block in a current coding frame and a reference block in a reference frame. But since the depth map is not directly provided for human eyes to view, it is only used for synthesizing virtual views, so the coding quality of the depth map has a close relation with the distortion degree of the virtual views. In order to ensure that the synthesized virtual view does not have obvious distortion, the distortion of the virtual view needs to be taken into consideration when measuring the coding distortion of the depth map.
For virtual view distortion calculation, the current 3D-HEVC video coding standard employs a synthetic view distortion variation model (Synthesized View Distortion Change, SVDC) as a calculation model for virtual view distortion. The SVDC model can alleviate the problem that the distortion of the depth map cannot be accurately mapped to the virtual viewpoint distortion caused by shielding and shielding-off phenomena to a certain extent. In a real application scenario, however, distortion is always inevitably introduced. Since DIBR techniques use depth map information to achieve a mapping of pixels in the original view to pixels in the virtual view, this process is related to both the original texture map and the original depth map. Therefore, the distortion of the texture map, the distortion of the depth map, and the distortion generated in the virtual view synthesis process may all cause the distortion of the synthesized virtual view. Depending on the starting point of the optimization, the enhancement of the quality of the virtual viewpoint of the three-dimensional video can be generally achieved in four ways: texture map quality enhancement, depth map quality enhancement, view synthesis technique optimization, and video post-processing.
The method effectively enhances the quality of the image or video after compression coding. But such methods are mainly used for artifact removal after image or h.265/HEVC video compression coding and cannot be directly used for enhancing the quality of virtual views in 3D-HEVC.
In terms of virtual viewpoint quality enhancement, zhu et al consider the quality enhancement of a virtual viewpoint as an image reconstruction task while considering geometric distortion and compression distortion, and propose a CNN-based virtual viewpoint quality enhancement method. Pan et al propose a virtual viewpoint quality enhancement method based on a dual-stream attention network, which utilizes global information of learning context and local information of extracted texture, and more comprehensively removes distortion of a synthesized virtual viewpoint compared with the method proposed by Zhu et al. However, the research on the method is less at present, and how to further mine the depth characteristics of the synthesized virtual viewpoint is a key problem for improving the subjective visual quality experience of the three-dimensional video.
Disclosure of Invention
The invention aims to provide a virtual viewpoint quality enhancement method based on a generation type countermeasure network, which solves the problem of virtual viewpoint distortion caused by three-dimensional video coding and decoding and viewpoint synthesis, can effectively improve the quality of a synthesized virtual viewpoint, improves PSNR by 1.127dB on average, and obtains good subjective visual effect. The invention is characterized in that: the invention provides a virtual viewpoint quality enhancement method based on a generated type countermeasure network.
In order to achieve the aim of the invention, the invention adopts the technical scheme that: a virtual viewpoint quality enhancement method based on a generation type countermeasure network specifically comprises the following steps:
1.1, analyzing the relevance of network and video characteristics and virtual viewpoint quality;
1.2, designing a virtual viewpoint quality enhancement flow frame;
1.3, constructing a virtual viewpoint quality enhancement model based on a generated type countermeasure network;
As a further optimization scheme of the virtual viewpoint quality enhancement method based on the generated type countermeasure network, the step 1.1 specifically comprises the following steps:
2.1, analyzing the reasons for virtual viewpoint distortion, mainly including compression coding, background area de-shielding, non-viewpoint overlapping area and inaccurate depth information;
2.2, establishing a virtual view distortion evaluation criterion C, wherein the formula of the virtual view distortion evaluation criterion C is C=f (p video(dtexture,ddepth,dsynthesis),pnetwork (v, b, d)), wherein p video (DEG) represents video characteristic parameters, d texture (DEG) represents texture map coding distortion, d depth (DEG) represents depth map coding distortion, and d synthesis (DEG) represents virtual view synthesis process distortion; p network (·) denotes the network characteristic parameter, v denotes the network data transmission rate, b denotes the network channel bandwidth, and d denotes the total network data delay.
As a further optimization scheme of the virtual viewpoint quality enhancement method based on the generated type countermeasure network, the step 1.1 specifically comprises the following steps: the step 1.2 specifically comprises the following steps:
3.1, obtaining a low-quality virtual viewpoint to be enhanced by encoding and decoding an original texture map and a depth map and performing viewpoint synthesis;
3.2, preprocessing the low-quality virtual view and the high-quality virtual view data set obtained in the step 3.1 based on a virtual view distortion evaluation criterion C;
3.3, constructing a generating network, judging the network, defining a loss function, and constructing a virtual viewpoint quality enhancement model;
and 3.4, reconstructing the low-quality virtual view into a high-quality virtual view by using the trained virtual view quality enhancement model.
As a further optimization scheme of the virtual viewpoint quality enhancement method based on the generated type countermeasure network, the step 1.3 specifically comprises the following steps:
4.1, combining 1 generation network module, 1 discrimination network module and 1 loss feedback module into a virtual viewpoint quality enhancement model;
4.2, combining the convolution layers of 1 64-channel 3×3 convolution kernels and 16 residual units to generate a network module;
4.3, combining 6 groups of convolution layers comprising a convolution kernel with the size of 3 multiplied by 3, a BN layer and a Leaky ReLU activation function into a discrimination network module;
4.4, composing a mean square error L MSE for measuring pixel-level loss, L PSNR for measuring objective difference of image picture quality, a perception loss L P for measuring image style (color, texture, contrast and the like) and a antagonism loss L A for measuring judgment result and real image cross entropy into a loss function L G of a generating network, wherein the formula is L G=λMSELMSE+λPSNRLPSNR+λPLP+λALA, wherein lambda MSE、λPSNR、λP and lambda A are parameters in the training process of generating a network model, Wherein W, H, C is the width, height and channel number of the image respectively,/>Wherein MSE is the mean square error lossWherein, phi m,n represents the feature distribution of the nth convolutional layer before the mth pooling layer in the VGG19 network, phi m,n(Io) is the feature distribution of the original high quality virtual view image, phi m,n(Iy) is the feature distribution of the high quality virtual view image generated by the generating network, W m,n and H m,n are the sizes of the features;
4.5, obtaining a loss function of the discrimination network by calculating the probability that an original high-quality virtual viewpoint is true and a low-quality virtual viewpoint which is 'false and spurious', wherein the formula is L D=-log(D(Io))-log(1-D(G(Ix)), wherein I x is a low-quality virtual viewpoint image synthesized by compressed video, I o is an original high-quality virtual viewpoint image, G (-) represents an image generated by the generation network module, and D (-) represents the probability that the discrimination network module determines that the generated image is a real image.
Compared with the prior art, the invention has the beneficial effects that:
(1) Aiming at the problem of virtual viewpoint synthesis distortion caused by three-dimensional video coding, decoding and viewpoint synthesis, the invention designs a quality enhancement network model oriented to a virtual viewpoint by taking the reason that the virtual viewpoint distortion is caused into consideration that the generated type countermeasure network has better performance in the aspect of image restoration and aims at realizing the virtual viewpoint quality enhancement.
(2) The invention analyzes the quality characteristics of the synthesized virtual viewpoint, deeply analyzes the main reasons of the synthetic distortion of the virtual viewpoint, and lays a foundation for preprocessing the data set;
(3) The invention designs a virtual viewpoint quality enhancement network model flow frame, which mainly comprises two parts of data preprocessing and virtual viewpoint quality enhancement, and integrally establishes a technical route and implementation steps of a virtual viewpoint quality enhancement method;
(4) The invention provides a virtual viewpoint quality enhancement network model based on a generation type countermeasure network, and the quality enhancement of the virtual viewpoint is realized through alternate training of a generation network and a discrimination network. From the experimental results of the method, on objective performance evaluation, the PSNR of the virtual view point is averagely improved by 1.127dB compared with the original HTM-16.0 method, and on SSIM gain, compared with the original HTM-16.0 method, the SSIM of the method is averagely improved by 0.0267. In addition, in subjective performance, the method provided by the invention has no obvious difference with the original video basically in subjective quality, and can obtain good subjective visual effect.
Drawings
The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate the invention and together with the embodiments of the invention, serve to explain the invention.
Fig. 1 is an overall flowchart of a virtual viewpoint quality enhancement method based on a generative countermeasure network provided by the present invention.
Fig. 2 is a schematic diagram of virtual viewpoint distortion caused by background region de-occlusion and non-viewpoint overlapping regions in the present invention.
Fig. 3 is a schematic diagram of virtual viewpoint distortion caused by inaccurate depth information in the present invention.
Fig. 4 is a flow frame for enhancing virtual viewpoint quality in the present invention.
Fig. 5 is a schematic view of a virtual viewpoint quality enhancement model based on a generated countermeasure network in the present invention.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. Of course, the specific embodiments described herein are for purposes of illustration only and are not intended to limit the invention.
Example 1
Referring to fig. 1, the technical scheme provided in this embodiment is as follows: a virtual viewpoint quality enhancement method based on a generative countermeasure network comprises the following steps:
step 1, analyzing the relevance between network and video characteristics and virtual viewpoint quality;
step 2, designing a virtual viewpoint quality enhancement flow frame;
And 3, constructing a virtual viewpoint quality enhancement model based on the generated type countermeasure network.
Specifically, referring to fig. 2 and 3, in step 1, the correlation between the network and video features and the virtual viewpoint quality is analyzed, and the method specifically includes the following steps:
1) Analyzing the reasons for the distortion of the virtual view, mainly including compression coding, background area de-shielding, non-view overlapping area and inaccurate depth information;
2) Establishing a virtual view distortion evaluation criterion C, wherein the formula of the virtual view distortion evaluation criterion C is C=f (p video(dtexture,ddepth,dsynthesis),pnetwork (v, b, d)), wherein p video (-) represents video characteristic parameters, d texture (-) represents texture map coding distortion, d depth (-) represents depth map coding distortion, and d synthesis (-) represents virtual view synthesis process distortion; p network (·) denotes the network characteristic parameter, v denotes the network data transmission rate, b denotes the network channel bandwidth, and d denotes the total network data delay.
Specifically, referring to fig. 4, in step 2, a virtual viewpoint quality enhancement flow framework is designed, including the steps of:
1) The method comprises the steps of obtaining a low-quality virtual viewpoint to be enhanced by encoding and decoding an original texture map and a depth map and performing viewpoint synthesis;
2) Preprocessing the low-quality virtual view and the high-quality virtual view data set obtained in the step 3.1 based on a virtual view distortion evaluation criterion C;
3) Constructing a generating network, judging the network, defining a loss function, and constructing a virtual viewpoint quality enhancement model;
4) And reconstructing the low-quality virtual view into a high-quality virtual view by using the trained virtual view quality enhancement model.
Specifically, referring to fig. 5, in step 3, a virtual viewpoint quality enhancement model based on a generated type countermeasure network is constructed, including the steps of:
1) Combining the 1 generation network module, the 1 discrimination network module and the 1 loss feedback module into a virtual viewpoint quality enhancement model;
2) Combining the convolution layers of the 1 64-channel 3×3 convolution kernels and 16 residual units to generate a network module;
3) Combining 6 groups of convolution layers comprising a convolution kernel with the size of 3 multiplied by 3, a BN layer and a Leaky ReLU activation function into a discrimination network module;
4) The mean square error L MSE for measuring pixel level loss, L PSNR for measuring objective difference of image picture quality, perceived loss L P for measuring image style (color, texture, contrast and the like) and antagonism loss L A for measuring judgment result and real image cross entropy are combined to form a loss function L G of a generating network, wherein the formula is L G=λMSELMSE+λPSNRLPSNR+λPLP+λALA, lambda MSE、λPSNR、λP and lambda A are parameters in the training process of generating a network model, Wherein W, H, C is the width, height and channel number of the image respectively,/>Wherein MSE is the mean square error lossWherein, phi m,n represents the feature distribution of the nth convolutional layer before the mth pooling layer in the VGG19 network, phi m,n(Io) is the feature distribution of the original high quality virtual view image, phi m,n(Iy) is the feature distribution of the high quality virtual view image generated by the generating network, W m,n and H m,n are the sizes of the features;
5) Obtaining a loss function of the discrimination network by calculating the probability that an original high-quality virtual viewpoint is true and a low-quality virtual viewpoint of 'spurious' is false, wherein the formula is L D=-log(D(Io))-log(1-D(G(Ix)), wherein I x is a low-quality virtual viewpoint image synthesized by compressed video, I o is an original high-quality virtual viewpoint image, G (·) represents an image generated by a generation network module, and D (·) represents the probability that the discrimination network module judges the generated image as a real image.
In order to test the performance of the method proposed in this embodiment, a Pytorch learning framework is used to build a virtual viewpoint quality enhancement model, all training is performed on a server, and the training environment configuration is shown in table 1. When the model is trained, an Adam method is adopted, the initial learning rate is set to be 0.0001, the learning rate is finely adjusted along with the change of training times, and the optimal parameters for generating the network loss countermeasure function are obtained. Test sequences Balloons, kendo, NEWSPAPER, poznan _Hal2, poznan _street and undo_Dancer.
Table 1 training environment settings
Table 2 shows the comparison results of the proposed method of the present embodiment and the two currently mainstream virtual viewpoint quality enhancement methods on objective quality evaluation indexes. All experimental results are obtained in the training environment of the embodiment, and the difference between the SSIM calculated by the image obtained by the proposed method and the original image and the SSIM calculated by the image obtained by the HTM-16.0 test platform code and the original image, and the PSNR calculated by the image obtained by the proposed method and the original image and the PSNR calculated by the image obtained by the HTM-16.0 test platform code and the original image are calculated
The difference between them is used as an evaluation index. Wherein, the specific calculation formula of the SSIM is as followsWherein X and Y are two images to be measured, mu X and mu Y are the average value of X and Y, respectively,/>And/>Is the variance of X and Y, σ XY is the covariance of X and Y, c 1=(k1L)2 and c 2=(k2L)2 are constants for stability, where k 1=0.01,k2 = 0.03, L represents the dynamic range of values for the pixel values, and when the number of image bits is 8, the value of L is 255. The specific calculation formula of PSNR isWherein M and N represent the sizes of the images, x i,j and y i,j represent the gray values of the original image and the image processed by the proposed method at the pixel points corresponding to the ith row and the jth column respectively, and N represents the number of image bits. Table 2 shows the results of the comparison of the methods and the comparative method
As shown in Table 2, the virtual viewpoint quality enhancement model proposed in this embodiment has an average improvement of 1.127dB in PSNR of the virtual viewpoint after the viewpoint synthesis compared with HTM-16.0 codec, which is better than 0.804dB of TSAN [4] and 0.315dB of the method proposed by Zhu et al [6 ]. On the gain of SSIM, the virtual viewpoint quality enhancement model proposed in this embodiment is improved by 0.0267 as compared with the SSIM of the virtual viewpoint after HTM-16.0 codec and viewpoint synthesis, and the result is also better than 0.0117 of TSAN [4] and 0.0046 of Zhu et al [6] proposed method. It can be derived from this that the method proposed by this embodiment has better performance in enhancing the quality of the virtual viewpoint.
The foregoing description of the preferred embodiments of the invention is not intended to limit the invention to the precise form disclosed, and any such modifications, equivalents, and alternatives falling within the spirit and scope of the invention are intended to be included within the scope of the invention.
Claims (1)
1. The virtual viewpoint quality enhancement method based on the generation type countermeasure network is characterized by comprising the following steps of:
Step1.1, analyzing the relevance of network and video characteristics and virtual viewpoint quality;
Step 1.2, designing a virtual viewpoint quality enhancement flow frame;
step 1.3, constructing a virtual viewpoint quality enhancement model based on a generated type countermeasure network;
the step 1.1 specifically comprises the following steps:
2.1, analyzing reasons for virtual viewpoint distortion include compression coding, background area de-shielding, non-viewpoint overlapping area and inaccurate depth information;
2.2, establishing a virtual view distortion evaluation criterion C, wherein the formula of the virtual view distortion evaluation criterion C is C=f (p video(dtexture,ddepth,dsynthesis),pnetwork (v, b, d)), wherein p video (DEG) represents video characteristic parameters, d texture (DEG) represents texture map coding distortion, d depth (DEG) represents depth map coding distortion, and d synthesis (DEG) represents virtual view synthesis process distortion; p network (·) represents a network characteristic parameter, v represents a network data transmission rate, b represents a network channel bandwidth, d represents a total network data delay;
The step 1.2 specifically comprises the following steps:
3.1, obtaining a low-quality virtual viewpoint to be enhanced by encoding and decoding an original texture map and a depth map and performing viewpoint synthesis;
3.2, preprocessing the low-quality virtual view and the high-quality virtual view data set obtained in the step 3.1 based on a virtual view distortion evaluation criterion C;
3.3, constructing a generating network, judging the network, defining a loss function, and constructing a virtual viewpoint quality enhancement model;
3.4, rebuilding the low-quality virtual view into a high-quality virtual view by using the trained virtual view quality enhancement model;
the step 1.3 specifically comprises the following steps:
4.1, combining 1 generation network module, 1 discrimination network module and 1 loss feedback module into a virtual viewpoint quality enhancement model;
4.2, combining the convolution layers of 1 64-channel 3×3 convolution kernels and 16 residual units to generate a network module;
4.3, combining 6 groups of convolution layers comprising a convolution kernel with the size of 3 multiplied by 3, a BN layer and a Leaky ReLU activation function into a discrimination network module;
4.4, composing a mean square error L MSE for measuring pixel-level loss, L PSNR for measuring objective difference of image picture quality, a perceived loss L P for measuring image style and a counterloss L A for measuring judgment result and real image cross entropy into a loss function L G of a generating network, wherein the formula is L G=λMSELMSE+λPSNRLPSNR+λPLP+λALA, wherein lambda MSE、λPSNR、λP and lambda A are parameters in the training process of generating a network model, Wherein W, H, C are the width, height and channel number of the image respectively,Where MSE is the mean square error loss L MSE,Wherein, phi m,n represents the feature distribution of the nth convolutional layer before the mth pooling layer in the VGG19 network, phi m,n(Io) is the feature distribution of the original high quality virtual view image, phi m,n(Iy) is the feature distribution of the high quality virtual view image generated by the generating network, W m,n and H m,n are the sizes of the features;
4.5, obtaining a loss function of the discrimination network by calculating the probability that an original high-quality virtual viewpoint is true and a low-quality virtual viewpoint which is 'false and spurious', wherein the formula is L D=-log(D(Io))-log(1-D(G(Ix)), wherein I x is a low-quality virtual viewpoint image synthesized by compressed video, I o is an original high-quality virtual viewpoint image, G (-) represents an image generated by the generation network module, and D (-) represents the probability that the discrimination network module determines that the generated image is a real image.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310445621.XA CN116723305B (en) | 2023-04-24 | 2023-04-24 | Virtual viewpoint quality enhancement method based on generation type countermeasure network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310445621.XA CN116723305B (en) | 2023-04-24 | 2023-04-24 | Virtual viewpoint quality enhancement method based on generation type countermeasure network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116723305A CN116723305A (en) | 2023-09-08 |
CN116723305B true CN116723305B (en) | 2024-05-03 |
Family
ID=87864931
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310445621.XA Active CN116723305B (en) | 2023-04-24 | 2023-04-24 | Virtual viewpoint quality enhancement method based on generation type countermeasure network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116723305B (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102892021A (en) * | 2012-10-15 | 2013-01-23 | 浙江大学 | New method for synthesizing virtual viewpoint image |
CN104853175A (en) * | 2015-04-24 | 2015-08-19 | 张艳 | Novel synthesized virtual viewpoint objective quality evaluation method |
CN108495110A (en) * | 2018-01-19 | 2018-09-04 | 天津大学 | A kind of virtual visual point image generating method fighting network based on production |
CN112489198A (en) * | 2020-11-30 | 2021-03-12 | 江苏科技大学 | Three-dimensional reconstruction system and method based on counterstudy |
WO2021093584A1 (en) * | 2019-11-13 | 2021-05-20 | 南京大学 | Free viewpoint video generation and interaction method based on deep convolutional neural network |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102325259A (en) * | 2011-09-09 | 2012-01-18 | 青岛海信数字多媒体技术国家重点实验室有限公司 | Method and device for synthesizing virtual viewpoints in multi-viewpoint video |
US11024009B2 (en) * | 2016-09-15 | 2021-06-01 | Twitter, Inc. | Super resolution using a generative adversarial network |
-
2023
- 2023-04-24 CN CN202310445621.XA patent/CN116723305B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102892021A (en) * | 2012-10-15 | 2013-01-23 | 浙江大学 | New method for synthesizing virtual viewpoint image |
CN104853175A (en) * | 2015-04-24 | 2015-08-19 | 张艳 | Novel synthesized virtual viewpoint objective quality evaluation method |
CN108495110A (en) * | 2018-01-19 | 2018-09-04 | 天津大学 | A kind of virtual visual point image generating method fighting network based on production |
WO2021093584A1 (en) * | 2019-11-13 | 2021-05-20 | 南京大学 | Free viewpoint video generation and interaction method based on deep convolutional neural network |
CN112489198A (en) * | 2020-11-30 | 2021-03-12 | 江苏科技大学 | Three-dimensional reconstruction system and method based on counterstudy |
Also Published As
Publication number | Publication date |
---|---|
CN116723305A (en) | 2023-09-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Li et al. | Quality assessment of DIBR-synthesized images by measuring local geometric distortions and global sharpness | |
Stefanoski et al. | Automatic view synthesis by image-domain-warping | |
US10349058B2 (en) | Method for predicting depth map coding distortion of two-dimensional free viewpoint video | |
CN102970529B (en) | A kind of object-based multi-view point video fractal image compression & decompression method | |
CN104602028B (en) | A kind of three-dimensional video-frequency B frames entire frame loss error concealing method | |
CN105933711B (en) | Neighborhood optimum probability video steganalysis method and system based on segmentation | |
CN105141940B (en) | A kind of subregional 3D method for video coding | |
CN113784129A (en) | Point cloud quality evaluation method, encoder, decoder and storage medium | |
Shao et al. | No-reference view synthesis quality prediction for 3-D videos based on color–depth interactions | |
CN104159095A (en) | Code rate control method for multi-view texture video and depth map coding | |
CN113038123A (en) | No-reference panoramic video quality evaluation method, system, terminal and medium | |
EP2391135B1 (en) | Method and device for processing depth image sequence | |
CN103905812A (en) | Texture/depth combination up-sampling method | |
CN102737380B (en) | Stereo image quality objective evaluation method based on gradient structure tensor | |
CN111385585B (en) | 3D-HEVC depth map coding unit division method based on machine learning | |
CN116723305B (en) | Virtual viewpoint quality enhancement method based on generation type countermeasure network | |
Sandić-Stanković et al. | Free viewpoint video quality assessment based on morphological multiscale metrics | |
CN105007494B (en) | Wedge-shaped Fractionation regimen selection method in a kind of frame of 3D video depths image | |
CN106791772B (en) | Largest tolerable depth distortion computation method based on drafting | |
Zhang et al. | Multi-layer and Multi-scale feature aggregation for DIBR-Synthesized image quality assessment | |
Wang et al. | Visual quality optimization for view-dependent point cloud compression | |
CN108810511A (en) | A kind of multiple views compression depth video enhancement method based on viewpoint consistency | |
CN111526354B (en) | Stereo video comfort prediction method based on multi-scale spatial parallax information | |
Park et al. | Virtual control of optical axis of the 3DTV camera for reducing visual fatigue in stereoscopic 3DTV | |
Xiao et al. | No-reference quality assessment of stereoscopic video based on deep frequency perception |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |