WO2010025655A1 - 3d视频通信方法、发送设备、系统及图像重构方法和系统 - Google Patents
3d视频通信方法、发送设备、系统及图像重构方法和系统 Download PDFInfo
- Publication number
- WO2010025655A1 WO2010025655A1 PCT/CN2009/073542 CN2009073542W WO2010025655A1 WO 2010025655 A1 WO2010025655 A1 WO 2010025655A1 CN 2009073542 W CN2009073542 W CN 2009073542W WO 2010025655 A1 WO2010025655 A1 WO 2010025655A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- depth
- video
- map
- information
- Prior art date
Links
- 238000000034 method Methods 0.000 claims description 121
- 238000004891 communication Methods 0.000 claims description 48
- 238000006243 chemical reaction Methods 0.000 claims description 47
- 238000012545 processing Methods 0.000 claims description 43
- 230000008569 process Effects 0.000 claims description 24
- 230000001360 synchronised effect Effects 0.000 claims description 19
- 238000012937 correction Methods 0.000 claims description 13
- 238000007781 pre-processing Methods 0.000 claims description 12
- 238000010586 diagram Methods 0.000 description 34
- 230000000694 effects Effects 0.000 description 34
- 238000003384 imaging method Methods 0.000 description 21
- 239000011800 void material Substances 0.000 description 13
- 238000009877 rendering Methods 0.000 description 9
- 238000004364 calculation method Methods 0.000 description 7
- 230000005540 biological transmission Effects 0.000 description 6
- 238000005286 illumination Methods 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 5
- 230000003287 optical effect Effects 0.000 description 5
- 230000009466 transformation Effects 0.000 description 5
- 230000000007 visual effect Effects 0.000 description 5
- 230000008439 repair process Effects 0.000 description 3
- 208000003464 asthenopia Diseases 0.000 description 2
- 230000015572 biosynthetic process Effects 0.000 description 2
- 238000007796 conventional method Methods 0.000 description 2
- 230000009977 dual effect Effects 0.000 description 2
- 238000003786 synthesis reaction Methods 0.000 description 2
- 238000013519 translation Methods 0.000 description 2
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 description 1
- 241000282412 Homo Species 0.000 description 1
- OAICVXFJPJFONN-UHFFFAOYSA-N Phosphorus Chemical compound [P] OAICVXFJPJFONN-UHFFFAOYSA-N 0.000 description 1
- 230000002411 adverse Effects 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000005314 correlation function Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005538 encapsulation Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000004091 panning Methods 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 238000002310 reflectometry Methods 0.000 description 1
- 230000003252 repetitive effect Effects 0.000 description 1
- 230000001568 sexual effect Effects 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 238000004148 unit process Methods 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/10—Processing, recording or transmission of stereoscopic or multi-view image signals
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/10—Processing, recording or transmission of stereoscopic or multi-view image signals
- H04N13/106—Processing image signals
- H04N13/122—Improving the 3D impression of stereoscopic images by modifying image signal contents, e.g. by filtering or adding monoscopic depth cues
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/10—Processing, recording or transmission of stereoscopic or multi-view image signals
- H04N13/106—Processing image signals
- H04N13/128—Adjusting depth or disparity
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/10—Processing, recording or transmission of stereoscopic or multi-view image signals
- H04N13/106—Processing image signals
- H04N13/161—Encoding, multiplexing or demultiplexing different image signal components
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/10—Processing, recording or transmission of stereoscopic or multi-view image signals
- H04N13/106—Processing image signals
- H04N13/167—Synchronising or controlling image signals
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/20—Image signal generators
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/20—Image signal generators
- H04N13/204—Image signal generators using stereoscopic image cameras
- H04N13/239—Image signal generators using stereoscopic image cameras using two 2D image sensors having a relative position equal to or related to the interocular distance
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/20—Image signal generators
- H04N13/204—Image signal generators using stereoscopic image cameras
- H04N13/246—Calibration of cameras
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/20—Image signal generators
- H04N13/204—Image signal generators using stereoscopic image cameras
- H04N13/25—Image signal generators using stereoscopic image cameras using two or more image sensors with different characteristics other than in their location or field of view, e.g. having different resolutions or colour pickup characteristics; using image signals from one sensor to control the characteristics of another sensor
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/20—Image signal generators
- H04N13/271—Image signal generators wherein the generated image signals comprise depth maps or disparity maps
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/20—Image signal generators
- H04N13/282—Image signal generators for generating image signals corresponding to three or more geometrical viewpoints, e.g. multi-view systems
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/20—Image signal generators
- H04N13/296—Synchronisation thereof; Control thereof
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/597—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/10—Processing, recording or transmission of stereoscopic or multi-view image signals
- H04N13/106—Processing image signals
- H04N13/156—Mixing image signals
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/10—Processing, recording or transmission of stereoscopic or multi-view image signals
- H04N13/106—Processing image signals
- H04N13/158—Switching image signals
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/10—Processing, recording or transmission of stereoscopic or multi-view image signals
- H04N13/106—Processing image signals
- H04N13/172—Processing image signals image signals comprising non-image signal components, e.g. headers or format information
- H04N13/178—Metadata, e.g. disparity information
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N2213/00—Details of stereoscopic systems
- H04N2213/005—Aspects relating to the "3D+depth" image format
Definitions
- the present invention relates to the field of video technologies, and in particular, to a 3D video communication method, a transmitting device, a system, and an image reconstruction method and system. Background technique
- the traditional video is a carrier of two-dimensional image information, which can only express the content of the object while ignoring the depth information of the object, such as the distance, location, etc., is incomplete.
- humans need to acquire more images than a two-dimensional image to obtain the necessary spatial information to obtain a picture that conforms to the world's two eyes to observe the world's visual experience.
- 3D video technology can provide images with depth information in line with the principle of human stereo vision, so that it can truly reproduce the scene of the objective world, showing the depth, layering and authenticity of objects in the scene, which is also the development of current video technology. direction.
- Get the depth information of the scene that is, the depth of the scene is a very important content in the 3D video system, and the scene depth map is also called the parallax map of the scene.
- the depth map of the scene obtained in the prior art mainly has the following methods:
- Stereo image matching is to capture a plurality of color images of a scene by using a camera.
- the color image is a 2D image of the scene.
- the image is analyzed and calculated to obtain a depth map of the scene.
- the basic principle is as follows: Find a corresponding image point of a certain point in the scene in a plurality of color maps, and then find the coordinates in the space according to the coordinates of the point in the plurality of color maps, thereby obtaining the depth of the point. information.
- the stereo image matching technology mainly includes a window-based matching method and a dynamic programming method, and both methods adopt a gray-based matching algorithm.
- the algorithm based on gray-scale matching is to divide the color image into small sub-regions, using its gray value as a template and finding the sub-regions with the most similar gray value distribution in other color images, if the two sub-regions satisfy similarity Sexual requirements, the points in the two sub-areas are considered to be matched.
- correlation functions are usually used to measure the similarity of two regions.
- An algorithm based on gray-scale matching generally obtains a scene-dense depth map.
- stereo image matching can also be performed based on the matching algorithm of features.
- Feature-based matching algorithms use features derived from grayscale information of color images for matching. Compared to algorithms that use simple luminance and grayscale variation information, feature-based matching algorithms are more stable and accurate. Matching features can be considered as potentially important features that can describe the 3D structure of the scene, such as the corners of edges and edges.
- Feature-based matching generally obtains a depth map with sparse scenes, and then uses interpolated values to obtain scene-dense depth maps.
- the other is to use a single depth camera (Depth Camera) to get the depth map of the scene.
- the basic principle of the depth camera is to determine the distance of the object by emitting infrared rays and detecting the intensity of the infrared rays reflected by the objects in the scene. Therefore, the depth map output by the depth camera is of good quality, high precision, and good application prospect.
- depth cameras are mainly used in the fields of gesture recognition, background replacement and synthesis, and are rarely used in 3D video systems, and generally use a single depth camera to collect scene video images.
- the depth map of the scene is obtained more accurately, but only one color map and the corresponding depth map of the scene of one viewpoint can be obtained by a single depth camera.
- reconstructing the image of the small parallax virtual viewpoint it may have better reconstruction effect.
- reconstructing the image of the large parallax virtual viewpoint the image image is reconstructed due to the lack of sufficient color image information.
- the image of the virtual viewpoint will produce a large "hole" and cannot be performed.
- the patching makes the reconstructed image severely distorted and the reconstruction effect is poor.
- FIG. 1 it is a schematic diagram of a principle for generating a void when reconstructing a virtual view image according to a video image acquired by a single depth camera in the prior art. If the video map of the object la and the object lb is obtained at the ol viewpoint, since the object lb blocks the laO portion of the object la, the actually obtained video map information includes only part of the image information of the object la and the image information of the object lb, and There is no image information of the laO portion of the object la.
- the 1 aO partial image of the object 1 a is such that a void is generated in the 1 aO portion, so that the reconstructed image is severely distorted and the reconstruction effect is poor.
- the prior art has at least the following drawbacks:
- the stereo matching algorithm in the prior art must rely on the brightness and chrominance information of the scene, and is highly susceptible to illumination unevenness and camera noise. And the effect of the repeated texture of the scene, etc., therefore, the obtained disparity/depth map error is more, the effect of the virtual view reconstruction based on the depth map is poor, the reconstructed image is inaccurate, and the algorithm of stereo matching is complicated, and the parallax is obtained.
- the real-time performance of the /depth map is not strong, and the current technology cannot be commercialized.
- An object of the present invention is to provide a 3D video communication method, a transmitting device, a system, and an image reconstruction method and system for improving the reconstruction effect of a virtual view point image.
- the embodiment of the invention provides a 3D video communication method, including:
- video image data of a scene collected by the image capturing device comprising at least one depth map and at least two color maps, wherein the video image data is composed of at least one image capturing device capable of outputting scene depth information and at least An image capture device capable of outputting scene color/grayscale video information, or at least one image capable of outputting scene depth information and color/grayscale video information Acquisition device obtained;
- the video image encoded data is transmitted.
- An embodiment of the present invention provides a 3D video communication sending device, including:
- a video capture unit configured to acquire video image data of a scene collected by the image capture device, where the video image data includes at least one depth map and at least two color maps, where the video capture unit includes at least one depth capable of outputting the scene
- a video encoding unit configured to encode the video image data to obtain video image encoded data
- a video output unit configured to send the video image encoded data.
- An embodiment of the present invention provides an image reconstruction method, including:
- An embodiment of the present invention provides an image reconstruction method, including:
- An embodiment of the present invention provides an image reconstruction system, including: a general image capturing device for acquiring a color map of a known viewpoint;
- a depth image acquisition device configured to acquire a depth map of the known viewpoint
- a conversion device configured to perform depth parallax conversion on the depth map, and obtain disparity information corresponding to the depth map
- a reconstruction device configured to reconstruct an image of the virtual viewpoint according to the color map of the known viewpoint and the disparity information.
- An embodiment of the present invention provides an image reconstruction system, including:
- a first general image capturing device configured to acquire a first color map of a known first viewpoint
- a second general image capturing device configured to acquire a second color map of a known second viewpoint
- a first depth map for obtaining the known first viewpoint
- the first determining means configured to determine the first depth map according to the first color map, the second color map, and the first depth map First depth parallax corresponding factor
- a first conversion device configured to perform depth parallax conversion on the first depth map according to the first depth disparity corresponding factor, to obtain first disparity information
- a first reconstruction device configured to reconstruct a third image of the virtual view according to the first color map and the first disparity information.
- the embodiment of the present invention provides a 3D video communication system, including a sending device and a receiving device, where the sending device includes:
- a video capture unit configured to acquire video image data of a scene collected by the image capture device, where the video image data includes at least one depth map and at least two color maps, where the video capture unit includes at least one depth capable of outputting the scene
- a video encoding unit configured to encode the video image data to obtain video image encoded data
- the receiving device includes:
- a video receiving unit configured to receive the video image encoded data sent by the video output unit
- a video decoding unit configured to decode the video encoded data to obtain video image decoding data.
- the depth map of the scene is collected by the image capturing device that can directly output the scene depth map, and the obtained depth map is accurate and reliable, and the depth map is collected in real time, and the video images of the virtual viewpoints obtained according to the depth map are effective.
- Accurate can reflect the real effect of the scene;
- the holes generated by only one color map reconstruction can be repaired, so that the reconstructed video image is more accurate and improved.
- the reconstruction effect of virtual viewpoint image has strong practicability; at the same time, it can avoid a lot of high-complexity calculation during image reconstruction, and improve the real-time image reconstruction and the quality of reconstructed image. .
- FIG. 1 is a schematic diagram showing the principle of generating a void when reconstructing a virtual view image according to a video image acquired by a single depth image acquisition device in the prior art
- FIG. 2 is a schematic diagram of a principle of a parallel dual image acquisition device 3D video system
- Fig. 3 is a schematic diagram showing the principle of acquiring a depth image by a CCD image capturing device equipped with an ultra-fast shutter and an intensity-modulated illuminator;
- Figure 4 is a basic configuration diagram of an HDTV Axi-Vision image acquisition device
- FIG. 5 is a schematic flowchart of Embodiment 1 of a 3D video communication method according to the present invention.
- Embodiment 2 of a 3D video communication method according to the present invention
- FIG. 7 is a schematic diagram of a relationship between a scene and a viewpoint according to an embodiment of the present invention.
- FIG. 8 is a schematic diagram of a relationship between a scene and an image point according to an embodiment of the present invention.
- Embodiment 9 is a flowchart of Embodiment 1 of an image reconstruction method according to the present invention.
- FIG. 10 is a flowchart of Embodiment 2 of an image reconstruction method according to the present invention
- 11 is a flowchart of Embodiment 3 of an image reconstruction method according to the present invention.
- Embodiment 4 is a flowchart of Embodiment 4 of an image reconstruction method according to the present invention.
- Embodiment 13 is a flowchart of Embodiment 5 of an image reconstruction method according to the present invention.
- FIG. 14 is a schematic structural diagram of Embodiment 1 of a 3D video communication transmitting device according to the present invention
- FIG. 15 is a schematic structural diagram of Embodiment 2 of a 3D video communication transmitting device according to the present invention
- FIG. 16 is a video collecting unit in an embodiment of a 3D video communication transmitting device according to the present invention
- 17A-17C is a schematic diagram of a combination of image acquisition and connection with an acquisition control module in an embodiment of a 3D video communication transmitting device according to the present invention
- Embodiment 1 of an image reconstruction system is a schematic structural diagram of Embodiment 1 of an image reconstruction system according to the present invention.
- FIG. 19 is a schematic structural diagram of Embodiment 2 of an image reconstruction system according to the present invention.
- Embodiment 3 of an image reconstruction system according to the present invention.
- FIG. 21 is a schematic structural diagram of Embodiment 4 of an image reconstruction system according to the present invention.
- Embodiment 5 is a schematic structural diagram of Embodiment 5 of an image reconstruction system according to the present invention.
- FIG. 23 is a schematic structural diagram of an embodiment of a 3D video communication system according to the present invention.
- FIG. 24 is a schematic structural diagram of a receiving device in an embodiment of a 3D video communication system according to the present invention.
- the embodiment of the present invention is based on the basic principle of 3D video, and obtains a depth map and a color map of the scene, and can obtain a 3D video image of each viewpoint by reconstruction.
- the embodiment of the present invention mainly acquires a depth map and a plurality of color maps of a scene by using an image capturing device, wherein a depth image of the scene can be obtained by using a depth image capturing device capable of outputting the depth information of the scene, and the color/gray of the scene can be output by using the image.
- a general image capture device of video information obtains a color or grayscale video map of the scene.
- FIG. 2 is a schematic diagram of the principle of a parallel dual image acquisition device 3D video system. as shown in picture 2, The camera ldl and the camera ld2 are horizontally placed, and the distance between them is B, and the distance from the photographing station point lc is Z. Then the parallax/depth in the horizontal direction satisfies the formula:
- f is the focal length
- Z is the distance between the object lc and the imaging plane
- B is the distance between the optical centers of the two cameras
- d is the parallax/depth, which is the distance value of the imaging pixels on the two cameras at the same spatial point, d including the horizontal
- the parallax of the direction and the parallax of the vertical direction, for a parallel camera system, 0. It can be seen that the parallax of the 3D image is related to the distance Z of the observer.
- the imaging position on the other image can be obtained, and as long as the depth map and the color map of the scene are obtained, 3D video images of scenes under various viewpoints can be reconstructed.
- Multi-Viewpoint (MV) / Free Viewpoint (FV) video is another hot topic in the current video field.
- Multiple scenes can be captured simultaneously by multiple cameras. Different camera angles are different, and multiple video streams are generated.
- the video streams of these different viewpoints are sent to the user terminal, and the user can select any viewpoint and direction to view the scene.
- the user-selected viewpoint can be a predefined fixed camera shooting viewpoint or a virtual viewpoint whose image is synthesized from images taken by real cameras around.
- 3D video and multi-view/free-view video are not mutually exclusive and can be combined into one system.
- Each viewpoint in the multi-view/free-view video system can be viewed in 2D or 3D.
- the depth map can effectively encode and decode the 3D video image; (2) the depth map can be used to effectively reconstruct the virtual viewpoint, that is, there is no viewpoint of the physical camera.
- stereo/multi-view displays often need to display images of multiple different positions at the same time, and can also generate images of other views according to images of one view by using color map + depth map, without simultaneously transmitting 2D of different views at the same time.
- the image saves bandwidth.
- the embodiment of the present invention uses the depth image acquisition device to acquire the depth map of the scene, and uses the depth image acquisition device and/or the ordinary image acquisition device.
- the color map of the scene is acquired, so that the depth image capturing device can be used to obtain the depth map with good effect, and at the same time, the 3D video images of various viewpoints can be obtained by using the multiple color maps obtained by the ordinary image capturing device.
- FIG. 3 is a schematic diagram showing the principle of acquiring a depth image by a CCD image acquisition device equipped with an ultra-fast shutter and an intensity-modulated illuminator.
- a snapshot of the spatial distribution of illumination intensity is shown in Figure 3, which increases linearly with time, with 2c and 2d being the spatial distribution of illumination intensity at the same time.
- objects 2a and 2b are square objects, and objects 2b are triangular objects.
- the instantaneous illumination intensity of the reflected light on the near object 2a emitted to the image capture device 2e is detected by the super fast shutter 2f of the image capture device detecting device, and a square distribution in the image A is obtained; the object 2b reflects the light The triangle distribution in image A is obtained.
- the instantaneous light intensity detected by the image capturing device 2e/ is stronger than / 2 , and the brightness of the square image is brighter than the triangle, so that the brightness of the captured image A can be utilized.
- the difference is to detect the depth of the object.
- the brightness of the reflected light of the object is affected by parameters such as the reflectivity of the object, the distance from the object to the image capture device, the modulation index of the light source, and the spatial inhomogeneity of the illumination.
- the image B can be obtained in a linearly decreasing manner with the spatial distribution of the illumination intensity, and the image A and the image B can be combined, and the adverse effects can be eliminated by the signal processing algorithm to obtain an accurate depth map.
- FIG. 4 is a basic configuration diagram of an HDTV Axi-Vision image acquisition device.
- a depth image processing unit and a color image processing unit are included.
- Near-infrared LED arrays are used to modulate illuminators with fast direct modulation. Near-infrared LEDs emit light at 850 nm, which is outside the visible range and does not interfere with visible light.
- Four LED units surround the camera lens to evenly illuminate the scene.
- a visible light source such as a fluorescent source, for illuminating the object being imaged, the source having a spectrum that exceeds the near-infrared region.
- the visible light and the near-infrared light are separated, wherein the visible light enters the color image processing unit and is processed by the color image processing unit to obtain a color image of the object, that is, a 2D image.
- the color image processing unit can be a color HDTV camera; the near-infrared light is processed by the depth image processing unit to obtain a depth image of the object.
- a short pulse bias is applied between the photocathode and the microchannel plate (MCP) to achieve ten a shutter of one-hundredth of a second, using the opening of the shutter to obtain an optical image of the object on the phosphor, which is then focused by a relay lens onto a high-resolution progressive CCD camera, converted into a photoelectron image, and finally passed through the signal.
- the processor forms a depth map of the object.
- the shutter is turned on and the light modulation frequency has the same frequency for better signal to noise ratio (SNR). It can be seen that the depth camera can obtain a better depth map, and the depth map obtained by the depth camera can reconstruct a video image with a stronger effect.
- FIG. 5 is a schematic flowchart of Embodiment 1 of a 3D video communication method according to the present invention. Specifically, as shown in FIG. 5, the embodiment may include the following steps:
- Step 101 Acquire video image data of a scene collected by the image capturing device, where the video image data includes at least one depth map and at least two color maps, where the video image data is composed of at least one image capturing device capable of outputting scene depth information.
- the 3D video communication method of the embodiment can acquire the depth map and the color map of the scene by using the image acquisition device, wherein the depth map is acquired by the depth image acquisition device capable of outputting the depth information of the scene, and the color map is output by the color video information capable of outputting the scene.
- the ordinary image capturing device obtains, and in addition, in this embodiment, the grayscale image can also be obtained by using an ordinary image capturing device capable of outputting a grayscale image.
- an ordinary image capturing device capable of outputting a grayscale image.
- a plurality of image capturing devices having different viewpoints may be disposed, and the image capturing device may be configured by at least one depth image capturing device capable of acquiring a scene depth map and at least one acquired scene color map.
- the ordinary image capturing device is composed of at least one depth image capturing device capable of simultaneously outputting depth information of the scene and color/grayscale video information.
- the depth map and the color map of the scene can be collected in real time, and the reconstructed 3D video image of each virtual viewpoint can be satisfied according to the collected depth map and color map.
- the image acquisition device composed of a plurality of image acquisition devices
- the image acquisition device with a better control position can be selected according to the need to obtain a depth map and a color map of the desired scene. Avoid repeated or unnecessary scene shooting. Before shooting, you can also adjust the shooting position of each image acquisition device to obtain a video image of the scene with a larger viewing angle.
- Step 102 Encode video image data to obtain video image encoded data; encode video image data of the scene acquired in step 101 to obtain video image encoded data of the scene.
- the video image data is encoded to facilitate the transmission and storage of video image data. Before the video image data is encoded, pre-processing operations such as correction of the video image data can be performed to ensure the accuracy and reliability of the video image data.
- Step 103 Send the video image encoded data.
- the video image encoded data may be sent to the video image receiving device, and the video image receiving device performs corresponding decoding on the video image encoded data and reconstruction of each virtual view video image, and finally The video image of each viewpoint is displayed by the display device.
- the video image receiving device can display the required video image according to the received video image encoded data, and can reconstruct and render the video images of various virtual viewpoints during the video image display to obtain scenes of different viewpoints.
- Video image. Video image display of the scene
- the video image of each viewpoint may also be displayed according to the needs of the viewer. Since the depth map and the color map of the scene can be obtained by the depth image capturing device and the ordinary image capturing device in step 101, the obtained depth map is accurate and reliable, and is performed.
- the color area generated by the reconstructed image can be repaired by using multiple color maps, the reconstruction effect of the video image is improved, and the reconstructed images of various virtual viewpoints can be obtained, and the reconstructed video is obtained.
- the image works well and reflects the real effect of the scene.
- the depth map of the scene is collected by the image capturing device that can directly output the scene depth map, and the obtained depth map is accurate and reliable, and the collected depth map has strong temporality, and the video of each virtual viewpoint obtained according to the depth map is obtained.
- the image is good, accurate and reliable, and can reflect the real effect of the scene.
- the cavity generated by only one color map reconstruction can be repaired, so that the reconstructed video is made.
- the image is more accurate, and the reconstruction effect of the virtual viewpoint image is improved, and the utility model has strong practicability.
- FIG. 6 is a schematic flowchart diagram of Embodiment 2 of a 3D video communication method according to the present invention. Specifically, as shown in FIG. 6, the embodiment may include the following steps:
- Step 201 Control each image acquisition device to synchronize the image acquisition of the scene.
- multiple image collection devices with different viewpoint positions may be disposed, and the image collection devices with different viewpoint positions may include at least one device.
- a depth image capturing device that outputs depth information of the scene and at least one general image capturing device capable of outputting color/grayscale video information of the scene, or a depth including at least one depth information and color/grayscale video information that can output the scene Image acquisition device.
- a certain number of depth image capturing devices and ordinary image capturing devices may be set, as long as the video image data of the collected scene includes at least one depth map and at least two color images.
- each image acquisition device when performing image acquisition of the scene, can be controlled to perform synchronous shooting and image acquisition, thereby ensuring synchronization of the collected video images, and avoiding images acquired at the same time point or different viewpoints at the same time.
- synchronous acquisition can obtain better video image effects.
- the image acquisition device can also be placed in different positions before the image data is collected. Obtain the best shooting angle to capture the video image of a larger viewing angle, ensure the reconstruction and display of the 3D video image of each viewpoint, and improve the reconstruction effect of the virtual viewpoint video image.
- the depth image capturing device can be placed in the middle of the ordinary image capturing device, so that a larger shooting angle can be obtained, and when the virtual viewpoint video image is reconstructed, a video image of the scene with a large viewing angle can also be obtained. .
- a synchronization signal can also be controlled and generated, and image acquisition of the scene by each image capturing device is synchronized according to the synchronization signal.
- the synchronization signal may be generated by a hardware or software clock, or a video output signal of an image acquisition device in the image acquisition process may be used as a synchronization signal, and the synchronization signal may be obtained when controlling the synchronous acquisition of each image acquisition device.
- the external synchronization interface directly input to each image acquisition device performs synchronous acquisition control on each image acquisition device, and the image acquisition device can be synchronously controlled by the acquisition control module, and the synchronous acquisition can achieve frame synchronization or line/field synchronization.
- Step 202 Perform image image calibration on the video images collected by each image acquisition device, and obtain internal parameters and external parameters of each image acquisition device.
- the calibration of the image acquisition device can be performed by a conventional calibration method and a self-calibration method.
- the traditional calibration method includes direct linear transformation (DLT) calibration method, Radial alignment constraint (RAC) calibration method and plane calibration method.
- the basic principle of the traditional calibration method is to use the calibration reference object to establish an image acquisition device imaging model linear equations, and measure the world coordinates of a set of points in the reference and their corresponding coordinates on the imaging plane, and then substitute these coordinate values into the Internal and external parameters are found in the linear equations.
- the self-calibration method refers to a process in which the image acquisition device can be calibrated only by the correspondence between the image points without calibrating the reference object. Self-calibration is based on The special constraint relationship between the imaging points in the image, such as the polar line constraint, can not require the structural information of the scene.
- calibration information including internal parameters and external parameters of the image acquisition device can be obtained, and the video images captured by the image acquisition devices can be corrected according to the internal parameters and external parameters of the image acquisition device.
- the video image is more in line with the human eye imaging model, and a better visual effect can be obtained according to the corrected video image.
- Step 203 Establish, according to the internal parameter and the external parameter, a correspondence between a video image collected by each image capturing device and attributes of each image capturing device, and serve as video image data of the scene, where the image capturing device attribute includes an internal parameter of the image capturing device, External parameters and the acquisition timestamp of each frame of the video image;
- Corresponding relationship between the video image and the attributes of each image acquisition device is established according to internal parameters and external parameters, and is output as video image data of the scene, and the image collection device attributes include internal parameters of the image acquisition device, external parameters, and acquisition of each frame of the video image.
- the time stamp by establishing the correspondence between the image acquisition device attributes and the acquired video images, the video image can be corrected according to the attributes of the image acquisition device.
- Step 204 Perform correction processing on the video image data according to the attribute of the image capturing device, and obtain the corrected video image data;
- the video image data is corrected according to the attribute of the image capturing device and the corresponding relationship between the video image and the image capturing device attributes, so that the corrected video image data can be obtained.
- the correcting process for the video image may include the following processing:
- Step 205 Encode the corrected video image data to obtain video image encoded data.
- the color map and the depth map data after the correction processing can be encoded by using a codec standard such as MPEG-4 or H.264, wherein the description of the depth can adopt the MPEG standard.
- a codec standard such as MPEG-4 or H.264
- the description of the depth can adopt the MPEG standard.
- a layer-based 3D video coding method can be utilized, which mainly combines SEI information in the H.264 protocol with hierarchical coding ideas, and
- the video data of the channel is encoded by the conventional method into a base layer containing only I and P frames, color map data of the channel, and then all the data of the other channel is encoded into a P frame, such as depth map data, and the reference frame used in prediction is
- the previous frame of the channel or the corresponding frame in the base layer can have better 2D/3D compatibility when decoding.
- the receiving display user can select 2D display or 3D display, and can control the video decoding module to perform corresponding decoding processing.
- Step 206 Perform packet processing on the video image encoded data, encapsulate the data into a data packet, and send the data.
- the video image encoded data may be grouped and packaged into a data packet and sent to the video image receiving device, and the received device performs corresponding processing on the received packet data, and the data may be transmitted through the existing The network, such as the Internet, is sent.
- the network such as the Internet
- the video image encoded data is packetized and transmitted, and specifically includes the following steps:
- Step 2011 multiplexing video image encoded data to obtain multiplexed data of video image encoded data
- multiple video data streams can be multiplexed in the frame/field manner for the encoded video data.
- one video data can be encoded as an odd field
- the other video data can be encoded as an even field
- the odd and even fields can be transmitted as one frame.
- Step 2022 Perform multiplexing processing on the multiplexed data of the video image encoded data, encapsulate the data into a data packet, and transmit the data.
- the embodiment may further receive encoded voice data, system command data, and/or file data, perform packet processing and transmit together with the video image encoded data, and may also receive externally input control information, where the control information includes a viewing viewpoint, Display mode, display distance information, etc., according to the control information, the image capturing device can be adjusted, and a better image capturing device for capturing the viewing angle can be selected for capturing the video image of the scene, such as adjusting the shooting angle and image of the image capturing device.
- the number of shots of the acquisition device, etc. improves the practicality of video image acquisition.
- the video receiving device can receive the video image encoded data through a network or the like, and perform corresponding processing on the received data, such as performing demultiplexing, decoding, reconstructing, rendering, displaying, and the like on the received video image data.
- the received encoded voice data can also be decoded, the received file data can be correspondingly stored, and the like, or a specific operation can be performed according to the system command data, for example, the received video can be received according to the display manner in the system command. The image is displayed.
- the video image receiving device can reconstruct a video image of the scene of each virtual viewpoint according to the depth map and the color map of the received scene. Since the depth map of the scene is acquired by the depth image acquisition device, the obtained depth map is accurate and reliable, and multiple color maps or grayscale images of the scene can be obtained by using multiple ordinary image acquisition devices or depth image acquisition devices. When the video images of the respective viewpoints of the scene are displayed, the plurality of color maps can be used to repair the void regions generated when only one color map is reconstructed, thereby improving the reconstruction effect of the viewpoint video images. At the same time, the depth map and color map of the scene acquired by the image acquisition device have strong real-time performance, and the collected video image data also has strong practicability.
- each image acquisition device by controlling each image acquisition device to perform synchronous acquisition of the video image of the scene and calibration of the image acquisition device, the synchronized video image data and the calibration information of the image acquisition device are acquired, and the video image acquired by the image acquisition device is performed according to the calibration information.
- Correction processing, making video images The processing is more accurate; at the same time, by encoding the video image, the convenience of storing and transmitting the video image data is improved, and the operation of storing and transmitting a large amount of video image data is facilitated, and the embodiment further improves the video collection and processing.
- the accuracy of the reconstructed image is improved, and the acquisition of the video image can be effectively controlled according to the input control information, thereby improving the practicability of the video image acquisition.
- FIG. 7 is a schematic diagram of a relationship between a scene and a viewpoint according to an embodiment of the present invention
- FIG. 8 is a schematic diagram of a relationship between a scene and an image point according to an embodiment of the present invention.
- the scene image is captured by the image acquisition device at the known viewpoint 1 and the known viewpoint 2, and the depth image acquisition device is placed at the known viewpoint 1 to acquire the depth map of the scene, and then calculated.
- a scene image of a virtual viewpoint (such as virtual viewpoint 1 and virtual viewpoint 2) between the viewpoint 1 and the known viewpoint 2 is known.
- the space point ⁇ ( ⁇ , ⁇ , ⁇ ) in the two image acquisition devices be (xl, yl), (x2, y2), then know the baseline length B and focal length f
- the depth Z can be calculated:
- Embodiments of the present invention can be made by knowing the viewpoint ⁇ 2 , the known viewpoint ⁇ ⁇ , the depth ⁇ , and the viewpoint ⁇ .
- X should be reconstructed.
- the depth information Z of the depth map acquired by the depth information image capturing device has only a relative meaning, and can represent the depth relationship of the scene without the parallax information of practical significance. It is necessary to convert scene depth information that has no practical meaning into practical meaning when performing reconstruction.
- Parallax information, that is, VX 2 is obtained from the depth Z. .
- the system parameter camera focal length f during shooting is constant with the distance between the two cameras and is therefore constant. Therefore, after the determination, the conversion from depth to parallax can be completed, and the time overhead of such conversion is basically negligible.
- the advantage of real-time performance is obvious compared to the method of obtaining parallax using a matching algorithm.
- the known viewpoint 1 is a left viewpoint
- the known viewpoint 2 is a right viewpoint
- the image collection device is known.
- the image acquired at the viewpoint 1 is a left image
- the image acquired at the known viewpoint 2 is a right image
- the depth information acquired at the known viewpoint 1 is a left depth map
- the depth information acquired at the known viewpoint 2 is right Depth map.
- the idea of the embodiment of the image reconstruction method of the present invention is described in detail by using an example.
- the example is described by the configuration of the ordinary image acquisition device and the two depth image acquisition devices. It can be understood that, for other configurations, it is also The scope of protection of the invention.
- the two image acquisition devices are arranged in parallel, and the optical centers of the depth image acquisition device and the ordinary image acquisition device need to be overlapped as much as possible. If the optical distance between the depth image acquisition device and the ordinary image acquisition device is large, the captured image will not be captured. Fully coincident, registration is required at this time, that is, the points in the image acquired by the depth image acquisition device are identical in position to the corresponding points in the image acquired by the ordinary image acquisition device. For example, if a certain point in the scene is in the ordinary image capturing device, the imaging coordinates are ( ⁇ ), and in the depth image capturing device, the imaging coordinates of the point are ( 3 ⁇ ), then:
- d is the disparity between the left and right graphs (known viewpoint 1 and known viewpoint 2).
- the embodiment of the present invention preferably adopts a feature point matching based method to acquire the disparity between feature points in the two images. Since the depth image actually acquired by the depth image capturing device is noise-containing, the embodiment of the present invention preferably calculates the N feature points and then obtains an average value to remove the noise, thereby obtaining a more accurate ffi value.
- the above method is respectively used to determine the depth disparity corresponding factor ⁇ of the left depth map and the depth disparity corresponding factor of the right depth map, and obtain a left disparity map and a right disparity map.
- the parallax Vx' between the point and the left camera is:
- Vx' -Vx
- FIG. 9 is a flowchart of Embodiment 1 of an image reconstruction method according to the present invention.
- a depth image capturing device capable of outputting a depth map and a color map may be disposed at the first viewpoint
- an ordinary image capturing device capable of outputting a color map may be disposed at the second viewpoint
- the depth image capturing device may obtain the first A depth map and a color map of one viewpoint, a color map of the second viewpoint can be acquired by the ordinary image capturing device.
- the method can include the following steps:
- Step 310 Acquire a first color map of a known first viewpoint and a second color map of a known second viewpoint.
- a color map of the scene is acquired by the ordinary image capturing device at the known first viewpoint and the known second viewpoint, respectively.
- Step 320 Acquire a first depth map of the known first viewpoint.
- Step 330 Determine a first depth disparity correspondence factor of the first depth map.
- Step 340 Perform depth parallax conversion on the first depth map according to the first depth disparity corresponding factor, and obtain first disparity information.
- Step 350 Reconstruct a third image of the virtual view according to the first color map and the first disparity information.
- the determined depth disparity corresponding factor may be repeatedly used, and it is not necessary to re-determine the depth disparity corresponding factor. That is to say, after the depth parallax corresponding factor is determined, the step 330 is not necessary.
- the depth map is directly obtained, and the depth map is converted into the parallax information to reconstruct the image, so that the disparity information is not required to be acquired by the stereo matching algorithm, thereby avoiding a large amount of high-complexity calculation and improving the image.
- FIG. 10 is a flowchart of Embodiment 2 of an image reconstruction method according to the present invention.
- the method of this embodiment may further include steps 313 and 314 before determining the depth disparity corresponding factor.
- the method of this embodiment may include the following steps:
- Step 311 Acquire a first color map of a known first viewpoint and a second color map of the known second viewpoint.
- a color map of the scene is acquired by the ordinary image capturing device at the known first viewpoint and the known second viewpoint, respectively.
- Step 312 Acquire a first depth map of the known first viewpoint.
- Step 313 Correct the first color map and the second color map such that points in the first color map are parallel to corresponding points in the second color map.
- the step 313 may also be after the step 31 1 , which is not limited by the embodiment of the present invention.
- Step 314 Register the first color map and the first depth map to make the first color The points in the figure coincide with corresponding points in the first depth map.
- Step 315 Determine a first depth disparity correspondence factor of the first depth map.
- Step 316 Perform depth parallax conversion on the first depth map according to the first depth disparity corresponding factor, and obtain first disparity information.
- Step 317 Reconstruct a third image of the virtual view according to the first color map and the first disparity information.
- the determined depth disparity corresponding factor may be repeatedly used, and it is not necessary to re-determine the depth disparity corresponding factor. That is to say, after the depth parallax corresponding factor is determined, the step 315 does not have to be performed.
- the depth map is directly obtained, and the depth map is converted into the parallax information to reconstruct the image, so that the disparity information is not required to be acquired by the stereo matching algorithm, thereby avoiding a large amount of high-complexity calculation and improving the image.
- FIG. 11 is a flowchart of Embodiment 3 of an image reconstruction method according to the present invention. Specifically, as shown in FIG. 11, the following steps are included:
- Step 410 Acquire a first color map of a known first viewpoint and a second color map of a known second viewpoint.
- a color map of the scene is acquired by the ordinary image capturing device at the known first viewpoint and the known second viewpoint, respectively.
- Step 420 Acquire a first depth map of the known first viewpoint and a second depth map of the known second viewpoint.
- a depth map of the scene at the known first viewpoint and the known second viewpoint is acquired by the depth image acquisition device.
- Step 430 Determine a first depth disparity correspondence factor of the first depth map and a second depth disparity correspondence factor of the second depth map.
- Step 440 Perform depth parallax conversion on the first depth map according to the first depth disparity corresponding factor, obtain first disparity information, and perform depth parallax conversion on the second depth map according to the second depth disparity corresponding factor. , obtain the second disparity information.
- the depth parallax conversion is performed on the depth map according to the depth parallax corresponding factor, and the principle and process of acquiring the parallax information have been described in detail above, and will not be further described herein for the sake of consideration.
- Step 450 Reconstruct a third image of the virtual view according to the first color map and the first disparity information, and reconstruct a fourth image of the virtual view according to the second color map and the second disparity information.
- Step 460 Perform hole filling according to the third image and the fourth image to generate a fifth image of the virtual view.
- the first color map and the first depth map are registered such that points in the first color map coincide with corresponding points in the first depth map.
- the second color map and the second depth map are registered such that points in the second color map coincide with corresponding points in the second depth map.
- the depth map is directly obtained, and the depth map is converted into the parallax information to reconstruct the image, so that the disparity information is not required to be acquired by the stereo matching algorithm, thereby avoiding a large amount of high-complexity calculation and improving the image.
- the real-time nature of the reconstruction and the quality of the reconstructed image is improved.
- the occlusion problem in the scene is solved by obtaining a sufficient depth map of the scene. The occlusion problem in the scene cannot be solved when the image is reconstructed by using the stereo matching algorithm.
- the color view and the depth map of the two viewpoints can be used to reconstruct the virtual view image.
- the reconstruction of the virtual viewpoint image can also be performed by using the color map and the depth map of more viewpoints, and the principle of the reconstruction process is the same as that of using two viewpoints.
- FIG. 12 is a flowchart of Embodiment 4 of an image reconstruction method according to the present invention. Specifically, as shown in FIG. 12, the method of the embodiment of the method may include the following steps:
- Step 510 Obtain a color map of a known viewpoint.
- Step 520 Obtain a depth map of the known viewpoint.
- Step 530 Perform depth parallax conversion on the depth map to obtain disparity information corresponding to the depth map.
- Determining a depth disparity correspondence factor of the depth map Determining a depth disparity correspondence factor of the depth map. And performing depth parallax conversion on the depth map according to the depth parallax corresponding factor, and acquiring disparity information corresponding to the depth map.
- Step 540 Reconstruct an image of the virtual view according to the color map of the known view and the disparity information.
- the embodiment employs a color map and a depth map of a known viewpoint.
- the application scenario is to generate other virtual viewpoint images with small parallax, which can be used for stereoscopic display.
- correction of the color map is not required.
- FIG. 13 is a flowchart of Embodiment 5 of an image reconstruction method according to the present invention. Specifically, as shown in FIG. 13, the method may include the following steps:
- Step 511 Obtain a color map of a known viewpoint.
- Step 512 Obtain a depth map of the known viewpoint.
- Step 513 Register a color map of the known viewpoint and a depth map of the known viewpoint, so that a point in the depth map coincides with a corresponding point in the color map.
- Step 514 Determine a depth disparity corresponding factor of the depth map.
- the depth disparity corresponding factor in this embodiment does not have a practical meaning, and the selection of the depth disparity corresponding factor may be selected according to the needs of the application scenario, such as according to the parameters of the stereoscopic display.
- Step 515 Perform depth parallax conversion on the depth map according to the depth disparity corresponding factor, and obtain disparity information corresponding to the depth map.
- Step 516 Reconstruct an image of the virtual view according to the color map of the known view and the disparity information.
- the embodiment employs a color map and a depth map of a known viewpoint.
- the application scenario is to generate other virtual viewpoint images of small parallax, which can be used in stereoscopic display.
- correction of the color map is not required, but registration of the color map and the depth map is required.
- the registration process is the same as the process of the previous embodiment; in this embodiment, the depth parallax corresponding factor still needs to be determined, but the depth disparity corresponding factor at this time has no practical meaning.
- the selection of the depth parallax corresponding factor can be selected according to the needs of the application scenario, such as according to the parameters of the stereoscopic display.
- FIG. 14 is a schematic structural diagram of Embodiment 1 of a 3D video communication transmitting apparatus according to the present invention.
- the 3D video communication transmitting apparatus of this embodiment includes a video capturing unit 11, a video encoding unit 12, and a video output unit 13.
- the video capture unit 11 is configured to acquire video image data of a scene collected by the image capture device, where the video image data includes at least one depth map and at least two color maps, and the video capture unit 11 includes at least one image depth that can be output.
- a depth image capturing device for information and at least one ordinary image capturing device capable of outputting scene color/gray video information, or including at least one depth image capturing device capable of outputting scene depth information and color/grayscale video information;
- video encoding The unit 12 is configured to encode the video image data to obtain video image encoded data.
- the video output unit 13 is configured to receive the video image encoded data encoded by the video encoding unit 12, and send the video image encoded data.
- the depth map and/or the color map of the scene may be acquired by the depth image capturing device in the video capturing unit 11, and the ordinary image capturing device acquires the color map of the scene, and then takes the depth map and the color map of the obtained scene as
- the 3D video image data is transmitted to the video encoding unit 12, and the captured video image data is encoded by the video encoding unit 12 to obtain video image encoded data of the scene, and the video image encoded data is sent to the video output unit 13, The video image encoded data is transmitted by the video output unit 13 to the video image receiving device.
- the depth map of the scene is collected by the depth image capturing device, and the obtained depth map is accurate and reliable, and at the same time, multiple color maps or grayscale images of the scene can be acquired by the depth image capturing device and/or the ordinary image capturing device, so that the When the 3D video image of each virtual scene scene is reconstructed, 3D video image data of various viewpoints can be obtained, and when the video image reconstruction of the virtual viewpoint is performed, the depth map and the color map collected by the depth image capturing device can be used for virtualizing.
- the image capturing device sets an appropriate shooting viewpoint, and the image of the obtained scene contains a video image with a larger viewing angle, and can reconstruct an image of a virtual viewpoint with a larger viewing angle range, and has a better reconstruction effect.
- the depth map of the scene is obtained by the depth image acquisition device, and the obtained depth map is accurate and reliable, and the real-time performance is strong.
- the 3D video images of various virtual viewpoints obtained according to the depth map are more accurate, and can reflect the real effect of the scene.
- a plurality of color maps of the scene are obtained by the depth image acquisition device and the ordinary image acquisition device, and when the 3D video image reconstruction of the virtual viewpoint is performed, 3D video data of a wide range of viewpoints can be obtained, and can be matched by a color
- the repaired 3D video image is more accurate, more accurately reflects the real effect of the scene, and improves the reconstruction effect of the virtual view image, so that the 3D video communication transmitting device of the embodiment of the present invention is improved. Has a strong practicality.
- FIG. 15 is a schematic structural diagram of a second embodiment of a 3D video communication transmitting device according to the present invention
- FIG. 16 is a schematic structural diagram of a video capturing unit in an embodiment of a 3D video communication transmitting device according to the present invention
- FIGS. 17A-17C are implementations of a 3D video communication transmitting device according to the present invention
- the combination form of the image acquisition device and the connection diagram with the acquisition control module On the basis of the first embodiment of the 3D video communication transmitting device of the present invention, as shown in FIG. 16, the video capturing unit 11 in this embodiment may include a depth image capturing device 110 capable of outputting a depth map of the scene, or may be simultaneously output.
- the depth image capturing device 1 11 of the scene depth map and the color map further includes a general image capturing device 1 12 that can output a color map or a gray scale image of the scene.
- the video capture unit 11 in this embodiment further includes at least one acquisition control module 1 13 for controlling the image acquisition device connected thereto to perform scene shooting, and collecting and outputting video image data for capturing the scene.
- the depth image capture device 111 can simultaneously output a depth map and a color map of the scene.
- the normal image capture device 112 can only output a color map or a gray map of the scene, and the depth image capture device 110 can only output the scene.
- the acquisition control module 1 13 can be combined with each image acquisition device and can be connected as follows:
- the acquisition control module 113 is connected to a depth image acquisition device 11 and a conventional image acquisition device 112;
- the acquisition control module 1 13 is connected to a depth image acquisition device 10 and two ordinary image acquisition devices 112;
- the positions of the depth image capturing device 110 and the normal image capturing device 112 can be arbitrarily placed, but in order to obtain the maximum viewing angle, the depth image capturing device 10 can be placed in the middle of a normal image capturing device 112, thus obtaining a depth map of the scene.
- the angle of view of the color map is larger, and the 3D video image of the virtual viewpoint can be reconstructed in a larger range, and the synthesized 3D video images of the virtual viewpoints are better.
- the acquisition control module 1 13 is connected to two or more depth image acquisition devices 11 1 .
- Multiple depth image acquisition devices 1 11 can obtain more depth maps of the scene and color maps corresponding to the depth map. Therefore, a larger scene range can be obtained when the scene reconstruction of the virtual view is performed, and the video data obtained by each depth image acquisition device can also be referred to, thereby improving the accuracy of the virtual view reconstruction.
- the combination of the above-mentioned acquisition control module 113 and each image acquisition device is only the most basic connection form, and the number of image acquisition devices can be arbitrarily combined or added according to actual needs to obtain better 3D video data of the scene, but the scene is performed.
- the video image data guaranteed to be output during video capture should include at least one depth map and multiple color maps of the scene.
- the present embodiment uses the above two basic combinations of (a) and (b) to form a video collection unit 11, which includes two acquisitions.
- the control module 113, one of the acquisition control modules 113 is connected to a depth image acquisition device 11 and an ordinary image acquisition device 112, and the other acquisition control module 113 is connected to a depth image acquisition device 110 and a general image acquisition device 112.
- the position of the viewpoint captured by each image acquisition device can be reasonably allocated, so that each depth map and color map of the captured scene have a good viewing angle, and the reconstruction effect of each virtual viewpoint image of the scene is ensured. .
- the video collection unit 11 may further include a synchronization module 114 and a calibration module 115.
- the synchronization module 114 is configured to generate a synchronization signal, and output the synchronization signal to the acquisition control module 113.
- the acquisition control module 113 synchronizes the shooting of the scene by each image acquisition device; or, is used to output the synchronization signal to the image acquisition device.
- the synchronization interface synchronizes the shooting of the scene by each image acquisition device, and the synchronization signal is generated by the synchronization module 114 itself or is a video output signal of an image acquisition device in the image acquisition process; the calibration module 115 is configured to receive the video captured by the image acquisition device.
- the image is subjected to calibration by the image acquisition device according to the acquired video image, and internal parameters and external parameters of each image acquisition device are obtained and sent to the acquisition control module 113.
- the acquisition control module 1 13 is further configured to establish an acquisition according to internal parameters and external parameters.
- the image collection device attributes include internal parameters and external parameters of the image acquisition device, and acquisition time stamps of each frame of the video image. Synchronous acquisition of each image acquisition device can be achieved by the synchronization module 1 14 to ensure the synchronization of the acquired video images.
- the internal parameters and external parameters of the image acquisition device can be obtained, and can be used as a reference basis for the correction processing of the video image, and the video images captured by different image acquisition devices are corrected to ensure the virtual viewpoint. Refactoring effect.
- the 3D video image communication transmitting apparatus of this embodiment may further include a pre-processing unit 14 for receiving, from the acquisition control module 113, a video image collected by each image acquisition device and attributes of each image acquisition device, and a video.
- the video image data corresponding to the attributes of the image capturing device, the video image data is corrected according to internal parameters and external parameters of the image capturing device, and the corrected video image data is output, and the video encoding unit 12 can receive the video image data.
- the pre-processing unit 14 corrects the processed video image data, and encodes the corrected video image data.
- Each of the acquisition control modules 113 has a corresponding pre-processing unit 14 connected thereto. In this way, it can be ensured that the video image data collected by each acquisition control module 113 can be processed quickly and accurately, and the efficiency of data processing is improved.
- the video output unit 13 may also include an output processing module 131 and an output interface module 132.
- the output processing module 131 is configured to receive the video image encoded data encoded by the video encoding unit 12, and perform packet processing on the video image encoded data, and package the data into a data packet.
- the output interface module 132 is configured to perform packet processing and encapsulation.
- the packet data of the packet is sent out.
- the embodiment may further include a multiplexing unit 15 for multiplexing video image encoded data to obtain multiplexed data.
- the output processing module 131 may be further configured to receive multiplexed data, perform packet processing on the multiplexed data, and encapsulate the data into data. package.
- This embodiment may also include an audio encoding unit, a system control unit, and a user data unit.
- the audio encoding unit is configured to encode the voice data and send it to the output processing module 131;
- the system control unit is configured to send the command data to the output processing module 131;
- the user data unit is configured to send the file data to the output processing module 131;
- the processing module 131 is further configured to perform packet processing on the received encoded voice data, command data, and/or file data, and package the data packet to the output interface module 132.
- the local audio information can be transmitted together with the video information to the receiving end of the video through the audio encoding unit, thereby improving the practicability of the 3D video.
- the embodiment may also include a control input unit 16 connected to the acquisition control module 113 in the video collection unit 11 for acquiring control information and transmitting control information to the acquisition control module, the control information may include viewing or displaying a viewpoint, Display information such as distance and display mode.
- the control information can be input by the user through a Graphical User Interface (GUI interface) or a remote control device, such as displaying or viewing viewpoints, distances, and display modes, and can be based on the information.
- GUI interface Graphical User Interface
- the acquisition control module 1 13 is controlled. If the display mode only requires the display of 2D video, the acquisition control module 113 may be required to select only the ordinary image acquisition device for scene shooting and acquisition.
- the depth image may be selected.
- the acquisition device and the ordinary image acquisition device together capture and collect, according to the viewing or display viewpoint, the selective selection by a part of the image acquisition device.
- the shooting of the scene and the image acquisition improve the efficiency of image collection, and at the same time reduce the collection of useless or repetitive data, which brings inconvenience to data transmission and processing.
- the acquisition control module is used to control the acquisition and output of video images by the image acquisition devices connected thereto.
- the acquisition control module can convert the analog image signal into a digital video image signal or directly receive the digital image signal, and the acquisition control module can save the collected image data in the form of a frame in the buffer of the acquisition control module.
- the acquisition control module can also provide the collected video data to the calibration module for image acquisition device calibration, and the calibration module returns the obtained internal parameters of the image acquisition device and the calibration information of the external parameters to the corresponding acquisition control module, and the acquisition control The module then establishes a one-to-one correspondence between the video image data and the corresponding acquired image capturing device attributes according to the calibration information of the image capturing device.
- the attributes of the image capturing device include the unique encoding of the image capturing device, the internal parameters of the image capturing device, and the external The parameters and the acquisition time stamp of each frame, etc., and the attributes of the image acquisition device and the video image data are output according to a certain format.
- the acquisition control module can also perform the operations of panning/rotating/pulling/stretching the image acquisition device through the remote control interface of the image acquisition device according to the calibration information of the image acquisition device, and the acquisition control module can also pass the synchronization interface of the image acquisition device.
- a synchronized moderate signal is provided to the image acquisition device to control simultaneous acquisition of the image acquisition device.
- the acquisition control module can also select a part of the image acquisition device to perform the collection operation according to the viewing or display viewpoint received by the control input unit, and close the collection of the unnecessary depth image acquisition device to avoid repeated or useless collection.
- the synchronization module is used to control the synchronous acquisition of multiple image acquisition devices. For high-speed moving objects, synchronous acquisition is very important. Otherwise, the images of different viewpoints or the same viewpoint will be very different at the same time, and the 3D video seen by the user will be distorted.
- the synchronization module can generate a synchronization signal through a hardware or software clock, and output to an external synchronization interface of the image acquisition device to capture images.
- the collecting device performs synchronous acquisition control or outputs to the acquisition control module, and the acquisition control module performs synchronous acquisition and control on the image acquisition device through the control line.
- the synchronization module can also use the video output signal of an image acquisition device as a control signal to input to other image acquisition devices for synchronous acquisition control. Synchronous acquisition enables frame synchronization or line/field synchronization.
- the calibration module mainly realizes the calibration of the image acquisition device, and the so-called image acquisition device calibration is to obtain the internal parameters and external parameters of the image acquisition device.
- Internal parameters include image acquisition device imaging image center, focal length, lens distortion, etc.
- External parameters include parameters such as rotation and translation of the image acquisition device position. Since the images captured by multiple image acquisition devices are often not aligned by the scan line, they do not conform to the imaging model of the human eye, and when viewed, they cause visual fatigue to the user. Therefore, it is necessary to correct the image captured by the image capturing device to an image conforming to the human eye imaging model, and the internal parameters and external parameters of the image capturing device obtained by the image capturing device calibration can be used as a basis for correcting the image.
- imaging point coordinate y w Z w is the world coordinate
- the calibration of the image acquisition device can be performed by a conventional calibration method and a self-calibration method.
- the traditional calibration method includes direct linear transformation calibration method, radial alignment constraint calibration method and plane calibration method.
- the basic principle of the traditional calibration method is to use the calibration reference object to establish an image acquisition device imaging model linear equations, and measure the world coordinates of a set of points in the reference and their corresponding coordinates on the imaging plane, and then substitute these coordinate values into the Internal and external parameters are found in the linear equations.
- Self The calibration method refers to a process in which the image acquisition device can be calibrated only by the correspondence between the image points without calibrating the reference object.
- the self-calibration is based on the special constraint relationship between the imaging points in multiple images, such as the polar line constraint, so the structural information of the scene may not be needed.
- the pre-processing unit receives the collected image buffer and the corresponding image acquisition device parameters from the acquisition control module, and processes the cached image according to the pre-processing algorithm.
- the preprocessing mainly includes the following contents:
- 3D video coding is mainly divided into two categories: block-based coding and object-based coding.
- block-based coding in addition to intra prediction and inter prediction to eliminate data redundancy in the spatial and temporal domains, spatial data redundancy between multi-channel images must also be eliminated.
- Parallax estimation and compensation techniques can be used to eliminate spatial redundancy between multi-channel images.
- the core of disparity estimation and compensation is to find the correlation between two or more images, which is similar to the motion estimation compensation, but the parallax estimation and compensation ratio motion estimation compensation is complex. miscellaneous.
- the motion estimation compensation process is an image in which the same image acquisition device is not synchronized
- the parallax estimation and compensation processing is an image in which the image acquisition devices are time-synchronized.
- the position of all pixels may change, and objects with far distances may consider the parallax to be zero.
- the video coding unit in the embodiment of the present invention can encode the color map and the depth map data output by the pre-processing unit by using a codec standard such as MPEG-4 or H.264, wherein the depth description can adopt the MPEG standard.
- a codec standard such as MPEG-4 or H.264
- there are various methods for encoding color map + depth map data such as a layered 3D video coding method, which mainly combines SEI information in the H.264 protocol with layered coding ideas, and video of one channel.
- the data is encoded by the conventional method into a base layer containing only I and P frames, color map data of the channel, and then all the data of the other channel is encoded into a P frame, such as depth map data, and the reference frame in the prediction is the previous one of the channel.
- the corresponding frame in the frame or the base layer so that it can have better 2D/3D compatibility when decoding.
- For the conventional 2D display only the base layer data can be decoded; for the 3D display
- the control input unit is mainly used to receive the input of the video user or the video terminal, and feed back to the video acquisition unit and the video coding unit.
- the information included in the control input unit mainly includes viewing and display viewpoints, display modes, and distance information of the user.
- the information that controls the input unit can be input by the user through a graphical user interface or a remote control device, such as viewing or displaying viewpoints, distance information, and display methods.
- the control input unit can also selectively control the image capturing device according to information such as viewing viewpoints, for example, only one or more image capturing devices in the video capturing unit can be selected for video image collection.
- the video coding unit in the image processing unit can be controlled to encode only the color map required for 2D display; if the display mode is 3D display, the output color is output.
- the map and depth map data are encoded.
- the image collection device of each image acquisition device is controlled by the acquisition control module, and the shooting angle of the image acquisition device is arranged during the acquisition, and the 3D of the scene with a larger viewing angle can be obtained.
- the video data has better reconstruction effect when reconstructing each virtual viewpoint of the scene; through the synchronization module and the calibration module, the synchronized video data and the image acquisition device calibration parameters can be obtained, so that the processed video image is processed. More accurate; At the same time, the video data is encoded, which improves the convenience of data storage and transmission, and facilitates the storage and transmission of a large amount of video data. This embodiment further improves the accuracy of video capture and processing, and improves the virtual viewpoint. The reconstruction effect of the video image.
- FIG. 18 is a schematic structural diagram of Embodiment 1 of an image reconstruction system according to the present invention.
- the reconstruction system may include:
- the first normal image capturing device 610 is configured to acquire a first color map of a known first viewpoint.
- the second normal image capturing device 620 is configured to acquire a second color map of the known second viewpoint.
- the first depth image acquisition device 630 is configured to acquire a first depth map of the known first viewpoint.
- the first determining device 640 is configured to determine a first depth disparity corresponding factor of the first depth map according to the first color map, the second color map, and the first depth map.
- the first converting means 650 is configured to perform depth parallax conversion on the first depth map according to the first depth disparity corresponding factor to obtain first disparity information.
- the first reconstruction device 660 is configured to reconstruct a third image of the virtual view according to the first color map and the first disparity information.
- the image is reconstructed by converting the depth map into disparity information, so that the disparity information is not required to be acquired by the stereo matching algorithm, thereby avoiding a large amount of high-complexity calculation and improving the image.
- FIG. 19 is a schematic structural diagram of Embodiment 2 of an image reconstruction system according to the present invention.
- the image reconstruction system of the present embodiment may further include a correction device 611 and a first registration device 612.
- the image reconstruction system of this embodiment may include:
- the first normal image capturing device 610 is configured to acquire a first color map of a known first viewpoint.
- the second normal image capturing device 620 is configured to acquire a second color map of the known second viewpoint.
- the first depth image acquisition device 630 is configured to acquire a first depth map of the known first viewpoint.
- the correcting means 611 is configured to correct the first color map and the second color map such that points in the first color map are parallel to corresponding points in the second color map.
- the first registration device 612 is configured to register the first color map and the first depth map such that a point in the first color map coincides with a corresponding point in the first depth map.
- the first determining device 640 is configured to determine a first depth disparity corresponding factor of the first depth map according to the first color map, the second color map, and the first depth map.
- the first converting means 650 is configured to perform depth parallax conversion on the first depth map according to the first depth disparity corresponding factor to obtain first disparity information.
- the first reconstruction device 660 is configured to reconstruct a third image of the virtual view according to the first color map and the first disparity information.
- the depth map is directly obtained, and the depth map is converted into the parallax information to reconstruct the image, so that the disparity information is not required to be acquired by the stereo matching algorithm, thereby avoiding a large amount of high-complexity calculation and improving the image.
- FIG. 20 is a schematic structural diagram of Embodiment 3 of an image reconstruction system according to the present invention.
- the embodiment may further include:
- the second depth image acquisition device 710 is configured to acquire a second depth map of the known second viewpoint.
- the second determining means 720 is configured to determine a second depth disparity corresponding factor of the second depth map according to the first color map, the second color map, and the second depth map.
- the second conversion device 730 is configured to perform depth parallax conversion on the second depth map according to the second depth disparity corresponding factor to obtain second disparity information.
- the second reconstruction device 740 is configured to reconstruct a fourth image of the virtual view according to the second color map and the second disparity information.
- the hole filling device 750 is configured to perform hole filling according to the third image and the fourth image to generate a fifth image of the virtual view.
- the first normal image capturing device and the first depth image capturing device are preferably coincident or integrated.
- the image reconstruction system further includes: if the point in the image acquired by the image capturing device does not coincide with the corresponding point in the depth image acquired by the depth image capturing device or the two images acquired by the image capturing device are not parallel, the image reconstruction system further includes:
- the correcting means 611 is configured to correct the first color map and the second color map such that points in the first color map are parallel to corresponding points in the second color map.
- the first registration device 612 is configured to register the first color map and the first depth map such that a point in the first color map coincides with a corresponding point in the first depth map.
- the second registration device 613 is configured to register the second color map and the second depth map such that a point in the second color map coincides with a corresponding point in the second depth map.
- the image is reconstructed by converting the depth map into disparity information, so that the disparity information is not required to be acquired by the stereo matching algorithm, thereby avoiding a large amount of high-complexity calculation and improving the image.
- the real-time nature of the reconstruction and the quality of the reconstructed image is improved.
- the occlusion problem in the scene is solved by obtaining a sufficient depth map of the scene. The occlusion problem in the scene cannot be solved when the image is reconstructed by the stereo matching algorithm.
- FIG. 21 is a schematic structural diagram of Embodiment 4 of an image reconstruction system according to the present invention.
- This embodiment may include: a general image capturing device 810 for acquiring a color map of a known viewpoint.
- the depth image acquisition device 820 is configured to acquire a depth map of the known viewpoint.
- the converting device 830 is configured to perform depth parallax conversion on the depth map, and obtain disparity information corresponding to the depth map.
- the reconstruction device 840 is configured to reconstruct an image of the virtual viewpoint according to the color map of the known viewpoint and the disparity information.
- the embodiment employs a color map and a depth map of a known viewpoint.
- the application scenario is to generate other virtual viewpoint images of small parallax, which can be used for stereoscopic display.
- correction of the color map is not required.
- FIG. 22 is a schematic structural diagram of Embodiment 5 of an image reconstruction system according to the present invention.
- This embodiment may include: a general image capturing device 810 for acquiring a color map of a known viewpoint.
- the depth image acquisition device 820 is configured to acquire a depth map of the known viewpoint.
- the converting device 830 is configured to perform depth parallax conversion on the depth map, and obtain disparity information corresponding to the depth map.
- the reconstruction device 840 is configured to reconstruct an image of the virtual viewpoint according to the color map of the known viewpoint and the disparity information.
- the determining device 850 is configured to determine a depth parallax corresponding factor of the depth map.
- a registration device 860 configured to register an image acquired by the ordinary image capturing device and an image acquired by the depth image capturing device, so that a point in the depth map is identical in position to a corresponding point in the image .
- the converting means 830 performs depth parallax conversion on the depth map according to the depth disparity corresponding factor, and acquires disparity information corresponding to the depth map.
- a color map and a depth map of a known viewpoint are used, and the applied scene is other virtual viewpoint images for generating d and parallax, which can be used for stereoscopic display.
- correction of the color map is not required, but registration of the color map and the depth map is required.
- the registration process is the same as that of the previous embodiment.
- the depth parallax corresponding factor still needs to be determined, but the depth disparity corresponding factor at this time does not have actual meaning, and the depth disparity corresponds to
- the selection of the factors can be selected according to the needs of the application scenario, such as selecting according to the parameters of the stereoscopic display.
- FIG. 23 is a schematic structural diagram of an embodiment of a 3D video communication system according to the present invention.
- the present embodiment includes a transmitting device 1 and a receiving device 2.
- the transmitting device 1 includes a video capturing unit 11, a video encoding unit 12, and a video output unit 13.
- the video capture unit 11 is configured to acquire video image data of a scene collected by the image capture device, where the video image data includes at least one depth map and at least two color maps, and the video capture unit 11 includes at least one capable of outputting the scene depth.
- a depth image capturing device for information and at least one ordinary image capturing device capable of outputting scene color/gray video information, or including at least one depth image capturing device capable of outputting scene depth information and color/grayscale video information; video encoding
- the unit 12 is configured to encode the video image data to obtain video image encoded data.
- the video output unit 13 is configured to receive the video image encoded data encoded by the video encoding unit 12, and send the video image encoded data.
- the receiving device 2 includes a video receiving unit 21 and a video decoding unit 22.
- the video receiving unit 21 is configured to receive video image encoded data sent by the video output unit 13;
- the video decoding unit 22 is configured to decode the video image encoded data to obtain video image decoded data.
- the transmitting device 1 and the receiving device 2 can be directly connected, or can be connected through an existing communication network, such as the Internet.
- the depth map and/or the color map of the scene may be acquired by the depth image capturing device in the video capturing unit 11, and the ordinary image capturing device acquires the color map of the scene, and then takes the depth map and the color map of the obtained scene as
- the 3D video image data is transmitted to the video encoding unit 12, and the captured video image data is encoded by the video encoding unit 12 to obtain video image encoded data of the scene, and the video image encoded data is sent to the video output unit 13, The video image encoded data is transmitted by the video output unit 13 to the video image receiving device.
- the depth map of the scene is collected by the depth image capturing device, and the obtained depth map is accurate and reliable, and at the same time, multiple color maps or grayscale images of the scene can be acquired by the depth image capturing device and/or the ordinary image capturing device, so that the When reconstructing a 3D video image of each virtual viewpoint scene, 3D video image data of various viewpoints can be obtained, and video image reconstruction of the virtual viewpoint is performed.
- the depth map and the color map acquired by the depth image acquisition device can be used to reconstruct the virtual viewpoint, and then the reconstructed image is repaired by using the color map collected by the ordinary image acquisition device to eliminate the possible void region.
- the reconstructed image is more in line with the real effect of the scene, and satisfies the user's visual effect.
- the depth image capturing device and the ordinary image capturing device can be set to the appropriate shooting viewpoint, and the image of the obtained scene is obtained.
- a video image with a larger viewing angle can reconstruct an image of a virtual viewpoint with a larger viewing angle range, and has a better reconstruction effect.
- the receiving device 2 may perform corresponding decoding, video image reconstruction, rendering, and display processing according to the received video image encoded data. Since the depth map of the embodiment is collected by the depth image acquisition device, the obtained depth map has good quality, and the depth map is collected with strong real-time performance. When performing 3D video image reconstruction of each virtual viewpoint scene, the depth can be utilized.
- the depth map and a color map acquired by the image acquisition device are used to reconstruct the virtual viewpoint, and then the reconstructed image is repaired by using the color map collected by the ordinary image acquisition device to eliminate the possible void area, so that the weight is heavy.
- the constructed image is more in line with the actual scene and satisfies the user's visual effect.
- FIG. 24 is a schematic structural diagram of a receiving device in an embodiment of a 3D video communication system according to the present invention.
- the receiving device 2 may further include an image reconstruction system 23 for reconstructing a video image of a viewpoint to be displayed according to the display information and the video image decoding data.
- the receiving device 2 in this embodiment may further include a demultiplexing unit 24 for demultiplexing the multiplexed data received by the video receiving unit 21, the multiplexed data being multiplexed data of the video image encoded data.
- the image reconstruction system 23 can receive the video image decoding data output by the video decoding unit 22, and perform reconstruction of the display viewpoint video image according to the depth map and the color map in the video image decoding data, to obtain a reconstructed image of the display viewpoint, and Fixing the void region in the reconstructed image of the display viewpoint according to the color map in the video image decoding data, and/or repairing the void region in the reconstructed image of the display viewpoint by linear or non-linear interpolation to obtain the display viewpoint Video image.
- the receiving device 2 in this embodiment may further include a display input unit 25 for acquiring display information, where the display information includes information such as a display or viewing viewpoint, a display manner, and a display distance, and the image reconstruction system 23 may display the information according to the display information.
- the video image decoded data is reconstructed to reconstruct a video image of the viewpoint to be displayed.
- the present invention may further include a rendering unit 26 and a display unit 27, wherein the rendering unit 26 is configured to receive a video image of the display viewpoint and render the image unit, and the display unit 27 is configured to receive image data of the display viewpoint rendered by the rendering unit 26, A video image showing the viewpoint is displayed.
- the rendering unit 26 can also receive the video image decoding data directly sent by the video decoding unit 22, render it, and send it to the display unit 27 for display.
- the receiving device 2 may further include a voice decoding unit, a system control unit, and/or a user data unit, where the voice decoding unit may be configured to decode the received encoded voice data; and the system control unit may be configured to receive the received system command data.
- the user data unit can store, edit, and the like the received file data and the like, and the above-described speech decoding unit, system control unit, and user data unit are not shown in the drawings.
- the image reconstruction system is configured to reconstruct the virtual viewpoint image based on the obtained color map and depth map data of the scene.
- Reconstruction of virtual viewpoint images can be performed using image-based rendering reconstruction techniques.
- Image-based rendering in refactoring techniques, /. Represents the original texture image, representing the new reconstructed viewpoint image
- d represents the depth map
- d(x, y) represents the disparity value at the pixel (X, y), "is a weight for the offset.
- Parallel image For example, the acquisition device system has the following relationship for each pixel (X, y) in the reconstructed viewpoint image:
- the depth map and the color map obtained by each image capturing device can be reconstructed in different ways.
- the image acquisition device that collects the video data includes only one or more depth image acquisition devices 110 depth image acquisition device 111, the image of the virtual viewpoint can be reconstructed according to the following steps:
- the color map and the depth map A of the output are reconstructed by using the general algorithm of the above-described image-based rendering reconstruction technique, and the virtual viewpoint V in the image acquisition device group is obtained.
- a reconstructed image (2)
- the color map / 2 and the depth map D 2 of the output are reconstructed by using the above-mentioned general algorithm to obtain another reconstructed image of the same virtual viewpoint V / v 2 ;
- the corresponding information of the pixels in the void region may be determined according to the brightness, chromaticity and depth information of the pixels surrounding the cavity, for example, The method of linear or nonlinear interpolation is repaired, and finally the video image of the virtual viewpoint is obtained.
- the image acquisition device that collects video data includes only one depth image acquisition device.
- the 1 10 depth image acquisition device 111 and a normal image acquisition device 112 can perform virtual view reconstruction according to the following steps:
- the color image / 2 output by the ordinary image capturing device is used for padding.
- the basic method of filling First, find the positional relationship between the ordinary image acquisition device and the depth image acquisition device, such as the calibration parameters calibrated according to the image acquisition device; then, use the depth map D to find / v hollow hole region in / 2 Corresponding positions, pixels of / 2 at this position are mapped to / v in the / v for filling holes in / v .
- step (3) For the remaining void area after step (2) / v , repair by linear or nonlinear interpolation, and finally obtain the video image of the virtual viewpoint.
- the image reconstruction system can also filter the reconstructed video image of the viewpoint. Wait for image processing to improve the effect of the video image.
- the image reconstruction system in the embodiment of the present invention may specifically include a first general image acquisition device 610, a second general image acquisition device 620, a first depth image acquisition device 630, and a first determination.
- the image reconstruction system in the embodiment of the present invention may specifically include a first general image acquisition device 610, a second general image acquisition device 620, a first depth image acquisition device 630, a correction device 611, and a A registration device 612, a first determining device 640, a first converting device 650, and a first reconstructing device 660; or as shown in FIG.
- the image reconstruction system in the embodiment of the present invention may specifically include a first normal image capturing The device 610, the second general image capturing device 620, the first depth image capturing device 630, the correcting device 611, the first registering device 612, the first determining device 640, the first converting device 650, the first reconstructing device 660, the first The second depth image capturing device 710, the second determining device 720, the second converting device 730, the second reconstructing device 740, and the hole filling device 750; or, as shown in FIG. 21, the image reconstruction system in the embodiment of the present invention is specifically The image capture device 810, the depth image capture device 820, the conversion device 830, and the reconstruction device 840 may be included; or, as shown in FIG.
- Image reconstruction system in the embodiment of the invention specifically include ordinary image pickup device 810, a depth image acquisition device 820, conversion device 830, reconstruction means 840, determining means 850, and the registration means 860.
- the image reconstruction system in this embodiment may have the same structure and function as the embodiment of the image reconstruction system of the present invention described above, and details are not described herein.
- the video collection unit 11 in the transmitting device 1 in this embodiment may further include at least one acquisition control module 113, a synchronization module 114, and a calibration module 115.
- the video output unit 13 may include an output processing module 131 and an output interface module 132.
- the transmitting device 1 may further include a pre-processing unit 14, a multiplexing unit 15, a control input unit 16, an audio encoding unit, a system control unit, and a user data unit.
- the acquisition control module 113 can be connected with a plurality of combinations of the depth image acquisition device and the general image acquisition device to control the image capture device to perform shooting and capturing of the scene.
- the structure of the transmitting device 1 in this embodiment is the same as that of the foregoing embodiment of the 3D video communication transmitting device of the present invention, and details are not described herein again.
- the sending device and the receiving device in this embodiment may be integrated, so that the integrated device can send video image data to other devices, and can also receive and process video image data sent by other devices, and can also receive and The video image data collected by the self-device can be used to display the video image in real time.
- the transmitting device and the receiving device in this embodiment can also be connected through various existing wireless or wired networks, and can be applied to remote video image acquisition, etc. .
- the video image data including the depth map and the color map collected by the transmitting device in the embodiment is accurate and reliable, and has strong real-time performance, and the video image data can be transmitted to the receiving.
- the device, the video image data is processed by the receiving device; since the captured video image data includes a depth map and a color map, when the video image reconstruction of the virtual viewpoint is performed, multiple color map pairs may be used to focus on only one color map.
- the void area generated by the structure is repaired, so that the reconstructed image has good effect and has strong practicability, which can meet the needs of 3D video.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Testing, Inspecting, Measuring Of Stereoscopic Televisions And Televisions (AREA)
Description
3D视频通信方法、 发送设备、 系统
及图像重构方法和系统 本申请要求于 2008 年 09 月 02 日提交中国专利局、 申请号为 200810119545.9、 发明名称为" 3D视频通信方法、 发送设备和系统"的中国 专利申请, 以及于 2008 年 10 月 30 日提交中国专利局、 申请号为 200810225195.4、 发明名称为 "图像重构方法和图像重构系统" 的中国专 利申请的优先权, 其全部内容通过引用结合在本申请中。 技术领域
本发明涉及视频技术领域, 特别涉及一种 3D视频通信方法、 发送设 备、 系统及图像重构方法和系统。 背景技术
目前传统的视频是一种二维图像信息的载体, 它只能表现出物体的内容 而忽略了物体的远近、 位置等深度信息, 是不完整的。 作为视频观察的主体, 人类需要获取比一幅二维图像信息更多的画面来获取必需的空间信息, 以获 得符合人类两只眼睛观察世界视觉感受的画面。
3D视频技术可以提供符合人类立体视觉原理的具有深度信息的画面, 从 而能够真实地重现客观世界的景象, 表现出场景中物体的纵深感、 层次感和 真实性, 也是当前视频技术发展的重要方向。 获取场景的深度信息, 即场景 的深度是 3D视频系统中一个非常重要的内容, 场景深度图也称为场景的视差 图。 目前, 现有技术中获取场景的深度图主要有以下方法:
一种是利用立体图像匹配获取场景的深度图。 立体图像匹配是利用摄像 机拍摄获得场景的多幅彩色图像, 该彩色图即为场景的 2D图像, 通过对彩色
图像进行分析、 计算获得场景的深度图。 其基本原理为: 找到场景中的某一 点在多幅彩色图中对应的成像点, 然后再根据该点在多幅彩色图中的坐标求 出其在空间中的坐标, 从而得到该点的深度信息。
立体图像匹配技术主要包括基于窗口的匹配法和动态规划法, 这两种方 法均采用了基于灰度的匹配的算法。 基于灰度的匹配的算法是将彩色图像分 割成小的子区域, 以其灰度值作为模版并在其它彩色图像中找到和其最相似 灰度值分布的子区域, 如果两个子区域满足相似性要求, 就认为这两个子区 域中的点是匹配的。 在匹配过程中, 通常使用相关函数衡量两个区域的相似 性。 基于灰度的匹配的算法一般可得到场景密集的深度图。
另外, 也可以基于特征的匹配的算法进行立体图像匹配。 基于特征的匹 配的算法是利用由彩色图像的灰度信息导出的特征进行匹配, 与利用简单的 亮度和灰度变化信息进行匹配的算法相比, 基于特征的匹配的算法更加稳定 和准确。 匹配的特征可认为是潜在的能够描述场景 3D结构的重要特征, 如边 缘和边缘的角点。 基于特征的匹配一般可先得到场景稀疏的深度图, 然后利 用内插值等方法得到场景密集的深度图。
另一种是利用单台深度摄像机(Depth Camera )获取场景的深度图。 深度摄像机的基本原理是通过发射红外线并检测场景中物体反射红外线 的强度来判定物体的距离, 因此, 深度摄像机输出的深度图质量好, 精度较 高, 应用前景好。 目前深度摄像机主要用于手势识别、 背景替换和合成等领 域, 在 3D视频系统中的应用较少, 且一般均采用单台深度摄像机进行场景视 频图像的采集。
在利用单台摄像机进行场景的视频图像采集中, 获得场景的深度图比较 精确, 但是通过单台深度摄像机只可以获得一个视点的场景的一幅彩色图及 对应的深度图。 在重构小视差虚拟视点的图像时可能有较好的重构效果, 但 在进行重构大视差虚拟视点的图像时, 由于获得的彩色图少, 缺少足够多的 彩色图像信息, 重构出的虚拟视点的图像会产生较大的"空洞", 且无法进行
修补, 使得重构出的图像失真严重, 重构效果较差。
如图 1所示,为现有技术中根据单台深度摄像机采集的视频图像进行虚拟 视点图像重构时产生空洞的原理示意图。 假如在 ol视点处获得物体 la和物体 lb的视频图, 由于物体 lb遮挡住物体 la的 laO部分, 因此实际获得的视频图信 息中只包括物体 la的部分图像信息和物体 lb的图像信息, 并没有物体 la的 laO 部分的图像信息。 若想要获得虚拟视点 o2处物体 4a和物体 lb的视频图像, 由 于实际获得的视频图像信息中并未有物体 la的 laO部分的图像信息, 在 o2视点 处重构获得的图像中就会缺少物体 1 a的 1 aO部分图像, 因此在 1 aO部分会产生 空洞, 使得重构出的图像失真严重, 重构效果差。
发明人在实现本发明过程中, 发现现有技术至少存在以下缺陷: 现有 技术中的立体匹配的算法必须依赖于场景的亮度和色度信息, 极易受到光 照的不均匀性、 摄像机的噪声以及场景的重复纹理等的影响, 因此, 得到 的视差 /深度图错误较多,基于深度图进行虚拟视点重构时效果差, 重构出 的图像不准确, 且立体匹配的算法复杂, 获得视差 /深度图的实时性不强, 目前的技术还无法实现商用; 而现有技术中利用单台深度摄像机获取深度 信息进行重构大视差虚拟视点的图像时, 会产生较大的 "空洞", 且无法进 行修补, 重构的图像失真严重, 重构效果差, 实用性也不强。 发明内容
本发明的目的是提供一种 3D视频通信方法、 发送设备、 系统及图像重 构方法和系统, 以提高虚拟视点图像的重构效果。
本发明实施例提供了一种 3D视频通信方法, 包括:
获取图像采集装置采集的场景的视频图像数据, 所述视频图像数据包括 至少一幅深度图和至少二幅彩色图, 所述视频图像数据由至少一台能输出场 景深度信息的图像采集装置和至少一台能输出场景彩色 /灰度视频信息的图 像采集装置, 或由至少一台能输出场景深度信息和彩色 /灰度视频信息的图像
采集装置获得;
对所述视频图像数据进行编码, 获得视频图像编码数据;
将所述视频图像编码数据发送出去。
本发明实施例提供了一种 3D视频通信发送设备, 包括:
视频采集单元, 用于获取图像采集装置采集的场景的视频图像数据, 所 述视频图像数据包括至少一幅深度图和至少二幅彩色图, 所述视频采集单元 包括至少一台能输出场景的深度信息的图像采集装置和至少一台能输出场景 的彩色 /灰度视频信息的图像采集装置, 或包括至少一台能输出场景的深度信 息和彩色 /灰度视频信息的图像采集装置;
视频编码单元, 用于对所述视频图像数据进行编码, 获得视频图像编码 数据;
视频输出单元, 用于将所述视频图像编码数据发送出去。
本发明实施例提供了一种图像重构方法, 包括:
获取已知视点的彩色图;
获取所述已知视点的深度图;
对所述深度图进行深度视差转换, 获取所述深度图对应的视差信息; 根据所述已知视点的所述彩色图和所述视差信息重构虚拟视点的图像。 本发明实施例提供了一种图像重构方法, 包括:
获取已知第一视点的第一彩色图和已知第二视点的第二彩色图; 获取所述已知第一视点的第一深度图;
根据所述第一彩色图、 所述第二彩色图和所述第一深度图确定所述第一 深度图的第一深度视差对应因子;
根据所述第一深度视差对应因子对所述第一深度图进行深度视差转换, 获取第一视差信息;
根据所述第一彩色图和所述第一视差信息重构虚拟视点的第三图像。 本发明实施例提供了一种图像重构系统, 包括:
普通图像采集装置, 用于获取已知视点的彩色图;
深度图像采集装置, 用于获取所述已知视点的深度图;
转换装置, 用于对所述深度图进行深度视差转换, 获取所述深度图对应 的视差信息;
重构装置, 用于根据所述已知视点的所述彩色图和所述视差信息重构虚 拟视点的图像。
本发明实施例提供了一种图像重构系统, 包括:
第一普通图像采集装置, 用于获取已知第一视点的第一彩色图; 第二普通图像采集装置 , 用于获取已知第二视点的第二彩色图; 第一深度图像采集装置, 用于获取所述已知第一视点的第一深度图; 第一确定装置, 用于根据所述第一彩色图、 所述第二彩色图和所述第一 深度图确定所述第一深度图的第一深度视差对应因子;
第一转换装置, 用于根据所述第一深度视差对应因子对所述第一深度图 进行深度视差转换, 获取第一视差信息;
第一重构装置, 用于根据所述第一彩色图和所述第一视差信息重构虚拟 视点的第三图像。
本发明实施例提供了一种 3D视频通信系统, 包括发送设备和接收设备, 所述发送设备包括:
视频采集单元, 用于获取图像采集装置采集的场景的视频图像数据, 所 述视频图像数据包括至少一幅深度图和至少二幅彩色图, 所述视频采集单元 包括至少一台能输出场景的深度信息的图像采集装置和至少一台能输出场景 的彩色 /灰度视频信息的图像采集装置, 或包括至少一台能输出场景的深度信 息和彩色 /灰度视频信息的图像采集装置;
视频编码单元, 用于对所述视频图像数据进行编码, 获得视频图像编码 数据;
视频输出单元, 用于将所述视频图像编码数据发送出去;
所述接收设备包括:
视频接收单元, 用于接收所述视频输出单元发送来的所述视频图像编码 数据;
视频解码单元, 用于对所述视频编码数据进行解码, 获得视频图像解码 数据。
本发明实施例通过可直接输出场景深度图的图像采集装置采集场景 的深度图, 获得的深度图准确可靠, 且深度图采集实时性强, 根据深度图 获得的各虚拟视点的视频图像效果好、 准确, 可反映场景的真实效果; 同 时根据图像采集装置获得的场景的多幅彩色图, 可对只由一幅彩色图重构 产生的空洞进行修补, 使得重构出的视频图像更加准确, 提高了虚拟视点 图像的重构效果, 具有较强的实用性; 同时, 在进行图像重构时可避免进 行大量的、高复杂度的计算,提高了图像重构的实时性和重构图像的质量。 附图说明
图 1为现有技术中根据单台深度图像采集装置采集的视频图像进行虚拟 视点图像重构时产生空洞的原理示意图;
图 2为平行双图像采集装置 3D视频系统的原理示意图;
图 3为配备了超高速快门的 CCD图像采集装置和调强发光器进行深 度图像的获取的原理示意图;
图 4为 HDTV Axi-Vision图像采集装置的基本构造图;
图 5为本发明 3D视频通信方法实施例一的流程示意图;
图 6为本发明 3D视频通信方法实施例二的流程示意图;
图 7为本发明实施例场景与视点关系示意图;
图 8为本发明实施例场景与像点关系示意图;
图 9为本发明图像重构方法实施例一的流程图;
图 10为本发明图像重构方法实施例二的流程图;
图 11为本发明图像重构方法实施例三的流程图
图 12为本发明图像重构方法实施例四的流程图
图 13为本发明图像重构方法实施例五的流程图
图 14 为本发明 3D视频通信发送设备实施例一的结构示意图; 图 15为本发明 3D视频通信发送设备实施例二的结构示意图; 图 16为本发明 3D视频通信发送设备实施例中视频采集单元的结构示 图 17A - 17C为本发明 3D视频通信发送设备实施例中图像采集 的组合形式及与采集控制模块的连接示意图;
图 18为本发明图像重构系统实施例一的结构示意图
图 19为本发明图像重构系统实施例二的结构示意图
图 20为本发明图像重构系统实施例三的结构示意图
图 21为本发明图像重构系统实施例四的结构示意图
图 22为本发明图像重构系统实施例五的结构示意图
图 23为本发明 3D视频通信系统实施例的结构示意图;
图 24为本发明 3D视频通信系统实施例中接收设备的结构示意图 具体实施方式
下面结合附图对本发明的具体实施例作进一步详细的说明。
本发明实施例是基于 3D视频的基本原理, 通过获得场景的深度图和 彩色图, 并可通过重构获得各视点的 3D视频图像。 具体地, 本发明实施 例主要通过图像采集装置获取场景的深度图和多幅彩色图, 其中可利用能 输出场景深度信息的深度图像采集装置获得场景的深度图, 利用能输出场 景彩色 /灰度视频信息的普通图像采集装置获得场景的彩色或灰度视频图。
下面以平行摄像机系统为例说明 3D视频的基本原理:
图 2为平行双图像采集装置 3D视频系统的原理示意图。如图 2所示,
摄像机 ldl和摄像机 ld2水平放置, 它们之间的距离为 B, 距离被拍摄工 间点 lc的距离为 Z。 那么水平方向的视差 /深度 满足公式:
其中, f为焦距, Z为物体 lc与成像平面的距离, B为两台摄像机光 心的间距, d为视差 /深度, 是同一空间点在两台摄像机上成像像素的距离 值, d包括水平方向的视差 和垂直方向的视差 , 对于平行摄像机系统, = 0。 可以看出, 3D图像的视差和观察者的距离 Z有关。 因此只要知道 某个空间点在一个图像上的成像位置和该点对应的视差 /深度,就可以求出 其在另一个图像上的成像位置, 只要获得场景足够多的深度图和彩色图, 就可以重构出各种视点下场景的 3D视频图像。
多视点 ( Multi- Viewpoint, MV ) /自由视点 ( Free Viewpoint, FV )视 频是当前视频领域研究的另一个热点。 可通过多个摄像机同时拍摄场景, 不同的摄像机的拍摄角度不同, 产生多个视频流; 这些不同视点的视频流 送到用户终端, 用户可以选择任意的视点和方向观看场景。 用户选择的视 点可以是预定义的固定的摄像机拍摄视点, 也可以是一个虚拟视点, 其图 像由周围真实的摄像机拍摄的图像合成得到。
此外, 3D视频和多视点 /自由视点视频不是互相排斥的, 可以融合为 一个系统。 多视点 /自由视点视频系统中的每一个视点可以采用 2D方式, 也可以采用 3D方式观看。
目前 3D视频 /多视点视频 /自由视点视频系统中,采用彩色图 +深度图 的方式进行视频图像的编码和传输是一种通用方式。 采用深度图的主要作 用是( 1 )使用深度图能够有效地进行 3D视频图像的编解码; (2 )使用 深度图能够有效地进行虚拟视点的重构, 该虚拟视点即没有物理摄像机的 视点。 使用深度图可以重构其它视点的图像, 因此可采用传输一幅彩色图
+深度图的方式即可解码得到多个视点的图像, 且深度图为灰度图像, 可 以进行高效压缩, 可以显著减少码流。 此外, 立体 /多视点显示器往往需要 同时显示多个不同位置的图像, 还可以利用彩色图 +深度图的方式根据一 个视角的图像生成其它视角的图像,而不必要同时传输多个不同视点的 2D 图像, 有效节约了带宽。
为保证场景深度图的准确性和实时性, 提高虚拟视点场景视频图像的 重构效果, 本发明实施例利用深度图像采集装置获取场景的深度图, 利用 深度图像采集装置和 /或普通图像采集装置获取场景的彩色图,这样可利用 深度图像采集装置获取具有良好效果的深度图, 同时, 配合普通图像采集 装置获得的多幅彩色图就可获得各种视点的 3D视频图像。 下面简要介绍 一下深度图像采集装置的原理。
图 3为配备了超高速快门的 CCD图像采集装置和调强发光器进行深 度图像的获取的原理示意图。图 3中显示了光照强度空间分布的一个快照, 该分布随时间呈线性递增, 2c和 2d为同一时刻光照强度的空间分布的趋 势。 场景中有物体 2a和物体 2b, 物体 2a为方形物体, 物体 2b为三角形 物体。 其中, 较近物体 2a上的反射光线发射到图像采集装置 2e的瞬时的 光照强度 Λ被图像采集装置探测装置的超高速快门 2f检测到, 并得到在图 像 A中的方形分布; 物体 2b反射光线得到图像 A中的三角形分布。 由于 物体 2a距离图像采集装置 2e较近,图像采集装置 2e探测到的瞬时的光照 强度 /,比/2强, 方形图像的亮度比三角形要亮, 因此, 可利用捕获到的图 像 A的亮度的差异来检测物体的深度。 但是, 物体反射光的亮度会受物体 的反射率、 物体到图像采集装置的距离、 光源的调制指数和照度的空间不 均匀性等参数的影响。 此时, 可利用与光照强度空间分布呈线性递减的方 式获得图像 B, 将图像 A和图像 B相结合, 并通过信号处理算法可消除不 利的影响, 得到精确的深度图。
图 4为 HDTV Axi- Vision图像采集装置的基本构造图。 如图 4所示,
在高清晰度电视 ( High Definition Television , HDTV ) Axi -视觉 ( Axi-Vision )摄像机系统中, 包括深度图像处理单元和彩色图像处理单 元。 近红外 LED 阵列用于调强发光器, 其具有快速直接调制的能力, 近 红外 LED发射光的波长为 850 nm, 在可见光的范围之外, 不会干扰可见 光。 4个 LED单元环绕在摄像机镜头的周围, 可均匀地照亮摄像的场景。 同时还有一个可见光源, 如荧光源, 用于照射被摄像物体, 该光源具有超 过近红外光区域的频谱。 当物体的反射光经过摄像机镜头的二向棱镜时, 可见光和近红外光被分离, 其中, 可见光进入彩色图像处理单元并由彩色 图像处理单元进行处理后, 获得物体的彩色图像, 即 2D图像, 该彩色图 像处理单元可为一个彩色 HDTV摄像机;近红外光则经过深度图像处理单 元处理后, 获得物体的深度图像。 在深度图像处理单元中, 经二向棱镜分 离出的近红外光被聚焦到光电阴极上的同时, 在光电阴极和微通道板 ( Micro Channel Plate, MCP )之间施加短脉冲偏压, 实现十亿分之一秒 的快门, 利用快门的开启在磷光体上获得物体的光学图像, 该光学图像再 经过中继镜头聚焦到高分辨率的逐行 CCD摄像机上, 转换为光电子图像, 最后通过信号处理器形成物体的深度图。 快门的开启和光线调制频率具有 相同的频率, 以获得更好的信噪比 ( Signal to Noise Ratio , SNR ) 。 由此 可以看出, 深度摄像机可以获得较好的深度图, 利用深度摄像机获得的深 度图可重构出具有较强效果的视频图像。
图 5为本发明 3D视频通信方法实施例一的流程示意图。 具体地, 如 图 5所示, 本实施例可包括以下步骤:
步骤 101、 获取图像采集装置采集的场景的视频图像数据, 该视频图像 数据包括至少一幅深度图和至少二幅彩色图, 该视频图像数据由至少一台能 输出场景深度信息的图像采集装置和至少一台能输出场景彩色 /灰度视频信 息的图像采集装置, 或由至少一台能输出场景深度信息和彩色 /灰度视频信息 的图像采集装置获得;
本实施例 3D视频通信方法可通过图像采集装置获取场景的深度图和彩 色图, 其中深度图由能输出场景的深度信息的深度图像采集装置采集获得, 彩色图由能输出场景的彩色视频信息的普通图像采集装置获得, 此外, 本实 施例中也可以利用能输出灰度图的普通图像采集装置获得灰度图。 具体地, 在进行场景的视频图像采集时,可以通过设置多个视点不同的图像采集装置, 该图像采集装置可由至少一台能获取场景深度图的深度图像采集装置和至少 一台获取场景彩色图的普通图像采集装置组成, 也可由至少一台能同时输出 场景的深度信息和彩色 /灰度视频信息的深度图像采集装置组成。 在进行场景 的视频图像采集时, 可实时的采集场景的深度图和彩色图, 根据采集的深度 图和彩色图可满足场景各虚拟视点的 3D视频图像的重构。 在由多台图像采 集装置组成的采集设备中, 进行视频图像采集时, 可根据需要有选择的控制 部分视点位置较好的图像采集装置进行拍摄, 以获得所需场景的深度图和彩 色图, 避免重复或不必要场景的拍摄, 同时拍摄前, 也可调整好各图像采集 装置的拍摄位置, 以获得较大视角的场景的视频图像。
步骤 102、 对视频图像数据进行编码, 获得视频图像编码数据; 对步骤 101获取的场景的视频图像数据进行编码, 获得场景的视频图 像编码数据。对视频图像数据进行编码,便于视频图像数据的传输和存储。 在对视频图像数据编码前, 还可对视频图像数据进行校正等预处理操作, 保证视频图像数据的准确性和可靠性。
步骤 103、 将视频图像编码数据发送出去。
获得视频图像编码数据后, 可将该视频图像编码数据发送到视频图像接 收设备, 由视频图像接收设备对该视频图像编码数据进行相应的解码以及各 虚拟视点视频图像的重构等操作, 最终可通过显示设备显示各视点的视频图 像。 具体地, 视频图像接收设备可根据接收到的视频图像编码数据显示需要 的视频图像, 在进行视频图像显示时, 可对各种虚拟视点的视频图像进行重 构和渲染, 以获得不同视点的场景的视频图像。 在进行场景的视频图像显示
时, 也可以根据观看者的需要, 显示各个视点的视频图像, 由于步骤 101 中可以通过深度图像采集装置和普通图像采集装置获得场景的深度图和 彩色图, 获得的深度图准确可靠, 在进行虚拟视点重构时, 可利用多幅彩 色图对重构的图像产生的空洞区域进行修补, 提高了视频图像的重构效 果, 可获得各种虚拟视点的重构图像, 且重构出的视频图像效果好, 可反 映出场景的真实效果。
本发明实施例通过可直接输出场景深度图的图像采集装置采集场景 的深度图, 获得的深度图准确可靠, 且采集的深度图具有较强的时性, 根 据深度图获得的各虚拟视点的视频图像效果好、 准确可靠, 可反映场景的 真实效果; 同时根据图像采集装置获得的场景的多幅彩色图, 可对只由一 幅彩色图重构产生的空洞进行修补, 使得重构出的视频图像更加准确, 提 高了虚拟视点图像的重构效果, 具有较强的实用性。
图 6为本发明 3D视频通信方法实施例二的流程示意图。 具体地, 如 图 6所示, 本实施例可包括以下步骤:
步骤 201、 控制各图像采集装置, 使之对场景的图像采集同步; 本实施例中, 可设置多个视点位置不同的图像采集装置, 该各视点位置 不同的图像采集装置可包括至少一台能输出场景的深度信息的深度图像采集 装置和至少一台能输出场景的彩色 /灰度视频信息的普通图像采集装置, 或者 包括至少一台可输出场景的深度信息和彩色 /灰度视频信息的深度图像采集 装置。 根据实际需要, 视频图像采集前, 可设置一定数量的深度图像采集装 置和普通图像采集装置, 只要采集到的场景的视频图像数据中包括至少一幅 深度图和至少二幅彩色图。 本步骤中, 在进行场景的图像采集时, 可控制各 图像采集装置进行同步拍摄和图像的采集,保证采集到的视频图像的同步性, 避免了同一视点或不同视点在同一时刻采集到的图像具有较大的差异, 特别 是对高速运动的物体, 同步采集可获得较好的视频图像效果。
此外, 在图像数据采集前, 也可以将图像采集装置设置在不同的位置,
获得最好的拍摄视角, 以拍摄到更大视角的视频图像, 保证各视点的 3D视 频图像的重构和显示, 提高虚拟视点视频图像的重构效果。 例如, 实际拍摄 时, 可以将深度图像采集装置放置在普通图像采集装置的中间, 这样可以获 得更大的拍摄视角, 在进行虚拟视点视频图像重构时, 也可以获得大视角的 场景的视频图像。
本步骤在对各图像采集装置进行同步拍摄时,也可控制并产生同步信号, 根据该同步信号同步各图像采集装置对场景的图像采集。 具体地, 该同步信 号可通过硬件或软件时钟产生, 也可以采用图像采集过程中的一台图像采集 装置的视频输出信号作为同步信号, 在控制各图像采集装置同步采集时, 可 通过将同步信号直接输入到各图像采集装置的外同步接口对各图像采集装置 进行同步采集控制, 也可以通过采集控制模块统一对各图像采集装置进行同 步控制, 该同步采集可达到帧同步或行 /场同步。
步骤 202、 对各图像采集装置采集到的视频图像进行图像采集装置标定, 获得各图像采集装置的内部参数和外部参数;
由于多台图像采集装置拍摄到的图像往往不是按扫描线对齐的, 不符 合人眼的成像模型, 观看时, 会对用户造成视觉疲劳。 因此, 需要将图像 采集装置拍摄到的图像校正成符合人眼成像模型的图像, 而通过图像采集 装置标定获得的图像采集装置的内部参数和外部参数可作为对采集到的 视频图像进行校正的依据, 图像采集装置的标定可采用传统标定法和自标 定法等。 传统标定法包括了直接线性变换 ( direct linear transformation , DLT ) 标定法、 基于径向排列约束 ( Radial alignment constraint , RAC ) 标定法和平面标定法等。 传统标定法的基本原理是使用标定参照物建立图 像采集装置成像模型线形方程组, 并测出参照物中一组点的世界坐标和其 在成像平面上的对应坐标, 然后将这些坐标值代入该线形方程组中求出内 部参数和外部参数。 自标定法是指不需要标定参照物, 仅仅通过图像点之 间的对应关系就可以对图像采集装置进行标定的过程。 自标定依据的是多
幅图像中成像点间存在的特殊约束关系, 如极线约束, 因此可以不需要场 景的结构信息。
通过图像采集装置标定, 可以获得包括图像采集装置的内部参数和外 部参数的标定信息, 根据该图像采集装置的内部参数和外部参数可以对各 图像采集装置拍摄到的视频图像进行校正处理, 获得的视频图像更加符合 人眼成像模型, 根据该校正处理后的视频图像可获得更好的视觉效果。
步骤 203、 根据所述内部参数和外部参数建立各图像采集装置采集的视 频图像与各图像采集装置属性的对应关系, 并作为场景的视频图像数据, 图 像采集装置属性包括图像采集装置的内部参数、 外部参数以及视频图像每帧 的采集时间戳;
根据内部参数和外部参数建立视频图像与各图像采集装置属性的对应关 系, 并作为场景的视频图像数据输出, 该图像采集装置属性包括图像采集装 置的内部参数、 外部参数以及视频图像每帧的采集时间戳, 通过建立图像采 集装置属性和采集到的视频图像的对应关系, 可以根据图像采集装置的属性 对视频图像进行校正处理。
步骤 204、 根据图像采集装置属性, 对视频图像数据进行校正处理, 获 得校正处理后的视频图像数据;
根据图像采集装置属性以及视频图像与各图像采集装置属性的对应关 系, 对视频图像数据进行校正处理, 可获得校正处理后的视频图像数据, 具 体地, 对视频图像的校正处理可包括以下处理:
( 1 )根据图像采集装置标定的标定参数对彩色图和深度图进行校正, 使彩色图和深度图对齐。 为便于在某一视点进行图像的重构, 该视点的彩 色图和深度图的内容应相同。 但普通图像采集装置和深度图像采集装置的 位置不可能精确重合, 因此需要利用图像采集装置标定结果对彩色图和深 度图执行变换, 使彩色图和深度图像比较精确地重合;
( 2 ) 可对不同图像采集装置因设置而导致的彩色图的亮度和色度的
差异进行调节, 使不同图像采集装置获得的彩色图色彩一致, 以消除不同 图像采集装置带来的图像差异;
( 3 ) 根据图像采集装置的标定参数对彩色图或深度图进行校正, 可 对图像进行径向畸变等校正。
步骤 205、 对校正处理后的视频图像数据进行编码, 获得视频图像编码 数据;
本实施例可利用 MPEG-4、 H.264等编解码标准对校正处理后的彩色 图和深度图数据进行编码, 其中深度的描述可以采用 MPEG标准。 目前有 多种对彩色图 +深度图数据编码的方法, 如, 可利用基于分层的 3D视频 编码方法,该方法主要是将 H.264协议中 SEI信息与分层编码思想相结合, 将一个通道的视频数据采用常规方法编码为只包含 I 、 P帧的基本层, 通 道的彩色图数据 , 而后对另外一个通道的数据全部编码成 P帧, 如深度 图数据, 预测时的参考帧采用本通道前一帧或基本层中对应的帧, 这样解 码时可具有较好的 2D/3D兼容性, 对于传统的 2D显示, 只需解码基本层 数据即可; 对于 3D显示, 全部解码。 这样, 接收显示用户可选择 2D显 示或 3D显示, 并可控制视频解码模块进行相应的解码处理。
步骤 206、 对视频图像编码数据进行分组处理, 封装成数据包并进行发 送。
在进行视频图像发送前, 可对视频图像编码数据进行分组处理, 并封装 成数据包发送到视频图像接收设备, 由接收设备对接收到的分组数据进行相 应的处理, 数据的发送可通过现有的网络, 如 Internet网, 进行发送。
此外, 本实施例步骤 206中对视频图像编码数据进行分组处理并发送, 具体可包括以下步骤:
步骤 2061、 对视频图像编码数据进行复用, 获得视频图像编码数据的复 用数据;
本步骤可对编码的视频数据按帧 /场的方式对多个视频数据流进行复用,
在进行按场方式进行复用时, 可以将一个视频数据编码为奇场, 另一个视频 数据编码为偶场, 并将奇偶场作为一帧进行传输。
步骤 2062、 对视频图像编码数据的复用数据进行分组处理, 封装成数据 包并进行发送。
此外, 本实施例还可接收编码语音数据、 系统命令数据和 /或文件数据, 进行分组处理并与视频图像编码数据一起发送出去, 也可接收外部输入的控 制信息, 该控制信息包括观看视点、 显示方式、 显示距离信息等, 根据该控 制信息可对图像采集装置拍摄进行调整, 选择较好的拍摄视角的图像采集装 置进行场景的视频图像的采集, 如可调整图像采集装置的拍摄角度、 图像采 集装置的拍摄数量等, 提高了视频图像采集的实用性。 视频接收设备可通过 网络等接收该视频图像编码数据, 并对接收到的数据进行相应的处理, 如可 对接收到的视频图像数据进行解复用、 解码、 重构、 渲染、 显示等处理, 还 可对接收到的编码语音数据进行解码, 对接收到的文件数据进行相应的储存 等处理, 也可根据系统命令数据执行特定的操作, 如可根据系统命令中的显 示方式对接收到的视频图像进行显示。
视频图像接收设备可根据接收到的场景的深度图和彩色图, 重构出各虚 拟视点的场景的视频图像。由于场景的深度图由深度图像采集装置采集获得, 获得的深度图准确可靠, 而且可通过多台普通图像采集装置或深度图像采集 装置获得场景的多幅彩色图或灰度图, 这样, 在进行场景的各视点视频图像 的显示时, 可以利用多幅彩色图对只由一幅彩色图重构时产生的空洞区域进 行修补, 提高了视点视频图像的重构效果。 同时, 利用图像采集装置采集场 景的深度图和彩色图具有较强的实时性, 采集出的视频图像数据也具有较强 的实用性。
本实施例通过控制各图像采集装置进行场景的视频图像的同步采集和图 像采集装置标定, 获取同步的视频图像数据和图像采集装置的标定信息, 并 根据标定信息对图像采集装置采集的视频图像进行校正处理, 使得视频图像
的处理更加准确; 同时, 通过对视频图像进行编码处理, 提高了视频图像数 据存储和传输的便利性, 便于对大量视频图像数据的存储和传输操作, 本实 施例进一步的提高了视频采集和处理的精度, 提高了重构图像的效果, 而且 根据输入控制信息可对视频图像的采集进行有效控制, 提高了视频图像采集 的实用性。
在获得场景中各视点的深度图和彩色图后, 本发明实施例即可利用已知 视点的彩色图和深度图进行虚拟位置的图像重构。 图 7为本发明实施例场景 与视点关系示意图; 图 8为本发明实施例场景与像点关系示意图。 如图 7所 示, 在已知视点 1处和已知视点 2处利用图像采集装置拍摄场景图像, 在所 述已知视点 1处放置深度图像采集装置来获取场景的深度图, 然后经过计算 得到已知视点 1和已知视点 2之间的虚拟视点 (如虚拟视点 1和虚拟视点 2 ) 的场景图像。 如图 8所示, 设空间一点 Μ ( Χ,Υ,Ζ )在两个图像采集装置中的 像点分别是 (xl,yl)、 (x2,y2), 则在知道基线长 B和焦距 f的情况下, 可以计 算出深度 Z:
ζ _ β _ β 则所述两个图像采集装置获取图像的视差为 ν :
_ _ β
χ2 ~ χι = ν = 则中间虚拟视点 X处的视差为 v 。:
本发明实施例可通过在已知视点 χ2 , 已知视点 χι , 深度 Ζ , 和视点 χ。的情 况下重构出 X。处的场景图像。 由上面公式可知, 要重构出 X。处的场景图像, 需要知道 ^和视差 VX2。, 由深度信息图像采集装置获取的深度图的深度信息 Z 仅具有相对意义, 可以表示场景的深度关系, 而不具有实际意义的视差信息。 在进行重构时需要将不具有实际意义的场景深度信息转换为具有实际意义的
视差信息, 即由深度 Z求出 VX2。。
深度与视差的关系为:
ζ _ β _ β 在拍摄过程中系统参数摄像机焦距 f, 和两摄像机光心距离是恒定的, 因 此 是恒定的。 因此确定 后, 就可以完成由深度到视差的转换, 这种转换 的时间开销基本可以忽略。 相对于采用匹配算法得到视差的方法来说, 实时 性的优势是显而易见的。
在本发明图像重构方法以及图像重构系统实施例的描述中, 为了方便说 明, 所述已知视点 1为左视点, 已知视点 2为右视点, 因此, 所述图像采集 装置在已知视点 1处所获取的图像为左图像, 在已知视点 2处所获取的图像 为右图像, 在已知视点 1处所获取的深度信息为左深度图, 在已知视点 2处 所获取的深度信息为右深度图。
先以一个实例对本发明图像重构方法实施例的思路进行详细的说明, 所 述实例以普通图像采集装置, 和两个深度图像采集装置的配置进行说明, 可 以理解, 对于其他配置, 也属于本发明的保护范围。 所述两个图像采集装置 平行设置, 深度图像采集装置和普通图像采集装置的光心需要尽可能重合, 如果深度图像采集装置和普通图像采集装置的光心距离较大, 则拍摄的图像 不会完全重合, 此时需要进行配准, 即使得深度图像采集装置所获取图像中 的点与普通图像采集装置所获取图像中的对应点在位置上完全相同。 如场景 中某点在普通图像采集装置中成像坐标为 (^ ), 在深度图像采集装置中该点 成像坐标为( 3^) , 则有:
x\ ~xd\ = 0 对于普通图像采集装置所获取的图像信息, 在重构之前, 需要将图像校 正到平行状态, 即两图像中只有水平视差, 不存在垂直视差。 如场景中某点
在两个普通图像采集装置中的成像坐标分别是 、 (¾J2), 则有:
其中 d为左右图 (已知视点 1和已知视点 2) 间的视差。
接下来需要将由深度图像采集装置所获取的不具有实际意义的深度信息 Z转换得到具有实际意义的视差信息 Vx, 即确定:
ζ_ β _β 中的 ffi的值。对此本发明实施例优选采用一种基于特征点匹配的方法来 获取两个图像中特征点之间的视差。 由于由深度图像采集装置实际获取到的 深度图像是含有噪声的, 因此本发明实施例优选采用计算 N个特征点, 然后 求取平均值的办法去除噪声, 从而得到更为精确的 ffi值。
由此, 确定了深度与视差的深度视差对应因子 , 接下来就可以将深度 图中的所有点的深度信息转换成视差信息, 即 Vx = * 。 对于左深度图和右 深度图分别采用上面的方法, 确定左深度图的深度视差对应因子 ^和右深度 图的深度视差对应因子 , 获得左视差图和右视差图。 对于虚拟中间视点 ^, 其与左普通图像采集装置(摄像机) 光心距离为 , 则该点与左摄像机之间 的视差 Vx'为:
Vx =D*X =ϋ*β'
Vx' =—Vx
B
因此, 对于虚拟中间视点的每一点,都可以由与左摄像机图像的视差计算 得出。
图 9为本发明图像重构方法实施例一的流程图。 本实施例中, 可在第一 视点处设置即可输出深度图和彩色图的深度图像采集装置, 在第二视点处设 置能输出彩色图的普通图像采集装置, 通过深度图像采集装置可获取第一视 点的深度图和彩色图, 通过普通图像采集装置可获取第二视点的彩色图。 具 体地, 该方法可包括以下步骤:
步骤 310、 获取已知第一视点的第一彩色图和已知第二视点的第二彩色 图。
在已知第一视点和已知第二视点分别通过普通图像采集装置获取场景的 彩色图。
步骤 320、 获取所述已知第一视点的第一深度图。
步骤 330、 确定所述第一深度图的第一深度视差对应因子。
根据所述第一彩色图、 所述第二彩色图和所述第一深度图确定所述第一 深度图的第一深度视差对应因子。
所述深度视差对应因子确定方法的详细过程已经在前面进行了详细描 述, 在此不再赘述。
步骤 340、 根据所述第一深度视差对应因子对所述第一深度图进行深度 视差转换, 获取第一视差信息。
所述根据所述第一深度视差对应因子对所述第一深度图进行深度视差转 换, 获取第一视差信息的原理和过程在前面已经进行了详细描述, 为了篇幅 考虑, 在此不再赘述。
步骤 350、 根据所述第一彩色图和所述第一视差信息重构虚拟视点的第 三图像。
其中, 所述步骤 330中的深度视差因子确定以后, 在以后每个图像重构 过程中, 重复使用所述确定的深度视差对应因子即可, 不必重新确定深度视 差对应因子。 也就是说, 确定了深度视差对应因子以后, 所述步骤 330就不 必再进行了。
本发明实施例通过直接获取深度图, 并将所述深度图转换为视差信息重 构图像, 从而不需要通过立体匹配算法获取视差信息, 因此避免进行大量的、 高复杂度的计算, 提高了图像重构的实时性, 并且由于不再采用立体匹配算 法进行图像重构, 因此不会存在帧间闪烁, 提高了重构图像的质量。
图 10为本发明图像重构方法实施例二的流程图。 在上述本发明图像 重构方法实施例一技术方案的基础上, 若普通图像采集装置所获取图像中 的点与深度图像采集装置所获取深度图像中的对应点不重合或普通图像 采集装置所获取的两个图像不平行, 则在确定深度视差对应因子前, 本实 施例方法还可包括步骤 313和步骤 314。 具体地, 本实施例方法可包括以 下步骤:
步骤 311、 获取已知第一视点的第一彩色图和已知第二视点的第二彩 色图。
在已知第一视点和已知第二视点分别通过普通图像采集装置获取场 景的彩色图。
步骤 312、 获取所述已知第一视点的第一深度图。
步骤 313、 校正所述第一彩色图和所述第二彩色图, 使所述第一彩色 图中的点与所述第二彩色图中的对应点平行。
所述步骤 313也可以在步骤 31 1之后, 本发明实施例并不对此进行限 定。
步骤 314、 配准所述第一彩色图和所述第一深度图, 使所述第一彩色
图中的点与所述第一深度图中的对应点重合。
步骤 315、 确定所述第一深度图的第一深度视差对应因子。
根据所述第一彩色图、 所述第二彩色图和所述第一深度图确定所述第 一深度图的第一深度视差对应因子。
所述深度视差对应因子确定方法的详细过程已经在前面进行了详细 描述, 在此不再贅述。
步骤 316、 根据所述第一深度视差对应因子对所述第一深度图进行深 度视差转换, 获取第一视差信息。
所述根据所述第一深度视差对应因子对所述第一深度图进行深度视 差转换, 获取第一视差信息的原理和过程在前面已经进行了详细描述, 为 了篇幅考虑, 在此不再贅述。
步骤 317、 根据所述第一彩色图和所述第一视差信息重构虚拟视点的 第三图像。
其中, 所述步骤 315中的深度视差因子确定以后, 在以后每个图像重 构过程中, 重复使用所述确定的深度视差对应因子即可, 不必重新确定深 度视差对应因子。也就是说,确定了深度视差对应因子以后,所述步骤 315 就不必再进行了。
本发明实施例通过直接获取深度图, 并将所述深度图转换为视差信息 重构图像, 从而不需要通过立体匹配算法获取视差信息, 因此避免进行大 量的、 高复杂度的计算, 提高了图像重构的实时性, 并且由于不再采用立 体匹配算法进行图像重构, 因此不会存在帧间闪烁, 提高了重构图像的质 量。
图 11为本发明图像重构方法实施例三的流程图。 具体地, 如图 1 1所 示, 包括以下步骤:
步骤 410、 获取已知第一视点的第一彩色图和已知第二视点的第二彩 色图。
在已知第一视点和已知第二视点分别通过普通图像采集装置获取场 景的彩色图。
步骤 420、 获取所述已知第一视点的第一深度图和所述已知第二视点 的第二深度图。
通过深度图像采集装置获取场景在已知第一视点处和已知第二视点 处的深度图。
步骤 430、 确定所述第一深度图的第一深度视差对应因子和所述第二 深度图的第二深度视差对应因子。
根据所述第一彩色图、 所述第二彩色图和所述第一深度图确定所述第 一深度图的第一深度视差对应因子。
根据所述第一彩色图、 所述第二彩色图和所述第二深度图确定所述第 二深度图的第二深度视差对应因子。
所述深度视差对应因子确定的方法在前面已经进行了详细描述, 在此 不再赘述。
步骤 440、 根据所述第一深度视差对应因子对所述第一深度图进行深 度视差转换, 获取第一视差信息, 根据所述第二深度视差对应因子对所述 第二深度图进行深度视差转换, 获取第二视差信息。
所述根据深度视差对应因子对深度图进行深度视差转换, 获取视差信 息的原理和过程在前面已经进行了详细描述, 为了篇幅考虑, 在此不再贅 述。
步骤 450、 根据所述第一彩色图和所述第一视差信息重构虚拟视点的 第三图像, 根据所述第二彩色图和所述第二视差信息重构虚拟视点的第四 图像。
步骤 460、 根据所述第三图像和所述第四图像进行空洞填补, 生成所 述虚拟视点的第五图像。
如果普通图像采集装置所获取图像中的点与深度图像采集装置所获
取深度图像中的对应点不重合或普通图像采集装置所获取的两个图像不 平行, 则在确定所述深度视差对应因子前还包括步骤:
校正所述第一彩色图和所述第二彩色图, 使所述第一彩色图中的点与 所述第二彩色图中的对应点平行。
配准所述第一彩色图和所述第一深度图, 使所述第一彩色图中的点与 所述第一深度图中的对应点重合。
配准所述第二彩色图和所述第二深度图, 使所述第二彩色图中的点与 所述第二深度图中的对应点重合。
本发明实施例通过直接获取深度图, 并将所述深度图转换为视差信息 重构图像, 从而不需要通过立体匹配算法获取视差信息, 因此避免进行大 量的、 高复杂度的计算, 提高了图像重构的实时性, 并且提高了重构图像 的质量。 并且,通过获取场景足够多的深度图, 解决了场景内的遮挡问题。 而所述场景内的遮挡问题在采用立体匹配算法重构图像时是无法解决的。
可以理解的是, 本发明图像重构方法实施例中, 可采用两个视点的彩 色图和深度图进行虚拟视点图像的重构。 同时, 也可采用更多视点的彩色 图和深度图进行虚拟视点图像的重构, 且所述重构过程的原理与采用两个 视点相同。
图 12为本发明图像重构方法实施例四的流程图。 具体地, 如图 12所 示, 本法实施例方法可包括以下步骤:
步骤 510、 获取已知视点的彩色图。
步骤 520、 获取所述已知视点的深度图。
步骤 530、 对所述深度图进行深度视差转换, 获取所述深度图对应的 视差信息。
在对所述深度图进行深度视差转换, 获取所述深度图对应的视差信息 前还需要:
确定所述深度图的深度视差对应因子。
从而根据所述深度视差对应因子对所述深度图进行深度视差转换, 获 取所述深度图对应的视差信息。
步骤 540、 根据所述已知视点的所述彩色图和所述视差信息重构虚拟 视点的图像。
所述实施例采用一个已知视点的彩色图和深度图。 其应用的场景是生 成小视差的其它虚拟视点图像, 可以用在立体显示方面。 在一个已知视点 的情况下, 不需要进行彩色图的校正。
图 13为本发明图像重构方法实施例五的流程图。 具体地, 如图 13所 示, 该方法可包括以下步骤:
步骤 511、 获取已知视点的彩色图。
步骤 512、 获取所述已知视点的深度图。
步骤 513、 配准所述已知视点的彩色图和所述已知视点的深度图, 使 所述深度图中的点与所述彩色图中的对应点重合。
步骤 514、 确定所述深度图的深度视差对应因子。
本实施例中的所述深度视差对应因子不具有实际的意义, 深度视差对 应因子的选择可以根据应用场景的需要来进行选择, 如根据立体显示器的 参数来进行选择。
步骤 515、 根据所述深度视差对应因子对所述深度图进行深度视差转 换, 获取所述深度图对应的视差信息。
步骤 516、 根据所述已知视点的所述彩色图和所述视差信息重构虚拟 视点的图像。
所述实施例采用一个已知视点的彩色图和深度图。 其应用的场景是生 成小视差的其它虚拟视点图像, 可以用在立体显示方面。 在一个已知视点 的情况下, 不需要进行彩色图的校正, 但是需要进行彩色图与深度图的配 准。 所述配准过程与前面实施例的过程相同; 在本实施例中, 仍然需要确 定深度视差对应因子, 不过此时的深度视差对应因子不具有实际的意义,
深度视差对应因子的选择可以根据应用场景的需要来进行选择, 如根据立 体显示器的参数来进行选择。
图 14为本发明 3D视频通信发送设备实施例一的结构示意图。本实施 例 3D视频通信发送设备包括视频采集单元 11、 视频编码单元 12和视频输 出单元 13。 其中, 视频采集单元 11 用于获取图像采集装置采集的场景的视 频图像数据, 该视频图像数据包括至少一幅深度图和至少二幅彩色图, 该视 频采集单元 11 包括至少一台能输出场景深度信息的深度图像采集装置和至 少一台能输出场景彩色 /灰度视频信息的普通图像采集装置, 或包括至少一台 能输出场景深度信息和彩色 /灰度视频信息的深度图像采集装置; 视频编码单 元 12用于对所述视频图像数据进行编码, 获得视频图像编码数据; 视频输出 单元 13用于接收视频编码单元 12编码后的视频图像编码数据, 并将该视频 图像编码数据发送出去。
本实施例中, 可通过视频采集单元 11 中的深度图像采集装置获取场 景的深度图和 /或彩色图,普通图像采集装置获取场景的彩色图, 然后将获 得的场景的深度图和彩色图作为 3D 视频图像数据并传给视频编码单元 12, 由视频编码单元 12对采集到的视频图像数据进行编码处理, 获得场 景的视频图像编码数据 ,并将该视频图像编码数据发送到视频输出单元 13 , 由视频输出单元 13将该视频图像编码数据发送到视频图像接收设备。本实施 例通过深度图像采集装置采集场景的深度图, 获得的深度图准确可靠, 同 时可通过深度图像采集装置和 /或普通图像采集装置获取场景的多幅彩色 图或灰度图, 这样在进行各虚拟视点场景的 3D视频图像重构时, 可以获 得各种视点的 3D视频图像数据, 在进行虚拟视点的视频图像重构时, 可 利用深度图像采集装置采集到的深度图和彩色图进行虚拟视点的重构, 然 后再利用普通图像采集装置采集到的彩色图对重构出的图像进行修补, 消 除可能产生的空洞区域, 使得重构出的图像更加符合场景的真实效果, 满 足了用户的视觉效果, 同时, 在采集时, 可以将深度图像采集装置和普通
图像采集装置设置合适的拍摄视点, 这样获得的场景的图像包含了较大视 角的视频图像, 可重构出更大视角范围的虚拟视点的图像, 且具有较好的 重构效果。
本实施例通过深度图像采集装置获取场景的深度图, 获得的深度图准 确可靠, 且实时性强, 根据深度图获得的各种虚拟视点的 3D视频图也就 更加准确, 可反映场景的真实效果; 同时, 通过深度图像采集装置和普通 图像采集装置获得场景的多幅彩色图, 在进行虚拟视点的 3D视频图重构 时, 可以获得大范围视点的 3D视频数据, 且可对由一幅彩色图重构时产 生的空洞区域进行修补, 使得重构出的 3D视频图更加准确, 更能反映场 景的真实效果, 提高了虚拟视点图像的重构效果, 使得本发明实施例 3D 视频通信发送设备具有较强的实用性。
图 15为本发明 3D视频通信发送设备实施例二的结构示意图; 图 16 为本发明 3D视频通信发送设备实施例中视频采集单元的结构示意图; 图 17A - 17C为本发明 3D视频通信发送设备实施例中图像采集装置的组合 形式及与采集控制模块的连接示意图。 在上述本发明 3D视频通信发送设 备实施例一的基础上, 如图 16所示, 本实施例中的视频采集单元 11可包 括能输出场景深度图的深度图像采集装置 110, 或为能同时输出场景深度 图和彩色图的深度图像采集装置 1 11 , 还包括可输出场景的彩色图或灰度 图的普通图像采集装置 1 12。本实施例中视频采集单元 11还包括至少一个 采集控制模块 1 13 , 用于控制与其连接的图像采集装置进行场景的拍摄, 采 集并输出拍摄所述场景的视频图像数据。 如图 17A - 17C所示, 深度图像采 集装置 111可同时输出场景的深度图和彩色图, 普通图像采集装置 112只 能输出场景的彩色图或灰度图, 深度图像采集装置 110只能输出场景的深 度图。 采集控制模块 1 13可与各图像采集装置组合进行连接, 可按如下形 式进行连接:
( a ) 如图 17A所示, 采集控制模块 113连接有一台深度图像采集装 置 1 11和一台普通图像采集装置 112;
( b ) 如图 17B所示, 采集控制模块 1 13连接有一台深度图像采集装 置 1 10和两台普通图像采集装置 112;
深度图像采集装置 110和普通图像采集装置 112的位置可任意放置, 但为了获得最大的视角, 深度图像采集装置 1 10可放置在一台普通图像采 集装置 112的中间, 这样获得的场景的深度图和彩色图的视角就会更大, 可重构出更大范围内虚拟视点的 3D视频图像, 且合成出的各虚拟视点的 3D视频图像效果更好。
( c ) 如图 17C所示, 采集控制模块 1 13连接有两台或两台以上的深 度图像采集装置 11 1。
多台深度图像采集装置 1 11可获得场景更多的深度图及与深度图对应 的彩色图。因此,在进行虚拟视点的场景重构时可以获得更大的场景范围, 且各深度图像采集装置获得的视频数据之间也可以进行参考, 提高虚拟视 点重构的精度。
上述采集控制模块 113与各图像采集装置组合的连接只是最基本的连 接形式, 根据实际需要可任意组合或添加图像采集装置的数量, 以获得场 景的更好的 3D视频数据, 但在进行场景的视频采集时保证输出的视频图 像数据应至少包括场景的一幅深度图和多幅彩色图。
如图 16所示, 为降低系统的部署成本且保证视频采集的质量, 本实 施例采用上述的 (a )和 (b ) 两种基本组合形式的混合构成视频采集单元 11 , 其中包括两个采集控制模块 113 , 其中一个采集控制模块 113连接有 一台深度图像采集装置 1 11和一台普通图像采集装置 112, 另一个采集控 制模块 113 连接有一个深度图像采集装置 110 和一个普通图像采集装置 112。 在进行场景的视频采集时, 可合理分配各图像采集装置拍摄的视点 位置, 使得采集到的场景的各深度图和彩色图均有较好的视角, 保证场景 的各虚拟视点图像的重构效果。 可以理解, 采集控制模块 113连接的图像 采集装置的数量越多, 采集控制模块 113部署的数量越多, 获得的场景的
各深度图和彩色图的数量也就越多, 获得场景的视角也就越大, 在进行场 景的各虚拟视点视频图像重构时的效果也就越好, 根据实际需要, 可选择 合适的图像采集装置组合及连接方式。
本实施例中, 如图 16所示, 视频采集单元 11还可包括同步模块 114 和标定模块 1 15。 同步模块 114用于产生同步信号, 并将该同步信号输出至 采集控制模块 113 , 由采集控制模块 113同步各图像采集装置对场景的拍摄; 或, 用于将同步信号输出至图像采集装置的外同步接口同步各图像采集装置 对场景的拍摄, 该同步信号由同步模块 114 自身产生或为图像采集过程中的 一台图像采集装置的视频输出信号; 标定模块 115用于接收图像采集装置采 集的视频图像, 根据采集的视频图像进行图像采集装置标定, 获得各图像采 集装置的内部参数和外部参数, 并发送至采集控制模块 113; 采集控制模块 1 13 还用于根据内部参数和外部参数建立采集的视频图像与各图像采集装置 属性的对应关系, 并作为场景的视频图像数据输出, 图像采集装置属性包括 图像采集装置的内部参数和外部参数以及视频图像每帧的采集时间戳等。 通 过同步模块 1 14可实现各图像采集装置的同步采集, 保证采集到的各视频 图像的同步性。 此外, 通过图像采集装置标定, 可获得图像采集装置的内 部参数和外部参数, 并可作为视频图像进行校正处理的参考依据, 对不同 图像采集装置拍摄出的视频图像进行校正处理, 保证虚拟视点的重构效 果。
如图 15所示,本实施例 3D视频图像通信发送设备还可包括预处理单 元 14 , 用于从采集控制模块 113接收包括各图像采集装置采集到的视频图像 和各图像采集装置属性, 以及视频图像与各图像采集装置的属性的对应关系 的视频图像数据, 根据图像采集装置的内部参数和外部参数对视频图像数据 进行校正处理, 输出校正处理后的视频图像数据, 视频编码单元 12可接收该 预处理单元 14校正处理后的视频图像数据,对校正处理后的视频图像数据进 行编码。其中,每个采集控制模块 113均有对应的预处理单元 14与之相连
接, 这样, 可以保证对每个采集控制模块 113采集到的视频图像数据均可 进行快速准确的处理, 提高数据处理的效率。
此外, 如图 15所示, 本实施例中, 视频输出单元 13也可包括输出处 理模块 131和输出接口模块 132。 其中, 输出处理模块 131用于接收视频编 码单元 12编码后的视频图像编码数据,并对该视频图像编码数据进行分组处 理, 封装成数据包; 输出接口模块 132用于将进行分组处理, 并封装成数据 包的分组数据发送出去。 本实施例还可包括复用单元 15 , 用于对视频图像编 码数据进行复用, 获得复用数据; 输出处理模块 131还可用于接收复用数据, 对复用数据进行分组处理并封装成数据包。
本实施例还可包括音频编码单元、 系统控制单元和用户数据单元。 音频 编码单元用于对语音数据进行编码, 并发送到输出处理模块 131 ; 系统控制 单元用于将命令数据发送到输出处理模块 131 ; 用户数据单元用于将文件数 据发送给输出处理模块 131 ; 输出处理模块 131还可用于对接收到编码语音 数据、 命令数据和 /或文件数据进行分组处理, 并封装成数据包发送到输出接 口模块 132。 本实施例通过音频编码单元可将本地的语音信息与视频信息一 同传输到视频的接收端, 提高了 3D视频的实用性, 此外, 也可以将本地的 一些文件数据, 命令信息等发送到视频接收端, 可满足用户的各种不同需求。 本实施例也可包括控制输入单元 16, 与视频采集单元 11 中的采集控制模块 113 连接, 用于获取控制信息, 并将控制信息发送到采集控制模块, 该控制 信息可包括观看或显示视点、 显示距离以及显示方式等信息, 该控制信息可 以由用户通过图形用户界面 ( Graphical User Interface , GUI界面) 或遥 控设备进行输入, 如显示或观看视点、 距离和显示方式等信息, 并可根据 该信息对采集控制模块 1 13进行控制,若显示方式只需要 2D视频的显示, 则可要求采集控制模块 113只选择普通图像采集装置进行场景的拍摄和采 集, 若需 3D视频显示, 则可将深度图像采集装置和普通图像采集装置一 同拍摄采集, 根据观看或显示视点, 可有选择的由部分图像采集装置进行
场景的拍摄和图像采集, 提高了图像采集效率, 同时也可降低采集过多无 用或重复的数据, 给数据传输和处理带来不便。
为对本发明实施例有更好的了解, 下面对本发明实施例中各主要功能 模块或单元做具体的说明:
采集控制模块 113
采集控制模块用于控制与其连接的各图像采集装置进行视频图像的 采集和输出。 采集控制模块可以将模拟图像信号转换为数字视频图像信号 或直接接收数字图像信号, 采集控制模块可将采集到的图像数据以帧的形 式保存在采集控制模块緩存中。 此外, 采集控制模块还可将采集到的视频 数据提供给标定模块进行图像采集装置标定, 标定模块将得到的图像采集 装置的内部参数和外部参数的标定信息返回给对应的采集控制模块, 采集 控制模块再根据图像采集装置的标定信息建立起视频图像数据和对应的 采集图像采集装置属性的一一对应关系, 图像采集装置的属性包括图像采 集装置的唯一的编码、 图像采集装置的内部参数和外部参数以及每帧的采 集时间戳等, 并将图像采集装置的属性和视频图像数据按照一定的格式输 出。 同时, 采集控制模块还可根据图像采集装置的标定信息通过图像采集 装置的遥控接口对图像采集装置进行平移 /转动 /拉近 /拉远等操作, 采集控 制模块也可通过图像采集装置的同步接口向图像采集装置提供同步的适 中信号, 控制图像采集装置的同步采集。 采集控制模块还可以根据控制输 入单元接收到的观看或显示视点, 可选择部分图像采集装置进行采集工 作, 关闭不需要的深度图像采集装置的采集, 避免重复或无用的采集。
同步模块 1 14
同步模块用于控制多台图像采集装置的同步采集。 对于高速运动的物 体, 同步采集是非常重要的, 否则导致不同视点或是同一视点的图像在同 一时刻差异很大, 用户看到的 3D视频就会失真。 同步模块可以通过硬件 或软件时钟产生同步信号, 并输出到图像采集装置的外同步接口对图像采
集装置进行同步采集控制或是输出到采集控制模块, 由采集控制模块通过 控制线对图像采集装置进行同步采集控制。 同步模块也可以采用一台图像 采集装置的视频输出信号作为控制信号输入到其它图像采集装置进行同 步采集控制。 同步采集可实现帧同步或行 /场同步。
标定模块 115
标定模块主要实现图像采集装置标定, 所谓图像采集装置标定, 是获 得图像采集装置的内部参数和外部参数。 内部参数包括图像采集装置成像 图像中心、 焦距、 镜头畸变等, 外部参数包括图像采集装置位置的旋转和 平移等参数。 由于多台图像采集装置拍摄到的图像往往不是按扫描线对齐 的, 不符合人眼的成像模型, 观看时, 会对用户造成视觉疲劳。 因此, 需 要将图像采集装置拍摄到的图像进行校正成符合人眼成像模型的图像, 而 通过图像采集装置标定获得的图像采集装置的内部参数和外部参数可作 为对图像进行校正的依据。
其中 为成像点坐标; ywZw为世界坐标; S为图像的尺度因子, 是 图像水平单位像素数/ M和垂直单位像素数/ ν·^ ; /为焦 巨; "。,ν。为图像 中心坐标; R为图像采集装置的旋转矩阵; t 为图像采集装置平移向量。 其中 K为图像采集装置的内部参数, R和 t为图像采集装置的外部参数。
图像采集装置的标定可以采用传统标定法和自标定法等。 传统标定法 包括了直接线性变换标定法、 基于径向排列约束标定法和平面标定法等。 传统标定法的基本原理是使用标定参照物建立图像采集装置成像模型线 形方程组, 并测出参照物中一组点的世界坐标和其在成像平面上的对应坐 标, 然后将这些坐标值代入该线形方程组中求出内部参数和外部参数。 自
标定法是指不需要标定参照物, 仅仅通过图像点之间的对应关系就可以对 图像采集装置进行标定的过程。 自标定依据的是多幅图像中成像点间存在 的特殊约束关系, 如极线约束, 因此可以不需要场景的结构信息。
预处理单元 14
预处理单元从采集控制模块处接收采集的图像緩存和相应的图像采 集装置参数, 根据预处理算法对緩存的图像进行处理。 预处理主要包括以 下内容:
( 1 )根据图像采集装置标定的标定信息对彩色图和深度图进行校正, 使彩色图和深度图对齐。 为便于在某一视点进行图像的重构, 该视点的彩 色图和深度图的内容应相同。 但普通图像采集装置和深度图像采集装置的 位置不可能精确重合, 因此需要利用图像采集装置标定结果对彩色图和深 度图执行变换, 使彩色图和深度图像比较精确地重合;
( 2 ) 消除不同图像采集装置带来的图像差异。 可对不同图像采集装 置因设置而导致的彩色图的亮度和色度的差异进行调节, 使不同图像采集 装置获得的彩色图色彩一致;
( 3 ) 根据图像采集装置的标定参数对彩色图或深度图进行校正, 可 对图像进行径向畸变等的校正。
视频编码单元 12
由于 3D视频系统中具有多个通道图像的视频数据, 具有的视频数据 非常大, 给视频数据的传输和存储带来了困难。 因此, 需要一个较好的是 视频编码单元对视频数据进行处理。 目前 3D视频编码主要分为两类: 基 于块的编码和基于对象的编码。 在 3D图像的的编码中, 除了帧内预测和 帧间预测消除空域和时域上的数据冗余度外, 还必须消除多通道图像之间 的空域数据冗余性。 视差(Parallax )估计与补偿技术可用于消除多通道图 像间的空域冗余度。 视差估计与补偿的核心是找到两幅或多幅图像间的相 关性, 和运动估计补偿是类似的, 但视差估计与补偿比运动估计补偿要复
杂。 运动估计补偿处理的是同一图像采集装置时间不同步的图像, 而视差 估计与补偿处理的是不同图像采集装置时间同步的图像。 在视差估计与补 偿中, 可能所有像素的位置都会发生改变, 距离很远的物体可以认为视差 为 0。
本发明实施例中所述的视频编码单元可利用 MPEG-4、 H.264等编解 码标准对预处理单元输出的彩色图和深度图数据进行编码, 其中深度的描 述可以采用 MPEG标准。 目前有多种对彩色图 +深度图数据编码的方法, 如基于分层的 3D视频编码方法, 该方法主要是将 H.264协议中 SEI信息 与分层编码思想相结合, 将一个通道的视频数据采用常规方法编码为只包 含 I 、 P 帧的基本层, 通道的彩色图数据 , 而后对另外一个通道的数据 全部编码成 P帧, 如深度图数据, 预测时的参考帧采用本通道前一帧或基 本层中对应的帧, 这样在解码时可具有较好的 2D/3D兼容性, 对于传统的 2D显示, 只需解码基本层数据即可; 对于 3D显示, 全部解码。
控制输入单元 16
控制输入单元主要用于接收视频用户或视频终端的输入, 并反馈给视 频采集单元以及视频编码单元。 控制输入单元包括的信息主要有观看和显 示视点、 显示方式和用户的距离信息等。 控制输入单元的信息可以由用户 通过图形用户界面或遥控设备进行输入, 如观看或显示视点、 距离信息和 显示方式。 此外, 控制输入单元还可以根据观看视点等信息有选择的对图 像采集装置进行控制, 如可只选择视频采集单元中的其中一台或多台图像 采集装置进行视频图像的采集。 同时, 若控制输入单元接收的显示方式为 2D显示, 可对图像处理单元中的视频编码单元进行控制, 只对 2D显示所 需的彩色图进行编码; 若显示方式为 3D显示, 则对输出彩色图和深度图 数据进行编码。
本实施例中, 通过采集控制模块控制各图像采集装置的图像采集, 并可 在采集中对图像采集装置的拍摄视角进行布置,可获得较大视角的场景的 3D
视频数据, 在进行场景的各虚拟视点的重构时具有较好的重构效果; 通过同 步模块和标定模块, 可获得同步的视频数据和图像采集装置标定参数, 使得 采集到的视频图像的处理更加准确; 同时, 对视频数据进行编码处理, 提高 了数据存储和传输的便利, 便于对大量视频数据的存储和传输操作, 本实施 例进一步地提高了视频采集和处理的精度, 提高了虚拟视点视频图像的重构 效果。
图 18为本发明图像重构系统实施例一的结构示意图。 具体地, 如图 18 所示, 该重构系统可包括:
第一普通图像采集装置 610 , 用于获取已知第一视点的第一彩色图。 第二普通图像采集装置 620, 用于获取已知第二视点的第二彩色图。 第一深度图像采集装置 630, 用于获取所述已知第一视点的第一深度图。 第一确定装置 640, 用于根据所述第一彩色图、 所述第二彩色图和所述 第一深度图确定所述第一深度图的第一深度视差对应因子。
第一转换装置 650, 用于根据所述第一深度视差对应因子对所述第一深 度图进行深度视差转换, 获取第一视差信息。
第一重构装置 660, 用于根据所述第一彩色图和所述第一视差信息重构 虚拟视点的第三图像。
所述图像重构系统的工作过程和工作原理可参考前述本发明图像重构方 法实施例, 在此不再贅述。
本发明实施例通过直接获取深度图, 通过将所述深度图转换为视差信息 重构图像, 从而不需要通过立体匹配算法获取视差信息, 因此避免进行大量 的、 高复杂度的计算, 提高了图像重构的实时性, 并且由于不再采用立体匹 配算法进行图像重构, 因此不会存在帧间闪烁, 提高了重构图像的质量。
图 19为本发明图像重构系统实施例二的结构示意图。在上述实施例一技 术方案的基础上, 若普通图像采集装置所获取图像中的点与深度图像采集装 置所获取深度图像中的对应点不重合或普通图像采集装置所获取的两个图像
不平行, 则本实施例图像重构系统还可包括校正装置 611 和第一配准装置 612。 具体地, 本实施例图像重构系统可包括:
第一普通图像采集装置 610 , 用于获取已知第一视点的第一彩色图。 第二普通图像采集装置 620, 用于获取已知第二视点的第二彩色图。 第一深度图像采集装置 630, 用于获取所述已知第一视点的第一深度图。 校正装置 611 , 用于校正所述第一彩色图和所述第二彩色图, 使所述第 一彩色图中的点与所述第二彩色图中的对应点平行。
第一配准装置 612, 用于配准所述第一彩色图和所述第一深度图, 使所 述第一彩色图中的点与所述第一深度图中的对应点重合。
第一确定装置 640, 用于根据所述第一彩色图、 所述第二彩色图和所述 第一深度图确定所述第一深度图的第一深度视差对应因子。
第一转换装置 650, 用于根据所述第一深度视差对应因子对所述第一深 度图进行深度视差转换, 获取第一视差信息。
第一重构装置 660, 用于根据所述第一彩色图和所述第一视差信息重构 虚拟视点的第三图像。
本发明实施例通过直接获取深度图, 并将所述深度图转换为视差信息重 构图像, 从而不需要通过立体匹配算法获取视差信息, 因此避免进行大量的、 高复杂度的计算, 提高了图像重构的实时性, 并且由于不再采用立体匹配算 法进行图像重构, 因此不会存在帧间闪烁, 提高了重构图像的质量。
图 20为本发明图像重构系统实施例三的结构示意图。在上述实施例二技 术方案的基石出上, 本实施例还可包括:
第二深度图像采集装置 710, 用于获取所述已知第二视点的第二深度图。 第二确定装置 720, 用于根据所述第一彩色图、 所述第二彩色图和所述 第二深度图确定所述第二深度图的第二深度视差对应因子。
第二转换装置 730, 用于根据所述第二深度视差对应因子对所述第二深 度图进行深度视差转换, 获取第二视差信息。
第二重构装置 740, 用于根据所述第二彩色图和所述第二视差信息重构 所述虚拟视点的第四图像。
空洞填补装置 750, 用于根据所述第三图像和所述第四图像进行空洞填 补, 生成所述虚拟视点的第五图像。
所述图像重构系统的工作过程和工作原理参考前面相关部分的描述, 为 了篇幅考虑, 在此不进行详细描述。
为了使普通图像采集装置所获取图像中的点与深度图像采集装置所获取 深度图像中的对应点重合, 所述第一普通图像采集装置和所述第一深度图像 采集装置、 所述第二普通图像采集装置和所述第二深度图像采集装置优选重 合或集成一体。
如果普通图像采集装置所获取图像中的点与深度图像采集装置所获取深 度图像中的对应点不重合或普通图像采集装置所获取的两个图像不平行, 则 所述图像重构系统还包括:
校正装置 611、 用于校正所述第一彩色图和所述第二彩色图, 使所述第 一彩色图中的点与所述第二彩色图中的对应点平行。
第一配准装置 612, 用于配准所述第一彩色图和所述第一深度图, 使所 述第一彩色图中的点与所述第一深度图中的对应点重合。
第二配准装置 613 , 用于配准所述第二彩色图和所述第二深度图, 使所 述第二彩色图中的点与所述第二深度图中的对应点重合。
本发明实施例通过直接获取深度图, 通过将所述深度图转换为视差信息 重构图像, 从而不需要通过立体匹配算法获取视差信息, 因此避免进行大量 的、 高复杂度的计算, 提高了图像重构的实时性, 并且提高了重构图像的质 量。 并且, 通过获取场景足够多的深度图, 解决了场景内的遮挡问题。 而所 述场景内的遮挡问题在采用立体匹配算法重构图像时是无法解决的。
图 21为本发明图像重构系统实施例四的结构示意图。 本实施例可包括: 普通图像采集装置 810, 用于获取已知视点的彩色图。
深度图像采集装置 820, 用于获取所述已知视点的深度图。
转换装置 830 , 用于对所述深度图进行深度视差转换, 获取所述深度图 对应的视差信息。
重构装置 840 , 用于根据所述已知视点的所述彩色图和所述视差信息重 构虚拟视点的图像。
所述实施例采用一个已知视点的彩色图和深度图。 其应用的场景是生成 小视差的其它虚拟视点图像, 可以用在立体显示方面。 在一个已知视点的情 况下, 不需要进行彩色图的校正。
图 22为本发明图像重构系统实施例五的结构示意图。 本实施例可包括: 普通图像采集装置 810, 用于获取已知视点的彩色图。
深度图像采集装置 820, 用于获取所述已知视点的深度图。
转换装置 830 , 用于对所述深度图进行深度视差转换, 获取所述深度图 对应的视差信息。
重构装置 840 , 用于根据所述已知视点的所述彩色图和所述视差信息重 构虚拟视点的图像。
确定装置 850, 用于确定所述深度图的深度视差对应因子。
配准装置 860 , 用于对所述普通图像采集装置所获取的图像和所述深度 图像采集装置所获取的图像进行配准, 使深度图中的点与图像中的对应点在 位置上完全相同。
所述转换装置 830根据所述深度视差对应因子对所述深度图进行深度视 差转换, 获取所述深度图对应的视差信息。
本实施例采用一个已知视点的彩色图和深度图, 其应用的场景是生成 d、 视差的其它虚拟视点图像, 可以用在立体显示方面。 在一个已知视点的情况 下, 不需要进行彩色图的校正, 但是需要进行彩色图与深度图的配准。 所述 配准过程与前面实施例的过程相同。 在本实施例中, 仍然需要确定深度视差 对应因子, 不过此时的深度视差对应因子不具有实际的意义, 深度视差对应
因子的选择可以根据应用场景的需要来进行选择, 如根据立体显示器的参数 来进行选择。
图 23为本发明 3D视频通信系统实施例的结构示意图。 具体地, 如图 23 所示, 本实施例包括发送设备 1和接收设备 2。 其中, 发送设备 1 包括视频 采集单元 11、 视频编码单元 12和视频输出单元 13。 其中, 视频采集单元 11 用于获取图像采集装置采集的场景的视频图像数据, 该视频图像数据包括至 少一幅深度图和至少二幅彩色图,该视频采集单元 11包括至少一台能输出场 景深度信息的深度图像采集装置和至少一台能输出场景彩色 /灰度视频信息 的普通图像采集装置, 或包括至少一台能输出场景深度信息和彩色 /灰度视频 信息的深度图像采集装置;视频编码单元 12用于对所述视频图像数据进行编 码, 获得视频图像编码数据; 视频输出单元 13用于接收视频编码单元 12编 码后的视频图像编码数据, 并将该视频图像编码数据发送出去。 接收设备 2 包括视频接收单元 21和视频解码单元 22。 视频接收单元 21用于接收视频输 出单元 13发送来的视频图像编码数据; 视频解码单元 22用于对该视频图像 编码数据进行解码, 获得视频图像解码数据。 发送设备 1和接收设备 2之间 可直接连接, 也可通过现有的通信网络, 如 Internet网连接。
本实施例中, 可通过视频采集单元 11 中的深度图像采集装置获取场 景的深度图和 /或彩色图,普通图像采集装置获取场景的彩色图, 然后将获 得的场景的深度图和彩色图作为 3D 视频图像数据并传给视频编码单元 12, 由视频编码单元 12对采集到的视频图像数据进行编码处理, 获得场 景的视频图像编码数据 ,并将该视频图像编码数据发送到视频输出单元 13 , 由视频输出单元 13将该视频图像编码数据发送到视频图像接收设备。
本实施例通过深度图像采集装置采集场景的深度图, 获得的深度图准 确可靠,同时可通过深度图像采集装置和 /或普通图像采集装置获取场景的 多幅彩色图或灰度图,这样在进行各虚拟视点场景的 3D视频图像重构时, 可以获得各种视点的 3D视频图像数据, 在进行虚拟视点的视频图像重构
时, 可利用深度图像采集装置采集到的深度图和彩色图进行虚拟视点的重 构, 然后再利用普通图像采集装置采集到的彩色图对重构出的图像进行修 补, 消除可能产生的空洞区域, 使得重构出的图像更加符合场景的真实效 果, 满足了用户的视觉效果, 同时, 在采集时, 可以将深度图像采集装置 和普通图像采集装置设置合适的拍摄视点, 这样获得的场景的图像包含了 较大视角的视频图像, 可重构出更大视角范围的虚拟视点的图像, 且具有 较好的重构效果。 接收设备 2接收到发送设备 1发送来的的视频图像编码 数据后, 可根据接收到的视频图像编码数据进行相应的解码、 视频图像的 重构、 渲染和显示等处理。 由于本实施例深度图由深度图像采集装置采集 得到, 获得的深度图质量好, 且深度图的采集具有较强的实时性, 在进行 各虚拟视点场景的 3D视频图像重构时, 可利用深度图像采集装置采集到 的深度图和一幅彩色图进行虚拟视点的重构, 然后再利用普通图像采集装 置采集到的彩色图对重构出的图像进行修补, 消除可能产生的空洞区域, 使得重构出的图像更加符合实际场景, 满足用户的视觉效果。
图 24为本发明 3D视频通信系统实施例中接收设备的结构示意图。 本实 施例中, 接收设备 2还可包括图像重构系统 23 , 用于根据显示信息和视频图 像解码数据, 重构需要显示的视点的视频图像。 本实施例中的接收设备 2中 还可包括解复用单元 24, 用于对视频接收单元 21接收到的复用数据进行解 复用, 该复用数据为视频图像编码数据的复用数据。 其中, 图像重构系统 23 可接收视频解码单元 22输出的视频图像解码数据,并根据视频图像解码数据 中的深度图和彩色图进行显示视点视频图像的重建, 获得显示视点的重建图 像, 并可根据视频图像解码数据中的彩色图对显示视点的重建图像中的空洞 区域进行修补, 和 /或采用线性或非线性插值的方法对显示视点的重建图像中 的空洞区域进行修补, 获得显示视点的视频图像。 本实施例中的接收设备 2 中还可包括显示输入单元 25 , 用于获取显示信息, 该显示信息包括显示或观 看视点、 显示方式以及显示距离等信息, 图像重构系统 23可根据该显示信息
对视频图像解码数据进行重构, 重构出所需要显示的视点的视频图像。 本实 施例发送设备中还可包括渲染单元 26和显示单元 27, 其中, 渲染单元 26用 于接收显示视点的视频图像并渲染; 显示单元 27用于接收渲染单元 26渲染 的显示视点的图像数据, 并显示显示视点的视频图像。 其中, 渲染单元 26也 可接收视频解码单元 22直接发送过来的视频图像解码数据,进行渲染并送到 显示单元 27进行显示。 此外, 接收设备 2中还可包括语音解码单元、 系统控 制单元和 /或用户数据单元, 语音解码单元可用于对接收到的编码语音数据进 行解码; 系统控制单元可用于对接收到的系统命令数据进行相应的处理; 用 户数据单元可对接收到的文件数据等进行存储、 编辑等处理, 上述所述的语 音解码单元、 系统控制单元和用户数据单元未在附图中示出。
下面对图像重构系统 23的原理和作用进行具体的说明:
图像重构系统用于根据获得的场景的彩色图和深度图的数据进行虚 拟视点图像的重建。 可利用基于图像的渲染重构技术对虚拟视点图像的重 建。 基于图像的渲染的重构技术中, /。表示原始纹理图像, 表示新的重 建的视点图像, d表示深度图, d(x , y)表示像素 (X , y)处的视差值, "为 偏移量的一个权值。 以平行图像采集装置系统为例, 对于重建的视点图像 中的每个像素 (X , y) , 有下列关系:
(x, y) = IN{x + a - d (x, y),y)
根据图像采集单元中图像采集装置的种类和数量, 根据各图像采集装 置获得深度图和彩色图进行虚拟视点的重构可采用不同的方式。
若采集视频数据的图像采集装置只包括一台或多台深度图像采集装 置 110深度图像采集装置 111 , 可根据以下步骤对虚拟视点的图像进行重 构:
( 1 )根据其中一台深度图像采集装置 D , 利用其输出的彩色图 和 深度图 A采用上述基于图像的渲染的重构技术的通用算法进行重构, 得到 图像采集装置组中的虚拟视点 V的一个重建图像
( 2 )根据另外一台深度图像采集装置 DC2 , 利用其输出的彩色图 /2和 深度图 D2采用上述的通用算法进行重构,得到同样的虚拟视点 V的另一重 建图像 /v 2 ;
( 3 )虚拟视点 V最终的重建图像 可为 和 /v 2的并集,即 /v = υ /ν 2 , /v 2可以填补图像 中的空洞。 对于 和/ 中的交集部分, 最终可以采用加 权的方式进行合成, 如采用公式: /(JC,J =
为 和视点位置相关的权值;
( 4 ) 对于步骤 (3 )之后形成的重建图像 A中剩下的空洞区域, 可以 根据空洞周围像素的亮度、 色度和深度信息确定空洞区域内像素相应的信 息对空洞区域进行修补, 如采用线性或非线性插值的方法进行修补, 最终 获得该虚拟视点的视频图像。
同样的, 采集视频数据的图像采集装置只包括一台深度图像采集装置
1 10深度图像采集装置 111和一台普通图像采集装置 112, 可根据以下步 骤进行虚拟视点重构:
( 1 )根据其中的深度图像采集装置 DC , 利用其输出的彩色图 A和深 度图 采用上述的通用算法进行重构, 得到图像采集装置组中的虚拟视点 V的重建图像 /v ;
( 2 )对于 /v中出现的空洞, 利用普通图像采集装置输出的彩色图像 /2 进行填补。 填补的基本方法: 首先, 求出普通图像采集装置和深度图像采 集装置之间的位置关系, 如根据图像采集装置标定的标定参数; 然后, 利 用深度图 D找到 /v中空洞区域在 /2中对应的位置, 将 /2在此位置处的像素 经过投影变换映射到 /v中用于填补 /v中的空洞。
( 3 ) 对于步骤 (2 )之后 /v中剩下的空洞区域, 采用线性或非线性插 值等方法进行修补, 最终获得该虚拟视点的视频图像。
同时, 图像重构系统还可对重构出的视点的视频图像还可以进行滤波
等图像处理, 提高视频图像的效果。
实际应用中, 如图 18 所示, 本发明实施例中的图像重构系统具体可 包括第一普通图像采集装置 610、第二普通图像采集装置 620、第一深度图像 采集装置 630、 第一确定装置 640、 第一转换装置 650和第一重构装置 660。 或者, 如图 19所示, 本发明实施例中的图像重构系统具体可包括第一普通 图像采集装置 610、第二普通图像采集装置 620、第一深度图像采集装置 630、 校正装置 611、 第一配准装置 612、 第一确定装置 640、 第一转换装置 650和 第一重构装置 660; 或者如图 20所示, 本发明实施例中的图像重构系统具体 可包括第一普通图像采集装置 610、第二普通图像采集装置 620、第一深度图 像采集装置 630、 校正装置 611、 第一配准装置 612、 第一确定装置 640、 第 一转换装置 650、 第一重构装置 660、 第二深度图像采集装置 710、 第二确定 装置 720、 第二转换装置 730、第二重构装置 740和空洞填补装置 750; 或者, 如图 21所示, 本发明实施例中的图像重构系统具体可包括普通图像采集装 置 810、 深度图像采集装置 820、 转换装置 830和重构装置 840; 或者, 如图 22 所示, 本发明实施例中的图像重构系统具体可包括普通图像采集装置 810、 深度图像采集装置 820、 转换装置 830、 重构装置 840、 确定装置 850、 和配准装置 860。 具体地, 本实施例中的图像重构系统可具有与上述本发明 图像重构系统实施例相同的结构和功能, 在此不再贅述。
此外, 本实施例中的发送设备 1 中的视频采集单元 11还可包括至少 一个采集控制模块 113、 同步模块 1 14和标定模块 115; 视频输出单元 13 可包括输出处理模块 131和输出接口模块 132; 发送设备 1还可包括预处理 单元 14、 复用单元 15、 控制输入单元 16、 音频编码单元、 系统控制单元和 用户数据单元。 其中采集控制模块 113可与多种由深度图像采集装置和普通 图像采集装置的组合进行连接, 以控制各图像采集装置进行场景的拍摄和采 集。 具体地, 本实施例中的发送设备 1 的结构与上述本发明 3D视频通信发 送设备各实施例相同, 在此不再贅述。
此外, 本实施例中的发送设备和接收设备可集成在一起, 使得集成的设 备即可以发送视频图像数据到其它设备, 也可以接收并处理其它设备发送来 的视频图像数据, 同时也可接收并处理自身设备采集到的视频图像数据, 可 在本地实时显示视频图像, 本实施例中的发送设备和接收设备也可通过现有 的各种无线或有线网络连接, 可应用于远程视频图像采集等。
本实施例中的发送设备通过图像采集装置采集到的包括深度图和彩 色图的视频图像数据, 采集到的深度图准确可靠, 且具有较强的实时性, 同时可将视频图像数据传输到接收设备, 由接收设备对视频图像数据进行 处理; 由于采集的视频图像数据包括深度图和彩色图, 在进行虚拟视点的 视频图像重构时, 可利用多幅彩色图对只由一幅彩色图重构时产生的空洞 区域进行修补,使得重构出的图像效果好, 具有较强的实用性, 可满足 3D 视频的需要。
最后应说明的是: 以上实施例仅用以说明本发明的技术方案而非对其 进行限制, 尽管参照较佳实施例对本发明进行了详细的说明, 本领域的普 通技术人员应当理解: 其依然可以对本发明的技术方案进行修改或者等同 替换, 而这些修改或者等同替换亦不能使修改后的技术方案脱离本发明技 术方案的精神和范围。
Claims
1、 一种 3D视频通信方法, 其特征在于包括:
获取图像采集装置采集的场景的视频图像数据, 所述视频图像数据包括 至少一幅深度图和至少二幅彩色图, 所述视频图像数据由至少一台能输出场 景深度信息的图像采集装置和至少一台能输出场景彩色 /灰度视频信息的图 像采集装置, 或由至少一台能输出场景深度信息和彩色 /灰度视频信息的图像 采集装置获得;
对所述视频图像数据进行编码, 获得视频图像编码数据;
将所述视频图像编码数据发送出去。
2、 根据权利要求 1所述的 3D视频通信方法, 其特征在于, 所述获取图 像采集装置采集的场景的视频图像数据具体包括:
控制各图像采集装置, 使之对所述场景的图像采集同步;
对所述各图像采集装置采集到的视频图像进行图像采集装置标定, 获得 各图像采集装置的内部参数和外部参数;
根据所述内部参数和外部参数建立各图像采集装置采集的所述视频图像 与各图像采集装置属性的对应关系, 并作为所述场景的视频图像数据, 所述 图像采集装置属性包括图像采集装置的内部参数、 外部参数以及所述视频图 像每帧的采集时间戳。
3、 根据权利要求 2所述的 3D视频通信方法, 其特征在于, 所述控制各 图像采集装置, 使之对所述场景的图像采集同步, 其过程具体包括:
提供同步信号, 根据所述同步信号同步所述各图像采集装置对所述场景 的图像采集。
4、 根据权利要求 2所述的 3D视频通信方法, 其特征在于, 所述对所述 视频图像数据进行编码之前还包括:
根据所述图像采集装置属性, 对所述视频图像数据进行校正处理, 获得 校正处理后的视频图像数据。
5、 根据权利要求 2所述的 3D视频通信方法, 其特征在于, 在所述控制 各图像采集装置, 使之对所述场景的图像采集同步之前还包括:
接收外部输入的控制信息, 根据所述控制信息设定各图像采集装置的视 角及拍摄距离, 所述控制信息包括观看视点、 显示方式以及显示距离信息。
6、 一种 3D视频通信发送设备, 其特征在于, 包括:
视频采集单元, 用于获取图像采集装置采集的场景的视频图像数据, 所 述视频图像数据包括至少一幅深度图和至少二幅彩色图, 所述视频采集单元 包括至少一台能输出场景的深度信息的图像采集装置和至少一台能输出场景 的彩色 /灰度视频信息的图像采集装置, 或包括至少一台能输出场景的深度信 息和彩色 /灰度视频信息的图像采集装置;
视频编码单元, 用于对所述视频图像数据进行编码, 获得视频图像编码 数据;
视频输出单元, 用于将所述视频图像编码数据发送出去。
7、 根据权利要求 6所述的 3D视频通信发送设备, 其特征在于, 所述视 频采集单元包括:
采集控制模块, 用于控制与其连接的图像采集装置进行所述场景的图像 采集;
同步模块, 用于产生同步信号, 并将所述同步信号输出至所述采集控制 模块, 由所述采集控制模块同步各图像采集装置对所述场景的图像采集; 或, 用于将所述同步信号输出至图像采集装置的外同步接口同步各图像采集装置 对所述场景的图像采集, 所述同步信号由所述同步模块自身产生或为所述各 图像采集装置中的一台图像采集装置的视频输出信号;
标定模块, 用于接收所述图像采集装置采集的视频图像, 并根据所述采 集的视频图像进行图像采集装置标定, 获得各图像采集装置的内部参数和外 部参数, 并发送至所述采集控制模块;
所述采集控制模块还用于根据所述内部参数和外部参数建立各图像采集
装置采集的所述视频图像与各图像采集装置属性的对应关系, 并作为所述场 景的视频图像数据输出 , 所述图像采集装置属性包括图像采集装置的内部参 数、 外部参数以及所述视频图像每帧的采集时间戳。
8、根据权利要求 7所述的 3D视频通信发送设备, 其特征在于, 所述 3D 视频通信发送设备还包括:
预处理单元, 用于从所述采集控制模块接收包括所述视频图像和所述图 像采集装置属性, 以及所述视频图像与各图像采集装置的属性的对应关系的 所述视频图像数据, 根据所述图像采集装置的内部参数和外部参数对所述视 频图像数据进行校正处理, 输出校正处理后的视频图像数据。
9、根据权利要求 7所述的 3D视频通信发送设备, 其特征在于, 所述 3D 视频通信发送设备还包括:
控制输入单元, 用于获取控制信息, 并将所述控制信息发送到所述采集 控制模块, 所述控制信息包括观看视点、 显示距离以及显示方式。
10、 一种图像重构方法, 其特征在于, 包括:
获取已知视点的彩色图;
获取所述已知视点的深度图;
对所述深度图进行深度视差转换, 获取所述深度图对应的视差信息; 根据所述已知视点的所述彩色图和所述视差信息重构虚拟视点的图像。
11、 根据权利要求 10所述的图像重构方法, 其特征在于, 所述对所述深 度图进行深度视差转换, 获取所述深度图对应的视差信息为:
根据所述深度图的深度视差对应因子对所述深度图进行深度视差转换, 获取所述深度图对应的视差信息。
12、 根据权利要求 11所述的图像重构方法, 其特征在于, 在所述对所述 深度图进行深度视差转换, 获取所述深度图对应的视差信息前还包括:
配准所述已知视点的彩色图和所述已知视点的深度图, 使所述深度图中 的点与所述彩色图中的对应点重合。
13、 一种图像重构方法, 其特征在于, 包括:
获取已知第一视点的第一彩色图和已知第二视点的第二彩色图; 获取所述已知第一视点的第一深度图;
根据所述第一彩色图、 所述第二彩色图和所述第一深度图确定所述第一 深度图的第一深度视差对应因子;
根据所述第一深度视差对应因子对所述第一深度图进行深度视差转换, 获取第一视差信息;
根据所述第一彩色图和所述第一视差信息重构虚拟视点的第三图像。
14、 根据权利要求 13所述的图像重构方法, 其特征在于, 还包括: 获取所述已知第二视点的第二深度图;
根据所述第一彩色图、 所述第二彩色图和所述第二深度图确定所述第二 深度图的第二深度视差对应因子;
根据所述第二深度视差对应因子对所述第二深度图进行深度视差转换, 获取第二视差信息;
根据所述第二彩色图和所述第二视差信息重构所述虚拟视点的第四图 像。
15、 根据权利要求 14所述的图像重构方法, 其特征在于, 重构所述虚拟 视点的第三图像和第四图像后, 还包括:
根据所述第三图像和所述第四图像进行空洞填补, 生成所述虚拟视点的 第五图像。
N ( N为 1、 2、 3... ... )
其中, 所述 为深度视差对应因子, 所述 νχ·为所述第一彩色图和所述第
二彩色图中第 i特征点之间的视差 ,所述" '·为深度图中第 i特征点的深度图 Z■ 的倒数, 即^0'^ 1^。
17、 根据权利要求 13或 14或 15所述的图像重构方法, 其特征在于, 确 定所述第一深度图的第一深度视差对应因子前还包括:
校正所述第一彩色图和所述第二彩色图, 使所述第一彩色图中的点与所 述第二彩色图中的对应点平行;
配准所述第一彩色图和所述第一深度图, 使所述第一彩色图中的点与所 述第一深度图中的对应点重合。
18、 根据权利要求 14或 15所述的图像重构方法, 其特征在于, 确定所 述第二深度图的第二深度视差对应因子前还包括:
配准所述第二彩色图和所述第二深度图, 使所述第二彩色图中的点与所 述第二深度图中的对应点重合。
19、 一种图像重构系统, 其特征在于, 包括:
普通图像采集装置, 用于获取已知视点的彩色图;
深度图像采集装置, 用于获取所述已知视点的深度图;
转换装置, 用于对所述深度图进行深度视差转换, 获取所述深度图对应 的视差信息;
重构装置, 用于根据所述已知视点的所述彩色图和所述视差信息重构虚 拟视点的图像。
20、 根据权利要求 19所述的图像重构系统, 其特征在于, 还包括: 确定装置, 用于确定所述深度图的深度视差对应因子;
所述转换装置根据所述深度视差对应因子对所述深度图进行深度视差转 换, 获取所述深度图对应的视差信息;
配准装置, 用于配准所述已知视点的彩色图和所述已知视点的深度图, 使所述深度图中的点与所述彩色图中的对应点重合。
21、 一种图像重构系统, 其特征在于, 包括:
第一普通图像采集装置, 用于获取已知第一视点的第一彩色图; 第二普通图像采集装置, 用于获取已知第二视点的第二彩色图; 第一深度图像采集装置, 用于获取所述已知第一视点的第一深度图; 第一确定装置, 用于根据所述第一彩色图、 所述第二彩色图和所述第一 深度图确定所述第一深度图的第一深度视差对应因子;
第一转换装置, 用于根据所述第一深度视差对应因子对所述第一深度图 进行深度视差转换, 获取第一视差信息;
第一重构装置, 用于根据所述第一彩色图和所述第一视差信息重构虚拟 视点的第三图像。
22、 根据权利要求 21所述的图像重构系统, 其特征在于, 还包括: 第二深度图像采集装置, 用于获取所述已知第二视点的第二深度图; 第二确定装置, 用于根据所述第一彩色图、 所述第二彩色图和所述第二 深度图确定所述第二深度图的第二深度视差对应因子;
第二转换装置, 用于根据所述第二深度视差对应因子对所述第二深度图 进行深度视差转换, 获取第二视差信息;
第二重构装置, 用于根据所述第二彩色图和所述第二视差信息重构所述 虚拟视点的第四图像。
23、 根据权利要求 22所述的图像重构系统, 其特征在于, 还包括: 空洞填补装置, 用于根据所述第三图像和所述第四图像进行空洞填补, 生成所述虚拟视点的第五图像。
24、 根据权利要求 21或 22或 23所述的图像重构系统, 其特征在于, 还 包括:
校正装置, 用于校正所述第一彩色图和所述第二彩色图, 使所述第一彩 色图中的点与所述第二彩色图中的对应点平行;
第一配准装置, 用于配准所述第一彩色图和所述第一深度图, 使所述第 一彩色图中的点与所述第一深度图中的对应点重合。
25、 根据权利要求 22或 23所述的图像重构系统, 其特征在于, 还包括: 第二配准装置, 用于配准所述第二彩色图和所述第二深度图, 使所述第 二彩色图中的点与所述第二深度图中的对应点重合。
26、 一种 3D视频通信系统, 包括发送设备和接收设备, 其特征在于, 所述发送设备包括:
视频采集单元, 用于获取图像采集装置采集的场景的视频图像数据, 所 述视频图像数据包括至少一幅深度图和至少二幅彩色图, 所述视频采集单元 包括至少一台能输出场景的深度信息的图像采集装置和至少一台能输出场景 的彩色 /灰度视频信息的图像采集装置, 或包括至少一台能输出场景的深度信 息和彩色 /灰度视频信息的图像采集装置;
视频编码单元, 用于对所述视频图像数据进行编码, 获得视频图像编码 数据;
视频输出单元, 用于将所述视频图像编码数据发送出去;
所述接收设备包括:
视频接收单元, 用于接收所述视频输出单元发送来的所述视频图像编码 数据;
视频解码单元, 用于对所述视频编码数据进行解码, 获得视频图像解码 数据。
27、 根据权利要求 26所述的 3D视频通信系统, 其特征在于, 所述视频 采集单元包括:
采集控制模块, 用于控制与其连接的图像采集装置进行所述场景的图像 采集;
同步模块, 用于产生同步信号, 并将所述同步信号输出至所述采集控制 模块, 由所述采集控制模块同步各图像采集装置; 或, 用于将所述同步信号 输出至图像采集装置的外同步接口同步各图像采集装置对所述场景的图像采 集, 所述同步信号由所述同步模块自身产生或为所述各图像采集装置中的一
台图像采集装置的视频输出信号;
标定模块, 用于接收所述图像采集装置采集的视频图像, 并根据所述采 集的视频图像进行图像采集装置标定, 获得各图像采集装置的内部参数和外 部参数, 并发送至所述采集控制模块;
所述采集控制模块还用于根据所述内部参数和外部参数建立各图像采集 装置采集的所述视频图像与各图像采集装置属性的对应关系, 并作为所述场 景的视频图像数据输出 , 所述图像采集装置属性包括图像采集装置的内部参 数、 外部参数以及所述视频图像每帧的采集时间戳。
28、 根据权利要求 27所述的 3D视频通信系统, 其特征在于, 所述发送 设备还包括:
预处理单元, 用于从所述采集控制模块接收包括所述视频图像和所述图 像采集装置属性, 以及所述视频图像与各个图像采集装置的属性的对应关系 的所述视频图像数据, 根据所述图像采集装置的内部参数和外部参数对所述 视频图像数据进行校正处理, 输出校正处理后的视频图像数据。
29、 根据权利要求 26所述的 3D视频通信系统, 其特征在于, 所述接收 设备还包括图像重构系统, 其中, 所述图像重构系统包括:
普通图像采集装置, 用于获取已知视点的彩色图;
深度图像采集装置, 用于获取所述已知视点的深度信息;
转换装置, 用于对所述深度信息进行深度视差转换, 获取所述深度信息 对应的视差信息;
重构装置, 用于根据所述已知视点的所述彩色图和所述视差信息重构虚 拟视点的图像。
30、 根据权利要求 26所述的 3D视频通信系统, 其特征在于, 所述接收 设备还包括图像重构系统, 其中, 所述图像重构系统包括:
第一普通图像采集装置, 用于获取已知第一视点的第一彩色图; 第二普通图像采集装置, 用于获取已知第二视点的第二彩色图;
第一深度图像采集装置, 用于获取所述已知第一视点的第一深度信息; 第一确定装置, 用于根据所述第一彩色图、 所述第二彩色图和所述第一 深度信息确定所述第一深度信息的第一深度视差对应因子;
第一转换装置, 用于根据所述第一深度视差对应因子对所述第一深度信 息进行深度视差转换, 获取第一视差信息;
第一重构装置, 用于根据所述第一彩色图和所述第一视差信息重构虚拟 视点的第三图像。
31、 根据权利要求 30所述的 3D视频通信系统, 其特征在于, 所述图像 重构系统还包括:
第二深度图像采集装置, 用于获取所述已知第二视点的第二深度信息; 第二确定装置, 用于根据所述第一彩色图、 所述第二彩色图和所述第二 深度信息确定所述第二深度信息的第二深度视差对应因子;
第二转换装置, 用于根据所述第二深度视差对应因子对所述第二深度信 息进行深度视差转换, 获取第二视差信息;
第二重构装置, 用于根据所述第二彩色图和所述第二视差信息重构所述 虚拟视点的第四图像。
32、 根据权利要求 31所述的 3D视频通信系统, 其特征在于, 所述图像 重构系统还包括:
空洞填补装置, 用于根据所述第三图像和所述第四图像进行空洞填补, 生成所述虚拟视点的第五图像。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP20090811030 EP2328337A4 (en) | 2008-09-02 | 2009-08-26 | 3D VIDEO COMMUNICATION, TRANSMISSION DEVICE, SYSTEM AND IMAGE RECONSTRUCTION, SYSTEM |
US13/038,055 US9060165B2 (en) | 2008-09-02 | 2011-03-01 | 3D video communication method, sending device and system, image reconstruction method and system |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN 200810119545 CN101668219B (zh) | 2008-09-02 | 2008-09-02 | 3d视频通信方法、发送设备和系统 |
CN200810119545.9 | 2008-09-02 | ||
CN2008102251954A CN101754042B (zh) | 2008-10-30 | 2008-10-30 | 图像重构方法和图像重构系统 |
CN200810225195.4 | 2008-10-30 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/038,055 Continuation US9060165B2 (en) | 2008-09-02 | 2011-03-01 | 3D video communication method, sending device and system, image reconstruction method and system |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2010025655A1 true WO2010025655A1 (zh) | 2010-03-11 |
Family
ID=41796744
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2009/073542 WO2010025655A1 (zh) | 2008-09-02 | 2009-08-26 | 3d视频通信方法、发送设备、系统及图像重构方法和系统 |
Country Status (3)
Country | Link |
---|---|
US (1) | US9060165B2 (zh) |
EP (1) | EP2328337A4 (zh) |
WO (1) | WO2010025655A1 (zh) |
Families Citing this family (34)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101289269B1 (ko) * | 2010-03-23 | 2013-07-24 | 한국전자통신연구원 | 영상 시스템에서 영상 디스플레이 장치 및 방법 |
JP5549476B2 (ja) * | 2010-08-24 | 2014-07-16 | ソニー株式会社 | 画像処理装置と画像処理方法 |
WO2012128070A1 (ja) * | 2011-03-18 | 2012-09-27 | ソニー株式会社 | 画像処理装置および画像処理方法 |
WO2012131895A1 (ja) * | 2011-03-29 | 2012-10-04 | 株式会社東芝 | 画像符号化装置、方法及びプログラム、画像復号化装置、方法及びプログラム |
US9900662B2 (en) * | 2011-05-03 | 2018-02-20 | Vmtv, Inc. | Social data associated with bookmarks to multimedia content |
JP5863134B2 (ja) * | 2011-05-05 | 2016-02-16 | エンパイア テクノロジー ディベロップメント エルエルシー | レンチキュラ指向性ディスプレイ |
KR20130003135A (ko) * | 2011-06-30 | 2013-01-09 | 삼성전자주식회사 | 다시점 카메라를 이용한 라이트 필드 형상 캡처링 방법 및 장치 |
US9270974B2 (en) * | 2011-07-08 | 2016-02-23 | Microsoft Technology Licensing, Llc | Calibration between depth and color sensors for depth cameras |
AU2012295043A1 (en) * | 2011-08-09 | 2014-03-06 | Samsung Electronics Co., Ltd. | Multiview video data encoding method and device, and decoding method and device |
JP5978573B2 (ja) * | 2011-09-06 | 2016-08-24 | ソニー株式会社 | 映像信号処理装置および映像信号処理方法 |
EP2611169A1 (fr) * | 2011-12-27 | 2013-07-03 | Thomson Licensing | Dispositif d'acquisition d'images stereoscopiques |
US20150117514A1 (en) * | 2012-04-23 | 2015-04-30 | Samsung Electronics Co., Ltd. | Three-dimensional video encoding method using slice header and method therefor, and three-dimensional video decoding method and device therefor |
WO2014000664A1 (en) | 2012-06-28 | 2014-01-03 | Mediatek Inc. | Method and apparatus of disparity vector derivation in 3d video coding |
CN104429063B (zh) * | 2012-07-09 | 2017-08-25 | Lg电子株式会社 | 增强3d音频/视频处理装置和方法 |
CN103778023B (zh) * | 2012-10-19 | 2017-09-15 | 原相科技股份有限公司 | 存取装置与控制装置之间的通信方法以及存取装置 |
KR101720320B1 (ko) * | 2012-11-09 | 2017-03-28 | 한국전자통신연구원 | 다중 스트림 기반 3차원 영상의 에러 보정 방법 및 장치 |
JP6027143B2 (ja) * | 2012-12-27 | 2016-11-16 | 日本電信電話株式会社 | 画像符号化方法、画像復号方法、画像符号化装置、画像復号装置、画像符号化プログラム、および画像復号プログラム |
KR102081087B1 (ko) | 2013-06-17 | 2020-02-25 | 삼성전자주식회사 | 동기적 영상과 비동기적 영상을 위한 영상 정합 장치 및 이미지 센서 |
US10558881B2 (en) * | 2016-08-24 | 2020-02-11 | Electronics And Telecommunications Research Institute | Parallax minimization stitching method and apparatus using control points in overlapping region |
EP3528495A4 (en) * | 2016-10-13 | 2020-01-22 | Sony Corporation | IMAGE PROCESSING DEVICE AND IMAGE PROCESSING METHOD |
CN106713922B (zh) * | 2017-01-13 | 2020-03-06 | 京东方科技集团股份有限公司 | 图像处理方法和电子设备 |
CN106657973A (zh) * | 2017-01-21 | 2017-05-10 | 上海量明科技发展有限公司 | 用于展示图像的方法及系统 |
US10607096B2 (en) * | 2017-04-04 | 2020-03-31 | Princeton Identity, Inc. | Z-dimension user feedback biometric system |
JP6955147B2 (ja) * | 2017-07-13 | 2021-10-27 | 富士通株式会社 | 画像処理装置、画像処理方法、及び画像処理プログラム |
CN107492122A (zh) * | 2017-07-20 | 2017-12-19 | 深圳市佳创视讯技术股份有限公司 | 一种基于多层深度平面的深度学习视差估计方法 |
KR102095539B1 (ko) * | 2017-12-05 | 2020-03-31 | 농업회사법인 원스베리 주식회사 | 인삼의 영상 이미지 분석을 통한 생육량 측정 방법 |
CN107901424B (zh) * | 2017-12-15 | 2024-07-26 | 北京中睿华信信息技术有限公司 | 一种图像采集建模系统 |
KR102113285B1 (ko) * | 2018-08-01 | 2020-05-20 | 한국원자력연구원 | 평행축 방식의 양안 카메라 시스템에서 근거리 물체의 입체영상을 위한 영상처리 방법 및 장치 |
US10957025B2 (en) * | 2018-12-03 | 2021-03-23 | International Business Machines Corporation | Photograph with variable light source distribution |
JP2020136697A (ja) * | 2019-02-12 | 2020-08-31 | キヤノン株式会社 | 画像処理装置、撮像装置、画像処理方法、及びプログラム |
WO2020181104A1 (en) | 2019-03-07 | 2020-09-10 | Alibaba Group Holding Limited | Method, apparatus, medium, and server for generating multi-angle free-perspective video data |
CN112116602A (zh) * | 2020-08-31 | 2020-12-22 | 北京的卢深视科技有限公司 | 深度图修复方法、装置和可读存储介质 |
CN115379194B (zh) * | 2021-05-19 | 2024-06-04 | 北京小米移动软件有限公司 | 深度图像的量化方法及装置、终端设备、存储介质 |
US11694608B1 (en) * | 2022-04-01 | 2023-07-04 | Novatek Microelectronics Corp. | Calibrating device and method for adjust luminance-chrominance of pixels of LED panels |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1432969A (zh) * | 2001-11-27 | 2003-07-30 | 三星电子株式会社 | 基于深度图像表示三维物体的装置和方法 |
US20050254817A1 (en) * | 2004-05-13 | 2005-11-17 | Mckee William J | Autostereoscopic electronic camera |
US20060227132A1 (en) * | 2005-04-11 | 2006-10-12 | Samsung Electronics Co., Ltd. | Depth image-based representation method for 3D object, modeling method and apparatus, and rendering method and apparatus using the same |
US20070201859A1 (en) * | 2006-02-24 | 2007-08-30 | Logitech Europe S.A. | Method and system for use of 3D sensors in an image capture device |
EP1931150A1 (en) * | 2006-12-04 | 2008-06-11 | Koninklijke Philips Electronics N.V. | Image processing system for processing combined image data and depth data |
CN101312542A (zh) * | 2008-07-07 | 2008-11-26 | 浙江大学 | 一种自然三维电视系统 |
Family Cites Families (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR100414629B1 (ko) * | 1995-03-29 | 2004-05-03 | 산요덴키가부시키가이샤 | 3차원표시화상생성방법,깊이정보를이용한화상처리방법,깊이정보생성방법 |
US6084979A (en) * | 1996-06-20 | 2000-07-04 | Carnegie Mellon University | Method for creating virtual reality |
GB0007870D0 (en) | 2000-03-31 | 2000-05-17 | Koninkl Philips Electronics Nv | Methods and apparatus for making and replauing digital video recordings, and recordings made by such methods |
US20040070667A1 (en) * | 2002-10-10 | 2004-04-15 | Fuji Photo Optical Co., Ltd. | Electronic stereoscopic imaging system |
KR100585966B1 (ko) | 2004-05-21 | 2006-06-01 | 한국전자통신연구원 | 3차원 입체 영상 부가 데이터를 이용한 3차원 입체 디지털방송 송/수신 장치 및 그 방법 |
KR100714068B1 (ko) * | 2004-10-16 | 2007-05-02 | 한국전자통신연구원 | 계층적 깊이 영상을 이용한 다시점 동영상 부호화/복호화방법 및 장치 |
US7643672B2 (en) * | 2004-10-21 | 2010-01-05 | Kazunari Era | Image processing apparatus, image pickup device and program therefor |
KR100603601B1 (ko) * | 2004-11-08 | 2006-07-24 | 한국전자통신연구원 | 다시점 콘텐츠 생성 장치 및 그 방법 |
CN101453662B (zh) * | 2007-12-03 | 2012-04-04 | 华为技术有限公司 | 立体视频通信终端、系统及方法 |
CN101483770B (zh) | 2008-01-08 | 2011-03-16 | 华为技术有限公司 | 一种编解码方法及装置 |
KR101506217B1 (ko) * | 2008-01-31 | 2015-03-26 | 삼성전자주식회사 | 스테레오스코픽 영상의 부분 데이터 구간 재생을 위한스테레오스코픽 영상 데이터스트림 생성 방법과 장치, 및스테레오스코픽 영상의 부분 데이터 구간 재생 방법과 장치 |
CN101668219B (zh) | 2008-09-02 | 2012-05-23 | 华为终端有限公司 | 3d视频通信方法、发送设备和系统 |
WO2010093350A1 (en) * | 2009-02-13 | 2010-08-19 | Thomson Licensing | Depth map coding using video information |
EP2672713A4 (en) * | 2012-01-13 | 2014-12-31 | Sony Corp | TRANSMISSION DEVICE, TRANSMISSION METHOD, RECEIVING DEVICE, AND RECEIVING METHOD |
-
2009
- 2009-08-26 EP EP20090811030 patent/EP2328337A4/en not_active Ceased
- 2009-08-26 WO PCT/CN2009/073542 patent/WO2010025655A1/zh active Application Filing
-
2011
- 2011-03-01 US US13/038,055 patent/US9060165B2/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1432969A (zh) * | 2001-11-27 | 2003-07-30 | 三星电子株式会社 | 基于深度图像表示三维物体的装置和方法 |
US20050254817A1 (en) * | 2004-05-13 | 2005-11-17 | Mckee William J | Autostereoscopic electronic camera |
US20060227132A1 (en) * | 2005-04-11 | 2006-10-12 | Samsung Electronics Co., Ltd. | Depth image-based representation method for 3D object, modeling method and apparatus, and rendering method and apparatus using the same |
US20070201859A1 (en) * | 2006-02-24 | 2007-08-30 | Logitech Europe S.A. | Method and system for use of 3D sensors in an image capture device |
EP1931150A1 (en) * | 2006-12-04 | 2008-06-11 | Koninklijke Philips Electronics N.V. | Image processing system for processing combined image data and depth data |
CN101312542A (zh) * | 2008-07-07 | 2008-11-26 | 浙江大学 | 一种自然三维电视系统 |
Non-Patent Citations (1)
Title |
---|
See also references of EP2328337A4 * |
Also Published As
Publication number | Publication date |
---|---|
US9060165B2 (en) | 2015-06-16 |
EP2328337A4 (en) | 2011-08-10 |
US20110150101A1 (en) | 2011-06-23 |
EP2328337A1 (en) | 2011-06-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2010025655A1 (zh) | 3d视频通信方法、发送设备、系统及图像重构方法和系统 | |
US8446459B2 (en) | Video communication method, device, and system | |
US11363240B2 (en) | System and method for augmented reality multi-view telepresence | |
CN101668219B (zh) | 3d视频通信方法、发送设备和系统 | |
CN101453662B (zh) | 立体视频通信终端、系统及方法 | |
Stankiewicz et al. | A free-viewpoint television system for horizontal virtual navigation | |
JP5763184B2 (ja) | 3次元画像に対する視差の算出 | |
CN101651841B (zh) | 一种立体视频通讯的实现方法、系统和设备 | |
US20120139906A1 (en) | Hybrid reality for 3d human-machine interface | |
US20100134599A1 (en) | Arrangement and method for the recording and display of images of a scene and/or an object | |
CN101662694B (zh) | 视频的呈现方法、发送、接收方法及装置和通信系统 | |
WO2009092233A1 (zh) | 多视角摄像及图像处理装置、系统及方法与解码处理方法 | |
CN206117890U (zh) | 一种全景360度虚拟现实成像系统 | |
CN105847778B (zh) | 360°多视点3d全息视频采集方法、设备及实现方法 | |
KR101158678B1 (ko) | 입체 영상 시스템 및 입체 영상 처리 방법 | |
KR20050083352A (ko) | 휴대용 단말장치에서 스테레오 카메라를 이용하여 파노라믹 영상과 3차원 영상을 획득 및 디스플레이를 할 수 있는 장치 및 그 방법. | |
CN114885147B (zh) | 融合制播系统及方法 | |
CN111787302A (zh) | 基于线扫描相机的立体全景直播拍摄系统 | |
Adhikarla et al. | Fast and efficient data reduction approach for multi-camera light field display telepresence systems | |
CN109379579A (zh) | 一种实时采集光场真三维数据的处理方法 | |
Domański et al. | Efficient transmission of 3d video using MPEG-4 AVC/H. 264 compression technology | |
Wang et al. | Designing a communication system for IVAS-Stereo Video Coding Based on H. 264 | |
KR20120041532A (ko) | 양안식 비디오 송신 장치 및 그 방법 | |
KR20120089603A (ko) | 모노스코픽 2d 비디오 및 대응하는 깊이 정보로부터 3d 비디오를 생성하기 위한 방법 및 시스템 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 09811030 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2009811030 Country of ref document: EP |