CN103269423A - Scalable 3D display remote video communication method - Google Patents
Scalable 3D display remote video communication method Download PDFInfo
- Publication number
- CN103269423A CN103269423A CN2013101767177A CN201310176717A CN103269423A CN 103269423 A CN103269423 A CN 103269423A CN 2013101767177 A CN2013101767177 A CN 2013101767177A CN 201310176717 A CN201310176717 A CN 201310176717A CN 103269423 A CN103269423 A CN 103269423A
- Authority
- CN
- China
- Prior art keywords
- data
- user
- dimensional display
- image
- video communication
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000004891 communication Methods 0.000 title claims abstract description 41
- 238000000034 method Methods 0.000 title claims abstract description 23
- 239000000284 extract Substances 0.000 claims abstract description 20
- 238000009877 rendering Methods 0.000 claims abstract description 7
- 230000001815 facial effect Effects 0.000 claims abstract description 6
- 238000005457 optimization Methods 0.000 claims description 7
- 238000012545 processing Methods 0.000 claims description 6
- 230000000007 visual effect Effects 0.000 claims description 3
- 230000001351 cycling effect Effects 0.000 claims 1
- 238000000605 extraction Methods 0.000 claims 1
- 238000012856 packing Methods 0.000 claims 1
- 230000005540 biological transmission Effects 0.000 abstract description 18
- 230000014509 gene expression Effects 0.000 abstract description 11
- 230000000694 effects Effects 0.000 description 11
- 238000005516 engineering process Methods 0.000 description 11
- 230000033001 locomotion Effects 0.000 description 11
- 230000003993 interaction Effects 0.000 description 7
- 230000009471 action Effects 0.000 description 6
- 230000008921 facial expression Effects 0.000 description 5
- 230000002452 interceptive effect Effects 0.000 description 5
- 230000008569 process Effects 0.000 description 3
- 230000008859 change Effects 0.000 description 2
- 230000006835 compression Effects 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 230000002457 bidirectional effect Effects 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 230000001186 cumulative effect Effects 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
Images
Landscapes
- Processing Or Creating Images (AREA)
Abstract
Description
技术领域technical field
本发明涉及远程视频通信领域,尤其涉及一种可拓展式三维显示远程视频通信方法。The invention relates to the field of remote video communication, in particular to an expandable three-dimensional display remote video communication method.
背景技术Background technique
远程视频通信系统在当今社会发挥着越来越重要的作用,业界采用了很多技术来提高系统虚拟现实的效果,使远程会议的参与者获得更加身临其境的体验。三维显示技术在学术界及产业界都经历了空前的发展,正可以使用在远程视频通信系统中。由于最新的三维显示技术可以提供自然的可视化效果以及重要的视觉线索,例如眼神会意,面部表情,肢体语言等等,将它应用到远程视频通信系统中,优势不言而喻。Remote video communication systems are playing an increasingly important role in today's society. The industry has adopted many technologies to improve the effect of virtual reality in the system, so that participants in remote conferences can obtain a more immersive experience. Three-dimensional display technology has experienced unprecedented development in both academia and industry, and it can be used in remote video communication systems. Since the latest 3D display technology can provide natural visualization effects and important visual cues, such as eye contact, facial expressions, body language, etc., the advantages of applying it to remote video communication systems are self-evident.
学术界已有一些应用于远程视频通信的三维显示技术,其中视差型自体视三维显示是一种较为成熟的方法。这种多视角的自体视三维显示在一些固定的视角提供了高分辨率的三维图像。然而,随着观察位置数量的提升,这种方法的显示效果会出现问题,它不能在较大的观察角度范围内提供光滑的运动视差,这意味着远程视频通信参与者走动的自由度被严重地限制了。另外一些基于屏幕旋转原理设计的体三维显示系统由于其旋转机制的结构设计,限制了其显示体积,也从成本,简洁性和可伸缩性的方面限制了其在商业远程视频通信系统中的应用。There are already some 3D display technologies applied to remote video communication in the academic circle, among which the parallax-type self-viewing 3D display is a relatively mature method. This multi-view self-viewing 3D display provides high-resolution 3D images at some fixed viewing angles. However, as the number of observation positions increases, the display effect of this method will be problematic. It cannot provide smooth motion parallax in a large range of observation angles, which means that the freedom of movement of participants in remote video communication is seriously restricted. restricted. In addition, some volumetric three-dimensional display systems designed based on the principle of screen rotation limit their display volume due to the structural design of their rotation mechanism, and also limit their application in commercial remote video communication systems in terms of cost, simplicity and scalability. .
面向市场的三维远程视频通信系统需要满足一些功能需求。首先,最基本的要求在于使与会者们能够交谈、走动、做一些动作,在这方面的限制越少越好。其次,需要提供自然的,具有平滑的双眼视差和运动视差的三维显示效果。同时,还需要快速地捕获与会者的外形及语言等信息,以便实时呈现出来。从商业角度而言,又要求任何端对端的三维显示远程视频通信系统成本较低、简洁、易于部署,以便于进行大范围的市场推广。A market-oriented 3D remote video communication system needs to meet some functional requirements. First of all, the most basic requirement is to enable participants to talk, walk, and do some actions. The less restrictions in this regard, the better. Second, there is a need to provide a natural, three-dimensional display effect with smooth binocular parallax and motion parallax. At the same time, it is also necessary to quickly capture information such as the appearance and language of the participants so that they can be displayed in real time. From a commercial point of view, any end-to-end 3D display remote video communication system is required to be low in cost, simple and easy to deploy, so as to facilitate large-scale market promotion.
在实时人物三维信息捕捉方面,RGB-D相机的出现为这个问题提供了便捷且成本低廉的潜在解决方案,然而将其应用到视频通信场合,尤其是三维视频通信场合的报道非常少。In terms of real-time 3D information capture of people, the emergence of RGB-D cameras provides a convenient and low-cost potential solution to this problem. However, there are very few reports on its application to video communication, especially 3D video communication.
发明内容Contents of the invention
本发明的目的是克服现有的远程视频通信解决方案在虚拟现实效果、用户自由度以及实现成本上的不足,提供了一种面向市场的可拓展式三维显示远程视频通信系统及方法。The purpose of the present invention is to overcome the deficiencies of existing remote video communication solutions in terms of virtual reality effect, user freedom and implementation cost, and provide a market-oriented scalable three-dimensional display remote video communication system and method.
一种可拓展式三维显示远程视频通信方法,实施步骤如下:An expandable three-dimensional display remote video communication method, the implementation steps are as follows:
数据发送端:获取第一用户的人物图像和语音信息,并提取出人物图像中的特征信息和纹理Data sending end: obtain the character image and voice information of the first user, and extract the feature information and texture in the character image
1)利用RGB-D相机获取第一用户的人物图像,该人物图像中包含了纹理图像和深度图像;1) Use the RGB-D camera to acquire the character image of the first user, the character image includes texture images and depth images;
从纹理图像中提取出面部特征信息及表情特征信息;Extract facial feature information and expression feature information from the texture image;
从深度图像中提取肢体特征信息并重建出人物点云;Extract body feature information from depth images and reconstruct character point clouds;
2)利用所述的人物点云对步骤1)中各特征信息进行优化得到优化后的特征信息,通过该优化后的特征信息生成三维模型A;2) Using the character point cloud to optimize each feature information in step 1) to obtain optimized feature information, and generate a 3D model A through the optimized feature information;
3)将所述的三维模型A投影到对应的纹理图像,将该三维模型A所对应的纹理数据提取出来;在采集人物图像的同时获取语音信息,将该语音信息连同优化后的特征信息,以及提取出来的纹理数据发送给第二用户;3) Project the 3D model A to the corresponding texture image, extract the texture data corresponding to the 3D model A; acquire voice information while collecting the character image, and combine the voice information with the optimized feature information, and sending the extracted texture data to the second user;
4)循环操作步骤1)~3),利用后一时刻的数据对前一时刻的数据进行更新并发送给第二用户;4) Cycle operation steps 1) to 3), update the data at the previous time with the data at the next time and send it to the second user;
数据接收端:Data receiver:
第二用户通过网络接收来自第一用户的数据后,提取其中的优化后的特征信息生成三维模型B,利用所述提取出来的纹理数据对三维模型B进行渲染,通过三维显示设备输出渲染结果,并播放所述的语音信息。After receiving the data from the first user through the network, the second user extracts the optimized feature information therein to generate a 3D model B, uses the extracted texture data to render the 3D model B, and outputs the rendering result through a 3D display device, And play the voice message.
在数据发送端获取第一用户的人物图像和语音信息,并提取出人物图像中的特征信息和纹理数据,然后发送至数据接收端,数据接收端的第二用户根据所接受到的数据进行建模和渲染处理,并通过三维显示设备输出,以及播放对应的语音信息,实现数据发送端和数据接收端的远程视频通信。Obtain the character image and voice information of the first user at the data sending end, extract the feature information and texture data in the character image, and then send it to the data receiving end, and the second user at the data receiving end performs modeling according to the received data and rendering processing, output through the three-dimensional display device, and play the corresponding voice information to realize remote video communication between the data sending end and the data receiving end.
在数据发送端,利用RGB-D相机获取第一用户的人物图像时,在不同时刻改变RGB-D相机相对第一用户的视角,得到不同角度下第一用户的人物图像,然后根据人物图像中的各特征信息,构建三维模型,实现三维显示的效果。At the data sending end, when using the RGB-D camera to acquire the person image of the first user, change the angle of view of the RGB-D camera relative to the first user at different times to obtain the person image of the first user at different angles, and then according to the image of the person Each feature information of the system is used to build a 3D model to achieve the effect of 3D display.
所述的语音信息、优化后的特征信息以及提取出来的纹理数据经压缩、打包处理后通过网络发送给所述的第二用户。其中,语音信息和优化后的特征信息的数据量较小,也可不进行压缩处理,对数据传输的速度影响大不,但是纹理数据的数据量较大,占用的资源较多,如不进行精简与压缩处理,其传输的速度较慢,第一用户与第二用户间存在较大的时间差,影响视频通信的实时性。The voice information, the optimized feature information and the extracted texture data are compressed and packaged and sent to the second user through the network. Among them, voice information and optimized feature information have a small amount of data, and compression processing may not be performed, which will have a great impact on the speed of data transmission. However, the texture data has a large amount of data and takes up more resources. If it is not streamlined Compared with compression processing, the transmission speed is relatively slow, and there is a large time difference between the first user and the second user, which affects the real-time performance of video communication.
利用后一时刻的数据对前一时刻的数据进行更新时,利用后一时刻的优化后的特征信息对前一时刻的优化后的特征信息进行更新处理,并对应生成更新后的三维模型A,将更新后的三维模型A投影到该后一时刻的纹理图像,提取出更新后的纹理数据。When using the data at the next moment to update the data at the previous moment, use the optimized feature information at the next moment to update the optimized feature information at the previous moment, and correspondingly generate the updated 3D model A, Project the updated 3D model A to the texture image at the next moment, and extract the updated texture data.
优选的,所述的三维显示设备包括投影模块和定向散射屏模块,所述的定向散射屏模块为具有二向散射特性的屏幕,所述的投影模块为投影阵列。所述的投影阵列为投影仪组成的阵列,或由二维显示器和镜头阵列组成。三维显示设备是使用集成光场三维显示技术的实时空间三维呈现系统,输出视场角度很大且运动视差连续的空间三维场景图像,使观察者通过双目视差形成三维视觉,达到真实的三维视频效果。Preferably, the three-dimensional display device includes a projection module and a directional scattering screen module, the directional scattering screen module is a screen with bidirectional scattering characteristics, and the projection module is a projection array. The projection array is an array composed of projectors, or composed of a two-dimensional display and a lens array. The 3D display device is a real-time spatial 3D presentation system using integrated light field 3D display technology, which outputs a spatial 3D scene image with a large field of view and continuous motion parallax, enabling the observer to form a 3D vision through binocular parallax to achieve a real 3D video Effect.
本发明将集成光场三维显示技术和基于RGB-D相机的实时人物捕捉技术结合起来,每个终端设备使用集成光场三维显示技术以提供实时真实感,具有细腻双目视差及运动视差的三维效果;同时,使用RGB-D相机捕获人物信息,并且通过互联网互联,发挥了三维显示系统的视觉优势和RGB-D相机成本低廉以及人物捕获高效的优点。The present invention combines integrated light field 3D display technology with RGB-D camera-based real-time character capture technology, and each terminal device uses integrated light field 3D display technology to provide real-time realism, with fine binocular parallax and motion parallax 3D At the same time, the use of RGB-D cameras to capture character information and interconnection through the Internet has brought into play the visual advantages of the 3D display system and the low cost of RGB-D cameras and the advantages of high-efficiency character capture.
本发明的主要优点在于:The main advantages of the present invention are:
整合了相关资源,使用价格低廉且方便使用的RGB-D深度相机实时地采集并处理,得到质量较好的人物交互信息,并对这些信息进行精简与压缩降低其网络传输带宽要求,使其可以实时地在互联网上传输,同时应用了高性能的实时光场三维显示装置,可以实现很好的虚拟现实效果,使远程视频通信的参与者可以获得身临其境的交互体验。Integrate relevant resources, use the cheap and easy-to-use RGB-D depth camera to collect and process in real time, get better quality character interaction information, and simplify and compress the information to reduce its network transmission bandwidth requirements, so that it can Real-time transmission on the Internet, and the application of a high-performance real-time light field three-dimensional display device can achieve a good virtual reality effect, so that participants in remote video communication can obtain an immersive interactive experience.
附图说明Description of drawings
图1为本发明中三维显示远程视频通信方法的基本逻辑结构示意图。FIG. 1 is a schematic diagram of the basic logic structure of the three-dimensional display remote video communication method in the present invention.
图2为数据发送端获取交互数据的流程图。FIG. 2 is a flow chart of the data sending end acquiring interactive data.
图3为数据接收端利用交互数据生成虚拟现实场景的流程图。Fig. 3 is a flowchart of generating a virtual reality scene by using interactive data at the data receiving end.
具体实施方式Detailed ways
如图1所示,实现本发明三维显示远程视频通信方法的系统包括数据发送端1以及通过网络与数据发送端1连接的数据接收端2,数据发送端1和数据接收端2实时地进行数据交互,实现远程视频通信。数据发送端1和数据接收端2均同时进行数据的接收和发送,两个终端的逻辑结构相同,为方便本发明的描述,人为的分为数据发送端1和数据接收端2。As shown in Figure 1, the system that realizes the three-dimensional display remote video communication method of the present invention includes a data sending end 1 and a
数据发送端1包括主控计算机8、三维显示设备5、人物捕获设备6,其中,主控计算机8包含网络传输模块9、三维显示软件模块4和人物捕获软件模块3。三维显示设备5和人物捕获设备6分别进入主控计算机8的三维显示软件模块4和人物捕获软件模块3,三维显示软件模块4和人物捕获软件模块3分别与网络传输模块9相连接,网络传输模块9通过互联网连接至数据接收端2,三维显示设备5在通信交互区域7输出图像数据,人物捕获软件模块3用于获取通信交互区域7内的人物图像,三维显示设备5的可视区域及人物捕获设备6的捕获区域均要能够覆盖通信交互区域7。The data sending end 1 includes a
网络传输模块9对需要在网络上传输的数据进行了精简与压缩,通过互联网传送至数据接收端2;同时接受其他终端通过互联网发送过来的数据。数据包括人物的表情参数,肢体动作参数,需要更新的人物纹理以及语音信息,数据量比较小,压缩后可以满足网络实时传输。The
三维显示设备5是使用集成光场三维显示技术的实时空间三维呈现系统,这是本发明的优点之一。光场三维显示的原理为重构三维场景发出的光线的空间分布,使用该原理的三维显示设备主要包括投影模块和定向散射屏模块。投影模块和定向散射屏模块具有多种排布模式,这里以实现360度全景观看的多投影式光场显示系统为例。围绕屏幕环状分布的投影仪将所要显示的空间三维模型或三维场景的各个角度投影的组合图像投影成像于环状屏幕的中心区域,该环状屏幕为具有二向散射特性的定向散射屏幕,即在横向具有特定散射角度、在纵向具有较大散射角度特性,呈旋转对称分布的散射屏幕结构。由此,环状分布的投影仪投影的图像就被转换成360度全景可视的空间三维场景图像,供围绕在屏幕周边观察区域内的观众观看。根据光场重建的原理以及环状定向散射屏的特性,在观察区域内的每一个位置仅能观看到对应于这个位置的一个投影仪投射出的一窄条图像,而每一个位置能看多个投影仪投射出的多个窄条图像的组合图像便形成该位置完整的画面,通过这种类似于集成拼接光场的模式呈现出完整的图像信息。那么在观察区域内的不同位置均能观看到对应于相应位置的不同图像,这就能保证在观察区域内的观众的双眼观看到的图像信息是不同的,通过双目视差形成三维视觉,亦可以通过在横向不同位置间的移动获得运动视差。三维显示设备5对于三维显示软件模块4的依赖关系仅在于三维显示设备5需要三维显示软件模块4提供模型、纹理等绘制三维效果所必须的数据和参数。三维显示软件模块4的具体实现方法和三维显示设备5无关。The three-dimensional display device 5 is a real-time spatial three-dimensional rendering system using integrated light field three-dimensional display technology, which is one of the advantages of the present invention. The principle of light field 3D display is to reconstruct the spatial distribution of light emitted by a 3D scene. The 3D display device using this principle mainly includes a projection module and a directional scattering screen module. The projection module and the directional scattering screen module have a variety of arrangement modes. Here, a multi-projection light field display system that realizes 360-degree panoramic viewing is taken as an example. The projectors distributed in a ring around the screen project the combined images of the three-dimensional model or three-dimensional scene to be displayed at various angles into the central area of the ring screen. The ring screen is a directional scattering screen with two-way scattering characteristics. That is, a scattering screen structure with a specific scattering angle in the horizontal direction and a larger scattering angle in the vertical direction, and a rotationally symmetrical distribution. Thus, the images projected by the annularly distributed projectors are converted into 360-degree panoramic and visible spatial three-dimensional scene images for viewers in the viewing area surrounding the screen. According to the principle of light field reconstruction and the characteristics of the annular directional scattering screen, each position in the observation area can only watch a narrow image projected by a projector corresponding to this position, and each position can see more The combined image of multiple narrow strip images projected by a projector forms a complete picture of the position, and the complete image information is presented through this mode similar to the integrated stitching light field. Then different images corresponding to the corresponding positions can be viewed at different positions in the observation area, which can ensure that the image information viewed by the eyes of the audience in the observation area is different, and three-dimensional vision is formed through binocular parallax, which is also Motion parallax can be obtained by moving between different positions in the lateral direction. The dependence of the 3D display device 5 on the 3D display software module 4 is only that the 3D display device 5 requires the 3D display software module 4 to provide models, textures and other necessary data and parameters for rendering 3D effects. The specific implementation method of the three-dimensional display software module 4 has nothing to do with the three-dimensional display device 5 .
三维显示远程视频通信系统的主要功能是实现两个方向上的实时信息交换。第一个方向是视频通信参与者的外形、表情、语音、动作等交互数的据获与发送。人物捕获设备6对通信交互区域7中的通信参与用户的外形、动作、语音进行捕捉,获得基本的图像、音频数据。这些数据被传送给人物捕获软件模块3,人物捕获软件模块3通过一些算法从中计算出人物的外形、语音和表情动作信息,并对其进行优化和累积,以得到更加符合真实场景情况的信息,并把这些数据传送给网络传输模块9。网络传输模块9对数据进行压缩和打包,通过互联网将其发送到另一个远程通信终端(即为数据接收端),从而完成这个方向上的数据传输。第二个方向是接收视频通信参与者的外形、表情、语音、动作等交互数据,并生成虚拟现实场景。网络传输模块9通过互联网获得另一个终端发送过来的通信用户数据,并将解压缩后的数据传送给三维显示软件模块4。三维显示软件模块4通过获得的用户外形及表情、动作、语音数据,计算出三维显示设备5需要呈现的三维模型及纹理信息,传送给三维图像显示设备5,使其呈现相应的三维效果并播放语音,供通信交互区域7中的参与者观看。这两个方向的实时信息交换的完成,使得三维显示远程视频通信系统不同终端的使用者们可以实时地观察到对方的外形、表情、动作等三维效果,听到对方的语音,并且也可以实时地把自己的这些信息反馈给对方。The main function of the three-dimensional display remote video communication system is to realize real-time information exchange in two directions. The first direction is the acquisition and transmission of interactive data such as appearance, expression, voice, and action of video communication participants. The character capture device 6 captures the appearance, action and voice of the communication participating users in the communication interaction area 7 to obtain basic image and audio data. These data are sent to the character capture software module 3, and the character capture software module 3 calculates the shape, voice and facial expressions of the character through some algorithms, and optimizes and accumulates it to obtain information that is more in line with the real scene. And send these data to network
下面详细介绍上面两个方向上实时信息交换的具体实施过程,包括以下步骤:The specific implementation process of real-time information exchange in the above two directions is introduced in detail below, including the following steps:
如图2所示,数据发送端:As shown in Figure 2, the data sender:
1)利用RGB-D相机获取第一用户的人物图像,该人物图像中包含了纹理图像和深度图像;1) Use the RGB-D camera to acquire the character image of the first user, the character image includes texture images and depth images;
从纹理图像中提取出面部特征信息及表情特征信息;Extract facial feature information and expression feature information from the texture image;
从深度图像中提取肢体特征信息并重建出人物点云。Extract body feature information from depth image and reconstruct character point cloud.
人物捕获设备选用廉价、易用且便于部署的RGB-D相机,利用RGB-D相机获取第一用户的人物图像时,在不同时刻改变RGB-D相机相对第一用户的视角,在本实施例中具体使用的RGB-D相机是Kinect。Kinect可以实时采集场景的颜色及深度数据流,并作为颜色帧和深度帧输送至人物捕获软件模块。这些数据流本身就是场景部分的纹理信息以及深度信息,其中包含了人物的纹理及深度数据。人物捕获软件模块3从人物的纹理数据中实时地提取出人物的面部特征及表情特征信息,从人物的深度数据中提取人物肢体特征信息,这些信息定义了人物的几何外形、面部表情和肢体动作信息,从深度数据中可以重建出人物的点云,可以反映人物的原始几何数据。The character capture device uses an inexpensive, easy-to-use, and easy-to-deploy RGB-D camera. When using the RGB-D camera to capture the first user's character image, change the angle of view of the RGB-D camera relative to the first user at different times. In this embodiment The RGB-D camera specifically used in is Kinect. Kinect can collect the color and depth data stream of the scene in real time, and send them to the character capture software module as color frames and depth frames. These data streams themselves are the texture information and depth information of the scene part, which contains the texture and depth data of the characters. The character capture software module 3 extracts the character's facial features and expression feature information from the character's texture data in real time, and extracts the character's body feature information from the character's depth data, which defines the character's geometric shape, facial expression and body movements Information, the point cloud of the character can be reconstructed from the depth data, which can reflect the original geometric data of the character.
本实施例中,使用Kinect作为RGB-D相机,针对Kinect软件开发,微软公司提供了Kinect的Software Development Kit(SDK),使用KinectSDK中的接口即可以提取特征信息,重建人物点云。In this embodiment, Kinect is used as the RGB-D camera. For Kinect software development, Microsoft provides Kinect's Software Development Kit (SDK). Using the interface in the KinectSDK, feature information can be extracted and character point clouds can be reconstructed.
2)利用所述的人物点云对步骤1)中各特征信息进行优化得到优化后的特征信息,通过该优化后的特征信息生成三维模型A。2) Using the person point cloud to optimize each feature information in step 1) to obtain optimized feature information, and generate a 3D model A through the optimized feature information.
这里的各特征信息包括上述的面部特征信息、表情特征信息和肢体特征信息,利用重建的人物点云对各特征信号进行优化得到优化后的特称信息,再通过该优化后的特征信息生成三维模型A。The feature information here includes the above-mentioned facial feature information, expression feature information, and body feature information. The reconstructed character point cloud is used to optimize each feature signal to obtain the optimized special name information, and then use the optimized feature information to generate a 3D model. Model A.
特征信息优化:定义能量方程,以优化后的特征信息作为未知数。方程描述了优化后的特征信息生成的模型与优化前的特征信息生成的模型在几何形状上的差别程度以及优化后的特征信息生成的模型与点云数据在几何形状上的差别程度这两者之和。求使此能量方程取得最小值的解,即得到优化后的特征信息。Feature information optimization: define the energy equation, and use the optimized feature information as the unknown. The equation describes the degree of difference in geometry between the model generated by the optimized feature information and the model generated by the feature information before optimization, and the degree of difference in geometry between the model generated by the optimized feature information and the point cloud data. Sum. Find the solution that makes this energy equation obtain the minimum value, that is, obtain the optimized characteristic information.
根据特征信息生成三维模型:使用了模型变形方法,包括Morph技术和Skinned Mesh技术。均为常规技术手段。Generate a 3D model based on feature information: Model deformation methods are used, including Morph technology and Skinned Mesh technology. All are conventional techniques.
3)将所述的三维模型A投影到对应的纹理图像,将该三维模型A所对应的纹理数据提取出来;在采集人物图像的同时获取语音信息,将该语音信息连同优化后的特征信息,以及提取出来的纹理数据发送给第二用户。3) Project the 3D model A to the corresponding texture image, extract the texture data corresponding to the 3D model A; acquire voice information while collecting the character image, and combine the voice information with the optimized feature information, And the extracted texture data is sent to the second user.
将三维模型A投影到Kinect采集到的纹理图像中,提取出三维模型A所对应的纹理数据,其中,语音信息可以通过Kinect中的声音采集设备直接获得,然后网络传输模块9将语音信息、优化后的特征信息以及提取出来的纹理数据经压缩、打包处理后发送给所述的第二用户。对于Kinect捕获的每个颜色帧,都可以提取出采集到的人物原始纹理,我们对多次采集到的人物原始纹理数据进行积累与优化,提高其鲁棒性,得到质量较好的人物纹理。The three-dimensional model A is projected into the texture image collected by Kinect, and the texture data corresponding to the three-dimensional model A is extracted, wherein the voice information can be directly obtained by the sound collection device in the Kinect, and then the
纹理的优化:每个人物都有一个基础纹理,每次提取到新的纹理数据都会用来更新基础纹理数据,这种更新是一种累积的方式,能优化纹理。具体的更新方法是:对于每一个像素,根据其累计采集到的纹理数据估计出一个概率分布,用这个概率分布的期望值来更新本像素的颜色值。Texture optimization: Each character has a basic texture, and every time new texture data is extracted, it will be used to update the basic texture data. This update is a cumulative method that can optimize the texture. The specific update method is: for each pixel, a probability distribution is estimated according to the accumulated texture data collected, and the color value of the pixel is updated with the expected value of the probability distribution.
4)循环操作步骤1)~3),利用后一时刻的数据对前一时刻的数据进行更新并发送给第二用户。4) Cycle operation steps 1) to 3), update the data at the previous time with the data at the next time and send it to the second user.
通过多次改变视角获取人物图像,并对多次采集到的人物原始纹理数据进行积累与优化,在结合对应的三维模型,就可以得到完整的人物的外形、表情、动作数据。By changing the angle of view multiple times to obtain character images, and accumulating and optimizing the original texture data of the characters collected multiple times, combined with the corresponding 3D model, you can get the complete shape, expression, and action data of the character.
如图3所示,数据接收端:As shown in Figure 3, the data receiving end:
第二用户通过网络接收来自第一用户的数据后,提取其中的优化后的特征信息生成三维模型B,利用所述提取出来的纹理数据对三维模型B进行渲染,通过三维显示设备输出渲染结果,并播放所述的语音信息。After receiving the data from the first user through the network, the second user extracts the optimized feature information therein to generate a 3D model B, uses the extracted texture data to render the 3D model B, and outputs the rendering result through a 3D display device, And play the voice message.
数据接收端内的网络传输模块接收通过互联网发送过来的数据包,对其中的数据进行解压缩,并完成数据同步的工作。网络传输模块将当前需要进行虚拟现实呈现的数据提供给三维显示软件模块,这些数据包括人物的几何外形的特征信息、面部表情的特征信息、肢体动作的特征信息、需要更新的人物纹理信息、人物语音信息。三维显示软件模块的实施过程为:把人物的几何外形的特征信息、面部表情的特征信息、肢体动作的特征信息作为参数带入算法中,对三维显示软件模块中预存的标准人物模型进行变形。通过变形操作,可以得到符合上述静态及动态特征描述的人物模型。三维显示软件模块中最初存储着人物模型的基础纹理,在每一次得到需要更新的人物纹理信息时,三维显示软件模块就会对人物模型的基础纹理进行更新,并不断地累积,得到当前可以使用的人物纹理。人物语音数据可以直接使用。三维显示软件模块把计算得到的人物模型数据、人物纹理数据提交给三维显示设备,进行虚拟现实呈现,同时播放人物语音。The network transmission module in the data receiving end receives the data packets sent over the Internet, decompresses the data therein, and completes the work of data synchronization. The network transmission module provides the data that needs to be presented in virtual reality to the three-dimensional display software module. These data include the characteristic information of the geometric shape of the character, the characteristic information of the facial expression, the characteristic information of the body movement, the character texture information that needs to be updated, and the character information. voice message. The implementation process of the three-dimensional display software module is as follows: the characteristic information of the character's geometric shape, facial expression and body movement are brought into the algorithm as parameters, and the standard character model pre-stored in the three-dimensional display software module is deformed. Through the deformation operation, a character model conforming to the above static and dynamic feature descriptions can be obtained. The 3D display software module initially stores the basic texture of the character model, and every time the character texture information that needs to be updated is obtained, the 3D display software module will update the basic texture of the character model and continuously accumulate it to obtain the currently usable texture information. character texture. Character voice data can be used directly. The 3D display software module submits the calculated character model data and character texture data to the 3D display device for virtual reality presentation and simultaneously plays character voice.
Claims (5)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310176717.7A CN103269423B (en) | 2013-05-13 | 2013-05-13 | Can expansion type three dimensional display remote video communication method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310176717.7A CN103269423B (en) | 2013-05-13 | 2013-05-13 | Can expansion type three dimensional display remote video communication method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103269423A true CN103269423A (en) | 2013-08-28 |
CN103269423B CN103269423B (en) | 2016-07-06 |
Family
ID=49013031
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310176717.7A Active CN103269423B (en) | 2013-05-13 | 2013-05-13 | Can expansion type three dimensional display remote video communication method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103269423B (en) |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104853134A (en) * | 2014-02-13 | 2015-08-19 | 腾讯科技(深圳)有限公司 | Video communication method and video communication device |
CN104935860A (en) * | 2014-03-18 | 2015-09-23 | 北京三星通信技术研究有限公司 | Method and device for realizing video call |
CN105763828A (en) * | 2014-12-18 | 2016-07-13 | 中兴通讯股份有限公司 | Instant communication method and device |
CN105812708A (en) * | 2016-03-18 | 2016-07-27 | 严俊涛 | Video call method and system |
CN106412562A (en) * | 2015-07-31 | 2017-02-15 | 深圳创锐思科技有限公司 | Method and system for displaying stereoscopic content in three-dimensional scene |
CN107846566A (en) * | 2017-10-31 | 2018-03-27 | 努比亚技术有限公司 | A kind of information processing method, equipment and computer-readable recording medium |
CN108139801A (en) * | 2015-12-22 | 2018-06-08 | 谷歌有限责任公司 | For performing the system and method for electronical display stabilization via light field rendering is retained |
WO2018120657A1 (en) * | 2016-12-27 | 2018-07-05 | 华为技术有限公司 | Method and device for sharing virtual reality data |
CN109299184A (en) * | 2018-07-31 | 2019-02-01 | 武汉大学 | A unified coding visualization method for 3D point cloud in near-Earth space |
JP2019512173A (en) * | 2016-01-22 | 2019-05-09 | 上海肇觀電子科技有限公司NextVPU (Shanghai) Co., Ltd. | Method and apparatus for displaying multimedia information |
CN109814718A (en) * | 2019-01-30 | 2019-05-28 | 天津大学 | A Multimodal Information Acquisition System Based on Kinect V2 |
CN113784109A (en) * | 2021-09-07 | 2021-12-10 | 太仓中科信息技术研究院 | Projection system and method for script killing environment |
CN114339190A (en) * | 2021-12-29 | 2022-04-12 | 中国电信股份有限公司 | Communication method, device, equipment and storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101101672A (en) * | 2007-07-13 | 2008-01-09 | 中国科学技术大学 | Stereo vision 3D face modeling method based on virtual image correspondence |
CN101668219B (en) * | 2008-09-02 | 2012-05-23 | 华为终端有限公司 | Communication method, transmitting equipment and system for 3D video |
CN102520787A (en) * | 2011-11-09 | 2012-06-27 | 浙江大学 | Real-time spatial three-dimensional presentation system and real-time spatial three-dimensional presentation method |
WO2012126135A1 (en) * | 2011-03-21 | 2012-09-27 | Intel Corporation | Method of augmented makeover with 3d face modeling and landmark alignment |
US20120274745A1 (en) * | 2011-04-29 | 2012-11-01 | Austin Russell | Three-dimensional imager and projection device |
-
2013
- 2013-05-13 CN CN201310176717.7A patent/CN103269423B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101101672A (en) * | 2007-07-13 | 2008-01-09 | 中国科学技术大学 | Stereo vision 3D face modeling method based on virtual image correspondence |
CN101668219B (en) * | 2008-09-02 | 2012-05-23 | 华为终端有限公司 | Communication method, transmitting equipment and system for 3D video |
WO2012126135A1 (en) * | 2011-03-21 | 2012-09-27 | Intel Corporation | Method of augmented makeover with 3d face modeling and landmark alignment |
US20120274745A1 (en) * | 2011-04-29 | 2012-11-01 | Austin Russell | Three-dimensional imager and projection device |
CN102520787A (en) * | 2011-11-09 | 2012-06-27 | 浙江大学 | Real-time spatial three-dimensional presentation system and real-time spatial three-dimensional presentation method |
Cited By (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104853134B (en) * | 2014-02-13 | 2019-05-07 | 腾讯科技(深圳)有限公司 | A kind of video communication method and device |
CN104853134A (en) * | 2014-02-13 | 2015-08-19 | 腾讯科技(深圳)有限公司 | Video communication method and video communication device |
CN104935860A (en) * | 2014-03-18 | 2015-09-23 | 北京三星通信技术研究有限公司 | Method and device for realizing video call |
CN105763828A (en) * | 2014-12-18 | 2016-07-13 | 中兴通讯股份有限公司 | Instant communication method and device |
CN106412562B (en) * | 2015-07-31 | 2019-10-25 | 深圳超多维科技有限公司 | The method and its system of stereo content are shown in three-dimensional scenic |
CN106412562A (en) * | 2015-07-31 | 2017-02-15 | 深圳创锐思科技有限公司 | Method and system for displaying stereoscopic content in three-dimensional scene |
CN108139801A (en) * | 2015-12-22 | 2018-06-08 | 谷歌有限责任公司 | For performing the system and method for electronical display stabilization via light field rendering is retained |
CN108139801B (en) * | 2015-12-22 | 2021-03-16 | 谷歌有限责任公司 | System and method for performing electronic display stabilization via preserving light field rendering |
JP2019512173A (en) * | 2016-01-22 | 2019-05-09 | 上海肇觀電子科技有限公司NextVPU (Shanghai) Co., Ltd. | Method and apparatus for displaying multimedia information |
CN105812708A (en) * | 2016-03-18 | 2016-07-27 | 严俊涛 | Video call method and system |
WO2018120657A1 (en) * | 2016-12-27 | 2018-07-05 | 华为技术有限公司 | Method and device for sharing virtual reality data |
CN107846566A (en) * | 2017-10-31 | 2018-03-27 | 努比亚技术有限公司 | A kind of information processing method, equipment and computer-readable recording medium |
CN109299184B (en) * | 2018-07-31 | 2022-04-29 | 武汉大学 | Unified coding visualization method for three-dimensional point cloud in near-earth space |
CN109299184A (en) * | 2018-07-31 | 2019-02-01 | 武汉大学 | A unified coding visualization method for 3D point cloud in near-Earth space |
CN109814718A (en) * | 2019-01-30 | 2019-05-28 | 天津大学 | A Multimodal Information Acquisition System Based on Kinect V2 |
CN113784109A (en) * | 2021-09-07 | 2021-12-10 | 太仓中科信息技术研究院 | Projection system and method for script killing environment |
CN114339190A (en) * | 2021-12-29 | 2022-04-12 | 中国电信股份有限公司 | Communication method, device, equipment and storage medium |
CN114339190B (en) * | 2021-12-29 | 2023-06-23 | 中国电信股份有限公司 | Communication method, device, equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN103269423B (en) | 2016-07-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103269423B (en) | Can expansion type three dimensional display remote video communication method | |
CN114631127B (en) | Small sample synthesis of talking heads | |
CN113099204B (en) | Remote live-action augmented reality method based on VR head-mounted display equipment | |
CN112533002A (en) | Dynamic image fusion method and system for VR panoramic live broadcast | |
WO2019041351A1 (en) | Real-time aliasing rendering method for 3d vr video and virtual three-dimensional scene | |
US20240296626A1 (en) | Method, apparatus, electronic device and storage medium for reconstructing 3d images | |
CN106302132A (en) | A kind of 3D instant communicating system based on augmented reality and method | |
WO2022209129A1 (en) | Information processing device, information processing method and program | |
CN114900678B (en) | VR end-cloud combined virtual concert rendering method and system | |
JP7430411B2 (en) | 3D object streaming method, device, and program | |
CN113382275B (en) | Live broadcast data generation method and device, storage medium and electronic equipment | |
CN105869215A (en) | Virtual reality imaging system | |
US20250037356A1 (en) | Augmenting a view of a real-world environment with a view of a volumetric video object | |
CN110351514A (en) | A kind of method that dummy model passes through remote assistance mode and video flowing simultaneous transmission | |
KR102674577B1 (en) | Reference to a neural network model by immersive media for adaptation of media for streaming to heterogeneous client endpoints | |
CN112532963B (en) | AR-based three-dimensional holographic real-time interaction system and method | |
KR20220110787A (en) | Adaptation of 2D video for streaming to heterogeneous client endpoints | |
CN117596373A (en) | Method and electronic device for information display based on dynamic digital human image | |
CN116744027A (en) | Meta universe live broadcast system | |
CN114554232B (en) | Naked eye 3D-based mixed reality live broadcast method and system | |
CN113992921A (en) | Virtual reality live video communication new technology | |
TWI774063B (en) | Horizontal/vertical direction control device for three-dimensional broadcasting image | |
Hsu et al. | Holographic remote interactive operating technology for controlling networked communication | |
CN105812780A (en) | A helmet with dual cameras | |
CN113949929A (en) | Video communication lifelike technology |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant |