WO2009074110A1 - Communication terminal and information system - Google Patents

Communication terminal and information system Download PDF

Info

Publication number
WO2009074110A1
WO2009074110A1 PCT/CN2008/073398 CN2008073398W WO2009074110A1 WO 2009074110 A1 WO2009074110 A1 WO 2009074110A1 CN 2008073398 W CN2008073398 W CN 2008073398W WO 2009074110 A1 WO2009074110 A1 WO 2009074110A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
video
module
communication terminal
information
Prior art date
Application number
PCT/CN2008/073398
Other languages
English (en)
French (fr)
Inventor
Ping Fang
Yuan Liu
Jing Wang
Kai Li
Original Assignee
Shenzhen Huawei Communication Technologies Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Huawei Communication Technologies Co., Ltd. filed Critical Shenzhen Huawei Communication Technologies Co., Ltd.
Priority to EP08860354A priority Critical patent/EP2136602A4/en
Priority to BRPI0819332-0A priority patent/BRPI0819332A2/pt
Publication of WO2009074110A1 publication Critical patent/WO2009074110A1/zh
Priority to US12/617,914 priority patent/US20100053307A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/597Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/55Depth or shape recovery from multiple images
    • G06T7/593Depth or shape recovery from multiple images from stereo images
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/139Format conversion, e.g. of frame-rate or size
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/204Image signal generators using stereoscopic image cameras
    • H04N13/207Image signal generators using stereoscopic image cameras using a single 2D image sensor
    • H04N13/221Image signal generators using stereoscopic image cameras using a single 2D image sensor using the relative movement between cameras and objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • G06T2207/10012Stereo images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • G06T2207/10021Stereoscopic video; Stereoscopic image sequence
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N2213/00Details of stereoscopic systems
    • H04N2213/007Aspects relating to detection of stereoscopic image format, e.g. for adaptation to the display format

Definitions

  • Embodiments of the present invention relate to the field of mobile communications, and in particular, to a communication terminal and an information system. Background technique
  • a two-dimensional image or video is a two-dimensional information carrier. It can only express the content of the scene and cannot express the depth information of the object such as the distance, location, etc., which is incomplete.
  • Three-Dimens iona l (3D) video is based on the principle of human binocular parallax. Two images of the same scene but slightly different are captured by the camera, and the left and right eyes are respectively displayed to form binocular parallax. , thereby obtaining scene depth information and experiencing a three-dimensional sense.
  • Stereoscopic images generally have two conventional forms, that is, a stereoscopic image composed of left and right image pairs respectively for the left and right eyes, or a stereoscopic image composed of a two-dimensional image and a depth map corresponding to the two-dimensional image.
  • the stereoscopic video technology can provide depth information conforming to the principle of stereoscopic vision, so that the objective world scene can be reproduced more realistically, and the depth, layering and authenticity of the scene are exhibited.
  • stereoscopic image/stereoscopic video technology It also seems to be more and more important.
  • Stereoscopic image/video technology is widely used in stereoscopic movies, television, stereoscopic video conferencing, virtual reality systems, remote industrial control, robot navigation and telemedicine.
  • stereoscopic image/video technology is also needed. For example, the beautiful scenery encountered when traveling Color, I hope that by sending a stereo message, other users can also display the current stereoscopic view; for example, if a stereo video call or stereoscopic image/video information is sent, both parties can sense the other party's stereo while listening to the other party's voice. Scenes.
  • the existing three-dimensional image communication terminal includes: a three-dimensional image input module having a plurality of cameras for capturing a stereoscopic object image; a three-dimensional image display module for displaying three-dimensional image information; and a communication module for transmitting at least the three-dimensional image
  • the three-dimensional image information obtained by the input module; the three-dimensional image display module is composed of a horizontal/vertical parallax display device of an integrated imaging type, and the cameras are arranged to be at least in an up/down, left/right azimuth distribution, in the vicinity of the three-dimensional image display device.
  • the existing three-dimensional communication terminal must use a plurality of cameras to complete the input of three-dimensional image information, and thus the structure is complicated, and the three-dimensional image/video information cannot be obtained.
  • Embodiments of the present invention provide a communication terminal and an information system, which realize stereoscopic image/video collection, implementation, and transmission by using a simple structure.
  • Embodiments of the present invention provide an information system for implementing image/video information transmission using a simple structure.
  • the embodiment of the invention provides a communication terminal, including:
  • An image/video collection generation module for collecting and generating stereoscopic image/video data; a communication module for transmitting images/videos and generated stereoscopic image/video data, or receiving images/videos and stereoscopic images/ Video data;
  • Image/video display module for image/video based on ⁇ , or received and generated
  • the stereoscopic image/video data shows a stereoscopic image/video.
  • An embodiment of the present invention provides an information system, including a stereoscopic image/video information center, where the stereoscopic image/video information center includes:
  • a stereoscopic image/video information server for storing stereoscopic image/video information, and conversion of stereoscopic image/video information and short messages.
  • the communication terminal may collect and generate stereoscopic image/video data by using an image/video collection generation module, and then send the image/video and the generated stereoscopic image/video data by the communication module, or receive the image/video and Stereoscopic image/video data is displayed in the image/video display module, so the structure is simple and effective.
  • the information system of the embodiment of the present invention can perform stereoscopic image/video information communication, so that both parties can sense the depth information of the other party's scene while hearing the other party's voice, and obtain an immersive call effect, so that the call is made. Both sides feel more intimate and more real.
  • FIG. 1 is a schematic structural diagram of an embodiment of a communication terminal according to the present invention.
  • FIG. 2 is a schematic diagram of performing a two-shot shooting for the same scene by using a planar image/video camera according to an embodiment of the communication terminal of the present invention
  • 3 is two images of parallel alignment of scan lines of an embodiment of a communication terminal according to the present invention.
  • FIG. 4 is a schematic diagram of a method for processing a two-dimensional to three-dimensional image sequence in an embodiment of a communication terminal according to the present invention
  • FIG. 5 is a schematic structural diagram of an image/video generation module according to an embodiment of a communication terminal according to the present invention
  • FIG. 6 is a schematic diagram of an image/video collection module image of an embodiment of a communication terminal according to the present invention
  • FIG. 7 is a schematic diagram of processing of an image/video generation module according to an embodiment of a communication terminal of the present invention
  • FIG. 9 is a schematic structural diagram of another embodiment of a communication terminal according to the present invention.
  • FIG. 10 is a schematic diagram of a communication module of another embodiment of a communication terminal according to the present invention encoded by a layered coding method
  • FIG. 11 is a schematic structural diagram of still another embodiment of a communication terminal according to the present invention.
  • FIG. 12 is a schematic structural diagram of an embodiment of an information system according to the present invention. detailed description
  • FIG. 1 is a schematic structural diagram of an embodiment of a communication terminal according to the present invention.
  • the communication terminal includes an image/video set generation module 1 for collecting and generating stereoscopic image/video data, and a communication module 2 for transmitting an image.
  • image/video and generated stereoscopic image/video data or received image/video and stereoscopic image/video data;
  • image/video display module 3 for image/video according to the collection, or received and generated stereo
  • the image/video data shows a stereoscopic image/video.
  • the image/video set generation module 1 includes: an image/video collection module 11 for collecting images/videos; and an image/video generation module 12 for collecting the images.
  • Image/Video generates stereoscopic image/video data.
  • the image/video collection module 11 is used to perform image/video collection, which may be a single perspective image.
  • the image/video camera can also be a flat image/video camera or multiple (eg two) flat image/video cameras.
  • the flat image/video camera can only collect flat images/videos, so the image/video generation module is required. 12
  • the planar image/video processing that is collected is a body image/video data, and there are many ways to deal with it.
  • the image/video collection module 1 1 is a single planar image/video camera, which can acquire three images of the same scene from different angles, and then perform stereo matching by the image/video generation module 12 to obtain a three-dimensional image. /video.
  • the image/video generation module 12 may perform alignment pre-processing according to the images of different angles collected to generate stereoscopic image/video data; or the image/video generation module 12 performs alignment pre-processing according to the images of different angles collected by the image/video generation module 12,
  • the two images after alignment are stereo-matched to obtain depth information, and the stereo image/video data is reconstructed according to one of the images and the depth information.
  • FIG. 2 it is a schematic diagram of performing a two-shot shooting for the same scene by using a planar image/video camera according to an embodiment of the communication terminal of the present invention.
  • the position shown by the solid line in the figure is the first image/video of the camera.
  • the image/video generation module 12 can perform pre-processing using various pre-processing methods, such as fast polar line adjustment or other methods for pre-processing and scan line alignment. Thereby eliminating the change of the camera orientation after the communication terminal to the subsequent stereo matching band The impact of coming.
  • r represents the parallax of the image in the same coordinate system
  • the two images captured by the camera are stereo-matched.
  • the relationship obtains depth information of the target in the scene, thereby obtaining stereoscopic image/video data.
  • the method integrates two images of the same scene from different angles through a single planar image/video camera, and completes the input of the three-dimensional image information, and has a simple structure and is easy to implement.
  • the image/video collection module 11 is a single planar image/video camera, and then using the captured planar image, the image/video generation module 12 needs to determine the depth of the target in the scene based on the planar two-dimensional image.
  • Information that is, the need to create a depth map, through the identification and understanding of the target in the image, to obtain a relative depth relationship, thereby creating a depth map to obtain stereoscopic image / video data, the method for processing only small and deep depth information for small screens Very convenient when it comes to stereoscopic images.
  • the image/video collection module 11 is a single planar image/video camera.
  • the image/video generation module 12 can use a two-dimensional to three-dimensional image sequence.
  • the column processing method divides the target in the scene into a foreground and a background, and sets different depth information, and generates stereoscopic image/video data by using the set depth information and the image/video.
  • FIG. 4 is a schematic diagram of a method for processing a two-dimensional to three-dimensional image sequence according to an embodiment of the communication terminal of the present invention, the image/video generation module 12 according to depth data of a point in the image series collected by the image; using depth data and a classification
  • the depth characteristic is determined as a function of the image features and the associated position; the depth map of at least one frame of the image sequence is created using the image features, thereby obtaining stereoscopic image/video data.
  • the method uses a single planar image/video camera to acquire continuous planar image frames, and uses the image features and related position functions to create a depth map of the image sequence, thereby realizing the input of the three-dimensional image sequence information.
  • the image/video collection module 11 is a single planar image/video camera.
  • FIG. 5 it is a schematic structural diagram of an image/video generation module according to an embodiment of the communication terminal of the present invention.
  • the method may include: a sample image acquisition sub-module 121, configured to collect the current image and the previous frame image; and a motion detection sub-module 122, configured to detect the motion pixel and the still by comparing the correlation pixels of the current image and the previous frame image.
  • a pixel segmentation sub-module 123 configured to divide the current image into small search regions, and generate a representation value for the motion pixels of each region according to the result of the motion detection
  • the depth map generation sub-module 124 is configured to detect the motion object region.
  • the motion pixel group sets a depth value for each motion pixel group to generate a depth map of the same size as the original image
  • a disparity processing sub-module 125 for generating stereoscopic image/video data.
  • the image/video generation module 12 obtains the motion pixels in the current image by comparing the current image with the image of the previous frame, and sets the depth value of the motion pixel group constituting the moving object region to generate a depth map of the same size as the original image. , the input of three-dimensional image information is realized.
  • the image/video collection module 11 is a two-plane image/video camera, as shown in FIG. 6 , which is a schematic diagram of the image/video collection module of the communication terminal embodiment of the present invention.
  • FIG. 6 is a schematic diagram of the image/video collection module of the communication terminal embodiment of the present invention.
  • the image/video generation module of the embodiment of the communication terminal of the present invention is processed.
  • the corresponding imaging points of the target point M in the two images in the scene are respectively, and the corresponding matching based on binocular stereo matching is obtained.
  • the difference between the imaging point and the coordinate difference between the imaging point and the binocular parallax is a schematic diagram of the geometric relationship between the target point depth and the camera in the embodiment of the communication terminal of the present invention.
  • the depth Z of the target point can be obtained according to Equation 2, where / is the focal length, Z is the distance from the scene point to the plane of the optical center, and ⁇ is the distance of the image point in the two image pairs, ⁇ two The optical center distance of the camera position, whereby the image/video generation module 12 generates stereoscopic image/video data.
  • Dl ⁇ r (Formula 2)
  • the image/video collection module 1 1 can also use a stereoscopic image/video camera.
  • the stereoscopic image/video camera has two structures, that is, a stereo camera composed of two lenses separated by a certain distance and a depth sensor by a planar camera.
  • the stereo camera is composed, and the processing of the image/video generation module 12 is different when different stereoscopic image/video cameras are used.
  • the image/video collection module 1 1 is a binocular stereo camera composed of two lenses separated by a certain distance
  • the two lenses can simultaneously collect two different images of one scene, which is equivalent to a human
  • the left and right eye images are similar to the image/video collection module 1 1 of the two planar cameras.
  • the subsequent processing of the image/video generation module 12 can also determine the depth information of the target in the scene through stereo matching; At the time of collection, the image pairs in the two image video sequences may be matched to obtain depth information corresponding to each frame image, thereby acquiring stereoscopic image/video data.
  • the plane camera can collect the planar image/video, and the depth sensing The depth information of the target in the two-dimensional image can be obtained, and the image/video generation module 12 generates stereoscopic image/video data according to the collected planar image/video and depth information.
  • the communication terminal of the embodiment of the present invention can use the image/video set generation module 1 to collect and generate stereoscopic image/video data, and then transmit the generated stereoscopic image/video data by the communication module 2, or receive other stereoscopic images/
  • the video data is displayed on the image/video display module 3, so the structure is simple and the effect is good, and the two parties can sense the depth information of the other scene while hearing the other party's voice, and obtain an immersive call.
  • the effect makes both parties feel more intimate and more realistic, so a better user experience can be obtained.
  • FIG. 9 is a schematic structural diagram of another embodiment of a communication terminal according to the present invention.
  • a storage module 4 is added, which can be used to store generated and received stereo images/videos. Data and / or call information.
  • the communication module 2 may include an encoding sub-module 21 for encoding the collected or stored image/video data, and a communication sub-module 20 for receiving and transmitting the encoded or unencoded image/video data;
  • the sub-module 22 is configured to decode the received encoded image/video data.
  • the communication module 2 transmits and receives three-dimensional stereoscopic image/video data, including but not limited to transmitting a three-dimensional image, and can also transmit other data such as sound; the communication method can be connected or transmitted by wire or wireless; three-dimensional image/video data
  • the transmission can be performed in various ways, such as separately transmitting two separate images or video streams, encoding the transmission reference image and depth information after extracting the depth, and transmitting the reference image and various prediction estimates.
  • the communication module 2 can transmit the image collected from the image/video collection module 11 or the stereoscopic image/video data read from the storage module 4.
  • the transmitted content can be encoded and unencoded stereoscopic image/video data.
  • the image/video collection module 11 is composed of two planar image/video cameras
  • the encoding sub-module 21 of the communication module 2 can separately encode and transmit the two image streams, or can be based on the acquired depth information.
  • the original image is encoded for transmission.
  • the communication module of another embodiment of the communication terminal of the present invention is coded by means of layered coding, and the base layer coded transmission selects a reference view selected from the original image pair (the left view is selected in the figure).
  • the depth information is encoded into an enhancement layer, wherein the base layer is coded by a standard hybrid coding method, and the enhancement layer performs predictive coding based on the corresponding layer of the base layer, and uses intra-layer inter prediction coding.
  • the base layer coded content can be decoded at the receiving end for compatibility with the conventional two-dimensional display, or the base layer and the enhancement layer coded content can be simultaneously decoded for stereoscopic display.
  • the communication module 2 can receive stereoscopic image/video data and other information transmitted by the network in addition to image coding transmission, and the received stereoscopic image/video information is decoded by the decoding sub-module 22 and then placed in the image/video display module 3 for stereoscopic display.
  • the received data can also be placed in the storage module 4 for storage.
  • the image/video display module 3 may be all devices capable of three-dimensional stereoscopic display, such as autostereoscopic display devices, stereo glasses, and holographic display devices.
  • the image/video display module 3 completes the display of the stereoscopic image/video.
  • the displayed image/video content may be the image/video collected by the image/video collection module 11 or may be received through the network, or may be from The content and the like read in the storage module 4.
  • the image/video display module 3 can be compatible with displaying two-dimensional image information, compatible communication between the three-dimensional image communication terminal and the traditional two-dimensional image communication device can be realized, which facilitates a smooth transition of the communication terminal from two-dimensional planar communication to three-dimensional communication. .
  • the image/video collected by the image/video collection module 11 can be collected. Transmitting through the communication module 2; or transmitting the stereoscopic image/video data generated by the image/video generation module 12; also transmitting the image/video and stereoscopic image/video data stored in the storage module 4; and simultaneously receiving the opposite end of the communication module 2
  • the transmitted image/video or stereoscopic image/video data is stereoscopically displayed by the image/video display module 3.
  • the communication terminal of another embodiment of the present invention can use the storage module 4 to store the image/video collected by the image/video collection generation module 1 and the generated stereoscopic image/video data, thereby facilitating transmission, and the effect is convenient. better.
  • the stereo information generating module 5 is configured to generate a stereoscopic image according to stereoscopic image/video data.
  • the video information is stored in the storage module 4, and the response module 6 is configured to automatically send the stereoscopic image/video information to the opposite end through the communication module 2.
  • the stereoscopic information generating module 5 may generate stereoscopic image/video information according to the stereoscopic image/video data generated by the image/video generation module 12 or the stereoscopic image/video data stored by the storage module 4, and the stereoscopic image/video information may be directly used by the communication module. 2 sending; can also be stored in the storage module 4, and subsequently retrieve the stereoscopic image/video information stored in the storage module 4 for transmission.
  • the communication terminal communicates using the stereoscopic image/video information
  • the stereoscopic image/video information can be transmitted by the communication module 2, or the stereoscopic image/video information can be received and displayed by the image/video display module 3.
  • the response module 6 implements an automatic response mode as follows: first, the image/video collection module 11 can be used to collect images/videos, or the image/video generation module 12 can be used to generate stereoscopic images/video data stored in the storage module 4 as response content. (It can be used for different call users for the same or different response content).
  • the answering module 6 Read the setting of automatic answering. When it is set to not process, no processing is performed; otherwise, the call is automatically answered, the calling user is judged, and the corresponding response content set in advance is read from the storage module 4 according to different users. The call message is played to the calling user, and the answer message of the recorded call user is stored in the storage module 4.
  • the communication terminal can generate stereoscopic image/video information by using the stereo information generating module 5 and store it in the storage module 4; and automatically transmit the stereo image/video to the opposite end through the communication module 2 by using the response module 6.
  • Information, so stereo image/video information can be generated and sent, so the function is more abundant and the effect is better.
  • FIG. 12 it is a schematic structural diagram of an embodiment of an information system according to the present invention.
  • the information system includes a stereoscopic image/video information center 91, and the stereoscopic image/video information center 91 includes a stereoscopic image/video information relay 91 1 for Interacting with the communication terminal 90; a stereoscopic image/video information server 912 for storing the conversion of stereoscopic image/video information and stereoscopic image/video information and short information.
  • the information system further includes an email (Ema i l ) server 92 for providing an email service.
  • the communication terminal 90 is the above-mentioned communication terminal 90 capable of supporting stereoscopic image/video information, and can realize generation, management, transmission and reception of stereoscopic image/video information, and is connected to the stereoscopic image/video information center 91 through a communication network, and can be wired or Wireless connections.
  • the stereoscopic image/video information relay 911 is a core component of the information system, and mainly realizes interaction with the communication terminal 90.
  • the stereoscopic image/video information server 912 is a core component of the information system, and provides storage of stereoscopic image/video information, and can also be converted according to completion of a corresponding format, which can be converted from stereoscopic image/video information to a planar image/
  • the video information may also be a conversion that reduces the resolution of the stereoscopic image/video information, thereby improving the transmission efficiency and being compatible with various terminal display technologies. Force or SMS system.
  • the E-mail (Ema il) server 92 can provide a standard Internet (Internet) mail service, can receive stereoscopic image/video information from the communication terminal 90, and can also transmit stereoscopic image/video information to the communication terminal 90.
  • This information system can also interact with other SMS centers.
  • a communication terminal 90 supporting stereoscopic image/video information and various existing communication terminals 90 are used.
  • the sender knows the processing capability of the peer communication terminal 90, sends short messages that are compatible with the existing short message system format, and transmits the existing short message;
  • the sender communication terminal 90 directly transmits the stereoscopic image/video information, and the stereoscopic image/video information center 91 formats the stereoscopic image/video information according to the display capability of the receiver communication terminal 90, and converts it into a compatible with the existing system. a format that implements compatible communication with an existing short message communication terminal 90;
  • the sender communication terminal 90 directly transmits the stereoscopic image/video information, and the stereoscopic image/video information center 91 sends a simple notification to the existing short message communication terminal 90 while transmitting the stereoscopic image/video information to the email (Ema il) mailbox.
  • the user views the computer and other devices that support stereoscopic text display via the Internet.
  • the system can convert the stereoscopic image/video information into a corresponding format according to the display capability of the receiving communication terminal 90, and can fully utilize The premise of network bandwidth.
  • the corresponding on-demand content can also be transmitted according to the display capability of the receiving communication terminal 90 when receiving the user's on-demand.
  • the information system of the embodiment of the present invention can perform stereoscopic image/video information communication, so that both parties can sense the depth information of the other party's scene while hearing the other party's voice, and obtain a body.
  • the effect of the call on the ground makes the callers feel more intimate and more realistic, and can get a better user experience.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Testing, Inspecting, Measuring Of Stereoscopic Televisions And Televisions (AREA)

Description

通信终端和信息系统 本申请要求于 2007 年 12 月 10 日提交中国专利局、 申请号为 200710179059. 发明名称为 "通信终端和信息系统" 的中国专利申请的优 先权, 其全部内容通过引用结合在本申请中。 技术领域
本发明实施例涉及移动通信领域, 尤其是一种通信终端和信息系统。 背景技术
二维的图像或视频是二维信息载体, 只能表现出景物的内容而无法表现 物体的远近、 位置等深度信息, 是不完整的。
立体(Three-Dimens iona l , 简称: 3D )视频是基于人的双目视差原理, 通过摄像机获取同一场景但略有差异的两幅图像, 分别对左眼和右眼进行显 示, 形成双目视差, 从而获得场景深度信息并体验到立体感。 立体图像一般 有两种常用的形式, 即由分别针对左右眼的左右图像对构成的立体图像, 或 者是由一幅二维图像以及与该二维图像对应的深度图构成立体图像。 因此立 体视频技术可以提供符合立体视觉原理的深度信息, 从而能够更加真实地重 现客观世界景象, 表现出场景的纵深感、 层次感和真实性, 随着时代的发展, 立体图像 /立体视频技术也显得越来越重要。
立体图像 /视频技术广泛应用在立体电影、 电视, 立体视频会议, 虚拟现 实系统, 远程工业控制, 机器人导航和远程医疗等诸多场合。
在日常生活中, 同样需要立体图像 /视频技术。 例如旅游时遇到的美丽景 色, 希望通过发送立体短信, 让其他用户也可以显示当前的立体景色; 再例 如进行立体视频通话或发送立体图像 /视频信息,则通话双方在听到对方声音 的同时, 还可以感知对方的立体场景。
现有的三维图像通信终端包括: 一个三维图像输入模块, 具有多个摄像 头用于捕获立体物体图像; 一个三维图像显示模块, 用于显示三维图像信息; 一个通信模块, 用于至少传输由三维图像输入模块获得的三维图像信息; 三 维图像显示模块由集成成像类型的水平 /垂直视差显示设备组成,摄像机被排 列为至少呈上 /下, 左 /右方位分布, 在三维图像显示设备的附近。
在实现本发明过程中, 发明人发现现有技术至少存在如下问题: 现有的 三维通信终端必须釆用多个摄像头才能完成三维图像信息的输入, 因此结构 复杂, 并且无法将三维图像 /视频信息进行存储。 发明内容
本发明实施例提供一种通信终端和信息系统, 利用简单的结构实现立体 图像 /视频的釆集、 实现和传输。
本发明实施例提供一种信息系统,利用简单的结构实现图像 /视频信息的 传输。
本发明实施例提供了一种通信终端, 包括:
一个图像 /视频釆集生成模块, 用于釆集和生成立体图像 /视频数据; 通信模块, 用于发送图像 /视频和生成的所述立体图像 /视频数据, 或者 接收图像 /视频和立体图像 /视频数据;
图像 /视频显示模块, 用于根据釆集的图像 /视频, 或者接收到的和生成 的立体图像 /视频数据显示出立体图像 /视频。
本发明实施例提供了一种信息系统, 包括立体图像 /视频信息中心, 该立 体图像 /视频信息中心包括:
立体图像 /视频信息中继, 用于与通信终端进行交互;
立体图像 /视频信息服务器, 用于存储立体图像 /视频信息, 以及立体图 像 /视频信息和短信息的转换。
本发明实施例通信终端可以利用图像 /视频釆集生成模块釆集和生成立 体图像 /视频数据, 然后由通信模块发送图像 /视频和生成的所述立体图像 / 视频数据, 或者接收图像 /视频和立体图像 /视频数据, 在图像 /视频显示模块 进行显示, 所以结构简单效果好。
本发明实施例的信息系统可进行立体图像 /视频信息的通信,让通话双方 在听到对方声音的同时, 还可以感知对方场景的深度信息, 获得一种身临其 境的通话效果, 使得通话双方感觉更亲切且更真实。 附图说明
图 1为本发明通信终端实施例的结构示意图;
图 2为本发明通信终端实施例利用一个平面图像 /视频摄像机,针对相同 场景进行两次拍摄的示意图;
图 3为本发明通信终端实施例的扫描线平行对齐的两幅图像;
图 4为本发明通信终端实施例釆用二维转三维图像序列处理方法的示意 图;
图 5为本发明通信终端实施例图像 /视频生成模块的结构示意图; 图 6为本发明通信终端实施例图像 /视频釆集模块釆集图像的示意图; 图 7为本发明通信终端实施例的图像 /视频生成模块进行处理的示意图; 图 8为本发明通信终端实施例目标点深度与摄像机之间的几何关系示意 图;
图 9为本发明通信终端另一实施例的结构示意图;
图 10 为本发明通信终端另一实施例的通信模块釆用分层编码的方式编 码的示意图;
图 11为本发明通信终端再一实施例的结构示意图;
图 12为本发明信息系统实施例的结构示意图。 具体实施方式
下面通过附图和实施例, 对本发明实施例的技术方案做进一步的详细描 述。
如图 1所示, 为本发明通信终端实施例的结构示意图, 该通信终端包括 图像 /视频釆集生成模块 1 ,用于釆集和生成立体图像 /视频数据;通信模块 2 , 用于发送图像 /视频和生成的所述立体图像 /视频数据,或者接收图像 /视频和 立体图像 /视频数据; 图像 /视频显示模块 3 , 用于根据釆集的图像 /视频, 或 者接收到的和生成的立体图像 /视频数据显示出立体图像 /视频。
进一步的, 再如图 1所示, 图像 /视频釆集生成模块 1包括: 图像 /视频 釆集模块 11 , 用于釆集图像 /视频; 图像 /视频生成模块 12 , 用于将釆集到的 图像 /视频生成立体图像 /视频数据。
图像 /视频釆集模块 11用来进行图像 /视频的釆集,可以是单个的立体图 像 /视频摄像机, 也可以是一个平面图像 /视频摄像机或者多个(例如两个) 平面图像 /视频摄像机。
平面图像 /视频摄像机只能釆集平面的图像 /视频, 因此需要图像 /视频生 成模块 12将釆集到的平面图像 /视频处理成立体图像 /视频数据,处理方式有 很多种。
第一, 图像 /视频釆集模块 1 1为一个单独的平面图像 /视频摄像机, 可以 通过从不同角度釆集同一场景的两幅图像, 然后由图像 /视频生成模块 1 2进 行立体匹配获取三维图像 /视频。
图像 /视频生成模块 12可以根据釆集到的不同角度的图像进行对齐预处 理, 从而生成立体图像 /视频数据; 或者图像 /视频生成模块 12根据釆集到的 不同角度的图像进行对齐预处理, 对对齐后的两幅图像进行立体匹配, 得到 深度信息, 并根据其中的一幅图像和深度信息重构生成立体图像 /视频数据。 如图 2所示, 为本发明通信终端实施例利用一个平面图像 /视频摄像机, 针对 相同场景进行两次拍摄的示意图, 图中实线所示的位置为摄像机釆集第一幅 图像 /视频的位置, 然后平移距离 图中虚线所示的位置为将摄像机平移后 釆集第二幅图像 /视频的位置, 当两次所釆集的图像中的场景内容没有变化或 变化很緩慢时,则可以近似获得由双目摄像机或立体摄像机一次拍摄的效果, 图中/为焦距, Z为场景点到光心平面的距离, ί为图像对中相对应的成像点 变换到同一平面的距离, S为摄像机釆集两个图像的位置的光心间距。 针对 摄像头从不同位置获取的两幅图像, 图像 /视频生成模块 12可以釆用各种预 处理方法进行预处理, 如可以釆用快速极线调整的方法或其它方法进行预处 理和扫描线对齐, 从而消除在通信终端后摄像头方位变化给后续立体匹配带 来的影响。
当图像 /视频生成模块 12对在两个不同位置针对相同场景的拍摄的图像 进行上述的预处理后, 可以得到如图 3所示的本发明通信终端实施例的扫描 线平行对齐的两幅图像, A 点在左右两幅图像中成像点分别是 和 , 利用 偏振约束可以按照如下方式进行深度求解:
X, ~ Z f fB
< ^> dx (ml , mr ) = xl - xr = ^- (Xl - Xr ) =
xr j Z Z
Xr _ Z
(式 1 )
其中 , r )表示图像在同一坐标系的视差, 对摄像机拍摄的两幅图像进行 立体匹配, 在获得场景中同一目标在两幅图像中的视差后, 可根据式 1 中视 差与深度之间的关系获得场景中目标的深度信息,从而得到立体图像 /视频数 据。该方法通过一个单独的平面图像 /视频摄像机从不同角度釆集同一场景的 两幅图像, 完成了三维图像信息的输入, 结构简单, 易于实现。
第二, 图像 /视频釆集模块 11为一个单独的平面图像 /视频摄像机, 然后 利用釆集到的平面图像, 图像 /视频生成模块 12基于平面二维图像获取三维 图像需要确定场景中目标的深度信息, 即需要创建深度图, 通过对图像中目 标的辨别和理解, 获得相对的深度关系, 从而创建深度图, 得到立体图像 / 视频数据, 该方法用于处理针对小屏幕仅需稀疏深度信息的立体图像时, 非 常方便。
第三, 图像 /视频釆集模块 11为一个单独的平面图像 /视频摄像机, 当需 要处理连续的图像帧时, 图像 /视频生成模块 12可以釆用二维转三维图像序 列处理方法, 将场景中的目标划分为前景和背景, 并设置不同的深度信息, 利用设置的深度信息和所述图像 /视频, 生成立体图像 /视频数据。 如图 4所 示的本发明通信终端实施例釆用二维转三维图像序列处理方法的示意图, 图 像 /视频生成模块 12根据釆集到的图像系列中点的深度数据; 利用深度数据 和一个分类器确定深度特性作为图像特征和相关位置的函数; 利用图像特征 创建图像序列至少某一帧的深度图, 由此得到立体图像 /视频数据。 该方法使 用一个单独的平面图像 /视频摄像机获取连续的平面图像帧,利用图像特征和 相关位置函数创建图像序列的深度图, 实现了三维图像序列信息的输入。
第四, 图像 /视频釆集模块 11为一个单独的平面图像 /视频摄像机, 如图 5所示, 为本发明通信终端实施例图像 /视频生成模块的结构示意图, 图中图 像 /视频生成模块 12 可以包括: 釆样图像获取子模块 121 , 用于釆集当前图 像和前一帧图像; 运动检测子模块 122 , 用于通过对当前图像和前一帧图像 的相关性像素比较检测运动像素和静止像素; 区域分割子模块 123 , 用于将 当前图像分割为小的搜索区域, 根据运动检测的结果对每个区域的运动像素 生成表示值; 深度图生成子模块 124 , 用于检测构成运动物体区域的运动像 素组, 给每个运动像素组设置深度值, 生成和原图像同样大小的深度图; 视 差处理子模块 125 , 用于生成立体图像 /视频数据。 该图像 /视频生成模块 12 通过将当前图像和前一帧图像进行比较, 得到当前图像中的运动像素, 将构 成运动物体区域的运动像素组设置深度值, 生成了和原图像同样大小的深度 图, 实现了三维图像信息的输入。
第五,图像 /视频釆集模块 1 1为两个平面图像 /视频摄像机,如图 6所示, 为本发明通信终端实施例图像 /视频釆集模块釆集图像的示意图,可以同时釆 集从不同方位拍摄的相同场景内容的两幅图像。 如图 7所示, 为本发明通信 终端实施例的图像 /视频生成模块进行处理的示意图,场景中目标点 M在两幅 图像中的对应成像点分别为 和 , 基于双目立体匹配可获得对应成像点 和 之间坐标差值, 即双目视差 , 再如图 8所示, 为本发明通信终 端实施例目标点深度与摄像机之间的几何关系示意图, 图中目标点深度与摄 像机之间的几何关系, 可以按照公式 2 求得目标点的深度 Z , 式中, /为焦 距, Z为场景点到光心平面的距离, ί为两幅图像对中的成像点的距离, Β为 两个摄像机位置的光心间距,由此图像 /视频生成模块 12生成立体图像 /视频 数据。 d l ^r (式 2 )
图像 /视频釆集模块 1 1也可以釆用立体图像 /视频摄像机, 立体图像 /视 频摄像机有两种结构, 即由分开一定距离的两个镜头组成的立体摄像机和由 一个平面摄像机加一个深度传感器组成的立体摄像机, 当釆用不同的立体图 像 /视频摄像机时, 图像 /视频生成模块 12的处理也不相同。
第一种方式, 当图像 /视频釆集模块 1 1为分开一定距离的两个镜头组成 的双目立体摄像机时, 两个镜头可以同时釆集一个场景的两幅不同的图像, 相当于人的左右眼图像, 与釆用两个平面摄像机的图像 /视频釆集模块 1 1类 似, 其图像 /视频生成模块 12的后续处理, 也可以通过立体匹配确定场景中 目标的深度信息; 当进行图像釆集时, 可以对两个图像视频序列中的图像对 进行匹配,获取每一帧图像所对应的深度信息,从而获取立体图像 /视频数据。
第二种方式, 当图像 /视频釆集模块 1 1为一个平面摄像机加一个深度传 感器组成的立体摄像机时, 平面摄像机可以釆集平面图像 /视频, 而深度传感 器可以获得该二维图像中目标的深度信息, 图像 /视频生成模块 12根据釆集 到的平面图像 /视频和深度信息生成立体图像 /视频数据。
因此本发明实施例的通信终端,可以利用图像 /视频釆集生成模块 1釆集 和生成立体图像 /视频数据, 然后由通信模块 2发送该生成的立体图像 /视频 数据, 或者接收其他立体图像 /视频数据在图像 /视频显示模块 3进行显示, 所以结构简单, 效果好, 并且可以使通话双方在听到对方声音的同时, 还可 以感知对方场景的深度信息, 获得一种身临其境的通话效果, 使得通话双方 感觉更亲切且更真实, 因此可以获得更好的用户体验。
如图 9所示, 为本发明通信终端另一实施例的结构示意图, 本实施例与 上一实施例不同之处在于增加了存储模块 4 , 可以用于存储生成和接收到的 立体图像 /视频数据和 /或通话信息。
进一步的, 通信模块 2可以包括编码子模块 21 , 用于将釆集或者存储的 图像 /视频数据编码; 通信子模块 20 , 用于接收和发送经过编码或者未经编 码的图像 /视频数据; 解码子模块 22 , 用于将接收到的经过编码的图像 /视频 数据进行解码。 通信模块 2发送和接收三维立体图像 /视频数据, 包括但是不 限于传输三维图像, 还可以对声音等其它数据进行传输; 通信的方式可釆用 有线或者无线连接和传输; 三维图像 /视频数据的传送可以釆用多种方式, 如 分别编码传送两个独立的图像或视频流, 提取深度后编码传输参考图像和深 度信息, 以及传输参考图像和各种预测估计等。
通信模块 2可以将从图像 /视频釆集模块 1 1釆集的图像或从存储模块 4 读取的立体图像 /视频数据进行传输。传输内容可以为编码和未编码的立体图 像 /视频数据。 如果图像 /视频釆集模块 11是由两个平面图像 /视频摄像机组成的,通信 模块 2的编码子模块 21对釆集的两个图像流可以分别单独编码传输,也可以 基于所获取的深度信息和原始图像对编码传输。 如图 10所示, 为本发明通信 终端另一实施例的通信模块釆用分层编码的方式编码的示意图, 基本层编码 传输从原始图像对中选择的参考视图 (图中选择左视图) , 将深度信息编码 放入增强层, 其中基本层釆用标准的混合编码方法编码, 增强层基于基本层 对应帧进行预测编码, 同时使用层内帧间预测编码。 釆用此编码方案, 在接 收端可以根据需要, 仅解码基本层编码内容用于兼容传统二维显示, 或同时 解码基本层和增强层编码内容用于立体显示。
通信模块 2除了进行图像编码传输外,还可以接收网络发送的立体图像 / 视频数据和其它信息, 接收的立体图像 /视频信息经过解码子模块 22解码后 放入图像 /视频显示模块 3进行立体显示,也可以将接收的数据放入存储模块 4进行存储。
对于图像 /视频显示模块 3可以是能够进行三维立体显示的所有设备,例 如自动立体显示设备, 立体眼镜和全息显示设备等。 图像 /视频显示模块 3完 成立体图像 /视频的显示, 显示的图像 /视频内容可以为图像 /视频釆集模块 11 釆集的图像 /视频, 也可以是通过网络接收到的内容, 还可以是从存储模 块 4中读取的内容等。
此外, 若图像 /视频显示模块 3可以兼容显示二维图像信息, 则可以实现 三维图像通信终端与传统二维图像通信设备的兼容通信, 便于通信终端从二 维平面通信向三维立体通信的顺利过渡。
当进行立体视频通信时,可以将图像 /视频釆集模块 11釆集的图像 /视频 通过通信模块 2发送;或者将图像 /视频生成模块 12生成的立体图像 /视频数 据发送;也可以将存储模块 4中存储的图像 /视频和立体图像 /视频数据发送; 同时通信模块 2 接收对端发送的图像 /视频或立体图像 /视频数据, 由图像 / 视频显示模块 3进行三维立体显示。
因此本发明另一实施例的通信终端, 可以利用存储模块 4 , 将图像 /视频 釆集生成模块 1釆集到的图像 /视频, 以及生成的立体图像 /视频数据进行存 储, 从而便于发送, 效果更好。
如图 1 1所示, 为本发明通信终端再一实施例的结构示意图, 本实施例与 上一实施例相比增加了立体信息生成模块 5 , 用于根据立体图像 /视频数据生 成立体图像 /视频信息, 并存储在存储模块 4 中;应答模块 6 , 用于通过通信 模块 2 自动向对端发送立体图像 /视频信息。
立体信息生成模块 5可以根据图像 /视频生成模块 12生成的立体图像 / 视频数据, 或者存储模块 4存储的立体图像 /视频数据生成立体图像 /视频信 息, 这些立体图像 /视频信息可以直接由通信模块 2发送; 也可以存储到存储 模块 4中, 后续调取存储模块 4中存储的立体图像 /视频信息进行发送。 当通 信终端利用立体图像 /视频信息进行沟通时,即可以利用通信模块 2发送立体 图像 /视频信息, 也可以接收立体图像 /视频信息, 由图像 /视频显示模块 3进 行显示。
而应答模块 6实现自动应答方式如下:首先可以利用图像 /视频釆集模块 1 1釆集图像 /视频, 或者利用图像 /视频生成模块 12生成立体图像 /视频数据 存储在存储模块 4中作为应答内容(可以为针对不同呼叫用户使用相同或者 不同的应答内容) , 当其它用户呼叫时, 本地通信终端无应答时, 应答模块 6 读取自动答录的设置, 当设置为不处理时, 则不进行任何处理; 否则自动 接听呼叫, 判断呼叫用户, 并根据不同的用户从存储模块 4中读取预先设置 的相应应答内容, 播放给呼叫用户, 同时录制呼叫用户的应答留言存入存储 模块 4中。
因此本发明再一实施例的通信终端, 可以利用立体信息生成模块 5生成 立体图像 /视频信息, 并存储在存储模块 4中;利用应答模块 6通过通信模块 2 自动向对端发送立体图像 /视频信息, 所以可以生成和发送立体图像 /视频 信息, 因此功能更加丰富, 效果更佳。
如图 12所示, 为本发明信息系统实施例的结构示意图, 该信息系统包括 立体图像 /视频信息中心 91 , 该立体图像 /视频信息中心 91包括立体图像 /视 频信息中继 91 1 , 用于与通信终端 90进行交互; 立体图像 /视频信息服务器 912 , 用于存储立体图像 /视频信息和立体图像 /视频信息和短信息的转换。 进 一步的, 再如图 8所示, 该信息系统还包括电子邮件(Ema i l )服务器 92 , 用于提供电子邮件服务。
通信终端 90就是上述可以支持立体图像 /视频信息的通信终端 90 , 可以 实现立体图像 /视频信息的生成、 管理、 发送和接收, 通过通信网络与立体图 像 /视频信息中心 91连接, 可以为有线或者无线连接。 立体图像 /视频信息中 继 911是本信息系统的核心组成部分, 主要实现与通信终端 90的交互。 立体 图像 /视频信息服务器 912是本信息系统的核心组成部分, 提供立体图像 /视 频信息的存储, 还可以根据完成相应的格式转化, 该格式转化可以是从立体 图像 /视频信息转换到平面图像 /视频信息,也可以是将立体图像 /视频信息的 分辨率降低的转化, 从而在提高传输效率的同时, 兼容各种不同终端显示能 力或短信系统。 电子邮件 (Ema i l ) 服务器 92 可以提供标准的互联网 ( Internet )邮件服务, 可以从通信终端 90接收立体图像 /视频信息, 也可 以向通信终端 90发送立体图像 /视频信息。
本信息系统也可以与其它短信中心交互。
利用支持立体图像 /视频信息的通信终端 90与现有的各种通信终端 90的
1 )发送方知道对端通信终端 90的处理能力, 发送有现有短信息系统格 式兼容的短信息, 发送现有短信息;
2 )发送方通信终端 90直接发送立体图像 /视频信息, 由立体图像 /视频 信息中心 91根据接收方通信终端 90的显示能力对立体图像 /视频信息进行格 式转换, 转换成与现有系统兼容的格式, 实现与现有的短信息通信终端 90的 兼容通信;
3 )发送方通信终端 90直接发送立体图像 /视频信息, 立体图像 /视频信 息中心 91发送简单通知给现有短信息通信终端 90的同时发送立体图像 /视频 信息到电子邮件(Ema i l )邮箱, 用户通过互联网 (Internet )在计算机等其 它支持立体短信显示的设备上查看。
基于立体图像 /视频信息中心 91的信息格式转化功能,在群发立体图像 / 视频信息时, 系统可以根据接收方通信终端 90的显示能力将立体图像 /视频 信息转化成相应的格式发送, 可以充分利用网络带宽的前提。 同时在接收用 户点播时也可以根据接收方通信终端 90的显示能力发送相应的点播内容。
因此本发明实施例的信息系统可进行立体图像 /视频信息的通信,让通话 双方在听到对方声音的同时, 还可以感知对方场景的深度信息, 获得一种身 临其境的通话效果, 使得通话双方感觉更亲切且更真实, 可以获得更好的用 户体验。 同时还可以实现利用通信终端对立体图像 /视频信息与现有信息的信 息系统之间的现有信息互发。
本领域普通技术人员可以理解: 实现上述方法实施例的全部或部分步 骤可以通过程序指令相关的硬件来完成, 前述的程序可以存储于一计算机 可读取存储介质中, 该程序在执行时, 执行包括上述方法实施例的步骤; 而前述的存储介质包括: R0M、 RAM, 磁碟或者光盘等各种可以存储程序代 码的介质。 非限制, 尽管参照较佳实施例对本发明实施例进行了详细说明, 本领域的普 通技术人员应当理解, 可以对本发明实施例的技术方案进行修改或者等同替 换, 而不脱离本发明实施例技术方案的精神和范围。

Claims

权利要求
1、 一种通信终端, 其特征在于包括:
一个图像 /视频釆集生成模块, 用于釆集和生成立体图像 /视频数据; 通信模块, 用于发送图像 /视频和生成的所述立体图像 /视频数据, 或者 接收图像 /视频和立体图像 /视频数据;
图像 /视频显示模块, 用于根据釆集的图像 /视频, 或者接收到的和生成 的立体图像 /视频数据显示出立体图像 /视频。
2、 根据权利要求 1所述的通信终端, 其特征在于所述图像 /视频釆集生 成模块包括:
图像 /视频釆集模块, 用于釆集图像 /视频;
图像 /视频生成模块, 用于将釆集到的图像 /视频生成立体图像 /视频数 据。
3、 根据权利要求 2所述的通信终端, 其特征在于所述图像 /视频釆集模 块为平面图像 /视频摄像机, 用来从不同角度釆集同一场景的两幅图像; 所述 图像 /视频生成模块根据釆集到的不同角度的图像生成立体图像 /视频数据。
4、 根据权利要求 3所述的通信终端, 其特征在于所述图像 /视频生成模 块根据釆集到的不同角度的图像生成立体图像 /视频数据, 具体为所述图像 / 视频生成模块根据釆集到的不同角度的图像进行对齐预处理, 从而生成立体 图像 /视频数据。
5、 根据权利要求 3所述的通信终端, 其特征在于所述图像 /视频生成模 块根据釆集到的不同角度的图像生成立体图像 /视频数据, 具体为所述图像 / 视频生成模块根据釆集到的不同角度的图像进行对齐预处理, 对对齐后的两 幅图像进行立体匹配, 得到深度信息, 并根据其中的一幅图像和深度信息重 构生成立体图像 /视频数据。
6、 根据权利要求 5所述的通信终端, 其特征在于所述图像 /视频生成模 块釆集到 的 不 同 角 度的 图 像进行对齐预处理 , 对对齐后 的两幅图像进行立体匹配, 得到深度信息, 具体为根据公式
< Χ' Z ^ dx(mi, mr) = xt - xr = ^(Xl - Xr) = , 得到深度信息, 从而生成立体图 = f Z Z
xr _ z 像 /视频数据, 式中/为焦距, Z为场景点到光心平面的距离即深度信息, d 为图像对中相对应的成像点变换到同一平面的距离, S为摄像机釆集两个 图像的位置的光心间距, 和 为同一点在两幅图像中的成像点, m"m^ 表示图像在同一坐标系的视差。
7、 根据权利要求 2所述的通信终端, 其特征在于所述图像 /视频釆集模 块为平面图像 /视频摄像机, 用来釆集图像 /视频; 所述图像 /视频生成模块根 据釆集到的图像, 将场景中的目标划分为前景和背景, 并设置不同的深度信 息, 利用设置的深度信息和所述图像 /视频, 生成立体图像 /视频数据。
8、 根据权利要求 2所述的通信终端, 其特征在于所述图像 /视频釆集模 块为平面图像 /视频摄像机, 用来釆集图像 /视频; 所述图像 /视频生成模块包 括: 釆样图像获取子模块, 用于釆集当前图像和前一帧图像; 运动检测子模块, 用于通过对当前图像和前一帧图像的相关性像素比较 检测运动像素和静止像素; 区域分割子模块, 用于将当前图像分割为小的搜索区域, 根据运动检测 的结果对每个区域的运动像素生成表示值;
深度图生成子模块, 用于检测构成运动物体区域的运动像素组, 给每个 运动像素组设置深度值, 生成和原图像同样大小的深度图;
视差处理子模块, 用于生成立体图像 /视频数据。
9、 根据权利要求 2所述的通信终端, 其特征在于所述图像 /视频釆集 模块为一个立体图像 /视频摄像机。
10、 根据权利要求 9所述的通信终端, 其特征在于所述立体图像 /视频 摄像机是由两个平面图像 /视频摄像头组成的双目立体摄像机,用于同时釆集 两个不同的图像 /视频;所述图像 /视频生成模块根据釆集到的两个平面图像 / 视频序列中的图像进行匹配, 获取每一个图像对应的深度信息, 生成立体图 像 /视频数据。
11、 根据权利要求 9所述的通信终端, 其特征在于所述立体图像 /视频 摄像机是由一个平面图像 /视频摄像头和一个深度传感器组成,用于釆集平面 图像 /视频和深度信息;所述图像 /视频生成模块根据釆集到的平面图像 /视频 和所述平面图像 /视频的深度信息生成立体图像 /视频数据。
12、 根据权利要求 1所述的通信终端, 其特征在于所述通信模块包括: 编码子模块, 用于将釆集或者存储的图像 /视频数据编码;
通信子模块, 用于接收和发送经过编码或者未经编码的图像 /视频数据; 解码子模块, 用于将接收到的经过编码的图像 /视频数据进行解码。
1 3、 根据权利要求 1-12任一所述的通信终端, 其特征在于还包括存储 模块, 用于存储生成和接收到的立体图像 /视频数据和 /或通话信息。
14、 根据权利要求 1 3所述的通信终端, 其特征在于还包括立体信息生 成模块, 用于根据立体图像 /视频数据生成立体图像 /视频信息, 并存储在存 储模块中。
15、 根据权利要求 14所述的通信终端, 其特征在于还包括应答模块, 用于通过通信模块自动向对端发送立体图像 /视频信息。
16、 一种信息系统, 其特征在于包括立体图像 /视频信息中心, 所述立 体图像 /视频信息中心包括:
立体图像 /视频信息中继, 用于与通信终端进行交互;
立体图像 /视频信息服务器, 用于存储立体图像 /视频信息, 以及立体图 像 /视频信息和短信息的转换。
17、 根据权利要求 16所述的信息系统, 其特征在于还包括电子邮件服 务器, 用于提供电子邮件服务。
PCT/CN2008/073398 2007-12-10 2008-12-09 Communication terminal and information system WO2009074110A1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
EP08860354A EP2136602A4 (en) 2007-12-10 2008-12-09 COMMUNICATION TERMINAL AND INFORMATION SYSTEM
BRPI0819332-0A BRPI0819332A2 (pt) 2007-12-10 2008-12-09 Terminal de comunicações e sistema de informações
US12/617,914 US20100053307A1 (en) 2007-12-10 2009-11-13 Communication terminal and information system

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN200710179059.1 2007-12-10
CN200710179059A CN101459857B (zh) 2007-12-10 2007-12-10 通信终端

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US12/617,914 Continuation US20100053307A1 (en) 2007-12-10 2009-11-13 Communication terminal and information system

Publications (1)

Publication Number Publication Date
WO2009074110A1 true WO2009074110A1 (en) 2009-06-18

Family

ID=40755241

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2008/073398 WO2009074110A1 (en) 2007-12-10 2008-12-09 Communication terminal and information system

Country Status (5)

Country Link
US (1) US20100053307A1 (zh)
EP (1) EP2136602A4 (zh)
CN (1) CN101459857B (zh)
BR (1) BRPI0819332A2 (zh)
WO (1) WO2009074110A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102904953A (zh) * 2012-10-12 2013-01-30 Tcl集团股份有限公司 一种远程医疗服务系统及方法

Families Citing this family (39)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7921150B1 (en) * 2009-10-23 2011-04-05 Eastman Kodak Company Method for viewing videos on distributed networks
KR101541197B1 (ko) * 2009-12-21 2015-08-05 한국전자통신연구원 스트리밍 서버군에서 서비스 중인 콘텐츠의 정보를 갱신하는 방법
CN101742224A (zh) * 2010-02-03 2010-06-16 中兴通讯股份有限公司 一种视频播放过程中呈现短信的方法及装置
CN102195894B (zh) * 2010-03-12 2015-11-25 腾讯科技(深圳)有限公司 即时通信中实现立体视频通信的系统及方法
US20120113229A1 (en) * 2010-06-24 2012-05-10 University Of Kentucky Research Foundation (Ukrf) Rotate and Hold and Scan (RAHAS) Structured Light Illumination Pattern Encoding and Decoding
CN102377982A (zh) * 2010-08-25 2012-03-14 深圳市捷视飞通科技有限公司 一种在线视频系统及其视频图像采集方法
US20120050480A1 (en) * 2010-08-27 2012-03-01 Nambi Seshadri Method and system for generating three-dimensional video utilizing a monoscopic camera
US20120050491A1 (en) * 2010-08-27 2012-03-01 Nambi Seshadri Method and system for adjusting audio based on captured depth information
US20120050495A1 (en) * 2010-08-27 2012-03-01 Xuemin Chen Method and system for multi-view 3d video rendering
US8994792B2 (en) 2010-08-27 2015-03-31 Broadcom Corporation Method and system for creating a 3D video from a monoscopic 2D video and corresponding depth information
US20120050478A1 (en) * 2010-08-27 2012-03-01 Jeyhan Karaoguz Method and System for Utilizing Multiple 3D Source Views for Generating 3D Image
JP2012053741A (ja) * 2010-09-02 2012-03-15 Sony Corp 画像処理装置、画像処理方法及びコンピュータプログラム
KR101640404B1 (ko) 2010-09-20 2016-07-18 엘지전자 주식회사 휴대 단말기 및 그 동작 제어방법
CN102480632B (zh) * 2010-11-24 2014-10-22 群光电子股份有限公司 三维影像处理系统、摄影装置及其影像产生装置
KR20120078649A (ko) * 2010-12-31 2012-07-10 한국전자통신연구원 카메라를 구비한 휴대용 영상 통화 장치 및 그 방법
EP2485495A3 (en) * 2011-02-03 2013-08-28 Broadcom Corporation Method and system for creating a 3D video from a monoscopic 2D video and corresponding depth information
EP2485494A1 (en) * 2011-02-03 2012-08-08 Broadcom Corporation Method and system for utilizing depth information as an enhancement layer
CN102164265B (zh) * 2011-05-23 2013-03-13 宇龙计算机通信科技(深圳)有限公司 一种三维视频通话的方法及系统
CN102347951A (zh) * 2011-09-29 2012-02-08 云南科软信息科技有限公司 一种支持在线三维展示的系统及方法
CN102655597A (zh) * 2011-11-23 2012-09-05 上海华博信息服务有限公司 可实时动态调节立体视频视差曲线的播放系统
CN103680291B (zh) * 2012-09-09 2016-12-21 复旦大学 基于天花板视觉的同步定位与地图绘制的方法
US9654762B2 (en) * 2012-10-01 2017-05-16 Samsung Electronics Co., Ltd. Apparatus and method for stereoscopic video with motion sensors
CN102868901A (zh) * 2012-10-12 2013-01-09 歌尔声学股份有限公司 一种3d视频通讯装置
CN103795961A (zh) * 2012-10-30 2014-05-14 三亚中兴软件有限责任公司 会议电视网真系统及其图像处理方法
CN102946545A (zh) * 2012-11-22 2013-02-27 上海文广互动电视有限公司 3d电视集成播出平台
KR102214934B1 (ko) * 2014-07-18 2021-02-10 삼성전자주식회사 단항 신뢰도 및 쌍별 신뢰도 학습을 통한 스테레오 매칭 장치 및 방법
CN104580986A (zh) * 2015-02-15 2015-04-29 王生安 结合虚拟现实眼镜的视频通信系统
EP3404927A4 (en) * 2016-01-13 2018-11-21 Sony Corporation Information processing device and information processing method
KR101675567B1 (ko) * 2016-03-29 2016-11-22 주식회사 투아이즈테크 파노라마 촬영장치, 파노라마 촬영 시스템, 이를 이용한 파노라마 영상 생성 방법, 컴퓨터 판독가능 기록매체 및 컴퓨터 판독가능 기록매체에 저장된 컴퓨터 프로그램
JP7020411B2 (ja) 2016-07-28 2022-02-16 ソニーグループ株式会社 情報処理装置、情報処理方法、およびプログラム
CN106251388A (zh) * 2016-08-01 2016-12-21 乐视控股(北京)有限公司 照片处理方法和装置
CN106540447A (zh) * 2016-11-04 2017-03-29 宇龙计算机通信科技(深圳)有限公司 Vr场景构建方法及系统、vr游戏搭建方法及系统、vr设备
CN106791764A (zh) * 2016-11-30 2017-05-31 努比亚技术有限公司 一种实现视频编解码的方法和装置
CN106713890A (zh) * 2016-12-09 2017-05-24 宇龙计算机通信科技(深圳)有限公司 一种图像处理方法及其装置
CN108769654A (zh) * 2018-06-26 2018-11-06 李晓勇 一种三维图像显示方法
US10816994B2 (en) * 2018-10-10 2020-10-27 Midea Group Co., Ltd. Method and system for providing remote robotic control
US10803314B2 (en) 2018-10-10 2020-10-13 Midea Group Co., Ltd. Method and system for providing remote robotic control
US10678264B2 (en) 2018-10-10 2020-06-09 Midea Group Co., Ltd. Method and system for providing remote robotic control
CN110324648B (zh) * 2019-07-17 2021-08-06 咪咕文化科技有限公司 直播展现方法和系统

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1541485A (zh) * 2001-08-15 2004-10-27 �ʼҷ����ֵ��ӹɷ����޹�˾ 3d视频会议系统
DE102004032191A1 (de) * 2004-07-02 2006-01-19 Scanbull Software Gmbh Verfahren und Vorrichtung zur Darstellung drei-dimensionaler Objekte
CN1728180A (zh) * 2004-07-29 2006-02-01 张文涛 三维立体动画和可远视的三维立体画及其制作方法

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6094215A (en) * 1998-01-06 2000-07-25 Intel Corporation Method of determining relative camera orientation position to create 3-D visual images
EP1185112B1 (en) * 2000-08-25 2005-12-14 Fuji Photo Film Co., Ltd. Apparatus for parallax image capturing and parallax image processing
JP2004040445A (ja) * 2002-07-03 2004-02-05 Sharp Corp 3d表示機能を備える携帯機器、及び3d変換プログラム
JP3989348B2 (ja) * 2002-09-27 2007-10-10 三洋電機株式会社 複数画像送信方法及び複数画像同時撮影機能付き携帯機器
JP4090896B2 (ja) * 2003-01-16 2008-05-28 シャープ株式会社 情報端末装置
US20050259148A1 (en) * 2004-05-14 2005-11-24 Takashi Kubara Three-dimensional image communication terminal
KR20070084277A (ko) * 2004-10-22 2007-08-24 비디에이터 엔터프라이즈 인크 모바일 3d그래피컬 메시징을 위한 시스템 및 방법
EP1851727A4 (en) * 2005-02-23 2008-12-03 Craig Summers AUTOMATIC SCENES MODELING FOR 3D CAMERA AND 3D VIDEO
CN101223552A (zh) * 2005-08-17 2008-07-16 Nxp股份有限公司 用于深度提取的视频处理方法和装置
US20070201859A1 (en) * 2006-02-24 2007-08-30 Logitech Europe S.A. Method and system for use of 3D sensors in an image capture device
KR100866491B1 (ko) * 2007-01-30 2008-11-03 삼성전자주식회사 영상 처리 방법 및 장치

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1541485A (zh) * 2001-08-15 2004-10-27 �ʼҷ����ֵ��ӹɷ����޹�˾ 3d视频会议系统
DE102004032191A1 (de) * 2004-07-02 2006-01-19 Scanbull Software Gmbh Verfahren und Vorrichtung zur Darstellung drei-dimensionaler Objekte
CN1728180A (zh) * 2004-07-29 2006-02-01 张文涛 三维立体动画和可远视的三维立体画及其制作方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP2136602A4 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102904953A (zh) * 2012-10-12 2013-01-30 Tcl集团股份有限公司 一种远程医疗服务系统及方法

Also Published As

Publication number Publication date
CN101459857A (zh) 2009-06-17
BRPI0819332A2 (pt) 2015-05-12
CN101459857B (zh) 2012-09-05
EP2136602A4 (en) 2010-04-14
US20100053307A1 (en) 2010-03-04
EP2136602A1 (en) 2009-12-23

Similar Documents

Publication Publication Date Title
WO2009074110A1 (en) Communication terminal and information system
CN101002471B (zh) 对图像编码的方法和设备及对图像数据解码的方法和设备
CN101453662B (zh) 立体视频通信终端、系统及方法
CN101651841B (zh) 一种立体视频通讯的实现方法、系统和设备
CN101610421B (zh) 视频通讯方法、装置及系统
KR100813961B1 (ko) 영상 수신장치
CN101472190B (zh) 多视角摄像及图像处理装置、系统
US20150358539A1 (en) Mobile Virtual Reality Camera, Method, And System
EP1737248A2 (en) Improvements in and relating to conversion apparatus and methods
US9654762B2 (en) Apparatus and method for stereoscopic video with motion sensors
You et al. Internet of Things (IoT) for seamless virtual reality space: Challenges and perspectives
WO2009052730A1 (en) Video encoding decoding method and device and video codec
US20110157312A1 (en) Image processing apparatus and method
CN102195894B (zh) 即时通信中实现立体视频通信的系统及方法
US20230410443A1 (en) Method and device for rendering content in mobile communication system
KR101645465B1 (ko) 휴대용 단말기에서 입체 영상 데이터를 생성하기 위한 장치 및 방법
CN101291441B (zh) 一种手机及图像信息的处理方法
Hu et al. Mobile edge assisted live streaming system for omnidirectional video
JP2004200814A (ja) 立体映像生成方法及び立体映像生成装置
KR100940209B1 (ko) 영상 디스플레이 모드 전환 방법, 장치, 및 그 방법을실행하기 위한 프로그램을 기록한 컴퓨터로 읽을 수 있는기록매체
US20230115563A1 (en) Method for a telepresence system
US20230360678A1 (en) Data processing method and storage medium
KR101044952B1 (ko) 영상 송신 및 수신 방법과 장치 및 이의 전송 스트림 구조
KR20220135939A (ko) 점군 데이터를 이용한 3차원 영상을 제공하는 송신 장치, 수신 장치 및 방법
Reddy et al. A client-driven 3D content creation system using 2D capable devices

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 08860354

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2008860354

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2293/DELNP/2010

Country of ref document: IN

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: PI0819332

Country of ref document: BR

Kind code of ref document: A2

Effective date: 20100517