CN101171841B - Region-of-interest extraction for video telephony - Google Patents

Region-of-interest extraction for video telephony Download PDF

Info

Publication number
CN101171841B
CN101171841B CN 200680014872 CN200680014872A CN101171841B CN 101171841 B CN101171841 B CN 101171841B CN 200680014872 CN200680014872 CN 200680014872 CN 200680014872 A CN200680014872 A CN 200680014872A CN 101171841 B CN101171841 B CN 101171841B
Authority
CN
China
Prior art keywords
roi
video
selected
information
Prior art date
Application number
CN 200680014872
Other languages
Chinese (zh)
Other versions
CN101171841A (en
Inventor
哈立德·希勒米·厄勒-马列
李彦辑
蔡明章
Original Assignee
高通股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to US66020005P priority Critical
Priority to US60/660,200 priority
Priority to US11/183,072 priority
Priority to US11/183,072 priority patent/US8019175B2/en
Application filed by 高通股份有限公司 filed Critical 高通股份有限公司
Priority to PCT/US2006/008457 priority patent/WO2006130198A1/en
Publication of CN101171841A publication Critical patent/CN101171841A/en
Application granted granted Critical
Publication of CN101171841B publication Critical patent/CN101171841B/en

Links

Abstract

The disclosure is directed to techniques for region-of-interest (ROI) processing for video telephony (VT) applications. According to the disclosed techniques, a recipient device defines ROI information for video information transmitted by a sender device, i.e., far-end video information. The recipient device transmits the ROI information to the sender device. Using the ROI information transmittedby the recipient device, the sender device applies preferential encoding to an ROI within a video scene. ROI extraction may be applied to process a user description of a region of interest (ROI) to generate information specifying the ROI based on the description. The user description may be textual, graphical, or speech-based. An extraction module applies appropriate processing to generated the ROI information from the user description. The extraction module may locally reside with a video communication device, or reside in a distinct intermediate server configured for ROI extraction.

Description

用于视频电话的关注区提取[0001] 本申请案主张2005年3月9日申请的第60/660,200号美国临时申请案以及2005 年7 月15 日申请的题为REGION-OF-INTEREST PROCESSING FOR VIDEO TELEPHONY 的第11/183,072号待决美国专利申请案的权益。 Focus area for video telephony extract [0001] This application claims the No. 60 / 660,200 U.S. Provisional Application March 9th, 2005 and filed REGION-OF-INTEREST entitled July 15, 2005 filed PROCESSING FOR VIDEO TELEPHONY the No. 11 / 183,072 pending US patent application rights and interests. 技术领域[0002] 本揭示案涉及数字视频编码和解码,且更明确地说涉及用于视频电话(VT)应用的处理关注区(ROI)信息的技术。 Technical Field [0002] The present disclosure relates to digital video encoding and decoding and, more particularly, relates to video telephony (VT) (ROI) processing the region of interest information technology applications. 背景技术[0003] 已为编码数字视频序列建立了许多不同的视频编码标准。 BACKGROUND [0003] has established a number of different video encoding standards for encoding digital video sequences. 举例来说,移动图片专家组(MPEG)已开发出许多标准,包含MPEG-1、MPEG-2和MPEG-4。 For example, Moving Picture Experts Group (MPEG) has developed a number of standards including MPEG-1, MPEG-2 and MPEG-4. 其它实例包含国际电信联盟(ITU)H. 263标准和新兴的ITU H. 264标准。 Other examples include the International Telecommunication Union (ITU) H. 263 standard, and the emerging ITU H. 264 standard. 这些视频编码标准通常支持通过以压缩方式编码数据而改进视频序列的传输效率。 These video coding standards generally support improved video sequences by encoding data in a compressed mode transmission efficiency. [0004] 视频电话(VT)允许用户共享视频和音频信息以支持例如视频会议的应用。 [0004] Video telephony (VT) allows users to share video and audio information to support applications such as video conferencing. 示范性视频电话标准包含由会话启始协议(SIP)界定的那些标准、ITU H. 323标准和ITU H. 324 标准。 Exemplary video telephony standards include those defined by the standard Session Initiation Protocol (SIP), ITU H. 323 standard and the ITU H. 324 standard. 在VT系统中,用户可发送并接收视频信息,仅接收视频信息,或仅发送视频信息。 In a VT system, users may send and receive video information, only receive video information, or only send video information. 接收者通常以视频信息从发送者传输的形式查看所接收的视频信息。 See generally in the form of a receiver of video information transmitted from the sender of the received video information. [0005] 已提议对视频信息的选定部分进行优先编码。 [0005] has proposed a selected portion of the video information are priority encoder. 举例来说,发送者可指定以较高质量编码关注区(ROI)以用于传输到接收者。 For example, the sender may specify a higher quality coding region of interest (ROI) for transmission to the recipient. 发送者可能希望向远程接收者强调所述ROI。 The sender may wish to emphasize the ROI to a remote receiver. 尽管发送者可能希望关注视频场景内的其它对象,ROI的典型实例是人脸。 Although the sender may wish to focus on other objects within the video scene, a typical example of ROI is the human face. 利用对ROI的优先编码,与非ROI区相比,接收者能够较清楚地查看R0I。 Using the priority encoder of the ROI as compared to the non-ROI regions, the recipient is able to see more clearly R0I. 发明内容[0006] 本揭示案针对用于视频电话(VT)的关注区(ROI)处理技术。 SUMMARY OF THE INVENTION [0006] The present disclosure for the region of interest (ROI) for video telephony (VT) of the processing technique. 根据所述揭示的技术,本地接收者装置界定由远程发送者装置编码和传输的视频,即远端视频的ROI信息。 The video of the disclosed techniques, a local recipient device is defined by a remote sender device encoded and transmitted, i.e., the distal end of the video information of ROI. 所述本地接收者装置将所述ROI信息传输到所述远程发送者装置。 The local recipient device to the ROI information to the remote sender device. 所述发送者装置使用由所述接收者装置传输的所述ROI信息,对视频场景内的ROI应用优先编码,例如较高质量编码或误差防护。 The sender device uses the ROI information transmitted by the recipient device, the application of the priority encoder ROI within the video scene, such as higher quality encoding or error protection. 以此方式,接收者装置能够远程控制对由发送者装置编码的远端视频的ROI 编码。 In this manner, the recipient device can remotely control ROI encoding of far-end video encoded by the sender's device. [0007] 除了接收远端视频外,接收者还可经装备以发送视频,即近端视频。 [0007] In addition to receiving far-end video, the recipient may be equipped to send video, i.e., a proximal end video. 因此,参与VT 通信的装置可对称地充当视频信息的发送者和接收者两者。 Hence, devices participating in VT communication may act as both the symmetrically sender and the receiver of video information. 充当接收者时,每一装置可定义远端ROI信息以用于由作为发送者的远程装置编码的视频。 When acting as a recipient, each device may define the distal end ROI information for video encoded by the remote device as a sender. 并且,充当发送者时,每一装置可定义近端ROI信息以用于传输到作为接收者的另一装置的视频信息。 Then, when acting as a sender, each device may define near-end ROI information for video information for transmission to another device as a receiver. 发送者或接收者装置可称为“R0I感知的”,是指其能够处理由另一装置提供的ROI信息以支持对ROI视频编码的远程控制。 Sender or recipient device may be referred to as a "perceived R0I" means that it is capable of processing ROI information provided by another device so as to support remote control of ROI video encoding. [0008] 远端ROI信息允许接收者控制发送者装置进行的远程ROI编码以较清楚地查看所接收的视频场景内的对象或区。 [0008] The distal end ROI information allows a recipient to control remote ROI encoding the sender device to the object or area within a video scene to see more clearly received. 近端ROI信息允许发送者控制本地ROI编码以强调所传输的视频场景内的对象或区。 Proximal end ROI information allows the sender to control local ROI encoding to emphasize objects or regions within a transmitted video scene. 因此,发送者对ROI的优先编码可基于由接收者或发送者产生的ROI信息。 Thus, the priority encoder of the sender may be based on the ROI generated by the sender or recipient ROI information. 另外,接收者装置可(例如)通过应用例如误差隐蔽、解块或去鸣振技术的较高质量后处理来基于ROI信息而优先解码ROI。 Further, the recipient device may be (e.g.), for example, by applying error concealment, deblocking or ming to the higher quality of the vibration technique preferentially decode processing ROI information based on the ROI. [0009] 为了促进ROI处理,本揭示案进一步预期用于ROI选择、ROI映射、ROI提取、ROI 信令、ROI跟踪,和对接收者装置的存取验证以允许对发送者装置的ROI编码进行远程控制的技术。 [0009] To facilitate ROI processing, the disclosure further contemplates for ROI selection, ROI mapping, ROI extraction, ROI signaling, ROI tracking, and access authentication of recipient devices to permit ROI coding apparatus is the sender remote control technology. ROI选择可依赖于预定义的ROI样式、口头或文本ROI描述,或用户划定的R0I。 ROI selection may rely on pre-defined ROI patterns, verbal or textual ROI description, or the user designated R0I. ROI映射涉及将选定的ROI样式转译为ROI映射,其可采取适宜由视频编码器使用的宏区块(MB)映射的形式。 ROI mapping involves translation of a selected ROI pattern as an ROI map, which may take the form of a suitable macroblock (MB) map used by the video encoder. [0010] ROI信令可涉及从接收者向发送者装置进行ROI信息的带内或带外信令。 [0010] ROI signaling may involve in-band means or ROI information from the recipient band signaling to the sender. ROI跟踪涉及响应于ROI运动而动态调节ROI映射。 ROI tracking involves dynamic movement in response to the adjusted ROI ROI map. 存取验证可涉及出于远程ROI控制以及解决本地与远程用户或多个远程用户之间的ROI控制冲突的目的而向接收者装置授予存取权和等级。 For remote access authentication may involve achieving the object ROI and ROI control between the local and remote users, or multiple remote users to control the conflict and grants access level to the recipient device. [0011] ROI提取可涉及处理对关注区(ROI)的用户描述以基于所述描述而产生指定所述ROI的信息。 [0011] ROI extraction may involve processing a user region of interest (ROI) based on the description described generating information specifying the ROI. 可基于指定ROI的信息来编码近端视频以增强近端视频的ROI相对于非ROI 区域的图像质量。 Information specifying the ROI based on near-end video encoded video proximal end ROI to enhance image quality with respect to the non-ROI area. 用户描述可基于文本、图形或语音。 User description may be based on text, graphics or voice. 提取模块应用适当处理以从用户描述中产生ROI信息。 Extraction module applies appropriate processing to generated the ROI information from the user description. 提取模块可驻存在视频通信装置本地,或驻存在经配置以进行ROI提取的不同的中间服务器中。 Extracting module may reside in a local video communication device, or reside in a distinct intermediate server configured for ROI extracted. [0012] 在一个实施例中,本揭示案提供一种方法,其包括从远程装置接收指定由本地装置编码且由远程装置接收的近端视频内的关注区(ROI)的信息,和基于ROI来编码近端视频以增强视频的ROI相对于非ROI区域的图像质量。 [0012] In one embodiment, the present disclosure provides a method which includes receiving designation information within the region of interest and a proximal end video received by the remote device encoded by the local device (ROI) from the remote device, and ROI-based end video encoded video to enhance image quality of the ROI relative to non-ROI areas. [0013] 在另一实施例中,本揭示案提供一种视频编码装置,所述视频编码装置包括:关注区(ROI)引擎,其从远程视频通信装置接收指定传输到远程装置的近端视频内的关注区(ROI)的信息;和视频编码器,其编码近端视频以增强视频的ROI相对于非ROI区域的图像质量。 [0013] In another embodiment, the present disclosure provides a video encoding apparatus, the video encoding apparatus comprising: region of interest (ROI) engine that receives a proximal end video transmitted to the designated remote device from the remote video communication device information within the region of interest (ROI); and a video encoder that encodes near-end video to enhance ROI video quality relative to non-ROI areas of the image. [0014] 在额外实施例中,本揭示案提供一种方法,其包括产生指定由远程装置传输且由本地装置接收的远端视频内的关注区(ROI)的信息,和将所述信息传输到远程装置以用于基于ROI来编码远端视频以增强视频的ROI相对于非ROI区域的图像质量。 [0014] In additional embodiments, the present disclosure provides a method comprising generating information within the distal region of interest video, designated by the local device and received by the remote transmission means (ROI) of the information transmission and to a remote device based on the ROI to ROI video encoded far-end video to enhance image quality relative to non-ROI areas. [0015] 在又一实施例中,本揭示案提供一种视频编码装置,所述视频编码装置包括:关注区(ROI)引擎,其产生指定从远程装置接收的远端视频内的关注区(ROI)的信息;和视频编码器,其编码近端视频并将指定ROI的信息和经编码的近端视频一起传输以由远程装置使用来基于ROI而编码远端视频以增强远端视频的ROI相对于非ROI区域的图像质量。 [0015] In yet another embodiment, the present disclosure provides a video encoding apparatus, the video encoding apparatus comprising: region of interest (ROI) engine that generates a region of interest designated remote device from the distal end of the video received ( ROI) information; and a video encoder, which encodes transmitted together with video and proximal end ROI specified information and the proximal end video encoded by the remote device to be used based on the ROI to enhance ROI video coding distal end of the distal end of the video relative to non-ROI areas of the image quality. [0016] 在另一实施例中,本揭示案提供一种方法,其包括从用户处接收由本地装置产生的近端视频内的关注区(ROI)的描述,基于所述描述产生指定ROI的信息,和基于指定ROI 的信息来编码近端视频以增强近端视频的ROI相对于非ROI区域的图像质量。 [0016] In another embodiment, the present disclosure provides a method comprising a region of interest (ROI) within the receiving means is generated by the local video from a user at the proximal end of the description, to produce a specified ROI based on the description information, and the encoded near-end video based on the information specifying the ROI to enhance the proximal end of the video image quality of the non-ROI relative to the ROI. [0017] 在额外实施例中,本揭示案提供一种视频编码装置,所述视频编码装置包括:关注区(ROI)引擎,其接收对由所述装置编码的近端视频内的关注区(ROI)的描述,并基于所述描述产生指定ROI的信息;和视频编码器,其编码近端视频以增强视频的ROI相对于非ROI 区域的图像质量。 [0017] In additional embodiments, the present disclosure provides a video encoding apparatus, the video encoding apparatus comprising: region of interest (ROI) engine that receives a region of interest within the proximal end of said apparatus for encoding a video ( ROI) is described, and the description based on information specifying the ROI is generated; and a video encoder that encodes near-end video to enhance ROI video quality relative to non-ROI areas of the image. [0018] 在又一实施例中,本揭示案提供一种视频编码系统,所述视频编码系统包括:第一视频通信装置,其编码近端视频;第二视频通信装置,其从第一视频通信装置接收近端视频,其中所述第二视频通信装置产生对由所述第一视频通信装置产生的近端视频内的关注区(ROI)的用户描述;和中间服务器,其结构上不同于所述第一和第二视频通信装置,且其基于所述描述产生指定ROI的信息,其中第一视频通信装置基于指定ROI的信息来编码近端视频以增强近端视频的ROI相对于非ROI区域的图像质量。 [0018] In yet another embodiment, the present disclosure provides a video encoding system, the video encoding system comprising: a first video communication device that encodes near-end video; a second video communication device from the first video a video communication device receives the proximal end, wherein the second video communication device generates a user region of interest (ROI) within near-end video generated by the first video communication apparatus will be described; and intermediate servers, which differs from the structure said first and second video communication device, and generates information specifying the ROI based on the description, wherein the first video communication device based on the information specifying the ROI to enhance the video to encode the proximal end of the proximal end video relative to non-ROI ROI the image quality of the region. [0019] 本文描述的技术可实施在硬件、软件、固件或其任何组合中。 [0019] The techniques described herein may be implemented in hardware, software, firmware, or any combination thereof. 如果实施在软件中, 那么可通过计算机可读媒体来部分实现所述技术,所述计算机可读媒体包括含有指令的程序代码,所述程序代码当被执行时会进行本文描述的方法中的一种或一种以上方法。 Method If implemented in software, the computer-readable medium may be implemented in part the techniques, the computer readable medium comprising program code containing instructions of the program code, when executed, will be described herein in a A process or two or more. [0020] 附图和以下描述内容中陈述了一个或一个以上实施例的细节。 [0020] The following description and drawings set forth in detail one or more embodiments. 从描述内容和附图以及从权利要求书中将了解其它特征、目的和优点。 From the description and drawings, and from the claims will be appreciated that other features, objects and advantages. 附图说明[0021] 图1是说明并入有ROI感知视频编解码器(CODEC)的视频编码和解码系统的方框图。 BRIEF DESCRIPTION [0021] Figure 1 is a block diagram incorporating ROI video encoding and decoding system of a video codec (CODEC) perception. [0022] 图2是说明与无线通信装置相关联的显示器上呈现的视频场景内的ROI的定义的图。 [0022] FIG 2 is defined by the ROI within a video scene presented on a display associated with the described wireless communication apparatus of FIG. [0023] 图3是说明并入有ROI感知CODEC的通信装置的方框图。 [0023] FIG. 3 is a block diagram of a communication device incorporating an ROI-aware CODEC. [0024] 图4是说明具有ROI感知CODEC且进一步并入有ROI提取模块的另一通信装置的方框图。 [0024] FIG. 4 is a diagram illustrating ROI having a ROI-aware CODEC, and there is a block diagram of another communication device extracting module is further incorporated. [0025] 图5是说明经由中间提取服务器的分布式ROI提取的方框图。 [0025] FIG. 5 is a block diagram of a distributed via an intermediate extraction server extracting ROI description. [0026] 图6是说明用于多个视频电话会话的分布式ROI提取的方框图。 [0026] FIG. 6 is a block diagram for a distributed ROI extraction plurality of video telephony session description. [0027] 图7A-7D是说明供用户选择的预定义的ROI样式的图。 [0027] Figures 7A-7D are predefined for the user to select an ROI pattern described in FIG. [0028] 图8是说明在接收者装置处产生ROI信息以在远程发送者装置处控制对近端视频的优先ROI编码的流程图。 [0028] FIG. 8 is a diagram illustrating ROI information generated at the recipient device a flowchart of the priority encoder ROI to control near-end video at a remote sender device. [0029] 图9是说明处理来自接收者装置的ROI信息以便结合ROI跟踪而在发送者装置处对近端视频进行优先ROI编码的流程图。 [0029] FIG. 9 is a diagram illustrating processing of ROI information from a recipient device for binding a flowchart of the priority ROI tracking ROI encoding in near-end video at a sender device. [0030] 图10是说明处理来自接收者装置的ROI信息以便结合用户验证而在发送者装置处对近端视频进行优先ROI编码的流程图。 [0030] FIG. 10 is a diagram illustrating processing of ROI information from a recipient device for user authentication in conjunction with the sender apparatus at the proximal end of the flowchart of the priority video encoded ROI. [0031] 图11是说明选择预定义的ROI样式的流程图。 [0031] FIG. 11 is a flowchart of selecting a predefined ROI pattern instructions. [0032] 图12是说明通过扩展和收缩ROI模板来定义所显示的视频场景中的ROI样式的图。 [0032] FIG. 12 is defined ROI patterns of the displayed video scene by expansion and contraction of FIG ROI template. [0033] 图13是说明通过拖动ROI模板来定义所显示的视频场景中的ROI样式的图。 [0033] FIG. 13 is defined ROI patterns of the displayed video scene by dragging the ROI template FIG. [0034] 图14是说明通过用铁笔在触摸屏幕上划定ROI区域来定义所显示的视频场景中的ROI样式的图。 [0034] FIG. 14 is defined ROI patterns displayed video scene by using the stylus of FIG ROI area delineated on the touchscreen. [0035] 图15是说明使用具有待动态提取和跟踪的指定的ROI对象的下拉式菜单来定义所显示的视频场景中的ROI样式的图。 [0035] FIG. 15 is an explanatory having to be extracted and tracked dynamically specified pull-down menu to define the ROI object of a video scene ROI patterns shown in FIG. [0036] 图16是说明使用具有映射到如图7A-7D中预定义的ROI样式的指定的ROI对象的下拉式菜单来定义所显示的视频场景中的ROI样式的图。 [0036] FIG. 16 is a diagram illustrating ROI pattern using a video scene is mapped to the specified ROI object in FIG. 7A-7D predefined ROI pattern to define the pull-down menu shown in FIG. [0037] 图17是说明使用ROI描述界面来定义所显示的视频场景中的ROI样式的流程图。 [0037] FIG 17 is an explanatory flowchart ROI ROI description interface style of the displayed video scene is defined. [0038] 图18是说明解决发送者与接收者装置之间的ROI冲突的流程图。 [0038] FIG. 18 is a flowchart illustrating ROI conflicts between sender and the receiver means to solve. [0039] 图19是说明远端视频内的ROI宏区块的优先解码的流程图。 [0039] FIG. 19 is a flowchart of a decoding priority ROI MBs within the distal video description. 具体实施方式[0040] 图1是说明并入有ROI感知视频编解码器(CODEC)的视频编码和解码系统10的方框图。 DETAILED DESCRIPTION [0040] Figure 1 is a block diagram 10 incorporating ROI-aware video codec (CODEC) of video encoding and decoding system. 如图1所示,系统10包含第一视频通信装置12和第二视频通信装置14。 1, the system 10 includes a first video communication device 12 and the second video communication device 14. 通信装置12、14通过传输信道16连接。 Communication devices 12, 14 are connected through a transmission channel 16. 传输信道16可以是有线或无线媒体。 Transmission channel 16 may be a wired or wireless medium. 系统10支持视频通信装置12、14之间的用于视频电话的双向视频传输。 System 10 supports two-way video transmission between video telephony for video communication devices 12, 14. 装置12、14可以大体上对称的方式操作。 Devices 12, 14 may be substantially symmetrical manner. 然而,在一些实施例中,视频通信装置12、14中的一者或两者可经配置以仅用于单向通信以支持ROI感知视频串流。 However, in some embodiments, video communication devices 12, 14 one or both may be configured for only one-way communication to support ROI-aware video streaming. [0041] 对于双向应用,互逆编码、解码、多路复用(MUX)和多路分解(DEMUX)组件可提供在信道16的相对端。 [0041] For two-way applications, reciprocal encoding, decoding, multiplexing (MUX) and demultiplexing (DEMUX) components may be provided at the opposite end of the channel 16. 在图1的实例中,视频通信装置12包含MUX/DEMUX组件18、R0I感知视频C0DEC20和音频C0DEC22。 In the example of FIG 1, video communication device 12 includes MUX / DEMUX component 18, R0I C0DEC20 aware video and audio C0DEC22. 类似地,视频通信装置14包含MUX/DEMUX组件26、ROI感知视频CODECS和音频C0DEC30。 Similarly, video communication device 14 includes MUX / DEMUX component 26, ROI-aware video and audio CODECS C0DEC30. 每一C0DEC20J8为“R0I感知的”,是指其能够处理由另一视频通信装置12、14远程提供或由其自身的视频通信装置本地提供的ROI信息。 Each C0DEC20J8 as "R0I perceived" means capable of processing ROI information provided by its own or a local video communication device provided by another remote video communication devices 12, 14. [0042] 视频通信装置12、14可实施为经装备以用于视频串流、视频电话或两者的无线移动终端或有线终端。 [0042] Video communication devices 12, 14 may be implemented as equipped for video streaming, video telephony, or both a wireless mobile terminal or a wired terminal. 为此,视频通信装置12、14可进一步包含适当的无线发射、接收、调制解调和处理电子元件以支持无线通信。 To this end, video communication devices 12, 14 may further include appropriate wireless transmitter, receiver, modem, and processing electronics to support wireless communication. 无线移动终端的实例包含移动无线电话、移动个人数字助理(PDA)、移动计算机或装备有无线通信能力和视频编码和/或解码能力的其它移动装置。 Examples of wireless mobile terminals include mobile radio telephones, mobile personal digital assistant (PDA), or mobile computer equipped with wireless communication capabilities and video encoding and / or decoding capabilities of other mobile devices. 有线终端的实例包含台式计算机、视频电话、网络设备、机顶盒、交互式电视等。 Examples of wired terminals comprise desktop computers, video telephones, network devices, set-top boxes, interactive television and the like. 视频通信装置12、14中任一者可经配置以发送视频信息、接收视频信息,或发送并接收视频fn息ο[0043] 对于视频电话应用,通常需要装置12支持视频发送和视频接收能力两者。 Video communication devices 12, 14 may be any one to send video information, receive video configuration information, or send and receive video information fn ο [0043] For video telephony applications, typically require support video transmitting apparatus 12 and two video reception capability By. 然而, 还预期串流视频应用。 However, it is also expected that streaming video applications. 在视频电话且尤其是借助无线通信的移动视频电话中,带宽是重要的关注因素。 In particular by means of video telephony and wireless communication of mobile video telephony, the bandwidth is an important concern factor. 因此,将额外编码位选择性地分配到ROI或其它优先编码步骤可改进视频的一部分的图像质量,同时维持总体编码效率。 Thus, additional coded bits selectively allocated to preferentially encode ROI or other steps may improve the image quality of a portion of the video while maintaining overall encoding efficiency. 对于优先编码,可将额外位分配到R0I,同时可将减少的数目的位分配到非ROI区(例如,视频场景中的背景)。 For preferential encoding, additional bits may be allocated to R0I, while reducing the number of bits allocated to non-ROI areas (e.g., background video scene). [0044] 通常,系统10使用用于视频电话(VT)应用的关注区(ROI)处理技术。 [0044] Generally, system 10 uses a region of interest for video telephony (VT) applications (ROI) processing. 然而,此类技术也可应用于视频串流应用,如上文所提及。 However, such techniques can also be applied to video streaming applications, as mentioned above. 出于说明的目的,将假定,每一视频通信装置12、14能够作为视频信息的发送者和接收者两者而操作,且藉此作为VT会话中的全额参与者而操作。 For purposes of illustration, it will be assumed that each video communication devices 12, 14 can be used as both the video information sender and the receiver operate, and thereby operates as a full participant in a VT session. 对于从视频通信装置12传输到视频通信装置14的视频信息,视频通信装置12是发送者装置且视频通信装置14是接收者装置。 For video information transmitted from video communication 14 device 12 to video communication device, a video communication device 12 is the sender device and video communication device 14 is the recipient device. 相反,对于从视频通信装置14传输到视频通信装置12的视频信息,视频通信装置12是接收者装置且视频通信装置14是发送者装置。 In contrast, for the transmission of video communication device 14 to the video information from video communication device 12, video communication device 12 is the recipient device and video communication device 14 is the sender device. 当讨论将由本地视频通信装置12、14编码和传输的视频信息时,所述视频信息将称为“近端”视频。 When discussing video information by the local video communication device 12, 14, encoded and transmitted, the video information will be referred to as "proximal" video. 当讨论将由远程视频通信装置12、14编码并从远程视频通信装置12、14接收的视频信息时,所述视频信息将称为“远端”视频。 When discussing video information by the remote video communication devices 12, 14 and receives encoded video from the remote communication devices 12, 14, the video information will be referred to as "distal end" video. [0045] 根据所揭示的技术,当作为接收者装置操作时,视频通信装置12或14界定针对从发送者装置接收的远端视频信息的ROI信息。 [0045] According to the disclosed techniques, when operating as a recipient device, video communication device 12 or 14 defines ROI information for far-end video received from the sender apparatus. 再次,从发送者装置接收的视频信息称为“远端”视频信息,因为其是从处于通信信道的远端的另一(发送者)装置接收的。 Again, video information received from a sender device is called "distal" video information as it is (the sender) is received from another apparatus the distal end of the communication channel. 同样,针对从发送者装置接收的视频信息而界定的ROI信息称为“远端”ROI信息。 Similarly, for the video information received from a sender apparatus ROI information defined as "a distal end" ROI information. 远端ROI通常是指远端视频内最引起远端视频的接收者关注的区。 ROI generally refers to the distal region distal caused by a remote video video most of the receiver's attention. 接收者装置解码远端视频信息并将经解码的远端视频经由显示装置呈现给用户。 Receiver means for decoding the video information and the distal end to a user via a display device is a remote video decoding. 用户在远端视频所呈现的视频场景内选择R0I。 R0I user selects a remote video within a video scene presented. [0046] 接收者装置基于用户选择的ROI而产生远端ROI信息,并将远端ROI信息发送到发送者装置。 [0046] The recipient device generates the distal end ROI information based on the ROI selected by the user, and a distal end ROI information is sent to the sender apparatus. 远端ROI信息可采取ROI宏区块(MB)映射的形式,其依据驻存在ROI内的宏区块来界定ROI。 The distal end ROI information may take the form of ROI macroblock (MB) map, on the basis of macro-blocks reside within the ROI to define ROI. ROI MB映射可用1标记处于ROI内的MB,且用0标记ROI外部的MB,以容易地识别包含在ROI中(1)以及排除在ROI外(0)的MB。 ROI MB map is available flag is 1 MB in the ROI, and the external tag ROI MB 0, to readily identify the ROI contained in (1) and excluded outside the ROI MB (0) is. MB是形成帧的一部分的视频区块。 MB is a video block that forms part of the frame. MB的大小可为16X16个像素。 MB in size can be 16X16 pixels. 然而,其它MB大小是可能的。 However, other MB sizes are possible. 因此,MB可指代任何视频区块,包含(但不限于)例如MPEG-U MPEG-2和MPEG-4、ITU H. 263、ITU H. 264的特定视频编码标准或任何其它标准内定义的宏区块。 Accordingly, MB may refer to any video block, comprising (but not limited to), for example, MPEG-U MPEG-2 and MPEG-4, the definition of the ITU H. 263, ITU H. 264 video coding standard in particular, or any other standard macro blocks. [0047] 通过使用由接收者装置传输的远端ROI信息,发送者装置将优先编码应用于视频场景内的相应的R0I。 [0047] By using the ROI information transmitted by the remote recipient device, the sender device applies the corresponding priority encoder within the video scene R0I. 明确地说,可将额外编码位分配到R0I,同时可将减少的数目的编码位分配到非ROI区,藉此改进ROI的图像质量。 Specifically, you can assign additional coded bits to R0I, while reducing the number of coding bits allocated to the non-ROI area, thereby improving the image quality of the ROI. 以此方式,接收者装置能够远程控制发送者装置对远端视频信息进行的ROI编码。 In this manner, the recipient device can remotely control the sender device distal end ROI coding of video information. 优先编码例如通过ROI区域中的优先位分配或优先量化,而将与视频场景的非ROI区域相比更高质量编码应用于ROI区域。 Priority Priority quantization encoding, for example by preferentially or ROI bit allocation region, and the coding and higher quality as compared to non-ROI areas of the video scene applied to the ROI. 经优先编码的ROI 允许接收者装置的用户较清楚地查看对象或区。 ROI encoded by priority allows the recipient device user to more clearly view objects or regions. 举例来说,与视频场景的背景区相比,接收者装置的用户可能希望较清楚地查看脸部或某一其它对象。 For example, compared to the background region of the video scene, the recipient device user may desire to see more clearly the face or some other object. [0048] 当作为发送者装置操作时,视频通信装置12或14也可定义针对由发送者装置传输的视频信息的ROI信息。 [0048] When operating as a sender device, video communication device 12 or 14 may also define ROI information for video information transmitted by the sender apparatus. 再次,发送者装置中产生的视频信息称为“近端”视频,因为其是在通信信道的近端产生的。 Again, video information sender apparatus produced is called "proximal" video, because it is at the proximal end of a communication channel is generated. 由发送者装置产生的ROI信息称为“近端”ROI信息。 ROI information generated by the sender device is called a "proximal end" ROI information. 近端ROI通常是指发送者希望向接收者强调的近端视频的区。 Proximal end ROI generally refers to the recipient the sender wants to emphasize the near video region. 因此,ROI可由接收者装置用户指定为远端ROI信息,或由发送者装置用户指定为近端ROI信息。 Thus, by the recipient device user ROI distal end ROI information is specified, or designated by the sender device user proximal end ROI information. 发送者装置将近端视频经由显示装置呈现给用户。 The sender device presents the near end video to a user via a display device. 与发送者装置相关联的用户在近端视频所呈现的视频场景内选择R0L·发送者装置使用用户选择的ROI来编码近端视频,使得相对于非ROI区域,近端视频中的ROI被(例如)以较高质量编码进行优先编码。 With the sender device user associated with the selected sender apparatus using R0L · ROI selected by the user to the proximal end of video encoding, so that with respect to non-ROI areas within a video scene presented by the near-end video, video proximal end ROI is ( eg) in higher quality coding priority encoder. [0049] 由发送者装置处的本地用户选择的近端ROI允许发送者装置的用户强调视频场景内的区或对象,且藉此使这些区或对象引起接收者装置用户的关注。 User [0049] The proximal end of the device by the local user at the sender's ROI selection means allows the sender to emphasize regions or objects within the video scene, and thereby these regions or objects of interest caused by the recipient device user. 值得注意的是,由发送者装置用户选择的近端ROI无需传输到接收者装置。 Notably, the proximal end by the sender device user need not select an ROI transmitted to the recipient device. 事实上,发送者装置在将近端视频传输到接收者装置之前使用所选择的近端ROI信息在本地编码所述近端视频。 In fact, the sender device uses the selected near-end video before transmitting to the recipient device in the proximal end ROI information to encode the proximal end of the local video. 然而,在一些实施例中,发送者装置可将ROI信息发送到接收者装置以允许应用优先解码技术,例如较高质量误差校正(如误差隐蔽)或后处理(如解块和去鸣振滤波器)。 However, in some embodiments, the sender device may send ROI information to the recipient device to allow an application priority decoding techniques, such as higher quality error correction (e.g., error concealment) or post-processing (e.g., deblocking and deringing filtering device). [0050] 如果ROI信息由发送者装置和接收者装置两者提供,那么发送者装置应用从接收者装置接收的远端ROI信息或本地产生的近端ROI信息来编码近端视频。 The proximal end [0050] If ROI information sender apparatus and a receiver means provides both, the distal end of the sender apparatus receives the application information from the recipient device or the locally generated ROI ROI information to encode the near-end video. 发送者装置与接收者装置提供的近端与远端ROI选择之间可能出现ROI冲突。 ROI conflicts may arise between the proximal and distal end ROI selected sender and receiver means provided by the device. 此类冲突可能需要解决,例如由本地用户主动解决或根据所规定的存取权和等级来解决,如本揭示案中其它地方将描述。 Such conflicts may be resolved, such as active or resolved by the local user in accordance with the predetermined access levels and to resolve, as elsewhere in the present disclosure will be described. 在任一情况下,发送者装置均基于由发送者装置本地提供的近端ROI信息或由接收者装置远程提供的ROI信息来优先编码ROI。 In either case, the information sender apparatus are ROI or ROI information provided remotely by the recipient device based on the proximal end of the means provided locally by the sender to the priority encoder ROI. [0051] 为了促进ROI处理,本揭示案进一步预期用于ROI选择、ROI映射、ROI信令、ROI跟踪,和对接收者装置的存取验证以允许对发送者装置的ROI编码进行远程控制的技术。 [0051] To facilitate ROI processing, the disclosure further contemplates for ROI selection, ROI mapping, ROI signaling, ROI tracking, and access authentication of recipient devices to permit ROI coding of the sender of the remote control apparatus technology. 如将描述,接收者装置或发送者装置应用的不同的ROI选择技术可涉及选择预定义的ROI样式、口头或文本ROI描述,或用户的ROI划定。 As will be described, different ROI recipient device or sender device may involve selection technique applied to select predefined ROI patterns, verbal or textual ROI description, or ROI delineation user. 在接收者装置中,ROI映射涉及将选定的远端或近端ROI样式转译为ROI映射,其可采取宏区块(MB)映射的形式。 The recipient device, ROI mapping involves the selected ROI pattern distal or proximal translation as an ROI map, which may take the form of a macroblock (MB) map. ROI信令可涉及从接收者装置向发送者装置进行远端ROI信息的带内或带外信令。 ROI signaling may involve in-band or the distal end ROI information from the recipient device with a signaling device to the sender. ROI跟踪涉及响应于ROI运动而动态调节由接收者装置产生的远端ROI映射或由发送者本身产生的本地近端R0I。 ROI tracking involves dynamic response to ROI motion R0I proximal the distal end of the locally generated by the recipient device or ROI map generated by the sender itself. 存取验证可出于对远端ROI的远程控制以及解决接收者与发送者装置之间的ROI控制冲突的目的而涉及向接收者装置授予存取权和等级。 Remote access authentication can be done for the remote control of ROI and ROI object between the solution and the recipient of the sender apparatus and relates to a control conflict and grants access level to the recipient device. [0052] 系统10可支持根据会话启始协议(SIP)、ITU H. 323标准、ITU H. 324标准或其它标准的视频电话。 [0052] The system 10 may support Session Initiation Protocol (SIP), ITU H. 323 standard, ITU H. 324 standard or other standard video telephony. 每一视频CODEC 20、观根据例如]\0^6-2、]\0^6-4、1™ H. 263或ITUH. 264 的视频压缩标准而产生经编码的视频数据。 Each video CODEC 20, for example, according to Concept] \ 0 ^ 6-2,] \ 0 ^ 6-4,1 ™ H. 263 video or ITUH. 264 compression standard and encoded video data is generated. 如图1中进一步展示,视频C0DEC20J8可与各自音频C0DEC22、30集成,且包含适当的MUX/DEMUX组件18、26以处理数据流的音频和视频部分。 Further shown in FIG. 1, the video may be integrated with each audio C0DEC20J8 C0DEC22,30, and include appropriate MUX / DEMUX components 18, 26 to handle audio and video portions of a data stream. MUX/DEMUX单元18 J6可符合ITU H. 223多路复用器协议或例如用户数据报协议(UDP)的其它协议。 MUX / DEMUX unit 18 J6 may conform to the ITU H. 223 multiplexer protocol, or other protocols such as User Datagram Protocol (UDP) is. [0053] 图2是说明与无线通信装置38相关联的显示器36上呈现的视频场景34内的R0I32的定义的图。 [0053] FIG 2 is defined in FIG R0I32 video scene 36 presented on the 38 instructions associated with a wireless communication device 34 a display. 在图2的实例中,R0I32是矩形区,其含有视频场景34中呈现的人的脸部39,但ROI可含有需要改进或增强的编码的任何图像或对象。 In the example of FIG. 2, R0I32 is a rectangular region which contains a video scene 34 presented face of the person 39, but may contain any image or ROI subject in need of enhanced or improved coding. 在VT应用中,视频场景34 中呈现的人通常将是远程发送者装置的用户,其是与作为接收者装置操作的无线通信装置38的用户进行的视频会议的一方。 In VT applications, the video scene 34 presented to the user usually remote sender device, which is one video conference with a user performs a wireless communication apparatus operated as a receiver means 38. R0I32构成为远端R0I,因为其定义从远程发送者装置传输的视频场景中的R0I。 R0I32 configured distal R0I, because the definition means from a remote sender transmitted video scene R0I. 根据本揭示案,远端R0I32被传输到发送者装置以指定对ROI内的视频场景区域的优先编码。 According to the present disclosure, the distal end R0I32 means is transmitted to the sender to specify the priority encoder of the video scene within the ROI region. 以此方式,接收者装置38的本地用户能够远程控制远端R0I32 的图像质量。 In this manner, the local user of the recipient device 38 to remotely control the image quality of the distal end R0I32. 如将描述,远端R0I32的大小、形状和位置可以是固定或可调节的,且可以多种方式予以定义、描述或调节。 As will be described, the size, shape and position of the distal end R0I32 may be fixed or adjustable, and may be defined in various ways, described or adjusted. [0054] R0I32允许接收者装置用户较清晰地查看视频场景34内的个别对象,例如人的脸部39。 [0054] R0I32 allow the recipient device user to more clearly view individual objects within video scene 34, such as a person's face 39. R0I32内的脸部39相对于视频场景34的非ROI区域(例如,背景区)而被以较高图像质量进行编码。 Face 39 in the non-ROI video scenes for R0I32 region 34 (e.g., background region) is encoded with higher image quality. 以此方式,用户能够较清楚地查看面部表情、唇部活动、眼部活动等。 In this way, the user can more clearly view facial expressions, lip activity, eye movements and so on. 然而,或者可使用R0I32来指定除了脸部以外的任何对象。 Alternatively, however, may be used to specify any objects R0I32 except the face. 一般来说,VT应用中的ROI可能非常主观且可能由于用户不同而不同。 Generally, VT application ROI can be very subjective and may vary due to the user. 所需的ROI还取决于如何使用VT。 ROI required also depends on how VT. 在一些情况下, VT可用于查看和评估对象,与视频会议形成对比。 In some cases, VT may be used to view and evaluate objects, in contrast to videoconferencing. [0055] 举例来说,丈夫可使用VT应用来展示其想要在机场礼品店购买的礼物。 [0055] For example, a husband may use VT applications to show it wants to present at the airport gift shop purchases. 丈夫可能希望以及时且交互的方式从他的妻子那里获得第二种意见。 Husband may want a timely and interactive way to get a second opinion from his wife there. 这样做,他可以立即作出决定, 因为他所搭乘的班机马上就要出发了。 In doing so, he can make a decision immediately, because he will soon take flight off. 在这种情况下,ROI是覆盖丈夫正考虑的礼物的区。 In this case, ROI is a gift of her husband is considering covering the area. 通过允许妻子(或丈夫)选择ROI,有可能实现针对所述特定ROI的较好编码或较好的服务质量,且藉此允许妻子较清楚地查看礼物。 Select ROI by allowing the wife (or husband), it is possible to achieve better coding for the specific ROI or better quality of service, and thereby allow his wife to see more clearly present. [0056] 作为另一实例,两个或两个以上工程师可进行涉及在白板上演示和讨论各种等式或图表的VT通话。 [0056] As another example, two or more engineers may VT call involving presentation and discussion of various equations or diagrams on a whiteboard. 在这种情况下,远程用户可能希望以较好的图像质量查看白板的一区域,例如更清楚地看到等式的细节。 In this case, the remote user may wish to see a better picture quality of the white area, for example, more clearly see the details of the equation. 为此,远程用户选择包含所述等式的ROI。 To this end, the remote user to select ROI containing the equation. 另外,当一工程师向白板进行添加时,远程用户可能希望移动ROI以跟踪新添加到白板的主题。 In addition, when a white board to add engineers, remote users may want to track mobile ROI newly added to the subject whiteboard. 远程用户指定ROI的能力可显著改进技术讨论过程中信息的交换。 The remote user the ability to specify the ROI may significantly improve the exchange of information technology during the discussion. [0057] 本文描述的ROI技术不仅改进ROI的视频质量,而且改进两个用户之间的视频交互。 [0057] ROI techniques described herein not only improve the video quality of the ROI, but also improve the video interactions between two users. 一般来说,常规VT应用仅仅将两个单向视频传输组合且任何交互均是口头进行。 In general, conventional VT applications merely two-way video transmissions and any interaction are compositions for oral. 在常规VT应用中,视频侧通常不存在交互。 In conventional VT applications, the video side typically interact absent. 允许接收者装置用户在VT通话期间至少具有对从发送者装置接收的视频内容的有限控制可允许更多的视频交互。 Allows the recipient device user has at least limited control over video content received from a sender device during a VT call may allow more video interaction. [0058] 以此方式,VT应用可经设计使得接收者装置用户可选择R0I,并将ROI信息发送回发送者装置以对ROI进行优先处理,例如较高质量编码(例如,通过分配较多编码位)或较强误差防护(例如,内部MB更新)。 [0058] In this manner, VT application can be designed such that a user may select the recipient device R0I, and ROI information back to the sender device preferentially process the ROI, such as higher quality encoding (e.g., encoded by allocating more bit), or stronger error protection (e.g., intra-MB update). 实际上,通过指定远端R0I,接收者装置用户可远程控制发送者装置编码器。 In fact, by specifying R0I distal end, a recipient device user can remotely control the sender device encoder. 另外,此远端ROI信息可由装置中的ROI感知视频解码器使用,所述ROI感知视频解码器接收远端视频以进行较好的后处理,例如误差隐蔽、解块或去鸣振。 Further, the distal end ROI information devices by ROI-aware video decoder used, the ROI-aware video decoder receives far-end video for better post-processing, such as error concealment, deblocking or deringing. 由经编码视频的接收者对视频编码器的远程控制不同于仅仅控制远程摄像机的摇摄、倾斜、 变焦或焦距。 By the recipient of the encoded video of the remote control is different from the video encoder only remote control of pan, tilt, zoom or focus. 相比之下,通过远程ROI处理,用户能够影响应用于特定区的编码的质量。 In contrast, by remote ROI processing, the user can influence the quality of encoding applied to a specific region. 然而,在一些实施例中,可提供远程摄像机控制与远程视频编码器控制组合。 However, in some embodiments, remote camera control can be provided with a remote control of the video encoder combination. [0059] 图3是说明并入有ROI感知CODEC的视频通信装置12的方框图。 [0059] FIG. 3 is a block diagram incorporating a video communication device 12 CODEC's perception ROI. 尽管图3描绘图1的视频通信装置12,但可类似地构造视频通信装置14。 Although FIG 3 depicts video communication device 12 of FIG. 1, but the video communication device 14 may be configured similarly. 再次,视频通信装置12或14 可充当接收者装置、发送者装置,以及优选地接收者和发送者装置两者。 Again, video communication device 12 or 14 may function as a recipient device, sender device, and preferably both a recipient and sender device. 如图3所示,视频通信装置12包含ROI感知C0DEC20、视频俘获装置40和用户界面42。 3, video communication device 12 includes ROI perceptual C0DEC20, video capture device 40 and a user interface 42. 尽管图3中展示信道16,但为了便于说明省略了MUX/DEMUX和音频组件。 Although FIG. 3 shows the channel 16, but are omitted for convenience of explanation MUX / DEMUX and audio components. 视频俘获装置40可以是与视频通信装置12集成或可操作地耦合到视频通信装置12的视频摄像机。 Video capture device 40 may be integrated or operatively coupled to the video camera video communication device 12 to video communication device 12. 在一些实施例中,举例来说,视频俘获装置40可与移动电话集成以形成所谓的视频摄像机电话。 In some embodiments, for example, video capture device 40 may be integrated with a mobile telephone to form a so-called video camera phone. 以此方式,视频俘获装置40可支持移动VT应用。 In this way, the video capture device 40 may support mobile VT applications. [0060] 用户界面42可包含显示装置,例如液晶显示器(LCD)、等离子屏幕、投影仪显示器,或可与视频通信装置12集成或可操作地耦合到视频通信装置12的任何其它显示设备。 [0060] The user interface 42 may include display devices such as liquid crystal display (LCD), a plasma screen, projector display, or may be integrated or be operatively coupled to any other video communication device 12 a display device with a video communication device 12. 显示装置向视频通信装置12的用户呈现视频图像。 The display device presents video images to a user of video communication device 12. 视频图像可包含由视频俘获装置40在本地获得的近端视频,以及从发送者装置远程传输的远端视频。 Video image 40 may include a proximal end video obtained locally by video capture device, and the far-end video from a remote sender transmission apparatus. 另外,用户界面42可包含多种用户输入媒体中的任一者,包含硬键、软键、各种指向装置、触笔等,以用于由视频通信装置12的用户输入信息。 Further, the user interface 42 may include a variety of user input media of any one comprising a hard keys, soft keys, various pointing devices, stylus, etc., for inputting information by a user of video communication device 12. 在一些实施例中,用户界面42的显示装置和用户输入媒体可与移动电话集成。 In some embodiments, the user interface the display device 42 and a user input medium may be integrated with the mobile phone. 视频通信装置12的用户依赖于用户界面42来查看远端视频以及(视情况) 查看近端视频。 User of the video communication device 12 relies on user interface 42 to view far-end video and (optionally) a proximal end view video. 另外,用户依赖于用户界面42来输入信息以用于定义或选择远端ROI以及(视情况)近端ROI。 In addition, the user relies on user interface 42 to input information for defining a distal or select ROI, and (optionally) a proximal end ROI. [0061] 如图3中进一步展示,ROI感知C0DEC20包含ROI引擎44、R0I感知视频编码器46 和ROI感知视频解码器48。 [0061] As further shown in Figure 3, ROI-aware C0DEC20 comprising ROI engine 44, R0I aware video encoder 46 and ROI-aware video decoder 48. ROI感知视频编码器46编码从视频俘获装置40获得的近端视频(“近端视频”)以用于传输到远程接收者装置。 ROI-aware video encoder 46 encodes the video capture device 40 near video ( "proximal Video") obtained for transmission to a remote recipient device. 再次,术语“近端”表示在视频通信装置12内本地产生的视频,这与从远程视频通信装置(例如,视频通信装置14)接收的“远端”视频形成对比。 Again, the term "proximal" denotes a video in a local video communication device 12 is generated, which in contrast to the "distal" video received from remote video communication device (e.g., video communication device 14). 在图3的实例中,ROI感知视频编码器46使用从远程接收器获得的近端ROI信息(“远程近端ROI ”)来优先编码近端ROI。 In the example of FIG. 3, ROI-aware video encoder 46 uses near-end obtained from the remote receiver ROI information ( "Remote proximal ROI") to preferentially encode ROI proximal end. 远程接收者是与远程视频通信装置14相关联的用户。 Remote receiver 14 is associated with the user remote video communication device. [0062] 从远程用户的视角来看,远程近端ROI当由远程装置14传输时是远程远端ROI,且从装置12的本地用户的视角来看当其被接收时称为远程近端ROI。 [0062] From the perspective of the remote user, the remote near-end ROI is transmitted by the remote device 14 when when the remote distal end ROI, remote near-end ROI and is referred to from the local user's perspective of the device 12 when it is received . 也就是说,作为发送者或接收者的装置12、14的视角决定了认为视频和ROI适用于近端还是远端视频。 That is, as the angle of view of the sender or receiver means 12, 14 determines that the video and ROI proximal or distal end suitable for video. 再次,远程控制远程装置14处的视频编码的本地装置12的用户指定远端R0I。 Again, remote control of video encoding apparatus 14 at the local device the remote user 12 specifies the distal end R0I. 然而,当远程装置14的用户接收到远端ROI时,其被认为是远程近端R0I,因为其关于正由本地装置14编码的近端视频。 However, when the user of the remote device 14 receives the distal end of the ROI, it is considered remote near-end R0I, because it is being used by the local unit 14 on the proximal end of video encoding. 一般来说,出于本揭示案中使用的标记的目的,视角是重要的。 Generally, the purpose of the present disclosure in perspective for the label used is important. [0063] 视情况,ROI感知视频编码器46可使用从视频通信装置14的本地用户获得的近端ROI信息(“本地近端R0I”)。 [0063] Optionally, ROI-aware video encoder 46 may use near-end user is obtained from the local video communication device 14 ROI information ( "Local proximal R0I"). 本地近端ROI也可称为发送者驱动的R0I,因为其由经编码近端视频的发送者产生。 Local near-end ROI also may be referred to as sender-driven R0I, because it is generated by the sender of the encoded near-end video. 本地近端ROI信息由本地编码器46使用且通常不发送到另一视频通信装置14,除非远程装置14中的视频解码器经设计以将优先解码应用于由发送者装置12的用户指定的近端R0I。 Local near-end ROI information is used by local encoder 46 and is not typically sent to the other video communication device 14, the remote device 14 unless a video decoder designed to decode the priority applied to the user 12 designated by the sender device near end R0I. 远程近端ROI也可称为接收器驱动的R0I,因为其由经编码近端视频的远程接收器产生。 Remote proximal end ROI also may be referred to as a receiver driven R0I, because it is generated by the remote receiver of the encoded near-end video. 远程近端ROI允许由视频通信装置12产生的视频的接收者控制ROI感知编码器46进行的ROI编码,而本地近端ROI允许由视频通信装置12产生的视频的发送者控制ROI感知编码器46进行的ROI编码。 Remote near-end ROI allows the sender of video generated by video communication device 12 to control ROI-aware receiver ROI coding of the encoder 46, and a proximal end ROI allows local video generated by video communication device 12 to control ROI-aware encoder 46 ROI coding performed. 在一些情况下,如将要描述,远程和本地ROI定义可能冲突,从而需要冲突解决。 In some cases, as will be described, remote and local ROI definitions may conflict, requiring conflict resolution. [0064] 本地和远程近端ROI信息可提供到ROI感知编码器46作为近端ROI宏区块(MB) 映射(“近端ROI MB映射”)。 [0064] The local and remote near-end ROI information may be provided to ROI-aware encoder 46 as a near-end ROI macroblock (MB) map ( "proximal ROI MB map"). 近端ROI MB映射识别驻存在接收器近端ROI或发送者近端ROI内的特定MB。 Proximal end ROI MB map to identify the specific resident within the receiver MB proximal end ROI or the sender near-end ROI. ROI感知编码器46以较高质量编码、较强误差防护或两者来优先编码近端视频中的R0I,以改进当例如远程视频通信装置14处的远程用户查看时ROI的图像质量。 ROI-aware encoder 46 to a higher quality encoding, stronger error protection or both of the proximal end of the priority encoder video R0I, to improve the ROI when a remote user at a remote video communication device 14, for example, to view the image quality. 对于ROI的较好的误差防护在无线电话应用中可能尤其合乎需要。 For a better error protection in a wireless telephone application ROI may be particularly desirable. 接着将所产生的经编码近端视频(“经编码近端视频”)传输到远程装置14。 Next, the resulting encoded near-end video ( "video encoded proximal end") 14 is transmitted to the remote device. [0065] 如将解释,ROI感知视频编码器46还传输已由视频通信装置12的本地用户针对从远程视频通信装置14接收的远端视频而产生的远端ROI信息(“远端ROI ”)。 [0065] As will be explained, ROI-aware video encoder 46 also transmits the user by the local video communication device 12 ROI information ( "distal ROI") from the distal end of the distal end of the video for the remote video communication device 14 receives the generated . 远端ROI 充当针对由远程视频通信装置14编码的视频的接收器驱动的R0I。 Acts as a receiver for the ROI distal end video encoded by the remote video communication 14 device is driven R0I. 实际上,由视频通信装置12传输的远端ROI信息允许至少部分控制由远程视频通信装置14产生的远端视频的编码器,正如由ROI感知解码器48接收的远程近端ROI由视频通信装置12使用以控制ROI 感知视频编码器46—样。 In fact, the distal end ROI information from the video communication device 12 allows the transmission of at least partially controlling a remote video encoder generated by the remote video communication device 14, as perceived by the remote near-end ROI decoder 48 receives the ROI by video communication device 12 used to control ROI-aware video encoder 46- like. 以此方式,每一视频通信装置12、14能够影响由另一装置产生的远端视频中的ROI编码。 In this manner, each video communication devices 12, 14 can affect the distal end ROI coding of video generated by another device. [0066] 由视频通信装置12传输的远端ROI信息可作为带内或带外信令信息而传输。 [0066] ROI information transmitted by the remote video communication device 12 may be transmitted as in-band or band signaling information. 在带内信令的情况下,远端ROI信息可内嵌在传输到远程视频通信装置14的经编码近端视频位流中。 In the case of in-band signaling, the distal end ROI information may be embedded in the transmission to the remote video communication device 14 of the proximal end of the encoded video bitstream. 举例来说,在MPEG4位流格式中,存在称为“User_data”的字段,其可用于内嵌描述位流的信息。 For example, the MPEG4 bitstream format, referred to the presence of "User_data" field, which may be used to describe the embedded information bitstream. “uSer_data”字段或其它位流格式中的类似字段可用于内嵌远端ROI信息而不会违反位流顺应性。 "USer_data" Bitstream Format field or other similar fields may be used in embedded distal end ROI information without violating bitstream compliance. 或者,ROI信息可通过例如隐写术的所谓的数据隐藏技术而内嵌在视频位流中。 Or, ROI information can be embedded in the video bit stream, for example, by the so-called steganography to hide data technology. [0067] ROI感知视频解码器48经配置以在uSer_data字段中或从远程装置传入的远端视频内的其它地方寻求ROI信息。 [0067] ROI-aware video decoder 48 is configured to uSer_data elsewhere within the incoming field or far-end video from the remote device to seek ROI information. 在带外信令的情况下,可使用例如ITU H. 245或SIP的信令协议来传达远端ROI信息。 In the case of band signaling, a signaling protocol may be used, for example ITU H. 245 or SIP to convey the distal end ROI information. 在任一情况下,远端ROI信息可采取界定远端ROI的位置和/或大小的ROI MB映射或物理坐标的形式。 In either case, the ROI information may take the form of a distal end defining a distal position of the ROI and / or size of ROI MB map or physical coordinates. 一旦解码器48接收到远端视频位流,其就基于与远程发送者装置约定的格式检索ROI信息,并将ROI信息传递到存取验证模块58以获得存取许可,以用于在将远程近端ROI提供到视频编码器56之前进行近端ROI控制。 Once decoder 48 receives far-end video bitstream, it retrieves the format based on the ROI information with remote sender apparatus agreed, and transmitting the access authentication information to the ROI module 58 to obtain access permission for the remote near-end ROI provided to the proximal end ROI control before video encoder 56. [0068] 除了控制远程视频编码器以优先编码远端视频中的ROI外,远端ROI信息还可应用于本地视频解码器以优先解码远端视频中的ROI内的MB。 [0068] In addition to controlling a remote video encoder to preferentially encode an outer distal end ROI video, the distal end ROI information may be applied to a local video decoder to preferentially decode the distal end of an ROI video MB. 举例来说,如图3中进一步展示,由ROI映射器M产生以用于传输到远程编码器的相同远端ROI MB映射可提供到ROI 感知视频解码器48。 For example, in FIG. 3 further shows, generated by ROI mapper M ROI MB map may be provided to ROI-aware video decoder 48 for transmission to the distal end of the same remote encoder. ROI感知视频解码器48使用ROI MB映射来优先解码从远程视频通信装置14接收的远端视频内的MB。 ROI-aware video decoder 48 using the MB ROI MB map to preferentially decode the video from a remote video communication device receives remote 14. 举例来说,ROI感知视频解码器48可与非ROMB相比向ROI MB应用更好的后处理。 For example, ROI-aware video decoder 48 may be compared to the non-ROI MB ROMB better post-processing applications. 额外地或作为替代,ROI感知视频解码器48可与非ROI MB相比向ROI MB应用更健壮的误差隐蔽技术。 Additionally or alternatively, ROI-aware video decoder 48 to apply a more robust ROI MB comparable to the non-ROI MB error concealment technique. 以此方式,ROI感知视频解码器48依赖于由本地用户产生的远端ROI信息来优先解码传入的远端视频的ROI部分以实现增强的图像质量。 In this manner, ROI-aware video decoder 48 relies on the distal end ROI generated by the local user to preferentially decode information ROI portion of the incoming far-end video for enhanced image quality. [0069] ROI感知视频解码器48从远程视频通信装置(例如,图1的视频通信装置14)接收传入的远端视频。 [0069] ROI-aware video decoder 48 (e.g., video communication device 14 of FIG. 1) incoming far-end video received from remote video communication device. ROI感知视频解码器48解码远端视频并将经解码的视频提供到用户界面42以在显示装置上呈现给本地用户。 ROI-aware video decoder 48 decodes far-end video to a user interface and provides the decoded video for presentation to the local user 42 on the display device.另外,如上所述,ROI感知视频解码器48从远程视频通信装置14接收远程近端ROI信息(“远程近端ROI ” )。 ROI感知视频解码器48接收到的近端ROI信息由远程视频通信装置14的用户产生以指定由视频通信装置12传输的视频中的R0I。如上所述,ROI感知视频解码器48接收到的远程近端ROI信息用于远程控制ROI感知视频编码器46以优先编码由视频通信装置12产生的近端视频中的ROI。如上所述,通过带内或带外信令技术来传输远程近端R0I。 [0070] 进一步参看图3,ROI感知视频编码器46和ROI感知视频解码器48与ROI引擎44 交互。 ROI引擎44处理本地和远程近端ROI信息以用于编码和传输来自视频俘获装置40 的近端视频位流。另外,ROI引擎44处理经由用户界面42提供的远端ROI信息以用于编码并传输到远程视频通信装置14。 ROI引擎44包含ROI控制器52、ROI映射器54、ROI跟踪模块56和验证模块58。在一些实施例中,ROI跟踪模块56和验证模块58可以是任选的。 [0071] ROI感知视频编码器46、ROI感知视频解码器48、ROI控制器52、ROI映射器54、 ROI跟踪模块56和验证模块58可以多种方式形成,作为离散功能模块或作为包含归属于每一模块的功能性的单片式模块。在任一情况下,ROI感知C0DEC20的各个组件(包含ROI 引擎44、视频编码器46和视频解码器48)可实现在硬件、软件、固件或其组合中。举例来说,此类组件可作为在一个或一个以上微处理器或数字信号处理器(DSP)、一个或一个以上专用集成电路(ASIC)、一个或一个以上现场可编程门阵列(FPGA)或者其它等效集成或离散逻辑电路上执行的软件过程而操作。如果实施在软件中,那么可通过计算机可读媒体来部分实现所述技术,所述计算机可读媒体包括含有指令的程序代码,所述程序代码当在处理器或DSP中执行时会进行本文描述的方法中的一种或一种以上方法。 [0072] 在操作中,视频通信装置12的用户选择由视频俘获模块40产生的近端视频或由ROI感知视频解码器48解码的远端视频,以在与用户界面42相关联的显示装置上查看。在一些实施例中,画中画(PEP)功能性可允许用户同时查看近端视频和远端视频。为了出于ROI定义的目的而查看近端或远端视频,用户可操纵用户界面42来调用ROI定义模式。缺省地,视频通信装置12可处理视频编码和解码而不考虑ROI。通过进入ROI定义模式,用户激活视频通信装置12的ROI感知编码和解码方面。或者,ROI感知编码和解码可为缺省模式。 [0073] 当呈现远端视频时,用户使用多种技术中的任一者来指示远端视频中的R0I,将对所述技术进行更详细描述。远端ROI在视频场景内突出显示用户关注的或需要较高图像质量的区或对象。用户界面42基于用户输入产生远端ROI指示。 ROI信息可由ROI引擎44 进一步处理以产生远端ROI信息以用于传输到视频通信装置14。 [0074] 或者,用户可选择从视频俘获模块40获得的近端视频以用于ROI定义。当呈现近端视频时,用户可视情况使用与用于远端视频中的ROI指示的技术类似或相同的技术来指示近端视频中的ROI。近端ROI或远端ROI可在VT通话开始时被初始指定或在VT通话过程期间的任何时间被指定。在一些实施例中,初始ROI可由本地用户或远程用户更新,或通过ROI跟踪模块56自动更新。如果ROI被自动更新,那么用户不需要继续输入ROI信息。事实上,将基于用户的初始输入而维持ROI,直到用户改变或中止ROI为止。 [0075] 用户界面42基于用户提供的指示而产生本地近端ROI指示。与远端ROI指示一样,近端ROI指示可由ROI引擎44进一步处理。近端ROI指示突出显示(即,通过增加图像质量)视频场景内的用户希望向远程用户强调的区或对象。本地用户可通过经由用户界面42选择预定义的ROI样式或划定ROI样式来选择近端ROI或远端ROI。划定ROI样式可涉及用铁笔进行徒手绘制,或对缺省ROI样式重新设计大小或重新定位。 [0076] 在图3的实例中,用户界面42将本地近端ROI指示(如果提供的话)和远端ROI 指示提供到ROI引擎44内的ROI控制器52。另外,ROI控制器52经由验证模块58从ROI 感知视频解码器48接收远程近端ROI。明确地说,ROI感知视频解码器48检测所接收的远端视频流内远程近端ROI信息的存在,或经由带外信令的远程近端ROI信息的存在,且将远程近端ROI信息提供到验证模块58。本地近端ROI和远端ROI指示可依照各个近端视频或远端视频的视频帧内的坐标来表达。 ROI的坐标可以是视频帧内的xy坐标。然而,xy 坐标经处理以产生ROI MB映射,以由编码器46或解码器48使用,如将解释。 [0077] ROI控制器M处理本地近端ROI、远程近端ROI和远端ROI,并将它们施加到ROI 映射器讨。 ROI映射器M将各个ROI坐标转换为宏区块(MB)映射。更明确地说,ROI映射器讨产生远端MB映射,其指定远端视频内的对应于由本地用户指示的远端ROI的MB。另外,ROI映射器M产生近端ROI MB映射,其指定近端视频内的对应于本地近端ROI、远程近端ROI或两者的组合的MB。 [0078] 对于预定义的ROI样式,ROI映射较简单。每一预定义的ROI样式可具有同样被预定义的指定MB映射。然而,对于划定的、重新定位或重新设计大小的ROI样式,ROI映射器M选择最符合由用户指定的ROI样式的坐标的MB边界。举例来说,如果指定的ROI横穿MB,那么ROI映射器M将ROI边界置于相关MB的外部边缘或内部边缘处。换句话说, ROI映射器M可经配置以仅将完全处于ROI内的MB包含在ROI MB映射中,或者还包含部分处于ROI内的MB。在任一情况下,ROI包含一组最近似于指定的ROI的完整MB。再次, 视频编码器46或视频解码器48在MB层级操作,且通常将需要将ROI转译为MB映射。通过将个别MB指定为包含在ROI中或排除在ROI外,ROI MB映射允许以不规则或非矩形形状定义ROI。 [0079] ROI感知视频编码器46在经编码的近端视频内或通过带外信令将远端ROI MB映射传输到远程视频通信装置14。近端ROI MB映射不传输到远程视频通信装置14。事实上, 近端ROI MB映射由ROI感知视频编码器46使用,以便在传输到远程视频通信装置14之前以较高质量编码或较强误差防护而优先编码近端视频中的指定的MB。因此,ROI感知视频编码器46将经编码的近端视频与经优先编码的ROI以及远端ROI信息传输到远程视频通信装置14。 [0080] ROI跟踪模块56跟踪近端视频的ROI区中的变化。如果VT应用驻存在移动视频通信装置内,举例来说,用户可能不时地移动,从而导致用户的位置相对于先前指定的ROI 发生变化。另外,即使当用户位置稳定时,ROI内的其它对象也可能移出ROI区。举例来说, 湖面上的小船可随着波浪运动而上下颠簸或左右移动。为了避免当发生移动时用户需要重新定义ROI,可提供ROI跟踪模块56以自动跟踪ROI区内的对象。 [0081] 在图3的实例中,ROI跟踪模块56从由ROI感知视频编码器46产生的经编码的近端视频接收运动信息。运动信息可采取经编码的近端视频内的MB的运动向量的形式,从而允许通过ROI映射器M对ROI MB映射定义进行闭环控制。基于运动信息,ROI跟踪模块56产生对近端ROI MB映射的递增位置调节,并将调节提供到ROI映射器M。位置调节可采取如包含在ROI中或排除在ROI外的MB状态变化的形式。 [0082] 如果运动信息指示ROI的大量移动,那么ROI MB映射中MB的状态可能改变。通常,处于ROI外部边界处的MB的状态将发生改变。响应于位置调节,ROI映射器M使由近端ROI MB映射指定的ROI移位,使得ROI位置以逐帧为基础适应于经编码的近端视频内的运动。 ROI跟踪模块56和ROI映射器M协作以在视频场景内检测到运动时自动调节ROI 位置。以此方式,ROI引擎44调节ROI以跟踪ROI内移动的对象。 [0083] 验证模块58用于解析远程用户的ROI权利,包含个别用户的权利和多个用户之间的权利的优先性。当ROI感知视频解码器48从远程视频通信装置14接收远程近端ROI 时,其将远程近端ROI提供到ROI引擎44。然而,在一些情况下,由远程用户指定的远程近端ROI可能与由本地用户指定的本地近端ROI冲突。举例来说,本地和远程用户可指定视频场景内的重叠ROI或完全不同的R0I。在此情况下,可提供验证模块58以解决ROI冲突。 [0084] 在一些实施例中,验证模块58可应用所谓的“主-从”机制来协调在给定时间应使用哪一近端ROI信息(本地或远程)。明确地说,在发送者接收接收器驱动的ROI信息之前,发送者是近端ROI主装置且控制其近端ROI。换句话说,在视频通信装置12处接收到远程近端ROI之前,本地用户控制近端ROI。因而远程用户是近端ROI “从属装置”且不控制近端R0I,除非主装置(即,本地用户)授予控制近端ROI的存取权。 [0085] 一旦本地用户向远程用户授予存取权,本地用户就不再控制其近端R0I。事实上, 与视频通信装置14相关联的远程用户获得对于由视频通信装置12产生的近端视频的近端ROI的控制权,且成为近端ROI的主装置。远程用户可保持控制权直到本地用户明确地撤消存取特权或以另外的方式拒绝远程用户的存取为止,或者直到远程用户中止ROI选择位置为止,在此情况下主ROI控制权可归还于本地用户。 [0086] 一旦ROI感知视频解码器48接收经编码的远端视频(如果有的话),其就基于与发送者约定的格式从视频位流中检索远程近端ROI信息。再次,近端ROI信息可内嵌在经编码的远端视频中或通过带外信令发送。在任一情况下,ROI感知视频解码器48将远程近端ROI传递到验证模块58以在经由ROI控制器52和ROI映射器M将远程近端ROI发送到ROI感知视频编码器46之前获得存取许可。验证模块58将存取权限制于特定用户,使得用户在不经本地用户授权的情况下不能控制编码过程。 [0087] 验证模块58可经配置以授予并管理存取权,并在一个或一个以上远程用户之间进行平衡。举例来说,本地用户可向选定的远程用户授予存取权。因此,本地用户可允许一些远程用户控制近端ROI并禁止其它远程用户控制近端ROI。并且,本地用户可向远程用户分派相对存取等级或优先权。以此方式,本地用户可指定远程用户之间的存取等级的阶层,使得在多个远程用户同时请求ROI控制权的情况下,一些远程用户与其它远程用户相比在控制近端ROI方面可具有优先权。举例来说,在多方视频会议过程中多个远程用户可能同时请求ROI控制权。在此类情况下,ROI控制权通常将专门授予给一个用户,其为本地用户, 或者如果控制权是由本地用户授予的,那么其为远程用户中的选定一者。 [0088] 在一些实施例中,验证模块58还可负责资源监视以确定本地视频通信装置12是否具有启用ROI感知视频处理的能力。如果本地装置不具有充足的处理资源来在给定时间支持远程ROI控制或满足特定类型的ROI请求,那么验证模块58撤消远程ROI控制存取权或拒绝ROI请求。作为一实例,由通信信道强加的带宽限制或本地处理负荷可能导致拒绝远程ROI控制。作为另一实例,这些限制可能允许使用预配置的ROI样式,而不是所划定或描述的ROI样式。验证模块58可通过将状态消息内嵌在待发送到远程装置的传出经编码近端视频中来向远程装置通知所述ROI决策。 [0089] 另外,可向个别远程用户授予不同的存取等级来控制远程用户可控制近端ROI的程度。举例来说,远程用户可限于仅在经本地用户批准时才可选择一组预定义的ROI样式、 特定的ROI位置或大小或ROI的规格。因此,验证模块58可自动解析远程用户对于近端ROI的控制,或通过与本地用户交互而协商对于远程用户的近端ROI控制权的主动批准。举例来说,当远程用户请求存取权以控制近端ROI时,验证模块58可经由用户界面42向本地用户提交询问以请求批准远程用户ROI控制权。 [0090] 验证模块58可以多种方式中的任一者跟踪远程用户的存取等级。如上所述,本地用户可主动地批准来自远程用户的控制近端ROI的请求,并主动地控制向远程用户授予的存取等级。或者,本地用户可在存储与远程用户相关联的信息(包括存取权或等级)的视频通信装置12中的存储器内维持地址簿。所述地址簿可采取具有远程用户和相关联的存取等级的列表的数据库的形式。当远程用户请求近端ROI控制权时,验证模块58从地址簿检索相关的存取权信息,并自动应用验证过程来解析本地用户、远程用户以及可能若干远程用户之间的ROI控制权。如果远程用户未列在地址簿中,那么本地用户可选择将远程用户添加到地址簿并具有适用的存取权。 [0091] 在一些情况下,本地用户可超越(override)为地址簿中的特定远程用户指定的缺省存取等级。举例来说,验证模块58可允许本地用户在VT通话过程期间在不同的远程用户之间主动地重新配置ROI控制优先权,或进行干涉以作为本地用户重新获得对近端ROI 的专有控制权。本地用户与验证模块58之间在维持地址簿或主动管理ROI控制权请求时的交互由图3中的存取控制信息(ACCESS CONTROL INFO)表示。 [0092] 当自动或主动批准远程用户的近端ROI控制权时,验证模块58将远程近端ROI传递到ROI控制器52以用于由近端ROI映射器M进行处理和映射。或者,即如果未提供远程近端ROI或本地用户已选择排斥远程用户而控制近端ROI,那么ROI控制器52处理由本地用户经由用户界面42提供的本地近端R0I。 [0093] 验证模块58用于解决本地与远程用户之间的ROI冲突。缺省地,验证模块58应用主-从概念,依照所述主-从概念,本地用户具有近端ROI控制权。当向远程用户授予具有最高等级的存取权时,远程用户完全控制视频通信装置12的ROI感知视频编码器46的近端ROI选择。否则,本地用户具有近端ROI控制权,其超越由远程用户作出的任何近端ROI 选择。 [0094] 尽管可向远程用户授予存取权,但本地用户在近端ROI控制过程中将占优势,因为远程用户的存取权通常比本地用户的存取权具有较低等级。因此,如果本地用户选择指定近端R0I,那么将忽视远程用户作出的任何近端ROI选择。另一方面,如果本地用户不指定近端R0I,那么分派给远程用户的存取权的等级有效,且远程用户能够控制近端R0I。然而,如上所述,本地用户仍可选择超越缺省的主-从关系并放弃给予本地用户的最高等级的存取权。 [0095] 图4是说明具有ROI感知CODEC且进一步并入有ROI提取模块60的另一视频通信装置12'的方框图。图4的视频通信装置12'与图3的视频通信装置12几乎一致。然而,视频通信装置12'进一步包含ROI提取模块60以基于来自用户的输入形成本地近端ROI和远端R0I。除了简单地处理对预设置的ROI样式的选择或允许用户对缺省ROI进行划定、重新定位或重新设计大小,ROI提取模块60还允许本地用户通过口头或文本ROI描述来指定R0I。明确地说,ROI提取模块60基于由本地用户提供的ROI描述来产生本地近端ROI或远端ROI。 [0096] ROI描述的实例包含例如“脸部”、“移动对象”、“唇部”、“人体”、“背景”等项目的文本或口头输入。可能非常需要对这些对象的优先编码。举例来说,对唇部或脸部的优先编码可较好地表现面部表情、吐词等。文本输入可被键入或从由用户界面42呈现的菜单中选择。可通过向与视频通信装置12'相关联的麦克风中说话来提供口头输入。在每一情况下,本地用户“描述”ROI而不是选择或划定ROI。 ROI提取模块60将所述描述转换为适用的近端或远端视频场景内的一组坐标。在使用口头ROI描述的情况下,用户界面42或ROI 提取模块60可包含常规的语音识别能力。明确地说,ROI提取模块60可基于一个或一个以上经识别的项目来产生指定ROI的信息。 [0097] ROI提取模块60通过应用经配置以检测所需的ROI的常规预编码处理算法来自动选择ROI坐标。明确地说,ROI提取模块60可应用一算法来根据视频ROI处理领域的技术人员已知的常规技术进行脸部检测、特征提取、对象分割或跟踪。举例来说,ROI提取模块60可应用依赖于基于视频输入数据的像素的亮度或色度值进行ROI识别的常规技术。 [0098] 常规脸部检测方案通常涉及使用肤色作为指导来识别脸部与非脸部像素。 IEICE 学报Inf. & Syst,2003 年1 月,第E86-D 卷,第1 期,第101-108 页,C. -ff. Lin、Y. -J. Chang 禾口Y.-C. Chen 的"A low-complexity face-assisted coding scheme for low bit-rate videotelephony,,中以及IEEE 学ί艮On Circuits and Systems for Video Technology, 1999 年6 月,第9 卷,第4 期,第551-564 页,D. Chai 和KN Ngan 的"Face segmentation using skin-colormap in videophone applications”中描述了常规脸部检测方案的实例。[0099] 当本地用户依据“脸部”描述ROI时,ROI提取模块60视情况分析近端或远端视频,以自动识别脸部并将与所识别的脸部相关联的坐标指定为ROI。ROI提取模块60接着将坐标传递到ROI控制器52以用于由ROI映射器M进行处理和映射。值得注意的是,ROI 提取模块60视情况处理本地近端ROI描述或远端ROI描述,将所述描述映射到适当的提取算法,并自动分析适用的经预编码的近端视频或经解码的远端视频以自动提取适当的R0I。[0100] 为了支持自动ROI检测,ROI提取模块60从视频俘获装置40接收近端视频,并从ROI感知视频解码器48接收远端视频。使用来自用户界面42的本地近端ROI描述或远端ROI描述,以及自动化检测算法,ROI提取模块60视情况产生本地近端ROI和远端R0I,以便应用于ROI控制器52。在每一情况下,ROI提取模块60将本地近端ROI描述或远端ROI 描述转换为最符合适用描述的坐标。在此情况下,用户不需要划定R0I。另外,用户不被限定于一组预定义的ROI样式。事实上,ROI控制器52主动地检测近端视频内与ROI描述匹配的适当区。[0101] ROI映射器M将ROI坐标映射到视频帧内的相关宏区块(MB),并产生近端或远端ROI MB映射。实际上,ROI映射器M将来自ROI控制器52的ROI坐标转译成视频编码器46可理解的格式。明确地说,视频编码器46经装备以在MB层级,即在逐个MB基础上处理编码。为此,ROI映射器M产生近端或远端视频的ROI MB映射。 ROI MB映射识别落在指定的ROI内的MB,使得视频编码器46可向那些MB应用优先编码。 [0102] 除了处理ROI描述外,ROI提取模块60还可经装备以处理由本地用户从一组预定义的样式中选择的或由本地用户划定、重新定位或重新设计大小的ROI样式。因此,视频通信装置12'可大体上如关于图3的视频通信装置12所描述的那样产生ROI信息,但进一步并入有ROI提取模块60以处理由本地用户以文本或口头形式输入的ROI描述。在便于本地用户使用方面,ROI提取模块60可能是合乎需要的。然而,一些视频通信装置可能不具有足够的处理能力来支持ROI提取模块60。因此,ROI提取模块60表示根据本揭示案的视频通信装置的一个合乎需要的但任选的组件。 [0103] 在一些实施例中,ROI提取模块60可处理不仅由本地用户而且由远程用户产生的ROI描述。以此方式,在一些装置中可远程地而非在本地执行提取功能性。举例来说,特定视频通信装置14可能不具有足够的本地资源或能力来支持对于由装置14的用户提供的ROI描述的ROI提取。然而,另一视频通信装置12可能较好地经装备以进行ROI提取。在此情况下,预期本地ROI提取可被卸载或分配到远程视频通信装置。 [0104] 为了支持远程提取,可以多种方式将ROI描述提供到远程装置。举例来说,口头描述可包含在传输到远程装置的音频流中。文本ROI描述以及预定义的ROI样式或划定的ROI样式同样可(例如)通过将此信息内嵌在经编码的视频流中而传输到远程装置。因此, 从一个装置发送到另一装置的ROI信息可采取预处理的ROI MB映射或ROI的任何其它指示或描述的形式,所述指示或描述包含在施加到远程编码器之前需要在远程装置处进行处理的指示或描述。 [0105] 图5是说明经由中间提取服务器61的分布式ROI提取的方框图。如图5所示,视频通信装置12、14可向中间提取服务器61提供足够的信息使得可提取ROI。举例来说,每一装置12、14可提供各自的本地近端ROI描述、远端ROI描述、经编码或原始近端视频,和经编码远端视频。作为从近端装置提供经编码远端视频的替代方法,ROI提取服务器61可直接从远端装置接收远端视频。使用此信息,提取服务器61产生远端ROI和本地近端ROI 中的一者或两者,并将它们提供到各自装置12、14。提取服务器61可以是位于通信网络内的任何地方的服务器,且可通过有线媒体、无线媒体或两者的组合耦合到装置12、14。提取服务器61可相对于视频通信装置12、14定位在远距离,或与装置12、14中的一者定位在一起。然而,在许多情况下,提取服务器61可以是远程服务器。一般来说,提取服务器61将在结构上不同于视频通信装置12、14。 [0106] 提取服务器61可与提取模块60非常类似地起作用,但远程、分布式地操作,使得不需要在装置12、14内本地执行ROI提取。以此方式,ROI提取的处理成本可分布到可能具有较大处理能力的不同装置。与ROI提取模块60 —样,提取服务器61可处理用户的例如口头、文本或图形描述的不同类型的ROI描述。为此,ROI提取服务器61可包含适宜的能力(例如,语音识别能力)来处理所述描述。另外,ROI提取服务器61可装备有视频解码能力以允许分析视频和提取ROI,以及编码能力以重新编码视频和内嵌ROI信息(视需要)。 [0107] 图6是说明用于多个视频电话会话的分布式ROI提取的方框图。在图6的实例中, ROI提取服务器61操作以处理多个视频通信装置12A-14A、12B-14B、12C-14D到12N-14N之间的VT会话的ROI提取。以此方式,ROI提取服务器61并行执行多个ROI提取任务以支持正在给定通行网络上进行的各种VT会话。 [0108] 图7A-7D是说明供本地或远程用户选择的预定义的ROI样式的图。图7A-7D的ROI样式是出于实例的目的,且不应认为具有限定性。图7A展示与无线通信装置38相关联的显示器36上呈现的视频场景34内的R0I62。 R0I62是在视频场景34内大体上居中的基本矩形。矩形R0I62的主要长度在视频场景34内垂直延伸。在许多情况下,预定义的居中矩形R0I62将有效地俘获人脸,即参与VT通话的远程用户的脸部。 [0109] 图7B展示另一R0I64,其采取具有在视频场景34内水平延伸的主要长度的矩形的形式。 R0I64在视频场景34内大体上居中,且可有效地俘获例如车辆、船只、产品、演示等对象。 [0110] 图7C展示另一R0I66,其形状经设计以俘获参与VT通话的远程用户的脸部和肩部。或者,R0I66可俘获例如单向视频串流应用中提供新闻广播的报道者、集会的主持人或会议的发言者的脸部和肩部。在任何情况下,预定义的R0I66均聚焦于人类VT参与者或演示者,并实现对所述人员的物理特征的优先编码。 [0111] 图7D展示在视频场景34内并排呈现的一组两个R0I68、70。在图5D的实例中, R0I68、70可有效地俘获并排就座或站立的两个人的脸部。以此方式,两个参与者的脸部可被优先编码以支持面部表情和移动的较高图像质量。 [0112] 图7A-7D中描绘的预定义的ROI样式是出于说明的目的。可提供具有替代位置或形状的其它预定义的ROI样式。举例来说,一些ROI样式倘若可映射到MB边界则可具有圆形或不规则形状。 [0113] 在一些实施例中,可允许用户对选定的ROI样式重新设计大小或重新定位。常规的指针和转角拖动技术可用于实现重新设计大小和重新定位。另外,可通过转角拖动或通过明确地指定缩放百分比来实现对ROI大小的重新缩放。当然,当ROI变大时,优先编码的程度由于带宽限制的缘故而减小。因此,在一些情况下,可在视频通信装置12内实行最大ROI大小。 [0114] 图8是说明在接收者装置处产生远端ROI信息以在发送者装置处控制近端视频中的优先ROI编码的流程图。图8中描绘的过程可实施在图3的视频通信装置12或图4的视频通信装置12'内。在操作中,视频通信装置12内的ROI感知视频解码器48解码来自远程发送者装置(例如,视频通信装置i4(图υ)的远端视频m。一旦解码远端视频,接收者装置12的用户界面42就显示远端视频以供本地用户查看(74)。[0115] 如果本地用户不请求ROI选择(76),那么不采取行动且解码远端视频的下一帧(72)。然而,如果请求ROI选择(76),那么用户界面42接受来自本地用户的远端ROI信息(78)。ROI控制器52和ROI映射器M接着协作以产生远端ROI MB映射(80)。ROI感知编码器46将远端ROI MB映射内嵌在经编码的近端视频中并藉此将远端ROI映射传输到编码远端视频的远程发送者装置14 (82)。远端ROI MB映射指定与远程视频通信装置14相关联的编码器应对待发送到视频通信装置12的远端视频的相关ROI内的MB应用优先编码。[0116] 图9是说明处理来自接收者装置的近端ROI信息以便结合ROI跟踪而在发送者装置处在近端视频中进行优先ROI编码的流程图。在图9的实例中,用户界面42接收由视频俘获装置40产生的近端视频流,并向本地用户呈现近端视频(84)。如果本地用户或远程用户均不请求近端ROI选择(86),那么正常编码每一视频帧内的所有MB(88),即不对ROI内的MB进行任何优先编码。接着将经编码的近端视频发送到远程接收者装置14(89)。[0117] 然而,如果本地用户或远程用户请求近端ROI选择(86),那么ROI控制器52和ROI 映射器讨处理相关近端ROI信息以产生近端ROI MB映射(90)。如果近端ROI由本地用户和远程用户两者指定,那么验证模块58可进行干涉以有利于ROI中的一者地解决冲突。当接收到近端ROI MB映射时(90),ROI感知视频编码器46通过应用较高质量编码、较强误差防护或两者来优先编码所述ROI内的MB (92)。[0118] 跟踪模块56通过监视由ROI感知视频编码器46 生的运动信息来跟踪近端视频内的ROI位置(94)。如果未检测到ROI中的位移(96),那么应用现有ROI映射来编码近端视频内的ROI MB(100),且将经编码的近端视频发送到远程接收者装置(102)。如果检测到ROI中的位移(96),那么视频跟踪模块56在编码近端视频(100)之前基于运动信息来调节ROI MB 映射(98)。[0119] 图10是说明处理来自接收者装置的ROI信息以便结合用户验证而在发送者装置处在近端视频中进行优先ROI编码的流程图。图10描绘图3或4的验证模块58允许远程用户控制近端ROI的操作,且为了简便假定未指定任何本地近端ROI。如图10所示,对于由视频通信装置12中的视频俘获装置40产生的近端视频流(104),验证模块58确定视频通信装置14的远程用户是否已请求远程近端ROI (106)。 [0120] 如果未请求任何远程近端ROI (106),且未指定任何本地近端ROI,那么正常编码近端视频中的所有MB(IlO)。然而,如果请求了远程近端ROI (106),那么验证模块58接着确定请求近端ROI的远程用户是否经验证(108)。明确地说,验证模块58可通过参考存储在视频通信装置12本地的地址簿来自动确定远程用户的存取权。或者,验证模块58可经由用户界面42主动地询问本地用户,以获得对由远程用户进行近端ROI控制的存取权的批准或拒绝。 [0121] 如果远程用户未经验证(108),那么正常编码近端视频中的所有MB(IlO)。然而, 如果远程用户经验证(108),那么向远程用户授予近端ROI控制权。在此情况下,ROI控制器52和ROI映射器M处理来自远程用户的近端ROI信息并产生近端MB映射(112)。使用近端MB映射,ROI感知编码器46优先编码由近端MB映射识别的MB (114)。视频通信装置12接着将经编码的近端视频发送到远程视频通信装置14(116)。 [0122] 图11是说明选择预定义的ROI样式的流程图。一旦ROI感知视频解码器48解码从远程视频通信装置14接收的远端视频(118),就经由用户界面42向本地用户显示远端视频(120)。如果本地用户请求ROI选择(122),那么用户界面42显示例如图7A-7D所示的ROI样式的预定义ROI样式的菜单(IM)。或者,用户可提供ROI描述或对ROI样式进行划定、重新定位或重新设计大小。然而,在图11的实例中,操作集中于呈现预定义的ROI样式。当本地用户选择预定义的ROI样式时(1¾),ROI控制器52和ROI映射器M基于选定的样式定义ROI MB映射(1¾)。 ROI感知视频编码器46将ROI MB映射内嵌在经编码的近端视频内并将ROI MB映射传输到远程视频通信装置14(130)以用于优先编码远端视频中的匪。 [0123] 图12是说明通过扩展和收缩ROI模板132来定义所显示的视频场景34中的ROI 样式的图。图12大体上对应于图2,但说明可由用户重新设计大小的ROI模板132的呈现。在图12的实例中,可通过对ROI模板的转角的一者进行转角拖动以扩展和收缩ROI模板来对ROI模板132重新设计大小。转角拖动以扩展ROI模板132的结果由经扩展的ROI模板134表示。转角拖动导致ROI模板132的大小增加或减小,但维持相对的长宽缩放比例。然而,在一些实施例中,也可允许用户拖动ROI模板132的一侧以便增加或减小ROI模板的大小,同时还改变长宽缩放比例。可使用铁笔结合触摸屏幕或使用与视频通信装置12的用户界面42相关联的另一指向装置来实现拖动。其它指向装置可包含操纵杆、触摸垫、滚轮、跟踪球等。 [0124] 图13是说明通过拖动ROI模板132来定义所显示的视频场景中的ROI样式的图。明确地说,图13展示通过将ROI模板拖动到视频场景34内的另一位置135来重新定位ROI 模板132。可通过铁笔和触摸屏幕或与用户界面42相关联的另一指向装置来实现拖动。 [0125] 图14是说明通过用铁笔138在触摸屏幕上划定ROI样式136来定义所显示的视频场景中的ROI样式的图。在图14的实例中,通过徒手描绘来产生ROI样式136。 ROI控制器52和ROI映射器M协作以将与划定的ROI样式相关联的坐标转换成MB映射,所述MB 映射识别视频场景;34内的大致落在ROI样式136内的MB。如图12、13和14所示的ROI样式的定义可适用于近端视频或远端视频内的R0I。 [0126] 图15是说明使用具有待动态跟踪的指定的ROI对象的下拉式菜单140来定义所显示的视频场景中的ROI样式的图。如图15所示,用户界面42呈现下拉式菜单140,其呈现例如“脸部”、“唇部”、“背景”和“移动”的ROI描述。本地用户选择下拉式菜单中的条目之一作为所需的ROI描述。作为响应,ROI提取模块60 (图4)视情况分析近端视频或远端视频,以检测对应于描述的ROI样式。作为下拉式菜单140的替代,用户可经由用户界面42 输入文本或向麦克风口头说出文本。在每一情况下,使用例如皮肤-色调检测、对象分割或类似技术的常规的特征检测算法来使选定的ROI与适当的ROI样式匹配。当选定ROI样式时,ROI控制器52和ROI映射器M产生适当的ROI MB映射。图15中的过程称为“动态的”,是指每一ROI描述必须动态地与考虑中的特定视频场景内的ROI样式匹配。 [0127] 图16是说明使用具有映射到如图7A-7D中的预定义的ROI样式的指定的ROI对象的下拉式菜单142来定义所显示的视频场景中的ROI样式的图。如图16所示,用户界面42呈现下拉式菜单142,其呈现例如“单一脸部”、“双脸部”、“头部/肩部”和“对象”的ROI 描述。本地用户选择下拉式菜单中的条目之一作为所需的ROI样式。作为响应,ROI控制器52使选定的ROI样式与相应的预定义的ROI样式(如图7A-7D中描绘的ROI样式)匹配。因此,不同于图15所示的ROI描述,静态ROI样式不需要视频分析。事实上,ROI控制器52和ROI映射器M产生对应于下拉式菜单142中的选择的预配置的ROI MB映射。再次,作为下拉式菜单142的替代,用户可经由用户界面42输入文本或向麦克风口头说出文本。图15中的过程称为“静态的”,是指每一ROI样式对应于预定义的ROI样式和MB映射。 [0128] 图17是说明使用ROI描述界面来定义所显示的视频场景中的ROI样式的流程图。图17所示的过程可与图15的下拉式菜单或其它输入媒体结合使用。如图17所示,ROI感知视频解码器48解码从远程发送者装置14接收的远端视频(144)。用户界面42接着向本地用户显示远端视频(146)。如果本地用户不请求对于远端视频的ROI选择(148),那么不将任何ROI信息发送到远程视频通信装置14。然而,如果请求了ROI选择(148),那么用户界面42呈现例如图17的下拉式菜单140的ROI描述界面(150)。 [0129] 当接收到本地用户ROI描述时(152),ROI控制器52和ROI映射器M基于描述选择ROI样式(154)并基于选定的ROI样式定义ROI MB映射(156)。再次,可通过使用常规检测技术分析远端视频并使ROI描述与远端视频内的特定MB匹配来确定选定的ROI样式。当产生远端ROI MB映射时,ROI感知视频编码器12将远端ROI MB映射内嵌在经编码的近端视频中并将其传输到远程视频通信装置14以用于优先编码远端R0I。 [0130] 图18是说明发送者与接收者装置12、14之间的ROI冲突的解决的流程图。明确地说,图18说明验证模块58 (图3或图4)解决由本地用户指定的近端ROI与由远程用户指定的近端ROI之间的冲突的操作。当在发送者装置处产生近端视频时(160),验证模块58确定近端ROI是否已由本地用户或远程用户请求(162)。如果不是,那么正常编码所有MB (164) 而不优先编码R0I,且将所产生的经编码的视频发送到接收者视频通信装置14(166)。 [0131] 如果请求了近端ROI (162),那么验证模块58确定由本地用户指定的近端ROI与由远程用户指定的近端ROI之间是否存在冲突(168)。如果未指定远程近端ROI,或如果本地与远程近端ROI —致,那么验证可将选定的近端ROI传递到ROI控制器52以进行处理。 [0132] 如果不存在本地近端R0I,但已选择远程近端R0I,那么验证模块58可允许应用远程近端R0I。或者,在一些实施例中,仅当通过本地用户交互或通过地址簿中记录的存取等级而向远程用户授予了明确存取权时,验证模块58才可允许应用远程近端R0I。如果不存在ROI冲突,那么ROI映射器M基于适用的近端ROI产生近端MB映射并将其施加到ROI 感知视频编码器46。 ROI感知视频编码器46接着优先编码近端视频的ROI内的MB (172)。 [0133] 如果本地与远程近端ROI之间存在冲突(168),那么验证模块58确定例如在视频通信装置12内本地存储的地址簿中是否已分派了存取等级(174)。如果分派了存取等级(174),那么验证模块58根据存取等级解决ROI冲突(176)。举例来说,针对远程用户而存储的存取等级可指示应超越于本地用户而向远程用户授予ROI控制权。如果未分派存取等级(174),那么验证模块58从本地用户处寻求对远程ROI控制的许可(178)。明确地说,验证模块58可经由用户界面42提交询问以请求批准远程用户进行近端ROI控制。 [0134] 如果本地用户给予批准,那么验证模块58将远程近端ROI传递到ROI控制器52 以进行处理。如果未给予批准,那么ROI控制器52处理本地近端ROI。在任一情况下,ROI 感知视频编码器46使用选定的ROI来优先编码近端视频内的落在所述ROI内的MB(172), 并将经编码的近端视频读出到远程接收者装置14(166)。在一些情况下,验证模块58不仅可解决本地用户与远程用户之间的ROI冲突,而且可能解决若干远程用户之间的ROI冲突。本地用户可主动地向远程用户中的一者授予控制近端ROI的存取权,或分派将各个远程用户的ROI控制权区分优先次序的相对存取等级。通常,专门向一个用户(例如,本地用户, 或远程用户中的一者)授予控制ROI的存取权。 [0135] 图19是说明远端视频内的ROI宏区块的优先解码的流程图。如图19所示,当从远程发送者装置14接收到远端视频时(180),本地接收者装置12中的ROI感知视频解码器48确定远程ROI是否已由本地用户指定(182)。如果不是,那么ROI感知视频解码器48正常编码远端视频中的所有MB(184)。然而,如果远端ROI信息由本地用户指定,那么ROI感知视频解码器48优先解码所接收的远端视频中的ROI MB(186) 0可通过相对于应用于非ROI MB的内插等式和误差隐蔽技术,应用较高质量内插等式或较健壮的误差隐蔽技术来优先解码ROI MB。优先解码可包含例如较高质量解块或去鸣振滤波器的优先后处理。 [0136] 本文描述的技术可实施在硬件、软件、固件或其任何组合中。如果实施在软件中, 那么可通过计算机可读媒体来部分实现所述技术,所述计算机可读媒体包括含有指令的程序代码,所述程序代码当执行时会进行上文描述的方法中的一种或一种以上方法。在此情况下,计算机可读媒体可包括例如同步动态随机存取存储器(SDRAM)的随机存取存储器(RAM)、只读存储器(ROM)、非易失性随机存取存储器(NVRAM)、电可擦除可编程只读存储器(EEPROM)、FLASH存储器、磁性或光学数据存储媒体等。 [0137] 程序代码可由一个或一个以上处理器执行,所述一个或一个以上处理器例如一个或一个以上数字信号处理器(DSP)、通用微处理器、专用集成电路(ASIC)、现场可编程逻辑阵列(FPGA)或其它等效集成或离散逻辑电路。在一些实施例中,本文描述的功能性可提供在经配置以用于编码和解码的专门软件模块或硬件单元内,或并入在组合的视频编解码器(CODEC)中。 [0138] 已描述各种实施例。这些和其它实施例在所附权利要求书的范围内。

Claims (26)

1. 一种视频通信方法,其包括:从本地装置的本地用户处接收对将由所述本地装置编码的近端视频内的第一关注区ROI的第一描述,其中所述第一描述定义关于将由所述本地装置编码的所述近端视频的所述第一ROI ;从远端装置的远端用户处接收对将由所述本地装置编码的近端视频内的第二关注区ROI的第二描述,其中所述第二描述定义关于将由所述本地装置编码的所述近端视频的所述第二ROI ;选择所述第一ROI和所述第二ROI中的一者;基于所选择的ROI的所述相应描述产生指定所选择的ROI的信息;基于所述指定所选择的ROI的信息来编码所述本地装置上的所述近端视频以增强所述近端视频的所选择的ROI相对于非ROI区域的图像质量;以及将所述经编码的近端视频和所述指定所选择的ROI的信息从所述本地装置传输到所述远端装置。 A video communication method comprising: receiving a first description of a first region of interest within said local device by proximal end video encoded ROI from the local user at the local device, wherein the first description of the definition of the encoded by the local device proximal end of a first ROI video; remote user at the remote unit receiving from a second region of interest in the local device by encoding a second proximal end ROI video description, wherein the description defines the second by the local device on the proximal end of video encoding a second ROI; ROI selecting said first and said second one of the ROI; based on the selected generating the ROI corresponding description information specifying the selected ROI; encoded information based on the selected ROI is designated the proximal end video on the local device to enhance the proximal end of the selected ROI video relative to non-ROI areas of the image quality; and the proximal end of the encoded video and the information specifying the selected ROI from the local device to the remote units transmit.
2.根据权利要求1所述的方法,其中所选择的ROI的所述相应描述是文本描述。 2. The method according to claim 1, wherein the ROI corresponding to the selected description is a textual description.
3.根据权利要求1所述的方法,其中所选择的ROI的所述相应描述是口头描述。 3. The method according to claim 1, wherein the ROI corresponding to the selected description is a verbal description.
4.根据权利要求3所述的方法,其进一步包括通过语音识别来处理所述口头描述,和基于一个或一个以上经识别的项来产生所述指定所选择的ROI的信息。 4. The method according to claim 3, further comprising processing the verbal description by speech recognition, and based on one or more of the designated ROI to produce the selected items of the information identified.
5.根据权利要求1所述的方法,其中所选择的ROI的所述相应描述是图形描述。 5. The method according to claim 1, wherein the ROI description corresponding to the selected pattern is described.
6.根据权利要求5所述的方法,其中所述图形描述是作为所述本地用户和所述远端用户中的至少一者划定在用户界面屏幕上的区域而接收的。 6. The method according to claim 5, wherein said graphical description as delineated in the user interface screen area of ​​the local user and at least one of the user received in the distal end.
7.根据权利要求1所述的方法,其进一步包括在不同于所述本地装置的中间服务器内处理所选择的ROI的所述相应描述以产生所述指定所选择的ROI的信息。 7. The method according to claim 1, further comprising the selected processing server in the intermediate apparatus is different from the local ROI description to produce a corresponding information of the selected specified ROI.
8.根据权利要求1所述的方法,其进一步所述指定所选择的ROI的信息内嵌在经编码近端视频内。 8. The method according to claim 1, which further specifies the information selected by the ROI embedded within the encoded near-end video.
9.根据权利要求1所述的方法,其进一步通过带外信令将所述指定所选择的ROI的信息从所述本地装置传递到所述远端装置。 9. The method according to claim 1, further the information specified by the selected ROI band signaling transmitted from the local device to the remote device.
10.根据权利要求1所述的方法,其进一步包括产生指定从所述远端装置接收的经编码远端视频内的第三ROI的信息,和将所述第三ROI信息和所述经编码的近端视频一起传输到所述远端装置。 10. The method according to claim 1, further comprising generating a third information ROI within encoded far-end video received from the designated remote unit, and the third information and the coded ROI end video transmitted to the remote units together.
11.根据权利要求1所述的方法,其进一步包括解码从所述远端装置接收的经编码远端视频以增强所述远端视频中的第三ROI区域相对于所述远端视频的非ROI区域的图像质量。 11. The method according to claim 1, further comprising decoding the encoded far-end video received from the remote device to enhance the non-ROI area of ​​the video in the distal third of the distal end with respect to the video the image quality of the ROI.
12.根据权利要求1所述的方法,其进一步包括基于所述指定所选择的ROI的信息来产生宏区块MB映射,所述MB映射识别处于所选择的ROI内的MB。 12. The method according to claim 1, further comprising generating a macroblock MB map based on the information specifying the ROI is selected, MB is the MB in the map to identify the selected ROI.
13.根据权利要求1所述的方法,其进一步包括:监视与所述经编码近端视频相关联的运动信息;基于所述运动信息来调节所选择的ROI ;以及基于所述经调节的经选择的ROI来编码所述近端视频。 And based upon the adjusted through; monitoring the motion video information associated with the encoded near-end; adjusting the ROI based on the motion information selected: 13. The method according to claim 1, further comprising encoding the selected ROI to the proximal end video.
14.根据权利要求13所述的方法,其进一步包括基于所述指定所选择的ROI的信息来产生宏区块MB映射,所述MB映射识别处于所选择的ROI内的MB,且其中调节所选择的ROI包含基于所述运动信息将MB的状态修改为包含在所选择的ROI中或排除在所选择的ROI 外。 14. The method according to claim 13, further comprising generating a macroblock MB map based on the information specifying the ROI is selected, the MB is MB map to identify the selected ROI within, and wherein the adjustment select ROI based on the motion information contains the status of the MB is contained in the modified ROI selected or excluded in the selected ROI outside.
15. 一种视频通信装置,其包括:关注区ROI引擎,其从所述装置的本地用户接收对将由所述装置编码的近端视频内的第一关注区ROI的第一描述,其中所述第一描述定义关于将由所述装置编码的所述近端视频的所述第一ROI ;从一远端装置的远端用户处接收对将由所述装置编码的近端视频内的第二ROI的第二描述,其中所述第二描述定义关于将由所述装置编码的所述近端视频的所述第二ROI ;选择所述第一ROI和所述第二ROI中的一者;并基于所选择的ROI的所述相应描述产生指定所选择的ROI的信息;视频编码器,其基于所述指定所选择的ROI的信息在所述装置上编码所述近端视频以增强所述近端视频的所选择的ROI相对于非ROI区域的图像质量,且其进一步将所述经编码的近端视频和所述指定所选择的ROI的信息传输到所述远端装置。 15. A video communication device comprising: a region of interest ROI engine, which is received from the local user of the apparatus wherein the first means by the description of the first region of interest within the ROI video coding proximal end, ROI in the second receiving means of said encoded by the user from the distal end of the distal end of a proximal end of the video device; a first definition is described by means of the proximal end of the first ROI video coding the second description, wherein the second description of the definition by means of said proximal end video encoded second ROI; ROI selecting said first and said second one of the ROI; and based on the corresponding to the selected ROI description generating information specifying the ROI selected; a video encoder, based on the information specifying the ROI is selected by the encoding means on said proximal end of said proximal end video to enhance video the selected ROI relative to non-ROI areas of the image quality, and further to the proximal end of the encoded video and the transmission of information specifying the selected ROI to the remote unit.
16.根据权利要求15所述的装置,其中所选择的ROI的所述相应描述是文本描述。 16. The apparatus according to claim 15, wherein the ROI corresponding to the selected description is a textual description.
17.根据权利要求15所述的装置,其中所选择的ROI的所述相应描述是口头描述。 17. The apparatus according to claim 15, wherein the ROI corresponding to the selected description is a verbal description.
18.根据权利要求17所述的装置,其进一步包括提取模块,所述提取模块通过语音识别来处理所述口头描述,并基于一个或一个以上经识别的项来产生所述指定所选择的ROI 的信息。 18. The apparatus according to claim 17, further comprising extracting module, the ROI extraction module to process the verbal description by speech recognition, and based on one or more of the identified entry to generate the designated selected Information.
19.根据权利要求15所述的装置,其中所选择的ROI的所述相应描述是图形描述。 19. The apparatus according to claim 15, wherein the ROI description corresponding to the selected pattern is described.
20.根据权利要求19所述的装置,其中所述图形描述是作为所述本地用户和所述远端用户中的至少一者划定在用户界面屏幕上的区域而接收的。 20. The apparatus according to claim 19, wherein the graphic description as delineated in the user interface screen area of ​​the local user and at least one of the user received in the distal end.
21.根据权利要求15所述的装置,其中所述ROI引擎产生指定从所述远端装置接收的经编码远端视频内的第三ROI的信息,且其中所述装置将所述指定所述第三ROI的信息和所述经编码近端视频一起传输到所述远端装置。 21. The apparatus according to claim 15, wherein the ROI engine generates information of the third ROI within the encoded far-end video received from the designated remote unit, and wherein said means for specifying the said and a third ROI information with the encoded near-end video transmitted to the remote unit.
22.根据权利要求15所述的装置,其进一步包括视频解码器,所述视频解码器解码从所述远端装置接收的经编码远端视频以增强所述远端视频中的第三ROI相对于所述远端视频的非ROI区域的图像质量。 22. The apparatus according to claim 15, further comprising a video decoder, the video decoder decodes received from the far-end video encoded by the remote device to enhance the distal end of a third video ROI relative to non-ROI areas of the far end video image quality.
23.根据权利要求15所述的装置,其进一步包括基于所述指定所选择的ROI的信息来产生宏区块MB映射,所述MB映射识别处于所选择的ROI内的MB。 23. The apparatus according to claim 15, further comprising generating a macroblock MB map based on the information specifying the ROI is selected, MB is the MB in the map to identify the selected ROI.
24.根据权利要求15所述的装置,其进一步包括跟踪模块,所述跟踪模块监视与所述经编码近端视频相关联的运动信息,并基于所述运动信息来调节所选择的R0I,其中所述编码器基于所述经调节的经选择的ROI来编码所述近端视频。 24. The apparatus according to claim 15, further comprising a tracking module, the tracking module monitors the near-end video encoded by the motion information associated with, and adjust R0I selected based on the motion information, wherein said encoder based on the adjusted ROI selected by the proximal end video encoded.
25.根据权利要求M所述的装置,其进一步包括映射器模块,所述映射器模块基于所述指定所选择的ROI的信息来产生宏区块MB映射,所述MB映射识别处于所选择的ROI内的MB,其中所述跟踪模块通过基于所述运动信息将MB的状态修改为包含在所选择ROI中或排除在所选择的ROI外来调节所选择的ROI。 25. The apparatus as claimed M claim, further comprising a mapping module, said mapping module based on the information specifying the ROI is selected to produce a macroblock MB map in the map to identify the selected MB of MB in the ROI, wherein the ROI tracking module is included in the selected foreign adjusting the ROI selected or excluded in the selected ROI based on the motion information by modifying the state of the MB.
26. 一种视频编码系统,其包括:第一视频通信装置,其编码近端视频,第二视频通信装置,其从所述第一视频通信装置处接收所述近端视频,其中所述第一视频通信装置从所述第一视频通信装置的本地用户处接收对将由所述第一视频通信装置编码的所述近端视频内的第一关注区ROI的第一描述,其中所述第一描述界定关于将由所述第一视频通信装置编码的所述近端视频的所述第一ROI ;其中所述第一视频通信装置从所述第二视频通信装置的远端用户处接收对将由所述第一视频通信装置编码的所述近端视频内的第二ROI的第二描述;其中所述第二描述定义关于将由所述第一视频通信装置编码的所述近端视频的所述第二ROI ;其中所述第一视频通信装置选择所述第一ROI和所述第二ROI中的一者; 中间服务器,其结构上不同于所述第一和第二视频通信装置,且其 26. A video coding system, comprising: a first video communication device that encodes near-end video, a second video communication device that receives the proximal end video from the first video communication device, wherein the first a first video communication device receives a description of a first region of interest within the first video communication device that encodes near-end video by the ROI from the local user at the first video communication device, wherein the first the first description defines the ROI on by a first video communication device that encodes near-end video; wherein the first video communication device receives from a user at a distal end of said second video communication by means of the the second description of the second ROI in the proximal end of said first video communication device that encodes video; wherein said first defining the description on the second by the first video communication device that encodes near-end video two ROI; wherein said first communication device to select the first video and the second ROI in the ROI one; intermediate server, structurally different from the first and second video communication device, and which 基于所选择的ROI 的所述相应描述产生指定所选择的ROI的信息,其中所述第一视频通信装置基于所述指定所选择的ROI的信息来编码所述近端视频以增强所述近端视频的所选择的ROI相对于非ROI区域的图像质量,和将所述经编码的近端视频和所述指定所选择的ROI的信息传输到所述第二视频通信装置。 Description of the respective specified ROI generation information selected based on the selection of the ROI, wherein the first video communication device based on the information specifying the ROI is selected to encode the proximal end of the proximal end video to enhance the selected ROI video quality relative to non-ROI areas of the image, and transmitting the encoded video and the proximal end of the designated information of the selected ROI to the second video communication device.
CN 200680014872 2005-03-09 2006-03-08 Region-of-interest extraction for video telephony CN101171841B (en)

Priority Applications (5)

Application Number Priority Date Filing Date Title
US66020005P true 2005-03-09 2005-03-09
US60/660,200 2005-03-09
US11/183,072 2005-07-15
US11/183,072 US8019175B2 (en) 2005-03-09 2005-07-15 Region-of-interest processing for video telephony
PCT/US2006/008457 WO2006130198A1 (en) 2005-03-09 2006-03-08 Region-of-interest extraction for video telephony

Publications (2)

Publication Number Publication Date
CN101171841A CN101171841A (en) 2008-04-30
CN101171841B true CN101171841B (en) 2012-06-27

Family

ID=39334927

Family Applications (2)

Application Number Title Priority Date Filing Date
CN 200680014872 CN101171841B (en) 2005-03-09 2006-03-08 Region-of-interest extraction for video telephony
CNA2006800145199A CN101167365A (en) 2005-03-09 2006-03-08 Region-of-interest processing for video telephony

Family Applications After (1)

Application Number Title Priority Date Filing Date
CNA2006800145199A CN101167365A (en) 2005-03-09 2006-03-08 Region-of-interest processing for video telephony

Country Status (1)

Country Link
CN (2) CN101171841B (en)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102170552A (en) * 2010-02-25 2011-08-31 株式会社理光 Video conference system and processing method used therein
EP2621167A4 (en) * 2010-09-24 2015-04-29 Gnzo Inc Video bit stream transmission system
CN102025965B (en) * 2010-12-07 2014-01-01 华为终端有限公司 Video talking method and visual telephone
EP2523145A1 (en) * 2011-05-11 2012-11-14 Alcatel Lucent Method for dynamically adapting video image parameters for facilitating subsequent applications
CN103024334B (en) * 2011-09-28 2015-11-25 中国移动通信集团公司 A method for video phone service, systems and devices
CN102438144B (en) * 2011-11-22 2013-09-25 苏州科雷芯电子科技有限公司 Video transmission
US20130279573A1 (en) * 2012-04-18 2013-10-24 Vixs Systems, Inc. Video processing system with human action detection and methods for use therewith
CN102750122B (en) * 2012-06-05 2015-10-21 华为技术有限公司 Multi-screen display control method, apparatus and system for
CN103581603B (en) * 2012-07-24 2017-06-27 联想(北京)有限公司 Transmission method and an electronic device multimedia data
CN103310411B (en) * 2012-09-25 2017-04-12 中兴通讯股份有限公司 An image enhancement method and apparatus for local
EP2936802A4 (en) * 2012-12-18 2016-08-17 Intel Corp Multiple region video conference encoding
US9386275B2 (en) 2014-01-06 2016-07-05 Intel IP Corporation Interactive video conferencing
US9516220B2 (en) 2014-10-02 2016-12-06 Intel Corporation Interactive video conferencing
CN105120366A (en) * 2015-08-17 2015-12-02 宁波菊风系统软件有限公司 A presentation method for an image local enlarging function in video call

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1291315A (en) 1998-03-20 2001-04-11 三菱电机株式会社 Lossy/lossless region of interest image coding

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6178204B1 (en) * 1998-03-30 2001-01-23 Intel Corporation Adaptive control of video encoder's bit allocation based on user-selected region-of-interest indication feedback from video decoder
US7559026B2 (en) * 2003-06-20 2009-07-07 Apple Inc. Video conferencing system having focus control
US20050024487A1 (en) * 2003-07-31 2005-02-03 William Chen Video codec system with real-time complexity adaptation and region-of-interest coding

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1291315A (en) 1998-03-20 2001-04-11 三菱电机株式会社 Lossy/lossless region of interest image coding

Also Published As

Publication number Publication date
CN101171841A (en) 2008-04-30
CN101167365A (en) 2008-04-23

Similar Documents

Publication Publication Date Title
US7561179B2 (en) Distributed real-time media composer
JP5453464B2 (en) Content-adaptive background skipping about ROI video coding
CA2633366C (en) System and method for videoconferencing using scalable video coding and compositing scalable video conferencing servers
EP1100270B1 (en) Terminal supporting signalling used in transmission and reception of MPEG-4 data
KR101645780B1 (en) Signaling attributes for network-streamed video data
US20100104021A1 (en) Remote Transmission and Display of Video Data Using Standard H.264-Based Video Codecs
US20030112867A1 (en) Video coding and decoding
US20020126755A1 (en) System and process for broadcast and communication with very low bit-rate bi-level or sketch video
CN100352279C (en) Method and systems for preparing video communication image for wide screen display
US8111280B2 (en) Video conference system and method in a communication network
AU2002355089B2 (en) Method and apparatus for continuously receiving frames from a pluarlity of video channels and for alternatively continuously transmitting to each of a plurality of participants in a video conference individual frames containing information concerning each of said video channels
US20050041739A1 (en) System and process for broadcast and communication with very low bit-rate bi-level or sketch video
EP2046048A2 (en) Region-of-interest coding with background skipping for video telephony
JP5410553B2 (en) Quality metric bias region of interest coding for the TV phone
US7492387B2 (en) Implementation of MPCP MCU technology for the H.264 video standard
US20060259552A1 (en) Live video icons for signal selection in a videoconferencing system
CN1140133C (en) Apparatus and method for dual compressed picture bitstream camera of universal serial bus connection
Ahmad et al. Video transcoding: an overview of various techniques and research issues
KR101108661B1 (en) Method for coding motion in a video sequence
KR100548383B1 (en) Digital video signal processing apparatus of mobile communication system and method thereof
JP4763312B2 (en) Coding method of the moving picture data, decoding method, the terminal device to perform these, and two-way interactive system
CN101507278B (en) Techniques and method for variable resolution encoding and decoding of digital video
US7479957B2 (en) System and method for scalable portrait video
EP2309747A2 (en) Region-of-interest coding using RHO domain bit allocation
AU2011258272B2 (en) Systems and methods for scalable video communication using multiple cameras and multiple monitors

Legal Events

Date Code Title Description
C06 Publication
C10 Request of examination as to substance
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 1117688

Country of ref document: HK

C14 Granted
REG Reference to a national code

Ref country code: HK

Ref legal event code: WD

Ref document number: 1117688

Country of ref document: HK