KR20050032118A

KR20050032118A - System and method of streaming 3-d wireframe animations

Info

Publication number: KR20050032118A
Application number: KR1020057002778A
Authority: KR
Inventors: 오스터만 요른; 바라클리오티스 소크라테스
Original assignee: 에이티 앤드 티 코포레이션
Priority date: 2002-08-20
Filing date: 2003-08-15
Publication date: 2005-04-06
Also published as: EP1532818A2; CA2495714A1; WO2004019619A3; JP2009181586A; WO2004019619A2; JP2005536802A

Abstract

Optimal resilience to errors in packetized streaming 3-D wireframe animation is achieved by partitioning the stream into layers and applying unequal error correction coding to each layer independently to maintain the same overall bitrate. The unequal error protection scheme for each of the layers combined with error concealment at the receiver achieves graceful degradation of streamed animation at higher packet loss rates than approaches that do not account for subjective parameters such as visual smoothness.

Description

System and method of streaming 3-D wireframe animations

본 출원은 2002년 8월 20일에 출원된 미국 가특허출원 번호 60/404,410의 우선권을 주장하고, 상기 내용들은 본원에 참조된다.This application claims the priority of US Provisional Patent Application No. 60 / 404,410, filed August 20, 2002, which is incorporated herein by reference.

발명의 명칭이 "Coding of Animated 3-D Wireframe Models For Internet Streaming Applications: Methods, Systems and Program Products"인 2002년 7월 19일에 출원된 비가출원 번호 10/198,129은 본 출원의 출원인에게 양도되었고, 본원에 참조된다.Non-Application No. 10 / 198,129, filed on July 19, 2002, entitled "Coding of Animated 3-D Wireframe Models For Internet Streaming Applications: Methods, Systems and Program Products," was assigned to the applicant of the present application, Reference is made herein.

본 발명은 데이터, 보다 명확하게는 3-D 와이어프레임 애니매이션들(wireframe animations)을 스트리밍하는 방법 및 시스템에 관한 것이다.The present invention relates to a method and system for streaming data, more specifically 3-D wireframe animations.

인터넷은 최근 몇 년 동안 저대역폭, 텍스트 전용 협업 매체(text-only collaboration medium)로부터, 풍부하고, 상호작용하며, 실시간인 시청각 가상 세계로 빠르게 발달하고 있다. 그것은 3-D 애니메이션들이 드라이빙 포스(driving force)를 구성하는 다수의 애플리케이션들, 환경들 및 사용자들을 포함한다. 애니매이트된(animated) 3-D 모델들은 디스플레이된 대상들과 직관적이고 현실적인 상호작용을 가능하게 하고, 종래의 시청각 애니매이션들로 달성될 수 없는 효과들을 허용한다. 따라서, 현재의 도전은 기존의 네트워크 환경을 강화하고, 제한된 자원들을 고려하여, 인터넷의 기반구조를 포함하는 현재 상황에서 새로운 데이터 스트림으로서 애니매이트된 3-D 기하학을 통합하는 것이다. 정적 3-D 메시 기하학 압축이 과거 10년 동안 활발히 연구되었지만, 정적 3-D 메시들을 시간 영역으로 확장하는 동적 3-D 기하학을 압축하는 연구는 거의 수행되지 않았다.The Internet has been rapidly developing in recent years from a low bandwidth, text-only collaboration medium to a rich, interactive, real-time audiovisual virtual world. It includes a number of applications, environments and users in which 3-D animations constitute a driving force. Animated 3-D models enable intuitive and realistic interaction with displayed objects and allow effects that cannot be achieved with conventional audiovisual animations. Thus, the current challenge is to consolidate the animated 3-D geometry as a new data stream in the current situation involving the infrastructure of the Internet, strengthening existing network environments and taking into account limited resources. While static 3-D mesh geometry compression has been actively studied for the past decade, little research has been done to compress dynamic 3-D geometry that extends static 3-D meshes into the time domain.

3-D 정적 모델들에 대해 가장 일반적으로 행해지는 표현들은 다각형 또는 삼각형 메시들이다. 상기 표현들은 원하는 정밀도 또는 품질 내에서 임의의 형태 및 토폴로지(topology)의 근접한 모델들을 허용한다. 효과적인 알고리즘들과 데이터 구조들은 상기 정적 메시들을 생성, 변경, 압축, 전송 및 저장하도록 존재한다. 앞으로, 시간 차원을 도입하는 비정적, 스트림 유형들은 네트워크의 제한된 자원들(대역폭)과 특성들(채널 에러들)에 견디도록 스케일링가능한(scalable) 해결책들을 요구할 것이다.The most common representations for 3-D static models are polygonal or triangular meshes. The representations allow for proximity models of any shape and topology within the desired precision or quality. Effective algorithms and data structures exist to create, modify, compress, transmit and store the static meshes. In the future, non-static, stream types that introduce a time dimension will require scalable solutions to withstand the limited resources (bandwidth) and characteristics (channel errors) of the network.

여기서 기술되는 3-D 와이어프레임 애니매이션 스트리밍의 문제는 이하와 같이 설명할 수 있다: (ⅰ) 시간-의존 3-D 메시는 와이어프레임 애니매이션 프레임들의 시퀀스로 스케일링 가능하게 압축된다, (ⅱ) 이용가능한 전송율 R이 공지(또는 대응하는 TCP-친화율(TCP-friendly rate)에 관해 결정)된다, (ⅲ) 채널 에러 특성들이 공지된다, 및 (ⅳ) 이용가능한 전송율의 소수부 C(C<R)는 채널 코딩을 위해 남겨둘 수 있다는 점이 가정된다. 이 후, 수신기측에서 시간-의존 메시의 감지된 품질을 최대화하도록 애니매이션 장면에서 중요한 각 레벨(계층)에 할당될 최적의 비트 수를 확인하는 것이 이슈이다.The problem of 3-D wireframe animation streaming described herein can be described as follows: (i) A time-dependent 3-D mesh is compressively compressed into a sequence of wireframe animation frames, (ii) available The rate R is known (or determined with respect to the corresponding TCP-friendly rate), (i) the channel error characteristics are known, and (i) the fractional part of the available rate C (C <R) It is assumed that it can be left for channel coding. The issue is then to identify the optimal number of bits to be assigned to each critical level (layer) in the animation scene to maximize the perceived quality of the time-dependent mesh on the receiver side.

대부분의 애니매이션 코딩 접근법들은 정적 3-D 메시들의 계층적 코딩을 달성하도록 객관적인 메트릭들을 사용한다. 향상된 애니매이션의 외관을 제공하도록 시각적 평활도와 같은 주관적인 양을 이용하는 애니매이션 접근법들이 필요하다. 여기에 기술된 것은 3-D 와이어 프레임 애니매이션 코덱 및 그것의 비트스트림 컨텐트와, 연관된 순방향 에러 정정(forward error correction;FEC) 코드들이다. 시각적 왜곡 메트릭 뿐만 아니라 비동일 에러 보호(unequal error protection;UEP) 방법 및 수신기-기반 은폐 방법이 또한 설명된다.Most animation coding approaches use objective metrics to achieve hierarchical coding of static 3-D meshes. There is a need for animation approaches that use subjective quantities such as visual smoothness to provide an improved appearance of animation. Described here are forward error correction (FEC) codes associated with a 3-D wire frame animation codec and its bitstream content. In addition to the visual distortion metrics, unequal error protection (UEP) methods and receiver-based concealment methods are also described.

도 1A는 3-D 애니매이션 코덱의 블록도.1A is a block diagram of a 3-D animation codec.

도 1B는 디코더의 블록도.1B is a block diagram of a decoder.

도 2는 PSNR, 하우스도르프 거리(Hausdorff Distance) 및 시각적 평활도를 포함하는 왜곡 매트릭들의 비교도.2 is a comparison of distortion metrics including PSNR, Hausdorff Distance, and visual smoothness.

도 3A는 에러 내성 와이어프레임 스트리밍의 방법의 흐름도를 나타내는 도면.3A is a flow diagram of a method of error tolerant wireframe streaming.

도 3B는 본 발명의 일 양상에 따른 흐름도를 도시하는 도면.3B illustrates a flowchart in accordance with an aspect of the present invention.

도 4는 와이어프레임 애니매이션 TELLY의 시퀀스에 대한 3개의 에러 은폐 방법들의 비교도.4 is a comparison of three error concealment methods for a sequence of wireframe animation TELLY.

도 5는 와이어프레임 애니매이션 TELLY의 3개 층들의 프레임들이 전송 및 디코딩된 시각적 평활도(VS)의 비교도.5 is a comparison of visual smoothness (VS) in which frames of three layers of wireframe animation TELLY are transmitted and decoded.

도 6은 와이어프레임 애니매이션 BOUNCEBALL의 2개 층들의 프레임들이 전송 및 시코딩된 시각적 평활도의 비교도.FIG. 6 is a comparison of visual smoothness in which frames of two layers of wireframe animation BOUNCEBALL are transmitted and encoded. FIG.

본 발명은 네트워크 대역폭을 고려하고 채널의 다량 손실 특성(bursty loss nature)을 참작하는 인터넷을 통해 에러 내성 시간-의존 3-D 메시 스트리밍을 위한 소스 및 채널 코딩 기술들에 집중한다. The present invention focuses on source and channel coding techniques for error tolerant time-dependent 3-D mesh streaming over the Internet taking into account network bandwidth and taking into account the bursty loss nature of the channel.

본 발명의 예시적인 실시예는, 와이어프레임 메시에서 각 노드에 대해 시각적 평활도를 계산하는 단계와, 각 계층과 연관된 평균 시각적 평활도 값이 애니매이션 시퀀스에서 개개의 계층의 중요도를 반영하도록 와이어프레임 메시와 연관된 데이터를 복수의 계층들로 계층화하는 단계를 포함하는 데이터를 스트리밍하는 방법이다. 본 발명의 또다른 실시예들은 비트스트림을 생성 및 전송하거나 비트스트림을 수신하는 상기 방법 및 장치와 유사한 처리에 따라 생성된 비트스트림을 포함할 수 있다.An exemplary embodiment of the present invention includes calculating visual smoothness for each node in a wireframe mesh, and associated with the wireframe mesh such that the average visual smoothness value associated with each layer reflects the importance of the individual layers in the animation sequence. A method of streaming data comprising stratifying data into a plurality of layers. Still other embodiments of the present invention may include a bitstream generated according to processing similar to the above methods and apparatus for generating and transmitting bitstreams or receiving bitstreams.

본 발명의 부가적인 특징들과 이점들이 이하의 명세서에서 설명될 것이고, 일부는 본 명세서로부터 분명해질 것이며, 또는 본 발명을 실행하여 습득될 것이다. 본 발명의 특징들과 이점들은 첨부된 청구범위에서 특히 지적된 수단들과 조합들에 의해 구현 및 획득될 수 있다. 본 발명의 상기 및 또다른 특징들은 이하의 명세서와 첨부된 청구범위로부터 보다 충분히 분명해질 것이고, 또한 여기서 설명되는 바와 같은 발명을 실행하여 습득될 것이다.Additional features and advantages of the invention will be set forth in the description which follows, and in part will be apparent from the description, or may be learned by practice of the invention. The features and advantages of the invention may be realized and obtained by means and combinations particularly pointed out in the appended claims. These and other features of the present invention will become more fully apparent from the following specification and appended claims, and will also be learned by practicing the invention as described herein.

본 발명의 전술된 및 또다른 이점들과 특징들이 획득될 수 있는 방식으로 기술되기 위해서, 간단히 전술된 본 발명의 보다 특정한 기술이 첨부 도면들에 도시된 특정 실시예들을 참조하여 표현될 것이다. 이 도면들은 본 발명의 오직 전형적인 실시예들만을 기술하고 있고, 그 범위를 제한하는 것으로 간주되지 않으며, 본 발명은 첨부 도면들의 사용을 통해 보다 상세하게 기술 및 설명될 것이라는 점을 이해하라.BRIEF DESCRIPTION OF DRAWINGS To describe the above and other advantages and features of the present invention in a manner that can be obtained, the more specific description of the present invention described above will be briefly described with reference to specific embodiments shown in the accompanying drawings. It is to be understood that these drawings describe only exemplary embodiments of the invention and are not to be considered limiting of its scope, for the invention will be described and described in more detail through the use of the accompanying drawings.

일반적으로 컴퓨터 네트워크들, 특히 인터넷을 통해 비디오를 스트리밍하는 많은 연구들이 진행중이다. 상대적으로 3-D 와이어프레임 애니매이션을 스트리밍하는 분야에 대한 연구는 거의 없다. 양 처리들은 약간의 유사성을 가지지만, 두 처리들은 대부분 상이하다. 상이한 데이터가 네트워크를 통과하고, 따라서 손실은 신호 재구성에 상이하게 영향을 미친다. 상기 손실의 지각적 효과들은 당업계에서 거의 다루어지지 않는다. 상기 범위에 있어서의 많은 연구들은 주관적인 효과들을 고려하는 것 대신에 PSNR과 같은 객관적인 척도들에 의존했다.In general, many studies are in progress for streaming video over computer networks, especially the Internet. There is very little research in the field of streaming 3-D wireframe animation. Both treatments have some similarities, but the two treatments are mostly different. Different data passes through the network, and thus losses affect signal reconstruction differently. The perceptual effects of such losses are rarely addressed in the art. Many studies in this range have relied on objective measures such as PSNR instead of considering subjective effects.

본 발명은 수신기측에서 지각적 효과에 의해 에러들에 대한 최적의 내성(resilience)을 달성하는 방법의 문제들을 다루기 위해 다수의 분야들로부터의 개념들을 합한다. 이 점에 있어서, 본 발명은 예를 들어 메시 표면 평활과 같은 애니매이션의 주관적인 품질에 관한 것이다. 주관적 인자들을 고려하는 개선된 코딩 체계를 달성하기 위해, 본 발명의 일 양상은 정적 3-D 메시 압축에 관한 왜곡 메트릭에 의해 측정된 바와 같이, 에러의 지각적 효과들을 최소화하면서 동일한 전체 비트율을 유지하는 방식으로 각 층에 대해 독립적으로 리드-솔로몬(Reed-Solomom;RS) 순방향 에러 정정(forward error correction;FEC) 코드들을 적용하는 것과 애니매이션 스트림을 다수의 층들로 분할하는 것을 포함한다. 다른 접근법들보다 높은 패킷 손실율로 스트리밍된 애니매이션들의 우아한 성능 저하(graceful degradation)는 에러 은폐(error concealment;EC) 및 효율적인 패킷화 체계가 결합된 비동일 에러 보호(unequal error protection;UEP)에 의해 달성될 수 있다.The present invention combines the concepts from a number of disciplines to address the problems of how to achieve optimal resilience to errors by perceptual effects at the receiver side. In this respect, the present invention relates to subjective quality of animation, such as, for example, mesh surface smoothing. In order to achieve an improved coding scheme that takes into account subjective factors, one aspect of the present invention maintains the same overall bit rate while minimizing the perceptual effects of error, as measured by the distortion metric for static 3-D mesh compression. And independently applying Reed-Solomom (RS) forward error correction (FEC) codes for each layer and dividing the animation stream into multiple layers. Graceful degradation of animations streamed at higher packet loss rates than other approaches is achieved by unequal error protection (UEP), which combines error concealment (EC) and an efficient packetization scheme. Can be.

본 개시는 우선, 관련된 표기를 도입하는 3-D 애니매이션 코덱의 개요, 에러 정정 RS 코드들의 개요, 채널 모델의 도출, 인코딩된 비트스트림에 대한 UEP 패킷화의 예시적 기술을 제공한다.The present disclosure first provides an overview of a 3-D animation codec that introduces an associated notation, an overview of error correction RS codes, derivation of a channel model, and an exemplary description of UEP packetization for an encoded bitstream.

시간-의존 3-D 메시의 정점들(vertices) m_j은 시간 t에서, 인덱싱된 셋트 M_t={m_jt;j=1,2,...,n}를 형성하고, 여기서 n은 메시 내 정점들의 수이다. 정점은 3개의 공간 성분들(x_j,y_j,z_j)을 갖고, 시간 내에 접속 변경들이 발생하지 않는다고 가정하여, 시간 t에서, 인덱싱된 셋트의 데이터를 위치 행렬 M_t에 의해 이하와 같이 나타낼 수 있다:Vertices m _j of a time-dependent 3-D mesh form an indexed set M _t = {m _jt ; j = 1,2, ..., n} at time t, where n is the mesh Is the number of my vertices. Assuming a vertex has three spatial components (x _j , y _j , z _j ) and no connection changes occur in time, at time t, the indexed set of data is represented by the position matrix M _t as Can indicate:

정점들의 인덱싱된 셋트는 노드들이라 불리는 직관에 의한 자연스러운 분할들로 분할된다. 용어 "노드"는 VRML에 규정된 바와 같은 노드들과 상기 노드들의 대응을 강조하도록 사용된다. i번째 노드에 대응하는 위치 행렬은 N_i,t로 표시된다. 일반성을 잃지 않고, 정점 행렬은 k개의 노드들에 대해 이제 이하와 같이 표현될 수 있다는 점에 유의하라:The indexed set of vertices is divided into natural divisions by intuition called nodes. The term "node" is used to emphasize the correspondence of nodes with those as defined in VRML. The position matrix corresponding to the i th node is denoted by N _{i, t} . Note that without losing generality, the vertex matrix can now be expressed as follows for k nodes:

표시의 편의를 위해, 용어들 N_i,t(i=1,2,...,k)은 행렬을 나타내고, i번째 노드를 또한 언급하는 것으로 사용될 수 있다. 3D-애니매이션 압축 알고리즘의 목적은 통신 채널을 통한 전송을 위해 합성 애니매이션을 형성하는 매트릭들의 시퀀스 M_t를 압축하는 것이다. 분명히, 3-D 메시의 자유-형성(free-form) 애니매이션들에 대해, 메시의 좌표는 M_t 행렬들을 압축하는데 부적합하게 하는 높은 분산을 나타낼 수 있다. 따라서, 신호는 시간 t에 모든 노드들에서 모든 정점들의 0이 아닌 변위들의 셋트로 규정될 수 있다:For ease of presentation, the terms N _{i, t} (i = 1,2, ..., k) denote a matrix and may be used to refer to the i th node as well. The purpose of the 3D-animation compression algorithm is to compress the sequence M _t of the metrics that form the composite animation for transmission over the communication channel. Clearly, for free-form animations of a 3-D mesh, the mesh's coordinates may indicate a high variance that makes it unsuitable for compressing the M _t matrices. Thus, the signal can be defined as a set of nonzero displacements of all vertices at all nodes at time t:

이 표기에 따라, 상기는 이하와 같이 변위 행렬 D_t로 표현될 수 있다:According to this notation, it can be represented by the displacement matrix D _t as follows:

또는 동일하게, 노드 행렬들을 사용하여:Or equally, using node matrices:

여기서 F_i,t는 l개의 노드들(i=1,2,...,l)에 대해, 노드 i의 변위 행렬이다. 모든 축상에서 변위가 0인 정점들을 D_t가 포함하지 않기 때문에, D_t의 차원은 M_t와 비교해서 p≤n으로 감소된다는 점에 유의하라. 노드에서 정점들이 바뀌지 않는 경우(F_i;t=0), 1≤k가 유지된다는 점도 유의하라. 3D-애니매이션 기술에 있어서, 스파스(sparse) 애니매이션들은 p<n 및 1<k인 시퀀스들로 언급되고, p=n 및 1=k인 경우, 애니매이션은 고밀도(dense)라 불린다. 인코더가 파라미터들 p와 l을 제어할 수 있다면, 계층화된 비트스트림(layered bitstream)을 생성할 수 있고(파라미터 l을 조정함으로써), 여기서 모든 층 L은 스케일링 가능하다(파라미터 p를 조정함으로써)는 점이 분명하다. 애니매이션의 스파서티(sparsity)(또는 밀도)는 이하에 규정된 바와 같이 범위 [0..1]내의 밀도 인자로 제한된다:Where F _{i, t} is the displacement matrix of node i, for l nodes (i = 1,2, ..., l). Note that since D _t does not include vertices with zero displacement on all axes, the dimension of D _t is reduced to p ≦ n compared to M _t . Note also that if the vertices at the node do not change (F _{i; t} = 0), 1 ≦ k is maintained. In 3D-animation techniques, sparse animations are referred to as sequences with p <n and 1 <k, and when p = n and 1 = k, the animation is called dense. If the encoder can control the parameters p and l, it can generate a layered bitstream (by adjusting parameter l), where all layers L are scalable (by adjusting parameter p) The point is clear. The sparsity (or density) of the animation is limited to the density factor in the range [0..1] as defined below:

여기서 F는 애니매이션 프레임들의 수이고, k는 기준 모델에서 노드들의 수이다. p→n이고 l→k이면, df→1이고, 따라서 애니매이션을 완료한다.Where F is the number of animation frames and k is the number of nodes in the reference model. If p → n and l → k, then df → 1, thus completing the animation.

상기 식(1)에서 기술되고 요약된 개념은 이하에 기술된 바와 같이,DPCM 코더에 적합하다. 여기서 기준 모델로 불리는 초기 와이어프레임 모델 M₀이 수신기에서 이미 존재하고 있다는 점을 코딩 처리는 가정한다. 시간-의존 메시와 동일한 손실 있는 채널(lossy channel)을 통해 전송이 행해진다고 가정되면, 기준 모델은 에러 보호와 함께, 정적 3-D 메시 전송을 위해 기존 방법으로 압축 및 스트리밍될 수 있다. 상기 기존 방법들은 정적 메시 전송들과 상호동작 및 수용할 수 있고, 여기서 보다 상세하게 논의되지 않을 것이다.The concepts described and summarized in equation (1) above are suitable for DPCM coders, as described below. The coding process assumes that an initial wireframe model M ₀ , referred to herein as a reference model, already exists at the receiver. If the transmission is assumed to be on the same lossy channel as the time-dependent mesh, then the reference model can be compressed and streamed in a conventional manner for static 3-D mesh transmission, with error protection. The existing methods can interoperate with and accommodate static mesh transmissions and will not be discussed in more detail here.

3D-애니매이션 코덱의 컨텍스트(context)에서, I-프레임은 기준 모델 M₀로부터 현재 시간 순간 t에서의 모델로 변경들을 기술한다. P-프레임은 이전 시간 순간 t-1로부터 현재 시간 순간 t로 모델의 변경들을 기술한다. I 및 P 프레임들에 대해 대응하는 위치 및 변위 행렬들은 각각 로 표시된다.In the context of the 3D-animation codec, the I-frame describes the changes from the reference model M ₀ to the model at the current time instant t. The P-frame describes changes in the model from the previous time instant t-1 to the current time instant t. The corresponding position and displacement matrices for the I and P frames are respectively Is displayed.

도 1A는 본 발명의 일 양상에 따른 코딩 처리의 예시적인 블록도(100)를 도시한다. 다이어그램은 3-D 공간에서 모든 축에 따라 각 정점의 변위의 시간적 상관을 이용하는 DPCM 인코더를 도시한다. P-프레임을 인코딩하기 위해, 이전 인스턴스의 디코딩된 셋트(애니매이션 프레임 또는 변위 행렬)는 예측된 값(108,106) 으로서 사용된다. (동일하게 I-프레임 인코딩을 위해, 예측된 행렬은 이고, 이것은 t=0에서 기준 모델에 대한 변위 행렬이다.) 이 후, 예측 에러 E_t, 즉, 현재 변위 행렬과 예측된 행렬 사이의 차(106)가 계산(102)되고 양자화(104)()된다. 마지막으로, 양자화된 샘플들은 비공지된 데이터 통계를 조정하도록 적응된 산술 코딩 알고리즘(110)을 사용하여 엔트로피 코딩(C_t)된다. 상기 예측 체계는 양자화 에러 누적을 방지한다.1A shows an exemplary block diagram 100 of coding processing in accordance with an aspect of the present invention. The diagram shows a DPCM encoder using the temporal correlation of the displacement of each vertex along all axes in 3-D space. To encode a P-frame, the decoded set (animation frame or displacement matrix) of the previous instance is estimated value (108, 106). Used as (Samely for I-frame encoding, the predicted matrix is And this is the displacement matrix for the reference model at t = 0.) Then, the prediction error E _t , i.e., the difference 106 between the current displacement matrix and the predicted matrix is calculated 102 and quantized 104 ( )do. Finally, the quantized samples are entropy coded (C _t ) using arithmetic coding algorithm 110 adapted to adjust for unknown data statistics. The prediction scheme prevents quantization error accumulation.

DPCM 디코더(120)가 도 1B에 도시된다. 디코더(120)는 우선 수신된 샘플들(122)(C'_t)을 산술적으로 디코딩하고, 디코딩된 샘플들(124,126)을 계산한다(). 각 노드의 양자화 범위는 그것의 방형 경계선(bounding box)에 의해 결정된다. 양자화 단계 사이즈는 모든 노드들에 동일한 것으로 가정될 수 있고, 또는 인코딩된 비트스트림율을 정형하기 위해 변할 수 있다. 상이한 노드들에 대해 상이한 양자화 단계 사이즈들을 허용하는 것은, 특히 노드들간의 경계들에서 메시 크랙들(cracks)과 같은 아티팩트들(artifacts)을 유발할 수 있다.DPCM decoder 120 is shown in FIG. 1B. Decoder 120 first arithmetically decodes received samples 122 (C ′ _t ) and calculates decoded samples 124, 126 ( ). The quantization range of each node is determined by its rectangular bounding box. The quantization step size may be assumed to be the same for all nodes, or may vary to shape the encoded bitstream rate. Allowing different quantization step sizes for different nodes can cause artifacts such as mesh cracks, especially at the boundaries between nodes.

모든 축들상에서 변위가 0인 정점들을 D_t가 포함하지 않기 때문에, D_t의 차원은 M_t와 비교해서 p≤n으로 감소된다는 것을 본 개시물은 전술했다. 상기 특성은 MPEG-4의 BIFS-애니매이션에 비해 애니매이션 프레임들을 감소시키지 않는 이점을 제공한다. 스파스 D_t 행렬들에 대해, 전체 노드가 애니매이트되지 않아서 우수한 애니매이션 내성을 허용하고 스케일링가능한 비트스트림을 생성하는 경우일 수도 있다. 또한, F_i,t=0, ∀i∈[1..l]인 경우, "빈(empty)" 프레임을 야기하여, 변위 행렬 D_t는 0이다. 상기 특성은 음성 오디오 스트림들에서 고유한 침묵 기간과 유사하고, 네트워크 지터(jitter)를 흡수하도록 RTP-기반 수신기들의 애플리케이션 계층에서 활용될 수 있다. 다수의 애플리케이션들에서 주요한 스트림간 동기화가 또한 달성될 수 있다(예컨대, 패킷 음성을 갖는 3-D 애니매이트된 가상 세일즈맨의 입술 동기화).Since D _t does not include vertices with zero displacement on all axes, the present disclosure described that the dimension of D _t is reduced to p ≦ n compared to M _t . This feature provides the advantage of not reducing animation frames compared to the BIFS-animation of MPEG-4. For sparse D _t matrices, it may be the case that the entire node is not animated to allow good animation immunity and produce a scalable bitstream. Further, in the case of F _{i, t} = 0,? I∈ [1..l], it causes an "empty" frame, so that the displacement matrix D _t is zero. This property is similar to the inherent silence period in voice audio streams and can be utilized at the application layer of RTP-based receivers to absorb network jitter. Major inter-stream synchronization can also be achieved in many applications (eg, lip synchronization of a 3-D animated virtual salesman with packet voice).

다음에 채널 모델 및 에러 정정 코드들이 기술된다. 순방향 에러 정정(FEC)의 아이디어는 수신기측에서 손실 패킷들을 재구성하도록 사용될 수 있는 부가적인 여분의 패킷들(resundant packets)을 전송하는 것이다. 본 발명의 양호한 실시예에 따른 FEC 처리에서, 리드-솔로몬(RS) 코드들이 패킷들에 사용된다. RS 코드들은 공지된 유일한 비-단순 최대 거리 분리가능한 코드들(non-trivial maximum distance separable codes)이고, 따라서 다량 손실 채널들을 통해 패킷 손실들에 대한 보호에 적합하다. 길이 n의 RS(n,k) 코드 및 차원 k는 갈로아 필드(Galois field) GF(2^q) 상에서 규정되고, k q-비트 정보 심볼들을 n개 심볼들의 코드워드, 즉, n≤2^q-1로 인코딩한다. 송신자는 n-k개 여분의 패킷들을 계산하기 위해서, k개의 정보 패킷들의 카피들을 저장할 필요가 있다. 결과적인 n개의 패킷들은 패킷의 블록(block of packet;BOP) 구조로 스택된다. 상기 BOP 구조는 당업계에 공지되어 있고, 따라서 여기서 보다 상세히 기술되지 않는다. 일정한 전체 채널 데이터율을 유지하도록, 소스율은 코드율이라 불리는 분수 k/n에 의해 감소되고, 초기의 감소된 애니매이션 품질을 유발한다. 수신기는 임의의 k개의 정정 심볼들, 또는 BOP의 패킷들을 수신하자 마자 디코딩을 시작한다.Next, the channel model and error correction codes are described. The idea of forward error correction (FEC) is to send additional redundant packets that can be used to reconstruct lost packets at the receiver side. In FEC processing according to a preferred embodiment of the present invention, Reed-Solomon (RS) codes are used for packets. RS codes are the only known non-trivial maximum distance separable codes and are therefore suitable for protection against packet losses over massive loss channels. RS length of n (n, k) code, and D k are Galois field (Galois field) is defined on the GF (2 ^q), k q- bit information symbols of the n number of code words of the symbols, that is, ^q n≤2 Encode to -1. The sender needs to store copies of k information packets in order to calculate nk extra packets. The resulting n packets are stacked in a block of packet (BOP) structure. Such BOP structures are known in the art and are therefore not described in more detail here. To maintain a constant overall channel data rate, the source rate is reduced by a fractional k / n called code rate, resulting in an initial reduced animation quality. The receiver starts decoding as soon as it receives any k correction symbols, or packets of a BOP.

실제로, 인터넷의 근원적 다량 손실 처리(underlying bursty loss process)는 매우 복잡하지만, 2-상태(state) Markov 모델로 매우 근사해질 수 있다. 2개의 상태들은, 패킷들이 적시에 정확하게 수신되는 상태 G(good)와, 패킷들이 손실되거나 손실되었다고 간주될 수 있는 시점만큼 지연되는 B(bad)이다. 상태 변환 확률들 p_GB 및 p_BG가 상기 모델을 충분히 기술하지만, 그것은 충분히 직관적이지 않기 때문에, 상기 모델은 평균 손실 확률 P_B 및 평균 버스트 길이 L_B를 사용하여 이하와 같이 표현될 수 있다:Indeed, the underlying bursty loss process of the Internet is very complex, but can be very approximated with a two-state Markov model. The two states are state G (good), in which packets are received correctly and timely, and B (bad), delayed by the point in time at which packets can be considered lost or lost. Since state transition probabilities p _GB and p _BG describe the model sufficiently, but it is not intuitive enough, the model can be expressed as follows using the average loss probability P _B and the average burst length L _B :

RS 코드 파라미터들의 선택을 위해, 채널 및 RS 코드 파라미터들의 함수로서 삭제 디코더(erasure decoder)에 의해 BOP가 재구성될 수 없는 확률이 공지될 필요가 있다. RS(n,k) 코드에 대해, 이것은 n-k개 이상의 패킷들이 BOP 내에서 손실되는 확률이고, 그것은 블록 에러율 P_BER로 불린다. P(m,n)은 n개 패킷들의 블록 내에서 m개의 손실 패킷들의 확률이고, 이것은 또한 블록 에러 밀도 함수라 불린다. 이 후, 상기 계산은 이하이다:For the selection of RS code parameters, the probability that the BOP cannot be reconstructed by the erasure decoder as a function of the channel and RS code parameters needs to be known. For RS (n, k) codes, this is the probability that more than nk packets are lost in the BOP, which is called the block error rate P _BER . P (m, n) is the probability of m missing packets within a block of n packets, which is also called a block error density function. The calculation is then as follows:

전술된 2-상태 Markov 모델에 대응하는 평균 손실 확률 P_B 및 평균 손실 버스트 L_B는 블록 에러 밀도 함수 P(m,n)에 관한 것이다. 그 관계의 정확한 본질은 논문에서 널리 연구되고 도출된다. 여기서 우리는 비트 에러 채널들에 대한 도출을 패킷 손실 채널에 적응시킨다.The average loss probability P _B and the average loss burst L _B corresponding to the two-state Markov model described above relate to the block error density function P (m, n). The exact nature of the relationship is widely studied and derived from the paper. Here we adapt the derivation for the bit error channels to the packet loss channel.

전술된 바와 같은 Markov 모델은 갱신(renewal) 모델이고, 즉 손실 이벤트가 손실 처리를 재설정한다. 상기 모델은 오류 없는(error-free) 간격들(갭들)의 분배에 의해 결정된다. v-1개의 패킷들이 두개의 손실 패킷들 사이에 수신되도록 갭 길이 v의 이벤트가 발생한다면, 갭 밀도 함수 g(v)는 갭 길이 v의 확률을 주고, 즉, g(v)=Pr(0^v-1|1). 갭 분배 함수 G(v)는 v-1보다 큰 갭 길이의 확률을 주고, 즉, G(v)=Pr(0^v-1|1). 우리 모델의 상태 B에서 모든 패킷들이 손실되고, 상태 G에서 모든 패킷들이 수신된다면 이하를 가져온다:The Markov model as described above is a renewal model, i.e. a loss event resets the loss processing. The model is determined by the distribution of error-free intervals (gaps). If an event of gap length v occurs such that v-1 packets are received between two missing packets, the gap density function g (v) gives the probability of gap length v, i.e. g (v) = Pr (0 ^v-1 | 1). The gap distribution function G (v) gives a probability of gap length greater than v-1, that is, G (v) = Pr (0 ^v-1 | 1). If all packets are lost in state B of our model and all packets are received in state G, we get:

R(m,n)이 손실 패킷 후의 다음 n-1개 패킷들 내의 m-1개 패킷 손실들의 확률 이라 하자. 상기 확률은 반복으로부터 계산될 수 있다:Let R (m, n) be the probability of m-1 packet losses in the next n-1 packets after the lost packet. The probability can be calculated from the iterations:

이 후, 블록 에러 밀도 함수 P(m,n) 또는 n개 패킷들의 블록 내의 m개 손실 패킷들의 확률은 이하에 의해 주어진다:Then, the block error density function P (m, n) or the probability of m missing packets in a block of n packets is given by:

여기서 P_B는 평균 에러 확률이다.Where P _B is the mean error probability.

식 5로부터, P(m,n)은 FEC 체계의 성능을 결정하고, 식 3 및 4를 사용하여 P_B, L_B의 함수로서 표현될 수 있다는 점에 유의한다. 이하에 설명되는 바와 같이, P(m,n)의 표현은 시각적 왜곡을 최소화하는 최적화된 소스/채널율 할당을 위해 RS(m,n) FEC 체계들에서 사용될 수 있다.From Equation 5, note that P (m, n) determines the performance of the FEC scheme and can be expressed as a function of P _B , L _B using Equations 3 and 4. As described below, the representation of P (m, n) can be used in RS (m, n) FEC schemes for optimized source / channel rate allocation that minimizes visual distortion.

다음에 비트스트림 형식과 패킷화 처리가 기술된다. 3-D 애니매이션 코덱의 출력 비트스트림은 예컨대 RTP와 같은 애플리케이션-레벨 전달 프로토콜로 스트리밍하기 위해 적절하게 패킷화될 필요가 있다. 단일 계층 비트-스트림을 위한 상기 처리는 공지되어 있고, 그 주요 특징들은 이하 3개의 개념들로 요약된다:Next, the bitstream format and packetization process are described. The output bitstream of the 3-D animation codec needs to be properly packetized for streaming to an application-level transfer protocol such as RTP. The above process for a single layer bit-stream is known and its main features are summarized in the following three concepts:

(1) 모델의 어느 노드들이 애니매이트되는지를 기술하기 위해, 애니매이션 마스크들(animation masks), 노드마스크(NodeMask) 및 정점마스크들(VertexMasks)은 BIFS-애니매이션과 유사한 방식으로 규정된다. 노드마스크는 본질적으로, 설정된다면, 노드 테이블 내 대응하는 노드가 애니매이트될지를 각 비트가 표시하는 비트-마스크이다. 기준 와이어프레임 모델은 이미 존재하기 때문에, 노드 테이블(장면 내 모든 노드들의 순서화된 리스트)은 수신기에 선험적으로 공지되거나 다른 수단에 의해 다운로드된다. 유사한 방식으로, 정점마스크들이 애니매이트될 정점들에 대해 축 당 하나로 규정된다.(1) To describe which nodes of the model are animated, animation masks, nodemasks and vertexmasks are defined in a manner similar to BIFS-animation. The nodemask is essentially a bit-mask where each bit indicates if the corresponding node in the node table will be animated, if set. Since the reference wireframe model already exists, the node table (an ordered list of all nodes in the scene) is known a priori to the receiver or downloaded by other means. In a similar manner, vertex masks are defined one per axis for the vertices to be animated.

(2) 가장 간단한 형태로, 하나의 프레임(하나의 애플리케이션 데이터 유닛(ADU)을 나타냄)은 하나의 RTP 패킷에 포함된다. 이 점에서, 3D-애니매이션 코덱의 출력 비트스트림은 공지된 애플리케이션 레벨 프레이밍(Application Level Framing;ALF) 원리에 따라 '자연히 패킷화가능(naturally packetizable)'하다. RTP 패킷 페이로드 형식은 각각의 축을 따라 인코딩된 샘플들에 이어, 노드마스크 및 정점마스크들로 시작되는 것으로 간주된다.(2) In its simplest form, one frame (representing one application data unit (ADU)) is contained in one RTP packet. In this regard, the output bitstream of the 3D-animation codec is 'naturally packetizable' according to the known Application Level Framing (ALF) principle. The RTP packet payload format is considered to begin with node and vertex masks, followed by samples encoded along each axis.

(3) RTP 헤더에서 M비트는 일련의 "빈" 프레임들 중 맨 처음에 설정되어야 하고, (존재한다면) 이것은 함께 그룹화될 수 있다.(3) The M bits in the RTP header should be set first in a series of "empty" frames, which (if present) can be grouped together.

상기 간단한 형식은 적당한 수의 정점들을 갖는 간단한 애니매이션들에 충분하다. 그러나, 높은 장면 복잡성을 갖는 스퀀스들이나 고-해상도 메시들은 압축 후에 다수의 코딩된 데이터를 생성할 수 있고, 잠재적으로 상기 경로 MTU를 초과하는 프레임들을 유발한다. 상기 경우들에서, 단일 계층에서의 미가공(raw) 패킷화는 RTP 페이로드에 대한 단편화 규칙들의 규정을 요구하고, 이것은 ALF에 있어서 항상 일정하지 않을 수 있다. 게다가, 전술된 바와 같이 RTP에서 직접 패킷화된 프레임들은 그 변화하는 길이들에 기인한 가변 비트율 스트림을 생성한다.The simple form is sufficient for simple animations with an appropriate number of vertices. However, sequences or high-resolution meshes with high scene complexity may generate multiple coded data after compression, potentially causing frames that exceed the path MTU. In such cases, raw packetization at a single layer requires the specification of fragmentation rules for the RTP payload, which may not always be constant for ALF. In addition, frames packetized directly in the RTP as described above create a variable bit rate stream due to their varying lengths.

(a) 계층화된 비트스트림들을 수용하고, (b) 일정한 비트율 스트림을 발생하기 위해, 상기에 설명된 요구들을 만족하는 보다 효과적인 패킷화 체계가 얻어진다. 상기 효율성은 패킷들의 블록(BOP)로 공지된 블록 구조를 적절하게 적응시킴으로써 달성될 수 있다. 상기 방법에서, 단일 계층의 인코딩된 프레임들은 S_P-열 그리드 구조(S_P-column grid structure)에 의해 n-라인의 순서로 일렬로 연속적으로 놓여지고, 이 후 RS 코드들은 그리드 상에 수직으로 생성된다. RS(n,k) 삭제 코드에 의해 보호된 데이터 프레임들에 대해, k개의 소스 데이터 프레임들에 대해 그리드의 길이가 n이도록, 에러 내성 정보가 첨부된다. 상기 방법은 버스트 패킷 에러들을 갖는 패킷 네트워크들에 가장 적절하고, 시퀀스 프레임율 FR, 패킷 사이즈 S_P, BOP에서 데이터 패킷율 F_BOP, 및 RS 코드(n,k)로 충분히 기술될 수 있다.In order to (a) accommodate layered bitstreams and (b) generate a constant bitrate stream, a more efficient packetization scheme is obtained that satisfies the requirements described above. The efficiency can be achieved by suitably adapting a block structure known as a block of packets (BOP). In the method, the encoded frames of the single layers S _P - open grid structure is subsequently placed in a line in the order of the n- line by (S _P -column grid structure), then the RS codes as vertical grid Is generated. For data frames protected by the RS (n, k) erasure code, error tolerance information is appended so that the length of the grid is n for k source data frames. The method is most suitable for packet networks with burst packet errors and can be fully described by the sequence frame rate FR, packet size S _P , data packet rate F _{BOP at BOP} , and RS code (n, k).

직관적으로, FR 프레임율로, S_P 바이트의 긴 패킷들을 갖는 F_BOP데이터 프레임들을 구성하는 BOP에 대해, 전체 소스 및 채널 비트율 R은 이하에 의해 주어진다:Intuitively, for a _BOP that constitutes F _BOP data frames with long packets of S _P bytes, at the FR frame rate, the total source and channel bit rate R are given by:

상기 식은 파라미터들 F_BOP, n 및 S_P를 적절히 균형잡히게 함으로써 효율적인 패킷화 체계들의 설계에 대한 가이드로서의 역할을 한다. 그것은 또한, 지연과 내성 간의 트레이드-오프(trade-off)를 포함한다. 계층화된 비트스트림에 대해, 계층당 하나의 BOP 구조에 대한 설계가 필요하다. 식 6에서 파라미터들을 변화함으로써, 상이한 RS 코드율들이 각 층에 할당될 수 있고, 따라서 에러 보호의 비동일레벨을 각 계층에 제공한다. 실제에 있어서, 시각적 에러의 척도를 고려하는 3-D 애니매이션 스트리밍의 애플리케이션을 위해 상기 파라미터들이 조정되는 방법이 다음에 설명된다.The equation serves as a guide to the design of efficient packetization schemes by properly balancing the parameters F _BOP , n and S _P. It also includes a trade-off between delay and immunity. For layered bitstreams, a design is needed for one BOP structure per layer. By varying the parameters in Equation 6, different RS code rates can be assigned to each layer, thus providing each layer with an uneven level of error protection. In practice, the following describes how the parameters are adjusted for an application of 3-D animation streaming that takes into account the measure of visual error.

수신기에서 애니매이트된 메시의 불완전 재구성으로부터 유발하는 시각적 손실을 측정하기 위해, 시간 t에서 원래 메시 M_t와 그것의 디코딩된 메시 간의 시각적 차이를 캡쳐할 수 있는 메트릭이 요구된다. 가장 간단한 척도는 대응하는 정점들간의 RMS 기하학적 거리이다. 대안적으로, 하우스도르프 거리가 일반적으로 에러 메트릭으로서 사용되고 있다. 본 경우, 하우스도르프 거리는, 에서 모든 포인트의 거리 내에 모든 포인트 M_t가 놓이는 방식으로 두개의 셋트들 M_t및 의 정점들간의 최대 최소 거리로서 규정된다. 상기는 이하와 같이 표현될 수 있다:The original mesh M _t and its decoded mesh at time t to measure visual loss resulting from incomplete reconstruction of the animated mesh at the receiver A metric that can capture the visual difference between the two is required. The simplest measure is the RMS geometric distance between the corresponding vertices. Alternatively, Hausdorff distance is generally used as the error metric. In this case, Hausdorf Street, Distance of all points in Two sets M _t and in such a way that all points M _t lie within It is defined as the maximum minimum distance between the vertices of. The above can be expressed as follows:

여기서 here

그리고 은 두개의 정점들 m_t와 간의 유클리드 거리이다. 다수의 다른 왜곡 매트릭들은 SNR과 PSNR과 같은 자연스러운 비디오 코딩과 동일하게 도출될 수 있지만, 다수의 신호들 상에서 왜곡이 감지된 사용자의 균일한 척도를 주는 것을 실패하고, 상이한 미디어를 통해 방법들을 인코딩하여, 그들이 인코딩하는 특정 신호의 통계적 특성들에 맞춰진다. 게다가, 특히 3-D 메시들에 대해, 모든 상기 메트릭들은 오직 기하학적 근사의 객관적인 표시들이나 잡음 대 신호 비를 주고, 표면 평활과 같은 인간의 눈이 감지하는 보다 미묘한 시각적 특성들을 캡쳐하는 것을 실패한다.And Is the two vertices m _t and Euclidean distance between. Many other distortion metrics can be derived equally to natural video coding such as SNR and PSNR, but fail to give a uniform measure of the user whose distortion is detected on multiple signals, and encode the methods through different media. The specific characteristics of the signal they encode are then matched. In addition, especially for 3-D meshes, all the above metrics give only objective indications or noise-to-signal ratios of geometric approximation, and fail to capture the more subtle visual characteristics perceived by the human eye such as surface smoothing.

도 2는 8Hz주파수로 I-프레임을 갖는 애니매이트된 시퀀스 BOUNCEBALL의 150개의 프레임들에 대한 PSNR, 하우스도르프 거리, 및 시각적 평활도와 같은 왜곡 메트릭들의 비교도(200)를 도시한다. 두개의 상한점들(PSNR-하우스도르프)은 그들이 나타내는 하우스도르프 거리와 기하학적 거리의 대응하는 메트릭들 간의 예측된 상관을 도시한다. 역으로 두개의 보다 낮은 점들은 기하학적 거리가 높은 경우들에서 시각적 왜곡(식 8)이 낮을 수 있는 것을 지시한다. 2 shows a comparison diagram 200 of distortion metrics such as PSNR, Hausdorff distance, and visual smoothness for 150 frames of an animated sequence BOUNCEBALL having an I-frame at 8 Hz frequency. The two upper points (PSNR-Huisdorf) show the predicted correlation between the corresponding metrics of the Hausdorf distance and the geometric distance they represent. Conversely, the two lower points indicate that the visual distortion (Equation 8) can be low in cases where the geometric distance is high.

3-D 메시 기하학들을 위해 스펙트럼 압축 알고리즘을 평가하는 동안 착수되는 것으로서 Karni와 Gotsman에 의해 표면 평활을 사용하는 방향으로 형성되었던 한가지 시도가 보고되었다. Siggraph 2000, Computer Graphics Proceedings에서 Zachi Karni와 Craig Gotsman에 의한 "Spectral compression for mesh geometry", Kurt Akeley, Ed.2000, 279-286페이지, ACM Press/ACM SIGGRAPH/Addison Wesley Longman을 참조하라. 여기에서, 제안된 3-D 메시 왜곡 메트릭은 인접 정점들에 대한 각각의 정점 거리에 의해, 두개의 정점들 간의 유클리드 거리로서 계산된 객관적 에러를 정규화한다. 상기 에러 메트릭의 유형은 3-D 메시의 표면 평활을 캡쳐한다. 이것은 토폴로지와 기하학을 고려하는 라플라시안 연산자에 의해 달성될 수 있다. 정점 v_i에서, 이 기하학적 라플라시안의 값은 이하이다:One attempt was made by Karni and Gotsman to use surface smoothing as being undertaken while evaluating the spectral compression algorithm for 3-D mesh geometries. See "Spectral compression for mesh geometry" by Zachi Karni and Craig Gotsman in Siggraph 2000, Computer Graphics Proceedings , Kurt Akeley, Ed. 2000, pp. 279-286, ACM Press / ACM SIGGRAPH / Addison Wesley Longman. Here, the proposed 3-D mesh distortion metric normalizes the objective error calculated as the Euclidean distance between two vertices, with each vertex distance to adjacent vertices. The type of error metric captures the surface smoothness of the 3-D mesh. This can be accomplished by the Laplacian operator taking into account the topology and geometry. At vertex v _i , the value of this geometric Laplacian is

여기서 n(i)는 정점 i의 인접한 것들의 인덱스 셋트이고, l_ij는 정점들 i와 j간의 기하학적 거리이다. 따라서, 새로운 메트릭은 메시들간의 기하학적 거리의 기준과, 라플라시안 차이의 기준의 평균으로서 규정된다(m_t,은 각각, 메시들의 정점 셋트들인 이고, n은 의 셋트 사이즈이다):Where n (i) is the index set of adjacent ones of vertex i and l _ij is the geometric distance between vertices i and j. Thus, the new metric is defined as the mean of the geometric distance between the meshes and the basis of the Laplacian difference (m _t , Are each vertex sets of meshes And n is Is the size of the set):

식 8에서 상기 메트릭은 본 발명에서 양호하게 사용되고, 이하에서 시각적 평활도 메트릭(VS)으로 언급될 것이다. 또한 메시의 시각적 평활도에 관한 다른 식들도 사용될 수 있다.The above metric in Equation 8 is well used in the present invention and will be referred to as visual smoothness metric (VS) hereinafter. Other equations regarding the visual smoothness of the mesh can also be used.

VS 메트릭은 모든 정점 m_t의 인접한 정점들과 같은 접속 정보를 요구한다. 3D-애니매이션 코덱의 경우, 애니매이션 중 접속을 변경하지 않는 것으로 가정되면, 정점 인접들이 미리 계산될 수 있다.The VS metric requires connection information such as adjacent vertices of all vertices m _t . In the case of the 3D-animation codec, if it is assumed that the connection is not changed during the animation, vertex neighbors may be precomputed.

전술된 BOP 구조는 RS 삭제 코드들에 기초한 리던던시 정보를 채용하는 효율적인 패킷화 체계의 설계에 적합하다. 설계 파라미터들의 관계가 또한 식 6에 보여진다. 그렇지만, 상기 식은 계층화에 관한 어떤 정보도 반영하지 않는다. 예식적인 계층화 설계 접근법이 3-D 와이어프레임 스트리밍을 위해 제안된 에러 내성 방법에 이어, 다음에 기술된다.The BOP structure described above is suitable for the design of an efficient packetization scheme that employs redundancy information based on RS erasure codes. The relationship of the design parameters is also shown in equation 6. However, the above expression does not reflect any information about stratification. A formal tiered design approach is described next, following the proposed error tolerance method for 3-D wireframe streaming.

각 계층의 평균 VS 값이 애니매이션 시퀀스에서 그 중요성을 반영하는 방식으로 계층화가 수행된다. 이것을 달성하기 위해, 식 8로부터 VS가 독립적으로 메시 내 모든 노드들에 대해 계산되고, 상기 노드들은 시퀀스에서 그 평균 VS에 따라 순서화된다. 최고 평균 VS을 갖는 노드, 또는 노드들의 그룹은 제1 및 시각적으로 가장 중요한 계층 L₀를 형성한다. 이것은 다른 계층들보다 패킷 에러들에 대해 보다 내성 있는 계층이다. 후속적인 중요 계층들 L₁,...,L_M은 VS 순서로, 후속적인 노드들, 또는 노드들의 그룹에 대응하여 생성된다.The stratification is performed in such a way that the average VS value of each layer reflects its importance in the animation sequence. To achieve this, VS from equation 8 is independently calculated for all nodes in the mesh, which nodes are ordered according to their average VS in the sequence. The node, or group of nodes, having the highest mean VS forms the first and visually most important layer L ₀ . This is a layer that is more resistant to packet errors than other layers. Subsequent critical layers L ₁ , ..., L _M are created in VS order, corresponding to subsequent nodes, or groups of nodes.

3-D 메시가 원하는 수의 계층들보다 많은 노드들을 갖는다면, 동일한 계층에서 그룹화된 노드들의 수는 설계 선택이고 그 계층의 출력 비트율을 말한다. 단지 몇개의 노드들만을 갖지만 노드당 다수의 정점들을 갖는 메시들에 대해, 노드 분할화가 바람직할 수 있다. 상기 분할화는 원래보다 많은 노드들을 갖는 새로운 메시에서 3-D 메시의 정점들을 재구성할 것이다. 상기 처리는 접속에 영향을 미치겠지만, 전체가 표현된 모델이 아니다. 가능하다면, 노드들로 분할하는 메시는 임의적이지만, 보다 자연스러운 대상들을 반영할 것이고, 이 새로운 노드들은 3-D 장면과 그 대응하는 모션을 표현할 것이다. 상기 장면에서 분할화가 불가능하다면, 동일한 계층에 할당될 임의 사이즈의 서브-메시들(노드들)에서 메시를 분할할 수 있다. 메시 분할화는 당업자에 의해 이해될 복잡한 전-처리 단계들을 요구할 것이다. 그러나, 3D-애니매이션 코덱은 정적 접속을 가정함을 상기하라.If the 3-D mesh has more nodes than the desired number of layers, the number of nodes grouped in the same layer is a design choice and refers to the output bit rate of that layer. For meshes with only a few nodes but with multiple vertices per node, node segmentation may be desirable. The segmentation will reconstruct the vertices of the 3-D mesh in a new mesh with more nodes than the original. This process will affect the connection, but is not a model in its entirety. If possible, the mesh dividing into nodes is arbitrary, but will reflect more natural objects, and these new nodes will represent the 3-D scene and its corresponding motion. If segmentation is not possible in the scene, then the mesh may be split in sub-meshes (nodes) of any size to be assigned to the same layer. Mesh segmentation will require complex pre-processing steps that will be understood by those skilled in the art. Recall, however, that the 3D-animation codec assumes a static connection.

"자연스러운 비디오"에서 가중 효과(cumulative effect)를 갖는 계층들을 구축하는 것이 일반적이다. 즉, 계층 L_j 데이터는 계층 L_j-1의 데이터를 부가하고, 전체 비디오 품질을 향상시킨다. 그러나, 계층 L_j-1까지만 디코딩할 수 있고, 정제 계층들을 잃는다. 이 접근법은 적응된 스트리밍 시나리오로 취해질 수 있고, 송신자는 혼잡한 네트워크 상태들 동안 j-1개의 계층만을 송신하도록 선택할 수 있고, j개 이상의 계층들은 네트워크 상태들이 개선된 때, 즉, 보다 많은 대역폭이 이용가능하게 될 때 송신되도록 선택할 수 있다.It is common to build layers with cumulative effects in "natural video". That is, the layer L _j data adds data of the layer L _j-1 and improves the overall video quality. However, only up to layer L _j-1 can be decoded and the refinement layers are lost. This approach can be taken in an adaptive streaming scenario, in which the sender can choose to transmit only j-1 layers during congested network conditions, where j or more layers have improved bandwidth, i.e. more bandwidth It can be selected to be sent when it becomes available.

여기서 개시된 3-D 애니매이션 계층들의 본질은 동일한 장면에서 항상 누적하는 것은 아니다. 디코딩 계층 L_j(적절한 노드 그룹화 또는 노드 분할화로 구축됨)은 이전 계층들L₀,...L_j-1에 포함된 데이터의 품질을 꼭 정제할 필요가 있는 것은 아니지만, 예를 들어, 상기 모델에서 보다 많은 정점들에 애니매이션을 부가함으로써 애니매이트된 모델에 애니매이션 디테일을 부가한다.The nature of the 3-D animation layers disclosed herein is not always cumulative in the same scene. The decoding layer L _j (built with proper node grouping or node segmentation) is the previous layers It is not necessary to refine the quality of the data contained in L ₀ , ... L _j-1 , but it is possible to add animation detail to an animated model, for example, by adding animation to more vertices in the model. Add.

예로서, 헤드-엔드-숄더즈 토킹 아바타(head-and-shoulders talking avatat)인 시퀀스 TELLY(이하에서 충분히 논의됨)를 고려하자. TELLY는 항상 카메라(고정 카메라)를 향한다. 카메라는 이동하지 않기 때문에, 헤어의 뒷부분을 애니매이트하는데 많은 대역폭이 낭비된다. 그러나, 보이는 부분을 계층 L_j-1에 할당하고, 보이지 않는 부분을 계층 L_j에 할당하는 노드 "헤어"의 적절한 분할화로 헤어의 보이는 부분들과 보이지 않는 부분들(정점들의 셋트)을 쉽게 검출할 수 있다. 고정 카메라의 경우(및 상호작용이 허용되지 않음), 계층 L_j는 전송되지 않는다. 따라서, 사용자가 고정 모델 내의 와이어프레임 메시나 애니매이션을 볼 때, 애니매이션은 회전하거나 이동하지 않기 때문에 애니매이션의 보이는 부분들만이 보여질 수 있다. 사용자가 아바타(또는 다른 모델) 상에서 회전하거나 확대함으로써 애니매이션을 검사하거나 그것의 뒷면을 볼 수 있는 경우, 계층 L_j가 보여진다. 이 경우, 사용자는, 사용자로 하여금 애니매이션의 모션 결핍으로 인해 고정 모드에서 보이지 않는 애니매이션의 부분들도 보이게 하는 상호작용 모드로 애니매이션을 본다. 그러나, 계층 L_j는 계층 L_j-1에서 헤어의 보이는 노드의 애니메이션을 정제하지 않는다. 그것은 보이지 않는 정점들에 대한 부가적인 애니매이션 데이터를 포함한다. 이것은 분할화 방법의 예시적 결과를 제공한다.As an example, consider the sequence TELLY (discussed fully below), which is a head-and-shoulders talking avatat. TELLY always faces the camera (fixed camera). Since the camera does not move, a lot of bandwidth is wasted to animate the back of the hair. However, with the proper segmentation of the node "hair" assigning visible parts to layer L _j-1 and assigning invisible parts to layer L _j , the visible and invisible parts of the hair (set of vertices) are easily detected. can do. For fixed cameras (and no interaction allowed), layer L _j is not transmitted. Thus, when a user views a wireframe mesh or animation in a fixed model, only visible portions of the animation can be seen since the animation does not rotate or move. If the user can inspect the animation or see the back side of it by rotating or zooming on the avatar (or other model), layer L _j is shown. In this case, the user sees the animation in an interactive mode that allows the user to also see parts of the animation that are not visible in the fixed mode due to lack of motion in the animation. However, layer L _j does not refine the animation of visible nodes of hair in layer L _j-1 . It contains additional animation data for invisible vertices. This provides an exemplary result of the segmentation method.

또한, "상호작용 모드"는 애니매이션과 사용자 상호작용을 반드시 요구하는 것이 아니다. 상호작용 모드는, 상기 애니매이션이 미리 감춰진 애니매이션의 부분을 노광하도록 이동 또는 회전할 수 있는 임의의 뷰잉 모드로 언급된다. 따라서, 뷰어가 단순히 애니매이션을 보는 경우, 말하는 동안 보다 인간적이거나 자연스런 방법으로 애니매이션이 이동 및 회전할 수 있다. 이 경우, L_j계층 또는 다른 보이지 않는 계층들은 뷰잉 경험을 완료하기 위해 부가적인 애니매이션 데이터를 제공하도록 송신될 수 있다. 이 점에 있어서, 고정 또는 상호작용 모드는 이용가능한 대역폭에 의존할 수 있다. 즉, 애니매이션의 보이는 계층 및 보이지 않는 계층 모두를 전송하기 위한 대역폭이 충분하다면, 상기 애니매이션은 고정 모드 대신에 상호작용 모드로 뷰잉될 수 있다. 본 발명의 또다른 양상에서, 사용자는 고정 또는 상호작용 모드를 선택할 수 있고, 따라서 어떤 계층들이 전송되는지를 제어한다.Also, the "interaction mode" does not necessarily require animation and user interaction. Interaction mode is referred to as any viewing mode in which the animation can be moved or rotated to expose a portion of the animation that is previously hidden. Thus, if the viewer simply views the animation, the animation can move and rotate in a more human or natural way while speaking. In this case, the L _j layer or other invisible layers may be transmitted to provide additional animation data to complete the viewing experience. In this regard, the fixed or interactive mode may depend on the available bandwidth. That is, if there is enough bandwidth to transmit both visible and invisible layers of animation, the animation can be viewed in interactive mode instead of fixed mode. In another aspect of the invention, the user can select a fixed or interactive mode, thus controlling which layers are transmitted.

도 3A는 본 발명의 일 양상에 따른 예시적인 단계들의 셋트를 도시한다. 그 방법은 3-D 와이어프레임 메시를 분할하는 단계(302), 메시 내 각 노드에 대해 VS 값을 계산하는 단계(304) 및 각 계층과 연관된 평균 VS 값이 애니매이션 시퀀스 내 각 계층의 중요도를 반영도록 와이어프레임 메시와 연관된 데이터를 복수의 계층들로 계층화하는 단계(306)를 포함한다. 계층의 중요도에 따라 계층에서 동일하지 않는 에러 정정 코드를 각 계층에 적용함으로써 복수의 계층들을 전송할 때 동일한 전체 비트율이 유지된다.3A shows a set of exemplary steps in accordance with an aspect of the present invention. The method involves partitioning a 3-D wireframe mesh (302), calculating VS values for each node in the mesh (304), and the average VS value associated with each layer reflecting the importance of each layer in the animation sequence. Layering 306 the data associated with the wireframe mesh into a plurality of layers. The same overall bit rate is maintained when transmitting multiple layers by applying an error correction code that is not identical in the layer to each layer according to the importance of the layer.

여기서 사용된 용어 "분할"은 동일한 계층에 할당될 임의 또는 임의적이지 않는 서브-메시들로 메시를 분할하는 것과 같은 전처리 단계를 의미할 수 있다. 또한, 상기 용어는 하나 이상의 노드들을 포함하는 다양한 계층들을 발생하는 처리와 같은 또다른 애플리케이션을 가질 수 있다.The term "split" as used herein may mean a preprocessing step, such as splitting a mesh into any or non-random sub-meshes to be assigned to the same layer. In addition, the term may have another application, such as a process for generating various layers including one or more nodes.

도 3B는 본 발명의 또다른 양상의 흐름도를 도시한다. 그 방법은 가장 큰 시각적 왜곡을 나타내는 복수의 계층들 중 하나의 계층에 보다 많은 리던던시를 할당하는 단계(320)를 포함한다. 예를 들어, 이것은 시각적으로 조악한 정보를 포함하는 계층일 수 있다. 다음, 상기 리던던시는 시각적 평활도에 보다 적게 기여하는 계층들에서 점차적으로 감소된다(322). 회복불가능한 패킷들의 손실이 개개의 계층 내에서만 발생하는 경우, 수신기에서 각 계층에 보간에 기초한 은폐가 적용된다(324). 통신 네트워크를 통해 특정 계층 이동(travel)에 속하는 패킷들로서, 그들은 송신기로부터 수신기로 상이한 경로들을 취할 수 있고, 따라서 가변의 지연들과 손실들을 겪는다. 수신기가 전체 패킷 손실율을 알 때, 수신기는 각 계층에 별개로 제공된 여분의 정보(FEC)를 사용하여 손실율을 감소하고자 할 것이다. FEC의 양은 모든 잃어버린 패킷들(잔여 패킷들)을 회복하는데 충분하지 않을 것이다. 보간에 기초한 은폐는 잔여 패킷 손실에 의해 도입된 왜곡을 감소하도록 각 계층에 독립적으로 적용될 수 있다. 일반적으로, 단계들 320,322는 코딩/전송기 단부에서 수행되고, 단계 324는 피어-투-피어(peer-to-peer) 네트워크들과 같은 통신 네크워크를 통해 수신기에서 수행된다.3B shows a flowchart of another aspect of the present invention. The method includes allocating more redundancy 320 to one of the plurality of layers representing the largest visual distortion. For example, this may be a layer containing visually coarse information. The redundancy is then gradually reduced (322) in the layers that contribute less to visual smoothness. If the loss of unrecoverable packets only occurs within an individual layer, interpolation based concealment is applied to each layer at the receiver (324). As packets belonging to a particular layer travel through the communication network, they can take different paths from the transmitter to the receiver and thus suffer from variable delays and losses. When the receiver knows the overall packet loss rate, the receiver will want to reduce the loss rate by using the extra information (FEC) provided separately for each layer. The amount of FEC will not be enough to recover all lost packets (remaining packets). Interpolation based concealment can be applied independently to each layer to reduce distortion introduced by residual packet loss. In general, steps 320 and 322 are performed at the coding / transmitter end and step 324 is performed at the receiver via a communication network such as peer-to-peer networks.

시간 t에, 수신기에서 애니매이션의 예측된 왜곡은 양들의 곱인 P_jt·D_jt의 합이고, 여기서 j는 계층 인덱스, D_jt는 시간 t에서 계층 j 내 정보를 잃어버려 초래된 시각적 왜곡이며, P_jt는 계층 j 내 회복불가능한 패킷 손실을 가질 확률이다. 상기 계층들을 재구성함으로써, 확률 P_jt은 독립적이고, 계층 내 버스트 패킷 손실은 디코딩된 시퀀스에서 그 자신의 시각적 왜곡 D_jt을 제공한다. 정식으로, 시간 t에, 디코더에서 예측된 시각적 평활도 BS_(t)은 이하와 같이 표현될 수 있다:The time t, and the estimated distortion of the animation at the receiver is the sum of gopin P _jt · D _jt of both, where j is the layer index, D _jt is the visual distortion resulting in lost your information layer j at time t, P _jt is the probability of having an irrecoverable packet loss in layer j. By reconstructing the layers, the probability P _jt is independent and the burst packet loss in the layer provides its own visual distortion D _jt in the decoded sequence. Formally, at time t, the visual smoothness BS _(t) predicted at the decoder can be expressed as:

여기서 L은 계층들의 수이다. 상기 식에서, P_jt는 식 5에 의해 주어진 바와 같이 블록 에러율 P_BER이거나, 계층 j에서 n-k_j개 이상의 패킷들을 손실하는 확률이다. 블록 에러 밀도 함수 P(m,n)을 사용하여, 다음이 도출된다:Where L is the number of layers. Where P _jt is the block error rate P _BER as given by equation 5 or the probability of losing nk _j or more packets in layer j. Using the block error density function P (m, n), the following is derived:

식들 9,10으로부터, VS_(t)가 이하와 같이 기술될 수 있다:From equations 9 and 10, VS _(t) can be described as follows:

식 11은 통계적인 방법으로 디코더에서 프레임당 경험되는 예측된 시각적 평활도를 추정한다. 상기 목적은 식 11에서 k_jt의 값에 관한 왜곡을 최소화하는 것이다. 비트스트림이 계층들로 분할되는 방법으로부터, 가장 큰 시각적 왜곡(조악한 계층)을 나타내는 계층에 보다 많은 리던던시를 할당하고, 전체 평활에 가장 정교한 기여를 하는 계층들상에 리던던시율을 점차적으로 감소하는 최적화 처리가 예측된다. 조건들 0≤k_jt≤n,을 따르는 모든 시간 t에서 계산될 필요가 있는 k_jt의 L값들이 존재하고, 여기서 R_C는 리던던시 비트, q는 심볼 사이즈이다. 상기 공식은 수적으로 풀릴 수 있는 비선형의 제약 최적화 문제를 가져온다.Equation 11 estimates the predicted visual smoothness experienced per frame at the decoder in a statistical manner. The aim is to minimize the distortion on the value of k _jt in equation (11). From the way the bitstream is divided into layers, the optimization that allocates more redundancy to the layer that represents the largest visual distortion (coarse layer) and gradually reduces the redundancy rate on the layers that make the most sophisticated contribution to overall smoothing Processing is foreseen. Conditions 0≤k _jt ≤n, There are L values of k _jt that need to be computed at every time t that follows, where R _C is the redundancy bit and q is the symbol size. This formula leads to a nonlinear constraint optimization problem that can be solved numerically.

P_B=0에 대한 모델의 예기된 동작은 k_jt들에 대해 동일한 값들을 생성하는 것이고, 높은 P_B에서 동일하지 않게 변화하는 k_jt들이 획득될 것이다. 식 11에서 평활 왜곡들의 계산을 위해, 수신기에서 에러 은폐가 발생하지 않는 것으로 가정된다는 점에 유의하라.The expected behavior of the model for P _B = 0 is to produce the same values for k _jt , and _unequally varying k _jt at high P _B will be obtained. Note that for the calculation of the smoothing distortions in Equation 11, no error concealment occurs at the receiver.

정점 선형 보간에 기초한 기술들은 3D-애니매이션 프레임들에 대해 충분하고 효율적인 에러 은폐 방법이라는 점을 알 수 있다. 이것은 고-프레임 애니매이션들이 선형 또는 피스-와이즈(piece-wise) 선형 이외의 정점 궤도들을 나타내는 것과 유사하지 않는 것에 따라, '기준 원리의 국소성(locality)'에 의존한다. 보다 높은 복잡성이 수용될 수 있다면, 이웃하는 프레임들로부터의 정보를 사용하여 보다 높은 차수의 보간이 채용될 수 있다. 임의의 다른 디코더에 의해 사용될 수 있는 일부 공지된 보간 및 다른 은폐 방법들이 일반적이다.It can be seen that techniques based on vertex linear interpolation are a sufficient and efficient error concealment method for 3D-animated frames. This relies on 'locality of the reference principle', as high-frame animations are not similar to representing vertex trajectories other than linear or piece-wise linear. If higher complexity can be accommodated, higher order interpolation may be employed using information from neighboring frames. Some known interpolation and other concealment methods that can be used by any other decoder are common.

도 4는 이 작업의 실험적 파라미터들, 즉,P_B=[0..30], L_B=4에 적응된 3개의 에러 은폐 방법들의 상대적인 성능들을 도시하는 그래프(400)이다. 선형 보간이 프레임 반복이나 모션 벡터-기반 방법들을 능가한다는 점은 분명하다. 상기 도면은 상이한 손실 패턴들을 갖는 8개의 반복들에 대한 평균 값을 도시한다. 보간 은폐 방법은 기준 원리의 국소성을 검증하여, 매우 낮은 분산을 나타낸다(평균 손실 버스트 길이 L_B=4는 30Hz의 시퀀스 프레임율보다 훨씬 적다). 그러므로, 본 발명은 양호하게 채널 디코더가 n-k_jt개 이하의 BOP 패킷들을 수신하는 경우, 수신기에서 보간에 기초한 에러 은폐를 사용한다. 사실, 최적화 문제에 대한 해법을 제공하는 k_jt들도 은폐 기술들과 결합된다면 최소화된 왜곡을 제공할 것이다. 상기 경우들에서 에러 은폐가 없다면 예측된 왜곡은 상기 왜곡보다 낮을 것이다.4 shows the experimental parameters of this task, namely A graph 400 showing the relative performances of three error concealment methods adapted to P _B = [0..30], L _B = 4. It is clear that linear interpolation outperforms frame repetition or motion vector-based methods. The figure shows the mean value for eight iterations with different loss patterns. The interpolation concealment method verifies the locality of the reference principle and shows very low variance (average loss burst length L _B = 4 is much less than the sequence frame rate of 30 Hz). Therefore, the present invention preferably uses error concealment based on interpolation at the receiver when the channel decoder receives nk _jt or less BOP packets. In fact, k _jt , which provides a solution to the optimization problem, will also provide minimal distortion if combined with concealment techniques. In such cases, if there is no error concealment, the predicted distortion will be lower than the distortion.

이하는 본 발명을 사용하는 실험적 결과들의 논의와 함께, 실세계 경우의 3-D 와이어프레임 애니매이션에 대한 최적화 처리 및 값들을 튜닝하는 방법과 실험적 절차를 설명한다.The following describes a method and experimental procedure for tuning the optimization process and values for 3-D wireframe animation in the real world case, with discussion of experimental results using the present invention.

이하의 실험들은 3-D 와이어프레임 애니매이션들을 스트리밍하기 위해 에러 은폐(EC)와 결합된 제안된 비동일 에러 보호(UEP)의 효율성을 시뮬레이션하는 것을 통해 증명된다. 특히, EC를 갖는 UEP를 사용하는 것은 간단한 UEP, EEP(Equal Error Protection) 및 NP(No Protection)에 비교된다. 상기 비교는 애니매이션중 시간-의존 메시의 표면 평활을 캡쳐하는 왜곡 척도를 가져오는 것으로 공지되는 시각적 평활도 메트릭에 기초한다. 파라미터들 k_jt의 계산을 위해, 채널율 R_C이 주어져 식 11의 제약된 최소화 문제가 수적으로 풀린다. 또한, 원래의 소스 신호의 비율 특성들이 BOP의 특정한 설계와 일치하도록 n이 식 6으로부터 계산된다. 식 6에서 사용된 다른 파라미터들은 실험들에서 두개의 시퀀스들에 대해 이하에서 주어지고, 표 Ι에 또한 요약된다.The following experiments are demonstrated through simulating the effectiveness of the proposed non-identical error protection (UEP) combined with error concealment (EC) to stream 3-D wireframe animations. In particular, using UEP with EC is compared to simple UEP, Equal (Equal Error Protection) and No Protection (NP). The comparison is based on a visual smoothness metric known to result in a distortion measure that captures the surface smoothness of the time-dependent mesh during animation. For the calculation of the parameters k _jt , the channel rate R _C is given so that the constrained minimization problem of equation 11 is solved numerically. In addition, n is calculated from Equation 6 so that the ratio characteristics of the original source signal match the specific design of the BOP. The other parameters used in Equation 6 are given below for the two sequences in the experiments and are also summarized in Table I.

표 ΙTable Ι

리던던시 실험들에 사용된 애니매이션 시퀀스 파라미터들Animation Sequence Parameters Used in Redundancy Experiments

: TELLY & BOUNCEBALLTELLY & BOUNCEBALL

EEP 경우, 상수 k는 15%로 설정되는 채널율의 선택으로부터 직접 도출될 수 있는 것으로 간주된다. NP의 경우, 소스에 대한 모든 이용가능한 채널율이 할당된다. 마지막으로 EC 체계는 잔여 손실들을 갖는 UEP의 경우를 위해 보간에 기초하여 사용된다. 모든 실험들에서 L_B=4가 사용된다.In the case of EEP, it is assumed that the constant k can be derived directly from the selection of the channel rate which is set to 15%. For NP, all available channel rates for the source are allocated. Finally, the EC scheme is used based on interpolation for the case of UEP with residual losses. In all experiments L _B = 4 is used.

시퀀스들 TELLY와 BOUNCEBALL은 식 2에 주어진 df_TELLY=0.75 및 df_BBALL=1.0의 밀도 인자들로 사용되었다. TELLY는 표 Ι에 도시되는 바와 같이 30Hz로 9개의 노드들(3개는 상태적으로 희박하고, 나머지 6개가 완성됨)과 전체 780개의 프레임들을 포함한다. 그 평균 소스 비트율은 R_S,TELLY=220Kbps이다. BOUNCEBALL은 평균 소스율 R_S,BALL=61Kbps의 1개 계층을 형성하는, 24Hz로 1개의 노드들과 528개의 프레임들을 갖는다. 두 시퀀스들은 15개의 프레임들마다 I-프레임들을 갖게 코딩된다. 대략, 15%의 채널 코딩 리던던시가 허용되었고, 전체 소스 및 채널 코딩 리던던시를 유발하며, R_TELLY=253Kbps 및 R_BBALL=70.15Kbps의 전체 소스 및 코딩율을 유발한다. 식 6으로부터 n=32 파라미터들을 선택하여, 각 계층들의 패킷화를 위한 계산이 표 I로 만들어 진다. 보다 높은 n은 지연과 버퍼 공간을 희생시켜 RS 코드들이 보다 내성있게 하기 때문에, n의 값은 지연과 효율성 간의 절충안으로서 선택된다.The sequences TELLY and BOUNCEBALL were used as density factors of df _TELLY = 0.75 and df _BBALL = 1.0 given in Equation 2. TELLY includes nine nodes (three state sparse, six completed) and a total of 780 frames at 30 Hz, as shown in Table I. The average source bit rate is R _{S, TELLY} = 220 Kbps. BOUNCEBALL has one node and 528 frames at 24 Hz, forming one layer of average source rate R _{S, BALL} = 61 Kbps. Both sequences are coded with I-frames every 15 frames. Approximately, it was allowed channel coding redundancy of 15%, to induce the full source and channel coding redundancy, and R and R _TELLY = 253Kbps _BBALL = induce the full source and a coding rate of 70.15Kbps. By selecting n = 32 parameters from equation 6, the calculations for packetization of each layer are made in Table I. Since higher n makes RS codes more resistant at the expense of delay and buffer space, the value of n is chosen as a compromise between delay and efficiency.

섹선 Ⅴ에 제공되는 제안된 계층화 방법에 따라 3개의 계층들로 분할되었고, 각각은 표 Ι에 도시된 노드들을 구성한다. 3-D 메시에서 애니매이트된 정점들의 전체 수 중에서 각 계층의 소수부는 평균적으로 (L₀,L₁,L₂)=(0.48,0.42,0.10)이다. 이 분할은 각 계층 비례항의 소스 비트율을 반영하는 것으로 예측된다. 제안된 계층화 체계는 동일한 계층 L₁에 3개 중 2개의 스파스 노드들을 할당했다는 것이 알려졌다. 이 두개의 스파스 노드들의 정점들의 총 수는 기준 메시에서 65%의 정점들을 나타낸다. 제3 스파스 노드인 노스트릴(nostril)은 계층 L₂에 할당되었지만, 그 개개의 모션은 모델의 정점들의 총 수(약1.3%)의 매우 작은 소수에 관한 것이다. 계층⁴(식 2) 당 계산된 밀도 인자 df_L및 출력 비트율에 노드-대-계층 할당(VS 메트릭 사용)을 관계시키고자 한다면, 상기 사실은 일부 가중치를 견딜 수 있다. 상기 관계가 존재한다면, 동적 계층화 체계가 필요에 따라 애플리케이션들을 위해 개발될 수 있다.According to the proposed layering method provided in section V, it is divided into three layers, each of which constitutes the nodes shown in Table I. Of the total number of animated vertices in the 3-D mesh, the fractional part of each layer averages (L ₀ , L ₁ , L ₂ ) = (0.48,0.42,0.10). This division is expected to reflect the source bit rate of each layer proportional term. It is known that the proposed layering scheme assigns two sparse nodes of three to the same layer L ₁ . The total number of vertices of these two sparse nodes represents 65% of the vertices in the reference mesh. The third sparse node, nostril, is assigned to layer L ₂ , but the individual motion is about a very small fraction of the total number of vertices of the model (about 1.3%). If we want to relate the node-to-layer allocation (using the VS metric) to the calculated density factor df _L per layer ⁴ (Equation 2) and the output bit rate, this fact can bear some weight. If such a relationship exists, a dynamic tiering scheme can be developed for the applications as needed.

시퀀스 BOUNCEBALL은 초기에 오직 하나의 노드를 포함한다. 상기 시퀀스는 내포하는 형상으로서 중심점 둘레의 고유한 대칭을 갖는 소프트 볼을 표현한다. 형상이 대칭적으로 주어진다면, 각 노드에 대한 VS 메트릭을 고려하지 않고 동일한 수의 정점들의 2개의 노드들로 상기 메시를 분할하도록 결정되었다. 상기 분할 뒤의 논리는, VS 메트릭이 제안된 UEP 내성 체계를 갖는 효과를 검증하는 시도이다. 모든 다른 소스 코딩 파라미터들, 가장 중요하게 양자화 단계 사이즈는 두개의 계층들 사이에서 일정하다. UEP 성능이 EEP에 접근하게 되도록 두 계층들은 대략 동일한 평균 보호 비트를 수신할 것으로 예기된다.The sequence BOUNCEBALL initially contains only one node. The sequence represents a soft ball with an inherent symmetry around the center point as an enclosing shape. Given the shape symmetrically, it was decided to split the mesh into two nodes of the same number of vertices without considering the VS metric for each node. The logic behind the partitioning is an attempt to verify the effect of the VS metric having the proposed UEP immunity scheme. All other source coding parameters, most importantly the quantization step size, are constant between the two layers. It is anticipated that the two layers will receive approximately the same average guard bit so that UEP performance approaches EEP.

도 5는 TELLY에 대한 평균 패킷 손실율(P_B)의 함수로서 VS를 설명하는 제 1 도(502)를 도시한다. 도면상에서의 네 개의 곡선들은 코드(31, 22)에 대해 각각의 제안된 내성 방법(resilience method)를 나타낸다. UEP에 대해 계산된 코드들의 평균은 (대략적으로 가장 근사치의 정수)로 다음과 같다: . UEP 및 UEP+EC가 P_B>9%의 높은 손실율에 대한 매체에 대해 NP 및 EEP를 능가함이 명백하다. 최저층이 고평균 시각적 왜곡으로 나타나는 방법으로 계층화가 수행된다는 점을 상기하라. UEP 방법이 더 낮은 층(L₀)에 더 높은 코드들을 할당하므로, 높은 손실율의 L₀에 대해 더 좋은 내성이 예측된다. 상기 인자는 평균 왜곡에서 우위하고, 더 좋은 수행을 가져온다. RS 코드들이 전체 또는 대부분의 에러들을 회복하기에 충분기 때문에, 적은 손실율에서 EEP 및 UEP가 거의 동일한 방법으로 동작함을 유의한다. 또한 소실이 없는 조건들 하에서의 NP 방법이 다른 방법들보다 좋다는 점을 유의한다. 이것은 이해가능한 결과로서, 자원 정보가 모든 이용가능한 채널율을 취하기 때문에, 그에 따라 신호에 대한 더 좋은 인코딩이 가능하다. 또한 EC의 효과를 알릴 가치가 있다: UEP+EC 체계의 왜곡은 간단한 UEP 경우을 통해 조금 개선된다. 이것은 또한 예측된다.5 shows a first diagram 502 illustrating VS as a function of average packet loss rate P _B for TELLY. Four curves in the figure represent each proposed resilience method for codes 31 and 22. The average of the codes computed for the UEP (approximately the nearest integer) is as follows: . It is clear that UEP and UEP + EC outperform NP and EEP for medium for high loss rates of P _B > 9%. Recall that stratification is performed in such a way that the lowest layer is represented by a high average visual distortion. Since the UEP method assigns higher codes to the lower layer (L ₀ ), better immunity is predicted for the high loss rate L ₀ . This factor is superior in mean distortion, resulting in better performance. Note that since the RS codes are sufficient to recover all or most of the errors, the EEP and UEP operate in almost the same way at low loss rates. It is also noted that the NP method under conditions without loss is better than the other methods. This is an understandable result, since resource information takes all available channel rates, thus allowing better encoding of the signal. It is also worth noting the effects of EC: The distortion of the UEP + EC scheme is slightly improved through a simple UEP case. This is also predicted.

시퀀스 TELLY 상에서의 (31, 27) RS 코드에 대한 결과들은 도 5의 도면(504)에서 도시되는 바와 매우 유사하다. 본 명세서에서, (EC와 함께 또는 없이) UEP 방법들이 EEP 또는 NP를 대체하는 임계치는 대략 P_B=7%이다. 자원률을 희생시켜 상기 저손실 영역에서 큰 내성을 분배하지 못하기 때문에, 채널 코딩 비트들이 실제적으로 '소모되는' 인자를 반복적으로 강조하는 (31, 22)와 비교해서, 초기 NP 수행(낮은 P_B'들)이 얼마나 가파른지 유의한다. 층당 대응하는 평균 코드들은: 이다. 에러 은폐의 보간 알고리즘으로부터의 UEP 방법들의 수행 결과에서 또한번 개선된다. 상기 수량이 최적화 문제에서 고려되지 못한 바와 같이, 시각적 에러에 대해 작은 감소를 기여할 것으로 예측된다.The results for the (31, 27) RS code on the sequence TELLY are very similar to that shown in the diagram 504 of FIG. In this specification, the threshold at which UEP methods replace EEP or NP (with or without EC) is approximately P _B = 7%. Initial NP performance (low P _B ), as compared with (31, 22) where channel coding bits repeatedly emphasize the factor that is actually 'consumed' because it does not distribute large immunity in the low loss region at the expense of resource rate. Note how steep this is. The corresponding average codes per floor are: to be. It is also improved once in the performance of the UEP methods from the interpolation algorithm of error concealment. As this quantity is not considered in the optimization problem, it is expected to contribute a small reduction to the visual error.

도 6은 본 섹션에서 상기한 바와 같이 '대칭적으로' 계층화된 BOUNCEBALL 시퀀스를 통해 반복되는 동일한 관측에 대해 획득되는 결과들(602)을 도시한다. 동일한 (31, 22) EEP 코드가 비교 전에 이용되었다. 그래프(602)는 최상의 전체 수행을 제공하는 UEP+EP와 함께, TELLY에서의 동일한 트렌드 및 상대적 수행들을 도시한다. 그러나, EEP부터의 UEP 커브들의 거리는 높은 P_B'들에서의 TELLY 시퀀스에 비교해 상당히 감소했음을 유의한다. UEP 경우에 대해 RS 코드들로 계산된 평균 정수는: , 즉 EEP 경우에 대해 등가이다. 이것은 일견 놀라운 결과일 수도 있지만, 주의깊은 추론(reasoning)은 시각적으로 균형화된 왜곡들에 대응하여 (동일한 수의 정점들, 노드들, 장면에서의 매우 유사한 모션, 및 동일한 인코딩 파라미터들) 포함하는 애니매이션의 양에 의해 동등하게 균형화된 층들을 제안한다. 이것은 상기한 BOUNCEBALL 시퀀스에 대한 계층화에서 정확하게 예측된 결과이다. 실제로, 최적화 문제에 대한 솔루션으로서 계산된 k_0t, k_1t의 실제 값들은 22의 평균 정수값 근처에서 변화한다. 더욱이, 원래 대칭적인 BOUNCEBALL 메시는 유사하게 가정된 그들 개개의 시각적 왜곡들을 고려하지 않고 두 개의 임의 노드들로 분할된다는 점을 상기하라. 실제로, 바운싱 포인트들에서의 소프트볼의 변형은 원 형상의 대칭성을 감소시킨다. 상기한 사실들은 UEP 및 EEP 커브들이 통상적으로 예측한 바와 같이 더 높은 P_B'들에서 정확하게 일치하지 않은 이유를 합리적으로 설명한다. 최종적으로, UEP+EC 방법은 상기 관측에서와 같이 시각적 왜곡에 대해 인지하기 어렵지만 약간의 개선을 제공한다.FIG. 6 shows the results 602 obtained for the same observation repeated through a 'symmetrically' layered BOUNCEBALL sequence as described above in this section. The same (31, 22) EEP code was used before the comparison. Graph 602 shows the same trend and relative performance in TELLY with UEP + EP providing the best overall performance. However, note that the distance of the UEP curves from the EEP is significantly reduced compared to the TELLY sequence at high P _B 's. The average integer calculated in RS codes for the UEP case is: That is, equivalent to the EEP case. This may seem surprising, but careful reasoning involves animations that correspond to visually balanced distortions (the same number of vertices, nodes, very similar motion in the scene, and identical encoding parameters). We propose layers that are equally balanced by the amount of. This is exactly the result predicted in the stratification for the BOUNCEBALL sequence described above. Indeed, the actual values of k _0t , k _1t calculated as a solution to the optimization problem change around an average integer value of 22. Moreover, recall that the original symmetric BOUNCEBALL mesh is split into two arbitrary nodes without considering those individual visual distortions that are similarly assumed. Indeed, deformation of the softball at the bouncing points reduces the symmetry of the circular shape. The above facts reasonably explain why the UEP and EEP curves do not exactly match at higher P _B 's as typically predicted. Finally, the UEP + EC method is difficult to perceive for visual distortions as in the above observations but provides some improvement.

본 발명은 에러에 대해 최적의 주관적인 내성을 획득하는 것과 같은 방법으로 스트리밍 3-D 와이어프레임 애니매이션에 대해 이용가능 채널 용량을 최상으로 이용하기 위한 방법의 기본적 문제점을 말한다. 간단히, 본 발명은 재구성된 이미지에서의 시각적 평활도를 측정하는 주관적인 파라미터로 채널 코딩, 패킷화, 및 계층화를 링크한다. 이것에 기초하여, 상기 결과가 3-D 애니매이션이 중요한 네트워킹된 미디어 유형이 될 수 있는 방법을 개시하도록 도울수 있음을 신뢰한다. 개시된 방법들은 시간-의존 메시들 상에서 표면의 평활을 검출하는 인간의 눈의 시각적 특성을 반영하는 메트릭을 이용하여 다른 층들 사이의 채널 코딩을 위해 남겨둔 비트 예산 할당(bit budget allocation)의 분배의 최적화를 시도한다. 상기 메트릭의 이용에서, 인코딩된 비트스트림은 초기에 시각적 중요성 층들로 분할되고, EC와 결합된 UEP는 인터넷 상에서 발생하는 갑작스러운 패킷 에러들에 대해 좋은 보호성을 나타낸다.The present invention addresses the basic problem of a method for best utilizing available channel capacity for streaming 3-D wireframe animation in such a way as to obtain optimal subjective immunity to error. Briefly, the present invention links channel coding, packetization, and layering with subjective parameters that measure visual smoothness in reconstructed images. Based on this, we trust that the results can help disclose how 3-D animation can be an important networked media type. The disclosed methods exploit optimization of the distribution of the bit budget allocation left for channel coding between different layers using a metric reflecting the visual characteristics of the human eye that detects surface smoothness on time-dependent meshes. Try. In the use of the metric, the encoded bitstream is initially divided into visual importance layers, and the UEP combined with the EC shows good protection against sudden packet errors occurring on the Internet.

본 발명의 범위 내에서의 실시예들은 또한 컴퓨터 실행가능 명령들 또는 명령들에 저장된 데이터 구조들을 운반 또는 소유하기 위한 컴퓨터 판독가능 미디어를 포함할 수 있다. 상기 컴퓨터 판독가능 미디어는 범용 목적 또는 특수 목적 컴퓨터에 의해 액세스될 수 있는 어떠한 이용가능 미디어이다. 제한적이지 않은 예로써, 상기 컴퓨터 판독가능 미디어는 컴퓨터 실행가능 명령들 또는 데이터 구조들의 형식으로 원하는 프로그램 코드 수단을 운반 또는 저장하기 위해 사용될 수 있는 RAM, ROM, EEPROM, CD-ROM 또는 다른 광 디스크 저장장치, 자기 디스크 저장장치 또는 다른 자기 저장 디바이스들, 또는 어떤 다른 매체를 포함할 수 있다. 네트워크 또는 다른 통신 접속을 통해 정보가 컴퓨터(하드와이어, 무선, 또는 그들의 조합의 어느 한 쪽)로 전송되거나 제공될 때, 컴퓨터는 컴퓨터 판독가능 매체로서 무선 또는 유선 어느 한 쪽의 접속을 적절히 관찰한다. 그에 따라, 어떠한 상기 접속은 컴퓨터 판독가능 매체로서 적절히 언급된다. 상기의 조합들은 또한 컴퓨터 판독가능 미디어의 범위 내에 포함될 수 있다.Embodiments within the scope of the present invention may also include computer readable media for carrying or possessing computer executable instructions or data structures stored in the instructions. The computer readable media is any available media that can be accessed by a general purpose or special purpose computer. By way of example, and not limitation, the computer readable media may include RAM, ROM, EEPROM, CD-ROM or other optical disk storage that may be used to carry or store desired program code means in the form of computer executable instructions or data structures. Device, magnetic disk storage or other magnetic storage devices, or any other medium. When information is transmitted or provided to a computer (either hardwired, wireless, or a combination thereof) over a network or other communication connection, the computer properly observes the connection of either wireless or wired as a computer readable medium. . As such, any such connection is appropriately referred to as a computer readable medium. Combinations of the above may also be included within the scope of computer-readable media.

예를 들어, 컴퓨터 실행가능 명령들은 특정 기능 또는 기능들의 그룹을 실행하기 위한 범용 목적 컴퓨터, 특수 목적 컴퓨터, 또는 특수 목적 처리 디바이스를 야기하는 명령들 및 데이터를 포함한다. 컴퓨터 실행가능 명령들은 또한 독립 컴퓨터 또는 네트워크 환경 컴퓨터에 의해 실행되는 프로그램 모듈들을 포함한다. 일반적으로, 프로그램 모듈들은 특정 태스크들을 수행하거나 특정 추상 데이터 유형들을 실행하는 루틴들, 프로그램들, 객체들, 컴포넌트들, 및 데이터 구조들 등을 포함한다. 컴퓨터-실행가능 명령들, 관련 데이터 구조들, 및 프로그램 모듈들은 본 명세서에 개시된 방법들의 단계들을 실행하기 위한 프로그램 코드 수단의 예들을 나타낸다. 상기 실행가능 명령들 또는 관련 데이터 구성들의 특정 시퀀스는 상기 단계들에서 설명한 함수들을 실행하기 위한 대응하는 행동들의 예들을 나타낸다.For example, computer executable instructions include instructions and data that cause a general purpose computer, special purpose computer, or special purpose processing device to execute a particular function or group of functions. Computer executable instructions also include program modules executed by a standalone computer or a network environment computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or execute particular abstract data types. Computer-executable instructions, associated data structures, and program modules represent examples of program code means for executing the steps of the methods disclosed herein. The particular sequence of executable instructions or related data constructs represents examples of corresponding actions for executing the functions described in the above steps.

당업자는 본 발명의 다른 실시예들이 개인용 컴퓨터들, 핸드헬드 디바이스들, 다중 프로세서 시스템들, 마이크로프로세서 기반 또는 프로그램 가능 전자 제품들, 네트워크 PC들, 소형 컴퓨터들, 메인프레임 컴퓨터들 등을 포함하는 컴퓨터 시스템 구성들의 많은 유형들의 네트워크 컴퓨터 환경들에서 실시될 수 있음을 이해한다. 실시예들은 또한 태스크들이 지역적으로 또는 통신 네트워크를 통해 (하드와이어 링크들, 무선 링크들, 또는 그들의 조합 중 어느 하나에 의해) 링크된 원격 처리 디바이스들에 의해 수행되는 분배된 컴퓨터 환경들에서 실시될 수 있다. 예를 들어, 피어 투 피어(peer to peer)로 분배된 환경들은 본 발명의 원리들이 적용되고 이로운 이상적 통신 네트워크를 제공한다. 분배된 컴퓨터 환경에서, 프로그램 모듈들은 지역 및 원격 메모리 저장 디바이스들의 양측에 위치될 수 있다.Those skilled in the art will appreciate that other embodiments of the present invention include personal computers, handheld devices, multiprocessor systems, microprocessor-based or programmable electronics, network PCs, small computers, mainframe computers, and the like. It is understood that the present invention can be practiced in many types of networked computer environments. Embodiments may also be practiced in distributed computer environments where tasks are performed by remote processing devices that are linked locally or by a communication network (either by hardwire links, wireless links, or a combination thereof). Can be. For example, environments distributed peer to peer provide an ideal communication network to which the principles of the present invention are applied and beneficial. In a distributed computer environment, program modules may be located on both sides of local and remote memory storage devices.

상기 설명이 특정 상세들을 포함하고 있을지라도, 그들은 어떤 방법으로도 청구범위를 제한하도록 해석되지 않는다. 본 발명의 상기 실시예들의 다른 구성들은 본 발명의 범위의 부분이다. 따라서, 첨부된 청구범위 및 그들의 법적 동등물들은 어떤 특정 예를 제공하기보다는 본 발명만을 규정한다.Although the description contains specific details, they are not to be construed in any way to limit the claims. Other configurations of the above embodiments of the invention are part of the scope of the invention. Accordingly, the appended claims and their legal equivalents define only the invention rather than provide any specific examples.

Claims

As a way of streaming data,

a) calculating visual smoothness for each node in a wireframe mesh; And

b) stratifying the data associated with the wireframe mesh into a plurality of layers such that the average visual smoothness value associated with each layer reflects the importance of the individual layers in the animation sequence.

2. The method of claim 1, further comprising maintaining the same overall bit rate when transmitting the plurality of layers.

2. The method of claim 1, wherein each layer of the plurality of layers comprises a node or group of nodes in the wireframe mesh.

The method of claim 1, wherein the most significant layer contains the highest average visual smoothness value and is most resistant to packet errors.

The method of claim 1, wherein the number of nodes included in a particular layer is associated with an output bit rate of the respective layer.

2. The method of claim 1, further comprising generating a 3-D packetized streaming signal representing a scene comprising an animation.

2. The method of claim 1, wherein maintaining the same overall bit rate occurs by applying to each layer an error correction code that is not identical in the layer according to the importance of the individual layers in the animation sequence.

2. The method of claim 1, further comprising dividing the wireframe mesh to create a segmented mesh having more nodes than the wireframe mesh before stratifying the data.

10. The method of claim 8, wherein the segmenting further comprises segmenting the wireframe mesh according to the objects in the scene represented by the nodes and the corresponding motion of the objects.

9. The method of claim 8, wherein partitioning further comprises dividing the wireframe mesh into sub-meshes of any size to be assigned to the same layer.

A method of receiving streamed data, wherein the unequal error protection scheme for the wireframe mesh is applied to the streamed data to divide the wireframe mesh into a plurality of layers. In the way:

Applying an error concealment scheme to the plurality of layers, wherein graceful degradation of a streamed animation with a high packet loss rate is achieved at a receiver.

12. The method of claim 11, wherein the non-identical error protection scheme comprises optimizing the distribution of bit budget allocation among the plurality of layers.

13. The method of claim 12, wherein the non-identical error protection scheme uses a visual smoothness metric.

12. The method of claim 11, wherein the plurality of layers are divided according to visual importance.

12. The method of claim 11, wherein interpolation-based concealment is applied to residual packets that do not recover from error-concealment.

A method for decoding streaming data associated with a wireframe mesh at a receiver, the streaming data comprising a plurality of layers organized according to the visual importance of the wireframe mesh, and a sender having the plurality of layers exhibiting greater visual distortion. In the streaming data decoding method, assigning more redundancy to one of the layers:

Using redundant information in individual layers in the streaming data to recover lost packets; And

Applying error concealment based on interpolation to reduce distortion due to residual packet loss.

17. The method of claim 16, wherein the layer representing the largest visual distortion is a coarse layer.

17. The method of claim 16, wherein the concealment based on interpolation is applied to each layer independently.

A method of streaming wireframe mesh data,

Dividing the wireframe mesh into layers including a layer having visible portions and a layer having non-visible portions;

When the user views the wireframe mesh in a fixed mode, transmitting only the layer with visible portions; And

When the user views the wireframe mesh in an interactive mode, transmitting a layer having visible portions and a layer having invisible portions.

20. The method of claim 19, wherein transmitting only layers with visible portions or transmitting a layer with visible portions and a layer with invisible portions is dependent on the available bandwidth.

20. The method of claim 19, wherein transmitting only layers with visible portions or transmitting a layer with visible portions and a layer with invisible portions is selected by a user.