MXPA01005522A

MXPA01005522A - Method and system for encoding rotations and normals in 3d generated scenes

Info

Publication number: MXPA01005522A
Application number: MXPA/A/2001/005522A
Authority: MX
Inventors: Julien Signes; Oliver Ondet
Original assignee: France Telecom
Priority date: 1998-12-04
Filing date: 2001-06-01
Publication date: 2002-03-05

Abstract

A method and system for encoding a video stream using a compact representation for rotations and normals. The method and system convert rotations and normals into normalized rotations and normals and then project the normalized versions onto a unit cube. Each rotation and normal is encoded according to on which face it was projected. In addition, motion can be compactly represented by movement across the unit cube.

Description

METHOD AND SYSTEM TO CODIFY ROTATIONS AND NORMALS IN SCENES GENERATED IN A THREE-DIMENSIONAL MODE CROSS REFERENCE TO RELATED REQUESTS The present application is related to the co-pending application entitled "Method and Systems for Predictive Encoding of Data Arrays" (Method and Systems for the Predictive Coding of Data Arrangements), proxy number 2167-0106-2, serial number 09 / 205,191, filed on the same date here, also naming Julien Signes and Olivier Ondet as inventors. The contents of that co-pending application are incorporated herein by reference.

BACKGROUND OF THE INVENTION Field of the Invention The present invention relates to the coding of computer-generated images, and more particularly, to the coding of three-dimensional scenes using three-dimensional and three-dimensional normal rotations.

Discussion of Background The phrase "computer-generated images" encompasses an expanding area of video technology. Originally, the term was always compared with images of simple text or two-dimensional images; however, the phrase now encompasses any type of digitally encoded video stream. The Motion Picture Experts Group (MPGE) was formed to investigate the 'technologies required for the encoding and decoding of image streams. The resulting standard (now called "MPEG-1") has served as a basis for two additional MPEG standards: MPEG-2 and MPEG-1. MPEG-4 is a standard that is "in progress" and forms the basis of the present invention. The final committee drafts are the ISO / IEC FCD 14496-1 MPEG 4 and -2MEPG-4 visual systems, the contents of the final committee drafts are incorporated here by reference. The draft standard deviates from the model based on the individual video flow and produces the point of a series of flows that act in a group. A portion of the standard is the Binary For Scenes Format (also known as "BIFS"). This format allows the description of three-dimensional objects and their movement, and provides the ability for greater interaction with that portion of the video stream. However, the draft standard does not directly provide a topologically consistent quantification scheme that supports a compensation process. The representation of rotations with quaternions is known to be used in three-dimensional computation, but not in the context used in the present invention. A known use is in the Cosmo viewer where the quaternions are used internally to interpolate between two rotations to avoid the production of artifacts.

COMPENDIUM OF THE INVENTION The ability to efficiently calculate, store and decode rotations and normals is not yet available in the BIFS portion of the standard. In this way, it is an object of the present invention to address this deficiency. Another object of the present invention is to provide a method and system implemented by computer to more efficiently encode rotations in a flow of computer generated images. It is yet another object of the present invention to provide a method and system implemented by computer to more efficiently encode normals in a flow of computer generated images. These and other objects of the present invention are directed by one or more of, (1) a computer implemented method for coding rotations and normals, (2) a system for encoding rotations and normals, and (3) a program product of computer to encode rotations and normals. Said system is applicable for the improved storage and reproduction of games, virtual reality environments, and movies. In addition, based on the improved efficiency, the video streams encoded in accordance with the present invention can be reproduced on lower bandwidth communications links than the encoded streams less efficiently.

BRIEF DESCRIPTION OF THE DRAWINGS A more complete appreciation of the invention and many of its intended advantages will be readily apparent by reference to the following detailed description, particularly when considered in conjunction with the accompanying drawings, in which: Figure 1 is a schematic illustration of a computer to provide encoding and / or decoding services of the present invention; Figure 2 is an illustration of a unit sphere surrounded by a unit cube on which normal projects; Figure 3 is an illustration of the unit cube of Figure 2 deployed to show the movement of the original point 205 to a new point 205 '; Figure 4 is an illustration of the point 205 'on the unit cube of Figure 2; and Figure 5 is an illustration of the difference between previous encodings and coding according to the present invention.

DESCRIPTION OF THE PREFERRED MODALITIES Referring now to the drawings, where like reference numbers designate identical or corresponding parts throughout the various views, Figure 1 is a schematic illustration of a computer system for encoding and / or decoding rotations and / or normals in descriptions and / or scene animations. A computer 100 implements the method of the present invention, wherein the computer housing 102 hosts a motherboard 104, which contains a CPU 106, memory 108 (e.g., DRAM, ROM, EPROM, EEPROM, SRAM, SDRAM, and Flash RAM), and other special-purpose logical devices, optional (for example, ASICs) or configurable logic devices (for example, GAL and reprogrammable FPGA). The computer 100 also includes plural input devices (e.g., a keyboard 122 and a mouse 124), and a display card 110 for controlling the monitor 120. In addition, the computer system 100 further includes a floppy disk drive 114; other removable media devices (e.g., compact disc 119, tape removable magneto-optical media (not shown)); and a hard disk 112, or other fixed high density media units, connected using an appropriate device bus (e.g., a SCSI busbar, an improved IDE busbar, or an Ultra DMA busbar). Also connected to the same busbar of the device or to another busbar of the device, the computer 100 may further include a compact disc reader 118, a compact disc reader / writer unit (not shown) or a storage device for multiple sets of compact discs (not shown). Although the compact disc 119 is shown in a CD case, the compact disk 119 can be inserted directly into a CD-ROM drive which does not require boxes. In addition, a printer (not shown) also provides lists of rotated / normal prints encoded and / or decoded. As stated above, the system includes at least one computer-readable medium. Examples of computer readable media are compact disks 119, hard disks 112, floppy disks, tapes, magneto / optical disks, PROMs (EPROM, EEPROM, Flash EPROM), DRAM, SRAM, SDRAM, etc. Stored in any or a combination of computer-readable media, the present invention includes software for controlling both the hardware of the computer 100 and for allowing the computer 100 to interact with a human user. Such software may include, but is not limited to, device units, operating systems, and user applications, such as development tools. Said computer-readable medium further includes the product of the computer program of the present invention for encoding and / or decoding normals and / or rotations in descriptions and / or scene animations. The computer coding mechanisms / devices of the present invention can be any interpreted or executable code mechanism, including, but not limited to, manuscripts, interpreters, dynamic link collections, Java classes, and full executable programs. As shown in Figure 2, the method of the present invention is based on a quantization process that traces normal and rotations on a "unit" cube surrounding the unit sphere. The process allows the decoding to reconstruct the normal and rotations. The subsequent processing (interpolation, analog filtration and reconstruction) can be carried out separately. For example, a normal 200, which is initially represented by a three-dimensional unit vector, is projected as a three-dimensional unit vector in the three-dimensional unit sphere 210. (A rotation, that is, initially represented as a three-dimensional unit vector ( for its axis) with an additional angle, a quaternion is converted, that is, a four-dimensional unit vector in the 4-dimensional unit sphere). The normal 200 is then quantified by intersecting the normal 200 with a face of the surrounding "unit" cube 215, which produces a point on a particular face (ie, face and positive in the example) of the "unit" cube 215 In this way, the normal can be "compressed" by converting the three-dimensional vector into (1) the location (+ 1, +2) of the point 205, (2) an orientation (for example, 0 for the x-axis for the axis y and 2 for the z axis) where the axis of point 205 is located (3) an address (for example, +1 or -1) of the unit vector. (The same process can be performed for a rotation, except that the address does not need to be stored, since quaternions that lie "in opposite directions on the unit sphere represent equivalent rotations.) Figure 2 is shown using 3 bits per component On the face Once a normal 200 (or rotation) has been quantified, the movement of the point 205 between frames in the video stream can be encoded using the topological coherence that the cubic tracing provides. compensation vector describes the movement on the face present at point 205, however, a point 205 can also be moved to a different face between frames.In this case, the face index used in the quantization process allows the computation of a The location of the point 205 is described in relation to Figure 3. Figure 3, the cube of Figure 2 has been unfolded. It is appropriately marked in its upper left corners. The original point 205 will be moved according to vDelta = [+ 6, -2] and Inverse = +1. Figure 3 shows in the displayed cube how the original point 205 moves to become a new point 205 '. Also, Figure 4 shows the same movement and the result in the original cube of Figure 2. As can be seen in Figure 4, the orientation has changed from the y axis (ori = 1) to the z axis (ori = 2) , and points 200"to the new location, however, the direction has remained at + 1. Interestingly, to move from the face + and its opposite face (ie, the face -y), the movement may be coded "jumping" to "that face directly by reversing the address of the previous point using the address bit. Therefore, long movements around the surface of the cube can be avoided by starting on an opposite face. Although an example is provided before the movement of a point between frames, the method may be generalized as described below. Each normal is renormalized to the cube according to: The rotations (axis «, angle a) are written, quaternions, d < agree with: • M-co-tf) < l) -. * < §) v (2) = .sm < f) vP) = .Sin (|) Normal ones are rotations that are reduced to the component form. The number of reduced components is defined to be N, where N = 2 for normals and N = 3 for rotations. Therefore, the dimension of v is then N = + 1. The process of compression and quantification is the same for both normals and rotations, and includes, determining (1) the rotation, (2) the direction, and (3) the components of each normal or rotation. The orientation k, of a unit vector v is the index i (where i is on the scale from 0 to N), whose component has the largest absolute value (that is, | v [i] |). This integer is encoded using two bits. Having found k, the direction, dir, (where dir = 1 or -1) of the unit vector v is determined using the sign of the component v [k]. As discussed above, this value is not written for rotations (due to the properties of the quaternions). The N components of the compressed vector are calculated by plotting the square on the unit sphere to a N dimensional square according to: v _ ['j = - tan' v [(/ + ¿+ l) mod (? / + l) T = 0 N p Other functions than the tangent arc are possible. Any coordinate transformation function that reduces the deformation of the map stroke due to the thin appearance of the cube faces can be used. Essentially, the function must virtually bend the planar faces of the enclosing cube, so that a given number of bits corresponds to almost the same quantization error anywhere in the underlying sphere. In addition, although other functions are possible in the form of a * tan'1 (h * c) where c = v [(i + k + 1) mod (N + 1)] / v [k], as described previously, the given added complexity of these forms increases the times of coding and decoding. Although some values, such as b = 0.7, give a slight increase in compression, when speed is a control factor, a and b are set at 1.0. However, the entire quantization process is controlled by the number of bits, quantization NbBits, used to represent normal or coded rotations. When encoded, if NbBits of quantification = 0 the coding process is trivial, since vq [i] = 0 for each i, and the values do not need to be encoded or written at all, that is, the values are "virtually" encoded using 0 bits. Otherwise, each component of vc (which lies between -1 and 1) is quantified as an integer assigned as follows: vq [i] = Vue \ ta (vc [¡] * * 2NbBits quantification -1 where Vuelta () returns the integer part of vc [i]. Having calculated the components vc, the components are encoded in a face index and a location on the face. The face index is encoded as an orientation (an unallocated integer, 2 bits) and, only for normals, an address (1 bit, unallocated integer (for example, where 0 is used as -1 and 1 is used as 1)). The same components are then coded as unassigned or assigned integers in quantification NbBits according to: .NbBits quantification -1 vq [i] That is, each normal is encoded in a number of bits given by: bits = 1 (address) + 2 (orientation) + 2 * NbBits of quantification, or, more specifically when NbBits of quantification = 3, as in the example, then : bits = 1 (address) + 2 (orientation) + 2 * 3 (ie quantization NbBits) = 9. For rotations, the address bit is omitted and the number of bits for encoding is given by: bits = 2 (orientation) + 3 * NbBits of quantification, or, more specifically, when NbBits of quantification = 3, as in the example, then: bits = 2 (orientation) + 3 * 3 (ie NbBits quantification) = 11. The The decoding process works in reverse, using the same number of bits to decode the face index (including direction and orientation). The components themselves are decoded to assigned values according to: NnbBits-1 quantification Having converted the encoded values to decoded values, the quantization process must be carried out through an inverse quantization process. As described above, if quantization NbBits is 0, the quantization process was trivial and no bit was actually written, so the decoding process is similarly trivial and does not read any bit, and v6 [i] is set to 0. However, when NbBits of quantification is not zero, the inverse quantization process is performed according to: vq'fi} Y / 7 = iNbBits quantification -1 After extracting the orientation, k, and the direction, dir, the reverse layout can be made according to: V [(/ + k + 1) mod (_V + 1)] = tan (* 'Vf'M .v "[Jfc] = 0 N If the object is a rotation, v 'can be either used directly or converted from a quaternion to a rotation (axis ", angle a): Vfl] V [2] V [3] a = 2.cos ~ '(v' [0]) nx = sm (a / 2) s? N (a / 2) s? N (/ 2) If the object is a normal one, v 'can be used directly as the components of the decoded normal. The ability to code and decode, however, does not just come to an end. The entire process of the present invention allows efficient compensation for movement. Therefore, a difference between two reasonably close quantized values provides a high compression. Although the compensation process is completely specified with the quantization integer NbBits defined in the quantization process, in order to encode the values extracted by the process, the following parameters have to be defined: Minimum Compensation: An array of integers that defines the minimum junctions of the compensation vector vDelta. NbBits of compensation: An integer that defines the number of bits used to represent the components of the compensation vector. Each component vDelta [i] is translated by CompMin [i] and encoded in compensation NbBits. In this way, the actual coded value is then vDelta [i] - CompMin [i] and the coding process defines how to calculate the compensation vector vDelta between the normal or quantized rotations, vq1 and vq2. Similar to the coding of the face index in the quantization process, in the compensation process, the parameters, vql and vq2, are defined by respective orientations and directions, named as (1) ori 1 and dirl and (2) ori2 and dir2, respectively for vql and vq2. All ori 1, d i r 1, ori2 and dir 2 are integers. In addition, vql and vq2 both include a respective array of quantized integer components that are stored in corresponding arrays, vq1 [] and vq2 []. Based on vql and vq2, the direction of a compensation vector vDelta is calculated as -1 or 1 (although for rotations the value is ignored), and is referred to as an inverse integer. Then, a group of compensation components is calculated and stored in an array of integers vDelta []. Values in vDeltaf] are obtained according to the method described below. Let the number of reduced components be N, where N = 2 for normal and N = 3 for rotations. Initially, the variable inv = 1. Then, the differential orientation, dOri and the direction, dDir, between vql and vq2 were calculated according to: dOri = (ori2-or1) mod (N + 1) dDir = dir1 * dir2 scale = max (1 / 2.2) NbBi, s quantification-1 - 1). That is, the scale represents the maximum value represented on one face of the cube based on the number of bits used in the quantization. However, 0 and 1 bit are special cases for NbBits of quantification. In the case of 0 and 1 bit, only the centers of the faces on the cube can be re presented. In this way, the coding with 0 bits (plus the bits of address and orientation) is more efficient than the coding with 1 bit, since 0 bits use less space and allow the coding of the same information. further, the preferred embodiment of the present invention uses odd values for the boundaries of each face in order to ensure that 0 can be correctly represented on each face. Although the face values of representation in this form raise a value (for example, -128 when NbBits quantization = 8) not used, it is preferable to accurately represent pure directions at the expense of less efficient coding. In fact, to compensate for lower coding efficiencies in general, additional coding layers (eg, Huffman coding) can be used up. However, in the preferred embodiment, no additional coding is applied to overlay the 1-bit loss, since the added complexity is not biased by a significant enough gain. Then, depending on the differential orientation, dOri, each of the following two cases is considered separately.

Since the variable inv may have changed during the calculation, the inverse variable is calculated from the fact according to: inverse = inv * dDir. Sometimes, there are two ways of representing the same rotation, depending on the face from which the projected point is seen. However, this non-marginal injectivity in the quantization process may not produce exactly vq2 when vql is compensated by vDelta to create vq2 '. However, vql 'and vq2' will always represent the same normal or rotation. Having calculated the compensation, the compensation must be coded as it was done during the quantification. Only for normals, the inverse is encoded in a single bit. The components of the compensation vector are then translated by minimum compensation and coded into compensation NbBits according to: codif.cacion = vDelta / 7/7 - CompMin [¡] The coding process is then inverted, and the compensation vectors are decoded The decoding process transforms -the- normal or rotation vq-1 * by the -vector-compensation vDelta to produce the normal or quantized rotation vq2. Only for normals, the inversion of a single bit is decoded. Then, the components of the compensation vector are translated by minimum compensation and encoded in compensation NbBits according to: vDeltafi] = vdeCod ?. ication + CompMin [i] From vDelta, the compensation process can be continued. The initial parameter, vql, includes (1) an orientation 1, (2) an address 1, and (3) a group of quantized components stored in an array of integers vq1 []. Also, the compensation vector, vDelta, includes (1) an inverse integer marked "inverse", and a group of compensation components stored in an array of integers vDelta []. The result of the compensation process is a quantized parameter vq2, defined by (1) an orientation, (2) an address, and (3) a group of quantized components stored in an array, vq2 [], of integers. The values of vq2 [] are calculated. Also, the scale is given: scale = max (1 / 2.2NbBl, s qualf? Cac? 6n "1 -1) Initially, an addition of component is made per component and it is stored in a temporary arrangement according to: VqTemp [i] = vq 1 [i] + vDelta [i]. As always, N is the number of reduced components, where N = 2 for normal and N = 3 for rotations. Based on the initial calculations, another processing is performed according to which the following 3 cases are true.

Accordingly, the present invention allows normal and rotations to be efficiently coded. The method applied to rotations and normals is uniform, which allows an optimal system to be developed for scene coding. If you believe that the prediction / compensation process described here combined with an appropriate quantization and entropy coding will allow a reduction in the size of data for video streams, as graphically shown in Figure 5. Said reduction is expected to be for a factor of 15 or more in view of a quantification reduction of approximately 5: 1 and a compensation reduction of approximately another 2: 1 to 3: 1 from the quantified state. As shown in Table 1 below, the compression of rotations and normals according to the present invention can provide important scene compression as compared to VRML ASCII files that represent substantially similar video streams.

TABLE 1 As may be apparent to those skilled in the art, the method and system may be practiced in another manner explicitly described herein without departing from the spirit of the invention. Therefore, the specification is not intended to be limiting, and only within the scope of the claims defines the limits of the invention.

Claims

1. A method implemented by computer to encode a video stream, the method comprises the steps of: (a) calculating an original vector representing one of a normal and a rotation; (b) calculate a normalized vector of the original vector; (c) projecting the normalized vector into a unit cube for a first point on a first face of the unit cube; and (d) encoding a video stream using an index of the first face and a location of the first point on the first face.

2. The computer implemented method according to claim 1, further comprising the steps of: (e) calculating a movement vector based on movement, together with the unit cube between the first point of the first face and a second point on the second side; and (f) encode the video stream using the motion vector. The method implemented by computer according to claim 2, wherein the first and second faces are the same face on the unit cube. The method implemented by computer according to claim 2, wherein the first and second faces are different faces in the unit cube. The method implemented by computer according to claim 1, further comprising the step of decoding the video stream by receiving the index of the first face and the first point, and converting to one of a normal and a rotation. 6. The computer implemented method according to claim 2, further comprising the step of decoding the video stream by receiving the motion vector and calculating one of a normal and a corresponding rotation.