GB2475651A - Determining an animation variable response subspace in response to a set of training data - Google Patents

Determining an animation variable response subspace in response to a set of training data Download PDF

Info

Publication number
GB2475651A
GB2475651A GB1103946A GB201103946A GB2475651A GB 2475651 A GB2475651 A GB 2475651A GB 1103946 A GB1103946 A GB 1103946A GB 201103946 A GB201103946 A GB 201103946A GB 2475651 A GB2475651 A GB 2475651A
Authority
GB
United Kingdom
Prior art keywords
animation
animation variable
variable response
response
points
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
GB1103946A
Other versions
GB201103946D0 (en
GB2475651B (en
Inventor
John Anderson
Mark Meyer
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Pixar
Original Assignee
Pixar
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US11/626,214 external-priority patent/US7764288B2/en
Priority claimed from US11/626,199 external-priority patent/US7839407B2/en
Application filed by Pixar filed Critical Pixar
Priority claimed from GB0812051A external-priority patent/GB2447388B/en
Publication of GB201103946D0 publication Critical patent/GB201103946D0/en
Publication of GB2475651A publication Critical patent/GB2475651A/en
Application granted granted Critical
Publication of GB2475651B publication Critical patent/GB2475651B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T13/00Animation

Abstract

A method for a computer system includes determining an animation variable response subspace within an animation variable response space associated with an animation variable. In response to a set of training data for the animation variable, a set of characteristic calculation key points configured to allow navigation within the animation variable response subspace is determined. A difference between a predicted animation variable response value and a calculated animation variable response value for a point in the set of points and in the set of characteristic calculation key points is also determined. When the difference exceeds a first threshold difference, the predicted animation variable response value and the calculated animation variable response value are combined to form the animation variable response value for the point for the specific frame of animation. The image is then rendered.

Description

METHODS AND APPARATUS FOR DETERMINING ANIMATION VARIABLE
VALUES IN COMPUTER ANIMATION
BACKGRO1J1D OF THE INVENTION [0001] The present invention relates to computer animation. More specifically, the present invention relates to methods and apparatus for accelerating aspects of computer animation including surface calculation, rendering, and the like.
[00021 Throughout the years, movie makers have oiten tried to tell stories involving make believe creatures, far away places, and fantastic things. To do so, they have often relied on 21thn2tiOn techniques to bring the make-believe to "life." Two ofthe major paths in animation have traditionally included, drawing-based animation techniques and stop motion animation techniques.
(0003] Drawing-based animation techniques were refined in the twentieth century, by nioviemaicers such as Walt Disney and usedinmovies suchas "Snow Whiteand theSeven Dwarfs" (1937) and "Fantasia" (1940). This animation technique typically required. artists to hand-draw (or paint) animated ima.ges onto a transparent media or eels. After painting, each eel would then be captured or recorded onto film as one or more frames in a movie.
[0004] Stop motion-based animation tecimiques typically required the construction of miniature sets, props, and characters. The filmmakers would construct the sets, add props, and position the miniature characters in a pose. After the aniintor was happy with how everything was arranged, one or more frames of film would be taken of that specific arrangement. Stop motion animation techniques were developed by'movie makers such as Willis O'Brien for movies such as "Xing Kong" (1933). Subsequently, these techniques were refined by animators such as Ray Harryhausen for movies including "Mighty Joe Young" (1948) and Clash Of The Titans (1981).
[0005] With the wide-spread availability of computers in the later part of the twentieth century, aninators began to rely upon computers to assist in the animation process. This included using computers to facilitate drawing-based animation, for example, by painting images, by generating in-between images ("tweening"), and the like. This also included using computers to augment stop motion animation techniques. For example, physical models could be represented by virtual models in computer memory, and manipulated [0006] One of the pioneering companies in the computer-aided animation (CA) industry was Pixar. Pixar is more widely known as Pixar Animation Studios, the creators of animated features such as "Toy Story" (1995) and "Toy Story 2" (1999), "A Bugs Life" (1998), "Monsters, Inc." (2001), "Finding Nemo" (2003), "The Incredibles" (2004), "Cars" (2006) S and others. In addition to creating animated features, Pixar developed computing platforms specially designed for CA, and CA software now known as RenderMan®. RenderMan® was particularly well received in the animation industry and recognized with two Academy Awards®. The RenderMan® software included a "rendering engine" that "rendered" or converted geometric and/or mathematical descriptions of objects into a two dimensional image.
[00071 The definition of geometric object and / or illumination object descriptions has typically been a time consuming process, ancordiiigly, statistical models have been used to represent such objects. The use of statistical models for calculations in graphics have included the creation of a kinematic articulation model which is trained from poses which can be generated from either physical simulations or hand corrected posed models. Approaches have been based on a pose based interpolation scheme in animation variables or based upon approximation on multiple coefficient weighting of the positions of points in various skeletal articulation frames. Such approaches do not require the posing of key points, a limitation that precludes the use of models depending upon history based simulation for posing.
[00081 Drawbacks to kinematic articulation schemes are related to their generality -the required training sets for such techniques can be very large. Additionally, in practice, it is essentially impossible to place bounds on the errors of the reconstructions when new poses are specified which are far from those included in the training set of poses.
[00091 Some techniques for real time deformation of a character surface have used a principal component analysis of joint angles to compute a basis for surface displacements resulting from perturbing each joints of a model in example poses. Given several example poses consisting ofjoint angles and surface displacements, specific joint angle values can be associated with coordinates in the basis space by projecting the surface displacement into the basis. Surface displacements for novel joint angle configurations are thus computed by using the joint angles to interpolate the basis coordinates, and the resulting coordinates determine surface a displacement represented in the subspace formed by the basis.
[0010] Some techniques for accelerating computation of illumination in a scene have also relied upon principle component analysis to compute a basis for global ifluinination resulting from illumination sources in a scene. Based upon the computed basis, the illumination from a novel lighting position can be computed by using the light position to interpolate subspace coordinates of nearby example lights.
[0011] Drawbacks to use of principle component analysis for animation, described above, include that they such analyses rely upon use of animation variables such as joint angles or light positions to drive a subspace model.
[0012J Fig. 1A illustrates a schematic block diagram of a general statistical modeling approach. In this general approach, a statistical model 10 is first determined to accept the animation variables 20 (posing controls, light positions and properties). Subsequently, the statistical model 10 produces the desired outputs 30 (point positions or illumination values).
[0013] Problems with this most general approach includes that the relationships between the input controls (animation variables 20) and the outputs 30 can be highly nonlinear. As an example, it is common to use an articulation variable to set either sensitivity or pivot point of another variable, thus the behavior may often be extremely nonlinear. Accordingly, it is essentially impossible to statistically discover a useful basis from typical training sets.
[0014] Accordingly, what is desired are improved methods and apparatus for solving the problems discussed above, while reducing the drawbacks discussed above.
BRIEF SUMMARY OF THE INVENTION
[0015] The present invention relates to computer animation. More specifically, the present invention relates to methods and apparatus for accelerating various time-consuming processes of computer animation such as posing of objects, global illumination computations, and the like.
(0016] Various embodiments of the present invention disclose Point Multiplication processes that are statistical acceleration schemes. Such embodiments first identify a set of characteristic key points in a graphical calculation, such as posing àr rendering. Next, graphical calculations are performed for the various key points to provide a subspace based estimate of the entire or full geometric calculation. Embodiments of the Point Multiplication process are useful acceleration schemes when the geometric calculation requirements for evaluating key point values are substantially lower than for evaluating the full set of points.
Calculations of this sort occur in many areas of graphics such as articulation and rendering.
[0017] Embodiments of the present invention that are directed to accelerated global iliwninatiori computations involve the use of importance sampling. Embodiinenis use prior knowledge about the light transport within the scene geometry, as observed from the training set, to optimize importance sampling operations.
[0018] Fig. lB illustrates a block diagram according to various embodiments of the present invention. In various embodiments, a training set of animation data is used to create a statistical subspace model, from which values of key points are determined.
The posing engine 50 is then driven at runtime with animation data 40 to determine values 60 for the set of key points. In some embodiments, the output values 70 are linearly related with respect the key point values 60. Accordingly, the selected key points 60 from the training set may describe many I alt of the important nonlinearities of the problem. In cases where new nonlinearities appear that are not seen in the training set, a soft cache fail through condition arises.
[0019] Embodiments of a soft caching process are extensions to embodiments of the Point Multiplication technique described herein. In such embodiments, values of the key points are also used to provide a confidence estimate for the Point Multiplication result. In other words, values of certain key points are used for error checking, and not necessarily used for Point Multiplication computations.
[0020] In frames with high anticipated error, the Point Multiplication process described below may "fail through," e.g. a cache miss. In such cases a full evaluation of values for all points may be necessary. In cases that have a low error, e.g. a cache hit, the Point Multiplication process may be used to determine values for all points. As an example, with global illumination when embodiments are applied to a photon mapping renderer the illumination gather step is initially limited to only the key points. If error tolerances are exceeded, the fail through is easily accommodated through additional gathers for additional points. In various embodiments where calculations can easily be performed on a point by point basis, in the event of a cache miss, embodiments can "fail through" to perform the full calculation on all of the points.
[0021] One aspect of the invention provides a method for a computer system comprising determining an animation variable response subspace within an animation variable response space associated with an animation variable, in response to a set of training data for the animation variable; determining a set of characteristic calculation key points configured to allow navigation within the animation variable response subspace; calculating animation variable response values for the set of characteristic calculation key points in the animation variable response subspace for a specific frame of animation in response to input data for the animation variable; predicting animation variable response values for a set of points within the animation variable response space for the specific frame of animation in response to animation variable response values for at least some of the set of characteristic calculation key points in the animation variable response subspace; determining a difference between a predicted animation variable response value and a calculated animation variable response value for a point in the set of points and in the set of characteristic calculation key points; when the difference exceeds a first threshold difference, combining the predicted animation variable response value and the calculated animation variable response value to form the animation variable response value for the point for the specific frame of animation; and rendering an image using the animation variable response value for the point.
[0022] Another aspect of the invention provides a computer system comprising a memory configured to store a set of training data for an animation variable; and a processor coupled to the memory, wherein the processor is configured to determine an animation variable response subspace within an animation variable response space associated with the animation variable, in response to the set of training data for the animation variable, wherein the processor is configured to determine a set of characteristic calculation key points configured to allow navigation within the animation variable response subspace, wherein the processor is configured to calculate animation variable response values for the set of characteristic calculation key points in the animation variable response subspace for a specific frame of animation in response to input data for the animation variable; wherein the processor is configured to predict animation variable response values for a set of points within the animation variable response space for the specific frame of animation in response to animation variable response values for at least some of the set of characteristic calculation key points in the animation variable response subspace, wherein the processor is configured to determine a difference between a predicted animation variable response value and a calculated animation variable response value for a point in the set of points and in the set of characteristic calculation key points, and wherein when the difference exceeds a first threshold difference, the processor is configured to combine the predicted animation variable response value and the calculated animation variable response value to form the animation variable response value for the point; andwherein the memory is configured to store the animation variable response value for the point; wherein the processor is configured to render an image in response to the --animation variable response value for the point.
[0023] In one embodiment a computer readable medium carries code for controlling a computer to carry out the method. The media may include semiconductor media (e.g. RAM, flash memory), magnetic media (e.g. hard disk, SAN), optical media (e.g. CD, DVD, barcode), or the like [0026] In one embodiment a computer readable medium carries code for controlling a computer to carry out the method. The code typically resides on a computer-readable media, such as a semiconductor media (e.g. RAM, flash memory), magnetic media (e.g. hard disk, SAN), optical media (e.g. CD, DVD, barcode), or the like.
BRIEF DESCRIPTION OF THE DRAWINGS
[0027] In order to more fully understand the present invention, reference is made to the accompanying drawings. Understanding that these drawings are not to be considered limitations in the scope of the invention, the presently described embodiments and the presently understood best mode of the invention are described with additional detail through use of the accompanying drawings.
[0028] Fig. IA illustrates a schematic block diagram of a general statistical modelling approach; [0029] Fig. lB illustrates a block diagram according to various embodiments of the present invention; [0030] Fig. 2 is a block diagram of typical computer system 100 according to an embodiment of the present invention; [0031] Figs. 3A-B illustrate a flow diagram according to various embodiments of the present invention; 100321 Figs. 4A-C illustrate a flow diagram according to various embodiments of the present invention; [0033] Figs. 5A-C illustrate a comparative examples of embodiments of the present invention; [0034] Figs. 6A-D2 illustrate examples according to embodiments of the present invention; and [0035] Figs. 7A-C illustrate examples according to embodiments of the present invention.
DETAILED DESCRIPTION OF THE INVENTION
[0036] Hardware Overview [0037] Fig. 2 is a block diagram of typical computer system 100 according to an embodiment of the present invention. In the present embodiment, computer system 100 typically includes a monitor 110, computer 120, a keyboard 130, a user input device 140, computer interfaces 150, and the like.
[0038] hi one embodiment, user input device 140 is typically embodied as a computer mouse, a trackball, a track pad, a joystick, wireless remote, drawing tablet, voice command system, eye tracking system, and the like. User input device 140 typically allows a user to select objects, icons, text and the like that appear on the monitor 110 via a command such as a click of a button or the like.
[0039] Embodiments of computer interfaces 150 typically include an Ethernet card, a modem (telephone, satellite, cable, ISDN), (asynchronous) digital subscriber line (DSL) unit, FireWireTM interface, USB interface, and the like. For example, computer interfaces 150 may be coupled to a computer network, to a FireWire bus, or the like.
In other embodiments, computer interfaces 150 may be physically integrated on the motherboard of computer 120, may be a software program, such as soft DSL, or the like.
[0040] In various embodiments, computer 120 typically includes familiar computer components such as a processor 160, and memory storage devices, such as a random access memory (RAM) 170, disk drives 180, and system bus 190 interconnecting the above components.
[0041] In one embodiment, computer 120 includes one or more Xeon microprocessors from InteITM. Further, in one embodiment, computer 120 typically includes a UNIXTM -based operating system.
[0042] RAM 170 and disk drive 180 are examples of tangible media configured to store data such as an animation environment, models including geometrical descriptions of objects descriptions of illumination sources, procedural descriptions of models, frames of training data, a specification of key points, embodiments of the present invention, including executable computer code, human readable code, or the like. Other types of tangible media include floppy disks, removable hard disks, optical storage media such as CD-ROMS, DVDs and bar codes, semiconductor memories such as flash memories, read-only-memories (ROMS), battery-backed volatile memories, networked storage devices, and the like.
[0043] In the present embodiment, computer system 100 may also include software that enables communications over a network such as the HTTP, TCP/IP, RTP/RTSP protocols, and the like. In alternative embodiments of the present invention, other communications software and transfer protocols may also be used, for example IPX, UDP or the like.
[0044] Fig. 2 representative of a computer system capable of embodying the present invention. It will be readily apparent to one of ordinary skill in the art that many other hardware and software configurations are suitable for use with the present invention.
For example, the computer may be a desktop, portable, rack-mounted or tablet configuration. Additionally, the computer may be a series of networked computers.
Further, the use of other micro processors are contemplated, such as XeonTM, Pentium TM or Core TM microprocessors; Turion 64, OpteronTM or AthIonXPTh microprocessors from Advanced Micro Devices, Inc; and the like. Further, other types of operating systems are contemplated, such as Windows® WindowsXP®, WindowsNT®, or the like from Microsoft Corporation, SolarisTM from Sun MicrosystemsTM, LINUXTM, UNIXTM, and the like. In still other embodiments, the techniques described above may be implemented upon a chip or an auxiliary processing board (e.g. graphics processor unit).
0045] Mathematical Basis for Animation and Rendering Subspaces [0046] Embodiments of the present invention use subspace descriptions to reduce the number of degrees of freedom. Embodiments begin with a definition of set of training frames of a state variable Q. In various embodiments, the training data may include a position value, an illumination value, or the like. Additionally, the training data are intended to exercise the pose space of the system to be accelerated. Next, a set of basis functions ai(x) and b1(t) are computed, where x is a generalized "space" coordinate which identifies the point and t is a "time" coordinate indexing the training frames:
M
Q(x,t) = [0047J 1=1 [0048] Here Q is an M dimensional subspace approximation to Q. Since, in some embodiments of the present invention, x and t are discrete variables, it is notationally convenient to rewrite the previous equation as:
M
Q[x,t] a,x] b,(tj [00491 [00501 In this relationship, Q (similarly Q) is a matrix indexed by x and t, a is a vector indexed by x, and b is a vector indexed by t. Using an Bnipirical Orthogonal Function EOF (and the closely related singular value decomposition (SVD) values for the a's and b's are determined. The a's and b's are determined such that the least squares errors is reduced (e.g. lessened, minimized) for all values of M, by finding the eigenvectors of the two covariance matrices QQT and QTQ In various embodiments, these two matrices have the same set of eigenvalues up to the reduced (e.g. lessened, minimum) number of x's and t's. hi the above representation, a1[x] and b1[t) are the eigenvectors with the largest eigen'values, az[xj is associated with the second largest cigenvalue, etc. In the present disclosure, the a's represent the subspace basis vectors and the b's represent the pose space coordinates.
[0051] This particular decomposition has a number of advantageous properties. One important property is that it is sequential,i.e., for M 1 the first set of vectors are determined. Next, by performing the same analysis on the residual data, (Q -Q) a second set of vectors is determined, and so on. This iterative processes described herein are numerically attractive methods for computing the subspace, since the largest mode can be easily found from a factored power method without computing the eigenvalues or the covariance matrices.
Additionally, since the process can be iteratively performed, one can monitor the magnitude of the residual, which is the subspace projection error. In various embodiments, the process may be terminated when the magnitude ofthe error is acàeptably small.
[0052J Another valuable aspect of the described embodiments is that no assumptions are made about the spatial or temporal adjacency of points and frames. In particular, although the a1[xJ vectors do tend to be smooth in space, this smoothness results only from the smoothness and spatial correlations of the original training frames.
[0053] Yet another valuable aspect of the described embodiments is that once the final subspace dimension, M, is chosen, the aj[xJ and b1[x] vectors are not uniquely determined by S the reduction (e.g. lessening, minimization) of error. What is desired is that the subspace is be spanned by the basis vectors. In various embodiments, multiplication of the basis vectors by any nonsingular matrix should result in an additional set of vectors which span the same space. Further, the multiplication of the basis vectors by any orthogonal M dimensional rotation nialrix should result in an orthonormal basis for the subspace. This property is quite useful to generate basis vectors that are more "local" the original basis vectors. In various embodiments, as will be described below, the "localized" basis vectors are used to help in selection of"key points." [0054] In various embodiments, the Point Multiplication process may include, after computing a subspace and set of key points using the training set, using values of selected key points and using a least squares projection onto the subspace to determine the pose space coordinates. The statistically determined pose is then taken as the value ofQ for that pose space coordinate.
[00551 Figs. 3A-B illustrate a flow diagram according to various embodiments of the present invention. Initially, a set of training frames Q, as described above is provided, step 200. In various embodiments, each frame defines values for animation variables for a number of sample training points at a defined frame (time) as derived from data specified by an animator, lighting user, or the like. As merely an example, an animator may specify specific poses for an object. Next, based upon the pose data, the positions "x" for a defined number of surface locations on the object can be calculated for each point, for each specific pose, to determine Q. As another example, a lighter may specify specific illumination sources being positioned in a scene. Next, based upon these sources, illumination values "x" for the training points in the scene can be calculated for each illumination configuration to determine Q. Iii one example, the number of training points in a training set may range from about 1000 to about 4000 points. In other embodiments, a greater number of training points or a fewer number of points may be specified for a training frame.
[0056) In various embodiments, based upon the set of training frames Q, the basis vectors a1[xJ and the pose space coordinates b*[t] are determined, step 210. As discussed above, techniques such as empirical orthogonal function, or single value decomposition can be used to determine the basis vectors. In various embodiments, the number of basis vectors is on the order of 10 to 20. In other embodiments, a greater or fewer number of basis vectors may be used, depending upon performance constraints, accuracy constraints, and the like.
[0057] In response to the basis vectors, in various embodiments, key points are selected from training points from the set of training frames; key points are approximations of training points; or the like, step 220. In various embodiments, the key points may be divided into groups, e.g. a first group of key points and a second group of points. In some embodiments, the second group of points are also key points. The groups of points need not be specifically identified at this stage. Further detail regarding how key points are may be identified are described below.
10058] In various embodiments, steps 200-220 may be performed ofi:.line, e.g. before run-time.
[0059] In various embodiments of the present invention, during production run-time, values of animation variables are specified, step 230. As examples of this step, an animator may specify displacements, geometric parameters, and the like as the animation values; or a lighter may specify a specific set of illumination sources in a scene, or the like. In various embodiments, any conventional animation environment or lighting environment may be used to specify these animation values. In some embodiments, this corresponds to animation control 40, in Fig. lB.
[0060] Next, in response to the given values for the animation variables, computed values are determined in a conventional maimer for key points and other points identified above, step 240. In various embodiments, values of the key points in all the distinct groups of key points are determined. In various embodiments directed to object posing, based upon the animation values, the pose values for the key points are determined. In some embodiments, the posing engine and posing results determined in this step corresponds to elements 50 and 60, in Fig. lB. In other embodiments, the computed values need not be determined by inputting the animation values into "full" animation engine, but may be computed by inputting the animation values into a more simplified animation engine.
[00611 In various embodiments, based upon the determined values for a first set of key points, values for other points may be predicted (e.g. determined), step 250. In some embodiments, the key points in the first set of key points are projected onto the subspace while reducing (e.g. lessening, minimizing) a teast squares error. The result of this operation is the determination of pose space coordinates b1 for a given frame. In other words, b are determined such that they weight the basis function aj, to approximately match the computed values associated with the first key points (with reduced least squares error). Based upon b1 and aj, the values for all points for points specified in the training set Q or others may be predicted. In various embodiments, this includes predicting values for points from the second set of points (e.g. key points), or other points. These predicted values may be used in the following steps. In other embodiments, values for points not specifically specified in training sets are predicted/determined based upon the projection onto the subspace In some embodiments, the output of this step from posing engine 50 are posed points 70, in Fig. lB.
[0062] Next, in various embodiments, a determination is made as to whether the predicted values of the second set of points arc within au error range of the actual, determined values of the second set of points, step 255. As will be explained further below, the predicted values for the points in the second set of points (step 250) may be compared to the computed values for these points determined in step 240. In various embodiments, because values for these points were computed above, these values are conveniently used in This step to determine how accurate the subspace is.
[0063] Although the predicted values for points determined in step 250 are usually quite accurate, there can be times when the statistical reconstruction may be unable to produce an estimate with acceptable accuracy. In various embodiments, if the key points are well chosen, the error in the calculations in step 250 may be a good measure of the total projection error. Large projection errors may occur in cases when the range of animation control that is provided in step 230 was not specified in the training set, or a particular combination of animation controls was exercised that resulted in a novel pose.
[0064] In various embodiments of the present invention directed to animated sequences, discontinuous behavior between adjacent frames of animation is not desirable, as that will lead to popping or other high frequency artifacts to be visible to an audience, for example.
One cases\ of this may include, for example, if frames 1-10 rely on animation predictions based upon a subspace model, and frames 11-12 rely on full animation computations based upon a full space model. fri such, embodiments, it is not desirable for the predicted values to be used for frames 1-10, and full computed values for frames 11-12 because the positions of points may be visibly different.
[0065] To reduce the appearance of such discontu.nities, a transition zone of projection error values e = [ernin, enax] over which the full animation calculation is provided in various embodiments. In such cases the values for the points are interpolated between the predicted values and the full calculated values, as a function of error, step 295. In various embodiments, if the error (measured by RMS error of the key points) is below emin, only the Point Multiplication result (e.g. predicted value) is used. In various embodiments, if the error is greater than emax, the full calculation is used at every point. Additionally, if the error is within the e range, the result may be linearly interpolated between the Point Multiplication (e.g. computed) and fully calculated results.
(0066] Figs. 7A-C illustrates examples according to other embodiments of the present invention. Specifically, they illustrate an example where the predicted results and the computed results are combined. Fig. 7A illustrates a facial pose resulting from posing all of the points with our posing engine. Next, the predicted result for the points generated via Point Multiplication of 170 key points is shown Fig. 7B. Although the full computed poses (Pig. 7A) are very close to the predicted poses (Fig. 7B), they may be different. In Fig. 7B, notice how the tight lip pucker of Fig. 7A is incorrectly predicted in Fig. 7B.
[0067] Fig. 7C illustrates shows a result of applying embodiments of the Soft Caching process to the incorrectly predicted facial pose in Fig. 7B, using e [0.1,0.15].
[0068] In Fig. 7B, the cache miss was identified, and as a result, the output illustrated in Fig. 7C was formed. As can be seen, 7C appears closer to the output of Fig. 7A. In various embodiments, other types of combinations between the predicted results may be performed, e.g. non-linear, etc. [0069] In some embodiments of the present invention, cache misses may be tagged and the animation inputs used for the cache miss may be input into the system and used as additional training data. Such embodiments would provide a system that learns more about the pose space as it is used. Training could, for example, be done each night resulting in a more accurate subspace for the animators and lighters to use the next day.
(0070] In other embodiments, instead of interpolating between a full set of predicted data and a full set of fully computed data, localized interpolation may be used. For example, in Figs. 7A-C, if error is determined to be localized, e.g. near the mouth, the full computation based upon animation input maybe performed near the mouth. In such cases, a spatial interpolation may also be applied which blends between regions of full computations, and regions of predictions. Such embodiments would b9'bne5cial to reduce the number of"lhil-through" computations that would be required.
[0071] In various embodiments of the present invention, the object surface may be rendered, step 260. Any conventional rendering engine may be used to perform this step. In some embodiments, Pixar's Renderrnan product can be used. The resulting image may be further processed, if desired, and a representation of the image is stored into computer memory, step 270. The representation of the image may be retrieved from the memory, or transferred to another memory, such as film media, optical disk, magnetic disk, or the like.
Subsequently, the other memory is used to retrieve the representation of the image, and the image may be output for viewing by a user, step 280. For example, the image may be output to an audience from film media, viewed by the animator on a computer display, viewed by a consumer from an optical storage disk, or the like.
[0072] Selection of Key Points [0073] In various embodiments of the present invention, when using key points for determining a pose in a subspace there are two potential sources of error. A first one is projection error, and a second one is cueing error. Projection error occurs, for example, when the pose specified by the user, e.g. animator, may not be in the subspace. In other words, the specified pose is outside the training set of poses, for example. Such errors may also occur if a lighting configuration is outside the set of training lighting configurations.
[0074] Cueing error occurs when the subspace locations determined from the least squares fit to the key points, described in step 250, above are not the closest point in the subspace to the desired pose. In other words, cueing error may be caused by sub-optimal selection of the key points from the subspace, as key point values are used as distance proxies. In light of the potential sources of error, various embodiments of the present invention may include an iterative approach for selecting the key points from the subspace that attempts to minimize / reduce cueing error.
[0075] Figs. 4A-C illustrate a flow diagram according to various embodiments of the present invention. More specifically, Figs. 4A-C illustrate a more detailed process for step 220, in Fig. 3. In various embodiments of the present invention, as discussed in step 210, basis vectors a[x) and the pose space coordinatesb1(t) are determined based upon the set of training frames Q. [0076] Next, a coordinate rotation is performed on the basis vectors, step 300. In various embodiments, the Varimax method is used which computes an orthogonal rotation that maximizes the ratio of the 4th moment to the 2nd moment of the basis vectors. After rotation, the rotated basis vectors provide.a measure of locality. In other words, the coordinate rotation localizes the basis vectors by maximizing the variation of a small set of points in each basis vector (in turn driving the variation of the other points towards zero). In other embodiments, other methods for performing coordinate rotation can also be used.
[0077] Next, in various embodiments, for each localized basis vector, step 310, two points are selected as key points from each of the rotated vectors, steps 320 and 330. In some embodiments, a first point may be the point with the largest positive magnitude in the basis vector, step 320; and the second point may be the point whose inner product with the first point has the largest negative value, step 330. In some embodiments, steps 320 and 330 may select key points with less than the largest magnitude or greater than the largest negative value, depending upon specific implementation or design.
[00781 This type of technique for discovering statistical structure is known as teleconnection analysis. In effect, this technique essentially chooses the point that moves the most when this (localized) basis function is excited as well as the point that is most negatively correlated to the rotated basis function. The process may then be repeated for each basis vector.
[00791 In various embodiments, in addition to the key points determined by the above procedure, additional key points may be selected from the subspace, step 340. For example, in some embodiments, there may be critical fiducial points identified by the character (or lighting) designer that are manually selected as key points. For example, the designer may include key points in areas of high movement, such as eyelids, the corners of a mouth, or the like. As another example, for rendering applications a sparse grid of key points may specified to avoid having large regions of a surface not being represented by key points.
10080] In various embodiments of the present invention, given this set of key points and the subspace basis vectors, a Point Multiplication approximation for each of the columns of the training matrix Q is determined, step 350. In some embodiments, the Point Multiplication approximation may include using a least squares projection onto the subspace to determine pose space coordinates %Q* [0081] In various embodiments, a residual matrix is then determined by subtracting the approximation Q from the corresponding columns of Q, step 360. The residual matrix represents the error (both projection and cueing error) when approximating the training set Q (many points) using the current set of key points (fewer points).
[00821 Based upon the error described by the residual matrix, step 370 additional points may selected as key points, until the error reaches an acceptable error bound. In various embodiments of the present invention, if the errof is unacceptable, the residual matrix is set as the training set Q, step 380, and the process described above is repeated.
[0083] Using the residual matrix in this iterative process allows the points to be added as key points to thereby reducing the cueing error. Although, in various embodiments of the present invention, steps 370 and 380 need not be performed. The identification of the key points are then stored for later use, for example in step 240, step39o.
[0084] A subtle point in the embodiments described above is that the basis vectors constructed during iterations of this process maybe discarded. Instead, the all basis vectors determined in step 210 used for the nm-time Point Multiplication reconstruction may be computed from the initial training matrix Q. [0085] Character Articulation Examples [0086] Embodiments of the present invention were applied to the area of non-skeletal character articulation, and more particularly to the area of facial articulation. Faces are particularly difficult to replace with statistical models because they often have hundreds of animation controls, many of which are very complex and non-linear, as described above. In the case of human faces, the number of degrees of freedom are small, thus the inventors verified that embodiments of the techniques descried above work well.
[0087) By using the computed key point positions as input, embodiments of the Point Multiplication technique were able to avoid complex nonlinearities and were able to simulate facial articulation. As noted in Fig. lÀ, using animation variables to directly drive was difficult due to such complex nonlinearities. Additionally, faces do not have the benefit of skeletal structure and joint angles used by other parts of the body as input to their statistical models.
[0088] In various experiments, facial animation of a boy character were used for testing purposes. The data set included about 8322 frames of animation from 137 animation sequences. In order to provide an independent data test for the Point Multiplication process the animation sequences were divided into two sets, one was used to train the model (2503 poses chosen from 44 randomly selected sequences) and the second was used for validation (the remaining 5819 poses).
[0089] Figs. 5A-C illustrate a comparative examples of embodiments of the present invention. Using embodiments of the process described above, 85 basis vectors posed by 170 key points was sufficient to pose the model to very small errors in the training segments. The key points identified are shown in Fig. 5A. The facial model computed by posing only the key points and then using Point Multiplication to determine the location of 2986 points is shown in Fig. SB. In comparison, the facial model computed by directly posing all 2986 articulated facial points is illustrated in Fig. SC. As illustrated seen the posing of the character in Fig. SB is very close to the pose in Fig. 5C. Experimentally, it was determined that the maximum error between any point of the pose in Fig. SB and Fig. C is than 1% of the diagonal of the head bounding box for over 99.5% of the test poses.
[0090] The reduction in computation by computing poses for key points and using Point Multiplication reduces the amount of time necessary to compute, for example, a surface pose.
In various embodiments, the characters typically modeled by the assignee of the present invention, typically employed fairly complicated kinematic defornier networks. Presently, it has been determined that for faces, the computatiOn time is in-between near linear cost per point calculation and a constant cost (where posing a small set of points is no less expensive than posing than the frill face).
[0091] In some experiments, for example, the test character in Figs. 5A-C, it has been determined that the cost of posing 170.key points is about 10% of the cost of posing the full 2986 points. In paiticular, all 2986 points can be posed in 0.5 second-S Ofl average, while the key points can be posed in 0.05 seconds on average. AdditiOnallY, the Point Multiplication process takes 0.00745 (e.g., 7.45 ms) seconds on average. As a result, the speed-up in posing time is approximatelY 8.7 times. In some embodiments, the speedup may not be linear with respect to the key point to total point. This is believed to be because the key points often require more complicated deformations than the other points. For example, the mouth points require more effort to pose than the points on the back of the head (representing a smoother surface). Another reason is believed to be because in the full set of points, multiple points can sometimes reuse computations. Thus posing such related points do not require twice the computations.
[00921 In various embodiments, the number of key points with respect to the full number of points may range from approximatelY 1%-5%, 5%-t0%, 10%-25%, or the like.
[00931 The examples described above relied upon a character posed using a kinematic deformatiOn rig. However, it should be noted that embodiments may also be applied to poses defined by physical sirnulatioii, band sculpting, or the like. In such embodm1th, first a version of the face that can be manipulated by the animator at runtime (the control face) is created. Next, a training set of poses containing pairs of the physically simulated (or hand sculpted) face and the corresponding pose of the control face are used as training sets. Then, at runtime, for example, the animator poses the control face and the system uses the control face points to find the projection into the joint (control face, simulated face) subapace and computes the corresponding simulated face, based upon the key point concept and Point Multiplication concept, described above.
100941 Additional Examples [00951 Embodiments of the present invention may also be applied to computation of indirect illumination contributions, as mentioned above. These rendering problems are particularly well suited to acceleration using various embodiments as they often have a large fraction of the total computation in a final gather step. Typically, the computational costs vary nearly linearly with the number of points computed, thus it is desirable to reduce the number of computations.
[00961 Figs. 6A-D2 illustrate examples according to embodiments of the present invention.
More specifically, Figs 6A-D2 illustrate an indirect illumination problem based on a Cornell box. In this example, each training frame was generated by illuminating the box using a single point light from a grid of 144 lights along the ceiling, see Fig. 6A. From the training set, 32 illumination basis functions were selected and 200 control points (key points 400) were identified.
[0097] Experimentally, embodiments of the Point Multiplication process were performed computing the indirect illumination contributions at a series of light locations scattered near the training illumination sources as seen in Figs. 6B 1, 6C 1, and 6D 1. For comparison, the fully computed indirect illumination solutions are respectively illustrated in Figs. 6B2, 6C2, and 6D2. As these images show, the lighting is accurately estimated via the embodiments of the Point Multiplication process. Farther, the resulting errors are quite small.
[00981 In various embodiments of the present invention, the use of the lighting at key points as input to the system, not lighting variables (e.g. light position) allows the system to handle changes in the lighting variables rather easily. For instance, changing the light type, or distance falloff, etc. normally has a complex, nonlinear effect on the resulting indirect illumination. In contrast, because embodiments of the present invention computes the indirect illumination at the key points and uses the illumination itself to drive the statistical model, there is no need to determine the result of changes of animation variables to the final indirect illumination as other methods would. In a production setting where lights have many complex and often interacting controls, this is huge benefit.
[00991 Other embodiments of the present invention may be applied to interactive relighting techniques including global illumination, and other illumination-related applications.
101001 The illumination rendering tests presented herein have performed the calculations using image spacö locations, however in other embodiments of the present invention, the illumination data can easily be stored in a spatial data structure or texture map to accommodate moving cameras and geometry.
[0101] Further embodiments can be envisioned to one of ordinary skill in the art after reading this disclosure. In other embodiments, combinations or sub-combinations of the above disclosed invention can be advantageously made. The block diagrams of the architecture and graphical user interfaces are grouped for ease of understanding. However it sbould be understood that combinations of blocks, addions of new blocks, re-arrangement of blocks, and the like are contemplated in alternative embodiments of the present invention.
[01021 The specication and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense.. It will, however, be evident that various uodifications and changes may be made thereunto without departing from the broader scope of the invention as set forth in the claims.

Claims (11)

  1. Claims 1. A method for a computer system comprising: determining an animation variable response subspace within an animation variable response space associated with an animation variable, in response to a set of training data for the animation variable; determining a set of characteristic calculation key points configured to allow navigation within the animation variable response subspace; calculating animation variable response values for the set of characteristic calculation key points in the animation variable response subspace for a specific frame of animation in response to input data for the animation variable; predicting animation variable response values for a set of points within the animation variable response space for the specific frame of animation in response to animation variable response values for at least some of the set of characteristic calculation key points in the animation variable response subspace; determining a difference between a predicted animation variable response value and a calculated animation variable response value for a point in the set of points and in the set of characteristic calculation key points; when the difference exceeds a first threshold difference, combining the predicted animation variable response value and the calculated animation variable response value to form the animation variable response value for the point for the specific frame of animation; and rendering an image using the animation variable response value for the point.
  2. 2. The method of claim 1, further including when the difference exceeds a second threshold difference, using the predicted animation variable response value for the animation variable response value for the point.
  3. 3. The method of claim 1 or claim 2, further including when the difference is less than the first threshold difference, using the calculated animation variable response value for the animation variable response value for the point.
  4. 4. The method of claim 2 or claim 3, wherein combining is selected from a group consisting of: linearly combining in response to the difference, non-linearly combining in response to the difference; and wherein the animation variable is selected from a group consisting of: articulation variable, illumination variable.
  5. 5. The method of any one of claims I to 4, wherein the animation variables response values are selected from a group consisting of: articulation values, and illumination values.
  6. 6. A computer system comprising: a memory configured to store a set of training data for an animation variable; and a processor coupled to the memory, wherein the processor is configured to determine an animation variable response subspace within an animation variable response space associated with the animation variable, in response to the set of training data for the animation variable, wherein the processor is configured to determine a set of characteristic calculation key points configured to allow navigation within the animation variable response subspace, wherein the processor is configured to calculate animation variable response values for the set of characteristic calculation key points in the animation variable response subspace for a specific frame of animation in response to input data for the animation variable; wherein the processor is configured to predict animation variable response values for a set of points within the animation variable response space for the specific frame of animation in response to animation variable response values for at least some of the set of characteristic calculation key points in the animation variable response subspace, wherein the processor is configured to determine a difference between a predicted animation variable response value and a calculated animation variable response value for a point in the set of points and in the set of characteristic calculation key points, and wherein when the difference exceeds a first threshold difference, the processor is configured to combine the predicted animation variable response value and the calculated animation variable response value to form the animation variable response value for the point; wherein the memory is configured to store the animation variable response value for the point; and wherein the processor is configured to render an image in response to the animation variable response value for the point.
  7. 7. The computer system of claim 6, wherein the processor is configured to use the predicted animation variable response value for the animation variable response value forthe point.
  8. 8. The computer system of claim 6 or claim 7, wherein the processor is configured to use the calculated animation variable response value for the animation variable response value for the point.
  9. 9. The computer system of any one of claims 6 to 8, wherein the animation variables response values are selected from a group consisting of: articulation values, and illumination values.
  10. 10. The computer system of any one of claims 6 to 9, wherein the processor is configured to calculate animation variable response values for additional points in the set of points in response to input data for the animation variable; and wherein the additional points are selected from a group consisting of: all points in the set of points, less than all points in the set of points.
  11. 11. A computer-readable medium carrying computer readable code for controlling a computer to carry out the method of any one of claims 1 to 5.
GB1103946A 2006-01-25 2007-01-25 Methods and apparatus for determining animation variable response values in computer animation Active GB2475651B (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US76247506P 2006-01-25 2006-01-25
US11/626,214 US7764288B2 (en) 2006-01-25 2007-01-23 Methods and apparatus for accelerated animation using point multiplication and soft caching
US11/626,199 US7839407B2 (en) 2006-01-25 2007-01-23 Methods and apparatus for accelerated animation using point multiplication
GB0812051A GB2447388B (en) 2006-01-25 2007-01-25 Methods and apparatus for determining animation variable response values in computer animation

Publications (3)

Publication Number Publication Date
GB201103946D0 GB201103946D0 (en) 2011-04-20
GB2475651A true GB2475651A (en) 2011-05-25
GB2475651B GB2475651B (en) 2011-08-17

Family

ID=43928119

Family Applications (1)

Application Number Title Priority Date Filing Date
GB1103946A Active GB2475651B (en) 2006-01-25 2007-01-25 Methods and apparatus for determining animation variable response values in computer animation

Country Status (1)

Country Link
GB (1) GB2475651B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6069631A (en) * 1997-02-13 2000-05-30 Rockwell Science Center, Llc Coding of facial animation parameters (FAPs) for transmission of synthetic talking head video over band limited channels
US20050286764A1 (en) * 2002-10-17 2005-12-29 Anurag Mittal Method for scene modeling and change detection

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6069631A (en) * 1997-02-13 2000-05-30 Rockwell Science Center, Llc Coding of facial animation parameters (FAPs) for transmission of synthetic talking head video over band limited channels
US20050286764A1 (en) * 2002-10-17 2005-12-29 Anurag Mittal Method for scene modeling and change detection

Also Published As

Publication number Publication date
GB201103946D0 (en) 2011-04-20
GB2475651B (en) 2011-08-17

Similar Documents

Publication Publication Date Title
Shimada et al. Physcap: Physically plausible monocular 3d motion capture in real time
US11176725B2 (en) Image regularization and retargeting system
US10565792B2 (en) Approximating mesh deformations for character rigs
US9684996B2 (en) Rendering global light transport in real-time using machine learning
US10650599B2 (en) Rendering virtual environments utilizing full path space learning
US8169438B1 (en) Temporally coherent hair deformation
US9892549B2 (en) Adaptive rendering with linear predictions
US10964084B2 (en) Generating realistic animations for digital animation characters utilizing a generative adversarial network and a hip motion prediction network
US7995059B1 (en) Mid-field and far-field irradiance approximation
US8736616B2 (en) Combining multi-sensory inputs for digital animation
US8054311B1 (en) Rig baking for arbitrary deformers
US9041718B2 (en) System and method for generating bilinear spatiotemporal basis models
US20130243331A1 (en) Information processing device, information processing method, and program
US7764288B2 (en) Methods and apparatus for accelerated animation using point multiplication and soft caching
Purushwalkam et al. Bounce and learn: Modeling scene dynamics with real-world bounces
US7839407B2 (en) Methods and apparatus for accelerated animation using point multiplication
Tonneau et al. Character contact re‐positioning under large environment deformation
US7952582B1 (en) Mid-field and far-field irradiance approximation
Huang et al. Inverse kinematics using dynamic joint parameters: inverse kinematics animation synthesis learnt from sub-divided motion micro-segments
WO2007087444A2 (en) Methods and apparatus for accelerated animation using point multiplication and soft caching
US11893671B2 (en) Image regularization and retargeting system
GB2475651A (en) Determining an animation variable response subspace in response to a set of training data
Hwang et al. Primitive object grasping for finger motion synthesis
US11941739B1 (en) Object deformation network system and method
Schurischuster et al. In-Time 3D Reconstruction and Instance Segmentation from Monocular Sensor Data