WO2024012007A1 - 一种动画数据生成方法、装置及相关产品 - Google Patents
一种动画数据生成方法、装置及相关产品 Download PDFInfo
- Publication number
- WO2024012007A1 WO2024012007A1 PCT/CN2023/091117 CN2023091117W WO2024012007A1 WO 2024012007 A1 WO2024012007 A1 WO 2024012007A1 CN 2023091117 W CN2023091117 W CN 2023091117W WO 2024012007 A1 WO2024012007 A1 WO 2024012007A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- features
- virtual object
- data
- animation
- feature
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 93
- 238000013528 artificial neural network Methods 0.000 claims abstract description 111
- 230000009471 action Effects 0.000 claims description 102
- 230000033001 locomotion Effects 0.000 claims description 102
- 238000003860 storage Methods 0.000 claims description 31
- 238000004590 computer program Methods 0.000 claims description 19
- 230000009467 reduction Effects 0.000 claims description 14
- 210000000610 foot bone Anatomy 0.000 claims description 11
- 230000001965 increasing effect Effects 0.000 claims description 9
- 238000007781 pre-processing Methods 0.000 claims description 5
- 210000001981 hip bone Anatomy 0.000 claims description 4
- 238000005516 engineering process Methods 0.000 abstract description 25
- 238000013473 artificial intelligence Methods 0.000 abstract description 13
- 238000012549 training Methods 0.000 description 36
- 238000010586 diagram Methods 0.000 description 29
- 230000006870 function Effects 0.000 description 19
- 230000000694 effects Effects 0.000 description 18
- 230000036544 posture Effects 0.000 description 13
- 210000000988 bone and bone Anatomy 0.000 description 12
- 238000010801 machine learning Methods 0.000 description 9
- 238000004519 manufacturing process Methods 0.000 description 8
- 230000008569 process Effects 0.000 description 8
- 239000013598 vector Substances 0.000 description 8
- 238000011161 development Methods 0.000 description 7
- 238000012545 processing Methods 0.000 description 7
- 238000013135 deep learning Methods 0.000 description 6
- 238000004891 communication Methods 0.000 description 4
- 238000013461 design Methods 0.000 description 4
- 230000006872 improvement Effects 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 3
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 230000009191 jumping Effects 0.000 description 3
- 238000004904 shortening Methods 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 230000001427 coherent effect Effects 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 238000012423 maintenance Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000009877 rendering Methods 0.000 description 2
- 238000004088 simulation Methods 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 241000282412 Homo Species 0.000 description 1
- 241001465754 Metazoa Species 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 238000013529 biological neural network Methods 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000001149 cognitive effect Effects 0.000 description 1
- 238000005094 computer simulation Methods 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 230000007123 defense Effects 0.000 description 1
- 238000009499 grossing Methods 0.000 description 1
- 230000001939 inductive effect Effects 0.000 description 1
- 238000013178 mathematical model Methods 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000002085 persistent effect Effects 0.000 description 1
- 230000002787 reinforcement Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000029058 respiratory gaseous exchange Effects 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 238000013526 transfer learning Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T13/00—Animation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T13/00—Animation
- G06T13/20—3D [Three Dimensional] animation
- G06T13/40—3D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30196—Human being; Person
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30241—Trajectory
Definitions
- This application relates to the field of artificial intelligence technology, especially animation data generation technology.
- Action matching technology can select the most matching animation frame from a large number of animations to play, thereby obtaining animations of virtual objects with different actions.
- Embodiments of the present application provide an animation data generation method, device and related products, aiming to generate animation data with low memory usage.
- a first aspect of this application provides a method for generating animation data.
- the animation data generation method is executed by the animation data generation device, including:
- the query features include trajectory features and skeletal features of the virtual object
- the characteristic dimensions of the virtual object are increased through a feature generation network in a neural network to obtain the combined characteristics of the virtual object, and the neural network is pre-trained;
- animation data of the virtual object is generated through an animation generation network in the neural network.
- a second aspect of this application provides an animation data generating device.
- the animation data generating device is deployed on the animation data generating device and includes:
- a query feature generation unit configured to generate query features of virtual objects in the virtual scene according to the operating data of the virtual scene; the query features include trajectory features and skeletal features of the virtual object;
- a combined feature generation unit configured to increase the feature dimensions of the virtual object through a feature generation network in a neural network based on the trajectory features and skeletal features of the virtual object to obtain the combined features of the virtual object, where the neural network is Pre-trained;
- An animation data generating unit is configured to generate animation data of the virtual object through an animation generating network in the neural network based on the combined characteristics of the virtual object.
- the third aspect of this application provides an animation data generation device.
- the animation data generation device includes a processor and a memory:
- the memory is used to store a computer program and transmit the computer program to the processor
- the processor is configured to execute the steps of the animation data generating method introduced in the first aspect according to instructions in the computer program.
- a fourth aspect of this application provides a computer-readable storage medium.
- the computer-readable storage medium is used to store a computer program.
- the steps of the animation data generation method introduced in the first aspect are implemented.
- a fifth aspect of this application provides a computer program product.
- the computer program product includes a computer program that implements the steps of the animation data generation method introduced in the first aspect when executed by the animation data generation device.
- the trajectory features and skeletal features of the virtual object in the virtual scene are obtained as query features, and based on the trajectory features and skeletal features, the animation data of the virtual object is generated through a pre-trained neural network. Since the pre-trained neural network has the function of adding feature dimensions of virtual objects based on query features and generating animation data of virtual objects based on high-dimensional features, it can meet the demand for animation data generation.
- Figure 1 is a schematic diagram of an animation state machine
- Figure 2 is a scene architecture diagram for implementing a method for generating animation data provided by an embodiment of the present application
- Figure 3 is a flow chart of an animation data generation method provided by an embodiment of the present application.
- Figure 4 is a schematic structural diagram of a neural network provided by an embodiment of the present application.
- Figure 5A is a schematic structural diagram of another neural network provided by an embodiment of the present application.
- Figure 5B is a flow chart of another animation data generation method provided by an embodiment of the present application.
- Figure 6 is a schematic structural diagram of a feature generation network provided by an embodiment of the present application.
- Figure 7 is a schematic structural diagram of a feature update network provided by an embodiment of the present application.
- Figure 8 is a schematic structural diagram of an animation generation network provided by an embodiment of the present application.
- Figure 9A is a training flow chart of a neural network provided by an embodiment of the present application.
- Figure 9B is a schematic diagram of the root skeleton trajectory before noise reduction provided by the embodiment of the present application.
- Figure 9C is a schematic diagram of the root skeleton trajectory after noise reduction provided by the embodiment of the present application.
- Figure 10A is a schematic structural diagram of a deep learning network capable of extracting auxiliary query features provided by an embodiment of the present application
- Figure 10B is a schematic diagram of the animation effect obtained by the traditional action matching method and the animation data generation method provided by the embodiment of the present application;
- Figure 11 is a schematic structural diagram of an animation data generating device provided by an embodiment of the present application.
- Figure 12 is a schematic structural diagram of another animation data generation device provided by an embodiment of the present application.
- Figure 13 is a schematic structural diagram of a server in an embodiment of the present application.
- Figure 14 is a schematic structural diagram of a terminal device in an embodiment of the present application.
- Figure 1 is a schematic diagram of an animation state machine.
- Defense Defend
- Tension Upset
- Victory Victory
- Idle Idle
- the two-way arrows between the four animations represent the progress between animations. switch. If the traditional method of state machine is used to generate animation in game development or animation production scenarios, when the movements of virtual objects are relatively complex, the design amount of the state machine will be very large, and subsequent updates and maintenance will be very difficult and require a lot of time. And prone to failure.
- action matching technology which solves the problems of large animation state machine design, complex logic, and inconvenient maintenance.
- action matching technology needs to store massive animation data in advance for query matching, so it consumes a lot of memory, resulting in poor storage and query performance.
- this application provides an animation data generation method, device and related products.
- the pre-trained neural network can be used to generate the animation data of the virtual object.
- the weight data of the neural network occupies a small memory footprint, storage and query performance can be improved.
- the animation data generation method mainly involves artificial intelligence (Artificial Intelligence, AI) technology, especially machine learning in artificial intelligence technology, and uses neural networks trained by machine learning to solve action matching technology. Storage and query performance issues in animation and film production.
- AI Artificial Intelligence
- Machine learning is a multi-field interdisciplinary subject involving probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and other disciplines. It specializes in studying how computers can simulate or implement human learning behavior to acquire new knowledge or skills, and reorganize existing knowledge structures to continuously improve their performance.
- Machine learning is the core of artificial intelligence and a branch of artificial intelligence. It is the fundamental way to make computers intelligent, and its applications cover all fields of artificial intelligence.
- Machine learning and deep learning usually include artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, teaching learning and other technologies.
- the history of artificial intelligence research has a natural and clear path from focusing on "reasoning" to focusing on "knowledge” to "learning”. Obviously, machine learning is a way to realize artificial intelligence, that is, using machine learning as a means to solve problems in artificial intelligence.
- Artificial neural network referred to as neural network or neural network-like network, in the field of machine learning and cognitive science, is a mathematical model or computational model that imitates the structure and function of biological neural networks and is used to estimate or approximate functions. Neural networks are connected by a large number of artificial neurons to perform calculations. In most cases, artificial neural networks can change the internal structure based on external information. It is an adaptive system, which in layman's terms has a learning function.
- Motion capture also known as dynamic capture, refers to the technology of recording and processing the movements of people or other objects. It is widely used in many fields such as entertainment, sports, medical applications, computer vision, and robotics. In fields such as animation production, film production, and video game development, it usually records the movements of human actors and converts them into the movements of digital models, and generates two- or three-dimensional computer animations. When it captures subtle movements of faces or fingers, it's often called performance capture.
- the virtual scene can be a simulation scene of the real world, a semi-simulation and semi-fictional three-dimensional scene, or a purely fictional three-dimensional scene.
- the virtual scene may be any one of a two-dimensional virtual scene, a 2.5-dimensional virtual scene, and a three-dimensional virtual scene.
- the following embodiments illustrate that the virtual scene is a three-dimensional virtual scene, but this is not limited.
- the virtual scene is also used for a virtual scene battle between at least two virtual objects.
- the virtual scene can be, for example, a game scene, a virtual reality scene, an extended reality scene, etc. This embodiment of the present application No restrictions.
- the movable object may be at least one of a virtual character, a virtual animal, and an animation character.
- the virtual object when the virtual scene is a three-dimensional virtual scene, the virtual object may be a three-dimensional model created based on animation skeleton technology.
- Each virtual object has its own shape and volume in the three-dimensional virtual scene and occupies a part of the space in the three-dimensional virtual scene.
- the animation data generation method provided by the embodiment of the present application can be executed by an animation data generation device, which can be, for example, a terminal device. That is, query features are generated on the end device and the data is animated based on a pre-trained neural network.
- terminal devices may specifically include but are not limited to mobile phones, computers, intelligent voice interaction devices, smart home appliances, vehicle-mounted terminals, aircraft, etc.
- Embodiments of the present invention can be applied to various scenarios, including but not limited to cloud technology, artificial intelligence, digital humans, virtual humans, games, virtual reality, extended reality (XR, Extended Reality), etc.
- the above animation data generation device can also be a server, that is, the query features can be generated on the server and the data can be animated according to the pre-trained neural network.
- the animation data generation method provided in the embodiments of this application can also be implemented jointly by a terminal device and a server.
- Figure 2 is a scene architecture diagram for implementing a method for generating animation data provided by an embodiment of the present application.
- the implementation scenario of the solution is introduced below with reference to Figure 2.
- terminal devices and servers are involved.
- the running data of the virtual scene can be extracted on the terminal device to generate the query features of the virtual objects in the virtual scene
- the weight data of the neural network can be retrieved from the server
- the animation data of the virtual object can be generated on the terminal device based on the neural network.
- the query characteristics of the virtual objects in the virtual scene can also be generated in the server based on the running data of the virtual scene, and the query characteristics are sent to the terminal device, and then the query characteristics are sent to the terminal device.
- Neural networks are used to generate animation data. Therefore, in the embodiments of this application, there is no limitation on the implementation entity that implements the technical solution of this application.
- the server shown in Figure 2 can be an independent physical server, or a server cluster or distributed system composed of multiple physical servers.
- the server can also be a basic cloud that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, CDN, and big data and artificial intelligence platforms.
- Cloud server for computing services can be an independent physical server, or a server cluster or distributed system composed of multiple physical servers.
- the server can also be a basic cloud that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, CDN, and big data and artificial intelligence platforms.
- Cloud server for computing services can be an independent physical server, or a server cluster or distributed system composed of multiple physical servers.
- the server can also be a basic cloud that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, CDN, and big data and artificial intelligence platforms.
- FIG 3 is a flow chart of an animation data generation method provided by an embodiment of the present application. The following uses the terminal device as the execution subject to introduce the specific implementation of this method.
- the animation data generation method shown in Figure 3 includes:
- S301 Generate query features of virtual objects in the virtual scene according to the running data of the virtual scene.
- the virtual object when the player controls a virtual object, the virtual object needs to display corresponding animation effects according to the player's control. For example, a virtual object performs a walking action, and when a player controls the virtual object to perform a squatting action, the animation of the virtual object performing a squatting action needs to be displayed in the virtual scene (i.e., the game scene).
- the animation data of this squatting action needs to be generated through the technical solution provided by this application.
- the technical solution provided by this application first needs to generate the query features of the virtual object, which can be used as input to the neural network in subsequent steps to finally generate animation data.
- Query features may include trajectory features and skeletal features of virtual objects.
- the so-called trajectory features may refer to features related to the trajectory of virtual objects in the virtual scene.
- the trajectory characteristics are characteristics of the virtual object as a whole.
- bone features are features from the individual bones of the virtual object.
- the trajectory features in the query features may include: trajectory speed and trajectory direction.
- trajectory features can also include trajectory point locations.
- the bone features in the query features may include left foot bone position information, left foot bone rotation information, right foot bone position information, and right foot bone rotation information.
- the bone characteristics can also include left foot bone speed and right foot bone speed.
- the trajectory referred to in the trajectory feature may refer to the trajectory of the root joint of the virtual object. It may be a path formed based on the projection of the virtual object's hip bone on the ground. If the virtual object is a humanoid character, the generation method is to project the hip bone information of the humanoid skeleton onto the ground, so that multiple animation frames are connected to form the trajectory point information of the virtual object.
- the ground referred to here may specifically be the ground in the virtual scene coordinate system.
- the skeletal features of the feet are included in the query features. Since the feet are an important part of the human body to represent posture, the position, rotation and other information of its bones are conducive to generating matching animations through the neural network.
- trajectory features and bone features are used as query features to characterize the characteristics of the virtual object from two aspects: the overall virtual object and individual bones, thereby combining these two types of features, which is beneficial to achieving accurate animation data. Generate to ensure that the generated animation data realistically depicts the display effect of virtual object movements.
- query features of virtual objects in the virtual scene are generated based on the running data of the virtual scene, which may include:
- the action control signal for the virtual object is extracted from the running data of the virtual scene. Then, based on the control parameters in the action control signal and the historical control parameters in the historical action control signals for the virtual object, the trajectory features and skeletal features of the virtual object are generated.
- the character's walking and running mainly depends on the player's input. If the player wants to run, the corresponding motion control signal will be input through the keyboard and handle, and then the animation engine will calculate a reasonable motion based on the motion control signal.
- the running speed is used as the trajectory feature.
- the historical control parameters in the previous historical action control signals can be combined during calculation.
- the control parameters may include action types (running, jumping, walking, etc.).
- the character attributes of virtual objects can also be combined to generate their trajectory features and skeletal features. For example, different character attributes have different maximum and minimum speed values.
- the historical action control signal may be an action control signal received before the latest received action control signal, for example, it may be an action control signal received before the latest received action control signal, or a previously predicted action control signal.
- the action control signal received within the set time.
- S302 Based on the trajectory features and skeletal features of the virtual object, increase the feature dimensions of the virtual object through the feature generation network in the neural network to obtain the combined features of the virtual object.
- Figure 4 is a schematic structural diagram of a neural network provided by an embodiment of the present application.
- the network structure illustrated in this figure includes a feature generation network and an animation generation network.
- the feature generation network can be used to increase the feature dimensions of the virtual object, that is, through the feature generation network, the query feature dimensions of the virtual object can be enriched.
- the query features input to the feature generation network include trajectory speed, trajectory direction, left foot bone position information, left foot bone rotation information, right foot bone position information, and right foot bone rotation information, and are processed by the feature generation network , so that the output features not only include the feature information in the input query features, but also include other auxiliary features that help to accurately generate animation data.
- the method of obtaining auxiliary features will be explained in more detail later.
- the features with increased dimensions obtained through feature generation network processing are called combined features. Since the combined features are features obtained based on the input trajectory features and bone features, it can be understood that the combined features output by the feature generation network match the query features input to the feature generation network and can be used as a combination of virtual objects Features used to generate animation data.
- S303 Based on the combination characteristics of the virtual object, generate animation data of the virtual object through the animation generation network in the neural network.
- the network structure shown in Figure 4 also includes an animation generation network.
- the function of this network is to generate animation data of virtual objects based on the combined features of the virtual objects input into it.
- the combined features of the virtual object in each frame of the animation engine can be used as input to the animation generation network to generate animation data for that frame.
- a coherent animation is formed in time sequence based on the animation data of each frame of the animation engine. Therefore, based on the functional requirements of the neural network, when the neural network needs to meet the above functional requirements, the output of the feature generation network can be directly used as the input of the animation generation network for training and use.
- the pre-trained neural network since the pre-trained neural network has the function of increasing the feature dimensions of virtual objects based on query features and generating animation data of virtual objects based on high-dimensional features, it can satisfy Requirements for animation data generation.
- the use of neural networks when generating animation data, it is no longer necessary to use previous action matching technology to store massive data in memory and query matching animations; the use of neural networks only requires the weights related to the neural network to be stored in advance Data, so the implementation of the entire solution has a low memory footprint, thus avoiding the problems of high memory footprint and poor query performance when generating animation data. Therefore, the solutions of the embodiments of the present application can achieve better application and development in animation engines.
- the feature generation network may not run every frame in order to improve performance and reduce animation jitter when the scheme is run.
- the execution of S302 requires certain conditions, for example, when the virtual object
- the changes in trajectory features and skeletal features meet the first preset condition and/or the time interval of the previous output combination feature of the distance feature generation network satisfies the second preset condition, based on the latest input trajectory features and skeletal features of the virtual object, through
- the feature generation network outputs the combined features of the virtual object. That is to say, in this possible implementation, the operation of the feature generation network needs to meet prerequisites, which may be conditions related to feature changes (such as the first preset condition), or may be related to its running time interval.
- the condition (such as the second preset condition) can also be a combination of these two conditions.
- FIG. 5A is a schematic structural diagram of another neural network provided by an embodiment of the present application. Compared with the network structure shown in Figure 4, the neural network shown in Figure 5A also includes an additional feature update network. In the structure shown in Figure 5A, the output of the feature generation network is used as the input of the feature update network; the output of the feature update network is used as the input of the animation generation network. When the feature generation network is not running, the feature update network is used to drive the next frame of animation generation to ensure smooth and continuous animation.
- Figure 5B is a flow chart of another animation data generation method provided by an embodiment of the present application.
- the neural network structure used in the method shown in this figure is consistent with the neural network structure shown in Figure 5A. That is, the neural network includes a feature generation network, a feature update network, and an animation generation network.
- the animation data generation method shown in Figure 5B includes:
- S501 Generate query features of virtual objects in the virtual scene based on the running data of the virtual scene.
- S502 Based on the trajectory features and skeletal features of the virtual object, increase the feature dimensions of the virtual object through the feature generation network in the neural network to obtain the combined features of the virtual object.
- the implementation manner of S501-S502 is basically the same as the implementation manner of S301-S302 in the previous embodiment. Therefore, the relevant introduction can refer to the embodiment provided above, and will not be described again here.
- the combined features of the virtual object generated by the feature generation network at runtime may be the combined features of the virtual object in the current frame.
- S503 embodies the function of the feature update network in the neural network shown in Figure 5A.
- the implementation method of S503 can be that the feature update network can output the combined features of the virtual object in the current frame based on the feature generation network and the inter-frame difference of the animation engine of the virtual scene, and output the virtual object in the current frame.
- the inter-frame difference (deltaTime) can refer to the time difference between two updates of the animation logic thread of the animation engine. Generally speaking, it is close to the game update time. For example, if the game update rate is 60 frames per second, then deltaTime is 1/60 second.
- the feature update network can obtain the combined features of the same dimension of the virtual object in the next frame from the combined features of the current frame. That is, the feature update network realizes the update of the combined features of the virtual object in adjacent frames, and updates the combined features of the next frame based on the combined features of the previous frame. In this way, when the feature generation network is not working all the time, the function of the feature update network can be used to achieve the continuity and smoothness of the animation data output by the subsequent animation generation network.
- S504 Based on the combined characteristics of the virtual object in the next frame of the current frame, generate animation data of the virtual object through the animation generation network in the neural network.
- the animation generation network since the output of the feature update network is used as the input of the animation generation network, the animation generation network directly based on the combination of the next frame input therein Features to generate animation data and output.
- the feature generation network is not run every frame, thereby improving the performance of the solution and reducing animation jitter.
- the animation can be guaranteed to be coherent and smooth.
- Figure 6 is a schematic structural diagram of a feature generation network provided by an embodiment of the present application.
- Figure 7 is a schematic structural diagram of a feature update network provided by an embodiment of the present application.
- Figure 8 is a schematic structural diagram of an animation generation network provided by an embodiment of the present application.
- the structure of the feature generation network is a six-layer fully connected network with four hidden layers, and the number of units in each hidden layer is 512.
- the feature update network is a four-layer fully connected network with two hidden layers, and the number of units in each hidden layer is 512.
- the animation generation network is a three-layer fully connected network with one hidden layer. The number of units in each hidden layer is 512.
- the above three networks may also contain other numbers of hidden layers or the hidden layers may contain other numbers of units. Therefore, the network structure of 6+4+2 levels and the number of 512 units in the neural network is only an implementation method and is not limited here.
- FIG. 9A is a training flow chart of a neural network provided by an embodiment of the present application. As shown in Figure 9A, training the neural network includes the following steps:
- motion capture technology has been introduced before. It is a relatively mature technology currently used in film production, animation production, game development and other fields. In the embodiments of this application, this technology is used to obtain motion capture data of the human body in real scenes. As an example, in order to improve the accuracy of training, this step can be achieved in the following ways:
- an action subject usually a person, such as an actor, or an animal
- the action subject moves according to a preset motion capture route and performs a preset action in a real scene
- the action subject is motion captured to obtain initial motion capture data.
- Process the initial motion capture data through at least one of the following pre-processing methods to obtain processed motion capture data: noise reduction, data expansion, or generating data in a coordinate system of an animation engine adapted to the virtual scene.
- the processed work capture data can generally be directly applied to subsequent S902.
- Preprocess the initial motion capture data through at least one of noise reduction and data expansion. Since noise reduction can improve the quality of the motion capture data, data expansion can expand the amount of motion capture data, thus providing a massive amount of data for training neural networks. Data support. Therefore, the training effect can be improved through the above preprocessing methods.
- the collection equipment that collects motion capture data may have signal noise.
- noise reduction can be used for the initial motion capture data. measures.
- a smoothing (Savitzky-Golay, SG) filter can be used to process the initial motion capture data.
- SG smoothing
- Figure 9B and Figure 9C are schematic diagrams of root bone trajectories before and after noise reduction. Combining Figure 9B and Figure 9C, it is not difficult to find that after noise reduction, the motion capture data taking the root skeleton trajectory as an example becomes less noisy and the trajectory is smoother.
- the amount of initial motion capture data is small.
- the initial motion capture data can be expanded.
- the expansion method may include data expansion of the initial motion capture data through a mirroring method, and/or data expansion of the initial motion capture data through scaling of the timeline.
- the mirroring method can mirror left walking into right walking in motion capture, and right walking into left walking, thereby increasing the amount of data in each mode.
- the animation data it may only capture the data of the action subject walking once. For example, this data is that the left foot moves forward first and then the right foot moves forward. In order to expand the data set, for example, the right foot moves forward first and then the left foot moves forward, you need to use the mirroring method for expansion.
- the way to expand data by scaling the timeline is to expand the data by increasing or decreasing the trajectory speed.
- This method mainly adjusts the speed in the animation data to simulate and generate motion capture data at different action speeds.
- the initial motion capture data is a walking motion of a 100-meter path completed in 30 seconds.
- zooming in on the timeline for example to a 2x length timeline, the original motion capture data is transformed into a walking motion that completes a 100-meter path in 60 seconds. It can be seen that enlarging the timeline reduces the action speed of the execution subject corresponding to the data. Similarly, shortening the time axis corresponds to increasing the action speed of the execution subject corresponding to the data.
- the original motion capture data is transformed into a walking motion that completes a path of up to 100 meters in 15 seconds.
- the extra time is linearly interpolated.
- the data can be filtered regularly according to the time series.
- the initial motion capture data can also be used as Basically generate data in the coordinate system of the animation engine adapted to the virtual scene.
- the basic database for training neural networks can be constructed.
- the initial motion capture data is data in the right-hand coordinate system
- the coordinate system of the animation engine is the left-hand coordinate system in the Z-axis direction. It can be converted according to the coordinate system relationship to generate motion capture data in the coordinate system of the animation engine.
- the initial motion capture data can be processed through at least one of the following pre-processing methods to obtain processed motion capture data: noise reduction, data expansion or generation adapted to the virtual scene Data in the coordinate system of the animation engine.
- S902 Obtain the root motion data, skeletal posture information and basic query features of the action subject according to the motion capture data.
- Basic query features can include trajectory features and skeletal features of the action subject.
- the basic query features here are consistent with the data type of the query features that need to be input to the feature generation network after the neural network is trained.
- Rails in Basic Query Features The trace features can be generated based on the movement direction and position of the action subject; the bone features in the basic query features can be obtained based on the motion information of the feet of the current action subject.
- the root motion data and skeletal posture information of the action subject obtained from the motion capture data are, in addition to the basic query features, information obtained from the motion capture data in this step that helps train the feature generation network and increase the dimension of the query features.
- S903 Extract the feature value of the action subject from the root motion data and skeletal posture information of the action subject, and use the feature value as an auxiliary query feature.
- S903 can be completed by another trained deep learning network.
- the function of this neural network is to extract feature values as auxiliary query features.
- the features referred to in the embodiments of this application such as query features, basic query features, auxiliary query features, combined features, etc., can all be represented by feature vectors.
- the vector representation of auxiliary query features can also be called auxiliary vectors.
- the auxiliary vector is a number generated by the deep learning network executing S903.
- the dimensions of the vector are consistent with the feature dimensions.
- Figure 10A is a schematic structural diagram of a deep learning network capable of extracting auxiliary query features provided by an embodiment of the present application.
- the deep learning network shown in Figure 10A can be a five-layer fully connected network with three hidden layers. After passing through each hidden layer, the low-dimensional feature vector representing the input data is gradually obtained. The final output is the auxiliary vector that needs to be used together with the vector representation of the basic query feature to train the feature generation network.
- S904 Obtain the combined features of the moving subject based on the trajectory features of the action subject, the skeletal features of the action subject and the auxiliary query features.
- the basic query features i.e., the trajectory features of the action subject and the skeletal features of the action subject
- the auxiliary query features add dimensions to the query features based on the basic query features.
- the function of the feature generation network is to add feature dimensions to the query features. Therefore, in the embodiment of the present application, the feature generation network in the neural network can be trained by using basic query features and combined features as a set of training data. Among them, the basic query features are used as the input of the feature generation network in the training stage, and the combined features of the moving subject are used as the output results for the aforementioned inputs. See S905 below.
- S905 Use the trajectory characteristics of the action subject, the skeletal features of the action subject, and the combined features of the movement subject to train the feature generation network in the neural network.
- the training cutoff conditions for the feature generation network can be set.
- the number of training iterations and/or the loss function can be used to determine whether cutoff training is required.
- training cutoff conditions can also be set for the training of feature update networks and animation generation networks.
- the process of training the neural network is performed in sequence, first training the feature generation network, then training the feature update network, and finally training the animation generation network. By training the above networks, the performance of each network after training can be guaranteed as much as possible.
- the process of training the feature generation network and animation generation network please refer to S906 and S907 below.
- S906 After the feature generation network is trained, use the combined features of the current frame output by the feature generation network and the combined features of the action subject in the next frame to train the feature update network in the neural network.
- the combined features of the action subject in the next frame are obtained based on the motion capture data of the action subject.
- the combined features of the action subject in the next frame are the output results of the trained feature update network, and the combined features of the current frame output by the feature generation network as the actual input to the trained feature update network.
- the animation generation network is trained using the root motion data and skeletal posture information of the action subject and the combined features of the action subject in the next frame output by the feature generation network.
- the root motion data and skeletal posture information of the action subject are used as the output results of the trained animation generation network, and the combined features of the action subject in the next frame output by the feature generation network are used as the actual input of the trained animation generation network.
- Table 1 compares the amount of storage that the traditional action matching method needs to occupy for each content and the amount of storage that the animation generation method provided by the embodiment of the present application needs to occupy for each content.
- FIG. 10B is a schematic diagram of the animation effect obtained by the traditional action matching method and the animation data generation method provided by the embodiment of the present application.
- the humanoid animation on the left is obtained by the traditional action matching method, and the humanoid animation on the right is obtained through the technical solution of this application.
- Combining the animation renderings on the left and right sides of Figure 10B it is easy to find that the animation effect finally obtained by the technical solution of this application is very close to the animation effect obtained by the action matching method. That is, better results are achieved and the needs for animation data generation are met.
- the improvement of storage performance makes the game run smoother and the animation viewing smoother. Improvements in storage performance provide more storage margin to support improvements in other aspects, such as supporting further improvements in game image quality, storing more user game data, and adding richer virtual character-related information. Data or scene data, etc. This further enhances players’ gaming experience.
- a certain game is run on a terminal device, and players operate in real time, using the mouse and keyboard to control virtual objects to run, jump, dodge and other actions in the game scene.
- the virtual object controlled by the player needs to make a jumping action in the virtual scene.
- the virtual object controlled by the player needs to run in the virtual scene.
- the terminal device can determine the player's action control intention through the control parameters and historical control parameters in the action control signal input by the player using the mouse and/or keyboard, and obtain the query characteristics of the virtual object through calculation .
- the terminal device communicates with the remote server to call the neural network.
- the weight data of the neural network it is stored locally on the terminal device.
- the terminal device takes the query features as input to the neural network.
- the neural network is pre-trained in the server based on motion capture data of some real scenes. Therefore, in fact, the terminal device can store the weight data of the neural network locally or retrieve the weight data of the neural network from the server and store it locally.
- the terminal device uses some rendering methods of the animation engine to render the animation data of the virtual object into an animation effect visible to the player in the game scene displayed on the terminal device.
- the virtual object controlled by the player jumps in the virtual scene displayed on the screen of the terminal device.
- the animation shows the changing posture of the virtual object during the jump and the separation distance between the two parties that is different from other postures. foot.
- the virtual object controlled by the player assumes a running posture in the virtual scene displayed on the screen of the terminal device, swings its arms back and forth regularly, and displays alternating leg movements beyond the walking posture. From the time the player starts to control on the terminal device to the corresponding animation effect being displayed in the virtual scene screen, the entire time is very short. The other screen displays of the game will not be affected by the control instructions and produce lags and regional mosaic effects.
- FIG 11 is a schematic structural diagram of an animation data generating device provided by an embodiment of the present application. As shown in Figure 11, the animation data generation device includes:
- Query feature generation unit 111 configured to generate query features of virtual objects in the virtual scene according to the operating data of the virtual scene; the query features include trajectory features and skeletal features of the virtual object;
- the combined feature generation unit 112 is configured to increase the feature dimensions of the virtual object through a feature generation network in a neural network based on the trajectory features and skeletal features of the virtual object, and obtain the combined features of the virtual object.
- the neural network It is obtained by pre-training;
- the animation data generation unit 113 is configured to generate animation data of the virtual object through the animation generation network in the neural network based on the combined characteristics of the virtual object.
- the pre-trained neural network Since the pre-trained neural network has the function of adding feature dimensions of virtual objects based on query features and generating animation data of the virtual objects based on high-dimensional features, it can meet the demand for generating animation data.
- the use of neural networks when generating animation data, it is no longer necessary to use previous action matching technology to store massive data in memory and query matching animations; the use of neural networks only requires the weights related to the neural network to be stored in advance Data, so the implementation of the entire solution has a low memory footprint, thus avoiding the problems of high memory footprint and poor query performance when generating animation data.
- Figure 12 is a schematic structural diagram of another animation data generating device provided by an embodiment of the present application.
- the combined features of the virtual object are the combined features of the virtual object in the current frame.
- the animation data generating unit 113 specifically includes:
- a combined feature update subunit configured to output the combined feature of the virtual object in the current frame based on the feature generation network and output the combined feature of the virtual object in the next frame of the current frame through the feature update network in the neural network.
- An animation data generation subunit is configured to generate animation data of the virtual object through the animation generation network in the neural network based on the combined characteristics of the virtual object in the next frame of the current frame.
- the combined feature generating unit 112 is specifically used to:
- the virtual object When the changes in the trajectory characteristics and skeletal characteristics of the virtual object meet the first preset condition and/or the time interval from the previous output of the combined feature by the feature generation network meets the second preset condition, the virtual object according to the latest input
- the trajectory features and skeletal features are used to output the combined features of the virtual object through the feature generation network.
- the combined feature update subunit is specifically used for:
- the combined features of the virtual object in the next frame of the current frame are output through the feature update network.
- the animation data generating device may also include a network training unit for obtaining a neural network through training.
- the network training unit specifically includes:
- the motion capture data acquisition subunit is used to acquire motion capture data of real scenes
- a data analysis subunit configured to obtain the root motion data, skeletal posture information and basic query features of the action subject according to the motion capture data;
- the basic query features include the trajectory features and skeletal features of the action subject;
- a feature value extraction subunit is used to extract the feature value of the action subject from the root motion data and skeletal posture information of the action subject, and use the feature value as an auxiliary query feature;
- Feature combination subunit used to obtain the combined features of the moving subject based on the trajectory features of the action subject, the skeletal features of the action subject and the auxiliary query features;
- the first training subunit is used to train the feature generation network in the neural network using the trajectory characteristics of the action subject, the skeletal features of the action subject, and the combined features of the movement subject;
- the second training subunit is used to use the combined features of the current frame output by the feature generating network and the combined features of the action subject in the next frame after the feature generation network is trained, to perform training on the neural network.
- the feature update network is trained, and the combined features of the action subject in the next frame are obtained based on the motion capture data of the action subject;
- the third training subunit is used to use the root motion data and skeletal posture information of the action subject and the combined features of the action subject in the next frame output by the feature generation network after the feature update network is trained,
- the animation generation network is trained.
- the motion capture data acquisition subunit is specifically used for:
- the initial motion capture data is processed through at least one of the following pre-processing methods to obtain processed motion capture data:
- data expansion methods may include but are not limited to:
- the initial motion capture data is expanded by scaling the timeline.
- the query feature generation unit 111 includes:
- a signal extraction subunit used to extract action control signals for the virtual object from the operating data of the virtual scene
- a feature generation subunit configured to generate trajectory features and skeletal features of the virtual object based on control parameters in the action control signal and historical control parameters in historical action control signals for the virtual object.
- the trajectory features include trajectory speed and trajectory direction
- the skeletal features include left foot skeletal position information, left foot skeletal rotation information, right foot skeletal position information, and right foot skeletal rotation information; wherein , the trajectory is formed based on the projection of the hip bones.
- the following introduces the structure of the animation data generation device in terms of server form and terminal device form respectively.
- FIG. 13 is a schematic structural diagram of a server provided by an embodiment of the present application.
- the server 900 may vary greatly due to different configurations or performance, and may include one or more central processing units (CPUs) 922 (for example, , one or more processors) and memory 932, one or more storage media 930 (eg, one or more mass storage devices) that stores applications 942 or data 944.
- the memory 932 and the storage medium 930 may be short-term storage or persistent storage.
- the program stored in the storage medium 930 may include one or more modules (not shown in the figure), and each module may include a series of instruction operations on the server.
- the central processor 922 may be configured to communicate with the storage medium 930 and execute a series of instruction operations in the storage medium 930 on the server 900 .
- Server 900 may also include one or more power supplies 926, one or more wired or wireless network interfaces 950, one or more input and output interfaces 958, and/or, one or more operating systems 941, such as Windows Server TM , Mac OS X TM , Unix TM , Linux TM , FreeBSD TM and so on.
- operating systems 941 such as Windows Server TM , Mac OS X TM , Unix TM , Linux TM , FreeBSD TM and so on.
- CPU 922 is used to perform the following steps:
- the query features include trajectory features and skeletal features of the virtual object
- the characteristic dimensions of the virtual object are increased through a feature generation network in a neural network to obtain the combined characteristics of the virtual object, and the neural network is pre-trained;
- animation data of the virtual object is generated through an animation generation network in the neural network.
- the embodiment of the present application also provides another animation data generating device.
- the animation data generating device may be a terminal device, as shown in Figure 14.
- the terminal device can be any terminal device including a mobile phone, tablet computer, personal digital assistant (English full name: Personal Digital Assistant, English abbreviation: PDA), sales terminal (English full name: Point of Sales, English abbreviation: POS), vehicle-mounted computer, etc. , taking the terminal device as a mobile phone as an example:
- FIG. 14 shows a block diagram of a partial structure of a mobile phone related to the terminal device provided by the embodiment of the present application.
- the mobile phone includes: radio frequency (English full name: Radio Frequency, English abbreviation: RF) circuit 1010, memory 1020, input unit 1030, display unit 1040, sensor 1050, audio circuit 1060, wireless fidelity (WiFi) module 1070, Processor 1080, power supply 1090 and other components.
- the input unit 1030 may include a touch panel 1031 and other input devices 1032
- the display unit 1040 may include a display panel 1041
- the audio circuit 1060 may include a speaker 1061 and a microphone 1062.
- the structure of the mobile phone shown in FIG. 14 does not constitute a limitation on the mobile phone, and may include more or fewer components than shown in the figure, or combine certain components, or arrange different components.
- the memory 1020 can be used to store software programs and modules.
- the processor 1080 executes various functional applications and data processing of the mobile phone by running the software programs and modules stored in the memory 1020 .
- the memory 1020 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function (such as a sound playback function, an image playback function, etc.), etc.; the storage data area may store a program according to Data created by the use of mobile phones (such as audio data, phone books, etc.), etc.
- memory 1020 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device.
- the processor 1080 is the control center of the mobile phone, using various interfaces and lines to connect various parts of the entire mobile phone, and executing software programs and/or modules stored in the memory 1020 by running or executing them, and calling data stored in the memory 1020. Various functions of the mobile phone and processing data, thereby collecting overall data and information on the mobile phone.
- the processor 1080 may include one or more processing units; preferably, the processor 1080 may integrate an application processor and a modem processor, where the application processor mainly processes operating systems, user interfaces, application programs, etc. , the modem processor mainly handles wireless communications. It can be understood that the above modem processor may not be integrated into the processor 1080.
- the processor 1080 included in the terminal also has the following functions:
- the query features include trajectory features and skeletal features of the virtual object
- the characteristic dimensions of the virtual object are increased through a feature generation network in a neural network to obtain the combined characteristics of the virtual object, and the neural network is pre-trained;
- animation data of the virtual object is generated through an animation generation network in the neural network.
- Embodiments of the present application also provide a computer-readable storage medium for storing a computer program.
- the computer program is executed by an animation data generation device, any one of the animation data generation methods described in the foregoing embodiments can be implemented.
- Embodiments of the present application also provide a computer program product including instructions.
- the computer program product includes a computer program that, when run on a computer, causes the computer to execute any one of the animation data generation methods described in the foregoing embodiments. implementation.
- the systems described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or they may be distributed to multiple network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
- each functional unit in each embodiment of the present application can be integrated into one processing unit, each unit can exist physically alone, or two or more units can be integrated into one unit.
- the above integrated units can be implemented in the form of hardware or software functional units.
- the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it may be stored in a computer-readable storage medium.
- the technical solution of the present application is essentially or contributes to the existing technology, or all or part of the technical solution can be embodied in the form of a software product.
- the computer software product is stored in a storage medium and includes a number of instructions to cause a computer device (which can be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in various embodiments of this application. .
- the aforementioned storage media include: U disk, mobile hard disk, read-only memory (English full name: Read-Only Memory, English abbreviation: ROM), random access memory (English full name: Random Access Memory, English abbreviation: RAM), magnetic Various media that can store program code, such as discs or optical discs.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Biophysics (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Processing Or Creating Images (AREA)
Abstract
本申请公开一种动画数据生成方法、装置及相关产品。可应用于云技术、人工智能、智慧交通、辅助驾驶、数字人、虚拟人、游戏、虚拟现实、扩展现实等场景。获取虚拟场景中虚拟对象的轨迹特征和骨骼特征,在轨迹特征和骨骼特征的基础上通过训练好的神经网络来生成该虚拟对象的动画数据。神经网络在查询特征基础上增加虚拟对象的特征维度并基于高维特征生成虚拟对象的动画数据,满足对动画数据的生成需求。神经网络的使用在生成动画数据时不再需要在内存中存储海量数据并从中查询匹配的动画;只需要提前存储与神经网络相关的权重数据,对内存的占用量较低,避免内存占用量高、在生成动画数据时查询性能不佳的问题。
Description
本申请要求于2022年7月15日提交中国专利局、申请号202210832558.0、申请名称为“一种基于神经网络的动画数据生成方法、装置及相关产品”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
本申请涉及人工智能技术领域,尤其涉及动画数据生成技术。
游戏开发和动画制作等场景中,为了增强观看者的体验,往往需要为虚拟对象的动作设计出灵活、逼真的动画效果。以游戏场景来说,对于奔跑、跳跃、下蹲、空闲时轻微的呼吸或摇摆、跌落时恐慌地抬起手臂等动作,如果能够进行灵活、逼真的动画效果展示,可以丰富玩家的视觉体验,增强玩家在游戏中的交互感。
为了制作出具有不同动作的虚拟对象的动画,相关技术提供了动作匹配(Motion Matching)技术。动作匹配技术可以从海量的动画中选择最匹配的一个动画帧播放,从而得到具有不同动作的虚拟对象的动画。
但是动作匹配技术进行动画数据的驱动时,在运行期间要将海量的数据存储在内存中,并且还需要在这些海量数据中进行动作匹配,导致了内存占用量高、查询性能不佳的问题。这一问题限制了动作匹配技术在动画引擎中的发展。
发明内容
本申请实施例提供了一种动画数据生成方法、装置及相关产品,旨在以低内存占用量生成动画数据。
本申请第一方面,提供了一种动画数据生成方法。该动画数据生成方法由动画数据生成设备执行,包括:
根据虚拟场景的运行数据生成所述虚拟场景中虚拟对象的查询特征;所述查询特征包括所述虚拟对象的轨迹特征和骨骼特征;
基于所述虚拟对象的轨迹特征和骨骼特征,通过神经网络中的特征生成网络增加所述虚拟对象的特征维度,得到所述虚拟对象的组合特征,所述神经网络是预先训练得到的;
基于所述虚拟对象的组合特征,通过所述神经网络中的动画生成网络生成所述虚拟对象的动画数据。
本申请第二方面,提供了一种动画数据生成装置。所述动画数据生成装置部署在动画数据生成设备上,包括:
查询特征生成单元,用于根据虚拟场景的运行数据生成所述虚拟场景中虚拟对象的查询特征;所述查询特征包括所述虚拟对象的轨迹特征和骨骼特征;
组合特征生成单元,用于基于所述虚拟对象的轨迹特征和骨骼特征,通过神经网络中的特征生成网络增加所述虚拟对象的特征维度,得到所述虚拟对象的组合特征,所述神经网络是预先训练得到的;
动画数据生成单元,用于基于所述虚拟对象的组合特征,通过所述神经网络中的动画生成网络生成所述虚拟对象的动画数据。
本申请第三方面,提供了一种动画数据生成设备,动画数据生成设备包括处理器以及存储器:
所述存储器用于存储计算机程序,并将所述计算机程序传输给所述处理器;
所述处理器用于根据所述计算机程序中的指令执行第一方面介绍的动画数据生成方法的步骤。
本申请第四方面,提供了一种计算机可读存储介质。所述计算机可读存储介质用于存储计算机程序,所述计算机程序被动画数据生成设备执行时实现第一方面介绍的动画数据生成方法的步骤。
本申请第五方面,提供了一种计算机程序产品。该计算机程序产品包括计算机程序,该计算机程序被动画数据生成设备执行时实现第一方面介绍的动画数据生成方法的步骤。
从以上技术方案可以看出,本申请实施例具有以下优点:
在本申请的技术方案中,获取虚拟场景中虚拟对象的轨迹特征和骨骼特征作为查询特征,在轨迹特征和骨骼特征的基础上,通过预先训练好的神经网络来生成该虚拟对象的动画数据。由于该预先训练好的神经网络具备在查询特征基础上增加虚拟对象的特征维度并基于高维特征生成虚拟对象的动画数据的功能,因此能够满足对动画数据的生成需求。此外,由于神经网络的使用,在生成动画数据时不再需要沿用以往动作匹配技术的方式在内存中存储海量数据并从中查询匹配的动画;神经网络的使用只需要提前存储与神经网络相关的权重数据,因此整个方案的实施对于内存的占用量较低,且无需从海量数据中进行实时的查询。进而避免了内存占用量高、在生成动画数据时查询性能不佳的问题。此外,由于内存占用量降低,查询需求减少,以游戏场景为例,游戏运行更加顺畅,更多的存储空间可以转为他用,从而方便提升游戏的其他性能,例如游戏画质等。进而,提升玩家的游戏体验。
图1为一种动画状态机示意图;
图2为本申请实施例提供的一种实现动画数据生成方法的场景架构图;
图3为本申请实施例提供的一种动画数据生成方法的流程图;
图4为本申请实施例提供的一种神经网络的结构示意图;
图5A为本申请实施例提供的另一种神经网络的结构示意图;
图5B为本申请实施例提供的另一种动画数据生成方法的流程图;
图6为本申请实施例提供的一种特征生成网络的结构示意图;
图7为本申请实施例提供的一种特征更新网络的结构示意图;
图8为本申请实施例提供的一种动画生成网络的结构示意图;
图9A为本申请实施例提供的一种神经网络的训练流程图;
图9B为本申请实施例提供的根骨骼轨迹的降噪前示意图;
图9C为本申请实施例提供的根骨骼轨迹的降噪后示意图;
图10A为本申请实施例提供的一种能够提取辅助查询特征的深度学习网络的结构示意图;
图10B为传统动作匹配方法和本申请实施例提供的动画数据生成方法获得的动画效果示意图;
图11为本申请实施例提供的一种动画数据生成装置的结构示意图;
图12为本申请实施例提供的另一种动画数据生成装置的结构示意图;
图13为本申请实施例中服务器的一个结构示意图;
图14为本申请实施例中终端设备的一个结构示意图。
在动画制作场景或游戏场景中,可以通过设计状态机来控制动画的各种复杂播放和转换控制逻辑。图1为一种动画状态机示意图。在图1所示的状态机中防御(Defend)、紧张(Upset)、胜利(Victory)、空闲(Idle)分别代表四种不同的动画,四种动画之间的双向箭头,表示动画之间进行切换。若游戏开发或动画制作场景中采用状态机的传统方式来生成动画,当虚拟对象的动作比较复杂时,状态机的设计量会非常庞大,并且后续的更新维护非常困难,需要耗费大量的时间,且容易产生故障。
由此产生了动作匹配技术,动作匹配技术解决了动画状态机设计量庞大、逻辑复杂、不便维护的问题。但是动作匹配技术需要预先存储海量动画数据进行查询匹配,因此对内存占用量高,导致存储和查询性能不佳。
鉴于以上问题,在本申请中提供了一种动画数据生成方法、装置及相关产品。当需要生成虚拟对象的动画时,在已获得虚拟对象的查询特征(轨迹特征和骨骼特征)基础上,只需要借助预先训练好的神经网络便可以生成虚拟对象的动画数据。相比于预先存储海量动画数据进行查询匹配在获得虚拟对象动画效果的实现方式,由于神经网络的权重数据内存占用量小,因此可以提升存储和查询性能。这一优势使本申请实施例方案能够在动画引擎中获得更好的应用和发展。
需要说明的是,本申请实施例提供的动画数据生成方法主要涉及人工智能(Artificial Intelligence,AI)技术,尤其涉及人工智能技术中的机器学习,以机器学习训练得到的神经网络解决动作匹配技术技术在动画制作和电影制作方面存在的存储和查询性能问题。
首先对本申请下文的实施例中可能涉及的若干个名词术语进行解释。
1)机器学习(Machine Learning):
机器学习是一门多领域交叉学科,涉及概率论、统计学、逼近论、凸分析、算法复杂度理论等多门学科。专门研究计算机怎样模拟或实现人类的学习行为,以获取新的知识或技能,重新组织已有的知识结构使之不断改善自身的性能。机器学习是人工智能的核心,是人工智能的一个分支。是使计算机具有智能的根本途径,其应用遍及人工智能的各个领域。机器学习和深度学习通常包括人工神经网络、置信网络、强化学习、迁移学习、归纳学习、式教学习等技术。人工智能的研究历史有着一条从以“推理”为重点,到以“知识”为重点,再到以“学习”为重点的自然、清晰的脉络。显然,机器学习是实现人工智能的一个途径,即以机器学习为手段解决人工智能中的问题。
2)神经网络(Neutral Network):
人工神经网络,简称神经网络或类神经网络,在机器学习和认知科学领域,是一种模仿生物神经网络的结构和功能的数学模型或计算模型,用于对函数进行估计或近似。神经网络由大量的人工神经元联结进行计算。大多数情况下人工神经网络能在外界信息的基础上改变内部结构,是一种自适应系统,通俗地讲就是具备学习功能。
3)动作捕捉:
动作捕捉,又称为动态捕捉,是指记录并处理人或其他物体动作的技术。它广泛应用于娱乐、体育、医疗应用、计算机视觉以及机器人技术等诸多领域。在动画制作、电影制作和电子游戏开发等领域,它通常是记录人类演员的动作,并将其转换为数字模型的动作,并生成二维或三维的计算机动画。当它捕捉面部或手指的细微动作时,它通常被称为性能捕获。
4)虚拟场景:
是应用程序在终端上运行时显示(或提供)的虚拟场景。该虚拟场景可以是对真实世界的仿真场景,也可以是半仿真半虚构的三维场景,还可以是纯虚构的三维场景。虚拟场景可以是二维虚拟场景、2.5维虚拟场景和三维虚拟场景中的任意一种,下述实施例以虚拟场景是三维虚拟场景来举例说明,但对此不加以限定。在一种可能的实现方式中,该虚拟场景还用于至少两个虚拟对象之间的虚拟场景对战,虚拟场景例如可以是游戏场景、虚拟现实场景、扩展现实场景等,本申请实施例对此不做限定。
4)虚拟对象:
是指在虚拟场景中的可活动对象。该可活动对象可以是虚拟人物、虚拟动物、动漫人物中的至少一种。在一种可能的实现方式中,当虚拟场景为三维虚拟场景时,虚拟对象可以是基于动画骨骼技术创建的三维立体模型。每个虚拟对象在三维虚拟场景中具有自身的形状和体积,占据三维虚拟场景中的一部分空间。
本申请实施例提供的动画数据生成方法可以由动画数据生成设备执行,该动画数据生成设备例如可以为终端设备。即,在终端设备上生成查询特征并根据预先训练好的神经网络来动画数据。作为示例,终端设备具体可以包括但不限于手机、电脑、智能语音交互设备、智能家电、车载终端、飞行器等。本发明实施例可应用于各种场景,包括但不限于云技术、人工智能、数字人、虚拟人、游戏、虚拟现实、扩展现实(XR,Extended Reality)等。此外,上述动画数据生成设备也可以是服务器,即可以在服务器上生成查询特征并根据预先训练好的神经网络来动画数据。
在一些其他的实现方式中,本申请实施例中提供的动画数据生成方法还可以通过终端设备和服务器共同去实现。图2为本申请实施例提供的一种实现动画数据生成方法的场景架构图。为便于理解本申请实施例提供的技术方案,以下结合图2介绍方案的实现场景。在实现场景中,涉及到终端设备和服务器。例如,可以在终端设备上提取虚拟场景的运行数据生成虚拟场景中虚拟对象的查询特征,从服务器调取神经网络的权重数据,在终端设备上基于神经网络生成虚拟对象的动画数据。此外,还可以在服务器中根据虚拟场景的运行数据生成虚拟场景中虚拟对象的查询特征,将查询特征发送到终端设备,再在终端设备
上利用神经网络来实现动画数据的生成。故本申请实施例中对于执行本申请技术方案的实现主体不做限定。
图2所示的服务器可以是独立的物理服务器,也可以是多个物理服务器构成的服务器集群或者分布式系统。另外,服务器还可以是提供云服务、云数据库、云计算、云函数、云存储、网络服务、云通信、中间件服务、域名服务、安全服务、CDN、以及大数据和人工智能平台等基础云计算服务的云服务器。
图3为本申请实施例提供的一种动画数据生成方法的流程图。下面以终端设备作为执行主体,介绍该方法的具体实现。如图3所示的动画数据生成方法,包括:
S301:根据虚拟场景的运行数据生成虚拟场景中虚拟对象的查询特征。
以游戏场景为例,玩家在操控虚拟对象时,需要虚拟对象根据玩家的操控展示相应的动画效果。例如虚拟对象执行走路动作,当玩家操控虚拟对象执行下蹲动作,则需要在虚拟场景(即游戏场景)中展示虚拟对象执行下蹲动作的动画。此下蹲动作的动画数据需要通过本申请提供的技术方案生成。为了生成该虚拟对象与玩家操作目的匹配的动画数据,本申请提供的技术方案中首先需要生成该虚拟对象的查询特征,以此在后续步骤中作为神经网络的输入,最终生成动画数据。
查询特征可以包括虚拟对象的轨迹特征和骨骼特征。所谓轨迹特征,可以是指与虚拟场景中虚拟对象的轨迹相关的特征。轨迹特征为从虚拟对象整体而言的特征。与之相对的,骨骼特征则是从虚拟对象的个别骨骼而言的特征。举例而言,查询特征中的轨迹特征可以包括:轨迹速度和轨迹方向。此外轨迹特征还可以包括轨迹点位置。查询特征中的骨骼特征可以包括左脚骨骼位置信息、左脚骨骼旋转信息、右脚骨骼位置信息和右脚骨骼旋转信息。此外,骨骼特征还可以包括左脚骨骼速度和右脚骨骼速度。
需要说明的是,轨迹特征中所指的轨迹可以是指虚拟对象根关节的轨迹。其可以是根据虚拟对象的臀部骨骼在地面的投影形成的路径。如果虚拟对象为人形角色,则生成方法是将人形骨骼的臀部骨骼信息投影到地面,这样多个动画帧连接起来就形成了虚拟对象的轨迹点信息。此处所指的地面具体可以是虚拟场景坐标系中的地面。采取双足的骨骼特征纳入到查询特征中,由于足部作为人体中表征姿态的重要部位,其骨骼在位置、旋转等方面的信息均有利于通过神经网络生成匹配的动画。在本申请中通过以轨迹特征和骨骼特征作为查询特征,分别从虚拟对象的整体和个别骨骼两个方面表征该虚拟对象的特征,从而结合这两种类型的特征,有利于实现动画数据的准确生成,保障生成的动画数据逼真刻画虚拟对象动作的展示效果。
作为本步骤的一种可选实现方式,根据虚拟场景的运行数据生成虚拟场景中虚拟对象的查询特征,具体可以包括:
首先从虚拟场景的运行数据中提取出针对虚拟对象的动作控制信号。接着根据动作控制信号中的控制参数以及针对虚拟对象的历史动作控制信号中的历史控制参数,生成虚拟对象的轨迹特征和骨骼特征。
在游戏运行时,人物的走跑主要取决于玩家的输入,如果玩家希望跑,那么通过键盘与手柄会输入对应的动作控制信号,然后动画引擎内部会根据动作控制信号计算一个合理
的跑步的速度作为轨迹特征。在运算时可以结合以往的历史动作控制信号中的历史控制参数。例如控制参数中可以包括动作类型(跑、跳跃、走路等)。此外,还可以结合虚拟对象的角色属性来生成其轨迹特征和骨骼特征。例如,不同的角色属性具有不同的速度最大值和速度最小值。此处,历史动作控制信号可以是在最新收到的动作控制信号之前收到的动作控制信号,例如可以是在最新收到的动作控制信号的前一次收到的动作控制信号,或者是此前预设时间内收到的动作控制信号。借助历史动作控制信号中的历史控制参数,有利于生成更加精准的、实时性更高的轨迹特征和骨骼特征。
S302:基于虚拟对象的轨迹特征和骨骼特征,通过神经网络中的特征生成网络增加虚拟对象的特征维度,得到虚拟对象的组合特征。
图4为本申请实施例提供的一种神经网络的结构示意图。在该图示意的网络结构中,包括特征生成网络和动画生成网络。其中,特征生成网络能够用于增加虚拟对象的特征维度,即通过该特征生成网络实现对该虚拟对象的查询特征维度的丰富。举例而言,输入该特征生成网络的查询特征包括轨迹速度、轨迹方向、左脚骨骼位置信息、左脚骨骼旋转信息、右脚骨骼位置信息和右脚骨骼旋转信息,而通过特征生成网络的处理,使输出的特征不仅包括输入的查询特征中的特征信息,还包括其他有助于准确生成动画数据的辅助特征。后文将对辅助特征的获取方式进行更加细致的说明。为了便于在文中区分输入的特征和输出的特征,本申请实施例中将通过特征生成网络处理得到的维度增加的特征称为组合特征。由于组合特征是以输入的轨迹特征和骨骼特征为基础获得的特征,因此,可以理解的是,该特征生成网络输出的组合特征与输入该特征生成网络的查询特征匹配,可以作为虚拟对象的组合特征,用于生成动画数据。
S303:基于虚拟对象的组合特征,通过神经网络中的动画生成网络生成虚拟对象的动画数据。
如图4所示的网络结构中,还包括动画生成网络。该网络的功能是在输入其中的虚拟对象的组合特征的基础上生成虚拟对象的动画数据。实际应用中,可以将动画引擎每一帧该虚拟对象的组合特征作为动画生成网络的输入,以生成该帧的动画数据。再根据动画引擎的各帧动画数据依照时序形成连贯的动画。因此,基于对神经网络的功能需求,当神经网络需要满足以上功能需求时,可以直接将特征生成网络的输出作为动画生成网络的输入进行训练和使用。
在本申请实施例介绍的动画数据生成方法中,由于该预先训练好的神经网络具备在查询特征基础上增加虚拟对象的特征维度并基于高维特征生成虚拟对象的动画数据的功能,因此能够满足对动画数据的生成需求。此外,由于神经网络的使用,在生成动画数据时不再需要沿用以往动作匹配技术的方式在内存中存储海量数据并从中查询匹配的动画;神经网络的使用只需要提前存储与神经网络相关的权重数据,因此整个方案的实施对于内存的占用量较低,进而避免了内存占用量高、在生成动画数据时查询性能不佳的问题。因此,本申请实施例方案能够在动画引擎中获得更好的应用和发展。
在一些可能的实现方式中,特征生成网络可能不是每一帧都运行,以便提高该方案运行时的性能和减少动画抖动。也就是说,S302的执行需要一定条件,例如,当虚拟对象的
轨迹特征和骨骼特征的变化满足第一预设条件和/或距离特征生成网络前一次输出组合特征的时间间隔满足第二预设条件时,根据最新输入的虚拟对象的轨迹特征和骨骼特征,通过特征生成网络输出虚拟对象的组合特征。也就是说,在该可能的实现方式中,特征生成网络的运行需要满足前提条件,该前提条件可能是关于特征变化的条件(例如第一预设条件),也可能是关于其运行时间间隔的条件(例如第二预设条件),还可以是这两方面条件的结合。
鉴于特征生成网络在一些可能的实现方式中不会每一帧都运行,为了保证动画效果,保证生成能够运行流畅的动画,本申请可以采用另一种神经网络的结构以生成动画数据。图5A为本申请实施例提供的另一种神经网络的结构示意图。相比图4所示的网络结构,在图5A所示的神经网络中还额外包括了特征更新网络。在图5A所示的结构中,特征生成网络的输出作为特征更新网络的输入;特征更新网络的输出作为动画生成网络的输入。当特征生成网络不运行时,通过特征更新网络去驱动下一帧动画生成,保证动画流畅、连续。图5B为本申请实施例提供的另一种动画数据生成方法的流程图,该图所示方法中所用的神经网络结构与图5A所示的神经网络结构一致。即,在神经网络中包括特征生成网络、特征更新网络和动画生成网络。
如图5B所示的动画数据生成方法,包括:
S501:根据虚拟场景的运行数据生成虚拟场景中虚拟对象的查询特征。
S502:基于虚拟对象的轨迹特征和骨骼特征,通过神经网络中的特征生成网络增加虚拟对象的特征维度,得到虚拟对象的组合特征。
在本申请实施例中S501-S502的实现方式与前述实施例S301-S302的实现方式基本相同,因此相关介绍可以参照前文提供的实施例,此处不再赘述。
需要说明的是,当特征生成网络并不是每一帧都运行时,特征生成网络在运行时生成的虚拟对象的组合特征可以是虚拟对象在当前帧的组合特征。
S503:基于特征生成网络输出的虚拟对象在当前帧的组合特征,通过神经网络中的特征更新网络输出虚拟对象在当前帧的下一帧的组合特征。
S503即体现了图5A所示神经网络中特征更新网络的功能。作为一种可选的实现方式,S503的实现方式可以是特征更新网络可基于特征生成网络输出的虚拟对象在当前帧的组合特征以及虚拟场景的动画引擎的帧间差,输出虚拟对象在当前帧的下一帧的组合特征。帧间差(deltaTime)可以是指动画引擎的动画逻辑线程两次更新之间的时间差。一般来说和游戏更新时间接近,比如游戏的更新速率60帧每秒,那么deltaTime就是1/60秒。也就是说,本申请实施例中,特征更新网络能够当前帧的组合特征得到该虚拟对象下一帧的同维度的组合特征。即,特征更新网络实现了对虚拟对象在前后相邻帧的组合特征更新,以前一帧的组合特征为基础更新下一帧的组合特征。如此,便可以在特征生成网络没有时时工作时,以特征更新网络的功能实现后续动画生成网络输出的动画数据的连贯性、流畅度。
S504:基于虚拟对象在当前帧的下一帧的组合特征,通过神经网络中的动画生成网络生成虚拟对象的动画数据。
区别于图3所示的流程的方法,在本申请实施例S504中,由于是以特征更新网络的输出作为动画生成网络的输入,因此,动画生成网络便直接根据输入其中的下一帧的组合特征来生成动画数据并输出。
在本申请实施例中,特征生成网络不是每一帧都运行,从而提高该方案运行时的性能和减少动画抖动。与此同时,通过特征更新网络,即便特征生成网络不是每一帧运行,也可以保证动画连贯和流畅。
图6为本申请实施例提供的一种特征生成网络的结构示意图。图7为本申请实施例提供的一种特征更新网络的结构示意图。图8为本申请实施例提供的一种动画生成网络的结构示意图。在图6至图8的示例中,特征生成网络的结构为六层的全连接网络,具有四个隐藏层,每一层隐藏层的单元数都是512。特征更新网络是一个四层的全连接网络,具有两个隐藏层,每一层隐藏层的单元数都是512。动画生成网络是一个三层的全连接网络,具有一个隐藏层。每一层隐藏层的单元数都是512。在其他的实现方式中,上述三种网络也可以包含其他数量的隐藏层数或者隐藏层包含其他数量的单元数。因此神经网络中6+4+2层级数量的网络结构和512单元数仅作为一个实现方式,此处不做限制。
前面在方法实施例中介绍了在查询特征的基础上生成虚拟对象的动画数据所使用的神经网络,并对其结构进行了示例性的说明。下面结合图9A介绍对于图5A所示的网络结构的神经网络训练方法。图9A为本申请实施例提供的一种神经网络的训练流程图。如图9A所示,训练该神经网络包括如下步骤:
S901:获取真实场景的动作捕捉数据。
获取真实场景的动作捕捉数据的目的是用来训练神经网络。动作捕捉技术在前面已经介绍过,其属于一种当前在电影制作、动画制作和游戏开发等领域中应用比较成熟的技术。本申请实施例中借助此项技术来获得真实场景中人体的动作捕捉数据。作为示例,为了提高训练的准确性,本步骤可以通过以下方式实现:
设计动作捕捉路线和预设的多个需要捕捉的动作。动作主体(一般是人,例如演员,也可能为动物)在真实场景中按照预设的动作捕捉路线运动并执行预设的动作时,对动作主体进行动作捕捉,得到初始的动作捕捉数据。对初始的动作捕捉数据通过以下至少一种预处理方式进行处理,得到处理后的动作捕捉数据:降噪、数据扩充或者生成适配于虚拟场景的动画引擎的坐标系下的数据。处理后的工作捕捉数据一般可以直接应用到后续的S902中。
通过降噪、数据扩充中至少一种方式对初始的动作捕捉数据进行预处理,由于降噪可以提高动作捕捉数据的质量,数据扩充可以扩大动作捕捉数据的数量,进而为训练神经网络提供了海量数据支撑。因此,通过上述预处理方式可以提升训练效果。
在一些场景中,采集动作捕捉数据的采集设备可能存在信号噪声,为了避免捕捉到的数据中也存在了噪声,从而影响神经网络的训练效果,因此,对于初始的动作捕捉数据,可以采取降噪的措施。举例而言,可以采用平滑(Savitzky-Golay,SG)滤波器进行滤波的方案对初始的动作捕捉数据进行处理。对于每一帧的动作捕捉数据中的骨骼根节点的位置,采用前后N帧的数据,也就是总共2N+1帧的数据,进行一次最小二乘法的拟合。最小二
乘法要求的是数据的平方差越小。然后再这个拟合曲线上选取当前帧的值为拟合之后的结果。需要说明的是,N的数值选取与动画的帧数量以及动画帧与帧之间的数据变化有关系,如果动画帧比较多,而且帧与帧之间的变化不大,那么N需要更大才能成功平滑降噪。一般来说N越大,那么降噪效果越强。作为示例,选取N=50。实际应用中还可以选用其他方式进行滤波,SG滤波器仅作为一个实现示例。通过滤波是动作捕捉数据中轨迹曲线更加平滑,减少了扰动。图9B和图9C为降噪前后根骨骼轨迹的示意图。结合图9B和图9C不难发现,经过降噪,以根骨骼轨迹为例的动作捕捉数据变得噪声降低,轨迹更加平滑。
在一些场景中,初始的动作捕捉数据的数据量较少,为了提升后续训练的神经网络的性能,可以对初始的动作捕捉数据进行扩充。其中,扩充方式可以包括通过镜像方法对初始的动作捕捉数据进行数据扩充,和/或,通过缩放时间轴的方式对初始的动作捕捉数据进行数据扩充。作为示例,镜像方法可以将动捕中左走镜像成右走,右走镜像成左走,从而增加每种模式的数据量。动画数据中,可能只捕捉了动作主体一次行走的数据,比如本次数据是先左脚前进然后右脚前进。为了扩充数据集,比如右脚先前进然后左脚跟着前进,就需要使用镜像方法进行扩充。缩放时间轴扩充数据的方式是指,使用增加或者减少轨迹速度的方式对数据进行扩充,此方法主要是把动画数据中的速率进行调整,用来模拟生成不同动作速度下的动作捕捉数据。比如初始的动作捕捉数据是在30秒的时间内完成了长达100米的路径的行走动作。通过放大时间轴,例如放大到2倍长度的时间轴,将原始的动作捕捉数据变换为在60秒的时间内完成长达100米的路径的行走动作。由此可见,放大时间轴即是降低了数据对应的执行主体的动作速度。类似地,缩短时间轴对应着提升了数据对应的执行主体的动作速度。例如,通过缩短时间轴,例如缩短到二分之一长度的时间轴,将原始的动作捕捉数据变换为在15秒的时间内完成长达100米的路径的行走动作。对于放大时间轴的实现方式,多出来的时间进行线性插值。对于缩短时间轴的实现方式,则可以依照时间序列对数据进行规律的过滤。通过以上方式对动作捕捉数据的扩充,为训练神经网络提供了海量数据支撑,有利于提升神经网络的性能。
除了降噪和数据扩充以外,由于真实场景与虚拟场景的坐标系差异,且最终获得的动画数据需要对应到虚拟场景的坐标系中,因此,还可以在本步骤中以初始的动作捕捉数据为基础生成适配于虚拟场景的动画引擎的坐标系下的数据。如此,便可以构建出用于训练神经网络的基础数据库。比如初始的动作捕捉数据是右手坐标系的数据,动画引擎的坐标系是Z轴向上的左手坐标系,可以根据坐标系关系进行转换,生成动画引擎的坐标系下的动作捕捉数据。
也就是在实际应用中,为了提升训练效果,可以对初始的动作捕捉数据通过以下至少一种预处理方式进行处理,得到处理后的动作捕捉数据:降噪、数据扩充或者生成适配于虚拟场景的动画引擎的坐标系下的数据。
S902:根据动作捕捉数据分别获取到动作主体的根运动数据、骨骼姿态信息和基础查询特征。
基础查询特征可以包括动作主体的轨迹特征和骨骼特征。此处基础查询特征与神经网络训练完毕后需要输入至特征生成网络的查询特征的数据类型一致。基础查询特征中的轨
迹特征可以是根据动作主体的移动方向和位置生成;基础查询特征中的骨骼特征可以是根据当前动作主体的双足的运动信息得到。
根据动作捕捉数据获取到的动作主体的根运动数据和骨骼姿态信息为除了基础查询特征以外,本步骤从动作捕捉数据中得到的有助于训练特征生成网络、增加查询特征维度的信息。
S903:从动作主体的根运动数据和骨骼姿态信息提取出动作主体的特征值,将特征值作为辅助查询特征。
在本申请实施例中,S903可由另一训练得到的深度学习网络来完成。该神经网络的功能是提取特征值作为辅助查询特征。需要说明的是,本申请实施例中所指的特征,例如查询特征、基础查询特征、辅助查询特征、组合特征等,均可以通过特征向量来表示。辅助查询特征的向量表示又可称为辅助向量。辅助向量是由执行S903的深度学习网络生成的数字。向量的维度与特征维度一致。图10A为本申请实施例提供的一种能够提取辅助查询特征的深度学习网络的结构示意图。在图10A所示的深度学习网络可以是五层全连接网络,具备3个隐藏层。通过各个隐藏层后,逐渐得到代表输入数据的低纬度特征向量。最终输出的就是需要与基础查询特征的向量表示一并用以训练特征生成网络的辅助向量。
S904:根据动作主体的轨迹特征、动作主体的骨骼特征和辅助查询特征得到运动主体的组合特征。
基础查询特征(即动作主体的轨迹特征和动作主体的骨骼特征)和辅助查询特征可得到运动主体的组合特征,通过辅助查询特征实现了在基础查询特征的基础上对查询特征增加维度。前面提到,特征生成网络的功能即是对查询特征增加特征维度,因此,本申请实施例中可以通过基础查询特征和组合特征作为一组训练数据对神经网络中的特征生成网络进行训练。其中,基础查询特征作为特征生成网络在训练阶段的输入,运动主体的组合特征作为针对前述输入的输出结果。参见下方S905。
S905:利用动作主体的轨迹特征、动作主体的骨骼特征和运动主体的组合特征,对神经网络中的特征生成网络进行训练。
在实际应用中可以设定对特征生成网络的训练截止条件。例如通过训练迭代的次数和/或损失函数来判定是否需要截止训练。类似地,对于特征更新网络和动画生成网络的训练也可以设置训练截止条件。在本申请实施例中,训练神经网络的过程分先后进行,先训练特征生成网络,再训练特征更新网络,最后训练动画生成网络。以此训练上述网络,能够尽可能保证每个网络训练后的性能。以此训练特征生成网络和动画生成网络的过程参见下方S906和S907。
S906:在特征生成网络训练完毕后,利用特征生成网络输出的当前帧的组合特征和动作主体在下一帧的组合特征,对神经网络中的特征更新网络进行训练。
其中,动作主体在下一帧的组合特征是根据动作主体的动作捕捉数据得到的,动作主体在下一帧的组合特征作为被训练的特征更新网络的输出结果,特征生成网络输出的当前帧的组合特征作为被训练的特征更新网络的实际输入。
S907:在特征更新网络训练完毕后,利用动作主体的根运动数据和骨骼姿态信息以及特征生成网络输出的动作主体在下一帧的组合特征,对动画生成网络进行训练。
其中,动作主体的根运动数据和骨骼姿态信息作为被训练的动画生成网络的输出结果,特征生成网络输出的动作主体在下一帧的组合特征作为被训练的动画生成网络的实际输入。
通过以上步骤训练得到了整个用于神经网络,即可用到本申请实施例提供的动画数据生成方法中。表1对比了传统动作匹配方法在各项内容上需要占据的存储量和本申请实施例提供的动画生成方法在各项内容上需要占据的存储量。
表1
从表1中可知,相比于动作匹配技术方案,本申请能够大大节省生成动画数据时对于存储空间的占用量,提升了存储性能。图10B为传统动作匹配方法和本申请实施例提供的动画数据生成方法获得的动画效果示意图。左侧的人形动画为传统动作匹配方法得到的,右侧的人形动画为通过本申请技术方案得到的。结合图10B左右两侧的动画效果图不难发现,本申请技术方案最终得到的动画效果与动作匹配方法得到的动画效果非常接近。即,达到了较好的效果,满足动画数据生成的需求。在保证动画效果的基础上,存储性能的提升使得游戏运行更加顺畅,动画观看更加流畅。而存储性能上的改进,使存储方面有更多的余量可以支撑其他方面的改进,例如可以支撑游戏画质的进一步提升、存储更多的用户的游戏数据、增加更丰富的虚拟角色的相关数据或者场景数据等。从而进一步提升玩家的游戏体验。
下面结合游戏场景描述本申请实施例提供的动画数据生成方法的实际应用。某款游戏在终端设备上运行,玩家实时操作,通过鼠标和键盘控制虚拟对象在游戏场景中做出奔跑、跳跃、闪躲等动作。当玩家敲击键盘的F键,根据游戏的设置,玩家所控制的虚拟对象需要在虚拟的场景中做出跳跃动作。当玩家敲击键盘的T键,根据游戏的设置,玩家所控制的虚拟对象需要在虚拟的场景中奔跑。应用本申请实施例提供的方法,终端设备能够通过玩家以鼠标和/或键盘输入的动作控制信号中的控制参数和历史控制参数来确定玩家的动作控制意图,通过计算得到该虚拟对象的查询特征。当计算得到虚拟对象的查询特征之后,终端设备与远程的服务器通信,以调用神经网络。在调用神经网络的权重数据后,存储在终端设备本地。终端设备将查询特征作为神经网络的输入。神经网络是服务器中基于一些真实场景的动作捕捉数据预先训练好的,因此实际上,终端设备可以在本地存储神经网络的权重数据或者从服务器上调取并存储在本地的神经网络的权重数据,以基于输入内容来运算,最终输出虚拟对象的动画数据。终端设备通过动画引擎的一些渲染方法,将虚拟对象的动画数据渲染为玩家在终端设备展示的游戏场景中可视的动画效果。当玩家敲击键盘
的F键,通过以上方法,玩家所控制的虚拟对象在终端设备屏幕展示的虚拟的场景中凌空跳跃,动画展示出跳跃时虚拟对象变化的身姿和区别于其他姿势的分离间距较大的双足。当玩家敲击键盘的T键,通过以上方法,玩家所控制的虚拟对象在终端设备屏幕展示的虚拟场景中做出奔跑姿态,双臂来回规律摆动,展示超出行走姿态的腿部交替运动。从玩家开始在终端设备上进行控制到相应的动画效果展示在虚拟场景画面中,整个耗时非常短暂,游戏的其他画面展示不会受控制指令影响而产生卡顿、区域马赛克效果。
图11为本申请实施例提供的一种动画数据生成装置的结构示意图。如图11所示,动画数据生成装置,包括:
查询特征生成单元111,用于根据虚拟场景的运行数据生成所述虚拟场景中虚拟对象的查询特征;所述查询特征包括所述虚拟对象的轨迹特征和骨骼特征;
组合特征生成单元112,用于基于所述虚拟对象的轨迹特征和骨骼特征,通过神经网络中的特征生成网络增加所述虚拟对象的特征维度,得到所述虚拟对象的组合特征,所述神经网络是预先训练得到的;
动画数据生成单元113,用于基于所述虚拟对象的组合特征,通过所述神经网络中的动画生成网络生成所述虚拟对象的动画数据。
由于该预先训练好的神经网络具备在查询特征基础上增加虚拟对象的特征维度并基于高维特征生成所述虚拟对象的动画数据的功能,因此能够满足对动画数据的生成需求。此外,由于神经网络的使用,在生成动画数据时不再需要沿用以往动作匹配技术的方式在内存中存储海量数据并从中查询匹配的动画;神经网络的使用只需要提前存储与神经网络相关的权重数据,因此整个方案的实施对于内存的占用量较低,进而避免了内存占用量高、在生成动画数据时查询性能不佳的问题。
图12为本申请实施例提供的另一种动画数据生成装置的结构示意图。在图12示意的装置结构中,所述虚拟对象的组合特征为所述虚拟对象在当前帧的组合特征,所述动画数据生成单元113,具体包括:
组合特征更新子单元,用于基于所述特征生成网络输出的所述虚拟对象在当前帧的组合特征,通过所述神经网络中的特征更新网络输出所述虚拟对象在当前帧的下一帧的组合特征;
动画数据生成子单元,用于基于所述虚拟对象在当前帧的下一帧的组合特征,通过所述神经网络中的动画生成网络生成所述虚拟对象的动画数据。
在一种可能的实现方式中,所述组合特征生成单元112,具体用于:
当所述虚拟对象的轨迹特征和骨骼特征的变化满足第一预设条件和/或距离所述特征生成网络前一次输出组合特征的时间间隔满足第二预设条件时,根据最新输入的虚拟对象的轨迹特征和骨骼特征,通过所述特征生成网络输出所述虚拟对象的组合特征。
在一种可能的实现方式中,所述组合特征更新子单元,具体用于:
基于所述虚拟对象在当前帧的组合特征以及适配于所述虚拟场景的动画引擎的帧间差,通过所述特征更新网络输出所述虚拟对象在当前帧的下一帧的组合特征。
在一种可能的实现方式中,所述动画数据生成装置,还可以包括网络训练单元,用于通过训练得到神经网络。其中,网络训练单元具体包括:
动作捕捉数据获取子单元,用于获取真实场景的动作捕捉数据;
数据分析子单元,用于根据所述动作捕捉数据分别获取到动作主体的根运动数据、骨骼姿态信息和基础查询特征;所述基础查询特征包括所述动作主体的轨迹特征和骨骼特征;
特征值提取子单元,用于从所述动作主体的根运动数据和骨骼姿态信息提取出所述动作主体的特征值,将所述特征值作为辅助查询特征;
特征组合子单元,用于根据所述动作主体的轨迹特征、所述动作主体的骨骼特征和所述辅助查询特征得到所述运动主体的组合特征;
第一训练子单元,用于利用所述动作主体的轨迹特征、所述动作主体的骨骼特征和所述运动主体的组合特征,对所述神经网络中的特征生成网络进行训练;
第二训练子单元,用于在所述特征生成网络训练完毕后,利用所述特征生成网络输出的当前帧的组合特征和所述动作主体在下一帧的组合特征,对所述神经网络中的特征更新网络进行训练,所述动作主体在下一帧的组合特征是根据所述动作主体的动作捕捉数据得到的;
第三训练子单元,用于在所述特征更新网络训练完毕后,利用所述动作主体的根运动数据和骨骼姿态信息以及所述特征生成网络输出的所述动作主体在下一帧的组合特征,对所述动画生成网络进行训练。
在一种可能的实现方式中,所述动作捕捉数据获取子单元,具体用于:
当动作主体在所述真实场景中按照预设的动作捕捉路线运动并执行预设的动作时,对所述动作主体进行动作捕捉,得到初始的动作捕捉数据;
对所述初始的动作捕捉数据通过以下至少一种预处理方式进行处理,得到处理后的动作捕捉数据:
降噪、数据扩充或者生成适配于所述虚拟场景的动画引擎的坐标系下的数据。
在一种可能的实现方式中,数据扩充方式,可以包括但不限于:
通过镜像方法对所述初始的动作捕捉数据进行数据扩充;和/或,
通过缩放时间轴的方式对所述初始的动作捕捉数据进行数据扩充。
在一种可能的实现方式中,所述查询特征生成单元111,包括:
信号提取子单元,用于从所述虚拟场景的运行数据中提取出针对所述虚拟对象的动作控制信号;
特征生成子单元,用于根据所述动作控制信号中的控制参数以及针对所述虚拟对象的历史动作控制信号中的历史控制参数,生成所述虚拟对象的轨迹特征和骨骼特征。
在一种可能的实现方式中,所述轨迹特征包括轨迹速度和轨迹方向,所述骨骼特征包括左脚骨骼位置信息、左脚骨骼旋转信息、右脚骨骼位置信息和右脚骨骼旋转信息;其中,轨迹为根据臀部骨骼的投影形成的。
下面就服务器形式和终端设备形式分别介绍动画数据生成设备的结构。
图13是本申请实施例提供的一种服务器结构示意图,该服务器900可因配置或性能不同而产生比较大的差异,可以包括一个或一个以上中央处理器(central processing units,CPU)922(例如,一个或一个以上处理器)和存储器932,一个或一个以上存储应用程序942或数据944的存储介质930(例如一个或一个以上海量存储设备)。其中,存储器932和存储介质930可以是短暂存储或持久存储。存储在存储介质930的程序可以包括一个或一个以上模块(图示没标出),每个模块可以包括对服务器中的一系列指令操作。更进一步地,中央处理器922可以设置为与存储介质930通信,在服务器900上执行存储介质930中的一系列指令操作。
服务器900还可以包括一个或一个以上电源926,一个或一个以上有线或无线网络接口950,一个或一个以上输入输出接口958,和/或,一个或一个以上操作系统941,例如Windows ServerTM,Mac OS XTM,UnixTM,LinuxTM,FreeBSDTM等等。
其中,CPU 922用于执行如下步骤:
根据虚拟场景的运行数据生成所述虚拟场景中虚拟对象的查询特征;所述查询特征包括所述虚拟对象的轨迹特征和骨骼特征;
基于所述虚拟对象的轨迹特征和骨骼特征,通过神经网络中的特征生成网络增加所述虚拟对象的特征维度,得到所述虚拟对象的组合特征,所述神经网络是预先训练得到的;
基于所述虚拟对象的组合特征,通过所述神经网络中的动画生成网络生成所述虚拟对象的动画数据。
本申请实施例还提供了另一种动画数据生成设备,该动画数据生成设备可以是终端设备,如图14所示,为了便于说明,仅示出了与本申请实施例相关的部分,具体技术细节未揭示的,请参照本申请实施例方法部分。该终端设备可以为包括手机、平板电脑、个人数字助理(英文全称:Personal Digital Assistant,英文缩写:PDA)、销售终端(英文全称:Point of Sales,英文缩写:POS)、车载电脑等任意终端设备,以终端设备为手机为例:
图14示出的是与本申请实施例提供的终端设备相关的手机的部分结构的框图。参考图14,手机包括:射频(英文全称:Radio Frequency,英文缩写:RF)电路1010、存储器1020、输入单元1030、显示单元1040、传感器1050、音频电路1060、无线保真(WiFi)模块1070、处理器1080、以及电源1090等部件。输入单元1030可包括触控面板1031以及其他输入设备1032,显示单元1040可包括显示面板1041,音频电路1060可以包括扬声器1061和传声器1062。可以理解的是,图14中示出的手机结构并不构成对手机的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。
存储器1020可用于存储软件程序以及模块,处理器1080通过运行存储在存储器1020的软件程序以及模块,从而执行手机的各种功能应用以及数据处理。存储器1020可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序(比如声音播放功能、图像播放功能等)等;存储数据区可存储根据手机的使用所创建的数据(比如音频数据、电话本等)等。此外,存储器1020可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。
处理器1080是手机的控制中心,利用各种接口和线路连接整个手机的各个部分,通过运行或执行存储在存储器1020内的软件程序和/或模块,以及调用存储在存储器1020内的数据,执行手机的各种功能和处理数据,从而对手机进行整体数据及信息收集。可选的,处理器1080可包括一个或多个处理单元;优选的,处理器1080可集成应用处理器和调制解调处理器,其中,应用处理器主要处理操作系统、用户界面和应用程序等,调制解调处理器主要处理无线通信。可以理解的是,上述调制解调处理器也可以不集成到处理器1080中。
在本申请实施例中,该终端所包括的处理器1080还具有以下功能:
根据虚拟场景的运行数据生成所述虚拟场景中虚拟对象的查询特征;所述查询特征包括所述虚拟对象的轨迹特征和骨骼特征;
基于所述虚拟对象的轨迹特征和骨骼特征,通过神经网络中的特征生成网络增加所述虚拟对象的特征维度,得到所述虚拟对象的组合特征,所述神经网络是预先训练得到的;
基于所述虚拟对象的组合特征,通过所述神经网络中的动画生成网络生成所述虚拟对象的动画数据。
本申请实施例还提供一种计算机可读存储介质,用于存储计算机程序,该计算机程序被动画数据生成设备执行时实现前述各个实施例所述的动画数据生成方法中的任意一种实施方式。
本申请实施例还提供一种包括指令的计算机程序产品,该计算机程序产品包括计算机程序,当其在计算机上运行时,使得计算机执行前述各个实施例所述的动画数据生成方法中的任意一种实施方式。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、设备的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统和方法,可以通过其它的方式实现。例如,以上所描述的系统实施例仅仅是示意性的,例如,所述系统的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个系统可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的系统可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出
来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(英文全称:Read-Only Memory,英文缩写:ROM)、随机存取存储器(英文全称:Random Access Memory,英文缩写:RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围。
Claims (13)
- 一种动画数据生成方法,所述方法由动画数据生成设备执行,包括:根据虚拟场景的运行数据生成所述虚拟场景中虚拟对象的查询特征;所述查询特征包括所述虚拟对象的轨迹特征和骨骼特征;基于所述虚拟对象的轨迹特征和骨骼特征,通过神经网络中的特征生成网络增加所述虚拟对象的特征维度,得到所述虚拟对象的组合特征,所述神经网络是预先训练得到的;基于所述虚拟对象的组合特征,通过所述神经网络中的动画生成网络生成所述虚拟对象的动画数据。
- 根据权利要求1所述的方法,所述虚拟对象的组合特征为所述虚拟对象在当前帧的组合特征,所述基于所述虚拟对象的组合特征,通过所述神经网络中的动画生成网络生成所述虚拟对象的动画数据,包括:基于所述特征生成网络输出的所述虚拟对象在当前帧的组合特征,通过所述神经网络中的特征更新网络输出所述虚拟对象在当前帧的下一帧的组合特征;基于所述虚拟对象在当前帧的下一帧的组合特征,通过所述神经网络中的动画生成网络生成所述虚拟对象的动画数据。
- 根据权利要求2所述的方法,所述基于所述虚拟对象的轨迹特征和骨骼特征,通过神经网络中的特征生成网络增加所述虚拟对象的特征维度,得到所述虚拟对象的组合特征,包括:当所述虚拟对象的轨迹特征和骨骼特征的变化满足第一预设条件和/或距离所述特征生成网络前一次输出组合特征的时间间隔满足第二预设条件时,根据最新输入的虚拟对象的轨迹特征和骨骼特征,通过所述特征生成网络输出所述虚拟对象的组合特征。
- 根据权利要求2所述的方法,所述基于所述特征生成网络输出的所述虚拟对象在当前帧的组合特征,通过所述神经网络中的特征更新网络输出所述虚拟对象在当前帧的下一帧的组合特征,包括:基于所述虚拟对象在当前帧的组合特征以及适配于所述虚拟场景的动画引擎的帧间差,通过所述特征更新网络输出所述虚拟对象在当前帧的下一帧的组合特征。
- 根据权利要求2所述的方法,所述神经网络为通过以下方式训练得到:获取真实场景的动作捕捉数据;根据所述动作捕捉数据分别获取到动作主体的根运动数据、骨骼姿态信息和基础查询特征;所述基础查询特征包括所述动作主体的轨迹特征和骨骼特征;从所述动作主体的根运动数据和骨骼姿态信息提取出所述动作主体的特征值,将所述特征值作为辅助查询特征;根据所述动作主体的轨迹特征、所述动作主体的骨骼特征和所述辅助查询特征得到所述运动主体的组合特征;利用所述动作主体的轨迹特征、所述动作主体的骨骼特征和所述运动主体的组合特征,对所述神经网络中的特征生成网络进行训练;在所述特征生成网络训练完毕后,利用所述特征生成网络输出的当前帧的组合特征和所述动作主体在下一帧的组合特征,对所述神经网络中的特征更新网络进行训练,所述动作主体在下一帧的组合特征是根据所述动作主体的动作捕捉数据得到的;在所述特征更新网络训练完毕后,利用所述动作主体的根运动数据和骨骼姿态信息以及所述特征生成网络输出的所述动作主体在下一帧的组合特征,对所述动画生成网络进行训练。
- 根据权利要求5所述的方法,所述获取真实场景的动作捕捉数据,包括:当动作主体在所述真实场景中按照预设的动作捕捉路线运动并执行预设的动作时,对所述动作主体进行动作捕捉,得到初始的动作捕捉数据;对所述初始的动作捕捉数据通过以下至少一种预处理方式进行处理,得到处理后的动作捕捉数据:降噪、数据扩充或者生成适配于所述虚拟场景的动画引擎的坐标系下的数据。
- 根据权利要求6所述的方法,对所述初始的动作捕捉数据进行数据扩充,包括:通过镜像方法对所述初始的动作捕捉数据进行数据扩充;和/或,通过缩放时间轴的方式对所述初始的动作捕捉数据进行数据扩充。
- 根据权利要求1-7任一项所述的方法,所述根据虚拟场景的运行数据生成所述虚拟场景中虚拟对象的查询特征,包括:从所述虚拟场景的运行数据中提取出针对所述虚拟对象的动作控制信号;根据所述动作控制信号中的控制参数以及针对所述虚拟对象的历史动作控制信号中的历史控制参数,生成所述虚拟对象的轨迹特征和骨骼特征。
- 根据权利要求1-7任一项所述的方法,所述轨迹特征包括轨迹速度和轨迹方向,所述骨骼特征包括左脚骨骼位置信息、左脚骨骼旋转信息、右脚骨骼位置信息和右脚骨骼旋转信息;其中,轨迹为根据臀部骨骼的投影形成的。
- 一种动画数据生成装置,所述装置部署在动画数据生成设备上,包括:查询特征生成单元,用于根据虚拟场景的运行数据生成所述虚拟场景中虚拟对象的查询特征;所述查询特征包括所述虚拟对象的轨迹特征和骨骼特征;组合特征生成单元,用于基于所述虚拟对象的轨迹特征和骨骼特征,通过神经网络中的特征生成网络增加所述虚拟对象的特征维度,得到所述虚拟对象的组合特征,所述神经网络是预先训练得到的;动画数据生成单元,用于基于所述虚拟对象的组合特征,通过所述神经网络中的动画生成网络生成所述虚拟对象的动画数据。
- 一种动画数据生成设备,所述设备包括处理器以及存储器:所述存储器用于存储计算机程序,并将所述计算机程序传输给所述处理器;所述处理器用于根据所述计算机程序中的指令执行权利要求1至9中任一项所述的动画数据生成方法的步骤。
- 一种计算机可读存储介质,所述计算机可读存储介质用于存储计算机程序,所述计算机程序被动画数据生成设备执行时实现权利要求1至9任一项所述的动画数据生成方法的步骤。
- 一种计算机程序产品,包括计算机程序,该计算机程序被动画数据生成设备执行时实现权利要求1至9任一项所述的动画数据生成方法的步骤。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US18/739,539 US20240331257A1 (en) | 2022-07-15 | 2024-06-11 | Animation data generation |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210832558.0 | 2022-07-15 | ||
CN202210832558.0A CN115222847A (zh) | 2022-07-15 | 2022-07-15 | 一种基于神经网络的动画数据生成方法、装置及相关产品 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US18/739,539 Continuation US20240331257A1 (en) | 2022-07-15 | 2024-06-11 | Animation data generation |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2024012007A1 true WO2024012007A1 (zh) | 2024-01-18 |
WO2024012007A9 WO2024012007A9 (zh) | 2024-09-06 |
Family
ID=83611189
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2023/091117 WO2024012007A1 (zh) | 2022-07-15 | 2023-04-27 | 一种动画数据生成方法、装置及相关产品 |
Country Status (3)
Country | Link |
---|---|
US (1) | US20240331257A1 (zh) |
CN (1) | CN115222847A (zh) |
WO (1) | WO2024012007A1 (zh) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118429494A (zh) * | 2024-07-04 | 2024-08-02 | 深圳市谜谭动画有限公司 | 一种基于虚拟现实的动画角色生成系统及方法 |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115222847A (zh) * | 2022-07-15 | 2022-10-21 | 腾讯数码(深圳)有限公司 | 一种基于神经网络的动画数据生成方法、装置及相关产品 |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111583364A (zh) * | 2020-05-07 | 2020-08-25 | 江苏原力数字科技股份有限公司 | 一种基于神经网络的群组动画生成方法 |
US11017560B1 (en) * | 2019-04-15 | 2021-05-25 | Facebook Technologies, Llc | Controllable video characters with natural motions extracted from real-world videos |
CN113570690A (zh) * | 2021-08-02 | 2021-10-29 | 北京慧夜科技有限公司 | 交互动画生成模型训练、交互动画生成方法和系统 |
CN114037781A (zh) * | 2021-11-12 | 2022-02-11 | 北京达佳互联信息技术有限公司 | 动画生成方法、装置、电子设备及存储介质 |
CN114170353A (zh) * | 2021-10-21 | 2022-03-11 | 北京航空航天大学 | 一种基于神经网络的多条件控制的舞蹈生成方法及系统 |
CN115222847A (zh) * | 2022-07-15 | 2022-10-21 | 腾讯数码(深圳)有限公司 | 一种基于神经网络的动画数据生成方法、装置及相关产品 |
-
2022
- 2022-07-15 CN CN202210832558.0A patent/CN115222847A/zh active Pending
-
2023
- 2023-04-27 WO PCT/CN2023/091117 patent/WO2024012007A1/zh unknown
-
2024
- 2024-06-11 US US18/739,539 patent/US20240331257A1/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11017560B1 (en) * | 2019-04-15 | 2021-05-25 | Facebook Technologies, Llc | Controllable video characters with natural motions extracted from real-world videos |
CN111583364A (zh) * | 2020-05-07 | 2020-08-25 | 江苏原力数字科技股份有限公司 | 一种基于神经网络的群组动画生成方法 |
CN113570690A (zh) * | 2021-08-02 | 2021-10-29 | 北京慧夜科技有限公司 | 交互动画生成模型训练、交互动画生成方法和系统 |
CN114170353A (zh) * | 2021-10-21 | 2022-03-11 | 北京航空航天大学 | 一种基于神经网络的多条件控制的舞蹈生成方法及系统 |
CN114037781A (zh) * | 2021-11-12 | 2022-02-11 | 北京达佳互联信息技术有限公司 | 动画生成方法、装置、电子设备及存储介质 |
CN115222847A (zh) * | 2022-07-15 | 2022-10-21 | 腾讯数码(深圳)有限公司 | 一种基于神经网络的动画数据生成方法、装置及相关产品 |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118429494A (zh) * | 2024-07-04 | 2024-08-02 | 深圳市谜谭动画有限公司 | 一种基于虚拟现实的动画角色生成系统及方法 |
Also Published As
Publication number | Publication date |
---|---|
CN115222847A (zh) | 2022-10-21 |
US20240331257A1 (en) | 2024-10-03 |
WO2024012007A9 (zh) | 2024-09-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2024012007A1 (zh) | 一种动画数据生成方法、装置及相关产品 | |
WO2021143261A1 (zh) | 一种动画实现方法、装置、电子设备和存储介质 | |
CN110930483B (zh) | 一种角色控制的方法、模型训练的方法以及相关装置 | |
US20090128567A1 (en) | Multi-instance, multi-user animation with coordinated chat | |
US10885691B1 (en) | Multiple character motion capture | |
Won et al. | Generating and ranking diverse multi-character interactions | |
US11238667B2 (en) | Modification of animated characters | |
CN112206517B (zh) | 一种渲染方法、装置、存储介质及计算机设备 | |
WO2023284634A1 (zh) | 一种数据处理方法及相关设备 | |
CN113633983A (zh) | 虚拟角色表情控制的方法、装置、电子设备及介质 | |
CN114125529A (zh) | 一种生成和演示视频的方法、设备及存储介质 | |
WO2019144346A1 (zh) | 虚拟场景中的对象处理方法、设备及存储介质 | |
US20230267668A1 (en) | Joint twist generation for animation | |
Kobayashi et al. | Motion capture dataset for practical use of AI-based motion editing and stylization | |
CN111739134B (zh) | 虚拟角色的模型处理方法、装置及可读存储介质 | |
CN117238448A (zh) | 孤独症干预训练元宇宙系统、学习监测和个性化推荐方法 | |
CN117058284A (zh) | 图像生成方法、装置和计算机可读存储介质 | |
CN115526967A (zh) | 虚拟模型的动画生成方法、装置、计算机设备及存储介质 | |
Lin et al. | Temporal IK: Data-Driven Pose Estimation for Virtual Reality | |
Sun | A digital feature recognition technology used in ballet training action correction | |
CN113559500B (zh) | 动作数据的生成方法、装置、电子设备及存储介质 | |
Liu et al. | Report on Methods and Applications for Crafting 3D Humans | |
WO2024169207A1 (zh) | 一种虚拟角色的表演内容展示方法及相关设备 | |
US11207593B1 (en) | Scalable state synchronization for distributed game servers | |
US11957976B2 (en) | Predicting the appearance of deformable objects in video games |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23838507 Country of ref document: EP Kind code of ref document: A1 |