CN111223170A

CN111223170A - Animation generation method and device, electronic equipment and storage medium

Info

Publication number: CN111223170A
Application number: CN202010013366.8A
Authority: CN
Inventors: 曾子骄; 周志强; 林群芬
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2020-01-07
Filing date: 2020-01-07
Publication date: 2020-06-02
Anticipated expiration: 2040-01-07
Also published as: CN111223170B

Abstract

The application provides an animation generation method, an animation generation device, electronic equipment and a storage medium, belongs to the technical field of computers, and relates to artificial intelligence and computer vision technologies. The animation generation method comprises the following steps: and obtaining an animation fragment T0 of the target virtual character, wherein the animation fragment T0 comprises an A0 animation frame, adjusting the posture of the target virtual character according to a target task set by the target virtual character to obtain an A1 animation frame based on the posture of the target virtual character in the A0 animation frame, and forming an animation fragment T1 of the target virtual character for completing the target task by at least two frames of animations A0 and A1. Wherein the posture of the target virtual character is adjusted by adjusting the moments of N joints of the target virtual character in the A0 animation frame, wherein N is a positive integer greater than or equal to 1. The method can reduce the consumption of manpower and improve the efficiency of generating the animation.

Description

Animation generation method and device, electronic equipment and storage medium

Technical Field

The present application relates to the field of computer technologies, and in particular, to an animation generation method and apparatus, an electronic device, and a storage medium.

Background

With the rapid development of computer technology and internet technology, electronic games, particularly network games, are becoming more popular. With the increasing demand of people, when game animation is produced, the virtual characters in the picture are expected to have more real and natural visual effects.

In order to pursue a more realistic and natural visual effect, in general, when designing a virtual character in a game, the virtual character may have bones with joints between adjacent bones, and animation of such a virtual character may also be referred to as animation of bones. After the image of the virtual character is designed for each virtual character in the game, the animator is required to give different actions to each virtual character, such as running, walking, jumping, and attacking.

The animator can manually drag the skeleton of the virtual character and correct the skeleton action little by little to achieve the required action effect. This process is not only inefficient, but also consumes a significant amount of labor.

Disclosure of Invention

In order to solve technical problems in the related art, embodiments of the present application provide an animation generation method, an apparatus, an electronic device, and a storage medium, which can generate an animation segment including a gesture sequence of a virtual character according to a target task to be executed by the virtual character, reduce consumption of manpower, improve efficiency, and improve an action effect of the virtual character.

In order to achieve the above purpose, the technical solution of the embodiment of the present application is implemented as follows:

in a first aspect, an embodiment of the present application provides an animation generation method, where the method includes:

obtaining an animation fragment T0 of the target virtual character, wherein the animation fragment T0 comprises A0 animation frames;

acquiring the posture of a target virtual character in an A0 animation frame, and adjusting the posture of the target virtual character according to a target task set by the target virtual character to obtain an A1 animation frame, wherein the posture of the adjusted target virtual character is obtained by adjusting the moments of N joints of the target virtual character in the A0 animation frame, and N is a positive integer greater than or equal to 1;

obtaining an animation segment T1 of a target virtual character completing the target task, the animation segment T1 being composed of the at least two frames of animations A0 and A1.

In a second aspect, an embodiment of the present application provides an animation generation apparatus, including:

the animation acquisition unit is used for acquiring an animation segment T0 of the target virtual character, wherein the animation segment T0 comprises A0 animation frames;

the posture adjusting unit is used for acquiring the posture of a target virtual character in the A0 animation frame and adjusting the posture of the target virtual character according to a target task set by the target virtual character to obtain an A1 animation frame, wherein the posture of the adjusted target virtual character is obtained by adjusting the moments of N joints of the target virtual character in the A0 animation frame, and N is a positive integer greater than or equal to 1;

and the animation generation unit is used for obtaining an animation segment T1 of the target virtual character completing the target task, wherein the animation segment T1 is composed of the at least two frames of animations A0 and A1.

In an optional embodiment, the posture adjustment unit is further configured to:

acquiring the state information of the target virtual character in an A0 animation frame;

inputting the state information of the target virtual character in an A0 animation frame and the target task into a control strategy network to obtain the torque output by the control strategy network and used for adjusting each joint of the target virtual character, wherein the control strategy network is obtained by training according to a sample animation segment, and the sample animation segment comprises a reference attitude sequence for finishing the target task by a reference virtual character.

acquiring state information of the target virtual character in an A0 animation frame and environment information of a scene environment where the target virtual character is located;

and inputting the state information of the target virtual character in an A0 animation frame, the target task and the environment information of the scene environment where the target virtual character is located into a control strategy network, and obtaining the torque which is output by the control strategy network and is used for adjusting each joint of the target virtual character, wherein the control strategy network is obtained by training according to a sample animation segment, and the sample animation segment comprises a reference posture sequence for finishing the target task by referring to the virtual character.

In an alternative embodiment, the state information includes phase data, attitude data and velocity data of the target virtual character, wherein the phase data is used for representing the stage of the target virtual character in the current animation segment, the attitude data is used for representing the current attitude of the target virtual character, and the velocity data is used for representing the current velocity state of the target virtual character.

In an optional embodiment, the apparatus further comprises a network training unit configured to:

determining the initial state of the training object according to the sample animation segment; inputting the state information of the training object at the current moment and the set training task into a control strategy network to obtain a control strategy at the next moment output by the control strategy network; the control strategy acts on the moment of each joint of the training object at the next moment, and each moment corresponds to one frame of animation;

according to the obtained control strategy, determining an expected reward value by referring to the reference attitude sequence of the virtual character in the sample animation segment and the set target task;

and adjusting parameters of the control strategy network according to the expected reward value, and continuing training the control strategy network after the parameters are adjusted until a set training end condition is reached to obtain the trained control strategy network.

In an optional embodiment, the network training unit is further configured to:

acquiring environment information of a scene environment where the training object is located;

and inputting the environmental information, the state information of the training object at the current moment and the training task into the control strategy network to obtain a control strategy output by the control strategy network at the next moment.

In an optional embodiment, the network training unit is further configured to:

controlling the training object to interact with the scene environment according to the obtained control strategy, and determining the state information of the training object at the next moment;

determining the reward value at the current moment according to the state information of the training object and the reference virtual character at the next moment and the set training task;

and determining an expected reward value according to the control strategy of each moment, the state information of the training object at each moment and the reward value at each moment.

In an optional embodiment, the network training unit is further configured to:

inputting the state information of the training object and the reference virtual character at the next moment and the set training task into a value evaluation network, so that the value evaluation network determines the task target reward at the current moment according to the state of the training object at the next moment and the set training task;

determining simulated target reward at the current moment according to the state information of the training object at the next moment and the state information of the reference virtual character at the next moment;

and determining the reward value at the current moment according to the task target reward and the simulation target reward.

In an alternative embodiment, the simulated target reward includes at least one of: attitude similarity, speed similarity, tail joint similarity, root joint attitude similarity and centroid attitude similarity;

the pose similarity is used for representing the similarity of the pose data of the training object and the reference virtual character; the speed similarity is used for representing the similarity of the training object and the speed data of the reference virtual character; the end joint similarity is used for representing the similarity of the postures of the training object and the end joint of the reference virtual character; the root joint similarity is used for representing the similarity of the postures of the training object and the root joint of the reference virtual character; the centroid pose similarity is used for representing the similarity of the center of gravity positions of the training object and the reference virtual character.

In an optional embodiment, the network training unit is further configured to:

forming a strategy track by the control strategy at each moment and the state information of the training object at each moment;

taking the sum of the reward values at each moment as the whole reward value corresponding to the strategy track;

determining the probability of the strategy track according to the current parameters of the control strategy network;

and determining an expected reward value according to the overall reward value corresponding to the strategy track and the probability of the strategy track.

In a third aspect, an embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and when the computer program is executed by a processor, the animation generation method of the first aspect is implemented.

In a fourth aspect, embodiments of the present application further provide an electronic device, which includes a memory and a processor, where the memory stores a computer program that is executable on the processor, and when the computer program is executed by the processor, the processor is enabled to implement the animation generation method of the first aspect.

The animation generation method, the animation generation device, the electronic equipment and the storage medium of the embodiment of the application obtain an animation segment T0 of a target virtual character, the animation segment T0 comprises an A0 animation frame, the posture of the target virtual character is adjusted according to a target task set by the target virtual character to obtain an A1 animation frame based on the posture of the target virtual character in the A0 animation frame, and the target virtual character is composed of at least two frames of animations A0 and A1 to form an animation segment T1 of the target virtual character for completing the target task. Wherein the posture of the target virtual character is adjusted by adjusting the moments of N joints of the target virtual character in the A0 animation frame, wherein N is a positive integer greater than or equal to 1. The method can generate the animation segment containing the attitude sequence of the target virtual character according to the target task to be executed by the target virtual character, thereby reducing the consumption of manpower and improving the efficiency of generating the animation.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.

Fig. 1 is a schematic view of an application scenario of an animation generation method according to an embodiment of the present application;

FIG. 2 is a flowchart of an animation generation method according to an embodiment of the present disclosure;

FIG. 3 is a flowchart of one implementation of step S201 in FIG. 2;

FIG. 4 is a flowchart of another implementation of step S201 in FIG. 2;

fig. 5 is a schematic diagram of training a control strategy network according to an embodiment of the present disclosure;

fig. 6 is a flowchart of a training process of a control strategy network according to an embodiment of the present disclosure;

fig. 7 is a schematic structural diagram of a control policy network according to an embodiment of the present application;

fig. 8 is a schematic structural diagram of another control policy network provided in an embodiment of the present application;

fig. 9 is a comparison graph of the effects of the PPO algorithm and the SAC algorithm provided in the embodiment of the present application;

FIG. 10 is a schematic diagram illustrating a training effect of a control strategy network according to an embodiment of the present disclosure;

FIG. 11 is a schematic diagram illustrating a convergence curve of each reward value fed back when a control strategy network is trained according to an embodiment of the present disclosure;

fig. 12 is a schematic structural diagram of an animation generation apparatus according to an embodiment of the present application;

FIG. 13 is a schematic structural diagram of another animation generation apparatus according to an embodiment of the present application;

fig. 14 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application clearer, the present application will be described in further detail with reference to the accompanying drawings, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

The word "exemplary" is used hereinafter to mean "serving as an example, embodiment, or illustration. Any embodiment described as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments.

The terms "first" and "second" are used herein for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature, and in the description of embodiments of the application, unless stated otherwise, "plurality" means two or more.

Some terms in the embodiments of the present application are explained below to facilitate understanding by those skilled in the art.

(1) Moment (torque): in physics, torque refers to the tendency of an applied force to cause an object to rotate about a rotational axis or pivot; in the present embodiment, torque refers to the tendency of forces to urge the bone to rotate about the joint.

(2) And (3) target tasks: for instructing the virtual character to perform a task of a specified action, such as "go", "shoot", "heel strike", "whirlwind kick", and the like. Each virtual role can complete various types of target tasks, and different types of virtual roles can complete different types of target tasks. Different control instructions can be preset to indicate the virtual character to complete different target tasks, for example, a player can trigger the corresponding control instruction through a control key to set the current target task for the virtual character.

(3) A physical engine: the method is characterized in that an engine for simulating physical laws through a computer program is mainly used in physics calculation, electronic games and computer animation, and can predict the action effect of a virtual character under different conditions by using variables such as quality, speed, friction force, resistance and the like.

The present application will be described in further detail with reference to the following drawings and specific embodiments.

In order to solve the problems that the efficiency of manually generating animation segments is low and the action effect of generated virtual characters is poor in the related art, embodiments of the present application provide an animation generation method, an animation generation device, an electronic device, and a storage medium. The embodiment of the present application relates to Artificial Intelligence (AI) and Machine Learning technologies, and is designed based on Computer Vision (CV) technology and Machine Learning (ML) in the AI.

Artificial intelligence is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making. The artificial intelligence technology mainly comprises a computer vision technology, a voice processing technology, machine learning/deep learning and other directions.

With the research and progress of artificial intelligence technology, artificial intelligence is developed and researched in a plurality of fields, such as common smart home, image retrieval, video monitoring, smart speakers, smart marketing, unmanned driving, automatic driving, unmanned aerial vehicles, robots, smart medical treatment and the like.

Computer vision technology is an important application of artificial intelligence, which studies relevant theories and techniques in an attempt to build an artificial intelligence system capable of obtaining information from images, videos or multidimensional data to replace human visual interpretation. Typical computer vision techniques generally include image processing and video analysis. The embodiment of the application provides a video screening method, and belongs to a method for video analysis.

Machine learning is a multi-field cross discipline, and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and the like. In the process of generating the animation segments, the control strategy network based on deep reinforcement learning is adopted to learn the sample animation segments containing the attitude sequence of the reference virtual character, and the control strategy network obtained by learning is utilized to generate the animation segments aiming at different virtual characters.

The animation generation method provided by the embodiment of the application can be applied to 3D (3Dimensions, three-dimensional) stereoscopic games, 3D animation movies, VR (Virtual Reality) scenes and the like. For example, in a 3D stereoscopic game, a large number of virtual characters are generally included. Virtual characters may also be referred to herein as physical characters, which may possess quality, be acted upon by gravity, and the like in a physics engine. In some embodiments, the virtual character may be composed of a skeleton, which is a movable skeleton constructed by joints, and is a movable virtual subject, and the whole virtual character is driven to move. In other embodiments, the virtual character may be made up of a bone and a skin, the skin being a triangular mesh wrapped around the bone, each vertex of the mesh being controlled by one or more bones that are not rendered in the game when the skin is wrapped around the bone.

In the game, the virtual character may be controlled by the player, or the control of the character may be automatically performed in accordance with the progress of the game. The types of virtual characters are also varied, such as "soldiers", "jurisprudents", "shooters", "athletes", and the like. The different types of virtual characters have a part of the same action types, such as running, walking, jumping, squatting and the like, and have a part of different action types, such as attack modes, defense modes and the like. Moreover, different types of virtual roles can complete the same type of target tasks and can also complete different types of target tasks. According to the animation generation method provided by the embodiment of the application, the animation segments can be generated according to the target task set aiming at the virtual character.

An application scenario of the animation generation method provided in the embodiment of the present application can be seen in fig. 1, where the application scenario includes a terminal device 11 and a game server 12. The terminal device 11 and the game server 12 can communicate with each other via a communication network. The communication network may be a wired network or a wireless network.

The terminal device 11 is an electronic device that can install various applications and can display an operation interface of the installed applications, and the electronic device may be mobile or fixed. For example, a mobile phone, a tablet computer, various wearable devices, a vehicle-mounted device, or other electronic devices capable of implementing the above functions may be used. Each terminal device 11 is connected to the game server 12 through a communication network, and the game server 12 may be a server of a game platform, may be a server or a server cluster or a cloud computing center composed of a plurality of servers, or may be a virtualization platform.

The terminal device 11 may have a client of the game installed thereon. In one embodiment, the client receives an operation of instructing a target virtual character to perform an item task (e.g., whirlwind kicking) input by the user through the control key, and sends an operation of setting the target task for the target virtual character to the game server 12. The game server 12 sequentially obtains at least two animation frames according to a target task set for a target virtual character. Specifically, the game server 12 stores a trained control strategy network, which is trained based on a sample animation segment containing a sequence of poses of a reference virtual character. The client acquires the state information of the target virtual character in the game in the previous frame of animation picture in real time, and sends the state information of the target virtual character in the previous frame of animation picture to the game server 12. The game server 12 inputs the state information of the target virtual character in the previous frame of animation picture and the set target task into the control strategy network to obtain the torque output by the control strategy network and used for adjusting each joint of the target virtual character, and sends the torque to the client, and the client adjusts each joint of the target virtual character in the previous frame of animation picture according to the torque of each joint based on the physical engine to obtain the posture of the target virtual character in the next frame of animation picture, so as to generate the next frame of animation picture. The previous frame of animation picture and the next frame of animation picture are two adjacent frames of animation pictures. The client side forms at least two frames of animation pictures into an animation segment of the target virtual character for completing the target task.

In another embodiment, the server may obtain, according to a target task set for the target virtual character, torques of joints of the adjustment target virtual character corresponding to frames of animation pictures included in an animation segment in which the target virtual character completes the target task, respectively, where the torques of the joints of the adjustment target virtual character are torques required to adjust the posture of the target virtual character in an adjacent previous frame of animation picture to the posture of the target virtual character in a next frame of animation picture. The server sends the obtained moment of each joint of the adjustment target virtual character corresponding to each frame of animation picture to the client, and the client adjusts the posture of the target virtual character in the animation picture according to the moment of each joint of the adjustment target virtual character corresponding to each frame of animation picture based on the physical engine to obtain a plurality of frames of animation pictures.

In another embodiment, the above process may be independently performed by a client installed in the terminal device 11. The client receives an operation which is input by a user through a control key and indicates a certain target virtual character to execute a certain project task, responds to the operation for setting the target task aiming at the target virtual character, and sequentially obtains at least two frames of animation pictures. Specifically, the client generates an animation segment T1 of the target virtual character for completing the target task based on the obtained animation segment T0 of the target virtual character, and the generation process includes: the animation fragment T0 comprises an A0 animation frame, the posture of a target virtual character in the A0 animation frame is obtained, the posture of the target virtual character is adjusted according to a target task set by the target virtual character to obtain an A1 animation frame, wherein the posture of the target virtual character is adjusted by adjusting the moments of N joints of the target virtual character in the A0 animation frame, and N is a positive integer greater than or equal to 1; an animation segment T1 for the target virtual character to complete the target task is obtained, the animation segment T1 is composed of at least two frames of animations A0 and A1.

The method can generate the animation segment containing the attitude sequence of the target virtual character according to the target task to be executed by the target virtual character, thereby reducing the consumption of manpower and improving the efficiency. In addition, in the prior art, an animator manually drags the skeleton of the target virtual character and corrects the skeleton action little by little to achieve the required action effect, and the obtained action of the target virtual character hardly achieves a real and natural state, so that the action effect of the target virtual character is poor. According to the method and the device, the state information of the target virtual character in the previous frame of animation picture and the set target task are input into the control strategy network, the moment output by the control strategy network and used for adjusting each joint of the target virtual character is obtained, each joint of the target virtual character in the previous frame of animation picture is adjusted according to the moment of each joint, and the posture of the target virtual character in the next frame of animation picture is obtained. The control strategy network is obtained by training according to a sample animation segment containing a posture sequence of a reference virtual character, and the posture of the target virtual character is adjusted according to the torque of each joint output by the control strategy network, so that the target virtual character executes corresponding action, and the action effect of the target virtual character can be improved.

The animation generation method provided by the present application may be applied to the game server 12, or may be applied to the client, and the terminal device 11 may implement the animation generation method provided by the present application, or may be completed by the game server 12 in cooperation with the client in the terminal device 11.

FIG. 2 is a flow chart illustrating an animation generation method according to an embodiment of the present application. As shown in fig. 2, the method comprises the steps of:

in step S201, an animation fragment T0 of the target virtual character is obtained.

The target virtual character may be in the form of a human figure, an animal, a cartoon or other forms, and the embodiments of the present application are not limited thereto. The target virtual character can be displayed in a three-dimensional form or a two-dimensional form. The target virtual character is provided with bones, joints are arranged between adjacent bones, the posture of the target virtual character can be changed by changing the position and the rotating angle of each joint, and a series of postures of the target virtual character are connected to form a continuous action.

The animation segment T0 includes an A0 animation frame. The animation segment T0 may be an animation segment input by the user, a pre-saved animation segment, or an animation segment generated in the game.

And S202, acquiring the posture of the target virtual character in the A0 animation frame, and adjusting the posture of the target virtual character according to the target task set by the target virtual character to obtain an A1 animation frame.

During the game, the user can control the target virtual character to execute different actions through the control key. In the embodiment of the application, each control key corresponds to one project task, a user can set a target task for a target virtual character through the control key, and the target virtual character executes an action corresponding to the target task.

The client of the game receives an operation which is input by a user through a control key and indicates a certain target virtual character to execute a certain project target task, obtains the posture of the target virtual character in the A0 animation frame, and adjusts the posture of the target virtual character according to the target task set by the target virtual character to obtain an A1 animation frame. Wherein, the posture of the target virtual character is obtained by adjusting the moments of N joints of the target virtual character in the A0 animation frame, and N is a positive integer greater than or equal to 1.

In step S203, an animation segment T1 of the target virtual character completing the target task is obtained.

The animation segment T1 is composed of at least two frames of animations A0 and A1, and A0 is the previous frame animation of A1.

In some embodiments, the pose of the target virtual character in the a1 animation frame may also be obtained, and adjusting the pose of the target virtual character according to the target task set by the target virtual character may result in the a2 animation frame, where adjusting the pose of the target virtual character is obtained by adjusting the moments of the N joints of the target virtual character in the a1 animation frame. By analogy, subsequent animation frames can be generated in sequence. And sequentially displaying all the animation frames to obtain an animation fragment T1 of the target virtual character for completing the target task. Specifically, the postures of the target virtual character in the adjacent animation pictures are different, and after a series of postures of the target virtual character are connected, a connected action can be formed, so that an animation segment that the target virtual character executes the target task through the series of actions is obtained.

By the method, the animation segment containing the attitude sequence of the target virtual character can be automatically generated according to the target task to be executed by the target virtual character based on the existing animation segment, so that the consumption of manpower is reduced, and the animation generation efficiency is improved. And the posture of the target virtual character is adjusted by adjusting the moment of each joint of the target virtual character, so that the target virtual character executes corresponding action, and the action effect of the target virtual character can be improved.

In one embodiment, the process of obtaining each frame of animation picture in the step S202, such as the process of obtaining a1 animation frame or a2 animation frame, can be as shown in fig. 3, and includes the following steps:

step S2021, acquiring the state information of the target virtual character in the previous frame of animation.

The state information of the target virtual character is used for representing the physical state of the target virtual object, and may include phase data of the target virtual character in the previous frame of animation picture, and attitude data and velocity data of each joint of the target virtual character.

The phase data is used to represent the stage of the target virtual character in the current animation segment, or to say, represent the action progress of the target virtual character, and the phase data of the target virtual character in the previous frame of animation picture refers to the stage or position of the posture of the target virtual character in all action postures required for completing the target task in the previous frame of animation picture. The phase data may be expressed as a fraction or percentage, as a one-dimensional data. For example, if 30 frames of animation are required to complete a certain target task, and the previous frame of animation is the 5 th frame of animation, the phase data is 5/30-1/6. The phase data may also be identified by time. For example, 30 animation frames are required for completing a certain target task, the total time length required for playing the 30 animation frames is T, the playing time corresponding to the first animation frame is recorded as the starting time, and the playing time T corresponding to the previous animation frame is set, so that the phase data Ph of the target virtual character in the previous animation frame can be expressed as Ph ═ T/T.

The pose data is used to characterize the pose morphology of the target virtual character, and in some embodiments, the pose data may be represented by rotation data of the respective joints of the target virtual character in a world coordinate system. The posture of the target virtual character can be accurately expressed by adopting a parameter matrix formed by the rotation data of each joint in the world coordinate system, and the used data is less, so that the calculation amount can be reduced, the calculation speed is accelerated, and the efficiency of generating the animation segments is further improved.

Illustratively, rotation information of each joint in the world coordinate system may be described in terms of quaternions. A quaternion is a hypercomplex number. The complex number is composed of a real number plus an imaginary unit i, and similarly, the quaternions are composed of a real number plus three imaginary units i, j, k, and the three imaginary numbers have the following relationship: i.e. i²＝j²＝k²＝﹣1，i⁰＝j⁰＝k⁰1. Each quaternion is a linear combination of 1, i, j, and k, and can be generally expressed as: a + bi + cj + dk, where a, b, c, d are real numbers. i. Geometric significance of j, k itselfIt is understood as a rotation, wherein i rotation represents a rotation from the X axis forward direction to the Y axis forward direction in the intersecting plane of the X axis and the Y axis, j rotation represents a rotation from the Z axis forward direction to the X axis forward direction in the intersecting plane of the Z axis and the X axis, k rotation represents a rotation from the Y axis forward direction to the Z axis forward direction in the intersecting plane of the Y axis and the Z axis, -i rotation represents a reverse rotation of i rotation, -j rotation represents a reverse rotation of j rotation, and-k rotation represents a reverse rotation of k rotation. If the target virtual character includes N joints, the input dimension of the pose data may be N4, where N is an integer greater than or equal to 1.

In other embodiments, the pose data may include position data and rotation data, and the position data may be represented as three-dimensional coordinates, with the three-dimensional coordinates representing the spatial coordinates of a joint. The rotation data can be expressed as quaternions, and the quaternions are used for expressing the rotation of one joint in a three-dimensional space. By representing the pose of the target virtual character using the position data and the rotation data, the pose of the target virtual character can be determined more accurately.

The velocity data is used to characterize the velocity state of the target avatar. In some embodiments, the velocity data may be represented by the angular velocities of the respective joints of the target virtual character, and since the dimension of the angular velocity of each joint is 3, if the target virtual character includes N joints, the dimension of the velocity of the target virtual character may be N × 3. And obtaining the speed data of the target virtual character in the animation picture at present according to the attitude data of the target virtual character in the animation picture of the previous frame and the attitude data of the target virtual character in the animation picture of the current station. The angular velocity of each joint can be adopted to accurately express the velocity state of the target virtual character, and the used data is less, so that the calculation amount can be reduced, and the calculation speed is accelerated.

In other embodiments, the velocity data may be composed of linear velocities (linear velocities) and angular velocities (angular velocities) of the joints of the target virtual character in a world coordinate system, where the world coordinate system may be a three-dimensional coordinate system including an X axis, a Y axis, and a Z axis that are perpendicular to each other, and after the world coordinate system is determined, the environment of the scene in which the target virtual character is located does not change. For each joint, the linear velocity has a dimension of 3 and the angular velocity has a dimension of 3, which may be 3+ 3-6. If the target avatar includes N joints, the velocity dimension of the target avatar may be N6. The speed data of the target virtual character is represented by adopting the combination of the linear speed and the angular speed, so that the speed of the target virtual character can be more accurately determined. Respectively acquiring phase data of the target virtual character in the previous frame of animation picture, and attitude data and speed data of each joint of the target virtual character, and combining the phase data, the attitude data and the speed data of the target virtual character as state information of the target virtual character in the previous frame of animation picture.

Step S2022, inputting the state information and the target task of the target virtual character in the previous frame of animation picture into the trained control strategy network to obtain the torque which is output by the control strategy network and is used for adjusting each joint of the target virtual character.

The target task is a task input by the user for the target virtual character, for example, if the user inputs a control command of "jump" through a control key on the display interface, so that the target virtual character jumps from the ground, the target task set for the target virtual character is "jump". The target task may also be other tasks, such as making the target virtual character advance in a given direction, or making the target virtual character kick to a designated position using a whirlwind kick action, which is not limited by the embodiment of the present application.

The target task may use a vector representation when it enters the control policy network. For example, assuming that the target task is to advance the target avatar in a given direction, the given direction may be represented by a two-dimensional vector on the horizontal plane. The state information of the target virtual character in the previous frame of animation picture and the vector representing the target task can be spliced together and input into the control strategy network, and the control strategy network outputs the torque for adjusting each joint of the target virtual character.

The control strategy network is obtained by training according to a sample animation segment containing the attitude sequence of the reference virtual character. The specific training process of the control strategy network will be described in detail below.

Step S2023, adjusting the posture of the target virtual character according to the moment of each joint for adjusting the target virtual character, to obtain the posture of the target virtual character in the next frame of animation.

In some embodiments, the physical engine may apply the moment of each joint to each joint of the target virtual character to adjust the posture of the target virtual character in the previous frame of animation image, so as to obtain the posture of the target virtual character in the next frame of animation image. The known moment of each joint can be directly used for accurately adjusting the posture of each joint of the target virtual character so as to enable the target virtual character to execute corresponding actions, so that the action effect of the target virtual character can be improved, and the actions of the target virtual character are more real and natural.

In other embodiments, when the pose of the target virtual character is adjusted according to the moment of each joint, the influence factor of the scene environment in which the target virtual character is located on the target virtual character may be considered, and the pose of the target virtual character in the next frame of animation picture may be determined by combining the moment of each joint and the influence factor of the scene environment. The scene environment may be a physical environment simulated by a physical engine, and in the simulated physical environment, the target virtual character follows a dynamics law, so that the action of the target virtual character is closer to a real situation. The influence factor of the scene environment may include an acting force of the scene environment on the target virtual character, for example, if an object in the scene environment is hit against the target virtual character, the target virtual character is subjected to a thrust action. And the moment of each joint and the acting force of the scene environment on the target virtual character are adjusted to jointly act on the target virtual character, so that the posture of the target virtual character in the next frame of animation picture is obtained.

Step S2024, generate the next frame of animation image based on the pose of the target virtual character in the next frame of animation image.

And integrating the obtained posture of the target virtual character into a scene environment where the target virtual character is located to obtain a next frame of animation picture. After one frame of animation picture is generated, the subsequent animation picture can be generated by circularly executing the processes.

In another embodiment, in the step S202, the process of obtaining each frame of animation picture may be as shown in fig. 4, and includes the following steps:

in step S202a, the state information of the target virtual character in the previous animation frame is obtained.

Step S202b, inputting the state information of the target virtual character in the previous animation picture, the target task, and the environment information of the scene environment where the target virtual character is located into the trained control policy network, and obtaining the torque output by the control policy network for adjusting each joint of the target virtual character.

The environment information of the scene environment may be, but is not limited to, a topographic map of the scene environment, the topographic map of the scene environment may reflect a change of a height of a terrain in the scene environment, the change of the terrain in the scene environment may affect a control policy output by the control policy network, and the control policy is a torque for adjusting each joint of the target virtual character. And the control strategy network integrates the state information of the target virtual character in the previous frame of animation picture, the target task and the environment information of the scene environment where the target virtual character is positioned, and outputs a corresponding control strategy. When the moment for adjusting each joint of the target virtual character is generated, the action of the target virtual character can be adaptive to the environment information by combining the environment information of the scene environment, so that the action of the target virtual character is more real and natural in the scene environment.

Step S202c, the posture of the target virtual character is adjusted according to the moment of each joint, and the posture of the target virtual character in the next frame of animation picture is obtained.

In step S202d, a next frame animation picture is generated based on the posture of the target virtual character in the next frame animation picture.

The following describes in detail a training process of a control strategy network employed in an embodiment of the present application. The embodiment of the application trains a control strategy network aiming at the problem of continuous control of the target virtual role attitude. As shown in fig. 5, the principle of the training control strategy network is that a training object corresponding to a target virtual character is designed in advance, and may be referred to as an agent. The training object is an action object of a control strategy output by the control strategy network in the training process, and the training object is in a set scene environment. The physical engine determines the action of the training object according to the control strategy output by the control strategy network, interacts the action of the training object with the scene environment, determines the reward value of the interaction and the state of the training object after the interaction is completed, and adjusts the control strategy output by the control strategy network according to the reward value and the state of the training object to obtain a better control strategy.

In the embodiment of the application, one control strategy network can be trained for the same type of target task, and corresponding control strategy networks can be trained for different types of target tasks respectively. For example, "shoot forward", "shoot left" and "shoot right" all belong to shots, only the direction of the shot is different, and therefore belonging to the same type of target task, a control strategy network can be trained. And the shooting and running belong to different types of target tasks, and corresponding control strategy networks can be trained respectively.

The training process of the control strategy network can be as shown in fig. 6, and includes the following steps:

step S601, the initial state of the training object is determined according to the initial state of the reference virtual character in the sample animation segment.

Wherein, the training object is the action object of the control strategy output by the control strategy network in the training process. Before training the control strategy network, a sample animation segment containing a sequence of poses of a reference virtual character needs to be obtained. In some embodiments, the reference avatar may be an already-fabricated virtual object, and the training object and the reference avatar are virtual objects with the same or similar skeletons.

Step S602, inputting the state information of the training object at the current moment and the set training task into the control strategy network to obtain the control strategy at the next moment output by the control strategy network, wherein each moment corresponds to one frame of animation picture.

Wherein the training task is set corresponding to the target task in the use process. Illustratively, the training task may be to advance the training subject in a given direction, or to have the training subject kick to a designated location using a whirlwind kick action.

The current moment is the playing moment corresponding to the current sample animation picture, and the next moment is the playing moment corresponding to the next frame of sample animation picture. The state information of the training object includes phase data, attitude data, and velocity data, and the state information of the target virtual character is the same as above, and is not described herein again.

The control policy network (operator network) may be a network with targets, comprising an input layer, a hidden layer and an output layer. The hidden layer may include a layer of neural network or a plurality of layers of neural networks, which may be set according to actual situations, and this is not limited in the embodiments of the present application. The neural network layer in the hidden layer may be a fully connected layer. For example, two fully-connected layers may be included in the hidden layer, wherein a first fully-connected layer may include 1024 neurons and a second fully-connected layer may include 512 neurons. When the hidden layer includes two or more neural network layers, the activation function between the neural network layers is a ReLU (Rectified Linear Unit) function.

In an embodiment, as shown in fig. 7, the state information of the training object at the current time and the set training task may be input into the control strategy network, so as to obtain the control strategy at the next time output by the control strategy network. The control strategy is the moment acting on each joint of the training object at the next moment.

In another embodiment, as shown in fig. 8, environment information of a scene environment in which the training object is located may be acquired, and the environment information of the scene environment may be a topographic map of the scene environment. And inputting the environmental information of the scene environment, the state information of the training object at the current moment and the training task into the control strategy network to obtain the control strategy at the next moment output by the control strategy network. For example, for some visual tasks that require adaptive terrain environment, the control strategy network may further include a feature extraction network, which is composed of multiple layers of convolution networks and fully-connected layers, and is used to extract terrain features from the input environment information of the scene environment, combine the extracted terrain features with the input state information of the training object and the training task, and determine the moments acting on the joints of the training object at the next moment. When the control strategy is generated, the action of the training object can be adapted to the environmental information by combining the environmental information of the scene environment, so that the action of the training object is more real and natural in the scene environment.

And step S603, controlling the interaction of the training object and the scene environment according to the obtained control strategy, and determining the state information of the training object at the next moment.

The scene environment is a scene displayed by the physical engine when the physical engine runs in the terminal device, and the scene environment can be a scene created for the training object to perform activities. The scene environment may be a simulation environment of a real world, a semi-simulation semi-fictional environment, or a pure fictional environment. The scene environment may be a two-dimensional virtual environment or a three-dimensional virtual environment, and when the virtual environment is a three-dimensional virtual environment, the virtual object is a three-dimensional stereo model created based on a skeletal animation technology. Each virtual object has its own shape and volume in the three-dimensional virtual environment, occupying a portion of the space in the three-dimensional virtual environment.

In one embodiment, the training object can be controlled to interact with the scene environment according to the moment acting on each joint of the training object at the next moment, and the state information of the training object at the next moment can be determined.

Step S604, determining the reward value at the current moment according to the state information of the training object and the reference virtual character at the next moment and the set training task.

Inputting the state information of the training object and the reference virtual character at the next moment and the set training task into a value evaluation network, so that the value evaluation network determines the reward of the task target at the moment according to the state of the training object at the next moment and the set training task, determines the simulated target reward at the moment according to the state information of the training object at the next moment and the state information of the reference virtual character at the next moment, and determines the reward value at the current moment according to the reward of the task target and the simulated target reward.

The network structure of the value evaluation network (critic network) and the network structure of the control policy network may be the same or different. The value evaluation network is used for evaluating the control strategy output by the control strategy network and determining the reward value of the training object for simulating the reference virtual role and completing the training task. The value evaluation network is also trained.

As can be seen from the above description, the prize value at the current time includes both the mission target prize and the imitation target prize. The simulation target reward is used for exciting the posture of the training object to be consistent with the posture of the reference virtual character, the similarity degree of the posture of the reference virtual character and the posture of the training object is compared when the two are in the same phase, and the closer the simulation target reward is, the higher the simulation target reward is; conversely, the lower the simulated target prize. The task goal reward is determined according to the condition that the training object completes the training task. Reward value r at the present moment t_tCan be expressed as:

wherein,

indicating a simulated target prize, w, corresponding to time t^IIn order to mimic the weight of the target reward,

indicating the mission objective reward, w, corresponding to time t^GWeights awarded for the mission objective. w is a^IAnd w^GRelated to network parameters of the value assessment network.

Optionally, the simulated target reward

The similarity of the kinematic quantization index may be included, and may include 5 dimensions, respectively: attitude similarity

Similarity of velocities

End joint similarity

Root joint posture similarity

Centroid attitude similarity

Imitating goal rewards

Can be expressed as:

w^pas a similarity of posture

Weight of (1), w^vIs the degree of similarity of velocity

Weight of (1), w^eFor end joint similarity

Weight of (1), w^rAs root joint posture similarity

Weight of (1), w^cAs the similarity of the centroid attitude

Wherein the root joint is used to represent the spine in the torso.

Wherein the attitude similarity

For describing the similarity of the postures of the training object and the reference virtual character, the similarity between the positions and the rotation degrees of the joints can be expressed as:

the attitude data representing the jth joint of the reference virtual character at time t may be represented by a quaternion.

Representing pose data for the jth joint of the training subject at time t.

Similarity of velocities

The velocity similarity for describing the training object and the reference virtual character, including the similarity between the angular velocity and the linear velocity of each joint and the target pose, can be expressed as:

indicating the velocity data of the jth joint of the reference virtual character at time t.

Representing velocity data for the jth joint of the training subject at time t.

End switchDegree of section similarity

The similarity for describing the posture of the training object and the terminal joint of the reference virtual character, including the joint of the four limbs, can be expressed as:

indicating that at time t, the pose data of the e-th end joint of the reference virtual character,

representing pose data for the e-th end joint of the training subject at time t.

Root joint posture similarity

The similarity for describing the training object and the reference virtual character root joint posture can be expressed as:

indicating that at time t, the pose data of the root joint of the reference virtual character,

representing pose data of the root joint of the training subject at time t.

Centroid attitude similarity

The similarity for describing the barycentric location of the training object and the reference virtual character can be expressed as:

indicating the position of the center of gravity of the reference virtual character in the world coordinate system at time t,

which represents the position of the center of gravity of the training subject in the world coordinate system at time t.

The setting of the mission objective reward may also vary depending on the type of training mission. For example, if the training task is a task in which the training subject mimics the walking posture of a reference virtual character and turns during walking, a task goal reward for the walking direction is set to encourage the training subject to advance in a given direction at a given speed. The designated direction can be represented by a two-dimensional vector on a horizontal plane, in the input of the control strategy network, the two-dimensional vector representing the designated direction and the state information are spliced together to be used as the input of the control strategy network, and the task target reward can be represented as:

wherein v is^*Is shown in a given direction

A set desired speed of

Representing the velocity of the center of gravity of the training subject in the horizontal plane. The mission objective reward function penalizes if the training object does not reach the desired speed in the specified direction, and penalizes if the training object exceeds the desired speed in the specified direction.

If the training task isThe training object is kicked to the designated position by using the action of whirlwind kicking. For example, a random target sphere is designated around the training subject, and the training subject kicks the designated target sphere with whirlwind kicks. The task vector of the training task is composed of two parts, one is the position of a given target sphere

Can be represented by a three-dimensional vector in space, and the other is a binary mark h which indicates whether the target is hit in the previous time period. In the training task, the relative position of the target sphere and the training object should satisfy the following conditions:

(1) the distance between the target sphere and the training object is randomly selected in the range of [0.6m,0.8m ];

(2) the height of the target sphere from the ground is within the range of [0.8m,1.2m ];

(3) the placement of the target sphere is within 2 radians of the initial orientation of the training subject.

The target sphere is within a certain distance range from the training object, so that the target sphere can be ensured to be within the reach range of the training object. In the above conditions, the target sphere has a certain distance from the training object, has a certain distance from the ground, and is positioned in front of the training object, so that the target sphere can be touched when the whirlwind kicking action of the training object is in a proper action range.

In this training task, the task goal reward may be expressed as:

wherein,

representing the position of the training subject at time t. The aim of the training task is to accurately play the appointed target while ensuring the whirlwind playing gesture, and to complete the task.

FIG. 11 is a graph showing the convergence of various awards during the training process, wherein the abscissa of FIG. 11 represents the number of training iterations, i.e., the training number, and the ordinate represents the awards fed back, the curve ① of FIG. 11 represents the convergence process curve of the centroid pose similarity, the curve ② represents the convergence process curve of the velocity similarity, the curve ③ represents the convergence process curve of the root joint pose similarity, the curve ④ represents the convergence process curve of the end joint pose similarity, the curve ⑤ represents the convergence process curve of the mission objective awards, the curve ⑥ represents the convergence process curve of the pose similarity, and the curve ⑦ represents the convergence process curve of the simulated objective awards, it can be seen from FIG. 11 that, when the number of training iterations reaches about 10000, the various awards can substantially converge, and the training object can substantially achieve the simulated objective of the simulated reference virtual character in various aspects.

After the bonus value at each time is determined, the process proceeds to step S605.

In step S605, an expected reward value is determined according to the control strategy at each time, the state information of the training object at each time, and the reward value at each time.

Setting the control strategy output by the control strategy network at the time t as a_tThe state information of the training object at each time is s_t. The control strategy at each time and the state information of the training object at each time are combined into a strategy track tau, which can be expressed as tau(s)₀，a₀，s₁，...，a_T-1，s_T). Wherein s is₀Status information indicating the training subject at the start time, a₀The start time is indicated, and T is the total time length of the sample animation determination.

The total of the reward values at each time is taken as the overall reward value corresponding to the strategy track, and can be expressed as:

wherein, γ^tRepresenting the weight corresponding to the prize value at time t.

Determining the probability of the strategy track according to the current parameters of the control strategy networkExpressed as:

wherein p is_θ(τ) represents the probability distribution of the trajectory,

π_θrepresenting the control strategy network and theta representing the current parameters of the control strategy network.

Determining an expected reward value J (θ) according to the overall reward value corresponding to the policy track and the probability of the policy track occurring, which can be expressed as:

step S606, judging whether the set training end condition is reached; if yes, go to step S608; if not; step S607 is executed.

The training end condition can be that the training times reach the set times, the variation range of the expected reward value obtained by N times of continuous training is within the set range, or the expected reward value reaches the set threshold value. Usually, when training number of times reaches 7000 ~ 10000 times or so, control strategy network just can be better convergence, consequently, set for the number of times and can set up between 7000 ~ 10000 times.

Step S607, adjusting the parameters of the control policy network according to the desired bonus value.

And if the set training end condition is not met, adjusting the parameters of the control strategy network according to the expected reward value and continuously training the control strategy network after the parameters are adjusted.

Step S608, using the current parameter as a parameter of the control strategy network to obtain the trained control strategy network.

If the training times reach the set times, or the variation amplitude of the expected reward value obtained by N times of continuous training is small, or the expected reward value reaches the set threshold value which can be set to be more than 85%, the training is finished, and the current parameter is used as the parameter of the control strategy network to obtain the trained control strategy network.

Illustratively, the control strategy network may be trained using a Deep reinforcement learning algorithm such as a PPO (Proximal Policy Optimization) algorithm, a SAC (Soft actuator-Critic) algorithm, or a DDPG (Deep Deterministic Policy gradient) algorithm for processing a continuous control problem.

The embodiment of the application tests a control strategy network trained through a PPO algorithm and a SAC algorithm, wherein the training task of the test is to enable a training object to simulate the walking posture of a reference virtual character and complete a turning task in the walking process, and the obtained training effects of the two algorithms are shown in FIG. 9. the abscissa in FIG. 9 represents the training iteration times, namely the training times, and the ordinate represents the feedback reward value.A curve ① in FIG. 9 represents the variation curve of the reward of a task target corresponding to the PPO algorithm, a curve ② represents the variation curve of the reward of the task target corresponding to the SAC algorithm, a curve ③ represents the variation curve of the reward of a simulated target corresponding to the SAC algorithm, a curve ④ represents the variation curve of the reward of the simulated target corresponding to the PPO algorithm, a curve ⑤ represents the variation curve of the overall reward value corresponding to the PPO algorithm, a curve ⑥ represents the variation curve of the reward of the overall value corresponding to the SAC algorithm, a curve ⑦ represents the variation curve of the expected reward corresponding to the PPO algorithm, and a curve ⑧ represents the variation curve of the reward corresponding.

The training process for training the control strategy network through the PPO algorithm and the SAC algorithm is basically consistent, and reference may be made to the training process shown in fig. 6. The PPO algorithm differs from the SAC algorithm in that the function that constructs the desired prize value J (θ) differs. As can be seen from fig. 9, the convergence process of the overall reward values of the PPO algorithm and the SAC algorithm is relatively close, the SAC algorithm has a slightly better effect in the aspect of implementing the training task, has a stronger exploration capability, and can converge relatively quickly. The PPO algorithm has advantages in the aspect of posture simulation, and the posture of the training object is more vivid and can meet requirements better. Thus, the PPO algorithm may be employed to construct a function of the expected reward value J (θ), i.e., the function noted above

Fig. 10 is an effect display diagram showing the control strategy output by the trained control strategy network, taking the simulated walking posture as an example. Fig. 10 (a) shows the posture of the reference virtual character in the animation screen at a certain time, and fig. 10 (b) shows the posture of the training target in the animation screen at a corresponding time. For example, observation is facilitated, and during training, the training object and the reference virtual character are not skinned. The reference virtual character and the training object each include N joints, which may be 15 joints as shown in fig. 10. The joint indicated by a05 is a root joint, and is also a parent node at the topmost level in the position of the pelvis of the human-type character. The remaining 14 joints are respectively a chest indicated by a1, a neck indicated by a2, a right leg indicated by a3, a left leg indicated by a4, a right knee indicated by a5, a left knee indicated by a6, a right ankle indicated by a7, a left ankle indicated by a8, a right forearm indicated by a9, a left forearm indicated by a10, a right elbow indicated by a11, a left elbow indicated by a12, a right ankle indicated by a13, and a left ankle indicated by a 14. a7, a8, a13 and a14 may be understood as terminal joints. As can be seen in fig. 10, the postures of the training object and the reference virtual character can be substantially kept consistent.

The following describes an implementation process of the animation generation method provided by the embodiment of the present application by using two specific examples.

In one embodiment, it is assumed that a user inputs a forward instruction through a control key in a game, instructing a target virtual character controlled thereby to proceed in a given direction. The game client acquires the state information of the target virtual character in the A0 animation frame contained in the animation segment T0 based on the existing animation segment T0, wherein the A0 animation frame can be the current animation frame which is being displayed in the display interface. And taking the A0 animation frame as the animation of the previous frame, inputting the state information of the target virtual character in the A0 animation frame and the target task advancing along the given direction into the control strategy network, and obtaining the moment for adjusting the N joints of the target virtual character in the A0 animation frame. Illustratively, the target virtual character may also include 15 joints. The target task proceeding along the given direction may be a task vector including data in the given direction. The obtained moments act on the N joints of the target virtual character, and the posture of the target virtual character is adjusted to obtain an A1 animation frame. For example, the pose of the target virtual character in the a1 animation frame may be a state where the left foot is grounded and the right foot is lifted backward. And repeating the steps, taking the A1 animation frame as the previous frame of animation, inputting the state information of the target virtual character in the A1 animation frame and the target task advancing along the given direction into the control strategy network, and obtaining the moment for adjusting the N joints of the target virtual character in the A1 animation frame. The obtained moments act on the N joints of the target virtual character, and the posture of the target virtual character is adjusted to obtain an A2 animation frame. For example, the pose of the target virtual character in the a2 animation frame may be a state where the left foot is grounded and the right foot is kicked forward. By analogy, a plurality of animation frames can be generated, and the animation segment T1 of the target virtual character executing the target task advancing along the given direction is obtained until the advancing instruction is finished, and the target virtual character completes the target task advancing along the given direction. The animation segment T1 includes the above-described a0 animation frame, a1 animation frame, a2 animation frame, and a plurality of animation frames generated subsequently.

In another embodiment, it is assumed that the target task set for the target avatar is to perform trick kicking of a target sphere using whirlwind kicking. The game client acquires the state information of the target virtual character in the A0 animation frame contained in the animation segment T0 based on the existing animation segment T0, wherein the A0 animation frame can be the current animation frame which is being displayed in the display interface. And taking the A0 animation frame as the previous frame of animation, inputting the state information of the target virtual character in the A0 animation frame and the target task of the whirlwind kicking ball into the control strategy network, and obtaining the moment for adjusting the N joints of the target virtual character in the A0 animation frame. The target task of the whirlwind kicking can also be a task vector, and the vector comprises the position coordinates of the target sphere. The obtained moments act on the N joints of the target virtual character, and the posture of the target virtual character is adjusted to obtain an A1 animation frame. For example, the pose of the target virtual character in the a1 animation frame may be a state where the left foot is grounded and the right foot is lifted forward. And repeating the steps, taking the A1 animation frame as the previous frame of animation, inputting the state information of the target virtual character in the A1 animation frame and the target task advancing along the given direction into the control strategy network, and obtaining the moment for adjusting the N joints of the target virtual character in the A1 animation frame. The obtained moments act on the N joints of the target virtual character, and the posture of the target virtual character is adjusted to obtain an A2 animation frame. For example, the pose of the target virtual character in the a2 animation frame may be a state where the left foot is grounded and the right foot is kicked up to the right. By analogy, a plurality of animation frames can be generated to obtain an animation segment T1 of the target virtual character completing the target task of whirlwind kicking. The animation segment T1 includes the above-described a0 animation frame, a1 animation frame, a2 animation frame, and a plurality of animation frames generated subsequently. In this embodiment, the control strategy network is trained based on sample animation segments that reference the virtual character to perform a whirlwind kicking task, and thus, the control strategy network may determine the number of animation frames contained in animation segment T1.

By the animation generation method provided by the embodiment of the application, when the operation of setting the target task for the target virtual character by the user is received, the animation segment can be automatically generated in the game process by combining the scene environment of the target virtual character, the workload of game art is greatly reduced, and the production efficiency of game animation production is improved.

Corresponding to the method embodiment, the embodiment of the application also provides an animation generation device. FIG. 12 is a schematic structural diagram of an animation generation apparatus according to an embodiment of the present application; as shown in fig. 12, the animation generating apparatus includes an animation acquiring unit 121, a posture adjusting unit 122, and an animation generating unit 123. Wherein,

an animation obtaining unit 121, configured to obtain an animation segment T0 of the target virtual character, where the animation segment T0 includes a0 animation frame;

the posture adjusting unit 122 is configured to obtain a posture of the target virtual character in the a0 animation frame, and adjust the posture of the target virtual character according to a target task set by the target virtual character to obtain an a1 animation frame, where the posture of the adjusted target virtual character is obtained by adjusting moments of N joints of the target virtual character in the a0 animation frame, and N is a positive integer greater than or equal to 1;

and the animation generating unit 123 is used for obtaining an animation segment T1 of the target virtual character completing the target task, wherein the animation segment T1 is composed of at least two frames of animations A0 and A1.

In an optional embodiment, the posture adjustment unit 122 is further configured to:

and inputting the state information of the target virtual character in an A0 animation frame and the target task into a control strategy network to obtain the torque which is output by the control strategy network and used for adjusting each joint of the target virtual character, wherein the control strategy network is obtained by training according to a sample animation segment, and the sample animation segment comprises a reference attitude sequence for finishing the target task by a reference virtual character.

acquiring state information of a target virtual character in an A0 animation frame and environment information of a scene environment where the target virtual character is located;

and inputting the state information of the target virtual character in the A0 animation frame, the target task and the environment information of the scene environment where the target virtual character is located into a control strategy network, and obtaining the torque which is output by the control strategy network and is used for adjusting each joint of the target virtual character, wherein the control strategy network is obtained by training according to a sample animation segment, and the sample animation segment comprises a reference attitude sequence for finishing the target task by the reference virtual character.

In an alternative embodiment, the state information includes phase data, pose data, and velocity data of the target virtual character, wherein the phase data is used to characterize the stage of the target virtual character within the current animation segment, the pose data is used to characterize the current pose of the target virtual character, and the velocity data is used to characterize the current velocity state of the target virtual character.

In an alternative embodiment, as shown in fig. 13, the apparatus further includes a network training unit 131, configured to:

In an alternative embodiment, the network training unit 131 is further configured to:

acquiring environment information of a scene environment where a training object is located;

and inputting the environment information, the state information of the training object at the current moment and the training task into the control strategy network to obtain the control strategy at the next moment output by the control strategy network.

controlling interaction between the training object and the scene environment according to the obtained control strategy, and determining the state information of the training object at the next moment;

determining the simulated target reward at the current moment according to the state information of the training object at the next moment and the state information of the reference virtual character at the next moment;

In an alternative embodiment, the mimicking of the targeted reward includes at least one of: attitude similarity, speed similarity, tail joint similarity, root joint attitude similarity and centroid attitude similarity;

the posture similarity is used for representing the similarity of the posture data of the training object and the reference virtual role; the speed similarity is used for representing the similarity of the speed data of the training object and the reference virtual character; the end joint similarity is used for representing the similarity of the postures of the training object and the end joints of the reference virtual character; the root joint similarity is used for representing the similarity of the postures of the training object and the root joint of the reference virtual role; the centroid pose similarity is used for representing the similarity of the center of gravity positions of the training object and the reference virtual character.

The animation generation device of the embodiment of the application sequentially obtains at least two frames of animation pictures included in the animation segments according to the target task set for the target virtual character, and sequentially displays the obtained animation pictures to further obtain the animation segments. The posture of the target virtual character in each frame of animation picture is obtained by adjusting each joint of the target virtual character in the adjacent previous frame of animation picture, and the moment of adjusting each joint is obtained by inputting the state information and the target task of the target virtual character in the adjacent previous frame of animation picture into a trained control strategy network. The method can generate the animation segment containing the attitude sequence of the target virtual character according to the target task to be executed by the target virtual character, thereby reducing the consumption of manpower and improving the efficiency. Meanwhile, the control strategy network is obtained by training according to the sample animation segment containing the attitude sequence of the reference virtual character, and the attitude of the target virtual character is adjusted according to the torque of each joint output by the control strategy network, so that the target virtual character executes corresponding action, and the action effect of the target virtual character can be improved.

Corresponding to the method embodiment, the embodiment of the application also provides the electronic equipment. The electronic device may be a server, such as the game server 12 shown in fig. 1, or an electronic device such as a smartphone, tablet, laptop or computer, including at least a memory for storing data and a processor for data processing. The processor for data Processing may be implemented by a microprocessor, a CPU, a GPU (Graphics Processing Unit), a DSP, or an FPGA when executing Processing. For the memory, the memory stores therein operation instructions, which may be computer executable codes, and the operation instructions implement the steps in the flow of the video screening method according to the embodiment of the present application.

Fig. 14 is a schematic structural diagram of an electronic device according to an embodiment of the present application; as shown in fig. 14, the electronic device 140 in the embodiment of the present application includes: processor 141, display 142, memory 143, input device 146, bus 145, and communication device 144; the processor 141, memory 143, input device 146, display 142 and communication device 144 are all connected by a bus 145, the bus 145 being used for data transfer between the processor 141, memory 143, display 142, communication device 144 and input device 146.

The memory 143 may be configured to store software programs and modules, such as program instructions/modules corresponding to the animation generation method in the embodiment of the present application, and the processor 141 executes various functional applications and data processing of the electronic device 140 by running the software programs and modules stored in the memory 143, such as the animation generation method provided in the embodiment of the present application. The memory 143 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program of at least one application, and the like; the storage data area may store data (such as animation segments, control policy networks) created according to the use of the electronic device 140, and the like. Further, the memory 143 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.

The processor 141 is a control center of the electronic device 140, connects various parts of the entire electronic device 140 using the bus 145 and various interfaces and lines, and performs various functions of the electronic device 140 and processes data by operating or executing software programs and/or modules stored in the memory 143 and calling data stored in the memory 143. Alternatively, processor 141 may include one or more Processing units, such as a CPU, a GPU (Graphics Processing Unit), a digital Processing Unit, and the like.

In the embodiment of the present application, the processor 141 presents the generated animation segments to the user through the display 142.

The processor 141 may also be connected to a network via the communication device 144, and if the electronic device is a terminal device, the processor 141 may transmit data to and from the game server via the communication device 144. If the electronic device is a game server, the processor 141 may transmit data with the terminal device through the communication device 144.

The input device 146 is mainly used for obtaining input operations of a user, and when the electronic devices are different, the input device 146 may also be different. For example, when the electronic device is a computer, the input device 146 may be a mouse, a keyboard, or other input device; when the electronic device is a portable device such as a smart phone or a tablet computer, the input device 146 may be a touch screen.

The embodiment of the present application further provides a computer storage medium, where computer-executable instructions are stored in the computer storage medium, and the computer-executable instructions are used to implement the animation generation method according to any embodiment of the present application.

In some possible embodiments, the aspects of the animation generation method provided by the present application may also be implemented in the form of a program product, which includes program code for causing a computer device to execute the steps of the animation generation method according to various exemplary embodiments of the present application described above in this specification when the program product runs on the computer device, for example, the computer device may execute the animation generation flow in steps S201 to S202 shown in fig. 2.

The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

A readable signal medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described device embodiments are merely illustrative, for example, the division of the unit is only a logical functional division, and there may be other division ways in actual implementation, such as: multiple units or components may be combined, or may be integrated into another system, or some features may be omitted, or not implemented. In addition, the coupling, direct coupling or communication connection between the components shown or discussed may be through some interfaces, and the indirect coupling or communication connection between the devices or units may be electrical, mechanical or other forms.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed on a plurality of network units; some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, all functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may be separately regarded as one unit, or two or more units may be integrated into one unit; the integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.

The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application.

Claims

1. A method of animation generation, the method comprising:

obtaining a first animation segment (T0) of a target virtual character, the first animation segment (T0) including A0 animation frames therein;

acquiring the posture of a target virtual character in an A0 animation frame, and adjusting the posture of the target virtual character in the A0 animation frame according to a target task set by the target virtual character to obtain an A1 animation frame, wherein the posture of the adjusted target virtual character is obtained by adjusting the moments of N joints of the target virtual character in the A0 animation frame, and N is a positive integer greater than or equal to 1;

obtaining a second animation segment (T1) of the target virtual character performing the target task, the second animation segment (T1) being composed of at least two frames of animations A0 and A1.

2. The method of claim 1 wherein the second animation segment (T1) further comprises an a2 animation frame, the a2 animation frame being the next frame animation of an a1 animation frame, the a2 animation frame being derived by:

and acquiring the posture of the target virtual character in the A1 animation frame, and adjusting the posture of the target virtual character in the A1 animation frame according to the target task set by the target virtual character to obtain the A2 animation frame, wherein the posture of the adjusted target virtual character is obtained by adjusting the moments of N joints of the target virtual character in the A1 animation frame.

3. The method of claim 1, wherein adjusting the moments of the N joints of the target virtual character in the a0 animation frame is obtained by:

inputting the state information of the target virtual character in an A0 animation frame and the target task into a control strategy network to obtain the moments which are output by the control strategy network and used for adjusting N joints of the target virtual character, wherein the control strategy network is obtained by training according to a sample animation segment, and the sample animation segment comprises a reference posture sequence for finishing the target task by a reference virtual character.

4. The method of claim 1, wherein adjusting the moments of the N joints of the target virtual character in the a0 animation frame is obtained by:

and inputting the state information of the target virtual character in an A0 animation frame, the target task and the environment information of the scene environment where the target virtual character is located into a control strategy network, and obtaining the moments which are output by the control strategy network and used for adjusting N joints of the target virtual character, wherein the control strategy network is obtained by training according to a sample animation segment, and the sample animation segment comprises a reference posture sequence for finishing the target task by referring to the virtual character.

5. The method of claim 3 or 4, wherein the state information comprises phase data, pose data and velocity data of the target avatar, wherein the phase data is used to characterize the stage of the target avatar within the current animation segment, the pose data is used to characterize the current pose of the target avatar, and the velocity data is used to characterize the current velocity state of the target avatar.

6. The method according to claim 3 or 4, wherein the training process of the control strategy network comprises:

7. The method of claim 6, wherein inputting the state information of the training object at the current time and the set training task into the control strategy network to obtain the control strategy at the next time outputted by the control strategy network, comprises:

8. The method of claim 7, wherein determining the desired reward value based on the derived control strategy, the sequence of reference poses of the reference virtual character and the set target task in the sample animation segment, comprises:

9. The method of claim 8, wherein determining the reward value at the current time according to the state information of the training object and the reference virtual character at the next time and the set training task comprises:

10. The method of claim 9, wherein the mimicking of a target reward comprises at least one of: attitude similarity, speed similarity, tail joint similarity, root joint attitude similarity and centroid attitude similarity;

11. The method of claim 7, wherein determining an expected reward value based on the control strategy at each time, the state information of the training subject at each time, and the reward value at each time comprises:

12. An animation generation apparatus, characterized in that the apparatus comprises:

an animation obtaining unit, configured to obtain a first animation segment (T0) of a target virtual character, where the first animation segment (T0) includes a0 animation frame;

an animation generation unit for obtaining a second animation segment (T1) of the target virtual character completing the target task, the second animation segment (T1) being composed of the at least two-frame animations A0 and A1.

13. The apparatus of claim 12, wherein the pose adjustment unit is further configured to:

acquiring the posture of a target virtual character in the A1 animation frame, and adjusting the posture of the target virtual character in the A1 animation frame according to a target task set by the target virtual character to obtain an A2 animation frame; wherein the posture of the target virtual character is obtained by adjusting the moments of the N joints of the target virtual character in the A1 animation frame.

14. A computer-readable storage medium having a computer program stored therein, the computer program characterized by: the computer program, when executed by a processor, implements the method of any of claims 1 to 11.

15. An electronic device comprising a memory and a processor, the memory having stored thereon a computer program operable on the processor, the computer program, when executed by the processor, causing the processor to carry out the method of any one of claims 1 to 11.