CN116206370A - Driving information generation method, driving device, electronic equipment and storage medium - Google Patents

Driving information generation method, driving device, electronic equipment and storage medium Download PDF

Info

Publication number
CN116206370A
CN116206370A CN202310500623.4A CN202310500623A CN116206370A CN 116206370 A CN116206370 A CN 116206370A CN 202310500623 A CN202310500623 A CN 202310500623A CN 116206370 A CN116206370 A CN 116206370A
Authority
CN
China
Prior art keywords
rotation angle
determining
joint point
local
bone
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310500623.4A
Other languages
Chinese (zh)
Other versions
CN116206370B (en
Inventor
陈睿智
李丰果
冯志强
彭昊天
刘豪杰
刘玉大
赵晨
刘经拓
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202310500623.4A priority Critical patent/CN116206370B/en
Publication of CN116206370A publication Critical patent/CN116206370A/en
Application granted granted Critical
Publication of CN116206370B publication Critical patent/CN116206370B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T13/00Animation
    • G06T13/203D [Three Dimensional] animation
    • G06T13/403D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings

Abstract

The invention provides a driving information generation method, a driving device, electronic equipment and a storage medium, relates to the technical field of artificial intelligence, in particular to the technical fields of computer vision, augmented reality, virtual reality, deep learning and the like, and can be applied to the technical fields of metauniverse, digital people and the like. The specific implementation scheme is as follows: determining a first local bone rotation angle of a first local bone at least one moment in a target time period, the first local bone characterizing bone between a first articulation point and a second articulation point of the subject; and generating driving information corresponding to the motion of the object in the target time period according to the global skeleton rotation angle and the first local skeleton rotation angle, wherein the global skeleton rotation angle is determined according to the joint point position information of the object joint point of the object at least at one moment.

Description

Driving information generation method, driving device, electronic equipment and storage medium
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to the technical fields of computer vision, augmented reality, virtual reality, deep learning and the like, which can be applied to the technical fields of metauniverse, digital man and the like, and particularly relates to a driving information generation and driving method, a driving device, electronic equipment and a storage medium.
Background
Optical capturing is a motion capture technique commonly used in the variety, motion picture and game industries to accomplish the task of motion capture by arranging light towers in a fixed space and putting on light capturing clothing with light spots by a performer to place specific light spots in batches on the performer. The optical capturing technology can enable the performer to freely move in a large range without being limited by a mechanical device.
Disclosure of Invention
The invention provides a driving information generation method, a driving device, an electronic device and a storage medium.
According to an aspect of the present invention, there is provided a driving information generating method including: determining a first local bone rotation angle of a first local bone at least one moment in a target time period, the first local bone characterizing bone between a first articulation point and a second articulation point of a subject; and generating driving information corresponding to the motion of the object in the target time period according to a global skeleton rotation angle and the first local skeleton rotation angle, wherein the global skeleton rotation angle is determined according to joint point position information of an object joint point of the object at the at least one moment.
According to another aspect of the present invention, there is provided a driving method of an avatar, including: determining driving information for driving the target avatar in response to capturing the motion from the target object, the driving information being generated based on the driving information generating method of the present invention; and driving the target avatar to perform a corresponding action by using the driving information.
According to an aspect of the present invention, there is provided a driving information generating apparatus including: a first determination module for determining a first local bone rotation angle of a first local bone at least one moment in a target time period, the first local bone characterizing bone between a first articulation point and a second articulation point of a subject; and the generation module is used for generating driving information corresponding to the action of the object in the target time period according to a global skeleton rotation angle and the first local skeleton rotation angle, and the global skeleton rotation angle is determined according to joint point position information of an object joint point of the object at the at least one moment.
According to another aspect of the present invention, there is provided an avatar driving apparatus including: a sixth determining module for determining driving information for driving the target avatar in response to capturing the motion from the target object, the driving information being generated based on the driving information generating apparatus of the present invention; and the driving module is used for driving the target virtual image to perform corresponding actions by utilizing the driving information.
According to another aspect of the present invention, there is provided an electronic apparatus including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform at least one of the driving information generating method and the avatar driving method of the present invention.
According to another aspect of the present invention, there is provided a non-transitory computer-readable storage medium storing computer instructions for causing the computer to perform at least one of the driving information generating method and the avatar driving method of the present invention.
According to another aspect of the present invention, there is provided a computer program product comprising a computer program stored on at least one of a readable storage medium and an electronic device, which when executed by a processor, implements at least one of the driving information generating method and the driving method of an avatar of the present invention.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the invention or to delineate the scope of the invention. Other features of the present invention will become apparent from the description that follows.
Drawings
The drawings are included to provide a better understanding of the present invention and are not to be construed as limiting the invention. Wherein:
fig. 1 schematically illustrates an exemplary system architecture to which at least one of a driving information generation method and an avatar driving method and a corresponding apparatus may be applied according to an embodiment of the present invention;
fig. 2 schematically shows a flowchart of a driving information generating method according to an embodiment of the present invention;
FIG. 3 schematically illustrates a schematic diagram of the actions of an object that this Egypt has reached at a certain point in time, according to an embodiment of the invention;
fig. 4 schematically shows a schematic diagram of a driving information generating method according to an embodiment of the present invention;
fig. 5 schematically illustrates a flowchart of a driving method of an avatar according to an embodiment of the present invention;
fig. 6 schematically shows a block diagram of a drive information generating apparatus according to an embodiment of the present invention;
fig. 7 schematically illustrates a block diagram of a driving apparatus of an avatar according to an embodiment of the present invention; and
FIG. 8 shows a schematic block diagram of an example electronic device that may be used to implement an embodiment of the invention.
Detailed Description
Exemplary embodiments of the present invention will now be described with reference to the accompanying drawings, in which various details of the embodiments of the present invention are included to facilitate understanding, and are to be considered merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
In the technical scheme of the invention, the related processes of collecting, storing, using, processing, transmitting, providing, disclosing, applying and the like of the personal information of the user all accord with the regulations of related laws and regulations, necessary security measures are adopted, and the public order harmony is not violated.
In the technical scheme of the invention, the authorization or the consent of the user is obtained before the personal information of the user is obtained or acquired.
The optical capturing technology is widely used because of the advantages of easy use, good effect, high sampling rate, mature industrial links and the like.
The inventor finds that the light capturing data can have the conditions of penetration, shaking, unnaturalness and unexpected due to factors such as shielding, precision deviation, acquisition environment and the like in the process of realizing the inventive concept. For example, due to the high demands on action quality by the variety, motion picture and game industries, the light capture data directly output by the light capture system may not be satisfactory, requiring refinement of the light capture data, which results in high time and labor costs when using the light capture data. For example, for a simple talking and broadcasting scenario, 10 minutes of data would require 1 week of repair. For scenes containing dance, fight, etc., the cost can increase significantly, and 10 minutes of data may require 2-3 weeks of repair.
Fig. 1 schematically illustrates an exemplary system architecture to which at least one of a driving information generating method and an avatar driving method and a corresponding apparatus may be applied according to an embodiment of the present invention.
It should be noted that fig. 1 is only an example of a system architecture to which the embodiments of the present invention may be applied to help those skilled in the art understand the technical content of the present invention, and does not mean that the embodiments of the present invention may not be used in other devices, systems, environments, or scenarios. For example, in another embodiment, an exemplary system architecture to which at least one of the driving information generation method and the driving method of the avatar and the corresponding apparatus may be applied may include a terminal device, but the terminal device may implement at least one of the driving information generation method and the driving method of the avatar and the corresponding apparatus provided in the embodiment of the present invention without interacting with a server.
As shown in fig. 1, a system architecture 100 according to this embodiment may include a first terminal device 101, a second terminal device 102, a third terminal device 103, a network 104, and a server 105. The network 104 is a medium used to provide a communication link between the first terminal device 101, the second terminal device 102, the third terminal device 103, and the server 105. The network 104 may include various connection types, such as wired and/or wireless communication links, and the like.
The user may interact with the server 105 via the network 104 using the first terminal device 101, the second terminal device 102, the third terminal device 103, to receive or send messages etc. Various communication client applications, such as a knowledge reading class application, a web browser application, a search class application, an instant messaging tool, a mailbox client and/or social platform software, etc. (by way of example only) may be installed on the first terminal device 101, the second terminal device 102, the third terminal device 103. Various information collecting means such as a sensor, a camera, etc. (by way of example only) may also be mounted on the first terminal device 101, the second terminal device 102, the third terminal device 103.
The first terminal device 101, the second terminal device 102, the third terminal device 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smartphones, tablets, laptop and desktop computers, and the like.
The server 105 may be a server providing various services, such as a background management server (merely an example) providing support for content browsed by the user with the first terminal apparatus 101, the second terminal apparatus 102, the third terminal apparatus 103. The background management server may analyze and process the received data such as the user request, and feed back the processing result (e.g., the web page, information, or data obtained or generated according to the user request) to the terminal device. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service ("Virtual Private Server" or simply "VPS") are overcome. The server may also be a server of a distributed system or a server that incorporates a blockchain.
It should be noted that at least one of the driving information generating method and the driving method of the avatar provided in the embodiment of the present invention may be generally performed by the first terminal device 101, the second terminal device 102, or the third terminal device 103. Accordingly, at least one of the driving information generating apparatus and the driving apparatus for an avatar provided in the embodiment of the present invention may be provided in the first terminal device 101, the second terminal device 102, or the third terminal device 103.
Alternatively, at least one of the driving information generating method and the driving method of the avatar provided by the embodiment of the present invention may be generally performed by the server 105. Accordingly, at least one of the driving information generating device and the driving device for the avatar provided in the embodiment of the present invention may be generally provided in the server 105. At least one of the driving information generating method and the driving method of the avatar provided by the embodiment of the present invention may also be performed by a server or a server cluster which is different from the server 105 and is capable of communicating with the first terminal device 101, the second terminal device 102, the third terminal device 103, and/or the server 105. Accordingly, at least one of the driving information generating apparatus and the driving apparatus for an avatar provided in the embodiment of the present invention may be provided in a server or a server cluster which is different from the server 105 and is capable of communicating with the first terminal device 101, the second terminal device 102, the third terminal device 103, and/or the server 105.
For example, in generating the driving information, the first terminal device 101, the second terminal device 102, and the third terminal device 103 may determine a first local bone rotation angle of the first local bone at least one time in the target period, and generate the driving information corresponding to the motion of the object in the target period according to the global bone rotation angle and the first local bone rotation angle. The first local bone characterizes a bone between a first articulation point and a second articulation point of the object, and the global bone rotation angle is determined according to articulation point position information of the object articulation point of the object at least one moment. Or by a server or server cluster capable of communicating with the first terminal device 101, the second terminal device 102, the third terminal device 103 and/or the server 105, and generating driving information corresponding to the motion of the object in the target time period.
For example, in driving the avatar, the first terminal apparatus 101, the second terminal apparatus 102, and the third terminal apparatus 103 may determine driving information for driving the target avatar according to the driving information generating method of the present invention in response to capturing the motion from the target object, and drive the target avatar to perform a corresponding motion using the driving information. Or by a server or server cluster capable of communicating with the first terminal device 101, the second terminal device 102, the third terminal device 103 and/or the server 105, and driving the target avatar.
It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
Fig. 2 schematically shows a flowchart of a driving information generating method according to an embodiment of the present invention.
As shown in FIG. 2, the method includes operations S210-S220.
In operation S210, a first local bone rotation angle of a first local bone at least one moment in a target time period is determined, the first local bone characterizing bone between a first articulation point and a second articulation point of a subject.
In operation S220, driving information corresponding to the motion of the object in the target period is generated according to the global bone rotation angle and the first local bone rotation angle, and the global bone rotation angle is determined according to the joint point position information of the object joint point of the object at least one time.
According to an embodiment of the present invention, the object may include at least one of a human, an animal, a plant, a robot, and the like, and may not be limited thereto. An object node may characterize some or all of the object's nodes. The object node may or may not include at least one of the first node and the second node.
According to an embodiment of the invention, the first local bone rotation angle and the global bone rotation angle may be determined from a subject bone reference pose. The skeletal reference pose of the object may include, for example, a pose that the object assumes in a standing state, or may include other predefined poses, without limitation. For example, a first local bone reference pose of a first local bone may be first determined from the subject bone reference pose. And then, comparing the real-time pose of the first local bone acquired in real time with the reference pose of the first local bone to determine the rotation angle of the first local bone. For example, the global bone rotation angle can be determined by comparing the real-time pose of the global bone acquired in real time with the reference pose of the bone of the subject.
According to the embodiment of the invention, the three-dimensional position information of the object node can be acquired first. Then, the rotation angle of the partial bone between each two object nodes may be determined according to the three-dimensional position information of each object node, and the global bone rotation angle may be determined according to the rotation angle of each partial bone.
According to the embodiment of the invention, the global bone rotation angle can be adjusted according to the first local bone rotation angle. For example, the first local bone rotation angle may be locked first, and then the bone rotation angle of the other part of the global bone rotation angle than the rotation angle related to the first local bone may be adjusted to obtain the adjusted global bone rotation angle. And may generate driving information corresponding to the motion of the object in the target time period according to the adjusted global skeleton rotation angle at least one time in the target time period. The drive information may include a sequence of adjusted global bone rotation angles corresponding to the at least one time instant.
According to an embodiment of the present invention, after determining the adjusted global bone rotation angle at least one time, the motion information in the target time period may also be determined in combination with a kinematic equation. Then, the driving information may be determined according to the motion information.
According to the embodiment of the invention, based on the first local bone rotation angle, the local bone driving information with finer granularity can be determined, and the driving information determined by combining the global bone rotation angle can have finer action details on the basis of keeping the overall action characteristics, so that the accuracy and precision of the generated driving information can be effectively improved.
The method shown in fig. 2 is further described below in connection with the specific examples.
According to an embodiment of the present invention, the operation S210 may include: a first initial bone rotation angle of a first local bone is determined. And determining a first local bone rotation angle according to the first initial bone rotation angle, the first joint point pose information of the first joint point and the second joint point position information of the second joint point.
According to the embodiment of the invention, the first initial bone rotation angle can be determined according to the first local bone real-time pose and the first local bone reference pose. The first joint point location information may include position information and pose information of the first joint point. The spatial location of the first node of interest may be determined based on the location information. The joint orientation of the first articulation point may be determined from the pose information.
According to the embodiment of the invention, for example, the position and the orientation of the first articulation point can be locked according to the first articulation point position information, and the position of the second articulation point can be locked according to the second articulation point position information. The first initial bone rotation angle may then be adjusted based on the position and orientation of the locked first articulation point and the position of the locked second articulation point such that the position and orientation of the first articulation point, as characterized by the adjusted first initial bone rotation angle, is the same as the position and orientation of the locked first articulation point and the position of the second articulation point, as characterized by the adjusted first initial bone rotation angle, is the same as the position of the locked second articulation point. From the adjusted first initial bone rotation angle, a first local bone rotation angle may be determined.
Through the embodiment of the invention, the obtained rotation angle of the first local bone can represent a more natural expression effect on the basis of adapting to the local action of the first local bone.
According to an embodiment of the present invention, determining a first initial bone rotation angle of a first local bone may comprise: and determining the first joint point position information and the second joint point position information of the first joint point according to the feedback information of the optical mark points arranged on the object. And determining a first initial bone rotation angle according to the first joint position information and the second joint position information.
According to the embodiment of the invention, the first initial bone rotation angle can be acquired in real time for the object wearing the light capturing garment based on the light capturing system. The light-capturing garment may be provided with optically marked spots. The optical marker points may be configured on the light capturing garment according to the distribution of the subject joint points. The feedback information may include position information of the optical mark point at the feedback moment. In the case of determining the object node corresponding to the optical mark point, the object node position information of the object node may be determined based on the feedback information of the optical mark point.
According to the embodiment of the invention, the first initial bone rotation angle of the first local bone can be determined according to the first joint position information and the second joint position information without other joints between the first joint and the second joint.
In accordance with an embodiment of the present invention, in the case that other nodes are included between the first node and the second node, for example, a third node is included between the first node and the second node. A first bone rotation angle of a first bone between the first articulation point and the third articulation point may be first determined based on the first articulation point position information and the third articulation point position information, and a second bone rotation angle of a second bone between the third articulation point and the second articulation point may be determined based on the third articulation point position information and the second articulation point position information. The first initial bone rotation angle may then be determined based on the first bone rotation angle and the second bone rotation angle.
Fig. 3 schematically shows a schematic view of an object motion acquired at a certain moment in time according to an embodiment of the invention.
According to an embodiment of the present invention, as shown in fig. 3, taking an object as a human body 300 as an example, the object node may include some or all of the nodes 0 to 23. The nodes included in region 301, such as 6, 9, 12, 13, 16, 18, etc., may be indicative of nodes that are not captured by the light capture system. For example, in the case where the region 301 does not exist, the object node may include 0 to 23. In the case of the existence of the region 301, the object nodes may include the rest of the nodes except 6, 9, 12, 13, 16, 18. The global bone rotation angle may characterize the rotation angle of the bone determined by all joints in 0 to 23. In this embodiment, the first joint point may be, for example, the right elbow joint point 19, and the second joint point may be, for example, the right wrist joint point 21. In this case, the first local bone may characterize a bone between 19 and 21. The global bone rotation angle may be adjusted with the global bone rotation angle characterized by 0 to 23 and the first local bone rotation angle between 19 to 21 as constraints, generating driving information corresponding to the motion of the object at the respective moments.
By the embodiment of the invention, under the condition of determining the first initial bone rotation angle based on the light capturing data, the automatic finishing of the light capturing data can be realized by combining the embodiment, the finishing cost of the light capturing data is effectively reduced, and the expression effect of the light capturing data is enhanced.
According to an embodiment of the present invention, a third node may be included between the first node and the second node; determining the first local bone rotation angle based on the first initial bone rotation angle, the first joint point pose information of the first joint point, and the second joint point position information of the second joint point may include: and determining a third joint point rotation angle of a third joint point according to the first initial bone rotation angle, the first joint point position information and the second joint point position information. And determining a first local bone rotation angle according to the first joint point pose information, the second joint point position information and the third joint point rotation angle.
According to the embodiment of the invention, in the case that the third joint point is included between the first joint point and the second joint point, the position and the orientation of the first joint point can be locked according to the first joint point pose information, and the position of the second joint point can be locked according to the second joint point position information. Then, based on the IK (Inverse Kinematics, inverse dynamics) algorithm, the driving coefficient of the third joint point is calculated by taking the first initial bone rotation angle as a constraint and combining the locking condition, so that the rotation angle of the third joint point can be obtained.
It should be noted that the third node may include one or more nodes, which are not limited herein.
According to an embodiment of the present invention, after determining the rotation angle of the third joint point, the position and orientation of the first joint point may be locked according to the pose information of the first joint point, the position of the second joint point may be locked according to the position information of the second joint point, and the orientation of the third joint point may be locked according to the rotation angle of the third joint point, the first local bone skeleton structure formed by the first joint point, the third joint point, and the second joint point may be determined, and by analyzing the first local bone skeleton structure, for example, the analysis may include analyzing at least one of the position information, the pose information, and the like of the first local bone skeleton structure in a world coordinate system or a preset coordinate system, the first local bone rotation angle between the first joint point and the second joint point may be determined.
For example, in the embodiment shown in connection with fig. 3, the first articulation point may be the left shoulder articulation point 16 and the second articulation point may be the left wrist articulation point 20. In this case, the third articulation point may be the left elbow articulation point 18 and the first local bone may be representative of the bones identified at 16, 18, and 20. Drive information corresponding to the motion of the object at the respective moment may be generated from the global bone rotation angle characterized by 0 to 23 and the first local bone rotation angle between 19 to 21. The first initial bone rotation angle between 16 and 20 may be used as a constraint, and the rotation angle of 18 may be calculated based on the IK algorithm based on pose information of the root left shoulder joint 16 and position information of the left wrist joint 20 to determine the first local bone rotation angle between 16 and 20.
By means of the above embodiment of the present invention, the determined first local bone rotation angle may represent a local motion with a more natural appearance by determining the third joint rotation angle.
According to an embodiment of the invention, the object nodes may include a fourth acquired node and a fifth non-acquired node. Before performing the above operation S220, the target object node may be determined as the fourth node in response to determining that the target object node of the object is acquired by a preset number of acquisition devices located at different perspectives.
According to an embodiment of the invention, the fourth joint point may be determined, for example, based on information collected by the light capturing system. For example, a light tower in a light capturing system may include a plurality of cameras for acquiring information of an object based on a plurality of viewing angles, resulting in multi-view video data. For example, object images of a plurality of perspectives can be obtained for the object at each time. The object images of each view may acquire the same or different object nodes.
According to an embodiment of the present invention, the preset number may be used to define the confidence level of the object node, and the value may be one or more. For example, a predetermined threshold may be predetermined, and the predetermined number may take any value greater than or equal to the predetermined threshold. Under the condition that a certain object node is collected by object images with preset number of view angles, the object node can be determined to have higher confidence, and the object node with higher confidence can be determined to be a fourth node. For object nodes that are only detected by fewer perspectives than a preset threshold or not by any one perspective, it may be determined that they have a lower confidence and the object node with the lower confidence may be determined as the fifth node.
By the above embodiment of the present invention, the accuracy of the determined fourth node can be improved.
According to an embodiment of the present invention, the manner of determining the fourth node may further include: in response to determining that the number of object nodes acquired for the object based on the target view angle is greater than a preset value, the object node acquired based on the target view angle is determined to be a fourth node.
It should be noted that the manner of determining the fourth node is merely an exemplary embodiment, but is not limited thereto, and other methods known in the art may be included as long as a more accurate object node can be determined.
According to an embodiment of the present invention, before performing the above operation S220, the global bone rotation angle may be first determined, and the process may include: and determining fifth joint position information of the fifth joint according to the fourth joint position information of the fourth joint. And determining the global skeleton rotation angle according to the fourth joint position information and the fifth joint position information.
In accordance with an embodiment of the present invention, in capturing an image or video for an object, there may be some object nodes that cannot be captured due to a view limitation or the presence of an occlusion, and the part of the object nodes may be determined as a fifth node. The fifth articulation point may include at least one of the first articulation point, the second articulation point, and the third articulation point, or may not include any one of the first articulation point, the second articulation point, and the third articulation point.
According to embodiments of the present invention, a deep neural network model capable of calculating unknown node location information based on known node location information may be first trained or acquired. Then, the fourth joint point position information acquired at each moment may be input into the deep neural network model, and the fifth joint point position information, or a set of the fourth joint point position information and the fifth joint point position information, may be obtained through processing of the deep neural network model.
For example, where the amount of turret is small, occlusion may result in portions of the optically marked dots not being visible at all viewing angles. In this case, the deep neural network model may receive the fourth joint point position information collected by the light capturing system as input, and output the position information of all object joint points including the fifth joint point position information missed by the light capturing system, so as to facilitate subsequent processing.
According to the embodiment of the invention, after the fifth joint point is determined, the rotation angle of the joint point between the fifth joint point and the fourth joint point or between the fifth joint point and the target joint point can be calculated based on the IK algorithm with the global bone rotation angle determined according to the fourth joint point position information and the fifth joint point position information as constraints according to the fourth joint point position information or the pose information of the target joint point, and the global bone rotation angle can be adjusted based on the rotation angle.
According to the embodiment of the invention, under the condition that the fifth joint point missed by the light capturing system exists, the global skeleton rotation angle can be determined based on the fourth joint point position information acquired by the light capturing system and fifth joint point position information calculated by the deep neural network model. The local bone rotation angles may be determined based on at least one of: the determination of the fifth joint point position information calculated from the fourth joint point position information acquired by the light capturing system and the deep neural network model, the determination of the fourth joint point position information acquired by the light capturing system, the determination of the fifth joint point position information calculated from the deep neural network model, and the like may be performed, but are not limited thereto.
For example, as shown in connection with fig. 3, in the case where the area 301 exists, in this embodiment, the fourth node may include all nodes outside the area 301, and the fifth node may include all nodes inside the area 301. In this case, the position information of all the nodes in the region 301 may be calculated according to the position information of some or all the nodes outside the region 301 in combination with the deep neural network model, so as to determine the position information of all the nodes in 0 to 23, so as to facilitate the calculation of the global bone rotation angle and the local bone rotation angle of each part.
Through the embodiment of the invention, the problem of incomplete information caused by missing joint points can be reduced, and the integrity of the generated driving information is improved.
According to an embodiment of the invention, the second articulation point may comprise a foot articulation point. Determining the first local bone rotation angle based on the first initial bone rotation angle, the first joint point pose information of the first joint point, and the second joint point position information of the second joint point may include: and determining the target foot articulation point according to the touchdown state information of the foot articulation point. The target foot articulation point may comprise at least one of: heel articulation point and toe articulation point. And determining a first local bone rotation angle according to the first initial bone rotation angle, the first joint point pose information, the ground contact state information and the target foot articulation point position information of the target foot articulation point.
According to the embodiment of the invention, the ground contact sensing model capable of distinguishing the heel contact state and the toe contact state of the performer can be trained or acquired first. For example, the touchdown sensing model may be obtained by training by labeling the touchdown state of the feet of the performer. In the training process, images can be cut according to the positions of heels and toes to obtain a target frame only containing foot information, marking information corresponding to the target frame can be determined based on the foot touchdown state in the target frame, and training of a touchdown sensing model is completed by combining the marking information and the target frame. The input of the touchdown sensing model may be an image including only foot information and the output may include a heel strike condition, a toe strike condition. Based on the touchdown sensing model, the touchdown state of the heel and toe can be separated in milliseconds.
According to the embodiment of the invention, the touchdown state can also be determined according to the acquired results of the multiple view angles and the confidence information. For example, where heel strike is detected based on multiple perspectives, it may be determined that the confidence of the heel strike is high and that the strike status information includes heel strike. The touchdown state information may include at least one of: heel-toe contact, heel-strike and toe-not-contact, heel-not-strike and toe-not-contact, heel-toe-not-contact, and the like, and may not be limited thereto.
According to embodiments of the present invention, the target foot articulation point may represent an angle articulation point of a touchdown state. Determining the target foot articulation point according to the touchdown state information of the foot articulation point may include: in response to determining that both the heel and toe are in a ground-engaging state or both are in an untouched state based on the ground-engaging state information, the heel or toe node is determined to be the target foot node. In response to determining that the heel is in a ground-contacting state and the toe is in an untouched state according to the ground-contacting state information, the heel articulation point is determined to be a target foot articulation point. In response to determining that the toe is in a ground-contacting state and that the heel is in an untouched state according to the ground-contacting state information, the toe joint point is determined to be a target foot joint point.
In the case where it is determined that none of the heel toe is touched, the target foot joint point may be determined to be empty, and the present invention is not limited thereto.
According to the embodiment of the invention, after the target foot joint point is determined, the position of the target foot joint point can be locked according to the position information of the target foot joint point on the basis of locking the position and the orientation of the first joint point according to the first joint point pose information, and the target foot joint point can be locked on the ground according to the ground contact state information or separated from the ground. Then, based on the IK algorithm, the driving coefficient of the intermediate joint point between the first joint point and the foot joint point can be calculated by taking the first initial bone rotation angle as a constraint and combining the locking condition, so as to obtain the rotation angle of the intermediate joint point. And then, determining the first local bone rotation angle according to the first joint point pose information, the ground contact state information, the target foot articulation point position information and the intermediate articulation point rotation angle.
For example, in the embodiment shown in connection with fig. 3, the first node may be the root node 0. Foot joints may include a right heel joint 8, a right toe joint 11. In the event that the right heel articulation point 8 touches the ground and the right heel articulation point 11 leaves the ground, the right heel articulation point 8 may first be locked to the ground. Then, the rotation angles of the articulation points 2, 5, etc. may be calculated based on the IK algorithm based on the pose information of the root articulation point 0 and the position information of the right heel articulation point 8 with the first initial bone rotation angle between 0 and 8 or 0 and 11 as a constraint to determine the first local bone rotation angle between 0 and 8 or 0 and 11.
By the above-described embodiments of the present invention, the determined first partial bone rotation angle may have a more natural expression effect.
According to an embodiment of the invention, the second node may comprise a first sub-node and a second sub-node. The driving information generating method may further include: and determining local motion information generated by the second local bone in the target time period according to a second initial bone rotation angle sequence of the second local bone of the object in the target time period. The second local bone characterizes bone between the first sub-joint and the second sub-joint. And in response to determining that the local motion information is matched with the preset motion information, adjusting the local motion information according to the preset motion information to obtain a second local bone rotation angle sequence.
According to embodiments of the present invention, one or more acquisition instants may be included in the target time period. Second initial bone rotation angle the second initial bone rotation angle sequence may include second initial bone rotation angles acquired for the second local bone at the one or more acquisition times. Local motion information can be obtained by converting the second initial bone selection angle sequence into a motion representation. The preset action information can represent standard action information of various actions and can be stored in a database in advance.
According to an embodiment of the invention, for example, the light harvesting system may collect motion information of a second local bone site of the subject, resulting in a sequence of local animation frames characterizing the motion of the second local bone site. The light capturing system can acquire global motion information of the object to obtain a global animation frame sequence. Then, the needed local animation frame sequence is extracted from the global animation frame sequence. The analysis of the sequence of local animation frames acquired based on the light capturing system, for example, includes the analysis of the joint point mode in each local animation frame, so as to obtain a second initial bone rotation angle sequence.
According to the embodiment of the invention, after the local motion information is obtained according to the second initial bone rotation angle sequence, the local motion information can be matched with preset motion information pre-stored in a database. After matching the target motion information with high similarity with the local motion information from the database, the local motion information can be adjusted according to the target motion information, so that the adjusted local motion information is consistent with the motion of the target motion information. The second sequence of local bone rotation angles may be a sequence of second local bone rotation angles characterized by the adjusted local motion information.
According to an embodiment of the present invention, adjusting the local motion information according to the preset motion information, the obtaining a second local bone rotation angle sequence may include: and determining a standard skeleton rotation angle sequence according to the preset action information. A second sequence of local bone rotation angles is determined from the standard sequence of bone rotation angles.
According to an embodiment of the invention, the second local bone may represent the bones of both hands and the arm part connected to both hands. The first sub-node and the second sub-node may represent the nodes of the hand and arm portions.
For example, taking a double-hand matching heart as an example, if the action is matched in the database, the action matching module may rewrite the joint points on the double hands into a standard heart-comparing action according to the standard double-hand heart action pre-stored in the database. This step may include: and (3) rewriting a second initial bone rotation angle sequence of bones of the hands and the arm parts of the object into a standard bone rotation angle sequence of standard hand-to-heart motions to obtain a second local bone rotation angle sequence.
By the above embodiments of the present invention, the resulting local actions can be made smoother, standard and natural.
According to an embodiment of the present invention, the operation S220 may further include, on the basis of determining the second partial bone rotation angle sequence: a third local bone rotation angle of the third local bone at least one moment is determined. The third local bone characterizes bone between the first articulation point and the first sub-articulation point. And generating driving information corresponding to the motion of the object in the target time period according to the global skeleton rotation angle and the third local skeleton rotation angle which correspond to the same moment.
According to an embodiment of the present invention, the third partial bone rotation angle may characterize a rotation angle of the third partial bone adjusted based on the second partial bone rotation angle. After determining the third partial bone rotation angle, the positions of the respective articulation points in the third partial bone may be first locked according to the third partial bone rotation angle. And then, based on an IK algorithm, taking the global skeleton rotation angle as a constraint, combining the locking conditions, calculating the driving coefficient of each object articulation point to obtain the rotation angle of each object articulation point, and obtaining the adjusted global skeleton rotation angle based on the rotation angle. The driving information can be determined according to the adjusted global bone rotation angle.
For example, taking the above-described motion of the two hands fitting the center as an example, it is possible to perform the IK calculation with constraint on the portion from the wrist to the root node after rewriting the joint points on the hands to the standard center motion. The constraints may include a global skeletal rotation angle of the light capturing system output.
Through the embodiment of the invention, more vivid and natural global actions can be obtained on the basis of adapting local actions.
According to an embodiment of the present invention, the determining the third local bone rotation angle of the third local bone at least one time may include: and determining the position information of the first sub-joint point at least one moment according to the second local bone rotation angle sequence. And determining the third local bone rotation angle according to the first sub-joint point position information corresponding to the same time, the third initial bone rotation angle of the third local bone and the first joint point position information of the first joint point.
According to the embodiment of the invention, after the second local bone rotation angle of the second local bone is rewritten, the positions of all the joints in the second local bone at the current moment can be locked according to the second local bone rotation angle. For example, the position of the first self-closing node may be locked according to the first sub-joint point position information. And then, on the basis of locking the position and the orientation of the first articulation point according to the pose information of the first articulation point, calculating the driving coefficient between the first articulation point and the first sub-articulation point by taking the third initial bone rotation angle between the first articulation point and the first sub-articulation point as a constraint and combining the locking condition based on the IK algorithm, so as to obtain the rotation angle of the articulation point between the first articulation point and the first sub-articulation point, and obtaining the third local bone rotation angle based on the rotation angle.
The determination methods of the third initial bone rotation angle and the second initial bone rotation angle may be the same as the determination method of the first initial bone rotation angle, and are not limited herein.
Through the embodiment of the invention, on the basis of adapting a part of local actions, the local actions of the rest parts also have more natural expression effects.
According to an embodiment of the present invention, the operation S220 may further include: a first weight configured for a global bone rotation angle is determined. Determining a second weight configured for a target local bone rotation angle, the first weight being less than the second weight, the target local bone rotation angle comprising at least one of: a first partial bone rotation angle, a second partial bone rotation angle, and a third partial bone rotation angle. And generating driving information corresponding to the motion of the object in the target time period according to the first weight, the global skeleton rotation angle, the second weight and the target local skeleton rotation angle.
According to the embodiment of the invention, after the global skeleton rotation angle of the object and the target local skeleton rotation angle of each local skeleton are obtained, the first weight and the second weight can be combined, the global skeleton rotation angle collected by the light capturing system aiming at the object is taken as weak constraint, the ground touching perception, IK, 3D articulation point supplement, IK, action matching, IK and other modules are taken as strong constraint, global motion smoothing is taken as an optimization target, global motion planning is carried out on the whole object video output by the light capturing coefficient, and driving information after program refinement can be solved and output.
Fig. 4 schematically shows a schematic diagram of a driving information generating method according to an embodiment of the present invention.
As shown in FIG. 4, the method includes operations S410-S440.
In operation S410, a subject video is acquired. The operations may include sub-operations S411-S412.
In operation S411, an object image is acquired.
In operation S412, a sequence of image frames characterizing the motion of the subject over a period of time is acquired.
In operation S420, local adjustment is performed.
According to an embodiment of the invention, local adjustment may be to infer more accurate motion details using local video or local motion information and to express the relevant details in terms of a target local bone rotation angle. The process may generally include operations S421-S425.
In operation S421, a touchdown state of the object is determined.
In operation S422, supplementary estimation is performed on the position information of the invisible 3D articulation point.
In operation S423, a local bone rotation angle is calculated for a local bone related to the touchdown-related joint and the invisible 3D joint based on the IK algorithm.
In operation S424, the actions are matched.
In operation S425, the local bone rotation angle of the local bone matched to the standard motion is adjusted.
In operation S430, global motion planning is performed according to the local bone rotation angle and the global bone rotation angle.
In operation S440, the refined driving information is obtained.
According to the embodiment of the invention, the global motion planning can synthesize the global motion condition on the basis of local adjustment, take local adjustment as constraint on the premise of ensuring fluency, apply global smoothness constraint and the like, solve the global planning optimal solution, obtain the refined driving data and output.
It should be noted that, more specific implementations of local adjustment and global motion planning have been described in the foregoing embodiments, and are not described herein.
Through the embodiment of the invention, by introducing the deep learning perception capability and matching with the post-processing refinement of the driving data, for example, the deep learning perception capability can be utilized to assist the driving data refinement, a technical framework of the driving data refinement is defined, and a total technical route from local perception and matching to global planning of the driving data refinement is defined, so that the light capture data refinement cost can be effectively reduced, and the cost reduction and the enhancement are realized.
Fig. 5 schematically illustrates a flowchart of a driving method of an avatar according to an embodiment of the present invention.
As shown in FIG. 5, the method includes operations S510-S520.
In response to capturing the motion from the target object, driving information for driving the target avatar is determined in operation S510.
In operation S520, the target avatar is driven to perform a corresponding action using the driving information.
According to an embodiment of the present invention, the driving information may be generated based on the driving information generating method described above. The target avatar may be an avatar having the same or similar appearance as the object. The driving information may be used to drive the avatar to perform an action corresponding to the object.
According to the embodiment of the invention, the industry productivity of light capturing action data production can be deeply improved based on the landing of the driving information generation method, and along with the reduction of light capturing cost and the improvement of effect and stability, the light capturing technology can be gradually transited from a professional level to a consumer level.
Fig. 6 schematically shows a block diagram of a driving information generating apparatus according to an embodiment of the present invention.
As shown in fig. 6, the driving information generating apparatus 600 includes a first determining module 610 and a generating module 620.
A first determination module 610 for determining a first local bone rotation angle of a first local bone at least one moment in a target time period, the first local bone characterizing bone between a first articulation point and a second articulation point of a subject.
The generating module 620 is configured to generate driving information corresponding to an action of the object in the target time period according to the global skeleton rotation angle and the first local skeleton rotation angle, where the global skeleton rotation angle is determined according to the joint point position information of the object joint point of the object at least one time.
According to an embodiment of the invention, the first determination module comprises a first determination sub-module and a second determination sub-module.
A first determination submodule for determining a first initial bone rotation angle of the first local bone.
The second determining submodule is used for determining the first local bone rotation angle according to the first initial bone rotation angle, the first joint point pose information of the first joint point and the second joint point position information of the second joint point.
According to an embodiment of the invention, the first determination submodule comprises a first determination unit and a second determination unit.
And the first determining unit is used for determining the first joint point position information and the second joint point position information of the first joint point according to the feedback information of the optical mark points arranged on the object.
And the second determining unit is used for determining the first initial bone rotation angle according to the first joint point position information and the second joint point position information.
According to an embodiment of the invention, a third articulation point is comprised between the first articulation point and the second articulation point. The second determination submodule includes a third determination unit and a fourth determination unit.
And the third determining unit is used for determining a third joint point rotation angle of a third joint point according to the first initial bone rotation angle, the first joint point pose information and the second joint point position information.
And the fourth determining unit is used for determining the first local bone rotation angle according to the first joint point pose information, the second joint point position information and the third joint point rotation angle.
According to an embodiment of the invention, the object nodes include a fourth acquired node and a fifth non-acquired node. The driving information generating apparatus further includes a second determining module and a third determining module.
And the second determining module is used for determining fifth joint point position information of the fifth joint point according to the fourth joint point position information of the fourth joint point.
And the third determining module is used for determining the global skeleton rotation angle according to the fourth joint point position information and the fifth joint point position information.
According to an embodiment of the present invention, the driving information generating apparatus further includes a fourth determining module.
And the fourth determining module is used for determining the target object node as a fourth node in response to the fact that the target object node of the determined object is acquired by a preset number of acquisition devices positioned at different visual angles.
According to an embodiment of the invention, the second joint point comprises a foot joint point. The second determination submodule includes a fifth determination unit and a sixth determination unit.
A fifth determining unit, configured to determine a target foot articulation point according to the touchdown state information of the foot articulation point, where the target foot articulation point includes at least one of: heel articulation point and toe articulation point.
And a sixth determining unit, configured to determine a first local bone rotation angle according to the first initial bone rotation angle, the first joint point pose information, the ground contact state information, and the target foot articulation point position information of the target foot articulation point.
According to an embodiment of the present invention, the fifth determining unit includes a first determining subunit, a second determining subunit, and a third determining subunit.
And the first determining subunit is used for determining the heel articulation point or the toe articulation point as a target foot articulation point in response to determining that the heel and the toe are both in a ground touching state or are both in an untouched state according to the ground touching state information.
And a second determination subunit configured to determine a heel articulation point as a target foot articulation point in response to determining that the heel is in a ground-contacting state and the toe is in an untouched state according to the ground-contacting state information.
And a third determining subunit, configured to determine the toe joint point as the target foot joint point in response to determining that the toe is in the ground contact state according to the ground contact state information and determining that the heel is in the non-ground contact state.
According to an embodiment of the invention, the second articulation point comprises a first sub-articulation point and a second sub-articulation point. The driving information generating device further includes a fifth determining module and an adjusting module.
And a fifth determining module, configured to determine local motion information generated by a second local bone of the object in the target time period according to a second initial bone rotation angle sequence of the second local bone in the target time period, where the second local bone characterizes a bone between the first sub-articular point and the second sub-articular point.
And the adjusting module is used for responding to the fact that the local action information is matched with the preset action information, and adjusting the local action information according to the preset action information to obtain a second local skeleton rotation angle sequence.
According to an embodiment of the invention, the adjustment module comprises a seventh determination unit and an eighth determination unit.
And the seventh determining unit is used for determining a standard skeleton rotation angle sequence according to the preset action information.
An eighth determining unit is used for determining a second local bone rotation angle sequence according to the standard bone rotation angle sequence.
According to an embodiment of the invention, the generation module comprises a third determination sub-module and a first generation sub-module.
A third determination sub-module for determining a third local bone rotation angle of a third local bone at least one time, the third local bone characterizing bone between the first articulation point and the first sub-articulation point.
The first generation sub-module is used for generating driving information corresponding to the motion of the object in the target time period according to the global skeleton rotation angle and the third local skeleton rotation angle which correspond to the same moment.
According to an embodiment of the invention, the third determination submodule includes a ninth determination unit and a tenth determination unit.
And a ninth determining unit, configured to determine, according to the second local bone rotation angle sequence, first sub-node position information of the first sub-node at least one moment.
And the tenth determining unit is used for determining the third local bone rotation angle according to the first sub-joint point position information corresponding to the same moment, the third initial bone rotation angle of the third local bone and the first joint point position information of the first joint point.
According to an embodiment of the invention, the generation module comprises a fourth determination sub-module, a fifth determination sub-module and a second generation sub-module.
A fourth determination submodule for determining a first weight configured for a global bone rotation angle.
A fifth determination submodule for determining a second weight configured for the target local bone rotation angle, the first weight being less than the second weight, the target local bone rotation angle including at least one of: a first partial bone rotation angle, a second partial bone rotation angle, and a third partial bone rotation angle.
The second generation sub-module is used for generating driving information corresponding to the action of the object in the target time period according to the first weight, the global skeleton rotation angle, the second weight and the target local skeleton rotation angle.
Fig. 7 schematically illustrates a block diagram of a driving apparatus of an avatar according to an embodiment of the present invention.
As shown in fig. 7, the driving apparatus 700 of the avatar includes a sixth determination module 710 and a driving module 720.
The sixth determination module 710 determines driving information for driving the target avatar in response to capturing the motion from the target object, the driving information being generated based on the driving information generating apparatus of the present disclosure.
And a driving module 720 for driving the target avatar to perform a corresponding action using the driving information.
According to embodiments of the present invention, the present invention also provides an electronic device, a readable storage medium and a computer program product.
According to an embodiment of the present invention, an electronic apparatus includes: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform at least one of the driving information generating method and the avatar driving method of the present invention.
According to an embodiment of the present invention, a non-transitory computer-readable storage medium storing computer instructions for causing a computer to execute at least one of the driving information generating method and the driving method of an avatar of the present invention.
According to an embodiment of the present invention, a computer program product includes a computer program stored on at least one of a readable storage medium and an electronic device, which when executed by a processor, implements at least one of the driving information generating method and the avatar driving method of the present invention.
FIG. 8 illustrates a schematic block diagram of an example electronic device 800 that may be used to implement an embodiment of the invention. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.
As shown in fig. 8, the apparatus 800 includes a computing unit 801 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 802 or a computer program loaded from a storage unit 808 into a Random Access Memory (RAM) 803. In the RAM 803, various programs and data required for the operation of the device 800 can also be stored. The computing unit 801, the ROM 802, and the RAM 803 are connected to each other by a bus 804. An input/output (I/O) interface 805 is also connected to the bus 804.
Various components in device 800 are connected to an input/output (I/O) interface 805, including: an input unit 806 such as a keyboard, mouse, etc.; an output unit 807 such as various types of displays, speakers, and the like; a storage unit 808, such as a magnetic disk, optical disk, etc.; and a communication unit 809, such as a network card, modem, wireless communication transceiver, or the like. The communication unit 809 allows the device 800 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.
The computing unit 801 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 801 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, etc. The calculation unit 801 performs the respective methods and processes described above, for example, at least one of a driving information generation method and an avatar driving method. For example, in some embodiments, at least one of the driving information generating method and the driving method of the avatar may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as the storage unit 808. In some embodiments, part or all of the computer program may be loaded and/or installed onto device 800 via ROM 802 and/or communication unit 809. When the computer program is loaded into the RAM 803 and executed by the computing unit 801, one or more steps of at least one of the above-described driving information generating method and avatar driving method may be performed. Alternatively, in other embodiments, the computing unit 801 may be configured to perform at least one of a driving information generating method and an avatar driving method by any other suitable means (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), complex Programmable Logic Devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for carrying out methods of the present invention may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of the present invention, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.
The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server incorporating a blockchain.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present invention may be performed in parallel, sequentially, or in a different order, so long as the desired results of the technical solution disclosed in the present invention can be achieved, and are not limited herein.
The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.

Claims (30)

1. A driving information generating method, comprising:
determining a first local bone rotation angle of a first local bone at least one moment in a target time period, the first local bone characterizing bone between a first articulation point and a second articulation point of a subject; and
and generating driving information corresponding to the motion of the object in the target time period according to a global skeleton rotation angle and the first local skeleton rotation angle, wherein the global skeleton rotation angle is determined according to joint point position information of an object joint point of the object at the at least one moment.
2. The method of claim 1, wherein the determining a first local bone rotation angle of the first local bone at least one instant in the target time period comprises:
determining a first initial bone rotation angle of the first local bone; and
and determining the first local bone rotation angle according to the first initial bone rotation angle, the first joint point pose information of the first joint point and the second joint point position information of the second joint point.
3. The method of claim 2, wherein the determining a first initial bone rotation angle of the first local bone comprises:
determining first node position information and second node position information of the first node according to feedback information of an optical mark point arranged on the object; and
and determining the first initial bone rotation angle according to the first joint point position information and the second joint point position information.
4. A method according to claim 2 or 3, wherein a third articulation point is included between the first articulation point and the second articulation point; the determining the first local bone rotation angle according to the first initial bone rotation angle, the first joint point pose information of the first joint point and the second joint point position information of the second joint point comprises:
Determining a third joint point rotation angle of the third joint point according to the first initial bone rotation angle, the first joint point pose information and the second joint point position information; and
and determining the first local bone rotation angle according to the first joint point pose information, the second joint point position information and the third joint point rotation angle.
5. The method of claim 1, wherein the object nodes comprise a fourth acquired node and a fifth non-acquired node; the method further comprises the steps of: prior to said generating drive information corresponding to the motion of said object in said target time period based on the global bone rotation angle and said first local bone rotation angle,
determining fifth node position information of the fifth node according to fourth node position information of the fourth node; and
and determining the global skeleton rotation angle according to the fourth joint point position information and the fifth joint point position information.
6. The method of claim 5, further comprising: before the fifth joint point position information of the fifth joint point is determined according to the fourth joint point position information of the fourth joint point,
And determining the target object joint point as the fourth joint point in response to determining that the target object joint point of the object is acquired by a preset number of acquisition devices positioned at different view angles.
7. The method of claim 2, wherein the second joint point comprises a foot joint point; the determining the first local bone rotation angle according to the first initial bone rotation angle, the first joint point pose information of the first joint point and the second joint point position information of the second joint point comprises:
determining a target foot articulation point according to the touchdown state information of the foot articulation point, wherein the target foot articulation point comprises at least one of the following: heel articulation point and toe articulation point; and
and determining the first local bone rotation angle according to the first initial bone rotation angle, the first joint point pose information, the ground contact state information and the target foot articulation point position information of the target foot articulation point.
8. The method of claim 7, wherein the determining the target foot articulation point based on the touchdown status information of the foot articulation point comprises:
determining the heel articulation point or the toe articulation point as the target foot articulation point in response to determining that both the heel and the toe are in a ground contact state or both are in an untouched state according to the ground contact state information;
In response to determining that the heel is in a ground-contacting state and the toe is in an untouched state according to the ground-contacting state information, determining the heel articulation point as the target foot articulation point; and
and in response to determining that the toe is in a ground-contacting state and the heel is in an untouched state according to the ground-contacting state information, determining the toe-articulation point as the target foot-articulation point.
9. The method of claim 1, wherein the second articulation point comprises a first sub-articulation point and a second sub-articulation point; the method further comprises the steps of:
determining local motion information generated by a second local bone of the object in the target time period according to a second initial bone rotation angle sequence of the second local bone in the target time period, wherein the second local bone represents bones between the first sub-joint point and the second sub-joint point; and
and responding to the fact that the local action information is matched with the preset action information, and adjusting the local action information according to the preset action information to obtain a second local skeleton rotation angle sequence.
10. The method of claim 9, wherein the adjusting the local motion information according to the preset motion information to obtain a second sequence of local bone rotation angles comprises:
Determining a standard skeleton rotation angle sequence according to the preset action information; and
and determining the second local bone rotation angle sequence according to the standard bone rotation angle sequence.
11. The method of claim 9 or 10, wherein the generating drive information corresponding to the motion of the object over the target time period from the global bone rotation angle and the first local bone rotation angle comprises:
determining a third local bone rotation angle of a third local bone at the at least one time, the third local bone characterizing bone between the first articulation point and the first sub-articulation point; and
and generating driving information corresponding to the motion of the object in the target time period according to the global skeleton rotation angle and the third local skeleton rotation angle which correspond to the same moment.
12. The method of claim 11, wherein the determining a third local bone rotation angle of a third local bone at the at least one time instant comprises:
determining first sub-joint point position information of the first sub-joint point at the at least one moment according to the second local bone rotation angle sequence; and
And determining the third local bone rotation angle according to the first sub-joint point position information, the third initial bone rotation angle of the third local bone and the first joint point position information of the first joint point at the same moment.
13. The method of claim 1, wherein the generating drive information corresponding to the motion of the object over the target time period from the global bone rotation angle and the first local bone rotation angle comprises:
determining a first weight configured for the global bone rotation angle;
determining a second weight configured for a target local bone rotation angle, the first weight being less than the second weight, the target local bone rotation angle comprising at least one of: the first partial bone rotation angle, the second partial bone rotation angle, the third partial bone rotation angle; and
and generating driving information corresponding to the action of the object in the target time period according to the first weight, the global skeleton rotation angle, the second weight and the target local skeleton rotation angle.
14. A driving method of an avatar, comprising:
Determining, in response to capturing an action from the target object, driving information for driving the target avatar, the driving information generated based on the method of any one of claims 1-13; and
and driving the target virtual image to perform corresponding actions by using the driving information.
15. A driving information generating apparatus comprising:
a first determination module for determining a first local bone rotation angle of a first local bone at least one moment in a target time period, the first local bone characterizing bone between a first articulation point and a second articulation point of a subject; and
the generation module is used for generating driving information corresponding to the motion of the object in the target time period according to a global skeleton rotation angle and the first local skeleton rotation angle, and the global skeleton rotation angle is determined according to joint point position information of an object joint point of the object at the at least one moment.
16. The apparatus of claim 15, wherein the first determination module comprises:
a first determination sub-module for determining a first initial bone rotation angle of the first local bone; and
And the second determining submodule is used for determining the first local bone rotation angle according to the first initial bone rotation angle, the first joint point position information of the first joint point and the second joint point position information of the second joint point.
17. The apparatus of claim 16, wherein the first determination submodule comprises:
a first determining unit configured to determine first node position information and second node position information of the first node according to feedback information of an optical marker point provided on the object; and
and the second determining unit is used for determining the first initial bone rotation angle according to the first joint point position information and the second joint point position information.
18. The apparatus of claim 16 or 17, wherein a third articulation point is included between the first articulation point and the second articulation point; the second determination submodule includes:
a third determining unit, configured to determine a third joint point rotation angle of the third joint point according to the first initial bone rotation angle, the first joint point pose information, and the second joint point position information; and
And the fourth determining unit is used for determining the first local bone rotation angle according to the first joint point pose information, the second joint point position information and the third joint point rotation angle.
19. The apparatus of claim 15, wherein the object nodes comprise a fourth acquired node and a fifth non-acquired node; the apparatus further comprises:
the second determining module is used for determining fifth joint point position information of the fifth joint point according to fourth joint point position information of the fourth joint point; and
and the third determining module is used for determining the global skeleton rotation angle according to the fourth joint point position information and the fifth joint point position information.
20. The apparatus of claim 19, further comprising:
and the fourth determining module is used for determining the target object joint point as the fourth joint point in response to determining that the target object joint point of the object is acquired by a preset number of acquisition devices positioned at different visual angles.
21. The apparatus of claim 16, wherein the second articulation point comprises a foot articulation point; the second determination submodule includes:
A fifth determining unit, configured to determine a target foot articulation point according to the touchdown state information of the foot articulation point, where the target foot articulation point includes at least one of the following: heel articulation point and toe articulation point; and
and a sixth determining unit, configured to determine the first local bone rotation angle according to the first initial bone rotation angle, the first joint point pose information, the ground contact state information, and target foot articulation point position information of the target foot articulation point.
22. The apparatus of claim 21, wherein the fifth determining unit comprises:
a first determining subunit, configured to determine, in response to determining that, according to the touchdown state information, both the heel and the toe are in a touchdown state or are in an untouched state, the heel articulation point or the toe articulation point as the target foot articulation point;
a second determining subunit, configured to determine the heel articulation point as the target foot articulation point in response to determining that the heel is in a touchdown state and the toe is in an untouched state according to the touchdown state information; and
and a third determining subunit, configured to determine the toe-articulation point as the target foot-articulation point in response to determining that the toe is in a touchdown state and that the heel is in an untouched state according to the touchdown state information.
23. The apparatus of claim 15, wherein the second articulation point comprises a first sub-articulation point and a second sub-articulation point; the apparatus further comprises:
a fifth determining module, configured to determine local motion information generated by a second local bone of the object in the target time period according to a second initial bone rotation angle sequence of the second local bone in the target time period, where the second local bone characterizes a bone between the first sub-articular point and the second sub-articular point; and
and the adjusting module is used for responding to the fact that the local action information is matched with the preset action information, and adjusting the local action information according to the preset action information to obtain a second local skeleton rotation angle sequence.
24. The apparatus of claim 23, wherein the adjustment module comprises:
a seventh determining unit, configured to determine a standard bone rotation angle sequence according to the preset motion information; and
an eighth determining unit, configured to determine the second local bone rotation angle sequence according to the standard bone rotation angle sequence.
25. The apparatus of claim 23 or 24, wherein the generating means comprises:
A third determination sub-module for determining a third local bone rotation angle of a third local bone at the at least one time, the third local bone characterizing bone between the first articulation point and the first sub-articulation point; and
the first generation sub-module is used for generating driving information corresponding to the action of the object in the target time period according to the global skeleton rotation angle and the third local skeleton rotation angle which correspond to the same moment.
26. The apparatus of claim 25, wherein the third determination submodule comprises:
a ninth determining unit, configured to determine, according to the second local bone rotation angle sequence, first sub-joint point position information of the first sub-joint point at the at least one moment; and
a tenth determining unit, configured to determine the third local bone rotation angle according to the first sub-joint point position information, the third initial bone rotation angle of the third local bone, and the first joint point pose information of the first joint point, which correspond to the same time.
27. The apparatus of claim 15, wherein the generating means comprises:
A fourth determination submodule for determining a first weight configured for the global skeletal rotation angle;
a fifth determination submodule for determining a second weight configured for a target local bone rotation angle, the first weight being less than the second weight, the target local bone rotation angle including at least one of: the first partial bone rotation angle, the second partial bone rotation angle, the third partial bone rotation angle; and
and the second generation submodule is used for generating driving information corresponding to the action of the object in the target time period according to the first weight, the global skeleton rotation angle, the second weight and the target local skeleton rotation angle.
28. An avatar driving apparatus, comprising:
a sixth determining module for determining driving information for driving the target avatar in response to capturing the motion from the target object, the driving information being generated based on the apparatus of any one of claims 15 to 27; and
and the driving module is used for driving the target virtual image to perform corresponding actions by utilizing the driving information.
29. An electronic device, comprising:
At least one processor; and
a memory communicatively coupled to the at least one processor; wherein, the liquid crystal display device comprises a liquid crystal display device,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-14.
30. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1-14.
CN202310500623.4A 2023-05-06 2023-05-06 Driving information generation method, driving device, electronic equipment and storage medium Active CN116206370B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310500623.4A CN116206370B (en) 2023-05-06 2023-05-06 Driving information generation method, driving device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310500623.4A CN116206370B (en) 2023-05-06 2023-05-06 Driving information generation method, driving device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN116206370A true CN116206370A (en) 2023-06-02
CN116206370B CN116206370B (en) 2024-02-23

Family

ID=86517729

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310500623.4A Active CN116206370B (en) 2023-05-06 2023-05-06 Driving information generation method, driving device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116206370B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116681809A (en) * 2023-06-28 2023-09-01 北京百度网讯科技有限公司 Method and device for driving virtual image, electronic equipment and medium
CN116894894A (en) * 2023-06-19 2023-10-17 北京百度网讯科技有限公司 Method, apparatus, device and storage medium for determining motion of avatar

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108229332A (en) * 2017-12-08 2018-06-29 华为技术有限公司 Bone attitude determination method, device and computer readable storage medium
CN109816773A (en) * 2018-12-29 2019-05-28 深圳市瑞立视多媒体科技有限公司 A kind of driving method, plug-in unit and the terminal device of the skeleton model of virtual portrait
CN112686976A (en) * 2020-12-31 2021-04-20 咪咕文化科技有限公司 Processing method and device of skeleton animation data and communication equipment
CN115147523A (en) * 2022-07-07 2022-10-04 北京百度网讯科技有限公司 Avatar driving method and apparatus, device, medium, and program product

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108229332A (en) * 2017-12-08 2018-06-29 华为技术有限公司 Bone attitude determination method, device and computer readable storage medium
US20190251341A1 (en) * 2017-12-08 2019-08-15 Huawei Technologies Co., Ltd. Skeleton Posture Determining Method and Apparatus, and Computer Readable Storage Medium
CN109816773A (en) * 2018-12-29 2019-05-28 深圳市瑞立视多媒体科技有限公司 A kind of driving method, plug-in unit and the terminal device of the skeleton model of virtual portrait
CN112686976A (en) * 2020-12-31 2021-04-20 咪咕文化科技有限公司 Processing method and device of skeleton animation data and communication equipment
CN115147523A (en) * 2022-07-07 2022-10-04 北京百度网讯科技有限公司 Avatar driving method and apparatus, device, medium, and program product

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李红波;孙舶源;李双生;: "基于骨骼信息的虚拟角色控制方法", 重庆邮电大学学报(自然科学版), no. 01 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116894894A (en) * 2023-06-19 2023-10-17 北京百度网讯科技有限公司 Method, apparatus, device and storage medium for determining motion of avatar
CN116681809A (en) * 2023-06-28 2023-09-01 北京百度网讯科技有限公司 Method and device for driving virtual image, electronic equipment and medium

Also Published As

Publication number Publication date
CN116206370B (en) 2024-02-23

Similar Documents

Publication Publication Date Title
CN116206370B (en) Driving information generation method, driving device, electronic equipment and storage medium
CN111460875B (en) Image processing method and apparatus, image device, and storage medium
KR102346320B1 (en) Fast 3d model fitting and anthropometrics
CN111488824B (en) Motion prompting method, device, electronic equipment and storage medium
CN111694429A (en) Virtual object driving method and device, electronic equipment and readable storage
KR101711736B1 (en) Feature extraction method for motion recognition in image and motion recognition method using skeleton information
CN113706699B (en) Data processing method and device, electronic equipment and computer readable storage medium
CN111294665B (en) Video generation method and device, electronic equipment and readable storage medium
EP3628380B1 (en) Method for controlling virtual objects, computer readable storage medium and electronic device
CN109448099A (en) Rendering method, device, storage medium and the electronic device of picture
US10395404B2 (en) Image processing device for composite images, image processing system and storage medium
CN108564643A (en) Performance based on UE engines captures system
JP2019096113A (en) Processing device, method and program relating to keypoint data
CN113129450A (en) Virtual fitting method, device, electronic equipment and medium
CN114862992A (en) Virtual digital human processing method, model training method and device thereof
KR20170067673A (en) Method and apparatus for generating animation
CN114187392B (en) Virtual even image generation method and device and electronic equipment
CN114677572B (en) Object description parameter generation method and deep learning model training method
CN111599002A (en) Method and apparatus for generating image
Yan et al. Human-object interaction recognition using multitask neural network
Amrutha et al. Human Body Pose Estimation and Applications
CN116524081A (en) Virtual reality picture adjustment method, device, equipment and medium
CN116092120B (en) Image-based action determining method and device, electronic equipment and storage medium
Xu et al. 3D joints estimation of the human body in single-frame point cloud
WO2023035725A1 (en) Virtual prop display method and apparatus

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant