CN116824014B

CN116824014B - Data generation method and device for avatar, electronic equipment and medium

Info

Publication number: CN116824014B
Application number: CN202310787301.2A
Authority: CN
Inventors: 刘豪杰; 李丰果; 冯志强; 陈睿智; 赵晨
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2023-06-29
Filing date: 2023-06-29
Publication date: 2024-06-07
Anticipated expiration: 2043-06-29
Also published as: CN116824014A

Abstract

The disclosure provides a data generation method, a device, electronic equipment and a medium for an avatar, relates to the technical field of artificial intelligence, and particularly relates to the technical fields of computer vision, augmented reality, virtual reality, deep learning and the like. The method may include: obtaining pose data for a target human body; obtaining a human joint model based on the pose data, the human joint model comprising two or more geometries respectively corresponding to human joints; responsive to determining that a collision exists between at least two geometries in the human joint model, determining respective identifications, directions of motion, and collision locations of the at least two geometries of the collision; and obtaining updated pose data based on the respective identifications, motion directions, and collision locations, the updated pose data for generating an avatar for the target human.

Description

Data generation method and device for avatar, electronic equipment and medium

Technical Field

The disclosure relates to the technical field of artificial intelligence, in particular to the technical fields of computer vision, augmented reality, virtual reality, deep learning and the like, and can be applied to scenes such as metauniverse, digital people and the like, in particular to a data generation method, a data generation device, electronic equipment, a computer readable storage medium and a computer program product for an virtual image.

Background

In computer vision, augmented reality, virtual reality, meta universe, digital people, etc., the generation of an avatar, especially the real-time generation of an avatar according to the pose of a human body, often occurs. For the avatar, a self-collision problem or a mold penetration problem is a very common phenomenon.

The approaches described in this section are not necessarily approaches that have been previously conceived or pursued. Unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section. Similarly, the problems mentioned in this section should not be considered as having been recognized in any prior art unless otherwise indicated.

Disclosure of Invention

The present disclosure provides a data generating method, apparatus, electronic device, computer-readable storage medium, and computer program product for an avatar.

According to an aspect of the present disclosure, there is provided a method for data generation of an avatar, including: obtaining pose data for a target human body; obtaining a human joint model based on the pose data, the human joint model comprising two or more geometries respectively corresponding to human joints; responsive to determining that a collision exists between at least two geometries in the human joint model, determining respective identifications, directions of motion, and collision locations of the at least two geometries of the collision; and obtaining updated pose data based on the respective identifications, motion directions, and collision locations, the updated pose data for generating an avatar for the target human.

According to another aspect of the present disclosure, there is provided an apparatus for data generation of an avatar, including: a pose obtaining unit for obtaining pose data for a target human body; a model obtaining unit configured to obtain a human joint model based on the pose data, the human joint model including two or more geometric bodies respectively corresponding to human joints; a collision determination unit for determining, in response to determining that a collision exists between at least two geometries in the human joint model, respective identifications, directions of motion, and collision positions of the at least two geometries of the collision; and a pose updating unit for obtaining updated pose data for generating an avatar for the target human body based on the corresponding identification, the movement direction, and the collision position.

According to another aspect of the present disclosure, there is provided an electronic device including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a data generation method for an avatar according to one or more embodiments of the present disclosure.

According to another aspect of the present disclosure, there is provided a non-transitory computer-readable storage medium storing computer instructions for causing the computer to perform a data generation method for an avatar according to one or more embodiments of the present disclosure.

According to another aspect of the present disclosure, there is provided a computer program product comprising a computer program, wherein the computer program, when executed by a processor, implements a data generation method for an avatar according to one or more embodiments of the present disclosure.

According to one or more embodiments of the present disclosure, a self-collision in an avatar may be effectively eliminated.

It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.

Drawings

The accompanying drawings illustrate exemplary embodiments and, together with the description, serve to explain exemplary implementations of the embodiments. The illustrated embodiments are for exemplary purposes only and do not limit the scope of the claims. Throughout the drawings, identical reference numerals designate similar, but not necessarily identical, elements.

FIG. 1 illustrates a schematic diagram of an exemplary system in which various methods described herein may be implemented, in accordance with an embodiment of the present disclosure;

Fig. 2 illustrates a flowchart of a data generation method for an avatar according to an embodiment of the present disclosure;

FIG. 3 illustrates a flow chart of a method of motion capture self-contact based on collision detection and inverse kinematics resolution in accordance with another embodiment of the present disclosure;

FIG. 4 illustrates a flow chart of a self-collision optimization method according to an exemplary embodiment of the present disclosure;

FIG. 5 shows a data flow diagram in accordance with an embodiment of the present disclosure;

Fig. 6 illustrates a block diagram of a data generating apparatus for an avatar according to an embodiment of the present disclosure;

fig. 7 illustrates a block diagram of an exemplary electronic device that can be used to implement embodiments of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

In the present disclosure, the use of the terms "first," "second," and the like to describe various elements is not intended to limit the positional relationship, timing relationship, or importance relationship of the elements, unless otherwise indicated, and such terms are merely used to distinguish one element from another. In some examples, a first element and a second element may refer to the same instance of the element, and in some cases, they may also refer to different instances based on the description of the context.

The terminology used in the description of the various illustrated examples in this disclosure is for the purpose of describing particular examples only and is not intended to be limiting. Unless the context clearly indicates otherwise, the elements may be one or more if the number of the elements is not specifically limited. Furthermore, the term "and/or" as used in this disclosure encompasses any and all possible combinations of the listed items.

Embodiments of the present disclosure will be described in detail below with reference to the accompanying drawings.

Fig. 1 illustrates a schematic diagram of an exemplary system 100 in which various methods and apparatus described herein may be implemented, in accordance with an embodiment of the present disclosure. Referring to fig. 1, the system 100 includes one or more client devices 101, 102, 103, 104, 105, and 106, a server 120, and one or more communication networks 110 coupling the one or more client devices to the server 120. Client devices 101, 102, 103, 104, 105, and 106 may be configured to execute one or more applications.

In an embodiment of the present disclosure, the server 120 may run one or more services or software applications enabling the execution of the data generation method for an avatar according to the present disclosure.

In some embodiments, server 120 may also provide other services or software applications, which may include non-virtual environments and virtual environments. In some embodiments, these services may be provided as web-based services or cloud services, for example, provided to users of client devices 101, 102, 103, 104, 105, and/or 106 under a software as a service (SaaS) model.

In the configuration shown in fig. 1, server 120 may include one or more components that implement the functions performed by server 120. These components may include software components, hardware components, or a combination thereof that are executable by one or more processors. A user operating client devices 101, 102, 103, 104, 105, and/or 106 may in turn utilize one or more client applications to interact with server 120 to utilize the services provided by these components. It should be appreciated that a variety of different system configurations are possible, which may differ from system 100. Accordingly, FIG. 1 is one example of a system for implementing the various methods described herein and is not intended to be limiting.

The user may use the client devices 101, 102, 103, 104, 105, and/or 106 to generate an avatar, view an avatar, perform self-collision elimination on an avatar, view elimination results, or debug models, etc. The client device may provide an interface that enables a user of the client device to interact with the client device. The client device may also output information to the user via the interface. Although fig. 1 depicts only six client devices, those skilled in the art will appreciate that the present disclosure may support any number of client devices.

Client devices 101, 102, 103, 104, 105, and/or 106 may include various types of computer devices, such as portable handheld devices, general purpose computers (such as personal computers and laptop computers), workstation computers, wearable devices, smart screen devices, self-service terminal devices, service robots, gaming systems, thin clients, various messaging devices, sensors or other sensing devices, and the like. These computer devices may run various types and versions of software applications and operating systems, such as MICROSOFT Windows, APPLE iOS, UNIX-like operating systems, linux, or Linux-like operating systems (e.g., GOOGLE Chrome OS); or include various mobile operating systems such as MICROSOFT Windows Mobile OS, iOS, windows Phone, android. Portable handheld devices may include cellular telephones, smart phones, tablet computers, personal Digital Assistants (PDAs), and the like. Wearable devices may include head mounted displays (such as smart glasses) and other devices. The gaming system may include various handheld gaming devices, internet-enabled gaming devices, and the like. The client device is capable of executing a variety of different applications, such as various Internet-related applications, communication applications (e.g., email applications), short Message Service (SMS) applications, and may use a variety of communication protocols.

Network 110 may be any type of network known to those skilled in the art that may support data communications using any of a number of available protocols, including but not limited to TCP/IP, SNA, IPX, etc. For example only, the one or more networks 110 may be a Local Area Network (LAN), an ethernet-based network, a token ring, a Wide Area Network (WAN), the internet, a virtual network, a Virtual Private Network (VPN), an intranet, an extranet, a blockchain network, a Public Switched Telephone Network (PSTN), an infrared network, a wireless network (e.g., bluetooth, WIFI), and/or any combination of these and/or other networks.

The server 120 may include one or more general purpose computers, special purpose server computers (e.g., PC (personal computer) servers, UNIX servers, mid-end servers), blade servers, mainframe computers, server clusters, or any other suitable arrangement and/or combination. The server 120 may include one or more virtual machines running a virtual operating system, or other computing architecture that involves virtualization (e.g., one or more flexible pools of logical storage devices that may be virtualized to maintain virtual storage devices of the server). In various embodiments, server 120 may run one or more services or software applications that provide the functionality described below.

The computing units in server 120 may run one or more operating systems including any of the operating systems described above as well as any commercially available server operating systems. Server 120 may also run any of a variety of additional server applications and/or middle tier applications, including HTTP servers, FTP servers, CGI servers, JAVA servers, database servers, etc.

In some implementations, server 120 may include one or more applications to analyze and consolidate data feeds and/or event updates received from users of client devices 101, 102, 103, 104, 105, and 106. Server 120 may also include one or more applications to display data feeds and/or real-time events via one or more display devices of client devices 101, 102, 103, 104, 105, and 106.

In some implementations, the server 120 may be a server of a distributed system or a server that incorporates a blockchain. The server 120 may also be a cloud server, or an intelligent cloud computing server or intelligent cloud host with artificial intelligence technology. The cloud server is a host product in a cloud computing service system, so as to solve the defects of large management difficulty and weak service expansibility in the traditional physical host and Virtual special server (VPS PRIVATE SERVER) service.

The system 100 may also include one or more databases 130. In some embodiments, these databases may be used to store data and other information. For example, one or more of databases 130 may be used to store information such as audio files and video files. Database 130 may reside in various locations. For example, the database used by the server 120 may be local to the server 120, or may be remote from the server 120 and may communicate with the server 120 via a network-based or dedicated connection. Database 130 may be of different types. In some embodiments, the database used by server 120 may be, for example, a relational database. One or more of these databases may store, update, and retrieve the databases and data from the databases in response to the commands.

In some embodiments, one or more of databases 130 may also be used by applications to store application data. The databases used by the application may be different types of databases, such as key value stores, object stores, or conventional stores supported by the file system.

The system 100 of fig. 1 may be configured and operated in various ways to enable application of the various methods and apparatus described in accordance with the present disclosure.

A data generation method 200 for an avatar according to an exemplary embodiment of the present disclosure is described below with reference to fig. 2.

At step S201, pose data for a target human body is obtained.

At step S202, a human joint model is obtained based on the pose data, the human joint model including two or more geometries respectively corresponding to human joints.

At step S203, in response to determining that there is a collision between at least two geometries in the human joint model, respective identifications, directions of motion, and collision locations of the at least two geometries of the collision are determined.

At step S204, updated pose data is obtained based on the respective identification, direction of motion, and collision location, the updated pose data being used to generate an avatar for the target human.

According to the method of the embodiment of the present disclosure, the self-collision in the avatar can be effectively eliminated. By converting pose data into a simple geometric model, collision detection can be performed in near real time, thereby effectively eliminating the self-collision problem.

According to some embodiments, obtaining updated pose data based on the respective identification, direction of motion, and collision location may include: obtaining an updated human joint model based on the identity of the at least two geometries, the direction of motion, and the collision location; and in response to determining that no collision exists between geometries in the updated human joint model, obtaining the updated pose data based on the updated human joint model.

According to such embodiments, the joint model can be updated to eliminate collisions based on the identity, direction of motion, and collision location. For example, the identity may be a block ID of a geometry and the direction of motion may be the direction of motion of each geometry of the collision. The collision location may be coordinates of the contact point, such as absolute coordinates or relative coordinates with respect to a corresponding geometry, and so forth, and the present disclosure is not limited thereto.

According to some embodiments, obtaining an updated human joint model based on the identification of the at least two geometries, the direction of motion, and the collision location may include performing at least one of the following: obtaining updated joint positions of corresponding human joints of the human joint model based on the motion direction; obtaining an updated joint rotation angle based on the updated joint position; and performing collision detection based on the updated joint position and the updated joint rotation angle.

According to such an embodiment, it is possible to obtain an updated human joint model based on the updated joint rotation angle, and perform collision detection. For example, such collision-elimination steps may be iteratively performed until no collision is determined to exist.

According to some embodiments, obtaining an updated joint rotation angle based on the updated joint position may include: based on the updated joint position, the updated joint rotation angle is obtained by inverse kinematics.

Inverse kinematics, sometimes referred to as inverse kinematics, etc., may include calculations from joint position to joint angle, i.e., the pose of the human body. It is to be understood that the aspects of the present disclosure may be applied to various inverse kinematics calculation algorithms known to those skilled in the art, and the present disclosure is not limited thereto.

According to some embodiments, obtaining respective updated joint positions of respective human joints of the human joint model based on the direction of motion may comprise: obtaining an updating step length; and obtaining the updated joint position by moving the current joint position of the human joint corresponding to the human joint model by the update step length in the direction opposite to the movement direction.

In such an embodiment, upon detection of a self-collision, the corresponding joint may be moved in a direction opposite to the direction of motion by a step to return to the state at the last instant of motion (or N times above) to mitigate or eliminate the collision. For example, such steps may be performed iteratively. The step size may be preset, may be set by a user, or may be selected according to actual conditions (e.g., calculation force, collision accuracy, real-time requirements, etc.).

According to some embodiments, obtaining pose data may include obtaining perceived joint pose data based on acquired images or video.

In such embodiments, the pose data may be obtained by motion capture of the human body, enabling the generation of an avatar from the human body pose acquired in real time

According to some embodiments, obtaining a human joint model based on the pose data may include driving a predetermined initial human joint model using the pose data.

Thus, by establishing a predetermined initial human body model and driving based on pose data, the human body model is enabled to reflect current pose data.

According to some embodiments, the predetermined initial human joint model may be constructed by: obtaining joint parameters corresponding to a perception model of the pose data; and building the initial human joint model based on the joint parameters and a predetermined pool of geometries such that the number of joints of the initial human joint model is consistent with the number of joints of the perception model, wherein the pool of geometries comprises at least one of: capsule, sphere, ellipsoid, cylinder, hexahedron.

In such an embodiment, the number of joints of the manikin is consistent with the number of joints of the perception model, such that the solution has a high scalability.

According to some embodiments, the method 200 may further include, after updating the pose data based on the identification of the at least two geometries, the direction of motion, and the collision location to eliminate the collision, driving a rendering model using the updated pose data to generate a rendered avatar.

For example, the generated avatar may be rendered after all the iterative and elimination steps to avoid jerkiness such as jitter occurring in the avatar.

With the continuous heat of the metauniverse, how users realize low-cost interaction in the metauniverse becomes an important contending point of a metauniverse class App, the sense of user immersion experience can be greatly improved by good human-computer interaction, and the dynamic capture technology is an important technology for enabling the human-computer interaction of the metauniverse, namely, people in the metauniverse do some actions and dances, and then digital people in the metauniverse do corresponding actions and dances.

Motion capture is a technique for assembling the motion of an actual character in the real world into a computer-generated virtual character or animation, and is widely used in the fields of games, entertainment, education, medical treatment, and the like. In motion capture systems, the self-contact problem is a very important problem, because the self-contact can cause problems such as model collapse of virtual characters or animations, thereby affecting the effect and quality of motion capture.

At present, based on monocular or multi-purpose dynamic capturing effect, the digital human effect driven by the existing scheme almost has the problem of self-contact penetration, especially in self-contact penetration of hands and bodies, self-contact penetration of legs and legs, and the like. The mainstream scheme is to collect a large amount of training data to improve the motion capture accuracy, and a large amount of manpower and material resources are consumed, but the self-contact problem is difficult to solve. In a physical engine, there may be generic capabilities, but the ability to specifically adapt specifically to digital persons has not emerged in dynamic capture technology.

In the related art, even though a great deal of manpower and material resources are consumed by gathering a great deal of training data to improve the accuracy of motion capture, the self-contact problem is still difficult to solve. Collision detection has a very important role for the final effect and the current UE4 does not have the capability to detect self-threading and the correlation of collision detection. In a physics engine, there may be generic capabilities, but no capability specifically for digital person-specific adaptations.

According to one or more embodiments of the present disclosure, effective and efficient collision avoidance is achieved.

The steps of a method 300 of motion capture self-contact based on collision detection and inverse kinematics solution according to another embodiment of the present disclosure are described below in connection with fig. 3.

At step S301, human body pose sensing is performed. There are many methods of acquiring joint pose data of a human body based on acquired images or videos, including but not limited to HRNET, VIBE, SPIN, TCMR and the like. For example, open source human pose perception may be employed. It is appreciated that the collision detection and inverse kinematics based method according to embodiments of the present disclosure may be applied behind any motion capture awareness module.

At step S302, collision detection model modeling is performed. For example, the digital person may be specifically adapted for the digital person according to the embodiment of the present disclosure, and the modeling may be performed when the digital person performs collision detection, wherein the modeled joint corresponds to the joint rotation angle perceived by the first-step human body pose one-to-one. By modeling such a simple geometric model, according to embodiments of the present disclosure, efficiency can be greatly improved.

At step S302, a model corresponding to the joints of the human body pose data may be built, with the same number of joints. Instead of the points and faces in the original pose, the built model can have specific space. Each joint may be replaced with a geometry, e.g. the joint is capsule-like, the head is sphere, etc.

At step S303, collision detection is performed. After modeling is completed, the human body pose data and the modeled model are taken as the input of a collision detection module, and the id of the collision geometry, the position of the contact point and the movement direction of the joint are obtained through a collision detection algorithm.

It will be appreciated that the model may be driven with pose data, i.e., joint angles, after modeling is complete, such that the model is driven to an angle corresponding to the target human body. The input may be pose data and the output may be whether the collision, the block id of the collision, the direction of motion.

The direction of motion may be determined by comparing the previous frame with this frame, for example. The collision detection algorithm may be a positional relationship between the detection geometries. It is to be understood that the present disclosure is not so limited.

At step S304, self-collision optimization is performed.

By the last step, the ids of both the geometry of that collision, as well as the location and direction of the collision, can be obtained. The self-collision optimization is how to make the state disappear as soon as possible, and a state without collision is achieved.

After the collision is determined, step S304 may be entered. The optimization step length can be set manually or can be changed according to the calculation force and precision requirements. The following rendering step may be entered after all collisions have been eliminated.

At step S305, engine driving and rendering are performed. The whole body joint data after self-collision optimization can be utilized to drive digital assets and render, and the final dynamic capturing effect can be obtained.

Step 400 of the self-collision optimization method according to an exemplary embodiment of the present disclosure is described below in connection with fig. 4. In such an embodiment, the solution of the final state may be performed in an optimization iterative manner.

At step S401, an optimization step is first determined.

At step S402, the joint position is updated with the movement direction and the step size.

In other words, the iteration may be performed by moving the step size in the opposite direction to the movement direction.

At step S403, a new joint rotation angle is calculated by inverse kinematics using the new joint position. The model can be driven again to see if there is a collision until the collision is finally eliminated.

At step S404, collision detection is performed.

At step S405, it is determined whether there is a collision, and if the collision is continued, step S402 may be repeated.

Finally, joint angle data of the whole body under the self-contact collision can be obtained.

According to one or more embodiments of the present disclosure, a technical solution of a motion capture model self-contact problem can be provided, which can effectively solve the self-contact problem in motion capture. In accordance with one or more embodiments of the present disclosure, a method may include: establishing a collision detection model, and acquiring and processing the position and posture information of the person in the video in real time; processing the position and posture information of the moving object detected by collision, driving a collision detection model, and performing collision detection; and solving collision optimization according to inverse kinematics, so that the self-contact problem in the motion capture process is avoided. According to one or more embodiments of the present disclosure, by the integrated application of collision detection and inverse kinematics, self-contact detection and inverse kinematics solution of the motion capture model is achieved, thereby ensuring stability and accuracy of motion capture.

A data flow 500 according to an exemplary embodiment of the present disclosure is described with reference to fig. 5. As shown, the video capture device 510 obtains the captured data and sends it to the human pose sensor data module 520. On the other hand, the collision detection model modeling module 530 generates a collision detection model 501 that is further driven by the human body pose sensed by the human body pose aware data module 520. The driven model 502 is input to the collision detection module 540. In the event of a collision in the model 502, the data is input to a self-collision optimization module 550 and after optimization, e.g., after multiple iterative optimizations, the self-collision-eliminated model 503 is obtained. Model 503 may then be output to engine driven rendering module 560 for rendering output of the avatar (e.g., digital person).

A data generating apparatus 600 for an avatar according to an embodiment of the present disclosure will now be described with reference to fig. 6. The data generating apparatus 600 for an avatar may include a pose obtaining unit 601, a model obtaining unit 602, a collision determining unit 603, and a pose updating unit 604. The pose obtaining unit 601 may be used to obtain pose data for a target human body. The model obtaining unit 602 may be configured to obtain a human joint model based on the pose data, the human joint model including two or more geometries respectively corresponding to human joints. The collision determination unit 603 may be adapted to determine, in response to determining that a collision exists between at least two geometries in the human joint model, a respective identity, a direction of movement and a collision position of the at least two geometries of the collision. The pose updating unit 604 may be configured to obtain updated pose data for generating an avatar for the target human body based on the respective identification, the direction of motion, and the collision position.

According to the device of the embodiment of the present disclosure, the self-collision in the avatar can be effectively eliminated.

According to some embodiments, obtaining updated pose data based on the respective identification, direction of motion, and collision location comprises: obtaining an updated human joint model based on the identity of the at least two geometries, the direction of motion, and the collision location; in response to determining that no collision exists between geometries in the updated human joint model, the updated pose data is obtained based on the updated human joint model.

According to some embodiments, obtaining an updated human joint model based on the identity of the at least two geometries, the direction of motion, and the collision location comprises performing at least one of the following: obtaining respective updated joint positions of respective human joints of the human joint model based on the direction of motion; obtaining an updated joint rotation angle based on the updated joint position; and performing collision detection based on the updated joint position and the updated joint rotation angle.

According to some embodiments, obtaining an updated joint rotation angle based on the updated joint position comprises: based on the updated joint position, the updated joint rotation angle is obtained by inverse kinematics.

According to some embodiments, obtaining respective updated joint positions of respective human joints of the human joint model based on the direction of motion comprises: obtaining an updating step length; and obtaining the updated joint position by moving the current joint position of the human joint corresponding to the human joint model by the update step length in the direction opposite to the movement direction.

According to some embodiments, obtaining pose data includes obtaining joint pose data of a human body based on acquired images or videos.

According to some embodiments, obtaining a human joint model based on the pose data includes driving a predetermined initial human joint model using the pose data.

According to some embodiments, the apparatus 600 may further comprise means for: after updating the pose data based on the identification of the at least two geometries, the direction of motion, and the collision location to eliminate the collision, a rendering model is driven using the updated pose data to generate a rendered avatar.

In the technical scheme of the disclosure, the related processes of collecting, acquiring, storing, using, processing, transmitting, providing, disclosing and the like of the personal information of the user accord with the regulations of related laws and regulations, and the public order is not violated.

According to embodiments of the present disclosure, there is also provided an electronic device, a readable storage medium and a computer program product.

Referring to fig. 7, a block diagram of an electronic device 700 that may be a server or a client of the present disclosure, which is an example of a hardware device that may be applied to aspects of the present disclosure, will now be described. Electronic devices are intended to represent various forms of digital electronic computer devices, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other suitable computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 7, the electronic device 700 includes a computing unit 701 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 702 or a computer program loaded from a storage unit 708 into a Random Access Memory (RAM) 703. In the RAM 703, various programs and data required for the operation of the electronic device 700 may also be stored. The computing unit 701, the ROM 702, and the RAM 703 are connected to each other through a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.

Various components in the electronic device 700 are connected to the I/O interface 705, including: an input unit 706, an output unit 707, a storage unit 708, and a communication unit 709. The input unit 706 may be any type of device capable of inputting information to the electronic device 700, the input unit 706 may receive input numeric or character information and generate key signal inputs related to user settings and/or function control of the electronic device, and may include, but is not limited to, a mouse, a keyboard, a touch screen, a trackpad, a trackball, a joystick, a microphone, and/or a remote control. The output unit 707 may be any type of device capable of presenting information and may include, but is not limited to, a display, speakers, video/audio output terminals, vibrators, and/or printers. Storage unit 708 may include, but is not limited to, magnetic disks, optical disks. The communication unit 709 allows the electronic device 700 to exchange information/data with other devices through computer networks, such as the internet, and/or various telecommunications networks, and may include, but is not limited to, modems, network cards, infrared communication devices, wireless communication transceivers and/or chipsets, such as bluetooth devices, 802.11 devices, wiFi devices, wiMax devices, cellular communication devices, and/or the like.

The computing unit 701 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 701 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 701 performs the various methods and processes described above, such as methods 200, 300, and/or 400, variations thereof, and the like. For example, in some embodiments, the methods 200, 300, and/or 400, variations thereof, and the like, may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as the storage unit 708. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 700 via the ROM 702 and/or the communication unit 709. When the computer program is loaded into RAM 703 and executed by computing unit 701, one or more steps of the methods 200, 300 and/or 400 described above, variations thereof, and the like, may be performed. Alternatively, in other embodiments, the computing unit 701 may be configured to perform the methods 200, 300 and/or 400, variations thereof, etc., in any other suitable manner (e.g., by means of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), complex Programmable Logic Devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.

The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server incorporating a blockchain.

It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps recited in the present disclosure may be performed in parallel, sequentially or in a different order, provided that the desired results of the disclosed aspects are achieved, and are not limited herein.

Although embodiments or examples of the present disclosure have been described with reference to the accompanying drawings, it is to be understood that the foregoing methods, systems, and apparatus are merely exemplary embodiments or examples, and that the scope of the present invention is not limited by these embodiments or examples but only by the claims following the grant and their equivalents. Various elements of the embodiments or examples may be omitted or replaced with equivalent elements thereof. Furthermore, the steps may be performed in a different order than described in the present disclosure. Further, various elements of the embodiments or examples may be combined in various ways. It is important that as technology evolves, many of the elements described herein may be replaced by equivalent elements that appear after the disclosure.

Claims

1. A method for data generation of an avatar, comprising:

Obtaining pose data for a target human body;

Obtaining a human joint model based on the pose data, the human joint model comprising two or more geometries respectively corresponding to human joints;

Responsive to determining that a collision exists between at least two geometries in the human joint model, determining respective identifications, directions of motion, and collision locations of the at least two geometries of the collision; and

Obtaining updated pose data based on the respective identifications, motion directions, and collision locations, the updated pose data for generating an avatar for the target human,

Wherein obtaining updated pose data based on the respective identifications, motion directions, and collision locations comprises:

Obtaining an updated human joint model based on the identity of the at least two geometries, the direction of motion, and the collision location; and

In response to determining that no collision exists between geometries in the updated human joint model, the updated pose data is obtained based on the updated human joint model.

2. The method of claim 1, wherein obtaining an updated human joint model based on the identity of the at least two geometries, the direction of motion, and the collision location comprises performing at least one of:

Obtaining updated joint positions of corresponding human joints of the human joint model based on the motion direction;

obtaining an updated joint rotation angle based on the updated joint position; and

Collision detection is performed based on the updated joint position and the updated joint rotation angle.

3. The method of claim 2, wherein obtaining an updated joint rotation angle based on the updated joint position comprises: based on the updated joint position, the updated joint rotation angle is obtained by inverse kinematics.

4. A method according to claim 2 or 3, wherein obtaining respective updated joint positions of respective human joints of the human joint model based on the direction of motion comprises:

obtaining an updating step length; and

And the updated joint position is obtained by moving the current joint position of the corresponding human joint of the human joint model to the opposite direction of the movement direction by the updating step length.

5. A method according to claim 2 or 3, wherein obtaining pose data comprises obtaining perceived joint pose data based on acquired images or video.

6. A method according to claim 2 or 3, wherein obtaining a human joint model based on the pose data comprises driving a predetermined initial human joint model using the pose data.

7. The method of claim 6, wherein the predetermined initial human joint model is constructed by:

obtaining joint parameters corresponding to a perception model of the pose data; and

Building the initial human joint model based on the joint parameters and a predetermined geometry pool such that a joint number of the initial human joint model is consistent with a joint number of the perception model, wherein the geometry pool comprises at least one of: capsule, sphere, ellipsoid, cylinder, hexahedron.

8. A method according to claim 2 or 3, further comprising, after updating the pose data based on the identity of the at least two geometries, the direction of motion, and the collision location to eliminate collisions, driving a rendering model using the updated pose data to generate a rendered avatar.

9. An apparatus for data generation of an avatar, comprising:

a pose obtaining unit for obtaining pose data for a target human body;

A model obtaining unit configured to obtain a human joint model based on the pose data, the human joint model including two or more geometric bodies respectively corresponding to human joints;

A collision determination unit for determining, in response to determining that a collision exists between at least two geometries in the human joint model, respective identifications, directions of motion, and collision positions of the at least two geometries of the collision; and

A pose updating unit for obtaining updated pose data based on the respective identifications, motion directions, and collision positions, the updated pose data for generating an avatar for the target human body,

obtaining an updated human joint model based on the identity of the at least two geometries, the direction of motion, and the collision location;

10. The apparatus of claim 9, wherein obtaining an updated human joint model based on the identity of the at least two geometries, the direction of motion, and the collision location comprises performing at least one of:

Obtaining respective updated joint positions of respective human joints of the human joint model based on the direction of motion;

11. The apparatus of claim 10, wherein obtaining an updated joint rotation angle based on the updated joint position comprises: based on the updated joint position, the updated joint rotation angle is obtained by inverse kinematics.

12. The apparatus of claim 10 or 11, wherein obtaining respective updated joint positions of respective human joints of the human joint model based on the direction of motion comprises:

obtaining an updating step length; and

13. The apparatus of claim 10 or 11, wherein obtaining pose data comprises obtaining joint pose data of a human body based on acquired images or video.

14. The apparatus of claim 10 or 11, wherein obtaining a human joint model based on the pose data comprises driving a predetermined initial human joint model using the pose data.

15. The apparatus according to claim 10 or 11, further comprising means for: after updating the pose data based on the identification of the at least two geometries, the direction of motion, and the collision location to eliminate the collision, a rendering model is driven using the updated pose data to generate a rendered avatar.

16. An electronic device, comprising:

At least one processor; and

A memory communicatively coupled to the at least one processor; wherein the method comprises the steps of

The memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-8.

17. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1-8.

18. A computer program product comprising a computer program, wherein the computer program, when executed by a processor, implements the method of any of claims 1-8.