CN117911632B

CN117911632B - Human body node three-dimensional virtual character action reconstruction method, equipment and computer readable storage medium

Info

Publication number: CN117911632B
Application number: CN202410309471.4A
Authority: CN
Inventors: 顾幸慈; 袁野
Original assignee: University of Electronic Science and Technology of China
Current assignee: University of Electronic Science and Technology of China
Priority date: 2024-03-19
Filing date: 2024-03-19
Publication date: 2024-05-28
Anticipated expiration: 2044-03-19
Also published as: CN117911632A

Abstract

The invention provides a human body node three-dimensional virtual character action reconstruction method, equipment and a computer readable storage medium, and belongs to the technical field of computer vision and data processing. The method comprises the steps of establishing connection with data transmission equipment by using a monocular camera, capturing a frame data stream, obtaining a human body sampling node set and a model posture detection state by using frame data through a human body node detection model, and carrying out filtering processing on the data; acquiring a set of human body sampling node coordinates, packaging the set into a frame human body node three-dimensional coordinate packet, and transmitting the frame human body node three-dimensional coordinate packet into a network transmission channel; the data receiving equipment receives and decodes the data, calculates a skeleton rotation quaternion and constructs an intermediate change matrix; and obtaining the rotation quantity of the skeleton node through the skeleton rotation quaternion and the intermediate change matrix, reconstructing the human body action, and updating the corresponding virtual character action in the three-dimensional virtual environment. The invention has the advantages of simplifying the data acquisition process, reducing the cost and simultaneously maintaining high accuracy and efficiency.

Description

Human body node three-dimensional virtual character action reconstruction method, equipment and computer readable storage medium

Technical Field

The invention belongs to the technical field of computer vision and data processing, and particularly relates to a human body node three-dimensional virtual character action reconstruction method.

Background

In the field of computer vision, human body pose estimation is a key technology aimed at identifying and locating key nodes of the human body from images or videos. At present, human body posture estimation technology is widely applied to a plurality of fields such as man-machine interaction, animation production, sports analysis and the like. However, conventional methods typically rely on multiple cameras or special sensors to capture human motion, which are costly and complex to set up.

Monocular cameras, while having potential for achieving three-dimensional human body pose estimation, present a number of challenges as a more convenient and economical option. For example, monocular cameras suffer from viewing angle limitations and lack of depth information compared to binocular cameras, and lower resolution cameras have difficulty accurately capturing complex human motion and fine node positions in three-dimensional space.

Therefore, the prior art has limitations in that the monocular cameras are used for accurate human body node data acquisition and processing, and particularly, great challenges are faced when efficient and accurate three-dimensional pose reconstruction is performed in an environment with limited resources.

Disclosure of Invention

The invention provides a human body node three-dimensional virtual character action reconstruction method, which aims to improve acquisition accuracy and accurately map to a three-dimensional virtual space.

The invention provides a human body node three-dimensional virtual character action reconstruction method, which comprises the following steps:

Step S101, a monocular camera is used for establishing connection with data transmission equipment, the data transmission equipment initializes the monocular camera, and the monocular camera captures frame data stream and transmits the frame data stream to the data transmission equipment to obtain frame data;

Step S102, the frame data are transmitted into a human body node detection model to obtain a detected human body sampling node set and a model posture detection state, and the coordinates of each human body sampling node are subjected to multiple filtering to obtain a filtered set of all human body sampling node coordinates of the frame data;

Step S103, packaging the set of all human body sampling node coordinates of the frame data into a frame human body node three-dimensional coordinate packet, placing the frame human body node three-dimensional coordinate packet into a network transmission channel, and opening the network transmission channel by data receiving equipment to receive and decode the frame human body node three-dimensional coordinate packet;

Step S104, calculating skeleton rotation quaternion for each skeleton according to the decoded data of the frame human body node three-dimensional coordinate packet, and constructing an intermediate change matrix;

Step 105, obtaining the rotation quantity of skeleton nodes through the skeleton rotation quaternion and the intermediate change matrix, reconstructing human body actions, and mapping corresponding virtual character actions in a three-dimensional virtual environment;

Step S106, storing or transmitting the rotation amount of each node to a predetermined storage system or network endpoint according to the current device on-line condition, or storing the rotation amount of each node in a storage medium of the data receiving device.

With reference to the first aspect, in a first implementation manner of the first aspect of the present invention, the establishing a connection with a data sending device by using a monocular camera, initializing the monocular camera by the data sending device, capturing a frame data stream by the monocular camera, and transmitting the frame data stream to the data sending device to obtain frame data, including:

the monocular camera and the data are sent to the equipment to establish wired or wireless connection, and the data are transmitted through a preset communication protocol;

The data transmission equipment initializes a monocular camera and sets sensor parameters of the monocular camera;

and capturing the frame data stream by the monocular camera, and transmitting the frame data stream to the data transmitting equipment in a wired or wireless transmission mode to obtain one frame of image data, namely frame data.

With reference to the first aspect, in a second implementation manner of the first aspect of the present invention, the transmitting the frame data into a human body node detection model to obtain a human body sampling node set and a model gesture detection state detected by the frame data, and performing multiple filtering on coordinates of each human body sampling node to obtain a set of coordinates of all human body sampling nodes of the frame data after filtering, where the method includes:

The frame data are transmitted into a human body node detection model, and the human body node detection model outputs a detected set of human body sampling nodes and a model posture detection state;

Determining the detection state of the frame data model according to the detection state of the model posture and the number of the human body sampling nodes;

and carrying out various filtering operations on the coordinates of each human body sampling node in the human body sampling node set according to the frame data model detection state to obtain a set of all human body sampling node coordinates of the frame data after the influence of noise and interference is filtered.

With reference to the first aspect, in a third implementation manner of the first aspect of the present invention, the packaging the set of coordinates of all human body sampling nodes of the frame data into a frame human body node three-dimensional coordinate packet, and placing the frame human body node three-dimensional coordinate packet into a network transmission channel, the data receiving device opens the network transmission channel to receive and decode the frame human body node three-dimensional coordinate packet, includes:

Packaging the set of all human body sampling node coordinates of the frame data into a frame human body node three-dimensional coordinate packet, wherein the content of the frame human body node three-dimensional coordinate packet comprises the three-dimensional coordinates of all detection nodes of the frame data;

And encoding the three-dimensional coordinate packet of the frame human body node, and then putting the encoded three-dimensional coordinate packet into a network transmission channel, wherein the network transmission channel comprises a wired network transmission channel and a wireless network transmission channel.

The data receiving device opens the network transmission channel to receive the three-dimensional coordinate packet of the frame human body node and stores the three-dimensional coordinate packet in a character string variable, wherein the three-dimensional coordinate packet represents the human body node.

With reference to the first aspect, in a fourth implementation manner of the first aspect of the present invention, the calculating, for each bone, a bone rotation quaternion by using the decoded data of the frame human node three-dimensional coordinate packet, and constructing an intermediate change matrix includes:

In calculating the intermediate change matrix of the bone, a rotation quaternion needs to be created, the creation of which needs to specify two vectors, the first vector being the forward direction and the second vector being the upward direction, the Z-axis of the rotation quaternion will be aligned with the forward direction, the X-axis will be aligned with the cross product between the forward direction and the upward direction, and the Y-axis will be aligned with the cross product between the Z-axis and the X-axis. If the upward direction is not specified, the upward direction is set as a world upward parameter, i.e., an upward vertical axis vector; creating the rotation quaternion for ensuring that the rotation conforms to the biomechanical properties of the human body;

Creating a skeletal rotation quaternion: the front direction is set as the front direction of the bone, and the upper direction is set as the upper direction of the bone;

Creating a rotation-oriented quaternion: the front direction is set to be the opposite direction of the front direction of the skeleton, and the upper direction is the self-defined upper direction;

multiplying the inverse quaternion of the bone rotation quaternion with the rotation-oriented quaternion to obtain the intermediate change matrix of the bone.

With reference to the first aspect, in a fifth implementation manner of the first aspect of the present invention, the obtaining, by the skeletal rotation quaternion and the intermediate change matrix, a rotation amount of a skeletal node, reconstructing a human body action, and updating a corresponding virtual character action in a three-dimensional virtual environment includes:

Multiplying the rotation quaternion of each skeleton node by the intermediate change matrix to obtain the rotation quantity of each node, wherein the rotation quantity is the rotation quaternion;

Taking the opposite direction of the front direction of the rotation quaternion of the skeleton of each frame as the front direction of a new rotation quaternion, taking the upper direction of the quaternion of the skeleton of each frame as the upper direction of the new rotation quaternion, and obtaining the rotation quantity of a skeleton node by inversely multiplying the new rotation quaternion and an intermediate change matrix;

Updating the rotation quaternion of each node by using the rotation quantity, and dynamically updating the gesture of the character model to reconstruct the action of the virtual character; the motion of the character model is synchronized with the actual motion data using the displacement and joint rotation, and is used as a map of the actual character motion.

With reference to the first aspect, in a sixth implementation manner of the first aspect of the present invention, the storing or transmitting, according to the current device online condition, the rotation amount of each node to a predetermined storage system or a network endpoint includes:

the rotation amount of each node will be stored in a predetermined storage medium or transmitted to other devices in the network through a network transmission channel port.

The invention provides a set of monocular camera-based human body node data acquisition processing and three-dimensional virtual character action reconstruction equipment, which comprises the following components: a monocular camera, a data transmitting device and a data receiving device;

with reference to the second aspect, the monocular camera includes a photosensitive element as a sensor;

The data transmission device comprises a memory, at least one processor and a network access device, wherein the memory is used for storing instructions, the at least one processor is used for calling the instructions in the memory, and the network access device is used for transmitting processed data through a network;

The data receiving device comprises a memory, at least one processor, a network access device and a display device, wherein the memory is used for storing instructions and processing results, the at least one processor is used for calling the instructions in the memory, the network access device is used for receiving processed data through a network, and the display device is used for displaying virtual three-dimensional roles.

A third aspect of the present invention provides a computer-readable storage medium having instructions stored therein that, when run on a computer, cause the computer to perform a monocular camera-based human body node data acquisition processing and three-dimensional virtual character action reconstruction method.

Compared with the prior art, the invention has the following beneficial technical effects:

In the technical scheme provided by the invention, a monocular camera is firstly utilized to establish connection with data transmission equipment, the monocular camera is initialized, and a frame data stream is captured to obtain frame data; the frame data are displayed on a display of the data sending equipment and are transmitted into a human body node detection model, the model detects a human body sampling node set and a gesture detection state, and each human body sampling node coordinate is filtered according to the detection state to obtain a filtered frame human body sampling node coordinate set; and packaging the filtered coordinate set into a frame human body node three-dimensional coordinate packet, and transmitting the frame human body node three-dimensional coordinate packet through a network transmission channel. On the basis, the data receiving equipment receives a frame human body node three-dimensional coordinate packet through a network transmission channel and converts the three-dimensional coordinate data into a skeleton rotation quaternion; and obtaining the rotation quantity of the skeleton node through the skeleton rotation quaternion and the intermediate change matrix, reconstructing human body actions, and updating corresponding virtual character actions in the three-dimensional virtual environment. According to the invention, the frame data captured by the monocular camera is accurately processed and analyzed, and then the three-dimensional reconstruction is carried out on the human body node data, so that the mapping of the three-dimensional virtual character action is more accurate, and the accuracy of data acquisition and the efficiency of three-dimensional mapping are improved.

The invention aims to use the monocular camera, so that the data acquisition and three-dimensional reconstruction effects consistent with those of the binocular or multi-view camera are achieved under the monocular condition while the cost is reduced. The invention has the advantages of simplifying the data acquisition process, reducing the cost, keeping high accuracy and efficiency, and being applicable to the fields of man-machine interaction, animation production, sports analysis and the like.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments of the present invention will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort to a person of ordinary skill in the art.

Fig. 1 is a schematic diagram of an embodiment of a method for acquiring and processing human body node data and reconstructing three-dimensional virtual character actions based on a monocular camera according to an embodiment of the present invention.

Fig. 2 is a topological diagram of human joint node coordinates and a node correspondence table in a method for acquiring and processing human node data and reconstructing three-dimensional virtual character actions based on a monocular camera in an embodiment of the present invention.

Fig. 3 is a schematic diagram of an embodiment of a monocular camera-based human body node data acquisition processing and three-dimensional virtual character action reconstruction device according to an embodiment of the present invention.

Fig. 4 is a schematic diagram of an embodiment of a monocular camera-based human body node data acquisition processing and three-dimensional virtual character action reconstruction device according to an embodiment of the present invention.

Detailed Description

The terms "first," "second," "third," "fourth" and the like in the description and in the claims and in the above drawings, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments described herein may be implemented in other sequences than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed or inherent to such process, method, article, or apparatus.

The embodiment of the invention provides a human body node three-dimensional virtual character action reconstruction method which is used for improving the stability of data acquisition and the accuracy of data analysis.

For convenience of understanding, a specific flow of an embodiment of the present invention is described below, referring to fig. 1, a method for reconstructing a three-dimensional virtual character action of a human node in an embodiment of the present invention includes the following steps:

step S101, a monocular camera is used for establishing connection with a data transmission device, the data transmission device initializes the monocular camera, and the monocular camera captures frame data stream and transmits the frame data stream to the data transmission device to obtain frame data.

It is to be understood that the execution subject of the present invention may be a monocular camera-based data transmitting device and a monocular camera-based data receiving device, and may also be a terminal or a server, which is not limited herein. The implementation of the invention is described taking a monocular camera-based data acquisition device and a data receiving device as an execution main body.

Specifically, the data acquisition device determines a data transmission mode and a protocol between the data acquisition device and the monocular camera, for example IIC, USB, SPI, UART, and connects the monocular camera by using a corresponding tool or an API according to the data transmission mode and the protocol. The monocular camera establishes wired or wireless connection with the data transmission device, and transmits the data through a preset communication protocol. The data transmitting device initializes the monocular camera, configures camera sensor parameters including color, frame rate, resolution, port selection and the like, and ensures successful connection. And introducing a human body node detection model, configuring the visual operation of the human body node detection model, confidence level, frame rate and detecting the number of joint points. Introducing a Kalman filter, and configuring the noise variance and the measurement covariance of the Kalman filter. A low pass filter is introduced and low pass filter parameters are configured. Constructing a blank data packet, wherein the blank data packet is a minimum transmission unit for data communication between the data sending equipment and the data receiving equipment, the blank data packet is an information matrix for storing multi-frame information, and each frame information stores skeleton site information of the number of the detected joint points. And capturing the frame data stream by the single-camera and transmitting the frame data stream to the data transmitting equipment in a wired or wireless transmission mode to obtain one frame of image data, namely frame data. The frame data stream transmitted by the monocular camera is captured by the data transmitting device, which processes each frame. And introducing and initializing a Kalman filter and a low-pass filter to construct blank data packets.

Step S102, displaying the frame data on a display of the data transmitting equipment, transmitting the frame data into a human body node detection model to obtain a detected human body sampling node set and a model posture detection state, and performing multiple filtering on coordinates of each human body sampling node to obtain a filtered set of coordinates of all human body sampling nodes of the frame data;

specifically, the frame data is displayed as one frame of picture information on a display of the transmitting apparatus. The frame data are transmitted into the human body node detection model, and the detected human body sampling node set and model gesture detection state is obtained; the human body node detection model is input into frame data, and is output into a human body node coordinate set with the number of detection joint points and a model posture detection state. Determining the posture detection state of the frame data model according to whether the number of the nodes detected by the human body node detection model meets the detection Guan Jiejie points, and if the number of the nodes detected by the human body node detection model meets the detection Guan Jiejie points, determining the posture detection state of the frame data model to be true; if the number of the nodes detected by the human body node detection model does not meet the number of Guan Jiejie points detected by the human body node detection model, the gesture detection state of the frame data model is false. When the frame data model posture detection state is false, the next frame image is directly captured, when the frame data model posture detection state is true, kalman filtering operation is respectively carried out on three-dimensional coordinate points of each human body sampling node, the current frame is compared with the previous frame, but only Kalman filtering is possibly trapped in local optimum, so that a low-pass filter is introduced and a cascading mode is adopted, the low-pass filtering operation is used for filtering out transient and burst actions, so that the joint point data is more stable, and a set of human body sampling node coordinates after the influence of noise and interference is filtered out is obtained.

Specifically, the set of coordinates of all human body sampling nodes of the frame data is a two-dimensional matrix, the two-dimensional matrix is flattened into an array, the array is converted into a character string, and the character string is encoded into a byte stream. The frame data all human body sampling node coordinates are assembled to form a frame human body node three-dimensional coordinate packet after being packed, and the frame human body node three-dimensional coordinate packet content comprises the three-dimensional coordinates of all detection nodes of the frame data; the frame human body node three-dimensional coordinate packet is a network data packet, and the network data packet is put into a network transmission message list. The network transmission is performed through a network transmission channel, wherein the network transmission channel comprises network transmission by using a wired mode and a wireless mode.

Specifically, the data receiving device needs to define a network transmission channel used by the data sending device, and initialize the network transmission channel, including building a thread delegate, setting a callback function, setting background operation content, and the data receiving device opens and binds a network channel port to monitor network data. And when the data receiving device receives data through the network channel port of the network transmission channel, the data content is stored in a data container and converted into a data format suitable for processing of the data receiving device. The data receiving device opens the network transmission channel to receive the three-dimensional coordinate packet of the frame human body node and stores the three-dimensional coordinate packet in a character string variable, wherein the three-dimensional coordinate packet represents the human body node. The data is divided into individual coordinate values, which are scaled appropriately. The data of the frame human body node three-dimensional coordinate packet is analyzed into a plurality of specific human body joint node coordinates, and the specific human body joint node coordinates at least comprise node coordinate contents of a nose, a left eye outside, a right eye inside, a right eye outside, a left ear, a right ear, a left mouth corner, a right mouth corner, a left shoulder, a right shoulder, a left elbow, a right elbow, a left wrist, a right wrist, a left little finger first finger joint, a right little finger first finger joint, a left index finger first finger joint, a right index finger first finger joint, a left thumb second finger joint, a right thumb second finger joint, a left hip, a right hip, a left knee, a right knee, a left ankle, a right ankle, a left heel, a right heel, a left toe, a right toe, and the like, and the specific positions of the joints are shown in fig. 2.

since the data transmitted by the data transmitting device does not contain all the node information required by the data receiving device, part of the node positions are calculated through the existing coordinate points. Introducing neck, root bones and trunk three bone nodes, wherein the position of the neck bone node is the midpoint between a left shoulder node and a right shoulder node, the position of the root bone node is the midpoint between a left hip node and a right hip node, and the position of the trunk node is the midpoint between the left hip node and the right hip node.

In the initialization phase, a normalized cross product function is first used to calculate the normal vector as the current frontal orientation of the human body.

Specifically, let three spatial vectors a, b and c, calculate b to a to vectors, calculate the cross product with c to a vectors, and normalize the result, the formula is:

a. and b and c are three-dimensional coordinates of a root bone, a left hip and a right hip.

Based on modern control theory, the state vector of the model only has the current motion position and speed, only has two state variables, and cannot restore the real human body dynamics. To better simulate the bone rotation of a real person, we introduce an intermediate change matrix, as compared to just presenting the displacement of each bone keypoint.

Specifically, in calculating the intermediate change matrix of the bone, a rotation quaternion needs to be created, the creation of the rotation quaternion needs to specify two vectors, the first vector is the front direction, the second vector is the upper direction, the Z axis of the rotation quaternion will be aligned with the front direction, the X axis will be aligned with the cross product between the front direction and the upper direction, and the Y axis will be aligned with the cross product between the Z axis and the X axis. If the up direction is not specified, the up direction is set as a world up parameter, i.e., an up vertical axis vector. The rotation quaternion is created to ensure that the rotation conforms to the biomechanical properties of the human body.

Creating a skeletal rotation quaternion: the anterior direction is set to the anterior direction of the bone and the superior direction is set to the superior direction of the bone.

Creating a rotation-oriented quaternion: the anterior direction is set to be the opposite direction of the anterior direction of the bone, and the upward direction is the self-defined upward direction.

In this embodiment, the front direction of the root skeleton facing the rotation quaternion is the normal vector facing the front of the human body, and the upper direction is the world upward parameter; the self-defined upper directions of the trunk, the neck, the head, the left and right shoulders, the left and right elbows, the left and right hips and the left and right knees are normal vectors facing the right front of the human body; the self-defining upper direction of the wrists on the left side and the right side is the normal vector of the plane where the skeleton vectors of three skeletons of the little finger, the index finger and the wrist are located; the self-definition upper directions of the little finger and the index finger on the left and right sides are normal vectors of planes of the bone vectors of the three bones of the little finger, the index finger and the thumb; the custom upward direction of the ankle on the left and right sides is a vector directed from the ankle to the knee; the custom upward direction of the heels on the left and right sides is a vector directed from the heels to the ankle.

Step S105, obtaining the rotation quantity of skeleton nodes through the skeleton rotation quaternion and the intermediate change matrix, reconstructing human body actions, and updating corresponding virtual character actions in a three-dimensional virtual environment;

Specifically, the displacements of the character model are first processed. The initial frame is a first frame obtained by analyzing the data received by the data receiving device after receiving the data, the initial position offset of the root node is calculated from the initial frame, and in the subsequent frame, the overall position of the character model is adjusted by using the initial offset, wherein the overall relative orientation and the overall relative position of the character model are included, and the movement of the character in the three-dimensional space is simulated.

The rotation of the character model is updated. The forward direction of the avatar is calculated. The rotation state is updated using a rotation quaternion for each bone node. The rotation quaternion is made to be the front direction by the opposite direction of the actual front direction of the skeleton, and the upper direction of the rotation quaternion is the same as the calculation method for calculating the upper direction of each skeleton in the step S104, and the upper direction comprises the direction pointed by the node and the normal vector obtained through calculation. And multiplying the rotation quaternion of each skeleton node by the intermediate change matrix to obtain the rotation quantity of each node, wherein the rotation quantity is the rotation quaternion.

And when the received node data is less than the number of the detected joint points, the character model is restored to the initial state by using the built-in animation controller.

Mapping the relative displacement of the virtual character, each skeleton node and each rotation quaternion to a virtual character skeleton node in a three-dimensional virtual environment, and adjusting the skeleton nodes of the virtual character to reconstruct actions and positions so as to realize the visualization of the actions of the virtual character.

Step S106, the rotation amount of each node is stored or transmitted to a preset storage system or network endpoint according to the current equipment online condition, or the rotation amount of each node is stored in a storage medium of the data receiving equipment;

Specifically, the rotation amount of each node is stored in a predetermined storage system or network endpoint according to the device on-line condition, or the rotation amount of each node is stored in a storage medium of the data receiving device. An online condition refers to whether a device may be connected to a predetermined storage system or network endpoint at the present time. If the data receiving device and the predetermined storage system or network endpoint are in the same local area network and the corresponding data transmission port is opened, the data receiving device can obtain the network address of the predetermined storage system or network endpoint and access the service of the predetermined storage system or network endpoint, then the rotation amount of each node is stored in the predetermined storage system or network endpoint; if the data receiving device cannot obtain the network address of the predetermined storage system or network endpoint, or the data receiving device cannot access the network services of the predetermined storage system or network endpoint, the rotation amount of each node will be stored in the storage medium of the data receiving device.

And storing the three-dimensional coordinates of the nodes, the skeleton rotation quaternion and the rotation amount of each node in a local or cloud storage system, so that the subsequent retrieval, use, analysis, display and sharing are facilitated.

The method for reconstructing the three-dimensional virtual character actions of the human body node in the embodiment of the present invention is described above, and the device for reconstructing the three-dimensional virtual character actions of the human body node in the embodiment of the present invention is described below, referring to fig. 3, where the device for reconstructing the three-dimensional virtual character actions of the human body node in the embodiment of the present invention includes:

the monocular camera 301 is configured to acquire data correspondingly transmitted by the camera sensor;

The data processing module 302 is configured to obtain frame image data of a frame data stream captured by a camera sensor, process the image data into a three-dimensional node coordinate packet of human skeleton, where three-dimensional coordinate data of at least 33 skeleton nodes exist in the three-dimensional node coordinate packet of human skeleton;

The data sending module 303 is configured to package the three-dimensional node coordinate packet of the human skeleton into a network transmission data packet, and send the network transmission data packet to a data receiving device through a wired or wireless network interface;

the data receiving module 304 is configured to receive a network transmission data packet through a wired or wireless network interface, and parse the network transmission data packet into data suitable for processing by a data receiving device;

The node transformation module 305 is configured to transform the three-dimensional human body node coordinates into an intermediate change matrix of each node and a displacement of the virtual character;

The node updating module 306 is configured to calculate and update a rotation amount of each node of the three-dimensional virtual character.

The data correspondingly transmitted by the camera sensor are obtained through the cooperation of the components, the frame image data of the frame data stream captured by the camera sensor is obtained, the frame image data of the frame data stream captured by the camera sensor is processed into a set of three-dimensional node coordinates of human skeleton, and a three-dimensional node coordinate packet of the human skeleton is obtained; transmitting the three-dimensional node coordinate packet of the human skeleton to data receiving equipment through a wired or wireless network; converting the three-dimensional human body node coordinates into rotation quantity of each node and updating the actions of the three-dimensional virtual character; the invention captures the actions of the user by using the monocular cameras, then transmits the camera data back to the data transmitting equipment for processing, and transmits the camera data to the data receiving equipment for converting the camera data into the actions of the three-dimensional virtual roles. The invention improves the stability of data acquisition and the accuracy of data analysis with lower cost.

Fig. 3 above describes the human node three-dimensional virtual character action reconstruction device in the embodiment of the present invention in detail from the perspective of a modularized functional entity, and the human node three-dimensional virtual character action reconstruction device in the embodiment of the present invention is described in detail from the perspective of hardware processing.

Fig. 4 is a schematic structural diagram of a human node three-dimensional virtual character motion reconstruction device according to an embodiment of the present invention, where a data sending device 600 and a data receiving device 680 of the human node three-dimensional virtual character motion reconstruction device may have relatively large differences due to different configurations or performances, and may include one or more processors (central processing units, CPU) 640 (e.g., one or more processors) and a memory 690, one or more storage media 650 (e.g., one or more mass storage devices) storing applications 653 or data 652. Wherein memory 690 and storage medium 650 may be transitory or persistent storage. The program stored in the storage medium 650 may include one or more modules (not shown), each of which may include a series of instruction operations in the human node three-dimensional virtual character motion reconstruction device. Still further, the processor 640 may be arranged to communicate with a storage medium 650 to execute a series of instruction operations in the storage medium 650 on the human node three-dimensional virtual character motion reconstruction device. An operating system 651 regulates and allocates a series of instruction operations in the storage medium 650 performed by the processor 640, and controls image display of the display 620 and transmission of the wired or wireless network interface 630. The display 620 displays the monocular camera 670 capturing the image frame data stream in the data transmitting device 600 and the display 620 displays the motion map of the virtual character in the data receiving device 680. The monocular camera 670 communicates with the input/output interface 660 through a data transmission channel, and the data transmitting device 600 communicates with the data receiving device 680 through a wired or wireless network interface 630; the data transmitting device 600 and the data receiving device 680 further comprise a power source 610 for powering the devices.

The invention also provides a computer readable storage medium which can be a nonvolatile computer readable storage medium, and the computer readable storage medium can also be a volatile computer readable storage medium, wherein the computer readable storage medium stores instructions, and when the instructions run on a computer, the instructions cause the computer to execute the steps of the monocular camera-based human body node data acquisition processing and three-dimensional virtual character action reconstruction method.

It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, which are not repeated herein.

The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied essentially or in part or all of the technical solution or in part in the form of a software product stored in a storage medium, including instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a read-only memory (ROM), a random access memory (randomacceS memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

The above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

1. The human body node three-dimensional virtual character action reconstruction method is characterized by comprising the following steps of:

step S106, the rotation amount of each node is stored or transmitted to a preset storage system or network endpoint according to the current equipment online condition, or the rotation amount of each node is stored in a storage medium of the data receiving equipment; an online condition refers to whether the device can currently connect to a predetermined storage system or network endpoint;

In step S104, calculating a bone rotation quaternion for each bone according to the decoded data of the frame human node three-dimensional coordinate packet, and constructing an intermediate change matrix, including:

In calculating the intermediate change matrix of the skeleton, a rotation quaternion needs to be created, the creation of the rotation quaternion needs to specify two vectors, the first vector is the front direction, the second vector is the upper direction, the Z axis of the rotation quaternion is aligned with the front direction, the X axis is aligned with the cross product between the front direction and the upper direction, and the Y axis is aligned with the cross product between the Z axis and the X axis; if the upward direction is not specified, the upward direction is set as a world upward parameter, i.e., an upward vertical axis vector; creating the rotation quaternion for ensuring that the rotation conforms to the biomechanical properties of the human body;

2. The human node three-dimensional virtual character action reconstruction method according to claim 1, wherein: the method for establishing connection with a data transmission device by using a monocular camera, initializing the monocular camera by the data transmission device, capturing a frame data stream by the monocular camera and transmitting the frame data stream to the data transmission device to obtain frame data, comprises the following steps:

3. The human node three-dimensional virtual character action reconstruction method according to claim 1, wherein: the frame data is transmitted into a human body node detection model to obtain a detected human body sampling node set and a model gesture detection state, the coordinates of each human body sampling node are subjected to multiple filtering to obtain a filtered set of all human body sampling node coordinates of the frame data, and the method comprises the following steps:

4. The human node three-dimensional virtual character action reconstruction method according to claim 1, wherein: the step of packaging the set of all human body sampling node coordinates of the frame data into a frame human body node three-dimensional coordinate packet, and placing the frame human body node three-dimensional coordinate packet into a network transmission channel, the step of opening the network transmission channel by a data receiving device to receive and decode the frame human body node three-dimensional coordinate packet comprises the following steps:

Encoding the three-dimensional coordinate packet of the frame human body node, and then putting the encoded three-dimensional coordinate packet into a network transmission channel, wherein the network transmission channel comprises a wired network transmission channel and a wireless network transmission channel;

5. The human node three-dimensional virtual character action reconstruction method according to claim 1, wherein: obtaining the rotation quantity of the skeleton node through the skeleton rotation quaternion and the intermediate change matrix, reconstructing human body actions, and updating corresponding virtual character actions in a three-dimensional virtual environment, wherein the method comprises the following steps:

Taking the opposite direction of the front direction of the skeleton rotation quaternion of each frame as the front direction of the new rotation quaternion, taking the upper direction of the skeleton quaternion of each frame as the upper direction of the new rotation quaternion, and carrying out inverse multiplication on the new rotation quaternion and the intermediate change matrix to obtain the rotation quantity of skeleton nodes;

6. The human node three-dimensional virtual character action reconstruction method according to claim 1, wherein: the storing or transmitting the rotation amount of each node to a predetermined storage system or network endpoint according to the current device presence condition comprises:

Storing the rotation amount of each node in a predetermined storage system or network endpoint according to the device on-line condition, or storing the rotation amount of each node in a storage medium of a data receiving device; if the data receiving device and the predetermined storage system or network endpoint are in the same local area network and the corresponding data transmission port is opened, the data receiving device can obtain the network address of the predetermined storage system or network endpoint and access the service of the predetermined storage system or network endpoint, then the rotation amount of each node is stored in the predetermined storage system or network endpoint; if the data receiving device cannot obtain the network address of the predetermined storage system or network endpoint, or the data receiving device cannot access the network services of the predetermined storage system or network endpoint, the rotation amount of each node will be stored in the storage medium of the data receiving device.

7. The utility model provides a human node three-dimensional virtual role action reconstruction equipment which characterized in that: comprising the following steps: a monocular camera, a data transmitting device, a data receiving device, for implementing the method for reconstructing the motion of the three-dimensional virtual character of the human body node according to any one of claims 1 to 6.

8. The human node three-dimensional virtual character motion reconstruction device according to claim 7, wherein: the monocular camera comprises a photosensitive element as a sensor;

9. A computer-readable storage medium, characterized by: the computer readable storage medium having instructions stored therein, which when run on a computer, cause the computer to perform the human node three-dimensional virtual character action reconstruction method of any one of claims 1-6.