CN114972441A - Motion synthesis framework based on deep neural network - Google Patents

Motion synthesis framework based on deep neural network Download PDF

Info

Publication number
CN114972441A
CN114972441A CN202210735748.0A CN202210735748A CN114972441A CN 114972441 A CN114972441 A CN 114972441A CN 202210735748 A CN202210735748 A CN 202210735748A CN 114972441 A CN114972441 A CN 114972441A
Authority
CN
China
Prior art keywords
motion
joint
sequence
frame
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210735748.0A
Other languages
Chinese (zh)
Inventor
何方展
薛鹏
夏贵羽
罗东
张泽远
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Information Science and Technology
Original Assignee
Nanjing University of Information Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Information Science and Technology filed Critical Nanjing University of Information Science and Technology
Priority to CN202210735748.0A priority Critical patent/CN114972441A/en
Publication of CN114972441A publication Critical patent/CN114972441A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/211Selection of the most significant subset of features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Processing Or Creating Images (AREA)

Abstract

The invention relates to the technical field of computers, in particular to a motion synthesis framework based on a deep neural network, which comprises the following steps: preparing training data and standardizing joint coordinates; extracting a motion rule of the motion sequence; training a motion law extraction network; training a motion synthesis network to establish a relation between a motion sequence head frame and a motion sequence tail frame and a motion rule; generating a corresponding motion rule according to the head and tail frames; the invention is used for synthesizing real human body motion under the condition of giving a section of head and tail frames of the motion sequence, and is used for solving the problems of complex control and limited synthesis content of the existing motion synthesis method.

Description

Motion synthesis framework based on deep neural network
Technical Field
The invention relates to the technical field of computers, in particular to a motion synthesis framework based on a deep neural network.
Background
The motion data acquired by the capturing device can be used for studying the characteristics of human motion, such as motion pattern recognition and motion pattern tracking, and the like, and also derive other promising applications, including the fields of animation, robot driving, motion rehabilitation and the like. However, the motion capture is very expensive and is easily limited by the range of actor's performance, so the motion synthesis technique is an effective means to solve the problem of high cost of motion capture.
The existing motion synthesis algorithm faces two main problems, one is to avoid the non-professional operation of a user on the synthesis process, but reduce the coordination of the motion synthesis result, so that the content of the synthesis result is limited, the requirements of the user are difficult to meet, and the imagination is difficult to exert. The other direction is that users often need to have professional motion synthesis knowledge to successfully complete the motion synthesis task. The invention provides a motion synthesis framework based on a deep neural network, which builds a depth model to establish the relation between a head frame and a tail frame and a motion rule, synthesizes a corresponding motion sequence according to the given head frame and the given tail frame and enhances the controllability of motion synthesis.
Disclosure of Invention
The present invention is directed to a motion synthesis framework based on a deep neural network, so as to solve the problems mentioned in the background art.
The technical scheme of the invention is as follows: a motion synthesis framework based on a deep neural network comprises training data, joint coordinates, a motion rule of a motion sequence, a relation between the motion sequence and the motion rule and a relation between a head frame and a tail frame of the motion sequence and the motion rule, and the motion synthesis method of the motion synthesis framework comprises the following steps:
s1, preparing training data and normalizing joint coordinates: collecting a plurality of motion sequences with a single motion type as training data and converting the motion sequences into joint coordinates, then carrying out standardization processing on the joint coordinates, and taking the relative coordinates of each joint relative to a father joint of the joint as the characteristics of the joint;
s2, extracting the motion rule of the partial motion sequence involved in the S1: calculating the angle between the position of a certain joint at any moment and the initial frame, and taking the change curve of the angle as the motion rule of the motion sequence;
s3, training the depth network according to the standardized motion data and establishing the relation between the motion sequence and the motion law: taking a motion sequence and the motion rules extracted in the S2 as a training data pair, and adopting an LSTM-based deep network for training to construct the relationship between the motion sequence and the motion rules;
s4, extracting the motion rules of all motion sequences by using the motion rule extraction network related to S3;
s5, training the depth network according to the standardized motion data and establishing the relation between the first frame and the last frame of the motion sequence and the motion law: taking a motion sequence head and tail frame and the motion rule extracted in S4 as a training data pair, and adopting a depth network based on LSTM to train to construct the relationship between the motion head and tail frame and the motion rule;
s6, generating a corresponding motion rule, namely a polynomial coefficient, through the trained network according to the given head and tail frames in the S5;
and S7, according to the polynomial coefficient obtained in S6, the position of each joint at any time is obtained, and a complete motion sequence is synthesized.
Preferably, the position of each joint in the joint coordinates in S1 is represented by a three-dimensional vector
Figure 572426DEST_PATH_IMAGE001
Expressing and normalizing it, wherein the coordinates are normalized
Figure 900639DEST_PATH_IMAGE002
Is defined as:
Figure 148955DEST_PATH_IMAGE003
preferably, the law of motion of the partial motion sequence involved in S1 is extracted in S2, and the law of motion is defined
Figure 525710DEST_PATH_IMAGE004
Is the angle of the position of the current frame joint relative to the starting position, angle
Figure 67550DEST_PATH_IMAGE004
Carrying out normalization processing, and setting the joint position of the initial frame to the joint position corresponding to the end frame as the positive direction;
angle of the joint position
Figure 176451DEST_PATH_IMAGE004
The correspondence with the three-dimensional coordinates is expressed as:
Figure 475846DEST_PATH_IMAGE005
wherein the content of the first and second substances,
Figure 453029DEST_PATH_IMAGE006
and
Figure 226206DEST_PATH_IMAGE007
respectively representing the position of each joint of the start frame and the end frame,
Figure 568326DEST_PATH_IMAGE008
indicating the angular change of the end frame relative to the start frame; then by the least squares method:
Figure 417333DEST_PATH_IMAGE009
obtaining an angle
Figure 870311DEST_PATH_IMAGE011
With respect to time
Figure 996530DEST_PATH_IMAGE013
The sequence of (a), namely:
Figure 8086DEST_PATH_IMAGE014
preferably, in S3, the deep network is trained according to the normalized motion data and the relationship between the motion sequence and the motion law is established: the input is a motion sequence, the output is a polynomial coefficient corresponding to a motion law, the time sequence characteristics of the preprocessed motion sequence are extracted through a three-layer LSTM network, the motion law corresponding to the motion law is output through a full connection layer, and the corresponding loss function is expressed as follows:
Figure 344390DEST_PATH_IMAGE015
wherein
Figure 601059DEST_PATH_IMAGE016
Calculated joint angle for network in time
Figure 581784DEST_PATH_IMAGE017
The value of (a) is,
Figure 265706DEST_PATH_IMAGE018
for actual joint angle in time
Figure 823727DEST_PATH_IMAGE017
The value of (a) is,
Figure 385551DEST_PATH_IMAGE019
representing the number of sampling time points of the selected sequence,
Figure 486363DEST_PATH_IMAGE020
representing the number of input motion sequences.
Preferably, in S4, the motion law extraction network referred to in S3 is used to extract the motion laws of all motion sequences:
the network input is a motion sequence, and the output is a joint angle
Figure 403503DEST_PATH_IMAGE021
With respect to time
Figure 855344DEST_PATH_IMAGE022
Is represented by
Figure 453816DEST_PATH_IMAGE023
Wherein
Figure 268188DEST_PATH_IMAGE025
Representing the number of sampling time points of the selected sequence,
Figure 261289DEST_PATH_IMAGE026
representing the number of input motion sequences.
Preferably, in S5, the deep network is trained according to the normalized motion data and the association between the head and end frames and the motion law is established:
angle obtained in S2
Figure 731585DEST_PATH_IMAGE027
With respect to time
Figure 196064DEST_PATH_IMAGE028
Of (2)
Figure 5889DEST_PATH_IMAGE029
The corresponding relation of which can be used as a function
Figure 202515DEST_PATH_IMAGE030
Represents, i.e.:
Figure 118694DEST_PATH_IMAGE031
said function
Figure 324548DEST_PATH_IMAGE032
By a polynomial of order 5, using
Figure 113512DEST_PATH_IMAGE033
Representing the coefficients of the joint point corresponding polynomial.
The number of the synthesis modules is consistent with that of the human joints, namely, a single module is responsible for feature extraction of a single joint motion rule, a given head frame and a given tail frame are input, and a required motion rule is output. The synthesis module comprises three layers of LSTM units, namely a batch normalization layer and a final full-connection layer, and the LSTM network is responsible for extracting characteristic information of a first frame and a last frame; the loss function for each synthesis module is represented as:
Figure 684302DEST_PATH_IMAGE034
wherein
Figure 129190DEST_PATH_IMAGE035
Denotes the first
Figure 935472DEST_PATH_IMAGE036
The first and last frames of an input
Figure 218423DEST_PATH_IMAGE037
At each of the sampling time points, the sampling time point,
Figure 22431DEST_PATH_IMAGE038
represents the angle value extracted by the S4 motion law extraction network,
Figure 751353DEST_PATH_IMAGE039
and expressing polynomial coefficients corresponding to the motion law generated by the motion synthesis network.
Preferably, in S6, a corresponding motion law, i.e. polynomial coefficient, is generated through the trained network according to the given motion head and end frames:
the network inputs the frame head and tail of the motion sequence, and outputs polynomial coefficients corresponding to the motion law, namely:
Figure 299009DEST_PATH_IMAGE040
Figure 203511DEST_PATH_IMAGE041
indicating the number of human joints.
Preferably, in S7, the human motion is synthesized according to the motion law. For the first
Figure 975158DEST_PATH_IMAGE042
First of frame
Figure 896102DEST_PATH_IMAGE043
Angle corresponding to each joint
Figure 919553DEST_PATH_IMAGE044
It can be expressed as:
Figure 803196DEST_PATH_IMAGE045
wherein the content of the first and second substances,
Figure 683427DEST_PATH_IMAGE046
Figure 527886DEST_PATH_IMAGE047
is the first
Figure 479662DEST_PATH_IMAGE043
Polynomial coefficients of the individual joint motion curves; then converting the angle into corresponding three-dimensional coordinates according to the idea of spherical interpolation
Figure 388450DEST_PATH_IMAGE048
The formula is as follows:
Figure 908424DEST_PATH_IMAGE049
Figure 99234DEST_PATH_IMAGE050
Figure 526804DEST_PATH_IMAGE051
indicating the normalized coordinates of the first and last frames of a motion sequence,
Figure 260405DEST_PATH_IMAGE052
is the first
Figure 810335DEST_PATH_IMAGE043
The angle change from the starting frame to the ending frame of each joint; and when the normalized position of each joint is obtained, calculating the absolute position coordinate of each joint according to the structure of the human body and the length of the skeleton, and finally reconstructing the real human body motion.
The invention provides a motion synthesis framework based on a deep neural network by improving, compared with the prior art, the invention has the following improvements and advantages:
one is as follows: the motion synthesis framework based on the deep neural network is used for synthesizing real human body motion under the condition of giving a head frame and a tail frame of a motion sequence, and is used for solving the problems that the existing motion synthesis method is complex in control and limited in synthesis content;
the second step is as follows: the motion synthesis framework based on the deep neural network can generate natural intermediate motion according to the head and tail frames of the motion sequence provided by the user, not only ensures the convenience of operation, but also can synthesize rich motion content by controlling the head and tail frames;
and thirdly: the motion synthesis framework based on the deep neural network can be applied to a plurality of fields, wherein in the field of the film and television industry, the motion synthesis framework can be used for synthesizing 3D human body motion to drive virtual characters; in the field of robots, the motion synthesis framework can synthesize special actions to drive the humanoid robot; in the field of medical rehabilitation, the motion synthesis framework can be used for synthesizing the normal motion posture of a patient with dyskinesia so as to assist psychotherapy.
Drawings
The invention is further explained below with reference to the figures and examples:
FIG. 1 is a block flow diagram of the present invention;
FIG. 2 is a normalized graph of the present invention;
FIG. 3 is a diagram of a law of motion extraction network of the present invention;
fig. 4 is a diagram of a motion synthesis network of the present invention.
Detailed Description
The present invention is described in detail below, and technical solutions in the embodiments of the present invention are clearly and completely described, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention provides a motion synthesis framework based on a deep neural network through improvement, and the technical scheme of the invention is as follows:
as shown in fig. 1, a motion synthesis framework based on a deep neural network includes training data, joint coordinates, a motion rule of a motion sequence, a relationship between the motion sequence and the motion rule, and a relationship between a head frame and a tail frame of the motion sequence and the motion rule, and the motion synthesis method of the motion synthesis framework includes the following steps:
s1, preparing training data and normalizing joint coordinates: collecting a plurality of motion sequences with a single motion type as training data and converting the motion sequences into joint coordinates, then carrying out standardization processing on the joint coordinates, and taking the relative coordinates of each joint relative to a father joint of the joint as the characteristics of the joint;
s2, extracting the motion rule of the partial motion sequence involved in the S1: calculating the angle between the position of a certain joint at any moment and the initial frame, and taking the change curve of the angle as the motion rule of the motion sequence;
s3, training the depth network according to the standardized motion data and establishing the relation between the motion sequence and the motion law: taking a motion sequence and the motion rules extracted in the S2 as a training data pair, and adopting an LSTM-based deep network for training to construct the relationship between the motion sequence and the motion rules;
s4, extracting the motion rules of all motion sequences by using the motion rule related to S3;
s5, training the depth network according to the standardized motion data and establishing the relation between the first frame and the last frame of the motion sequence and the motion law: taking a motion sequence head and tail frame and the motion rule extracted in S4 as a training data pair, and adopting a depth network based on LSTM to train to construct the relationship between the motion head and tail frame and the motion rule;
s6, generating a corresponding motion rule, namely a polynomial coefficient, through the trained network according to the given head and tail frames in the S5;
and S7, according to the polynomial coefficient obtained in S6, obtaining the position of each joint at any time, thereby synthesizing a complete motion sequence.
Wherein the position of each joint in the joint coordinates in S1 is a three-dimensional vector
Figure 865272DEST_PATH_IMAGE054
Expressed and normalized as shown in fig. 2, where the coordinates are normalized
Figure 96533DEST_PATH_IMAGE055
Is defined as:
Figure 543695DEST_PATH_IMAGE056
wherein, the motion rule of the partial motion sequence involved in the S1 is extracted from the S2, and the definition is defined
Figure 405472DEST_PATH_IMAGE057
Is the angle of the position of the current frame joint relative to the starting position, angle
Figure 508557DEST_PATH_IMAGE057
Carrying out normalization processing, and setting the joint position of the initial frame to the joint position corresponding to the end frame as the positive direction;
angle of joint position
Figure 340247DEST_PATH_IMAGE057
The correspondence with the three-dimensional coordinates is expressed as:
Figure 78133DEST_PATH_IMAGE058
wherein the content of the first and second substances,
Figure 110811DEST_PATH_IMAGE059
and
Figure 497930DEST_PATH_IMAGE060
respectively representing the position of each joint of the start frame and the end frame,
Figure 274256DEST_PATH_IMAGE061
indicating the angular change of the end frame relative to the start frame; then by the least squares method:
Figure 633694DEST_PATH_IMAGE062
obtaining an angle
Figure 696328DEST_PATH_IMAGE063
With respect to time
Figure 9891DEST_PATH_IMAGE064
The sequence of (a), namely:
Figure 324328DEST_PATH_IMAGE065
in S3, the deep network is trained according to the normalized motion data, and the relationship between the motion sequence and the motion law is established: as shown in fig. 3, the input is a motion sequence, the output is a polynomial coefficient corresponding to a motion law, the time sequence feature of the preprocessed motion sequence is extracted through three layers of LSTM networks, the motion law corresponding to the motion sequence is output through a full connection layer, and the corresponding loss function is expressed as:
Figure 600589DEST_PATH_IMAGE066
wherein
Figure 975070DEST_PATH_IMAGE067
Calculated joint angle in time for the network
Figure 274464DEST_PATH_IMAGE068
The value of (a) is,
Figure 251647DEST_PATH_IMAGE069
for actual joint angle in time
Figure 287474DEST_PATH_IMAGE070
The value of (a) is set to (b),
Figure 629594DEST_PATH_IMAGE071
representing the number of sampling time points of the selected sequence,
Figure 213022DEST_PATH_IMAGE072
representing the number of input motion sequences.
In the step S4, the motion law extraction network involved in the step S3 is used to extract the motion laws of all motion sequences:
the network input is a motion sequence and the output is a joint angle
Figure 134841DEST_PATH_IMAGE057
With respect to time
Figure 588957DEST_PATH_IMAGE073
Is represented by
Figure 615161DEST_PATH_IMAGE074
Wherein
Figure 826831DEST_PATH_IMAGE075
Indicating the number of sampling time points of the selected sequence,
Figure 145817DEST_PATH_IMAGE076
representing the number of input motion sequences.
In S5, the deep network is trained according to the normalized motion data, and the association between the first and last frames and the motion law is established:
angle obtained in S2
Figure 392121DEST_PATH_IMAGE057
With respect to time
Figure 76043DEST_PATH_IMAGE073
Of (2) a
Figure 899643DEST_PATH_IMAGE077
The corresponding relation of which can be used as a function
Figure 396221DEST_PATH_IMAGE078
Represents, i.e.:
Figure 559349DEST_PATH_IMAGE079
Figure 476490DEST_PATH_IMAGE080
by a polynomial of order 5, using
Figure 662751DEST_PATH_IMAGE081
Representing the coefficients of the joint point corresponding polynomial.
As shown in fig. 4, the number of the synthesis modules is consistent with that of the human joints, that is, a single module is responsible for feature extraction of a single joint motion law, and the input is a given head and tail frame, and the output is a required motion law. The synthesis module comprises three layers of LSTM units, namely a batch normalization layer and a final full-connection layer, and the LSTM network is responsible for extracting characteristic information of a first frame and a last frame; the loss function for each synthesis module is represented as:
Figure 261223DEST_PATH_IMAGE082
wherein
Figure 606754DEST_PATH_IMAGE083
Is shown as
Figure 71627DEST_PATH_IMAGE084
The first and last frames of an input
Figure 541922DEST_PATH_IMAGE085
At each of the sampling time points, the sampling time point,
Figure 6402DEST_PATH_IMAGE086
represents the angle value extracted by the S4 motion law extraction network,
Figure 81805DEST_PATH_IMAGE087
and expressing polynomial coefficients corresponding to the motion law generated by the motion synthesis network.
In S6, a corresponding motion law, that is, a polynomial coefficient, is generated through a trained network according to a given motion start frame and end frame:
the network inputs the frame head and tail of the motion sequence, and outputs polynomial coefficients corresponding to the motion law, namely:
Figure 278431DEST_PATH_IMAGE088
Figure 298340DEST_PATH_IMAGE089
indicating the number of human joints.
In S7, the human body motion is synthesized according to the motion law. For the first
Figure 205991DEST_PATH_IMAGE090
First of frame
Figure 932638DEST_PATH_IMAGE091
Angle corresponding to each joint
Figure 628062DEST_PATH_IMAGE092
It can be expressed as:
Figure 10633DEST_PATH_IMAGE093
wherein the content of the first and second substances,
Figure 754598DEST_PATH_IMAGE094
Figure 663648DEST_PATH_IMAGE095
is the first
Figure 906804DEST_PATH_IMAGE096
Polynomial coefficients of individual joint motion curves; then converting the angle into corresponding three-dimensional coordinates according to the idea of spherical interpolation
Figure 901305DEST_PATH_IMAGE097
The formula is as follows:
Figure 448961DEST_PATH_IMAGE098
Figure 822304DEST_PATH_IMAGE099
Figure 797213DEST_PATH_IMAGE100
indicating the normalized coordinates of the first and last frames of a motion sequence,
Figure 13431DEST_PATH_IMAGE101
is the first
Figure 800996DEST_PATH_IMAGE102
The angle change from the starting frame to the ending frame of each joint; and when the normalized position of each joint is obtained, calculating the absolute position coordinate of each joint according to the structure of the human body and the length of the skeleton, and finally reconstructing the real human body motion.

Claims (10)

1. A motion synthesis framework based on a deep neural network is characterized in that: preparing training data and standardizing joint coordinates; extracting a motion rule of the motion sequence; training a motion law extraction network; training a motion synthesis network to establish a relation between a motion sequence head frame and a motion sequence tail frame and a motion rule; generating a corresponding motion rule according to the head and tail frames; converting the generated motion rule into the position of each joint at any moment, and synthesizing a complete motion sequence, wherein the motion synthesis method of the motion synthesis framework comprises the following steps:
s1, preparing training data and normalizing joint coordinates: collecting a plurality of motion sequences with a single motion type as training data and converting the motion sequences into joint coordinates, then carrying out standardization processing on the joint coordinates, and taking the relative coordinates of each joint relative to a father joint of the joint as the characteristics of the joint;
s2, extracting the motion rule of the partial motion sequence involved in the S1: calculating the angle between the position of a certain joint at any moment and the initial frame, and taking the change curve of the angle as the motion rule of the motion sequence;
s3, training the depth network according to the standardized motion data and establishing the relation between the motion sequence and the motion law: taking a motion sequence and the motion rule extracted in the S2 as a training data pair, and adopting LSTM-based deep network training to construct the relationship between the motion sequence and the motion rule;
s4, extracting the motion rules of all motion sequences by using the motion rule related to S3;
s5, training the depth network by using the standardized motion data and establishing the relation between the first frame and the last frame of the motion sequence and the motion law: taking a motion sequence head and tail frame and the motion rule extracted in S4 as a training data pair, and adopting LSTM-based deep network training to construct the relationship between the motion head and tail frame and the motion rule;
s6, generating a corresponding motion law, namely a polynomial coefficient, through the trained network by using the given head and tail frames in the S5;
and S7, according to the polynomial coefficient obtained in S6, obtaining the position of each joint at any time, thereby synthesizing a complete motion sequence.
2. The deep neural network-based motion synthesis framework of claim 1, wherein: three-dimensional vector for position of each joint in the joint coordinates in S1
Figure 312273DEST_PATH_IMAGE001
Expressing and normalizing it, wherein the coordinates are normalized
Figure 437355DEST_PATH_IMAGE002
Is defined as follows:
Figure 891470DEST_PATH_IMAGE003
3. the deep neural network-based motion synthesis framework of claim 1, wherein: s2, extracting the motion rule of the part motion sequence involved in S1, and defining
Figure 903026DEST_PATH_IMAGE004
Is the angle of the position of the current frame joint relative to the starting position, angle
Figure 114695DEST_PATH_IMAGE004
Carrying out normalization processing, and setting the joint position of the initial frame to the joint position corresponding to the end frame as the positive direction;
angle of the joint position
Figure 699260DEST_PATH_IMAGE004
The correspondence with the three-dimensional coordinates is expressed as:
Figure 742303DEST_PATH_IMAGE005
wherein the content of the first and second substances,
Figure 363908DEST_PATH_IMAGE006
and
Figure 361076DEST_PATH_IMAGE007
respectively representing the position of each joint of the start frame and the end frame,
Figure 483753DEST_PATH_IMAGE008
indicating the angular change of the end frame relative to the start frame; and solving by least squares:
Figure 584564DEST_PATH_IMAGE009
obtaining an angle
Figure 439388DEST_PATH_IMAGE010
With respect to time
Figure 750284DEST_PATH_IMAGE011
The sequence of (a), namely:
Figure 348755DEST_PATH_IMAGE012
4. the deep neural network-based motion synthesis framework of claim 1, wherein: in S3, the standardized motion data is used to train the depth network and establish the relationship between the motion sequence and the motion law, the input is the motion sequence, the output is the polynomial coefficient corresponding to the motion law, the preprocessed motion sequence first passes through the three-layer LSTM network to extract its time-sequence characteristics, and then passes through the full connection layer to output the corresponding motion law, and the corresponding loss function is expressed as:
Figure 802608DEST_PATH_IMAGE013
wherein
Figure 156229DEST_PATH_IMAGE014
Calculated joint angle in time for the network
Figure 626525DEST_PATH_IMAGE015
The value of (a) is,
Figure 966370DEST_PATH_IMAGE016
for actual joint angle in time
Figure 572932DEST_PATH_IMAGE017
The value of (a) is,
Figure 363033DEST_PATH_IMAGE018
representing the number of sampling time points of the selected sequence,
Figure 556511DEST_PATH_IMAGE019
representing the number of input motion sequences.
5. The deep neural network-based motion synthesis framework of claim 1, wherein: in S4, the motion law extraction network involved in S3 is used to extract the motion laws of all motion sequences:
the network input is a motion sequence and the output is a joint angle
Figure 700047DEST_PATH_IMAGE020
With respect to time
Figure 489012DEST_PATH_IMAGE021
Is represented by
Figure 325381DEST_PATH_IMAGE022
Wherein
Figure 239110DEST_PATH_IMAGE023
Representing selected sequence of sampling time pointsThe number of the first and second groups is,
Figure 310971DEST_PATH_IMAGE024
representing the number of input motion sequences.
6. The deep neural network-based motion synthesis framework of claim 1, wherein: and S5, training the deep network by using the standardized motion data and establishing the relation between the head and the tail frames and the motion law:
angle obtained in S2
Figure 656240DEST_PATH_IMAGE025
With respect to time
Figure 132352DEST_PATH_IMAGE026
Of (2) a
Figure 64535DEST_PATH_IMAGE027
The corresponding relation of which can be used as a function
Figure 408929DEST_PATH_IMAGE028
Represents, i.e.:
Figure 110169DEST_PATH_IMAGE029
said function
Figure 67103DEST_PATH_IMAGE030
By a polynomial of order 5, using
Figure 424266DEST_PATH_IMAGE031
Coefficients representing the joint point corresponding polynomial;
the number of the synthesis modules is consistent with that of the human joints, namely, a single module is responsible for feature extraction of a single joint motion rule, a given head frame and a given tail frame are input, and a required motion rule is output.
7. The synthesis module comprises three layers of LSTM units, namely a batch normalization layer and a final full-connection layer, and the LSTM network is responsible for extracting characteristic information of a first frame and a last frame; the loss function for each synthesis module is represented as:
Figure 510034DEST_PATH_IMAGE032
wherein
Figure 393676DEST_PATH_IMAGE033
Denotes the first
Figure 273908DEST_PATH_IMAGE034
The first and last frames of an input
Figure 616902DEST_PATH_IMAGE035
At each of the sampling time points, the sampling time point,
Figure 303098DEST_PATH_IMAGE036
represents the angle value extracted by the S4 motion law extraction network,
Figure 978930DEST_PATH_IMAGE037
and expressing polynomial coefficients corresponding to the motion law generated by the motion synthesis network.
8. The deep neural network-based motion synthesis framework of claim 1, wherein: in S6, a corresponding motion law, i.e., polynomial coefficient, is generated through the trained network according to the given motion head and end frames:
the network inputs the frame head and tail of the motion sequence, and outputs polynomial coefficients corresponding to the motion law, namely:
Figure 967746DEST_PATH_IMAGE038
Figure 424135DEST_PATH_IMAGE039
indicating the number of human joints.
9. The deep neural network-based motion synthesis framework of claim 1, wherein: in S7, the human body motion is synthesized according to the motion law.
10. For the first
Figure 851706DEST_PATH_IMAGE040
First of frame
Figure 821192DEST_PATH_IMAGE041
Angle corresponding to each joint
Figure 308805DEST_PATH_IMAGE042
It can be expressed as:
Figure 986911DEST_PATH_IMAGE043
wherein the content of the first and second substances,
Figure 952593DEST_PATH_IMAGE044
Figure 275121DEST_PATH_IMAGE045
is the first
Figure 432171DEST_PATH_IMAGE041
Polynomial coefficients of individual joint motion curves; then converting the angle into corresponding three-dimensional coordinates according to the idea of spherical interpolation
Figure 597573DEST_PATH_IMAGE046
The formula is as follows:
Figure 366946DEST_PATH_IMAGE047
Figure 543980DEST_PATH_IMAGE048
Figure 373396DEST_PATH_IMAGE049
indicating the normalized coordinates of the first and last frames of a motion sequence,
Figure 291673DEST_PATH_IMAGE050
is the first
Figure 835044DEST_PATH_IMAGE041
The angle change from the starting frame to the ending frame of each joint; and when the normalized position of each joint is obtained, calculating the absolute position coordinate of each joint according to the structure of the human body and the length of the skeleton, and finally reconstructing the real human body motion.
CN202210735748.0A 2022-06-27 2022-06-27 Motion synthesis framework based on deep neural network Pending CN114972441A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210735748.0A CN114972441A (en) 2022-06-27 2022-06-27 Motion synthesis framework based on deep neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210735748.0A CN114972441A (en) 2022-06-27 2022-06-27 Motion synthesis framework based on deep neural network

Publications (1)

Publication Number Publication Date
CN114972441A true CN114972441A (en) 2022-08-30

Family

ID=82965826

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210735748.0A Pending CN114972441A (en) 2022-06-27 2022-06-27 Motion synthesis framework based on deep neural network

Country Status (1)

Country Link
CN (1) CN114972441A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110853670A (en) * 2019-11-04 2020-02-28 南京理工大学 Music-driven dance generating method
CN111310641A (en) * 2020-02-12 2020-06-19 南京信息工程大学 Motion synthesis method based on spherical nonlinear interpolation
CN111681321A (en) * 2020-06-05 2020-09-18 大连大学 Method for synthesizing three-dimensional human motion by using recurrent neural network based on layered learning
WO2021234151A1 (en) * 2020-05-22 2021-11-25 Motorica Ab Speech-driven gesture synthesis
CN114170353A (en) * 2021-10-21 2022-03-11 北京航空航天大学 Multi-condition control dance generation method and system based on neural network

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110853670A (en) * 2019-11-04 2020-02-28 南京理工大学 Music-driven dance generating method
CN111310641A (en) * 2020-02-12 2020-06-19 南京信息工程大学 Motion synthesis method based on spherical nonlinear interpolation
WO2021234151A1 (en) * 2020-05-22 2021-11-25 Motorica Ab Speech-driven gesture synthesis
CN111681321A (en) * 2020-06-05 2020-09-18 大连大学 Method for synthesizing three-dimensional human motion by using recurrent neural network based on layered learning
CN114170353A (en) * 2021-10-21 2022-03-11 北京航空航天大学 Multi-condition control dance generation method and system based on neural network

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
GUIYU XIA 等: "A Deep Learning Framework for Start–End Frame Pair-Driven Motion Synthesis" *
WENLIN ZHUANG 等: "Towards 3D Dance Motion Synthesis and Control" *
YI ZHOU 等: "AUTO-CONDITIONED RECURRENT NETWORKS FOR EXTENDED COMPLEX HUMAN MOTION SYNTHESIS" *
庄文林: "人体运动建模与合成" *
彭淑娟 等: "人体运动生成中的深度学习模型综述" *

Similar Documents

Publication Publication Date Title
KR102081854B1 (en) Method and apparatus for sign language or gesture recognition using 3D EDM
WO2021169839A1 (en) Action restoration method and device based on skeleton key points
Beymer et al. Example based image analysis and synthesis
CN107239728A (en) Unmanned plane interactive device and method based on deep learning Attitude estimation
CN110096156B (en) Virtual reloading method based on 2D image
CN111553968B (en) Method for reconstructing animation of three-dimensional human body
CN113706699B (en) Data processing method and device, electronic equipment and computer readable storage medium
US20160004905A1 (en) Method and system for facial expression transfer
CN110310351B (en) Sketch-based three-dimensional human skeleton animation automatic generation method
CN112837215B (en) Image shape transformation method based on generation countermeasure network
CN111062326A (en) Self-supervision human body 3D posture estimation network training method based on geometric drive
CN109116981A (en) A kind of mixed reality interactive system of passive touch feedback
Zhu et al. Human motion generation: A survey
CN108908353B (en) Robot expression simulation method and device based on smooth constraint reverse mechanical model
CN111310641A (en) Motion synthesis method based on spherical nonlinear interpolation
Liu et al. Real-time robotic mirrored behavior of facial expressions and head motions based on lightweight networks
CN112634413B (en) Method, apparatus, device and storage medium for generating model and generating 3D animation
CN113989928A (en) Motion capturing and redirecting method
CN111539288B (en) Real-time detection method for gestures of both hands
Tang et al. Real-time conversion from a single 2D face image to a 3D text-driven emotive audio-visual avatar
Agarwal et al. Imitating human movement with teleoperated robotic head
CN113079136A (en) Motion capture method, motion capture device, electronic equipment and computer-readable storage medium
CN114972441A (en) Motion synthesis framework based on deep neural network
US11734889B2 (en) Method of gaze estimation with 3D face reconstructing
JPH10255070A (en) Three-dimensional image generating device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20220830