CN112700521A - Music-driven human skeleton dance motion generation system - Google Patents

Music-driven human skeleton dance motion generation system Download PDF

Info

Publication number
CN112700521A
CN112700521A CN202110101178.5A CN202110101178A CN112700521A CN 112700521 A CN112700521 A CN 112700521A CN 202110101178 A CN202110101178 A CN 202110101178A CN 112700521 A CN112700521 A CN 112700521A
Authority
CN
China
Prior art keywords
dance
music
module
action
motion
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110101178.5A
Other languages
Chinese (zh)
Inventor
刘科成
肖双九
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Jiaotong University
Original Assignee
Shanghai Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jiaotong University filed Critical Shanghai Jiaotong University
Priority to CN202110101178.5A priority Critical patent/CN112700521A/en
Publication of CN112700521A publication Critical patent/CN112700521A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T13/00Animation
    • G06T13/203D [Three Dimensional] animation
    • G06T13/2053D [Three Dimensional] animation driven by audio data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T13/00Animation
    • G06T13/203D [Three Dimensional] animation
    • G06T13/403D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Signal Processing (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Processing Or Creating Images (AREA)

Abstract

A music-driven human skeletal dance action generating system, comprising: music characteristic extraction system, dance action generation system and dance action evaluation system based on GAN that link to each other in proper order, wherein: the dance action evaluation index receives dance actions of the dance action generation system and evaluates the superiority and inferiority of the dance actions from the three aspects of dance authenticity, diversity and complexity. The invention can generate coherent human skeleton dance actions which accord with the characteristics of input music by introducing the prior knowledge of human dance, taking a common music file as input and relying on the capability of generating new data of a dance action generation model based on GAN.

Description

Music-driven human skeleton dance motion generation system
Technical Field
The invention relates to a technology in the field of computer automatic dance, in particular to a music-driven human skeleton dance motion generation system.
Background
The computer automatic dance technology refers to a technology for automatically generating dance motion sequences through a specific computing model under the drive of music. The key point of automatic dance is to determine the mapping relationship between music and dance, and different dance guidances can create various dance movements under the same background music, and it is very difficult to spatially model human dance movements due to avoiding the generation of unnatural dance movements. Even if the sequence of dance movements generated differs only slightly from the normal human posture, the result may appear unnatural. And the research in the field also has the problem of lacking a high-quality data set, and the data collection is mainly used for the motion recognition task, not including dance motions, and even not having music matched with the dance motions.
Earlier research work abstracts automatic dance into a similarity-based retrieval problem, selects the most matched and similar segments with the input music from a pre-constructed fixed dance action set according to the difference of the input music in characteristics such as rhythm and melody, and combines the selected segments into a new dance action sequence. Dance movements generated based on the method are limited to a pre-constructed dance movement set, new movements which do not exist in the set cannot be generated, and the connection positions between basic dance movement units have the problem of movement incoherence.
Disclosure of Invention
Aiming at the defects in the prior art, the invention provides a music-driven human skeleton dance motion generation system, which can generate coherent and smooth human skeleton dance motions according with the input music characteristics by introducing the priori knowledge of human dance, taking a common music file as input and relying on the capability of generating new data of a dance motion generation model based on GAN.
The invention is realized by the following technical scheme:
the invention comprises the following steps: music characteristic extraction system, dance action generation system and dance action evaluation system based on GAN that link to each other in proper order, wherein: the dance action evaluation index receives dance actions of the dance action generation system and outputs evaluation results from three aspects of dance authenticity, diversity and complexity respectively.
The music feature extraction system comprises: the short-time Fourier transform module, the music chromaticity characteristic extraction module and the initial strength detection module that link to each other in proper order, wherein: the short-time Fourier transform module receives input music information and converts the input music information into a time-frequency domain from a time domain, the music chromaticity characteristic extraction module extracts chroma reflecting music rhythm characteristics, and the initial intensity detection module detects onset strength reflecting music rhythm characteristics.
The dance action generation system based on the GAN comprises: a dance motion generator module and a dance motion discriminator module, wherein: a dance motion generator module of the network structure of the codec generates dance motions, and a dance motion discriminator module judges whether dance motion data is false data created by the dance motion generator module.
The music encoder in the dance motion generator module extracts music characteristic vectors at each moment from music data, specifically: music data passes through a plurality of one-dimensional convolution layers, then passes through a bidirectional GRU layer and a full connection layer, the convolution layers are used for data dimension reduction, the GRU layer takes time dimension into consideration, finally the full connection layer outputs a characteristic diagram with the size of T multiplied by 256, and random noise is further introduced into the characteristic diagram by the music encoder, namely: the noise generated randomly passes through a GRU layer, and then the output result of the GRU layer is combined with the music coding result and output.
The dance motion decoder in the dance motion generator module is a multilayer perceptron (MLP) composed of a plurality of fully-connected layers, one or more hidden layers are introduced into the MLP on the basis of a single-layer neural network, and the hidden layers are arranged between an input layer and an output layer.
The dance action discriminator module comprises: global dance action arbiter and local dance arbiter, wherein: the global dance action discriminator measures whether the dance action in the music data and the dance action data output by the dance action generator module accords with the music characteristics on the whole, and judges the reality of the dance action; the local dance motion judger divides dance motion data into a plurality of subsequences, and judges the reality of dance motions by judging whether the dance motions are locally continuous or not.
The dance action evaluation system comprises: a Freutt starting distance (FID) module, a mean variance module, and a mean instantaneous velocity module, wherein: the FID module measures the advantages and disadvantages of human skeleton dance actions in dance action data output by the dance action generating system from dance reality indexes, the average variance module measures the advantages and disadvantages of human skeleton dance actions in dance action data output by the dance action generating system from diversity indexes, and the average instantaneous speed module measures the advantages and disadvantages of human skeleton dance actions in dance action data output by the dance action generating system from complexity indexes.
Technical effects
The invention integrally solves the problems that the matching degree of dance motions and music generated in the prior art is not high and brand new motions cannot be generated.
Compared with the prior art, the dance device can generate various types of dance motions including female dances, ballets and mechanical dances aiming at different input music types. The ability to generate more types of dance may be obtained through a trained model if the types of dance in the dataset continue to be augmented.
Drawings
FIG. 1 is a schematic structural view of the present invention;
FIG. 2 is a schematic diagram of a music feature extraction algorithm implementation flow of the present invention;
FIG. 3 is a schematic diagram of a dance motion generator model according to the present invention;
FIG. 4 is a schematic diagram of an implementation flow of the global dance motion discriminator model according to the present invention;
FIG. 5 is a schematic diagram of an implementation process of the local dance motion discriminator model according to the present invention.
Detailed Description
As shown in fig. 1, the music-driven human skeletal dance motion generation system according to the present embodiment includes: music characteristic extraction system, dance action generation system and dance action evaluation system based on GAN that link to each other in proper order, wherein: the dance action evaluation index receives dance actions of the dance action generation system and evaluates the superiority and inferiority of the dance actions from the three aspects of dance authenticity, diversity and complexity.
The music feature extraction system comprises: the short-time Fourier transform module, the chroma feature extraction module and the onsetsnength detection module are connected in sequence, wherein: the short-time Fourier transform module receives input music information, converts the input music information from a time domain to a time-frequency domain, and then divides the time-frequency information into two branches for processing: mapping the frequency information of each frame into one octave based on a pitch significance algorithm, and extracting chroma reflecting the music melody characteristics by a chroma characteristic extraction module; and on the other hand, based on the SuperFlux algorithm, the onsetsententh detection module detects onsetsententh reflecting the music rhythm characteristics.
The time frequency information is an information representation form reflecting the change of the frequency of the music along with the time.
The short-time Fourier transform module is a variant of basic Fourier transform, and specifically comprises the following operations: a long non-stationary signal is considered as a combination of a series of shorter stationary signals, the whole long signal is divided by windowing in time, and then discrete Fourier transform is applied to each divided part of the short signal.
The chroma refers to music characteristics defined according to twelve equal temperaments and is a T multiplied by 12 dimensional matrix, wherein: t denotes the number of music frames and 12 denotes 12 semitone pitches within one octave.
The onsetsentngth is a rhythm characteristic continuous in time.
The dance action generation system based on the GAN comprises: dance action generator module and dance action arbiter module connected, wherein: the dance motion generator module generates dance motions, and the dance motion discriminator module judges whether dance motion data are false data created by the dance motion generator module.
The dance action generator module is a network structure of a coder-decoder and comprises: a music encoder and a dance motion decoder.
The music encoder extracts music characteristic vectors at each moment from music data, specifically: music data passes through a plurality of one-dimensional convolution layers, then passes through a bidirectional GRU layer and a full connection layer, the convolution layers can play a role in data dimension reduction, the GRU layer can take time dimension into consideration, and finally the full connection layer outputs a characteristic diagram with the size of T multiplied by 256.
For the conventional GAN model, the input to the generator is only random noise, whereas the input to the present invention is music data. Although the music data alone can be input to substantially achieve the intended research objective, the variety of dance movements generated is still to be improved. Considering that dancing should be a random art, even if the same person is under the same music, the dancing of each jump is not invariable, so random noise is introduced into the music characteristic vector calculated by the music coder, so that the model can generate more diversified dance movements. Thus, noise perturbations are added to the sequence data, the randomly generated noise is passed through a GRU layer, and the output of the GRU layer is then combined with the music coding results.
The dance action decoder is a multi-layer perceptron (MLP) composed of a plurality of fully-connected layers, the MLP introduces one or a plurality of hidden layers on the basis of a single-layer neural network, and the hidden layers are arranged between an input layer and an output layer. The MLP used in the invention is improved on the basis of the MLP structure, the output of each hidden layer is added with the input of the current hidden layer, the result is input into the next hidden layer, and a BN layer and a ReLU activation function are added among the hidden layers, so that the degradation problem of a deep network can be effectively avoided.
As shown in fig. 3, the dance motion discriminator module includes: a global dance action discriminator and a local dance discriminator. Wherein: the global dance action discriminator receives data of both music and dance actions, and judges whether the input dance actions accord with music characteristics on the whole and judges the reality of the dance actions; the local dance motion judger takes dance motion data as input, divides the dance motion data into a plurality of subsequences, and judges the reality of the dance motion by judging whether the dance motion is locally continuous or not.
As shown in fig. 4, the global dance motion discriminator processes the input data of the music encoder and the dance motion encoder, combines the results of the two encoders, and inputs the combined results into the subsequent two-class network, wherein the structure of the music encoder is completely the same as that of the music encoder in the dance motion generator; dance movement data and movement frame difference data respectively pass through a series of two-dimensional convolution layers, then the dance movement data and the movement frame difference data are combined, and the obtained result and the music coding result are combined through the two convolution layers and the two full-connection layers; and finally, inputting the data into a two-class network comprising a one-dimensional convolutional layer and a full-connection layer to obtain a yes or no output result, wherein the yes represents that the judger determines that the dance motion is a real motion matched with the music, and the no represents that the dance motion is a false motion which is not matched with the music and is generated by a machine.
As shown in FIG. 5, the local dance motion judger is substantially identical to the global dance motion judger, thereby omitting the music processing part and adding the unfolding (unfolding) operation of the input dance motion. The Unfold operation refers to extracting a sliding local area block from a batch of input samples, and the input parameters are similar to two-dimensional convolution. The Unfold operation can divide the complete dance action sequence into a plurality of partially overlapped dance action subsequences, and then a network similar to a global discriminator is used for processing the subsequence data to finally obtain a discrimination result of whether the input dance action is true or not
The dance action evaluation system comprises: an FID module, a mean variance module, and a mean instantaneous velocity module, wherein: the FID module measures the quality of the human skeleton dance action from the dance reality index, the average variance module measures the quality of the human skeleton dance action from the diversity index, and the average instantaneous speed module measures the quality of the human skeleton dance action from the complexity index.
The dance authenticity index, namely a key quantitative index for measuring the performance of the GAN model from the perspective of the quality of a generated sample, specifically comprises the following steps: FID | | μrg||2+Tr(∑r+∑g-2(∑rg)1/2) Wherein: mu is an empirical mean, sigma is an empirical covariance, Tr is a trace of a matrix, r is a real data set, and g is a generated data set; smaller FID values indicate that the generated data is closer to the real data because the mean and covariance of the two are very close.
The diversity index is obtained by taking the average of the variance of the spatial position of each joint point; the dispersion degree of the data distribution is measured, and the dispersion degree is larger when the variance is larger. We believe that the larger the mean variance, the more diverse the dance movements are generated.
The complexity index is the average value of the instantaneous speed of each joint point at each moment in two dimensions of time and space; dance movements that tend to be faster represent a more complex and highly pleasing viewing assessment.
Through specific practical experiments, experimental data can be obtained by using a PyTorch machine learning library on an i7-4770K CPU and an Invitta GTX 970 independent display card, wherein the experimental data comprises the following data:
TABLE 1FID comparison
Figure BDA0002915702930000051
TABLE 2 mean variance comparison
Figure BDA0002915702930000052
TABLE 3 average instantaneous speed contrast
Figure BDA0002915702930000053
As can be known from data in the table, when 10-dimensional noise is introduced, the model effect is best and is closest to real dance movement; because the ballet comprises a lot of turning movements and the movement amplitude is large, the most complicated of the three types is that the generated ballet FID is much higher; the diversity was better in clitoris and ballets, and the diversity in generating mechanical dances was inferior to the reference method. In combination with practical situations, the average variance of the mechanical dance is not too large, and the mechanical dance generated by the reference method is more like a female dance, so that the average variance of the result is large. For clitoris and ballet, the results obtained with this method were slightly faster than those obtained with the reference method; for mechanical dancing, the result of the method is slightly slower, because the difference between K-POP and E-music cannot be distinguished by the model in the reference method, and given the input of E-music, the model will output a dance more like a dancing in a woman.
Compared with the prior art, the dance motion automatic generation method improves the reality, diversity and complexity of the dance motion which is automatically generated.
The foregoing embodiments may be modified in many different ways by those skilled in the art without departing from the spirit and scope of the invention, which is defined by the appended claims and all changes that come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein.

Claims (7)

1. A music-driven human skeletal dance motion generation system, comprising: music characteristic extraction system, dance action generation system and dance action evaluation system based on GAN that link to each other in proper order, wherein: the dance action evaluation index receives dance actions of the dance action generation system and outputs evaluation results from three aspects of dance authenticity, diversity and complexity respectively;
the music feature extraction system comprises: the short-time Fourier transform module, the music chromaticity characteristic extraction module and the initial strength detection module that link to each other in proper order, wherein: the short-time Fourier transform module receives input music information and converts the input music information from a time domain to a time-frequency domain, the music chromaticity characteristic extraction module extracts chroma reflecting music rhythm characteristics, and the initial intensity detection module detects onset strength reflecting music rhythm characteristics;
the chroma refers to music characteristics defined according to twelve equal temperaments and is a T multiplied by 12 dimensional matrix, wherein: t denotes the number of music frames, 12 denotes 12 semitone pitches within one octave;
the onsetsentngth is a rhythm characteristic continuous in time.
2. The system for generating music-driven human skeletal dance movements according to claim 1, wherein said GAN-based dance movement generating system comprises: a dance motion generator module and a dance motion discriminator module, wherein: a dance motion generator module of the network structure of the codec generates dance motions, and a dance motion discriminator module judges whether dance motion data is false data created by the dance motion generator module.
3. A music-driven human skeletal dance motion generation system according to claim 1, wherein said music encoder in said dance motion generator module extracts music feature vectors at each time from music data, specifically: music data passes through a plurality of one-dimensional convolution layers, then passes through a bidirectional GRU layer and a full connection layer, the convolution layers are used for data dimension reduction, the GRU layer takes time dimension into consideration, finally the full connection layer outputs a characteristic diagram with the size of T multiplied by 256, and random noise is further introduced into the characteristic diagram by the music encoder, namely: the noise generated randomly passes through a GRU layer, and then the output result of the GRU layer is combined with the music coding result and output.
4. The system for generating dance movements of human bones driven by music according to claim 1, wherein the dance movement decoder of said dance movement generator module is a multi-layered sensor consisting of a plurality of fully connected layers, said multi-layered sensor is based on a single-layered neural network and incorporates one or more hidden layers, said hidden layers are disposed between an input layer and an output layer.
5. The system for generating a music-driven human skeletal dance motion according to claim 1, wherein said dance motion discriminator module comprises: global dance action arbiter and local dance arbiter, wherein: the global dance action discriminator measures whether the dance action in the music data and the dance action data output by the dance action generator module accords with the music characteristics on the whole, and judges the reality of the dance action; the local dance motion judger divides dance motion data into a plurality of subsequences, and judges the reality of dance motions by judging whether the dance motions are locally continuous or not.
6. A music-driven human skeletal dance motion generating system according to claim 1, wherein said dance motion evaluation system includes: a Frechtt starting distance module, an average variance module, and an average instantaneous velocity module, wherein: the FID module measures the advantages and disadvantages of human skeleton dance actions in dance action data output by the dance action generating system from dance reality indexes, the average variance module measures the advantages and disadvantages of human skeleton dance actions in dance action data output by the dance action generating system from diversity indexes, and the average instantaneous speed module measures the advantages and disadvantages of human skeleton dance actions in dance action data output by the dance action generating system from complexity indexes.
7. A music-driven human skeletal dance motion generating system according to claim 6, wherein said dance reality indicator is a sample mass generated from the generationThe key quantitative index for measuring the performance of the GAN model by angle specifically comprises the following steps: FID | | μrg||2+Tr(∑r+∑g-2(∑rg)1/2) Wherein: mu is an empirical mean, sigma is an empirical covariance, Tr is a trace of a matrix, r is a real data set, and g is a generated data set;
the diversity index is obtained by taking the average of the variance of the spatial position of each joint point;
the complexity index is the average value of the instantaneous speed of each joint point at each moment in both time and space dimensions.
CN202110101178.5A 2021-01-26 2021-01-26 Music-driven human skeleton dance motion generation system Pending CN112700521A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110101178.5A CN112700521A (en) 2021-01-26 2021-01-26 Music-driven human skeleton dance motion generation system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110101178.5A CN112700521A (en) 2021-01-26 2021-01-26 Music-driven human skeleton dance motion generation system

Publications (1)

Publication Number Publication Date
CN112700521A true CN112700521A (en) 2021-04-23

Family

ID=75516073

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110101178.5A Pending CN112700521A (en) 2021-01-26 2021-01-26 Music-driven human skeleton dance motion generation system

Country Status (1)

Country Link
CN (1) CN112700521A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113521711A (en) * 2021-07-13 2021-10-22 济南幼儿师范高等专科学校 Dance training auxiliary system and method
CN115035221A (en) * 2022-06-17 2022-09-09 广州虎牙科技有限公司 Dance animation synthesis method, device, equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110955786A (en) * 2019-11-29 2020-04-03 网易(杭州)网络有限公司 Dance action data generation method and device
CN110992449A (en) * 2019-11-29 2020-04-10 网易(杭州)网络有限公司 Dance action synthesis method, device, equipment and storage medium
KR20200042143A (en) * 2018-10-15 2020-04-23 주식회사 더브이엑스 Dancing room service system and method thereof
CN111968202A (en) * 2020-08-21 2020-11-20 北京中科深智科技有限公司 Real-time dance action generation method and system based on music rhythm
CN111986295A (en) * 2020-08-14 2020-11-24 腾讯科技(深圳)有限公司 Dance synthesis method and device and electronic equipment

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20200042143A (en) * 2018-10-15 2020-04-23 주식회사 더브이엑스 Dancing room service system and method thereof
CN110955786A (en) * 2019-11-29 2020-04-03 网易(杭州)网络有限公司 Dance action data generation method and device
CN110992449A (en) * 2019-11-29 2020-04-10 网易(杭州)网络有限公司 Dance action synthesis method, device, equipment and storage medium
CN111986295A (en) * 2020-08-14 2020-11-24 腾讯科技(深圳)有限公司 Dance synthesis method and device and electronic equipment
CN111968202A (en) * 2020-08-21 2020-11-20 北京中科深智科技有限公司 Real-time dance action generation method and system based on music rhythm

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
HSIN-YING LEE,ET AL: "Dancing to Music", 《33RD CONFERENCE ON NEURAL INFORMATION PROCESSING SYSTEMS》 *
XUANCHI REN,ET AL: "Self-supervised Dance Video Synthesis Conditioned on Music", 《IN PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA 》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113521711A (en) * 2021-07-13 2021-10-22 济南幼儿师范高等专科学校 Dance training auxiliary system and method
CN115035221A (en) * 2022-06-17 2022-09-09 广州虎牙科技有限公司 Dance animation synthesis method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
Yang et al. A regression approach to music emotion recognition
CN112700521A (en) Music-driven human skeleton dance motion generation system
Won et al. Toward interpretable music tagging with self-attention
Ioannidis et al. Gait recognition using compact feature extraction transforms and depth information
Desmet et al. Assessing a clarinet player's performer gestures in relation to locally intended musical targets
Shin et al. Skeleton-based dynamic hand gesture recognition using a part-based GRU-RNN for gesture-based interface
Kao et al. Temporally guided music-to-body-movement generation
Zhu et al. Quantized gan for complex music generation from dance videos
CN110175551A (en) A kind of sign Language Recognition Method
Oka et al. Marker-less piano fingering recognition using sequential depth images
Meng et al. Improving speech related facial action unit recognition by audiovisual information fusion
Essid et al. Fusion of multimodal information in music content analysis
Lim et al. Emotion Recognition by Facial Expression and Voice: Review and Analysis
Jiang et al. Forgery-free signature verification with stroke-aware cycle-consistent generative adversarial network
Sahoo et al. MIC_FuzzyNET: Fuzzy integral based ensemble for automatic classification of musical instruments from audio signals
Itohara et al. Particle-filter based audio-visual beat-tracking for music robot ensemble with human guitarist
Tavakoli et al. Study of Gabor and local binary patterns for retinal image analysis
Aleksandrova et al. Face recognition systems based on Neural Compute Stick 2, CPU, GPU comparison
Yu et al. A neural harmonic-aware network with gated attentive fusion for singing melody extraction
Hassan et al. An effective combination of textures and wavelet features for facial expression recognition
Shiu et al. Robust on-line beat tracking with kalman filtering and probabilistic data association (kf-pda)
Tralie Geometric multimedia time series
Kannapiran et al. Voice-based gender recognition model using FRT and light GBM
Yuan et al. Research on the Evaluation Model of Dance Movement Recognition and Automatic Generation Based on Long Short-Term Memory
Mancusi Harmonizing deep learning: a journey through the innovations in signal processing, source separation and music generation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210423

RJ01 Rejection of invention patent application after publication