CN113851146A - Performance evaluation method and device based on feature decomposition - Google Patents

Performance evaluation method and device based on feature decomposition Download PDF

Info

Publication number
CN113851146A
CN113851146A CN202111131753.2A CN202111131753A CN113851146A CN 113851146 A CN113851146 A CN 113851146A CN 202111131753 A CN202111131753 A CN 202111131753A CN 113851146 A CN113851146 A CN 113851146A
Authority
CN
China
Prior art keywords
performance
information
audio information
music score
reconstructed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111131753.2A
Other languages
Chinese (zh)
Inventor
张剑
蒋慧军
徐伟
陈又新
韩宝强
肖京
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202111131753.2A priority Critical patent/CN113851146A/en
Publication of CN113851146A publication Critical patent/CN113851146A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
    • G10L25/30Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Auxiliary Devices For Music (AREA)
  • Electrophonic Musical Instruments (AREA)

Abstract

The embodiment of the invention discloses a performance evaluation method and a device thereof based on feature decomposition, wherein the method comprises the following steps: acquiring performance audio information of a user, and inputting the performance audio information into a preset performance evaluation model to obtain reconstructed music score information; determining the accuracy level according to the matching degree of the reconstructed music score information and the target performance music score; determining skill level according to an information vector distance value between the playing audio information and preset master playing audio information and the information vector distance value; and comprehensively evaluating the playing audio information according to the accuracy level and the skill level to obtain a comprehensive evaluation result. The scheme of the embodiment of the invention can reasonably and scientifically evaluate the accuracy rate and the playing skill of the input audio signal, visually show the difference between the playing audio information and the master playing audio information, and improve the rationality and the accuracy of the playing evaluation.

Description

Performance evaluation method and device based on feature decomposition
Technical Field
The invention relates to an artificial intelligence technology, in particular to a performance evaluation method and a performance evaluation device based on feature decomposition.
Background
With the increasing living standard, people's demand for improving the literacy of music is increasing day by day, and the learning of musical instrument playing is an important way. However, learning musical instrument performance requires 1-to-1 instruction and a lot of practice by professional teachers, so that educational resources are strained, learning costs are high, and a lot of people cannot acquire sufficient professional instructional education time. In recent years, a large number of training products are produced to relieve the dependence of people on expensive educational resources, the aim is to improve the function and performance through an artificial intelligence technology, and the products can be close to the professional level of human teachers in the process of guiding musical instrument playing. One of core technologies in the products is an instrument performance evaluation method, in the existing method, generally, vocal music characteristics (such as pitch, starting point, ending point and the like) played by a user are extracted based on professional equipment or an audio signal processing method to form a user performance spectrogram, and then, evaluation of the user performance is given by combining with a standard spectrogram. Therefore, the existing evaluation method is greatly different from evaluation of professional teachers, and the comprehensive performance level cannot be improved by using the products under the guidance of professional persons.
Disclosure of Invention
The following is a summary of the subject matter described in detail herein. This summary is not intended to limit the scope of the claims.
The embodiment of the invention provides a performance evaluation method and device based on feature decomposition, which can evaluate the performance skill of an input audio signal and improve the reasonability and accuracy of performance evaluation.
In a first aspect, an embodiment of the present invention provides a performance evaluation method based on feature decomposition, where the method includes:
acquiring performance audio information of a user, and inputting the performance audio information into a preset performance evaluation model to obtain reconstructed music score information;
determining an accuracy level according to the matching degree between the reconstructed music score information and the target performance music score;
determining skill level according to an information vector distance value between the playing audio information and preset master playing audio information and the information vector distance value;
and comprehensively evaluating the playing audio information according to the accuracy level and the skill level to obtain a comprehensive evaluation result.
In a second aspect, an embodiment of the present invention provides a performance evaluation apparatus based on feature decomposition, including:
the acquisition module is used for acquiring the performance audio information of a user and inputting the performance audio information into a preset performance evaluation model to obtain reconstructed music score information;
the matching module is used for determining the accuracy level according to the matching degree between the reconstructed music score information and the target performance music score;
the computing module is used for determining skill level according to an information vector distance value between the playing audio information and preset master playing audio information and the information vector distance value;
and the evaluation module is used for comprehensively evaluating the playing audio information according to the accuracy level and the skill level to obtain a comprehensive evaluation result.
In a third aspect, an embodiment of the present invention provides an electronic device, including: the performance evaluation method based on the feature decomposition provided by the embodiment of the invention is realized when the control processor executes the computer program.
In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium, which stores a computer program, and when the computer program is executed by a processor, the performance evaluation method based on feature decomposition according to the embodiment of the present invention is implemented.
According to the embodiment of the invention, through acquiring the performance audio information of a user, the performance audio information is input into a preset performance evaluation model to obtain reconstructed music score information; accuracy level is obtained through reconstructing music score information and a target playing music score, skill level is obtained through playing audio information and master playing audio information, accuracy and playing skill of the playing audio information are reasonably and scientifically evaluated, difference between the playing audio information and the master playing audio information is visually shown, and comprehensive playing level of a user is improved.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
Drawings
The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the example serve to explain the principles of the invention and not to limit the invention.
Fig. 1 is a schematic flow chart of a performance evaluation method based on feature decomposition according to an embodiment of the present invention;
FIG. 2 is a schematic flow chart diagram of a performance evaluation method based on feature decomposition according to another embodiment of the present invention;
FIG. 3 is a schematic diagram of an implementation of step S200 in FIG. 2;
FIG. 4 is a schematic diagram of an implementation of step S300 in FIG. 1;
FIG. 5 is a diagram illustrating an implementation of step S400 in FIG. 1;
FIG. 6 is a schematic diagram of an implementation of step S500 in FIG. 1
FIG. 7 is a schematic diagram of a performance evaluation method based on feature decomposition according to an embodiment of the present invention;
fig. 8 is a schematic structural diagram of a performance evaluation device based on feature decomposition according to an embodiment of the present invention;
FIG. 9 is a schematic diagram of the structure of the training module of FIG. 8;
fig. 10 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
It should be understood that in the description of the embodiments of the present invention, if there is any description of "first", "second", etc., it is only for the purpose of distinguishing technical features, and it is not to be understood as indicating or implying relative importance or implicitly indicating the number of indicated technical features or implicitly indicating the precedence of the indicated technical features. "at least one" means one or more, "a plurality" means two or more. "and/or" describes the association relationship of the associated objects, and means that there may be three relationships, for example, a and/or B, and may mean that a exists alone, a and B exist simultaneously, and B exists alone. Wherein A and B can be singular or plural. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. "at least one of the following" and similar expressions refer to any combination of these items, including any combination of singular or plural items. For example, at least one of a, b, and c may represent: a, b, c, a and b, a and c, b and c or a and b and c, wherein a, b and c can be single or multiple.
In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
The embodiment of the application can acquire and process related data based on an artificial intelligence technology. Among them, Artificial Intelligence (AI) is a theory, method, technique and application system that simulates, extends and expands human Intelligence using a digital computer or a machine controlled by a digital computer, senses the environment, acquires knowledge and uses the knowledge to obtain the best result.
The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.
With the increasing living standard, the demand of people for improving the music literacy is increasing day by day, and the application of the artificial intelligence technology to the learning of musical instrument playing becomes an important means. The directions of voice processing technology, deep learning and the like in the artificial intelligence software technology provide a means for processing and learning the performance audio information. The performance evaluation function is improved through the artificial intelligence technology, and the performance audio information of the user is evaluated quickly and scientifically.
The performance evaluation method provided by the embodiment of the invention replaces professional music teachers with computer equipment to provide evaluation for performance trainees, and provides real-time reminding and guidance for the performance trainees, so that the performance evaluation method is widely applied to the fields of musical instrument learning, singing exercise and the like.
In the related art, the audio evaluation method aims at the problem of deductive accuracy of a player, and an accuracy level can be obtained only by comparing audio data with an existing music score. This solution has a number of disadvantages: the accuracy level can be obtained only by comparing the audio data with the existing music score, the rhythm sense, expressive force, musicality, style and the like of a player cannot be comprehensively evaluated, and the comprehensive performance level cannot be rapidly improved by the user.
Based on the above, the embodiment of the present invention provides a performance evaluation method and apparatus based on eigen decomposition (eigen decomposition), wherein performance audio information of a user is obtained and input to a performance evaluation model to obtain reconstructed score information and reconstructed audio information; accuracy level is obtained through reconstructing music score information and a target playing music score, skill level is obtained through playing audio information and master playing audio information, accuracy and playing skill of the playing audio information are reasonably and scientifically evaluated, difference between the playing audio information and the master playing audio information is visually shown, and comprehensive playing level of a user is improved.
It should be understood that eigen decomposition is a method of decomposing a matrix into the product of its eigenvalues and the matrix represented by the eigenvectors. In practical applications, the auto-correlation feature components are usually decomposed from the acquired data to obtain feature vectors of the acquired data. And performing feature recognition through the data to extract important information which is worthy of attention. In addition, by calculating the distance between the feature vector of the target data and the feature vector of the user data, the user data can be scientifically and reasonably evaluated, and the level of the user data in all aspects can be accurately evaluated.
Referring to fig. 1 and fig. 2, fig. 1 and fig. 2 show a flow of a performance evaluation method based on feature decomposition according to an embodiment of the present invention. As shown in fig. 2, the performance evaluation method based on feature decomposition according to the embodiment of the present invention includes the following steps:
and S100, collecting the target performance music score and the master performance audio information to form a training set in which the target performance music score corresponds to the master performance audio information one by one.
It should be understood that, in order to establish the standard of performance evaluation, the target performance score and the corresponding master performance audio information need to be collected, so as to facilitate evaluation of the performance audio information of the user. Meanwhile, the application range of the training set can be expanded by collecting target performance music scores and master performance audio information, wherein the target performance music scores include but are not limited to music score information of guitar, piano, violin, drum, vertical bamboo and other instruments. And the master performance audio information is master performance audio corresponding to the target performance music score, and is stored in a unified format to form a training set for performance evaluation.
And S200, inputting the training set into the performance evaluation model for training, and updating the performance evaluation model.
It should be understood that the performance evaluation model is an auto-supervision model based on an Auto Encoder (AE). In the training process, the parameters of the performance evaluation model are obtained by training the minimized error, so that the target performance music score and the master performance audio information in the training set are close to the target performance music score and the master performance audio information as much as possible after being reconstructed by the performance evaluation model. Therefore, the more the target playing music score and master playing audio information of the training set are, the more the playing evaluation model is fully trained, and the obtained model parameters have more reference significance.
It should be understood that the performance evaluation model is a deep learning network model, and the training of 4 neural network-characterized component units of a music score encoder, an audio encoder, a music score decoder and a global decoder is realized end to end by minimizing the music score information error and the performance audio information error before and after reconstruction.
Referring to fig. 3, step S200 may be implemented by the following steps:
and S210, generating a reconstructed performance score according to the target performance score.
It should be understood that the score encoder is provided with multiple network layers, and thus, the output of the target performance score after inference by the score encoder is generally set as a 256-dimensional vector in length, i.e., the score information vector includes 256 target audio sequences. In addition, the score decoder is provided with a loop layer and is adapted to output a parameter value for each target audio sequence. Therefore, the score information vector is input into a score decoder, and the score information vector is converted into a reconstructed performance score.
And S220, generating reconstructed audio information according to the target performance music score and the master performance audio information.
It should be understood that the audio encoder is provided with a plurality of network layers, and therefore, the output of the audio information performed by the master after being inferred by the audio encoder is generally set as a vector with a length of 256 dimensions, that is, the audio information vector includes 256 target audio sequences. In addition, the score decoder is provided with a loop layer and is adapted to output a parameter value for each target audio sequence. Therefore, the score information vector and the audio information vector are combined to obtain a global information vector, the global information vector is input into a global decoder, and the global information vector is converted into reconstructed audio information.
It should be appreciated that by combining the score information vector and the performance skill information vector, the global information vector can be made to characterize both the accuracy and skill of the performance audio information.
Referring to fig. 7, fig. 7 is a schematic diagram of a performance evaluation method based on feature decomposition according to an embodiment of the present invention.
It is to be understood that the global information vector is converted into reconstructed audio information by decoding the global information vector. It can be seen that the music score information vector plays two roles in the reconstruction of the music score information and the reconstruction of the audio information in the training process, the playing skill information vector only participates in the synthesis of the reconstruction of the audio information, and the error value of the reconstruction of the audio information is smaller than the preset audio error threshold value, so that the playing skill information only contains playing related information, and therefore the playing skill in the playing audio information can be represented.
And S230, updating the parameters of the performance evaluation model, so that the errors between the reconstructed performance music score and the target performance music score and between the reconstructed audio information and the master performance audio information are smaller than a preset error threshold.
It should be understood that the error of reconstructing the rendered score is composed by the cross entropy of each generated audio sequence, which is calculated as follows:
Figure BDA0003280695130000041
wherein L isNSIs the error value of the reconstructed score information, N represents the length of the generated score sequence, t represents the number of steps in the sequence,
Figure BDA0003280695130000042
distribution between all notes, p, predicted for a score decodertIs a true note distribution.
It should be understood that by updating the parameters of the score encoder and the score decoder, the error value L between the performance score and the target performance score is reconstructedNSLess than the preset music score error threshold, can effectively reduce the distortion of the performance evaluation model and ensure the reconstruction of the performance evaluation modelAnd (5) playing the reference value of the music score.
It should be understood that the error of the reconstructed audio information is composed of the error of the reconstructed audio information generated at each moment and the audio information played by the master, and the calculation formula is as follows:
Figure BDA0003280695130000051
wherein L isMSAn error value for reconstructing the audio information, N representing the length of the sequence of the generated score, t representing the number of steps in the sequence,
Figure BDA0003280695130000053
audio vector data reconstructed for time t, vtAnd (5) performing audio vector data of the audio information for the master at the time t.
It should be appreciated that by updating the parameters of the audio encoder and the global decoder, the error value L between the reconstructed audio information and the master performance audio information is madeMSAnd the fidelity of the performance evaluation model is ensured and the reference value of the reconstructed audio information is improved when the audio error is smaller than the preset audio error threshold.
S300, acquiring the performance audio information of a user, and inputting the performance audio information into a preset performance evaluation model to obtain reconstructed music score information;
referring to fig. 4, step S300 can be implemented by the following steps:
s310, inputting the performance audio information into the performance evaluation model;
s320, extracting a music score information vector from the performance audio information;
and S330, generating reconstructed music score information according to the music score information vector.
It should be understood that, as shown in step S210 in the above embodiment, the music score information is passed through the music score encoder, and a music score information vector is extracted from the music score information; and inputting the score information vector into a score decoder to obtain reconstructed score information. The reconstruction process is consistent with that of the target performance music score, and the reconstructed performance music score is generated by the music score encoder and the music score decoder, and is not described in detail herein.
And S400, determining the accuracy level according to the matching degree of the reconstructed music score information and the target performance music score.
Referring to fig. 5, step S400 may be implemented by:
and S410, calculating the cross entropy of the reconstructed music score information and the corresponding note sequence in the target performance music score.
It should be understood that the calculation of the cross entropy between the reconstructed score information and the corresponding note sequence in the target performance score is consistent with the calculation process of step S230 in the above embodiment, and is not described herein again.
And S420, acquiring the matching number of notes in the reconstructed music score information according to the cross entropy and a preset music score matching threshold.
It should be understood that by comparing the cross entropy with the preset score matching threshold, the number of notes of the reconstructed score information and the target performance score within the score matching threshold can be obtained, and the accuracy of the performance score information can be reflected.
And S430, determining the accuracy level according to the matching number.
It will be appreciated that from the ratio of the number of matches to the total number of notes, the accuracy level is found as shown in the following equation:
Figure BDA0003280695130000052
wherein M ismatchedTo match the number, MtotalIs the total number of notes.
And S500, determining the skill level according to the information vector distance value between the performance audio information and the master performance audio information.
Referring to fig. 6, step S500 may be implemented by the following steps:
and S510, extracting the performance skill information vector from the performance audio information.
And S520, extracting the master skill information vector from the master performance audio information.
It should be understood that the process of extracting the performance skill information vector and the master skill information vector is consistent with the manner of extracting the score information vector from the performance audio information in step S310 in the above embodiment, and will not be described herein again.
It should be understood that, through the trained audio encoder, the master skill information vector is extracted from the master performance audio information, so that the fidelity of the master skill information vector can be further ensured, and the accuracy rate of the skill level is prevented from being influenced by distortion and errors of the master skill information vector.
It should be understood that the score encoder has an influence on both the reconstructed score information and the reconstructed audio information, and therefore the updating of the parameters thereof comes from the supervisory signals of the reconstructed score information and the reconstructed audio information, and the updating of the parameters thereof also balances the errors of the reconstructed score information and the reconstructed audio information.
S530, calculating the skill distance between the playing skill information vector and the master skill information vector.
It should be understood that, from the performance skill information vector and the master skill information vector, the calculation formula of the skill distance is as follows:
Figure BDA0003280695130000061
wherein v is1For the vector of the playing skill information, v2Distance (v) for master skill information vector1,v2) Has a value range of [ -1, +1]。
And S540, determining the skill level according to the skill distance.
It should be understood that the Distance of skill (v) is obtained1,v2) Then, since the greater the skill distance, the lower the skill level, the calculation formula for obtaining the skill level is as follows:
Sperform=(1-|Distance(v1,v2)|)*100
wherein S isperformIs a skill level.
And S600, obtaining a comprehensive evaluation result according to the accuracy level and the skill level.
It should be understood that the accuracy level and the weighted average of the skill level are calculated to obtain the comprehensive evaluation result of the performance audio information. Therefore, the influence of the accuracy level and the skill level on the comprehensive evaluation result is fully reflected, the calculation process of the comprehensive evaluation result is more reasonable, and the comprehensive evaluation result obtained by the user has higher reference value.
Illustratively, the combined evaluation results take the average of the accuracy level and the skill level. Therefore, the accuracy level and the skill level are calculated from the steps S400 and S500, and the result S is comprehensively evaluatedtotalThe calculation formula of (a) is as follows:
Stotal=(Saccuracy+Sperform)/2
referring to fig. 8, fig. 8 is a schematic structural diagram of a performance evaluation device based on feature decomposition according to another embodiment of the present invention. The performance evaluation device based on the feature decomposition provided by the embodiment of the invention comprises:
the acquisition module 710 is configured to acquire performance audio information of a user, and input the performance audio information to a preset performance evaluation model to obtain reconstructed music score information;
a matching module 720, configured to determine an accuracy level according to a matching degree between the reconstructed score information and the target performance score;
the calculating module 730 is used for determining the skill level according to the information vector distance value between the performance audio information and the master performance audio information;
and the evaluation module 740 is used for obtaining a comprehensive evaluation result according to the accuracy level and the skill level.
The performance evaluation device based on feature decomposition provided by the embodiment of the invention further comprises:
the collecting module 750 is configured to collect the target performance score and the master performance audio information to form a training set in which the target performance score and the master performance audio information correspond to each other one by one;
and the training module 760 is used for inputting the training set into the performance evaluation model for training and updating the performance evaluation model.
Referring to fig. 9, in the performance evaluation apparatus based on feature decomposition according to the embodiment of the present invention, the training module 760 further includes:
a score reconstruction module 761 for generating a reconstructed performance score according to the target performance score;
an audio reconstruction module 762 for generating reconstructed audio information according to the target performance score and the master performance audio information;
an updating module 763, configured to update parameters of the performance evaluation model, so that errors between the reconstructed performance score and the target performance score and errors between the reconstructed audio information and the master performance audio information are smaller than a preset error threshold.
In the performance evaluating apparatus based on feature decomposition according to the embodiment of the present invention, the score reconstructing module 761 further includes:
a score encoder for extracting a score information vector from the performance audio information;
and the music score decoder is used for generating reconstructed music score information according to the music score information vector.
In the performance evaluating apparatus based on feature decomposition provided in the embodiment of the present invention, the matching module 720 is further configured to:
calculating the cross entropy of the reconstructed music score information and the corresponding note sequence in the target playing music score;
acquiring the matching number of notes in the reconstructed music score information according to the cross entropy and a preset music score matching threshold;
from the number of matches, the accuracy level is determined.
In the performance evaluating apparatus based on feature decomposition provided in the embodiment of the present invention, the audio reconstructing module 762 further includes:
an audio encoder for extracting a performance skill information vector from the performance audio information; extracting a master skill information vector from master performance audio information;
the calculation module 730 is further configured to: calculating the skill distance between the playing skill information vector and the master skill information vector;
determining a skill level based on the skill distance.
It should be noted that, because the content of information interaction, execution process, and the like between the modules of the apparatus is based on the same concept as the method embodiment of the present invention, specific functions and technical effects thereof may be referred to specifically in the method embodiment section, and are not described herein again.
Fig. 10 illustrates an electronic device 800 provided by an embodiment of the invention. The electronic device 800 includes, but is not limited to:
a memory 820 for storing programs;
and a control processor 810 for executing the program stored in the memory 820, wherein when the control processor 810 executes the program stored in the memory 820, the control processor 810 is configured to execute the performance evaluation method based on feature decomposition.
Control processor 810 and memory 820 may be connected by a bus or other means.
The memory 820 is a non-transitory computer readable storage medium that can be used to store non-transitory software programs and non-transitory computer executable programs, such as the method for performance profiling based on feature decomposition described in any of the embodiments of the present invention. The control processor 810 implements the above-described method of performance evaluation based on feature decomposition by executing non-transitory software programs and instructions stored in the memory 820.
The memory 820 may include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required for at least one function; the storage data area may store performance evaluation methods that perform the above-described feature decomposition-based performance evaluation. Further, the memory 820 may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory 820 may optionally include memory located remotely from the control processor 810, which may be connected to the control processor 810 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The non-transitory software programs and instructions necessary to implement the above-described feature-decomposition-based performance evaluation method are stored in the memory 820 and, when executed by the one or more control processors 810, perform the feature-decomposition-based performance evaluation method provided by any of the embodiments of the present invention.
The embodiment of the invention also provides a storage medium, which stores computer executable instructions, and the computer executable instructions are used for executing the performance evaluation method based on the feature decomposition.
In one embodiment, the storage medium stores computer-executable instructions, which are executed by one or more control processors 810, for example, by one control processor 810 in the electronic device 800, so that the one or more control processors 810 can execute the method for evaluating performance based on feature decomposition according to any embodiment of the present invention.
The above described embodiments are merely illustrative, wherein elements illustrated as separate components may or may not be physically separate, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
One of ordinary skill in the art will appreciate that all or some of the steps, systems, and methods disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). The term computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data, as is well known to those of ordinary skill in the art. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, Digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can accessed by a computer. In addition, communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media as known to those skilled in the art.
While the preferred embodiments of the present invention have been described in detail, the present invention is not limited to the above embodiments, and those skilled in the art will appreciate that the present invention is not limited thereto. Under the shared conditions, various equivalent modifications or substitutions can be made, and the equivalent modifications or substitutions are included in the scope of the invention defined by the claims.

Claims (10)

1. A performance evaluation method based on feature decomposition is characterized by comprising the following steps:
acquiring performance audio information of a user, and inputting the performance audio information into a preset performance evaluation model to obtain reconstructed music score information;
determining an accuracy level according to the matching degree between the reconstructed music score information and the target performance music score;
calculating an information vector distance value between the playing audio information and preset master playing audio information, and determining skill level according to the information vector distance value;
and comprehensively evaluating the playing audio information according to the accuracy level and the skill level to obtain a comprehensive evaluation result.
2. The method of claim 1, wherein before the obtaining the performance audio information of the user, further comprising:
collecting a target performance music score and the master performance audio information to form a training set in which the target performance music score and the master performance audio information are in one-to-one correspondence;
and inputting the training set into a performance evaluation model for training, and updating the performance evaluation model.
3. The method according to claim 2, wherein the inputting the training set into a performance evaluation model for training and updating the performance evaluation model comprises:
generating a reconstructed playing music score according to the target playing music score;
generating reconstructed audio information according to the target performance music score and the master performance audio information;
and updating the parameters of the performance evaluation model, so that the errors between the reconstructed performance music score and the target performance music score and between the reconstructed audio information and the master performance audio information are smaller than a preset error threshold.
4. The method of claim 3, wherein the acquiring of the user performance audio information and the inputting of the user performance audio information into a preset performance evaluation model to obtain reconstructed score information comprises:
inputting the performance audio information into the performance evaluation model;
extracting a score information vector from the performance audio information;
and generating the reconstructed music score information according to the music score information vector.
5. The method of claim 4, wherein determining an accuracy level based on a degree of match between the reconstructed score information and a target performance score comprises:
calculating the cross entropy of the reconstructed music score information and the corresponding note sequence in the target playing music score;
acquiring the matching number of notes in the reconstructed music score information according to the cross entropy and a preset music score matching threshold;
determining the accuracy level based on the number of matches.
6. The method of claim 4, wherein determining the skill level based on calculating an information vector distance value between the performance audio information and a predetermined master performance audio information, and based on the information vector distance value, comprises:
extracting a performance skill information vector from the performance audio information;
extracting a master skill information vector from the master performance audio information;
calculating a skill distance between the performance skill information vector and the master skill information vector;
determining the skill level based on the skill distance.
7. The method according to claim 1, wherein the comprehensive evaluation of the performance audio information according to the accuracy level and the skill level to obtain a comprehensive evaluation result comprises:
and calculating the weighted average value of the accuracy level and the skill level to obtain the comprehensive evaluation result of the playing audio information.
8. A performance evaluation device based on feature decomposition is characterized by comprising:
the acquisition module is used for acquiring the performance audio information of a user and inputting the performance audio information into a preset performance evaluation model to obtain reconstructed music score information;
the matching module is used for determining the accuracy level according to the matching degree between the reconstructed music score information and the target performance music score;
the computing module is used for computing an information vector distance value between the playing audio information and the master playing audio information and determining the skill level according to the information vector distance value;
and the evaluation module is used for comprehensively evaluating the playing audio information according to the accuracy level and the skill level to obtain a comprehensive evaluation result.
9. An electronic device, comprising: a memory, a control processor and a computer program stored on the memory and executable on the control processor, when executing the computer program, implementing the method for performance evaluation based on feature decomposition according to any one of claims 1 to 7.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out a method of characterizing decomposition-based performance evaluation according to any one of claims 1 to 7.
CN202111131753.2A 2021-09-26 2021-09-26 Performance evaluation method and device based on feature decomposition Pending CN113851146A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111131753.2A CN113851146A (en) 2021-09-26 2021-09-26 Performance evaluation method and device based on feature decomposition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111131753.2A CN113851146A (en) 2021-09-26 2021-09-26 Performance evaluation method and device based on feature decomposition

Publications (1)

Publication Number Publication Date
CN113851146A true CN113851146A (en) 2021-12-28

Family

ID=78980221

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111131753.2A Pending CN113851146A (en) 2021-09-26 2021-09-26 Performance evaluation method and device based on feature decomposition

Country Status (1)

Country Link
CN (1) CN113851146A (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107767847A (en) * 2017-09-29 2018-03-06 小叶子(北京)科技有限公司 A kind of intelligent piano performance assessment method and system
CN108492817A (en) * 2018-02-11 2018-09-04 北京光年无限科技有限公司 A kind of song data processing method and performance interactive system based on virtual idol
KR102107588B1 (en) * 2018-10-31 2020-05-07 미디어스코프 주식회사 Method for evaluating about singing and apparatus for executing the method
CN111554256A (en) * 2020-04-21 2020-08-18 华南理工大学 Piano playing ability evaluation system based on strong and weak standards
CN111554255A (en) * 2020-04-21 2020-08-18 华南理工大学 MIDI playing style automatic conversion system based on recurrent neural network
CN111898753A (en) * 2020-08-05 2020-11-06 字节跳动有限公司 Music transcription model training method, music transcription method and corresponding device
CN112669796A (en) * 2020-12-29 2021-04-16 西交利物浦大学 Method and device for converting music into music book based on artificial intelligence
CN113780811A (en) * 2021-09-10 2021-12-10 平安科技(深圳)有限公司 Musical instrument performance evaluation method, device, equipment and storage medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107767847A (en) * 2017-09-29 2018-03-06 小叶子(北京)科技有限公司 A kind of intelligent piano performance assessment method and system
CN108492817A (en) * 2018-02-11 2018-09-04 北京光年无限科技有限公司 A kind of song data processing method and performance interactive system based on virtual idol
KR102107588B1 (en) * 2018-10-31 2020-05-07 미디어스코프 주식회사 Method for evaluating about singing and apparatus for executing the method
CN111554256A (en) * 2020-04-21 2020-08-18 华南理工大学 Piano playing ability evaluation system based on strong and weak standards
CN111554255A (en) * 2020-04-21 2020-08-18 华南理工大学 MIDI playing style automatic conversion system based on recurrent neural network
CN111898753A (en) * 2020-08-05 2020-11-06 字节跳动有限公司 Music transcription model training method, music transcription method and corresponding device
CN112669796A (en) * 2020-12-29 2021-04-16 西交利物浦大学 Method and device for converting music into music book based on artificial intelligence
CN113780811A (en) * 2021-09-10 2021-12-10 平安科技(深圳)有限公司 Musical instrument performance evaluation method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
CN111630590B (en) Method for generating music data
KR20190010135A (en) Apparatus and method for composing music using artificial intelligence
CN118194923B (en) Method, device, equipment and computer readable medium for constructing large language model
Kim et al. An overview of automatic piano performance assessment within the music education context
CN117057414B (en) Text generation-oriented multi-step collaborative prompt learning black box knowledge distillation method and system
CN113780811A (en) Musical instrument performance evaluation method, device, equipment and storage medium
CN115206270A (en) Training method and training device of music generation model based on cyclic feature extraction
Gounaropoulos et al. Synthesising timbres and timbre-changes from adjectives/adverbs
Jadhav et al. Transfer Learning for Audio Waveform to Guitar Chord Spectrograms Using the Convolution Neural Network
CN113851146A (en) Performance evaluation method and device based on feature decomposition
Ortega et al. Phrase-level modeling of expression in violin performances
Stevens et al. Representations of tonal music: A case study in the development of temporal relationships
CN115331648A (en) Audio data processing method, device, equipment, storage medium and product
CN111782864B (en) Singing audio classification method, computer program product, server and storage medium
CN113870897A (en) Audio data teaching evaluation method and device, equipment, medium and product thereof
Chuan An active learning approach to audio-to-score alignment using dynamic time warping
Chen Design of music teaching system based on artificial intelligence
KR102227415B1 (en) System, device, and method to generate polyphonic music
CN116645957B (en) Music generation method, device, terminal, storage medium and program product
CN116129938A (en) Singing voice synthesizing method, singing voice synthesizing device, singing voice synthesizing equipment and storage medium
Chaurasiya et al. Recognition of Speech Emotion Using Machine Learning Techniques
CN118824215A (en) Audio processing method, device, equipment, medium and product
Zhu Application of Artificial Intelligence and Speech Data System based on Music Internet Course Learning System
Otsuka et al. An on-line algorithm of guitar performance transcription using non-negative matrix factorization
Askedalen Generating Live Interactive Music Accompaniment Using Machine Learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40062551

Country of ref document: HK