Disclosure of Invention
The invention aims to overcome the defects that the analysis is not comprehensive enough due to the fact that the piano tone quality evaluation depends on subjective evaluation and the music is not considered, and solves the problem that the spatial domain characteristic analysis of piano playing tone signals is lacked due to the fact that only a single microphone is used for data acquisition, and provides a piano playing tone quality evaluation system combined with the music and a playing tone signal acquisition device combining a microphone array recording device and a high-quality microphone recording device. Meanwhile, the tone quality evaluation module is constructed by adopting the fuzzy neural network, after a playing sound signal of a tune with a specific style played on one piano by a user is received, the corresponding characteristics of the data in a domain, a time domain, a frequency domain and a time-frequency diagram are analyzed, the obtained characteristic vector and the tune label are used as the input of the fuzzy neural network, a score is finally output, and the tone quality evaluation is carried out on the played piano when the tune is selected.
In order to realize the purpose and the function, the system provided by the invention needs to be subjected to the process of system establishment before use, namely, collecting playing sound signals of different music styles on different pianos, collecting expert subjective evaluation data, extracting and analyzing signal characteristics, determining the structure of a fuzzy neural network and training; the trained fuzzy neural network can be used. The specific technical scheme of the invention is as follows.
A piano playing tone quality evaluation system combined with music comprises a piano music library, a microphone array recording device, a high-quality microphone recording device, a music database, an expert listening evaluation module, a signal characteristic extraction module, a sample library, a signal characteristic analysis module and a tone quality evaluation module;
the piano music library is provided with music labels, and the music labels refer to the serial number marking of each music on the basis of music classification; the piano music library has two functions, namely, in the system establishing process, a large number of piano music resources are provided for subsequent analysis and training of a fuzzy neural network; secondly, in the using process of the user, selectable piano music is provided for the user;
the microphone array recording device comprises microphones arranged at different spatial positions, and a plurality of microphones collect playing sound signals, namely space-domain signals, of the piano at different positions so as to realize macroscopically analyzing the playing sound quality of the piano;
the signal characteristic extraction module extracts macroscopic and microscopic signal characteristics through signal characteristic extraction of audio files formed by piano playing sounds collected by the microphone array recording device and the high-quality microphone recording device, wherein the signal characteristics comprise space domain characteristics, time domain characteristics, frequency domain characteristics and time-frequency diagram characteristics; then establishing a sample library for storing input samples of the fuzzy neural network, wherein the input samples comprise two sample types of training samples and samples to be evaluated; the sample content is a signal feature vector and comprises the extracted signal features and corresponding curved wind labels;
the signal characteristic analysis module realizes the function of establishing a fuzzy set and a fuzzy inference rule, and comprises the steps of carrying out statistical comparison analysis on the extracted signal characteristics and subjective evaluation data obtained from the expert listening evaluation module, thereby establishing a fuzzy set of the signal characteristics and the subjective evaluation data, and simultaneously establishing a fuzzy inference rule of an evaluation process;
the tone quality evaluation module realizes the function of outputting tone quality evaluation scores after the samples in the sample library are processed by the fuzzy neural network; before the tone quality evaluation module is used, the structure of a fuzzy neural network is determined by adopting a fuzzy inference rule established by the signal characteristic analysis module, all samples of the sample library are used as training samples of the fuzzy neural network, subjective evaluation data of an audio file corresponding to each sample are used as expected output, namely a supervision signal, and then the fuzzy neural network is trained; after the network training is finished, the tone quality evaluation module can be used; when the user uses the tone quality evaluation module, the function of intelligently outputting the evaluation score after obtaining the sample to be evaluated can be realized without manually obtaining subjective evaluation data.
Based on the technical scheme, the evaluation method of the piano playing sound quality evaluation system combining the music comprises two processes, namely a system establishing process and a user using process.
Before the system is used, playing sound signals of different music on different pianos need to be collected, expert subjective evaluation data are collected, signal characteristics are extracted and analyzed, and the structure of a fuzzy neural network is determined and trained.
The steps of the system setup procedure are as follows:
(1) the piano music library with the music labels is established by analyzing the representative music classification rules, and the music labels are used for numbering and marking the types of the music on the basis of music classification.
(2) The performance sound signals played on a plurality of pianos are recorded on site using a microphone array recording apparatus and a high-quality microphone recording apparatus, and the played tunes include all the tunes in a piano tune library. The performance sound signals collected by the microphone array recording device are processed to form a multi-channel audio file, and the performance sound signals collected by the high-quality microphone recording device are correspondingly processed to form a high-fidelity audio file.
(3) The high-fidelity audio file is played back in a listening room, a plurality of professionals are allowed to carry out listening experiments, the quality of the sound quality of the professional is evaluated, subjective evaluation data are collected, and the subjective evaluation data, the audio file and the music note tag which are obtained through statistics are correspondingly stored in a music database.
(4) And based on the short-time stationarity of the audio signal, carrying out pre-emphasis processing and framing processing on the multi-channel audio file and the high-fidelity audio file.
(5) Inputting all audio files in the music database into a signal feature extraction module to extract signal features, establishing a sample library, forming signal feature vectors by the extracted signal features and corresponding music labels, and correspondingly storing the signal feature vectors into the sample library to be used as training samples.
(6) Counting the extracted signal characteristics, and carrying out comparative analysis on the signal characteristics and the subjective evaluation data in the step (3) so as to establish a fuzzy set of the signal characteristics and the subjective evaluation data and establish a fuzzy inference rule of an evaluation process; the fuzzy set and fuzzy inference rules refer to conditions required for judging the sound quality and the description of judgment logic in fuzzy mathematics; as can be described as follows: if the fundamental frequency harmonic proportion is in the range of x1 and the spatial balance is in the range of y1, the sound quality evaluation score is in the range of z 1; x1, y1, z1 all belong to respective fuzzy sets X, Y, Z.
(7) And (4) determining the structure of the fuzzy neural network according to the size of the signal feature vector obtained in the step (5) and the fuzzy inference rule obtained by analysis in the step (6), taking all samples of the sample library as training samples of the fuzzy neural network, taking subjective evaluation data of the audio file corresponding to each sample as expected output, namely a supervision signal, then training the fuzzy neural network, and finishing the establishment of the tone quality evaluation module.
After the system is built, a user can use the system, and the function of intelligently outputting the evaluation score after the sample to be evaluated is obtained can be realized without manually obtaining subjective evaluation data. The steps of the user using the system are as follows:
(1) let the user select a tune in the piano tune library and play on the piano to be evaluated, the microphone array recording apparatus and the high-quality microphone recording apparatus simultaneously collect the performance tone signals.
(2) Processing the playing sound signals to form audio files, and storing the audio files and the music labels into a music database;
(3) the audio file is input into a signal feature extraction module after being preprocessed, the extracted signal features and the music wind labels form signal feature vectors together, and the signal feature vectors are stored in a sample library to serve as samples to be evaluated.
(4) The sample to be evaluated is input into the established sound quality evaluation module, and finally the sound quality evaluation module outputs an evaluation score in the range of [0,100] as the performance sound quality evaluation result of the selected piano.
Compared with the existing tone quality evaluation system, the invention has the following advantages:
(1) according to the piano playing tone quality evaluation system combined with the music, after the system is established, only a section of piano playing tone signal needs to be input into the computer, the piano playing tone quality can be evaluated without subsequent manual work, and compared with the manual subjective evaluation of experts, the piano playing tone quality evaluation system combined with the music greatly improves convenience and intelligence and reduces labor cost and time cost.
(2) The system provided by the invention fully considers subjective evaluation data and objective signal characteristics in the system establishing process, wherein the objective signal characteristics comprise space domain characteristics, time domain characteristics, frequency domain characteristics and time-frequency diagram characteristics, and the multi-characteristic analysis method greatly improves the objectivity of the evaluation system and avoids the problems of evaluation standard floating and the like caused by artificial subjective evaluation.
(3) The system provided by the invention also considers the factors of the music which have personalized characteristics, and brings the music label into the piano for playing, and after the system processing, the user can obtain the sound quality of the music which is preferred by the user on the piano to be evaluated. Therefore, the user can better assist the user in selecting the piano really suitable for the user by comparing the evaluation scores of the plurality of intended pianos on the favorite music, so that the playing effect is optimal; and according to the preference of a user, a piano manufacturer can use the system to assist in debugging various parameters of the piano when producing the personalized custom piano.
Detailed Description
In order to make the objects, technical solutions, innovative points and advantages of the present invention more apparent, embodiments of the present invention are further described below with reference to the accompanying drawings.
The following further describes embodiments of the present invention with reference to the drawings, but the practice of the present invention is not limited thereto.
As shown in fig. 1, the system of the present invention comprises nine modules: the system comprises a piano music library, a microphone array recording device, a high-quality microphone recording device, a music database, an expert listening evaluation module, a signal characteristic extraction module, a sample library, a signal characteristic analysis module and a tone quality evaluation module.
The piano performance sound quality evaluation system combined with the music wind mainly comprises the following parts:
(1) piano music library: the piano music book comprises a plurality of piano music books with music book labels, wherein the music book labels are used for numbering and marking each music book on the basis of music book classification.
(2) Microphone array recording device: the device is used for collecting spatial signals of piano playing tones for macroscopic analysis.
(3) High-quality microphone recording device: the signal is used for acquiring the piano high-fidelity performance sound for microscopic analysis.
(4) An expert listening evaluation module: in the system establishing process, playing sound signals of all piano songs on different pianos in a piano song library are required to be collected, and then the playing sound signals are correspondingly processed to form audio files; then, subjective evaluation data is obtained through the process of audio file playback and expert listening evaluation.
(5) A music database: correspondingly storing the music style label, the audio file and the subjective evaluation data into a music database; the process of playback of audio files and expert listening evaluation is not needed during the use of the system by the user, so that the data stored in the music library only comprises the audio files and the music labels.
(6) A signal feature extraction module: after the audio files of the playing sound in the music database are preprocessed, the macroscopic and microscopic signal characteristics of the piano playing sound are extracted through the module, and the signal characteristics mainly comprise space domain characteristics, time domain characteristics, frequency domain characteristics and time-frequency diagram characteristics.
(7) Sample library: the part is used for storing input samples of the fuzzy neural network, the sample content is a signal feature vector and comprises extracted signal features and corresponding curved wind labels, and the sample types comprise training samples and samples to be evaluated.
(8) A signal characteristic analysis module: the module is mainly used for realizing the function of establishing the fuzzy inference rule in the system establishing process, and comprises the steps of carrying out statistic comparison analysis on the extracted signal characteristics and subjective evaluation data, establishing a fuzzy set of the signal characteristics and the subjective evaluation data and establishing the fuzzy inference rule in the evaluation process.
(9) And a tone quality evaluation module: the module realizes the function of outputting the tone quality evaluation score after the samples in the sample library are processed by the fuzzy neural network. Before the module is used, the fuzzy inference rule established by the signal characteristic analysis module in the step (8) is adopted to determine the structure of the fuzzy neural network, all samples in the sample library in the step (7) are used as training samples of the fuzzy neural network, subjective evaluation data of an audio file corresponding to each sample is used as expected output, namely a supervision signal, and then the fuzzy neural network is trained. After the network is trained, the tone quality evaluation module can be used. When the module is used by a user, the function of intelligently outputting the evaluation score after the sample to be evaluated is obtained can be realized without manually obtaining subjective evaluation data.
Before the system provided by the invention is used, the process of system establishment is required, namely, playing sound signals of different music on different pianos are collected, expert subjective evaluation data are collected, signal characteristics are extracted and analyzed, and the structure of a fuzzy neural network is determined and trained; the trained fuzzy neural network can be used. Thus, the system set-up process includes all nine modules, while the user usage process includes only seven of the modules: the system comprises a piano music library, a microphone array recording device, a high-quality microphone recording device, a music database, a signal characteristic extraction module, a sample library and a tone quality evaluation module.
The piano tune library in this example may include four types of styles formed by classifying the piano tune development periods: the song label is respectively numbered 1,2,3 and 4 in the baroque style, the classical style, the romantic style and the modern style; selecting three typical exercise songs from each type of style; thus, a total of 12 piano tunes were included in the tune library.
The microphone array recording device in this embodiment is shown in fig. 2, and includes 18 microphones, an interface assembly board, a USB to serial port signal transmission control circuit, an amplification band-pass circuit, an a/D module, and an ARM signal processing and storing module. The amplifying band-pass circuit, the A/D module and the ARM signal processing and storing module are integrated into a set circuit, each set circuit board receives playing sound signals collected by 3 microphones, and 6 set circuit boards in total comprise 1 host set circuit and 5 slave set circuits; the sampling frequency is 100kHz, and the quantization precision is 12 bits; the USB-to-serial port signal transmission control circuit is mainly responsible for receiving signals of starting and stopping recording at the computer end, when the computer sends a command for starting receiving, the command is transmitted to the host through the module, the host simultaneously transmits the command to each slave through a signal wire connected with the slave, and then other modules start working; data in the ARM signal processing storage module are stored on the SD storage card in a bin file mode, and are converted into multichannel wav audio files through corresponding programs.
The high-quality microphone recording device comprises a single microphone capable of collecting high-fidelity playing sound signals, and is combined with Adobe Audio software to generate a high-fidelity wav audio file, the selected sampling frequency is 96kHz, and the quantization precision is 16 bits.
The music database in the embodiment is mainly used for storing multi-channel wav audio files and high-fidelity wav audio files, the audio files, subjective evaluation data obtained after the audio files are subjected to the expert listening evaluation module, and music labels are correspondingly stored in the database, so that the music database with various styles, multiple pianos and different sound quality evaluation score labels is established.
And wherein the expert listening evaluation module mainly comprises the following components: the high-fidelity wav audio file interception part is played back in a listening room, 5 experts are invited to perform listening experiments, each intercepted audio is evaluated according to indexes such as sound stability, richness, brightness, fullness and the like, then tone quality between every two audio files is compared and scored integrally, and the collected subjective evaluation data is counted and analyzed.
The music feature extraction module in the embodiment mainly realizes the function of extracting and obtaining features in the aspects of airspace, time domain, frequency domain, time-frequency diagram and the like by taking an audio file in a music database as input; the relationship between these features and the psychoacoustic index is as follows:
1) macroscopic spatial domain signal characteristics such as spatial balance extracted based on the multi-channel wav audio file are related to the stability of sound and the naturalness of a sound area and a transition area;
2) microscopic time domain envelope characteristics such as the oscillation starting time and the single tone time value extracted based on the high-fidelity wav audio file are related to the tone brightness or low tone quality of the sound, and the tone quality is directly influenced;
3) microscopic frequency domain characteristics such as amplitude and amplitude proportion of fundamental frequency and overtone extracted based on the high-fidelity wav audio file are related to timbre expressive force of sound;
4) the time-frequency diagram shows that the overall characteristics of fundamental frequency and overtone are related to the tone fullness and harmony of sound.
The sample library in the embodiment is used for storing input samples of the tone quality evaluation module, and comprises two sample types, namely a training sample and a sample to be evaluated; the sample content is a signal feature vector comprising the extracted signal features and corresponding melody tags.
The tone quality evaluation module in the embodiment is a main module of the system, the main structure of the tone quality evaluation module is a multi-input single-output five-layer neural network, and the selection of internal logic and weight functions depends on a fuzzy set and a fuzzy inference rule established by a signal characteristic analysis module.
The music signal characteristic analysis module mainly realizes the function of establishing a fuzzy inference rule, and comprises the steps of carrying out statistic comparison analysis on the extracted signal characteristics and subjective evaluation data, thereby establishing a fuzzy set of the signal characteristics and the subjective evaluation data, and simultaneously establishing the fuzzy inference rule of an evaluation process; the fuzzy set and the fuzzy rule refer to conditions required for judging the sound quality and the description of a judgment logic in fuzzy mathematics; as can be described as follows: if the fundamental frequency harmonic proportion is in the range of x1 and the spatial balance is in the range of y1, the sound quality evaluation score is in the range of z 1. x1, y1, z1 all belong to respective fuzzy sets X, Y, Z.
Before the system is used, the system needs to be subjected to the process of system establishment, namely, playing sound signals of different music styles on different pianos are collected, expert subjective evaluation data are collected, signal characteristics are extracted and analyzed, and the structure of a fuzzy neural network is determined and trained; the trained fuzzy neural network can be used. A flow chart of the system set-up procedure is shown in figure 3.
The system establishment process is as follows:
(1) establishing a piano music library: as described above, a total of 12 piano songs of four genres are included in the library.
(2) The recording device is turned on, and the player plays the music in the music library: the performance sound signals played on a plurality of pianos are recorded on site using a microphone array recording apparatus and a high-quality microphone recording apparatus, and the played tunes include all the tunes in a piano tune library.
(3) Forming an audio file: the performance sound signals collected by the microphone array recording device are processed to form a multi-channel audio file, and the performance sound signals collected by the high-quality microphone recording device are correspondingly processed to form a high-fidelity audio file. In order to correspond to each exercise music of each style in the piano music library of this example, thus letting 3 players play on 4 distinctly different pianos, 48 hi-fi audio files and 48 16-channel audio files are obtained in total.
(4) Audio file playback, expert listening evaluation, music database establishment: the high-fidelity audio file is played back in a listening room, a plurality of professionals are allowed to carry out listening experiments, the quality of the sound quality of the professional is evaluated, subjective evaluation data are collected, and the subjective evaluation data, the audio file and the music note tag which are obtained through statistics are correspondingly stored in a music database.
(5) Preprocessing an audio file: and based on the short-time stationarity of the audio signal, carrying out pre-emphasis processing and framing processing on the multi-channel audio file and the high-fidelity audio file.
(6) Extracting signal features to form training samples: inputting all audio files in the music database into a signal feature extraction module to extract signal features, establishing a sample library, forming signal feature vectors by the extracted signal features and corresponding music labels, and correspondingly storing the signal feature vectors into the sample library to be used as training samples.
(7) Analyzing the signal characteristics and subjective evaluation data, and establishing a fuzzy set and a fuzzy inference rule: counting the extracted signal characteristics, and carrying out comparative analysis on the extracted signal characteristics and the subjective evaluation data obtained in the step (4), so as to establish a fuzzy set of the signal characteristics and the subjective evaluation data and establish a fuzzy inference rule of an evaluation process; the fuzzy set and fuzzy inference rules refer to conditions required for judging the sound quality and the description of judgment logic in fuzzy mathematics; as can be described as follows: if the fundamental frequency harmonic proportion is in the range of x1 and the spatial balance is in the range of y1, the sound quality evaluation score is in the range of z 1; x1, y1, z1 all belong to respective fuzzy sets X, Y, Z.
(8) Determining the structure of the fuzzy neural network, training the fuzzy neural network: and (4) determining the structure of the fuzzy neural network according to the size of the signal feature vector obtained in the step (6) and the fuzzy inference rule obtained by analysis in the step (7), taking all samples of the sample library as training samples of the fuzzy neural network, taking subjective evaluation data of the audio file corresponding to each sample as expected output, namely a supervision signal, and then training the fuzzy neural network.
And after the verification and verification evaluation result is verified to be optimal, the tone quality evaluation module is established. After the system is successfully established, a user can use the system, and the function of intelligently outputting evaluation scores after obtaining samples to be evaluated can be realized without manually obtaining subjective evaluation data; a flow chart of the user using the system is shown in fig. 4.
The flow of the system using process of the user is as follows:
(1) user selection of preferred music: the user selects a preferred tune in the piano tune library, for example, a romantic style tune, the tune label number is set to 3, and the selected tune is correctly played on the piano desired to obtain the sound quality evaluation.
(2) The recording device is turned on, and the user plays the tune of the selected tune on the piano to be evaluated: the microphone array recording device and the high-quality microphone recording device simultaneously acquire performance sound signals of a user.
(3) Forming an audio file, and storing the audio file in a music database by combining with a music label: and after the performance is finished, processing the collected performance sound signals to form audio files, and storing the audio files and the music labels into a music database.
(4) Preprocessing an audio file: and performing pre-emphasis processing and framing processing on the audio file based on the short-time stationarity of the audio signal.
(5) Extracting signal characteristics to form a sample to be evaluated: and inputting the preprocessed audio file into a signal feature extraction module, forming a signal feature vector by the extracted signal feature and the song label, and storing the signal feature vector into a sample library to be used as a sample to be evaluated.
(6) And (3) performing sound quality evaluation, and outputting a sound quality evaluation score: inputting the sample to be evaluated into the established sound quality evaluation module, and finally outputting an evaluation score, such as 65 scores, within the range of [0,100], by the sound quality evaluation module, wherein the evaluation score is the performance sound quality evaluation result of the played piano under the selected curved wind.
Therefore, the user can compare the evaluation scores of multiple intended pianos on the favorite music so as to assist in selecting the piano really suitable for the user and enable the playing to achieve the best effect; according to the preference of users, piano manufacturers can use the system to assist in debugging various parameters of the piano when producing personalized custom pianos, so that a large part of manual evaluation cost and time cost are reduced.
The above embodiments are intended to be preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.