CN114332903A

CN114332903A - Lute music score identification method and system based on end-to-end neural network

Info

Publication number: CN114332903A
Application number: CN202111458277.5A
Authority: CN
Inventors: 姚俊峰; 何瑞晨; 颜彬彬
Original assignee: Xiamen University
Current assignee: Xiamen University
Priority date: 2021-12-02
Filing date: 2021-12-02
Publication date: 2022-04-12

Abstract

The invention provides a lute music score recognition method and a lute music score recognition system based on an end-to-end neural network, which belong to the technical field of music score recognition, and the method comprises the following steps: s10, acquiring a large number of lute music score pictures and music XML files corresponding to the lute music score pictures, and preprocessing the lute music score pictures; step S20, cutting each music score picture according to lines to obtain lute music score sub-pictures, and cutting corresponding music XML files based on the lute music score sub-pictures to obtain music XML sub-files; step S30, converting each MusicXML subfile into an MEI file, and converting each MEI file into a semantic file based on a preset semantic dictionary; step S40, establishing a lute music score recognition model based on an end-to-end neural network, and training the lute music score recognition model by utilizing sub-pictures of the lute music scores and semantic files; and step S50, carrying out lute music score recognition by using the trained lute music score recognition model. The invention has the advantages that: the lute music score can be automatically identified, and the digitization efficiency of the lute music score is greatly improved.

Description

Lute music score identification method and system based on end-to-end neural network

Technical Field

The invention relates to the technical field of music score identification, in particular to a lute music score identification method and system based on an end-to-end neural network.

Background

Music score is a method for recording music by symbols, conventionally, music is mostly spread by hand-written music score, and in order to protect and spread the music heritage, it is important to digitize and store the music score. However, the manual recording of music scores one by one is time-consuming, labor-consuming and error-prone, and therefore, a need for automatic recognition of music scores arises.

However, in the prior art, there is no method for automatically identifying the lute music score. Therefore, how to provide a lute music score recognition method and system based on an end-to-end neural network to realize automatic recognition of the lute music score becomes a technical problem to be solved urgently.

Disclosure of Invention

The invention aims to solve the technical problem of providing a lute music score recognition method and system based on an end-to-end neural network, and realizing automatic recognition of the lute music score.

In a first aspect, the invention provides a lute music score recognition method based on an end-to-end neural network, which comprises the following steps:

s10, acquiring a large number of lute music score pictures and music XML files corresponding to the lute music score pictures, and preprocessing the lute music score pictures;

step S20, cutting each lute music score picture according to lines to obtain a plurality of lute music score sub-pictures, and cutting a corresponding music XML file based on the lute music score sub-pictures to obtain a plurality of music XML sub-files;

step S30, converting each MusicXML subfile into an MEI file, and converting each MEI file into a semantic file based on a preset semantic dictionary;

step S40, establishing a lute music score recognition model based on an end-to-end neural network, and training the lute music score recognition model by utilizing sub-pictures and semantic files of the lute music score;

and step S50, automatically recognizing the lute music score by using the trained lute music score recognition model.

Further, in the step S10, the preprocessing the lute music score pictures specifically includes:

and sequentially carrying out the preprocessing of graying, binaryzation, noise reduction and inclination correction on each lute music score picture.

Further, the step S20 is specifically:

identifying the upper spectral line of each line of music score in each lute music score picture based on a template matching method, and further cutting each lute music score picture according to lines based on the upper spectral line to obtain a plurality of lute music score sub-pictures;

identifying vertical spectral lines in each lute music score sub-picture, judging the number of subsections contained in the lute music score sub-picture by using the vertical spectral lines, and cutting the musicXML file based on the number of subsections and a label carried by the musicXML file to obtain a plurality of musicXML sub-files.

Further, the step S30 is specifically:

converting each MusicXML subfile into an MEI file through an OMR tool website, and converting each MEI file into a semantic file based on a preset semantic dictionary;

the semantic file is a note information sequence comprising note information, fingering information and rhythm information.

Further, the step S40 is specifically:

establishing a lute music score recognition model based on an end-to-end convolution recurrent neural network, taking each lute music score sub-picture as the input of the lute music score recognition model, taking a semantic file corresponding to each lute music score sub-picture as the output of the lute music score recognition model, and training the lute music score recognition model;

and a CTC function is adopted as a loss function of the lute music score identification model.

In a second aspect, the invention provides a lute music score recognition system based on an end-to-end neural network, which comprises the following modules:

the lute music score image and music XML file acquisition module is used for acquiring a large number of lute music score images and music XML files corresponding to the lute music score images and preprocessing the lute music score images;

the lute music score picture and music XML file cutting module is used for cutting each lute music score picture according to lines to obtain a plurality of lute music score sub-pictures, and cutting a corresponding music XML file based on the lute music score sub-pictures to obtain a plurality of music XML sub-files;

the semantic conversion module is used for converting each MusicXML subfile into an MEI file and converting each MEI file into a semantic file based on a preset semantic dictionary;

the lute music score recognition model training module is used for establishing a lute music score recognition model based on an end-to-end neural network and training the lute music score recognition model by utilizing sub-pictures and semantic files of the lute music score;

and the lute music score automatic identification model is used for carrying out lute music score automatic identification by utilizing the trained lute music score identification model.

Further, in the lute music score picture and music xml file obtaining module, the preprocessing of each lute music score picture specifically comprises:

Further, the lute music score picture and music XML file cutting module specifically comprises:

Further, the semantic conversion module is specifically:

Further, the lute music score recognition model training module specifically comprises:

The invention has the advantages that:

1. the lute music score recognition method includes the steps that a plurality of lute music score sub-pictures are obtained by cutting each lute music score picture according to lines, a corresponding music XML file is cut based on the lute music score sub-pictures and converted into an MEI file, the MEI file is converted into a semantic file comprising note information, fingering information and rhythm information in a semantic mode, a created lute music score recognition model is trained based on the lute music score sub-pictures and the semantic file, and finally automatic recognition of the lute music score can be achieved by the aid of the trained lute music score recognition model, so that automatic recognition of the lute music score is achieved, and compared with traditional manual recording, the digitization efficiency of the lute music score is greatly improved.

2. By cutting the lute music score picture according to lines and carrying out the preprocessing of graying, binaryzation, noise reduction and inclination correction, the method greatly improves the accuracy of identification compared with the method for directly identifying the whole lute music score picture.

Drawings

The invention will be further described with reference to the following examples with reference to the accompanying drawings.

Fig. 1 is a flow chart of a lute score identification method based on an end-to-end neural network.

Fig. 2 is a schematic structural diagram of a lute score recognition system based on an end-to-end neural network.

Fig. 3 is a schematic structural diagram of the lute music score recognition model of the invention.

Detailed Description

The technical scheme in the embodiment of the application has the following general idea: the method comprises the steps of cutting lute music score pictures according to lines to obtain a plurality of lute music score sub-pictures, cutting and semantically converting music XML files corresponding to the lute music score sub-pictures to obtain semantic files, training a lute music score recognition model based on the lute music score sub-pictures and the semantic files, and finally automatically recognizing lute music scores by using the trained lute music score recognition model.

Referring to fig. 1 to fig. 3, a lute score recognition method according to a preferred embodiment of the present invention includes the following steps:

music Extensible Markup Language (MusicExtensible Markup Language) is an open Music symbol file format based on XML, and is used for Music exchange and Music distribution, and the MusicXML aims at creating a universal Music notation format;

and step S50, automatically identifying the lute music score by using the trained lute music score identification model, namely cutting and preprocessing the lute music score picture to be identified according to lines and inputting the lute music score identification model for identification.

The lute music score recognition model is different from other models in training modes, namely the lute music score recognition model does not expect a data set to provide position information of music symbols in the lute music score, and from the music perspective, the position information of each music symbol in a lute music score picture does not need to be retrieved, and only corresponding note information, fingering information and rhythm information in the lute music score (staff) need to be known; therefore, only the context of the corresponding note information sequence needs to be searched, and finally, only one note information sequence needs to be output, the corresponding information of the note information sequence has corresponding explanation in the semantic dictionary, and according to the semantic dictionary, a user can convert the corresponding note information sequence into a control command of the robot.

In the step S10, the preprocessing of the lute music score pictures specifically includes:

and sequentially carrying out the preprocessing of graying, binaryzation, noise reduction and inclination correction on each lute music score picture. Graying is to convert a colorful lute music score picture into a grayscale image with only one channel, binarization is to convert the lute music score picture into black and white, noise reduction is to remove noise points of the lute music score picture, and inclination correction is to better identify and cut the music score of each line.

The step S20 specifically includes:

identifying vertical spectral lines in each lute music score sub-picture, judging the number of subsections contained in the lute music score sub-picture by using the vertical spectral lines, and cutting the musicXML file based on the number of subsections and a label carried by the musicXML file to obtain a plurality of musicXML sub-files. Namely, the musicXML file is cut by utilizing the corresponding relation between the number attribute and the number of the measure labels carried by the musicXML file.

The step S30 specifically includes:

converting each MusicXML subfile into an MEI file through an OMR tool website, and converting each MEI file into a semantic file based on a preset semantic dictionary; the semantic dictionary stores the one-to-one correspondence of each measure of the lute music score with note information, fingering information and rhythm information;

the semantic file is a note information sequence comprising note information, fingering information and rhythm information, and the semantic file is a complete data unit in a data set and is used for training the lute music score recognition model.

The OMR tool is a highly integrated open source code C and C + + component, can be used for constructing a large number of languages, and supports a plurality of different hardware and operating system platforms during running; these components include, but are not limited to: memory management, thread processing, platform port (abstraction) libraries, diagnostic support, monitoring support, garbage collection, and local real-time compilation.

The step S40 specifically includes:

establishing a lute music score recognition model based on an end-to-end Convolution Recurrent Neural Network (CRNN), taking each lute music score sub-picture as the input of the lute music score recognition model, taking a semantic file corresponding to each lute music score sub-picture as the output of the lute music score recognition model, and performing supervised training on the lute music score recognition model;

The convolution recursive neural network is formed by connecting the last layer of the CNN with the input of the first layer of the RNN, all output channels of the convolution part are connected into one image, and the column of the image is regarded as a single module of the convolution block; in the aspect of the identification accuracy of the lute music score identification model, in the end-to-end semantic information, for each music score, a training set only provides a corresponding expected note information sequence (semantic file) without position information of the note information sequence in an image, so that a CTC function is adopted as a loss function; the ctc (connectionist Temporal classification) function is a function for avoiding manual alignment of input and output, and is suitable for speech recognition or OCR.

The invention discloses a preferred embodiment of a lute music score recognition system based on an end-to-end neural network, which comprises the following modules:

the lute music score automatic identification model is used for carrying out lute music score automatic identification by utilizing the trained lute music score identification model, namely cutting a to-be-identified lute music score picture according to lines and inputting the to-be-identified lute music score identification model for identification after preprocessing.

In the lute music score picture and music XML file obtaining module, the preprocessing of the lute music score pictures specifically comprises the following steps:

The lute music score picture and music XML file cutting module specifically comprises:

The semantic conversion module is specifically as follows:

The lute music score recognition model training module specifically comprises:

In summary, the invention has the advantages that:

Although specific embodiments of the invention have been described above, it will be understood by those skilled in the art that the specific embodiments described are illustrative only and are not limiting upon the scope of the invention, and that equivalent modifications and variations can be made by those skilled in the art without departing from the spirit of the invention, which is to be limited only by the appended claims.

Claims

1. A lute music score recognition method based on an end-to-end neural network is characterized by comprising the following steps: the method comprises the following steps:

2. The lute music score recognition method based on the end-to-end neural network as claimed in claim 1, wherein: in the step S10, the preprocessing of the lute music score pictures specifically includes:

3. The lute music score recognition method based on the end-to-end neural network as claimed in claim 1, wherein: the step S20 specifically includes:

4. The lute music score recognition method based on the end-to-end neural network as claimed in claim 1, wherein: the step S30 specifically includes:

5. The lute music score recognition method based on the end-to-end neural network as claimed in claim 1, wherein: the step S40 specifically includes:

6. A lute music score recognition system based on an end-to-end neural network is characterized in that: the system comprises the following modules:

7. The lute music score recognition system based on end-to-end neural network as claimed in claim 6, wherein: in the lute music score picture and music XML file obtaining module, the preprocessing of the lute music score pictures specifically comprises the following steps:

8. The lute music score recognition system based on end-to-end neural network as claimed in claim 6, wherein: the lute music score picture and music XML file cutting module specifically comprises:

9. The lute music score recognition system based on end-to-end neural network as claimed in claim 6, wherein: the semantic conversion module is specifically as follows:

10. The lute music score recognition system based on end-to-end neural network as claimed in claim 6, wherein: the lute music score recognition model training module specifically comprises: