CN113076967B

CN113076967B - Image and audio-based music score dual-recognition system

Info

Publication number: CN113076967B
Application number: CN202011420871.0A
Authority: CN
Inventors: 袁存鼎; 秦兴辰; 黄煌
Original assignee: Wuxi Leqi Technology Co ltd
Current assignee: Shanghai Leqiai Technology Co ltd
Priority date: 2020-12-08
Filing date: 2020-12-08
Publication date: 2022-09-23
Anticipated expiration: 2040-12-08
Also published as: CN113076967A

Abstract

The invention discloses a dual recognition system based on images and audio, and belongs to the technical field of music theory. The system is mainly used for identifying an original bitmap of a paper music score, respectively generating first note information and second note information in a mode of combining image identification and audio identification, matching according to the same format, namely confirming if the first note information and the second note information are the same in matching, confirming the first note information or the second note information which is identified by a user according to the original bitmap if the first note information or the second note information is different in matching, and finally outputting the generated music score. The invention aims to solve the problem that in the prior art, the calibration is difficult to realize only through image recognition, and the music score image cannot be obtained only through audio recognition, so that under the condition that a large number of music scores need to be recognized, manual one-by-one rechecking is omitted, and only when the error occurs in one round of rechecking assisted by audio, manual secondary rechecking is needed, so that the recognition accuracy and the recognition efficiency are greatly improved, and meanwhile, the cost is reduced.

Description

Image and audio-based music score dual-recognition system

Technical Field

The invention relates to the technical field of music theory, in particular to a music score dual recognition system based on images and audios.

Background

With the development of science and technology, people's life gradually develops to paperless, and paperless has the advantages of easy preservation and easy sharing, and compared with the traditional paper material, paperless application has more obvious electronization characteristics. In the musical theory technical field, aiming at some existing paper music scores, the prior art generally adopts an image recognition method to realize electronization, for example, a music score recognition system and a recognition method are provided in the Chinese invention patent with the application number of 201810193256.7, the output of a final music score can be realized through an image input module, an image preprocessing module, a low-rank image module, a difference image module, a spectral line generation module, a spectral line deletion module, a note image module, a note comparison recognition module and a note output module, and a music score image recognition method and a device are also provided in the Chinese invention patent with the publication number of CN106446952B, and a staff image to be processed is obtained; describing edge information of the staff image to be processed by adopting an edge detection method, and detecting position coordinates of the staff by using a straight line detection method; performing note positioning segmentation on the staff image to be processed by adopting a preset note classifier to obtain the position of each complete note in the image; adopting a preset convolutional neural network to identify the note character head obtained by segmentation, judging whether the note character head is a solid character head or a hollow character head, and obtaining the position of the character head; and identifying each complete note according to the obtained position coordinates of the five lines, the relative position of each complete note, the positions of the solid note head or the hollow note head and the note head, and finally realizing the output of the music score. Therefore, the method for image recognition to obtain the final score output belongs to a relatively mature technology, but in the process of image recognition, the recognition precision of the image often does not reach 100%, that is, the problem of recognition failure or recognition error may be faced, when a large number of scores need to be recognized, manual checking and verification are time-consuming and labor-consuming, and the efficiency is low, so that the improvement of the verification efficiency and the recognition accuracy in the process of electronization of the paper score is particularly important.

Disclosure of Invention

In order to solve the problem that in the prior art, in the process of identifying a music score through an image, a fuzzy image is easy to be identified or an identification error is easy to occur, the invention provides a function of audio-assisted identification on the basis of image identification, and the accuracy of music score identification is improved.

In view of the above, the present invention provides an image and audio based score dual recognition system, comprising:

the image input module is used for receiving an input music score image and transmitting the input music score image to the image identification module;

the image recognition module is used for generating a recognition image by recognizing the information of the original bitmap music score of the image input module and acquiring corresponding first note information, and acquiring the music score information in an image recognition mode, and specifically comprises an image preprocessing module, a low-rank image module, a difference image module, a spectral line generating module, a spectral line deleting module, a note image module and a note comparison and recognition module;

the audio recognition module is used for obtaining second note information corresponding to the original bitmap music score by obtaining original audio information, and in the audio recognition module, since note information and duration information corresponding to each note can only be obtained through audio and images cannot be generated, the audio recognition module is connected with the image recognition module, and after a speed mark is obtained from the image recognition module, the speed mark information is transmitted to the audio recognition module, so that vectorization note information matched with the speed mark information is generated according to the speed mark information and the duration information corresponding to the played note, and a corresponding recognition image is formed;

the calibration module is used for calibrating the first note information identified by the image and the second note information identified by the audio frequency to generate the first note information and the second note information which correspond to each other and are the same;

the music theory analysis module is used for generating a corresponding music score vector diagram by music theory analysis of the calibrated note information;

and the music score output module is used for outputting the music score vector diagram obtained by the music theory analysis module.

As a preferred embodiment, the calibration module includes a matching sub-module and a prompting sub-module, the matching sub-module is configured to match the first note information and the second note information, input the same matched note information to the music score output module, and input the different matched information to the prompting sub-module, and the prompting sub-module is configured to check the note information by a user, and select the first note information or the second note information that is the same as the original bitmap.

As a preferred embodiment, the image input module is provided with a camera or a scanner, and is used for obtaining an original bitmap of the music score by shooting or scanning, and after obtaining the original bitmap, a corresponding vector diagram can be obtained after inputting the image into the image recognition module;

as a preferred embodiment, the first note information includes a tempo symbol, a note, a bar, a repetition number, an end number, a time scale, and a soft start, and the second note information includes a note and a corresponding note playing duration.

In a preferred embodiment, the second note information generates a corresponding contrast image by a tempo symbol and a time stamp of the first note information.

As a preferred embodiment, the identification image includes only note information, and does not include a spectral line, a bar line, a repetition number, an end number, a time number, and a weak start.

As a preferred implementation, the music score output module is provided with an electronic display screen for displaying the output music score.

In conclusion, the invention has the following beneficial effects: after the information of the music score bitmap is identified through the image, the corresponding contrast image is generated in a mode of combining audio identification, and the problem that the image cannot be generated only through the audio identification in the traditional means is solved, so that the note information generated through the audio identification assists the identification image generated through the image identification, the part which cannot be accurately identified by the image identification method is identified, the problem of identification errors in the image identification method is effectively solved, manual review is omitted under the condition that a large number of music scores need to be identified, the identification precision and the identification efficiency are greatly improved, and meanwhile, the cost is reduced.

Drawings

FIG. 1 is a schematic diagram of an identification system according to the present invention

FIG. 2 is a schematic diagram of a calibration module according to the present invention

FIG. 3 is a diagram of the frequency of piano music score keys

Detailed Description

The present invention is described in further detail below with reference to figures 1-3.

Taking piano music score as an example, and fig. 1-2 are schematic diagrams of a music score dual recognition system based on image and audio, which is provided by the invention, and comprise an image input module, wherein the image input module is provided with a camera or a scanner and is used for obtaining an original bitmap of the music score through shooting or scanning, and after the original bitmap is obtained, a corresponding vector diagram can be obtained after the image is input to the image recognition module;

the system comprises an image recognition module, a low-rank image module, a difference image module, a spectral line generation module, a spectral line deletion module, a note image module and a note comparison and recognition module, wherein the image recognition module is used for generating a recognition image by recognizing information of an original bitmap music score of an image input module and acquiring corresponding first note information, and the music score information is acquired in an image recognition mode;

the system comprises an audio identification module, an audio identification module and a control module, wherein the audio identification module is used for obtaining second note information corresponding to an original bitmap music score by obtaining original audio information, the second note information comprises notes and corresponding note playing duration, in the audio identification module, the note information and the duration information corresponding to each note can only be obtained through audio, and images cannot be generated, so that the audio identification module is connected with the image identification module, after a speed mark can be obtained from the image identification module, the speed mark information is transmitted to the audio identification module, vectorization note information matched with the speed mark information is generated according to the speed mark information and the duration information corresponding to the played note, a corresponding identification image is formed, and the identification image only comprises the note information and does not comprise spectral lines, minor lines, repetition numbers, ending numbers, beat numbers and weak numbers;

the calibration module comprises a matching sub-module and a prompting sub-module, the matching sub-module is used for matching the first note information with the second note information, the same matched note information is input into the music score output module, the different matched information is used for acquiring the position corresponding to the corresponding first note information and reporting the position to the prompting sub-module, and the prompting sub-module is used for checking the note information by a user and selecting the first note information same as the original bitmap.

and the music score output module is provided with an electronic display screen and is used for displaying the music score obtained by the calibration module in an identification manner.

When a user needs to analyze a music score to obtain a vector diagram music score which can be recognized by a computer, firstly, an original bitmap corresponding to the music score is analyzed through an image recognition method, symbol recognition is carried out after image preprocessing, recognized symbols are analyzed into formats such as pitch and the like, corresponding modes of notes are shown in the attached drawing 3, notes recognized by images are generated into a database and then are compared with a note database obtained by audio recognition, the databases of the image recognition and the note database are compared according to the same format, for example [ key, energy value, initial frame number, end frame number ], the same result obtained after comparison is output from a matching sub-module, a calibration module outputs first note information corresponding to the successfully matched notes to a music score output module, and the music score output module outputs the first note information according to the vector diagram mode; when the matching fails, the database enters a prompting sub-module for comparing failed data, the prompting sub-module feeds the first note information and the second note information back to the music score output module respectively, the music score output module prompts a user, and the user determines that the first note information or the second note information is accurate information and then outputs correct note information in a key-in mode.

The present embodiment is only for explaining the present invention, and it is not limited to the present invention, and those skilled in the art can make modifications of the present embodiment without inventive contribution as needed after reading the present specification, but all of them are protected by patent law within the scope of the claims of the present invention.

Claims

1. A system for dual image and audio-based recognition of a musical score, comprising:

the image input module is used for receiving an input music score image and transmitting the input music score image to the image identification module; the image recognition module is used for generating a recognition image by recognizing the information of the original bitmap music score of the image input module and obtaining corresponding first note information; the first note information comprises a velocity sign, a note, a bar, a repetition number, an end number, a time number and a weak start;

the music theory analysis module is used for carrying out vector storage on the first note information through music theory analysis;

the audio identification module is used for obtaining second note information corresponding to the original bitmap music score by acquiring original audio information, and the second note information comprises notes and corresponding note playing duration; connecting the audio recognition module with the image recognition module, and transmitting speed mark information to the audio recognition module after obtaining a speed mark from the image recognition module, so as to generate vectorization note information matched with the speed mark information according to the speed mark information and duration information corresponding to the played note;

the calibration module is used for calibrating the first note information identified by the image and the second note information identified by the audio frequency to generate corresponding same note information and output the same note information to the music score output module; the calibration module comprises a matching sub-module and a prompting sub-module, the matching sub-module is used for matching the first note information with the second note information, inputting the same matched note information into the music score output module, inputting different matched information into the prompting sub-module, and the prompting sub-module is used for checking the note information by a user;

and the music score output module is used for outputting the first note information corresponding to the note information obtained by the calibration module in a vector diagram mode.

2. The image and audio based score dual recognition system of claim 1, wherein: the image input module is provided with a camera or a scanner and is used for obtaining an original bitmap of the music score through shooting or scanning.

3. The image and audio based score dual recognition system of claim 2, wherein: the first note information comprises a speed mark, a note, a bar line, a repetition number, an ending number, a beat number and a weak start, and the second note information comprises a note and corresponding note playing duration.

4. The image and audio based score dual recognition system of claim 3, wherein: and the second note information generates a corresponding contrast image through the speed mark and the beat number of the first note information.

5. The system of claim 1, wherein the system comprises: the identification image includes only musical notes, and does not include spectral lines, bar lines, repetition numbers, end numbers, time numbers, and weak starts.

6. The image and audio based score dual recognition system of claim 1, wherein: the music score output module is provided with an electronic display screen for displaying the output music score.