CN113076967B - Image and audio-based music score dual-recognition system - Google Patents

Image and audio-based music score dual-recognition system Download PDF

Info

Publication number
CN113076967B
CN113076967B CN202011420871.0A CN202011420871A CN113076967B CN 113076967 B CN113076967 B CN 113076967B CN 202011420871 A CN202011420871 A CN 202011420871A CN 113076967 B CN113076967 B CN 113076967B
Authority
CN
China
Prior art keywords
module
image
note information
note
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011420871.0A
Other languages
Chinese (zh)
Other versions
CN113076967A (en
Inventor
袁存鼎
秦兴辰
黄煌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Leqiai Technology Co ltd
Original Assignee
Wuxi Leqi Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuxi Leqi Technology Co ltd filed Critical Wuxi Leqi Technology Co ltd
Priority to CN202011420871.0A priority Critical patent/CN113076967B/en
Publication of CN113076967A publication Critical patent/CN113076967A/en
Application granted granted Critical
Publication of CN113076967B publication Critical patent/CN113076967B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • G06V10/751Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/0008Associated control or indicating means
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/031Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Acoustics & Sound (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Character Discrimination (AREA)
  • Auxiliary Devices For Music (AREA)

Abstract

The invention discloses a dual recognition system based on images and audio, and belongs to the technical field of music theory. The system is mainly used for identifying an original bitmap of a paper music score, respectively generating first note information and second note information in a mode of combining image identification and audio identification, matching according to the same format, namely confirming if the first note information and the second note information are the same in matching, confirming the first note information or the second note information which is identified by a user according to the original bitmap if the first note information or the second note information is different in matching, and finally outputting the generated music score. The invention aims to solve the problem that in the prior art, the calibration is difficult to realize only through image recognition, and the music score image cannot be obtained only through audio recognition, so that under the condition that a large number of music scores need to be recognized, manual one-by-one rechecking is omitted, and only when the error occurs in one round of rechecking assisted by audio, manual secondary rechecking is needed, so that the recognition accuracy and the recognition efficiency are greatly improved, and meanwhile, the cost is reduced.

Description

Image and audio-based music score dual-recognition system
Technical Field
The invention relates to the technical field of music theory, in particular to a music score dual recognition system based on images and audios.
Background
With the development of science and technology, people's life gradually develops to paperless, and paperless has the advantages of easy preservation and easy sharing, and compared with the traditional paper material, paperless application has more obvious electronization characteristics. In the musical theory technical field, aiming at some existing paper music scores, the prior art generally adopts an image recognition method to realize electronization, for example, a music score recognition system and a recognition method are provided in the Chinese invention patent with the application number of 201810193256.7, the output of a final music score can be realized through an image input module, an image preprocessing module, a low-rank image module, a difference image module, a spectral line generation module, a spectral line deletion module, a note image module, a note comparison recognition module and a note output module, and a music score image recognition method and a device are also provided in the Chinese invention patent with the publication number of CN106446952B, and a staff image to be processed is obtained; describing edge information of the staff image to be processed by adopting an edge detection method, and detecting position coordinates of the staff by using a straight line detection method; performing note positioning segmentation on the staff image to be processed by adopting a preset note classifier to obtain the position of each complete note in the image; adopting a preset convolutional neural network to identify the note character head obtained by segmentation, judging whether the note character head is a solid character head or a hollow character head, and obtaining the position of the character head; and identifying each complete note according to the obtained position coordinates of the five lines, the relative position of each complete note, the positions of the solid note head or the hollow note head and the note head, and finally realizing the output of the music score. Therefore, the method for image recognition to obtain the final score output belongs to a relatively mature technology, but in the process of image recognition, the recognition precision of the image often does not reach 100%, that is, the problem of recognition failure or recognition error may be faced, when a large number of scores need to be recognized, manual checking and verification are time-consuming and labor-consuming, and the efficiency is low, so that the improvement of the verification efficiency and the recognition accuracy in the process of electronization of the paper score is particularly important.
Disclosure of Invention
In order to solve the problem that in the prior art, in the process of identifying a music score through an image, a fuzzy image is easy to be identified or an identification error is easy to occur, the invention provides a function of audio-assisted identification on the basis of image identification, and the accuracy of music score identification is improved.
In view of the above, the present invention provides an image and audio based score dual recognition system, comprising:
the image input module is used for receiving an input music score image and transmitting the input music score image to the image identification module;
the image recognition module is used for generating a recognition image by recognizing the information of the original bitmap music score of the image input module and acquiring corresponding first note information, and acquiring the music score information in an image recognition mode, and specifically comprises an image preprocessing module, a low-rank image module, a difference image module, a spectral line generating module, a spectral line deleting module, a note image module and a note comparison and recognition module;
the audio recognition module is used for obtaining second note information corresponding to the original bitmap music score by obtaining original audio information, and in the audio recognition module, since note information and duration information corresponding to each note can only be obtained through audio and images cannot be generated, the audio recognition module is connected with the image recognition module, and after a speed mark is obtained from the image recognition module, the speed mark information is transmitted to the audio recognition module, so that vectorization note information matched with the speed mark information is generated according to the speed mark information and the duration information corresponding to the played note, and a corresponding recognition image is formed;
the calibration module is used for calibrating the first note information identified by the image and the second note information identified by the audio frequency to generate the first note information and the second note information which correspond to each other and are the same;
the music theory analysis module is used for generating a corresponding music score vector diagram by music theory analysis of the calibrated note information;
and the music score output module is used for outputting the music score vector diagram obtained by the music theory analysis module.
As a preferred embodiment, the calibration module includes a matching sub-module and a prompting sub-module, the matching sub-module is configured to match the first note information and the second note information, input the same matched note information to the music score output module, and input the different matched information to the prompting sub-module, and the prompting sub-module is configured to check the note information by a user, and select the first note information or the second note information that is the same as the original bitmap.
As a preferred embodiment, the image input module is provided with a camera or a scanner, and is used for obtaining an original bitmap of the music score by shooting or scanning, and after obtaining the original bitmap, a corresponding vector diagram can be obtained after inputting the image into the image recognition module;
as a preferred embodiment, the first note information includes a tempo symbol, a note, a bar, a repetition number, an end number, a time scale, and a soft start, and the second note information includes a note and a corresponding note playing duration.
In a preferred embodiment, the second note information generates a corresponding contrast image by a tempo symbol and a time stamp of the first note information.
As a preferred embodiment, the identification image includes only note information, and does not include a spectral line, a bar line, a repetition number, an end number, a time number, and a weak start.
As a preferred implementation, the music score output module is provided with an electronic display screen for displaying the output music score.
In conclusion, the invention has the following beneficial effects: after the information of the music score bitmap is identified through the image, the corresponding contrast image is generated in a mode of combining audio identification, and the problem that the image cannot be generated only through the audio identification in the traditional means is solved, so that the note information generated through the audio identification assists the identification image generated through the image identification, the part which cannot be accurately identified by the image identification method is identified, the problem of identification errors in the image identification method is effectively solved, manual review is omitted under the condition that a large number of music scores need to be identified, the identification precision and the identification efficiency are greatly improved, and meanwhile, the cost is reduced.
Drawings
FIG. 1 is a schematic diagram of an identification system according to the present invention
FIG. 2 is a schematic diagram of a calibration module according to the present invention
FIG. 3 is a diagram of the frequency of piano music score keys
Detailed Description
The present invention is described in further detail below with reference to figures 1-3.
Taking piano music score as an example, and fig. 1-2 are schematic diagrams of a music score dual recognition system based on image and audio, which is provided by the invention, and comprise an image input module, wherein the image input module is provided with a camera or a scanner and is used for obtaining an original bitmap of the music score through shooting or scanning, and after the original bitmap is obtained, a corresponding vector diagram can be obtained after the image is input to the image recognition module;
the system comprises an image recognition module, a low-rank image module, a difference image module, a spectral line generation module, a spectral line deletion module, a note image module and a note comparison and recognition module, wherein the image recognition module is used for generating a recognition image by recognizing information of an original bitmap music score of an image input module and acquiring corresponding first note information, and the music score information is acquired in an image recognition mode;
the system comprises an audio identification module, an audio identification module and a control module, wherein the audio identification module is used for obtaining second note information corresponding to an original bitmap music score by obtaining original audio information, the second note information comprises notes and corresponding note playing duration, in the audio identification module, the note information and the duration information corresponding to each note can only be obtained through audio, and images cannot be generated, so that the audio identification module is connected with the image identification module, after a speed mark can be obtained from the image identification module, the speed mark information is transmitted to the audio identification module, vectorization note information matched with the speed mark information is generated according to the speed mark information and the duration information corresponding to the played note, a corresponding identification image is formed, and the identification image only comprises the note information and does not comprise spectral lines, minor lines, repetition numbers, ending numbers, beat numbers and weak numbers;
the calibration module comprises a matching sub-module and a prompting sub-module, the matching sub-module is used for matching the first note information with the second note information, the same matched note information is input into the music score output module, the different matched information is used for acquiring the position corresponding to the corresponding first note information and reporting the position to the prompting sub-module, and the prompting sub-module is used for checking the note information by a user and selecting the first note information same as the original bitmap.
The music theory analysis module is used for generating a corresponding music score vector diagram by music theory analysis of the calibrated note information;
and the music score output module is provided with an electronic display screen and is used for displaying the music score obtained by the calibration module in an identification manner.
When a user needs to analyze a music score to obtain a vector diagram music score which can be recognized by a computer, firstly, an original bitmap corresponding to the music score is analyzed through an image recognition method, symbol recognition is carried out after image preprocessing, recognized symbols are analyzed into formats such as pitch and the like, corresponding modes of notes are shown in the attached drawing 3, notes recognized by images are generated into a database and then are compared with a note database obtained by audio recognition, the databases of the image recognition and the note database are compared according to the same format, for example [ key, energy value, initial frame number, end frame number ], the same result obtained after comparison is output from a matching sub-module, a calibration module outputs first note information corresponding to the successfully matched notes to a music score output module, and the music score output module outputs the first note information according to the vector diagram mode; when the matching fails, the database enters a prompting sub-module for comparing failed data, the prompting sub-module feeds the first note information and the second note information back to the music score output module respectively, the music score output module prompts a user, and the user determines that the first note information or the second note information is accurate information and then outputs correct note information in a key-in mode.
The present embodiment is only for explaining the present invention, and it is not limited to the present invention, and those skilled in the art can make modifications of the present embodiment without inventive contribution as needed after reading the present specification, but all of them are protected by patent law within the scope of the claims of the present invention.

Claims (6)

1. A system for dual image and audio-based recognition of a musical score, comprising:
the image input module is used for receiving an input music score image and transmitting the input music score image to the image identification module; the image recognition module is used for generating a recognition image by recognizing the information of the original bitmap music score of the image input module and obtaining corresponding first note information; the first note information comprises a velocity sign, a note, a bar, a repetition number, an end number, a time number and a weak start;
the music theory analysis module is used for carrying out vector storage on the first note information through music theory analysis;
the audio identification module is used for obtaining second note information corresponding to the original bitmap music score by acquiring original audio information, and the second note information comprises notes and corresponding note playing duration; connecting the audio recognition module with the image recognition module, and transmitting speed mark information to the audio recognition module after obtaining a speed mark from the image recognition module, so as to generate vectorization note information matched with the speed mark information according to the speed mark information and duration information corresponding to the played note;
the calibration module is used for calibrating the first note information identified by the image and the second note information identified by the audio frequency to generate corresponding same note information and output the same note information to the music score output module; the calibration module comprises a matching sub-module and a prompting sub-module, the matching sub-module is used for matching the first note information with the second note information, inputting the same matched note information into the music score output module, inputting different matched information into the prompting sub-module, and the prompting sub-module is used for checking the note information by a user;
and the music score output module is used for outputting the first note information corresponding to the note information obtained by the calibration module in a vector diagram mode.
2. The image and audio based score dual recognition system of claim 1, wherein: the image input module is provided with a camera or a scanner and is used for obtaining an original bitmap of the music score through shooting or scanning.
3. The image and audio based score dual recognition system of claim 2, wherein: the first note information comprises a speed mark, a note, a bar line, a repetition number, an ending number, a beat number and a weak start, and the second note information comprises a note and corresponding note playing duration.
4. The image and audio based score dual recognition system of claim 3, wherein: and the second note information generates a corresponding contrast image through the speed mark and the beat number of the first note information.
5. The system of claim 1, wherein the system comprises: the identification image includes only musical notes, and does not include spectral lines, bar lines, repetition numbers, end numbers, time numbers, and weak starts.
6. The image and audio based score dual recognition system of claim 1, wherein: the music score output module is provided with an electronic display screen for displaying the output music score.
CN202011420871.0A 2020-12-08 2020-12-08 Image and audio-based music score dual-recognition system Active CN113076967B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011420871.0A CN113076967B (en) 2020-12-08 2020-12-08 Image and audio-based music score dual-recognition system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011420871.0A CN113076967B (en) 2020-12-08 2020-12-08 Image and audio-based music score dual-recognition system

Publications (2)

Publication Number Publication Date
CN113076967A CN113076967A (en) 2021-07-06
CN113076967B true CN113076967B (en) 2022-09-23

Family

ID=76609106

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011420871.0A Active CN113076967B (en) 2020-12-08 2020-12-08 Image and audio-based music score dual-recognition system

Country Status (1)

Country Link
CN (1) CN113076967B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114419634B (en) * 2022-03-28 2022-07-26 之江实验室 Feature rule-based music score analysis method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1573915A (en) * 2003-06-06 2005-02-02 明基电通股份有限公司 Method of creating music file with main melody and accompaniment
CN107369359A (en) * 2017-09-20 2017-11-21 郑州幼儿师范高等专科学校 A kind of vocal music pronunciation training system
CN111179691A (en) * 2019-12-31 2020-05-19 苏州缪斯谈谈科技有限公司 Note duration display method and device, electronic equipment and storage medium
CN111553260A (en) * 2020-04-26 2020-08-18 苏州缪斯谈谈科技有限公司 Interactive teaching method and system
CN111968675A (en) * 2020-07-10 2020-11-20 南京邮电大学 Stringed instrument note comparison system based on hand recognition and use method thereof

Family Cites Families (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8094891B2 (en) * 2007-11-01 2012-01-10 Sony Ericsson Mobile Communications Ab Generating music playlist based on facial expression
CN101600118B (en) * 2008-06-06 2012-09-19 株式会社日立制作所 Device and method for extracting audio/video content information
CN102222227B (en) * 2011-04-25 2013-07-31 中国华录集团有限公司 Video identification based system for extracting film images
CN108028040B (en) * 2015-09-07 2022-06-07 雅马哈株式会社 Musical performance assisting apparatus and method
CN107146631B (en) * 2016-02-29 2020-11-10 北京搜狗科技发展有限公司 Music identification method, note identification model establishment method, device and electronic equipment
CN107566863A (en) * 2016-06-30 2018-01-09 中兴通讯股份有限公司 A kind of exchange of information methods of exhibiting, device and equipment, set top box
CN106297755B (en) * 2016-09-28 2023-06-13 北京邮电大学 Electronic equipment and identification method for music score image identification
CN106446952B (en) * 2016-09-28 2019-08-16 北京邮电大学 A kind of musical score image recognition methods and device
CN108416359A (en) * 2018-03-09 2018-08-17 湖南女子学院 A kind of music score identifying system and recognition methods
CN110010112A (en) * 2019-04-22 2019-07-12 咸阳师范学院 A kind of electronic music editing system
CN210516214U (en) * 2019-04-30 2020-05-12 张玄武 Service equipment based on video and voice interaction
CN110299049B (en) * 2019-06-17 2021-12-17 韶关市启之信息技术有限公司 Intelligent display method of electronic music score
CN111176544B (en) * 2019-12-30 2023-07-18 河海大学常州校区 Multifunctional musical instrument spectrum display device based on image processing and image processing method
CN111787346B (en) * 2020-07-09 2024-02-09 腾讯科技(深圳)有限公司 Music score display method, device, equipment and storage medium based on live broadcast
CN111737589A (en) * 2020-08-25 2020-10-02 北京圈清文化传媒有限公司 Artificial intelligence based recommendation method, device and system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1573915A (en) * 2003-06-06 2005-02-02 明基电通股份有限公司 Method of creating music file with main melody and accompaniment
CN107369359A (en) * 2017-09-20 2017-11-21 郑州幼儿师范高等专科学校 A kind of vocal music pronunciation training system
CN111179691A (en) * 2019-12-31 2020-05-19 苏州缪斯谈谈科技有限公司 Note duration display method and device, electronic equipment and storage medium
CN111553260A (en) * 2020-04-26 2020-08-18 苏州缪斯谈谈科技有限公司 Interactive teaching method and system
CN111968675A (en) * 2020-07-10 2020-11-20 南京邮电大学 Stringed instrument note comparison system based on hand recognition and use method thereof

Also Published As

Publication number Publication date
CN113076967A (en) 2021-07-06

Similar Documents

Publication Publication Date Title
JP4504702B2 (en) Document processing apparatus, document processing method, and document processing program
Fornés et al. CVC-MUSCIMA: a ground truth of handwritten music score images for writer identification and staff removal
KR101376863B1 (en) Grammatical parsing of document visual structures
Truong et al. Improvement of end-to-end offline handwritten mathematical expression recognition by weakly supervised learning
CN111144191B (en) Font identification method, font identification device, electronic equipment and storage medium
US20160300555A1 (en) System and method for optical music recognition
CN111968649A (en) Subtitle correction method, subtitle display method, device, equipment and medium
CN106485984B (en) Intelligent teaching method and device for piano
US8768241B2 (en) System and method for representing digital assessments
CN112509661B (en) Methods, computing devices, and media for identifying physical examination reports
KR20120063170A (en) System and method for recognizing a music score image
CN113076967B (en) Image and audio-based music score dual-recognition system
CN116645683A (en) Signature handwriting identification method, system and storage medium based on prompt learning
CN111052221A (en) Chord information extraction device, chord information extraction method, and chord information extraction program
KR102152260B1 (en) Apparatus and method for recognizing key-value relationship
CN115393875B (en) MobileNet V3-based staff identification and numbered musical notation conversion method and system
CN111552830A (en) User-selected music score exercise method and system
CN114202763B (en) Music numbered musical notation semantic translation method and system
CN115391506A (en) Question and answer content standard detection method and device for multi-section reply
CN109949813A (en) A kind of method, apparatus and system converting speech into text
CN114511858A (en) AI and RPA-based official document file processing method, device, equipment and medium
CN113255470A (en) Multi-mode piano partner training system and method based on hand posture estimation
JPH06332443A (en) Score recognizing device
Schramm et al. Audiovisual tool for solfège assessment
JP3727173B2 (en) Speech recognition method and apparatus

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 214000 Ping An wealth center, financial first street, Binhu District, Wuxi City, Jiangsu Province

Applicant after: Wuxi Leqi Technology Co.,Ltd.

Address before: Building a, Cetus, Wuxi Software Park, No.18 Zhenze Road, Xinwu District, Wuxi City, Jiangsu Province, 214000

Applicant before: Wuxi Le Chi Technology Co.,Ltd.

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20231206

Address after: Room 301, Building 6, Building B4, No. 221 Huangxing Road, Yangpu District, Shanghai, 200082

Patentee after: Shanghai Leqiai Technology Co.,Ltd.

Address before: 214000 Ping An wealth center, financial first street, Binhu District, Wuxi City, Jiangsu Province

Patentee before: Wuxi Leqi Technology Co.,Ltd.

TR01 Transfer of patent right