CN106297755B - Electronic equipment and identification method for music score image identification - Google Patents

Electronic equipment and identification method for music score image identification Download PDF

Info

Publication number
CN106297755B
CN106297755B CN201610859907.2A CN201610859907A CN106297755B CN 106297755 B CN106297755 B CN 106297755B CN 201610859907 A CN201610859907 A CN 201610859907A CN 106297755 B CN106297755 B CN 106297755B
Authority
CN
China
Prior art keywords
image
note
staff
complete
control circuit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610859907.2A
Other languages
Chinese (zh)
Other versions
CN106297755A (en
Inventor
宋晴
杨录
贾文赫
王智慧
杨李怡
刘小欧
辛学仕
陈海鹏
杨敏
姜佳男
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Posts and Telecommunications
Original Assignee
Beijing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Posts and Telecommunications filed Critical Beijing University of Posts and Telecommunications
Priority to CN201610859907.2A priority Critical patent/CN106297755B/en
Publication of CN106297755A publication Critical patent/CN106297755A/en
Application granted granted Critical
Publication of CN106297755B publication Critical patent/CN106297755B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/32Constructional details
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2220/00Input/output interfacing specifically adapted for electrophonic musical tools or instruments
    • G10H2220/155User input interfaces for electrophonic musical instruments
    • G10H2220/441Image sensing, i.e. capturing images or optical patterns for musical purposes or musical control purposes
    • G10H2220/455Camera input, e.g. analyzing pictures from a video camera and using the analysis results as control data

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Character Discrimination (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses electronic equipment and a recognition method for music score image recognition, wherein the electronic equipment comprises a shell, a sounding component, a main board arranged in the shell and an image scanning component arranged at a first end part of the shell; the main board is provided with a main control circuit, a sound card circuit and a power circuit which are respectively and electrically connected with the main control circuit; acquiring a staff image to be processed through a camera and transmitting the staff image to a main control circuit; the main control circuit identifies the staff images to be processed and identifies each complete note; the main control circuit sends corresponding sound digital signals to the sound card circuit according to the recognized complete notes, and the sound card circuit converts the received sound digital signals into playable analog signals and transmits the playable analog signals to the sounding component for playing; the device solves the problems of the prior art that an image acquisition module is separated from an identification module and the use is inconvenient. The method adopts the cascade connection of the note classifier and the convolutional neural network to carry out note identification, and has the advantages of high identification speed and high identification precision.

Description

Electronic equipment and identification method for music score image identification
Technical Field
The invention relates to the technical field of image recognition, in particular to electronic equipment and a recognition method for recognizing music score images.
Background
Image recognition refers to a technique of processing, analyzing, and understanding an image with a computer to recognize objects and objects of various modes.
The music score image recognition device in the prior art comprises an image acquisition module and a computer, wherein the image acquisition module acquires image data of a music score in a photographing or music score scanning mode, inputs the image data into the computer, and analyzes and recognizes the acquired image data through a recognition module in the computer.
However, with the above-described musical score image recognition apparatus, there are the following technical problems: the image acquisition module is separated from the identification module, and the image acquisition module and the identification module are required to work by a computer, so that the working process is longer, and the convenience of use is affected.
Most of music score image recognition methods in the prior art are based on traditional computer vision methods, are not ideal in recognition accuracy and recognition speed, cannot achieve rapid and accurate recognition, and even require high normalization requirements for music scores to be recognized, so that the method is not beneficial to use in daily scenes.
Disclosure of Invention
The embodiment of the invention aims to provide electronic equipment and an identification method for identifying music score images, which can solve the problems that an image acquisition module and an identification module of the music score image identification equipment in the prior art are separated, the use is inconvenient, and the identification precision and the identification speed of the music score image identification method in the prior art are not ideal.
To achieve the above object, an embodiment of the present invention discloses an electronic device for music score image recognition, including a housing, a sound emitting part, a main board disposed in the housing, and an image scanning part disposed at a first end of the housing;
the main board is provided with a main control circuit, a sound card circuit and a power supply circuit which are respectively and electrically connected with the main control circuit;
the image scanning component comprises a scanning roller and a camera arranged above the scanning roller, and the scanning roller and the camera are electrically connected with the main control circuit; the camera sends the shot music score image to the main control circuit for processing;
the sound generating component is connected with the sound card circuit and generates sound according to a sound signal sent by the main control circuit;
the power supply circuit is respectively and electrically connected with the scanning roller, the camera and the sounding component to supply power to the scanning roller, the camera and the sounding component;
the second end of the shell is provided with a battery compartment and a compartment cover, and the battery compartment is connected with a power circuit on the main board.
Preferably, the shell is a pen-shaped shell; the image scanning component is arranged at the first end part of the pen-shaped shell;
the sound generating component is arranged above the image scanning component, and the image scanning component and the sound generating component form a first end part into a pen point shape;
the main board is arranged at a position close to the pen point in the pen-shaped shell;
at least 2 main board mounting columns are arranged in the pen-shaped shell; the main board is fixed in the pen-shaped shell through the at least 2 main board mounting columns.
Preferably, a battery compartment and a compartment cover are arranged at the second end of the pen-shaped shell, and the battery compartment is connected with a power circuit on the main board.
Preferably, an external power line is arranged at the second end part of the pen-shaped shell, and the external power line is connected with a power circuit on the main board.
The embodiment of the invention also discloses a music score image recognition method, which comprises the following steps,
acquiring a staff image to be processed through a camera and transmitting the staff image to a main control circuit;
the main control circuit identifies the staff images to be processed and identifies each complete note;
the main control circuit sends corresponding sound digital signals to the sound card circuit according to the recognized complete notes, and the sound card circuit converts the received sound digital signals into playable analog signals and transmits the playable analog signals to the sounding component for playing;
the main control circuit identifies the staff image to be processed, including,
drawing the edge information of the image by adopting an edge detection method on the staff image to be processed, and detecting the position coordinates of the staff by adopting a straight line detection method;
carrying out note positioning segmentation on the staff image to be processed by adopting a preset note classifier to obtain the position of each complete note in the image;
identifying the note heads obtained by segmentation by adopting a preset convolutional neural network, judging whether the note heads are solid Fu Tou or hollow, and obtaining the positions of the note heads;
and identifying each complete note according to the obtained five-line position coordinates, the relative position of each complete note, whether the complete note is a solid note or a hollow note and the position of the note.
Preferably, the training process of the note classifier includes:
establishing a positive sample data set and a negative sample data set, wherein the positive sample data set comprises position data of a positioning frame and image data of a staff image in the positioning frame, the positive sample data set comprises image data of complete notes, and the negative sample data set comprises image data of other music scores except for the complete notes;
the channel characteristics of each sample in the positive sample data set and the negative sample data set are extracted, and a note classifier is trained.
Preferably, the musical note location segmentation is performed on the staff image to be processed, including,
randomly selecting a plurality of candidate positioning frames on the staff image to be processed, scanning the positioning frames one by one, extracting the channel characteristics from the image in each positioning frame, inputting the extracted channel characteristics into a note classifier, judging whether the image in the positioning frame is a positive sample or a negative sample, judging that the positive sample is a complete note in a music score, judging that the negative sample is a background rejection of the music score, thereby obtaining the complete note in the staff image to be processed, and comparing the position data of the positioning frames in the note classifier to obtain the position of each complete note in the image.
Preferably, the training process of the convolutional neural network comprises,
establishing a note symbol head data set which comprises three classification data of a solid symbol head, a hollow symbol head and a background;
constructing a convolutional neural network, which comprises 2 convolutional layers, 2 downsampling layers and 1 full-connection layer;
and inputting the symbol head image data in the symbol head data set into a convolutional neural network to complete training.
Preferably, the convolutional neural network is used to identify the note heads obtained by segmentation, including,
the method comprises the steps of inputting a complete note obtained by note positioning and segmentation into a convolutional neural network, obtaining a solid note, a hollow note or a background through comparison with data in a note data set, discarding the background, and simultaneously determining the position of the note in the complete note by comparing the position data of the note in the note data set.
Preferably, the staff image to be processed specifically includes: and denoising, contrast enhancement and graying the staff image, and reducing noise or uneven illumination to obtain a binary image.
According to the technical scheme, the sound generating component, the main board and the image scanning component are all integrated into one device, so that the portability of a product is greatly improved, and the problem that an image acquisition module and an identification module are separated and inconvenient to use in the prior art is solved.
According to the identification method embodiment, an edge detection method is adopted for drawing out the edge information of an image of the staff image to be processed, and then a linear detection method is adopted for detecting the position coordinates of the staff; carrying out note positioning segmentation on the staff image to be processed by adopting a preset note classifier to obtain the position of each complete note in the image; identifying the note heads obtained by segmentation by adopting a preset convolutional neural network, judging whether the note heads are solid Fu Tou or hollow, and obtaining the positions of the note heads; and identifying each complete note according to the obtained five-line position coordinates, the relative position of each complete note, whether the complete note is a solid note or a hollow note and the position of the note. Compared with the traditional computer vision method, the method adopts the cascade connection of the note classifier and the convolutional neural network to carry out note recognition, and has the advantages of high recognition speed and high recognition precision.
Of course, it is not necessary for any one product or method of practicing the invention to achieve all of the advantages set forth above at the same time.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of an embodiment of an electronic device of the present invention;
FIG. 2 is a schematic circuit diagram of a motherboard in an embodiment of the electronic device of the present invention;
FIG. 3 is a control schematic diagram of a motherboard in an embodiment of the electronic device of the present invention;
FIG. 4 is a flowchart of a first embodiment of the score recognition method of the present invention;
FIG. 5 is a flow chart of the identification of a staff image to be processed by the master control circuit in a first embodiment of the identification method of the present invention;
FIG. 6 is a flow chart of the identification of a staff image to be processed by the master control circuit in a second embodiment of the identification method of the present invention;
FIG. 7 is a schematic diagram of a single edge detection method in a second embodiment of the score recognition method of the present invention;
FIG. 8 is an effect diagram of five-line position coordinate detection in the second embodiment of the score recognition method of the present invention;
FIG. 9 is a diagram showing a training process of a phonetic symbol classifier in a second embodiment of the score recognition method of the present invention;
FIG. 10 is a sample schematic of a positive sample dataset and a negative sample dataset in a second embodiment of the score recognition method of the present invention;
FIG. 11 is a flowchart of a phonetic symbol positioning segmentation in a second embodiment of the score recognition method of the present invention;
FIG. 12 is an effect diagram of a phonetic symbol positioning segmentation in a second embodiment of the score recognition method of the present invention;
FIG. 13 is a schematic diagram of a training process of a convolutional neural network in a second embodiment of the score recognition method of the present invention;
FIG. 14 is a block diagram of a convolutional neural network in a second embodiment of the score recognition method of the present invention;
fig. 15 is a flowchart of recognition of a tone Fu Futou in a second embodiment of the score recognition method of the present invention;
in the figure, 1, the cabin cover, 2, the battery compartment, 3, the mainboard, 4, the camera, 5, scanning gyro wheel, 6, mainboard erection column, 7, sounding part, 8, LED light filling lamp.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
In the structure of one embodiment of the electronic device for music score image recognition of the present invention, as shown in fig. 1, the housing is a pen-shaped housing, the image scanning component is disposed at a first end of the pen-shaped housing, the sound generating component 7 is mounted above the image scanning component, and the image scanning component and the sound generating component 7 form the first end into a pen point shape; the image scanning means comprises a scanning roller 5 and a camera 4 arranged above the scanning roller 5.
The main board 3 is mounted in the pen-like housing at a position close to the pen point. At least 2 main board mounting posts 6 are arranged in the pen-shaped shell, and the main board 3 is fixed in the pen-shaped shell through the at least 2 main board mounting posts 6. As shown in fig. 2, a main control circuit, a sound card circuit and a power supply circuit are arranged on the main board 3, and the sound card circuit and the power supply circuit are respectively and electrically connected with the main control circuit; the scanning roller 5 and the camera 4 are electrically connected with a main control circuit; the camera 4 sends the shot music score image to the main control circuit for processing; the sounding component 7 is connected with the sound card circuit and sounds according to the sound signal sent by the main control circuit.
The second end of the pen-shaped shell is provided with a battery compartment 2 and a compartment cover 1, and the battery compartment 2 is connected with a power circuit on the main board 3. It should be noted that the battery compartment 2 and the cover 1 are provided for supplying power to the power circuit on the main board 3, and other structures may be selected for supplying power, for example: an external power line is arranged at the second end part of the pen-shaped shell and is connected with a power circuit on the main board 3.
Preferably, the camera 4 is further provided with an LED light supplementing lamp 8 for supplementing light to the camera 4.
Preferably, the sound generating component 7 is a speaker. It should be noted that the sound emitting component 7 is a sound emitting device in the prior art, and is intended to perform the function of sound emission.
Preferably, the camera 4 is implemented by a CMOS image sensor OV 7620; the main control circuit is realized by adopting a microprocessor Argus3 chip. As shown in FIG. 3, the microprocessor Argus3 chip is embedded with ARM9TDMI core, and a high-speed buffer memory, a special RAM and various rich application interfaces are integrated in the chip to support the formats of SPAM, FLASH and the like and provide a video processing engine and an image processor.
Preferably, a protective sleeve movably connected with the pen-shaped shell is arranged outside the image scanning component, and the shape of the protective sleeve is matched with the shape of the pen point so as to protect the camera 4.
A first embodiment of the score image recognition method of the present invention, as shown in fig. 4, includes,
step 101: acquiring a staff image to be processed through a camera and transmitting the staff image to a main control circuit;
step 102: the main control circuit identifies the staff images to be processed and identifies each complete note;
step 103: the main control circuit sends corresponding sound digital signals to the sound card circuit according to the recognized complete notes, and the sound card circuit converts the received sound digital signals into playable analog signals and transmits the playable analog signals to the sounding component for playing;
the master circuit identifies the staff images to be processed, as shown in fig. 5, including,
step 1021: drawing the edge information of the image by adopting an edge detection method on the staff image to be processed, and detecting the position coordinates of the staff by adopting a straight line detection method;
step 1022: carrying out note positioning segmentation on the staff image to be processed by adopting a preset note classifier to obtain the position of each complete note in the image;
step 1023: identifying the note heads obtained by segmentation by adopting a preset convolutional neural network, judging whether the note heads are solid Fu Tou or hollow, and obtaining the positions of the note heads;
step 1024: and identifying each complete note according to the obtained five-line position coordinates, the relative position of each complete note, whether the complete note is a solid note or a hollow note and the position of the note.
A second embodiment of the music score image recognition method of the present invention, as shown in fig. 6, is different from the first embodiment of the recognition method in that the main control circuit recognizes the staff image to be processed, including,
step 2021: denoising, contrast enhancement and graying the obtained staff image, and reducing noise or uneven illumination to obtain a binary image;
step 2022: drawing the edge information of the image by adopting a unilateral edge detection method on the obtained binary image, and detecting the five-line position coordinate by adopting a hough straight line detection method;
step 2023: carrying out note positioning segmentation on the obtained binary image by adopting a preset note classifier to obtain the position of each complete note in the image;
step 2024: identifying the note heads obtained by segmentation by adopting a preset convolutional neural network, judging whether the note heads are solid Fu Tou or hollow, and obtaining the positions of the note heads;
step 2025: and identifying each complete note according to the obtained five-line position coordinates, the relative position of each complete note, whether the complete note is a solid note or a hollow note and the position of the note.
Other steps in the second embodiment of the music score image recognition method of the present invention may refer to the first embodiment, and will not be described herein.
Preferably, the single edge detection method of step 2022 in the second embodiment of the identification method of the present invention includes:
a) And (3) selecting Sobel operators to respectively calculate gradient values in the horizontal direction and the vertical direction:
horizontal gradient: s is(s) x =(a 2 +2a 3 +a 4 )-(a 0 +2a 7 +a 6 )
Vertical gradient: s is(s) y =(a 0 +2a 1 +a 2 )-(a 6 +2a 5 +a 4 )
Amplitude value:
Figure BDA0001122338150000091
sobel template:
Figure BDA0001122338150000092
wherein a is 0 -a 7 Representing 8 neighborhood pixel points;
b) Adopting non-maximum value inhibition to inhibit gradient values in the horizontal direction and the vertical direction, namely only reserving the maximum value point on the gradient straight line in each direction, and setting the values of the rest points to be 0;
c) And obtaining the size of a threshold to be set in each region by adopting an adaptive threshold method, using the threshold as a condition limit of whether edges are connected or not, and drawing the edge information of the image.
In order to better illustrate the beneficial effects of the single edge detection method, the following makes a comparison between the conventional canny edge detection method and the single edge detection method adopted by the invention:
1) The traditional canny edge detection method comprises the following steps:
a) The first-order partial derivative of each pixel in the image is obtained, and the gradient direction and the amplitude are calculated, so that the amplitude of each point in different directions is obtained, and different operator templates, such as Robert operators, prewitt operators and the like, are involved in the process;
b) The non-extremum suppression is carried out on the gradient amplitude, the larger the element value in the gradient amplitude matrix of the image is, the larger the gradient value of the point in the image is, but the point is insufficient to be determined as an edge point, so that the extremum of the pixel point on a straight line is required to be found, the gray value corresponding to the non-extremum point is set to be 0, and most of non-edge points can be removed;
c) And detecting and connecting edges by using a double-threshold algorithm, selecting two thresholds, and obtaining an edge image according to the high threshold. And (3) linking the edges into contours in the high-threshold image, searching a point meeting the low threshold in the 8-value neighborhood points of the break point when the end points of the contours are reached, and collecting new edges according to the point until the edges of the whole image are closed, so that the whole edge image is formed.
2) The single-side edge detection method adopted by the invention comprises the following steps:
a) The template operator commonly used by the original canny algorithm is changed, and then a Sobel operator (a 0 -a 7 Representing 8 neighborhood pixel points), respectively solving gradient values in the horizontal direction and the vertical direction;
horizontal gradient: s is(s) x =(a 2 +2a 3 +a 4 )-(a 0 +2a 7 +a 6 )
Vertical gradient: s is(s) y =(a 0 +2a 1 +a 2 )-(a 6 +2a 5 +a 4 )
Amplitude value:
Figure BDA0001122338150000101
sobel template:
Figure BDA0001122338150000102
b) The gradient values in each direction are also suppressed, but since a straight line single-sided edge is required, the suppression method is required to be changed, the non-extreme value suppression in the original method is changed to the non-maximum value suppression, namely only the point of the maximum value on the gradient straight line in each direction is reserved, the values of the rest points are all set to 0, as shown in fig. 7, the area of (3*3) is used as a comparison block, the central pixel is respectively compared with (1, 5) (2, 6) (3, 7) (4, 8), and the non-maximum value point is set to 0;
c) The method uses the self-adaptive threshold method to obtain the size of the threshold to be set in each region, and uses the threshold as the condition limit of whether the edges are connected or not.
It should be noted that the adaptive threshold method is a common method in the prior art.
By the comparison, the detection of the traditional canny method finds that each five lines have double edges and influence the positioning effect, the invention adopts non-maximum value inhibition to only reserve gradient single-edge extremum, and adds the self-adaptive threshold condition, so that the five lines better present single-edge;
it should be noted that, the hough line detection method in step 2022 is a conventional line detection method in the prior art, and can detect the position coordinates of the five lines according to the edge information of the obtained image, as shown in fig. 8, which is an effect diagram of the positioning of the five line spectrums in this embodiment.
Preferably, the training process of the note classifier in step 2023 in the second embodiment of the identification method of the present invention, as shown in fig. 9, includes:
step 301: establishing a positive sample data set and a negative sample data set, wherein the positive sample data set is the image data comprising the complete musical notes, and the negative sample data set is the image data comprising the rest musical scores except the complete musical notes, and the image data of the staff images in the positioning frames are included in the data sets as shown in fig. 10;
step 302: the channel characteristics of each sample in the positive sample data set and the negative sample data set are extracted, and a note classifier is trained.
It should be noted that the negative examples herein may be incomplete note images, staff images, score background images, etc., but are not limited to the listed above.
Preferably, the channel characteristics of each sample include gray scale and color, linear filtering, nonlinear transformation, point-by-point transformation, gradient histogram. It should be noted that the 5 channel characteristics, which are integral channel characteristics in the prior art, are defined and explained as follows:
gray and color: gray scale is a simple channel, and LUV color space is also three commonly used channels;
linear filtering: obtaining channels by linear transformation, such as channels obtained by convoluting images with Gabor filters in different directions, wherein each channel contains edge information in different directions, so as to obtain texture information of different dimensions of the images;
nonlinear transformation: calculating the gradient amplitude of the image, and capturing the edge intensity information; capturing edge gradient information, wherein the gradient comprises edge intensity and edge direction, and for a color image, the gradient needs to be calculated in 3 channels respectively, and the maximum response of the 3 gradients at corresponding positions is taken as the final output; binarizing the image, wherein the image is binarized by two different thresholds respectively;
point-by-point transformation: any pixel in the channel may be changed as a post-processing by any one of a number of functions. The local multiplication operator exp (sigma) can be obtained by Log operation i log(x i ))=∏ i x i Similarly, computing the power of p for each pixel can be used to solve the generalized mean;
gradient histogram: is a weighted histogram whose bin index is calculated from the direction of the gradient and whose weight is calculated from the magnitude of the gradient, i.e. the channel is calculated as: q (Q) θ (x,y)=G(x,y)*1[Θ(x,y)=θ]Here, G (x, y) and Θ (x, y) represent the gradient magnitude and quantized gradient direction of the image, respectively, and at the same time, blurring of different scales is performed, so that gradient information of different scales can be calculated. Furthermore, the calculated histogram is normalized by means of gradient magnitude information, in a way similar to the HOG feature.
Preferably, the positioning frame is a rectangular block positioning frame, the size of the positioning frame is determined according to the interval between five lines, and the height and width of the positioning frame are calculated according to the formula respectively:
height=5*interval;width=2.5*interval。
preferably, the staff image to be processed in step 2023 of the second embodiment of the identification method of the present invention is subjected to note-location segmentation, as shown in fig. 11, comprising,
and randomly selecting a plurality of candidate positioning frames on the binary image to be identified, scanning the positioning frames one by one, extracting the channel characteristics from the images in each positioning frame, inputting the extracted channel characteristics into a note classifier, judging whether the images in the positioning frames are positive samples or negative samples, judging that the positive samples are complete notes in the music score, judging that the negative samples are the background of the music score, discarding the background of the music score, thereby obtaining the complete notes in the binary image to be identified, and comparing the position data of the positioning frames in the note classifier to obtain the positions of each complete note in the images, as shown in fig. 12.
In this embodiment, 2000 candidate positioning frames are randomly selected.
Preferably, the training process of the convolutional neural network in step 2024 in the second embodiment of the identification method of the present invention, as shown in fig. 13, includes,
step 401: establishing a note symbol head data set which comprises three classification data of a solid symbol head, a hollow symbol head and a background;
step 402: as shown in fig. 14, a convolutional neural network is constructed, which comprises 2 convolutional layers, 2 downsampling layers and 1 fully-connected layer;
step 403: and inputting the symbol head image data in the symbol head data set into a convolutional neural network to complete training.
The note header data set in this embodiment includes 2000 solid note headers, 1500 hollow note headers and 4000 background images.
According to the embodiment, the caffe framework convolutional neural network is adopted, the caffe framework is a clear, high-readability and rapid deep learning framework, the model is simple in structure and few in parameters, note identification can be carried out only by realizing simple convolution and full-connection forward network in a plurality of environments (notebooks, mobile phones and the like), and the caffe framework is not required to be additionally configured, so that the method is very convenient and simple.
Preferably, the recognition method of the present invention in the second embodiment uses a convolutional neural network to recognize the note headers obtained by segmentation, as shown in fig. 15, including,
the method comprises the steps of inputting a complete note obtained by note positioning and segmentation into a convolutional neural network, obtaining a solid note, a hollow note or a background through comparison with data in a note data set, discarding the background, and simultaneously determining the position of the note in the complete note by comparing the position data of the note in the note data set.
In practical application, playable electronic music score can be generated according to the identified note information for playing.
By adopting the second embodiment to carry out note identification, the hardware is three-star galaxy S3, the CPU carries out test, the note identification speed reaches 500fps, and the accuracy is 98.71%.
It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
In this specification, each embodiment is described in a related manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments.
The foregoing description is only of the preferred embodiments of the present invention and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention are included in the protection scope of the present invention.

Claims (8)

1. A music score image recognition method for an electronic device for music score image recognition, characterized by comprising,
acquiring a staff image to be processed through a camera and transmitting the staff image to a main control circuit;
the main control circuit identifies the staff images to be processed and identifies each complete note;
the main control circuit sends corresponding sound digital signals to the sound card circuit according to the recognized complete notes, and the sound card circuit converts the received sound digital signals into playable analog signals and transmits the playable analog signals to the sounding component for playing;
the main control circuit identifies the staff image to be processed, including,
drawing the edge information of the image by adopting an edge detection method on the staff image to be processed, and detecting the position coordinates of the staff by adopting a straight line detection method;
carrying out note positioning segmentation on the staff image to be processed by adopting a preset note classifier to obtain the position of each complete note in the image, wherein the training process of the note classifier comprises the following steps: establishing a positive sample data set and a negative sample data set, wherein the positive sample data set comprises position data of a positioning frame and image data of a staff image in the positioning frame, the positive sample data set comprises image data of complete notes, and the negative sample data set comprises image data of other music scores except for the complete notes; extracting channel characteristics of each sample in the positive sample data set and the negative sample data set, and training a note classifier; the method comprises the steps of carrying out note positioning segmentation on a staff image to be processed, wherein a plurality of candidate positioning frames are randomly selected on the staff image to be processed, scanning the positioning frames one by one, extracting channel characteristics from the image in each positioning frame, inputting the extracted channel characteristics into a note classifier, judging whether the image in the positioning frame is a positive sample or a negative sample, judging that the positive sample is a complete note in a music score, judging that the negative sample is a complete note in the music score, discarding the background of the music score, thereby obtaining the complete note in the staff image to be processed, and obtaining the position of each complete note in the image by comparing the position data of the positioning frame in the note classifier;
identifying the note heads obtained by segmentation by adopting a preset convolutional neural network, judging whether the note heads are solid Fu Tou or hollow, and obtaining the positions of the note heads;
identifying each complete note according to the obtained five-line position coordinates, the position of each complete note in the image, whether the complete note is a solid symbol head or a hollow symbol head and the position of the symbol head;
the electronic equipment for music score image recognition comprises a shell, a sounding component, a main board arranged in the shell and an image scanning component arranged at the first end part of the shell;
the main board is provided with a main control circuit and a sound card circuit electrically connected with the main control circuit;
the image scanning component comprises the camera, and the camera sends the shot music score image to the main control circuit for processing;
the sound generating component is connected with the sound card circuit and generates sound according to the sound signal sent by the main control circuit.
2. The score image recognition method of claim 1, wherein the training process of the convolutional neural network comprises,
establishing a note symbol head data set which comprises three classification data of a solid symbol head, a hollow symbol head and a background;
constructing a convolutional neural network, which comprises 2 convolutional layers, 2 downsampling layers and 1 full-connection layer;
and inputting the symbol head image data in the symbol head data set into a convolutional neural network to complete training.
3. The method of recognizing a score image as claimed in claim 2, wherein the recognizing of the note heads obtained by division using the convolutional neural network comprises,
the method comprises the steps of inputting a complete note obtained by note positioning and segmentation into a convolutional neural network, obtaining a solid note, a hollow note or a background through comparison with data in a note data set, discarding the background, and simultaneously determining the position of the note in the complete note by comparing the position data of the note in the note data set.
4. The music score image recognition method of claim 1, wherein the staff image to be processed specifically is: and denoising, contrast enhancement and graying the staff image, and reducing noise or uneven illumination to obtain a binary image.
5. The music score image recognition method of claim 1, wherein a power circuit electrically connected to the main control circuit is further provided on the main board;
the image scanning component further comprises a scanning roller, and the scanning roller and the camera are electrically connected with the main control circuit;
the power supply circuit is respectively and electrically connected with the scanning roller, the camera and the sounding component to supply power to the scanning roller, the camera and the sounding component;
the second end of the shell is provided with a battery compartment and a compartment cover, and the battery compartment is connected with a power circuit on the main board.
6. The musical score image recognition method as claimed in claim 1, wherein the housing is a pen-shaped housing; the image scanning component is arranged at the first end part of the pen-shaped shell;
the sound generating component is arranged above the image scanning component, and the image scanning component and the sound generating component form a first end part into a pen point shape;
the main board is arranged at a position close to the pen point in the pen-shaped shell;
at least 2 main board mounting columns are arranged in the pen-shaped shell; the main board is fixed in the pen-shaped shell through the at least 2 main board mounting columns.
7. The music score image recognition method of claim 6, wherein the second end of the pen-shaped housing is provided with a battery compartment and a hatch cover, the battery compartment being connected to a power circuit on the main board.
8. The music score image recognition method of claim 6, wherein the second end of the pen-shaped case is provided with an external power line connected to a power circuit on the main board.
CN201610859907.2A 2016-09-28 2016-09-28 Electronic equipment and identification method for music score image identification Active CN106297755B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610859907.2A CN106297755B (en) 2016-09-28 2016-09-28 Electronic equipment and identification method for music score image identification

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610859907.2A CN106297755B (en) 2016-09-28 2016-09-28 Electronic equipment and identification method for music score image identification

Publications (2)

Publication Number Publication Date
CN106297755A CN106297755A (en) 2017-01-04
CN106297755B true CN106297755B (en) 2023-06-13

Family

ID=57715584

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610859907.2A Active CN106297755B (en) 2016-09-28 2016-09-28 Electronic equipment and identification method for music score image identification

Country Status (1)

Country Link
CN (1) CN106297755B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107945780A (en) * 2017-11-23 2018-04-20 北京物灵智能科技有限公司 A kind of instrument playing method and device based on computer vision
CN108766463B (en) * 2018-04-28 2019-05-10 平安科技(深圳)有限公司 Electronic device, the music playing style recognition methods based on deep learning and storage medium
CN108665888A (en) * 2018-05-11 2018-10-16 西安石油大学 A kind of system and method that written symbol, image are converted into audio data
CN110796146A (en) * 2019-10-11 2020-02-14 上海上湖信息技术有限公司 Bank card number identification method, model training method and device
CN112133264B (en) * 2020-08-31 2023-09-22 广东工业大学 Music score recognition method and device
CN113076967B (en) * 2020-12-08 2022-09-23 无锡乐骐科技股份有限公司 Image and audio-based music score dual-recognition system
CN112925944A (en) * 2021-03-10 2021-06-08 上海妙克信息科技有限公司 Music score identification method, terminal equipment and computer readable storage medium
CN115019600A (en) * 2022-01-17 2022-09-06 滁州职业技术学院 Music staff recognizer and music recognition method thereof

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0997060A (en) * 1995-09-29 1997-04-08 Kawai Musical Instr Mfg Co Ltd Musical score recognition device
CN1283832A (en) * 1999-08-10 2001-02-14 曾平蔚 Optical scan method and device for reading music score
JP2003242439A (en) * 2003-02-07 2003-08-29 Kawai Musical Instr Mfg Co Ltd Musical score recognizing device
CN103646247A (en) * 2013-09-26 2014-03-19 惠州学院 Music score recognition method
CN105022993A (en) * 2015-06-30 2015-11-04 北京邮电大学 Stave playing system based on image recognition technology
CN206097909U (en) * 2016-09-28 2017-04-12 北京邮电大学 A electronic equipment for music book image recognition

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0997060A (en) * 1995-09-29 1997-04-08 Kawai Musical Instr Mfg Co Ltd Musical score recognition device
CN1283832A (en) * 1999-08-10 2001-02-14 曾平蔚 Optical scan method and device for reading music score
JP2003242439A (en) * 2003-02-07 2003-08-29 Kawai Musical Instr Mfg Co Ltd Musical score recognizing device
CN103646247A (en) * 2013-09-26 2014-03-19 惠州学院 Music score recognition method
CN105022993A (en) * 2015-06-30 2015-11-04 北京邮电大学 Stave playing system based on image recognition technology
CN206097909U (en) * 2016-09-28 2017-04-12 北京邮电大学 A electronic equipment for music book image recognition

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘晓翔等.乐谱识别中音符结构分析方法.计算机工程与设计.2009,30(3),709-712、778. *

Also Published As

Publication number Publication date
CN106297755A (en) 2017-01-04

Similar Documents

Publication Publication Date Title
CN106297755B (en) Electronic equipment and identification method for music score image identification
US11151363B2 (en) Expression recognition method, apparatus, electronic device, and storage medium
US20230117712A1 (en) Feature density object classification, systems and methods
US8306327B2 (en) Adaptive partial character recognition
KR100724932B1 (en) apparatus and method for extracting human face in a image
Nagarajan et al. Static hand gesture recognition for sign language alphabets using edge oriented histogram and multi class SVM
CN103824091B (en) A kind of licence plate recognition method for intelligent transportation system
Mohandes et al. Prototype Arabic Sign language recognition using multi-sensor data fusion of two leap motion controllers
CN106446952A (en) Method and apparatus for recognizing score image
CN109685065B (en) Layout analysis method and system for automatically classifying test paper contents
Shah et al. OCR-based chassis-number recognition using artificial neural networks
US9922241B2 (en) Gesture recognition method, an apparatus and a computer program for the same
Ardiansyah et al. Systematic literature review: American sign language translator
Marne et al. Identification of optimal optical character recognition (OCR) engine for proposed system
CN110363111B (en) Face living body detection method, device and storage medium based on lens distortion principle
KR101151435B1 (en) Apparatus and method of recognizing a face
CN108921006B (en) Method for establishing handwritten signature image authenticity identification model and authenticity identification method
Hashim et al. Kurdish sign language recognition system
Zhu et al. Scene text detection via extremal region based double threshold convolutional network classification
CN110516638B (en) Sign language recognition method based on track and random forest
Singh et al. Implementation and evaluation of DWT and MFCC based ISL gesture recognition
CN109325501B (en) Guitar backboard image-based material identification method and device and readable storage medium
CN108985294B (en) Method, device and equipment for positioning tire mold picture and storage medium
CN113610071B (en) Face living body detection method and device, electronic equipment and storage medium
Walhazi et al. Preprocessing latent-fingerprint images for improving segmentation using morphological snakes

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant