CN111950552A - Method for recognizing southern music score by using computer - Google Patents

Method for recognizing southern music score by using computer Download PDF

Info

Publication number
CN111950552A
CN111950552A CN202010819712.1A CN202010819712A CN111950552A CN 111950552 A CN111950552 A CN 111950552A CN 202010819712 A CN202010819712 A CN 202010819712A CN 111950552 A CN111950552 A CN 111950552A
Authority
CN
China
Prior art keywords
image
southern
music score
model
characters
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010819712.1A
Other languages
Chinese (zh)
Inventor
徐凌云
肖继华
唐文千
卓佳源
杨晓琪
郭奕晗
武星
晃国清
郁抒思
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Huasheng Intelligent Technology Co Ltd
Original Assignee
Shanghai Huasheng Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Huasheng Intelligent Technology Co Ltd filed Critical Shanghai Huasheng Intelligent Technology Co Ltd
Priority to CN202010819712.1A priority Critical patent/CN111950552A/en
Publication of CN111950552A publication Critical patent/CN111950552A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/24Aligning, centring, orientation detection or correction of the image
    • G06V10/243Aligning, centring, orientation detection or correction of the image by compensating for image skew or non-uniform image deformations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/103Formatting, i.e. changing of presentation of documents
    • G06F40/117Tagging; Marking up; Designating a block; Setting of attributes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T3/40Scaling the whole image or part thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/20Image enhancement or restoration by the use of local operators
    • G06T5/30Erosion or dilatation, e.g. thinning
    • G06T5/70
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/90Determination of colour characteristics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/32Normalisation of the pattern dimensions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words

Abstract

The invention relates to the technical field of southern music score information acquisition, and discloses an identification method for identifying a southern music score by using a computer, which solves the technical problem that the prior southern music score can not be identified by the computer, and comprises the following steps: firstly, model training: through the steps, obtaining a note identifier; and secondly, image recognition: the image recognition comprises the following steps: s1, loading the image; s2, zooming the image; s3, graying; s4, binarization; s5, correcting the inclination; s6, cutting; s7, cutting characters; s8, identifying the musical note; s9, XML conversion. According to the technical scheme, on the basis of image recognition, a series of preprocessing is carried out on the image, then the image classifier capable of recognizing the special characters of the southern music score is obtained through a convolutional neural network training sample, so that a computer can recognize the southern music score and draw and electronically release the southern music score, and the purposes of copying, popularization and inheritance are achieved.

Description

Method for recognizing southern music score by using computer
Technical Field
The invention relates to the technical field of southern music score information acquisition, in particular to a method for identifying a southern music score by using a computer.
Background
The Fujian Quanzhou south China has been concerned by people as a precious "activating stone" for traditional music. However, under the impact of the multi-element culture of the new century, the complex spectrum and uniqueness of the Nanyin cause the protection and inheritance of the Nanyin to have some difficulties. Firstly, the music score is the music score, and the unique names, fingering and shooting characters of the music score are special and cannot be recognized by a computer, so that the music score is always copied and given by hands. It causes difficulties in identifying, editing and making music scores, and causes high learning cost and high learning threshold, thus causing certain obstruction to popularization and inheritance.
Disclosure of Invention
Aiming at the technical problem that the Nanyin music score provided in the background technology can not be identified by a computer, the method of the invention carries out a series of pre-treatments on the image based on the image identification, and then obtains an image classifier capable of identifying the special characters of the Nanyin music score by training a sample through a Convolutional Neural Network (CNN), so that the computer can identify the Nanyin music score and draw and arrange the electronic version of the Nanyin music score, thereby achieving the purposes of convenient copying, popularization and inheritance.
In order to achieve the purpose, the invention provides the following technical scheme:
an identification method for identifying a southern musical score by using a computer, comprising the steps of:
firstly, model training: through the steps, obtaining a note identifier;
and secondly, image recognition:
the image recognition comprises the following steps:
s1, image loading: loading the scanned whole batch of southern music score images into a memory;
s2, image scaling: adjusting the image to a set size;
s3, graying: processing the color image;
s4, binarization: processing the image into black and white;
s5, tilt correction: for the image with the inclination angle, performing inclination correction on the image;
s6, column cut: performing opening operation on the binary image in the vertical direction, wherein the opening operation is used for eliminating characters in the image, and leaving a vertical frame, so that each column of the music score can be distinguished;
s7, character cutting: obtaining cut and square character images as samples needed by the model;
s8, note identification: outputting the character images preprocessed in the steps from S1 to S8 to a note recognizer obtained by model training, outputting codes corresponding to notes, and finally mapping the codes to the specific note name, fingering and japanning of the south syllable;
s9, XML conversion: and outputting the text into an XML file to form formatted text.
Through the technical scheme, the Nanyin music score recognition algorithm is mainly characterized in that after the images are specially processed, a Convolutional Neural Network (CNN) is used for training samples to obtain a note recognizer capable of recognizing special characters of the Nanyin music score, and therefore the special characters on the Nanyin music score are recognized. The south music score is successfully identified by using a south music score algorithm, and the traditional music scores with different forms and versions are changed into the standard electronic music score. The study cost and threshold of the south music are greatly reduced, so that the precious traditional music 'activating stone' of the south music can be better popularized and inherited.
The invention is further configured to: the model training comprises the following steps:
first step, image scanning: scanning the written southern music score into an image;
secondly, image loading: loading the scanned whole batch of southern music score images into a memory;
thirdly, zooming the image: adjusting the image to a set size;
fourthly, graying: processing the color image;
fifthly, binarization: processing the image into black and white;
sixthly, correcting the inclination: for the image with the inclination angle, performing inclination correction on the image;
seventh step, cutting: performing opening operation on the binary image in the vertical direction, wherein the opening operation is used for eliminating characters in the image, and leaving a vertical frame, so that each column of the music score can be distinguished;
eighth step, character cutting: obtaining cut and square character images as samples needed by the model;
ninth step, note labeling: mapping the character image to a unique ASCII code, labeling the sample, and generating a set of high-quality data set after labeling;
step ten, model training: training a network model by adopting an Adam optimizer and a Cross Entropy (Cross Entropy) loss function, and iteratively setting times;
step eleven, outputting a model result: outputting the optimal model for use in the note recognizer for image recognition in claim 1.
Through the technical scheme, the note recognizer can be obtained through the model training step.
The invention is further configured to: in the image scaling step, the image is adjusted to a uniform size of 2000x 3000.
Through the technical scheme, the images are adjusted to be 2000x3000 in uniform size, so that the operation is convenient.
The invention is further configured to: graying the color image by using an average value method, wherein the average value method comprises the following steps of:
Figure BDA0002634020990000031
the image processed by the average value method only has a gray image of one channel.
By the technical scheme, when the color image is processed, three channels are required to be processed in sequence, and time overhead is large. Therefore, in order to increase the processing speed of the entire application system, it is necessary to reduce the amount of data to be processed by graying a color image.
The invention is further configured to: determining an optimal threshold value T by an algorithm, wherein the threshold value T is set to be 255 when the optimal threshold value T is larger than the threshold value, and the threshold value T is set to be 0 when the optimal threshold value T is smaller than the threshold value;
Figure BDA0002634020990000041
the processed image only has black and white colors, so that the gray scale range is divided into a target type and a background type, and the binarization of the image is realized.
Through the technical scheme, the purpose of binarization is to convert the image of the gray scale of the previous step into a black-and-white binary image, so that a cleaned edge contour line can be obtained, and follow-up processing services such as edge extraction, image segmentation, target identification and the like can be better served.
The invention is further configured to: the network model in the model training comprises four convolutional layers, namely two pooling layers and two full-connection layers.
The invention is further configured to: the fixed number of iterations is 1000.
Through the technical scheme, the experiment shows that the optimal model can be output for image recognition when the iteration fixed times is 1000 times.
In conclusion, the invention has the following beneficial effects:
(1) the Nanyin music score recognition algorithm is mainly characterized in that after special processing is carried out on an image, a Convolutional Neural Network (CNN) training sample is used for obtaining a note recognizer capable of recognizing special characters of a Nanyin music score, so that the special characters on the Nanyin music score are recognized;
(2) the south music score is successfully identified by using a south music score algorithm, and the traditional music scores with different forms and different versions are changed into standard electronic music scores;
(3) the study cost and threshold of the south music are greatly reduced, so that the precious traditional music 'activating stone' of the south music can be better popularized and inherited.
Drawings
FIG. 1 is a schematic diagram of a southern pronunciation name;
FIG. 2 is a schematic diagram of a southward pointing approach;
FIG. 3 is a schematic view of the Nanyin swing;
FIG. 4 is a schematic block diagram of model training;
FIG. 5 is a schematic block diagram of an image recognition principle;
FIG. 6 is a schematic view of a projection, i.e., a horizontal projection, of a two-dimensional image on the y-axis;
FIG. 7 is a schematic view of the projection of a two-dimensional image onto the x-axis, i.e., a vertical projection;
FIG. 8 is a schematic diagram of a network training model.
Detailed Description
The present invention will be described in further detail with reference to examples and drawings, but the present invention is not limited to these examples.
A method for recognizing the southern music score by computer includes such steps as generating a note recognizer by model training, and recognizing the image of southern music score.
Specifically, as shown in fig. 1, fig. 2, fig. 3 and fig. 4, the model training steps are as follows:
first step, image scanning: the whole algorithm is to identify the specific notes in the written southern music score, so the written southern music score is firstly scanned into images and stored in a computer, and one image is formed by one page.
Secondly, image loading: with the scanned images, we load the images of the southern music score into memory. In the training phase, we load a batch of such images; and in the subsequent identification phase, one sheet is loaded at a time.
Thirdly, zooming the image: adjusting the image to a uniform size of 2000x 3000;
fourthly, graying: when a color image is processed, three channels are often required to be processed in sequence, and the time overhead is large. Therefore, in order to increase the processing speed of the entire application system, it is necessary to reduce the amount of data to be processed by graying a color image. In the RGB model, if R ═ G ═ B, the color represents a gray scale color, where the value of R ═ G ═ B is called the gray scale value, so that each pixel of the gray scale image only needs one byte to store the gray scale value (also called the intensity value, luminance value), and the gray scale range is 0-255. The color image is grayed by four methods, namely a component method, a maximum value method, an average value method and a weighted average method. We chose the mean method:
Figure BDA0002634020990000061
the processed image is only left with a gray scale image of one channel.
Fifthly, binarization: the purpose of binarization is to convert the image of the gray scale of the previous step into a black-and-white binary image, so that a cleaned edge contour line can be obtained, and the subsequent processing services such as edge extraction, image segmentation, target identification and the like can be better served. The specific method is to make the gray value of each pixel in the pixel matrix of the image be 0 (black) or 255 (white), that is, to make the whole image have only black and white effects. In the gray image, the gray value range is 0-255, an optimal threshold value T is determined by an algorithm, 255 is set when the threshold value T is larger than the threshold value, and 0 is set when the threshold value T is smaller than the threshold value.
Figure BDA0002634020990000062
The processed image only has black and white colors, so that the gray scale range is divided into a target type and a background type, and the binarization of the image is realized.
Sixthly, correcting the inclination: for images with a tilt angle, before training or subsequent recognition, the images are subjected to tilt correction. Before the inclination correction, we need to do some processing to the image to eliminate the characters in the image. The specific method is to carry out open operation on the processed binary image in the horizontal direction, namely to use symbols for the image I and the structural element S
Figure BDA0002634020990000071
S represents the opening operation on image I.
Figure BDA0002634020990000072
The on operation is to erode and then dilate the image. And (3) corrosion: the highlight part in the image is corroded, the field is reduced, and the effect image has a highlight area smaller than that of the original image; when the operation is performed, the adjacent area is replaced by the minimum value, and the highlight area is reduced. Expansion: the highlight part in the image is expanded, the field is expanded, and the effect image has a highlight area larger than that of the original image; when the operation is performed, the adjacent area is replaced by the maximum value, and the highlight area is increased.
Opening operation:
1) the on operation can remove isolated dots, burrs, and the overall position and shape are inconvenient.
2) The on operation is a filter based on geometric operations.
3) Differences in the size of the structuring element will result in different filtering effects.
4) The selection of different structural elements results in different segmentations, i.e. different features are extracted.
After the opening operation, the text part in the image disappears, and a horizontal frame is left. After taking the horizontal borders, we calculate the angles of the score borders and the image borders. By the two-line angle equation:
Figure BDA0002634020990000073
and after the angle is obtained through calculation, image rotation is carried out, and finally a set of corrected images are formed.
Seventh step, cutting: after the inclination correction, the binary image is subjected to opening operation in the vertical direction, the method is the same as the inclination correction step, only the selected structure S is different, and the vertical frame is left in order to eliminate characters in the image, so that each column of the music score can be distinguished.
Eighth step, character cutting: referring to fig. 6 and 7, after column cutting, a plurality of characters are still in one column, and horizontal rows and vertical rows exist, so that the image cannot be input into a recognizer for recognition, and one-step character cutting is needed. We cut the characters in the horizontal and vertical directions by horizontal and vertical projection. The horizontal projection means: projection of the two-dimensional image on the y-axis, vertical projection refers to: projection of the two-dimensional image on the x-axis. We can see from the projected image that there are many gaps in the two directions, which are reflected in the image as character-to-character gaps, in such a way that we can cut the image into individual characters. Through the image preprocessing, the complexity of the model can be greatly reduced, the performance of the model is improved, and the time for training and subsequent image recognition is shortened.
Ninth step, note labeling: through the preprocessing of the image in the earlier stage, cut and square character images are obtained, and the cut and square character images are samples needed by the final model. Samples are available, but what meaning the samples specifically represent cannot be known by the computer, so we need to label the samples first. Since most of the notes of the south note are characters outside the UTF-8 encoding set, we need to encode the characters and map these special characters to unique ASCII codes. After encoding, we can label the samples. Finally, a set of high-quality data set is generated for model training.
Step ten, model training: referring to FIG. 8, with the data set, we can train the model. The heart of the southern note score recognition algorithm is here. We have invented a southern musical score recognition method based on Convolutional Neural Network (CNN). The network model includes four convolutional layers, two pooling layers, and two fully-connected layers. And (3) rolling layers: the kernel size is 3x3, the moving step size is 2x2, and the activation function is Relu. A pooling layer: kernel size 2x2, step size 2x 2. Full connection layer: and outputting the probability value through a Softmax activation function. And (3) training the network model by adopting an Adam optimizer and a Cross Entropy (Cross Entropy) loss function, and iterating for a fixed number of times, wherein about 1000 times are recommended. And outputting the optimal model for image recognition.
And step eleven, outputting a model training result, namely outputting an optimal model for a note recognizer of subsequent image recognition.
As shown in fig. 1, 2, 3 and 5, the image recognition steps are as follows:
s1, image loading: with the scanned images, we load the images of the southern music score into memory. In the training phase, we load a batch of such images; and in the subsequent identification phase, one sheet is loaded at a time.
S2, image scaling: adjusting the image to a uniform size of 2000x 3000;
s3, graying: when a color image is processed, three channels are often required to be processed in sequence, and the time overhead is large. Therefore, in order to achieve the improvement of the whole application systemFor speed purposes, it is desirable to grayscale color images to reduce the amount of data that needs to be processed. In the RGB model, if R ═ G ═ B, the color represents a gray scale color, where the value of R ═ G ═ B is called the gray scale value, so that each pixel of the gray scale image only needs one byte to store the gray scale value (also called the intensity value, luminance value), and the gray scale range is 0-255. The color image is grayed by four methods, namely a component method, a maximum value method, an average value method and a weighted average method. We chose the mean method:
Figure BDA0002634020990000091
the processed image is only left with a gray scale image of one channel.
S4, binarization: the purpose of binarization is to convert the image of the gray scale of the previous step into a black-and-white binary image, so that a cleaned edge contour line can be obtained, and the subsequent processing services such as edge extraction, image segmentation, target identification and the like can be better served. The specific method is to make the gray value of each pixel in the pixel matrix of the image be 0 (black) or 255 (white), that is, to make the whole image have only black and white effects. In the gray image, the gray value range is 0-255, an optimal threshold value T is determined by an algorithm, 255 is set when the threshold value T is larger than the threshold value, and 0 is set when the threshold value T is smaller than the threshold value.
Figure BDA0002634020990000101
The processed image only has black and white colors, so that the gray scale range is divided into a target type and a background type, and the binarization of the image is realized.
S5, tilt correction: for images with a tilt angle, before training or subsequent recognition, the images are subjected to tilt correction. Before the inclination correction, we need to do some processing to the image to eliminate the characters in the image. The specific method is to carry out open operation on the processed binary image in the horizontal direction, namely to use symbols for the image I and the structural element S
Figure BDA0002634020990000102
S represents the opening operation on image I.
Figure BDA0002634020990000103
The on operation is to erode and then dilate the image. And (3) corrosion: the highlight part in the image is corroded, the field is reduced, and the effect image has a highlight area smaller than that of the original image; when the operation is performed, the adjacent area is replaced by the minimum value, and the highlight area is reduced. Expansion: the highlight part in the image is expanded, the field is expanded, and the effect image has a highlight area larger than that of the original image; when the operation is performed, the adjacent area is replaced by the maximum value, and the highlight area is increased.
Opening operation:
1) the on operation can remove isolated dots, burrs, and the overall position and shape are inconvenient.
2) The on operation is a filter based on geometric operations.
3) Differences in the size of the structuring element will result in different filtering effects.
4) The selection of different structural elements results in different segmentations, i.e. different features are extracted.
After the opening operation, the text part in the image disappears, and a horizontal frame is left. After taking the horizontal borders, we calculate the angles of the score borders and the image borders. By the two-line angle equation:
Figure BDA0002634020990000111
and after the angle is obtained through calculation, image rotation is carried out, and finally a set of corrected images are formed.
S6, column cut: after the inclination correction, the binary image is subjected to opening operation in the vertical direction, the method is the same as the inclination correction step, only the selected structure S is different, and the vertical frame is left in order to eliminate characters in the image, so that each column of the music score can be distinguished.
S7, character cutting: referring to fig. 6 and 7, after column cutting, a plurality of characters are still in one column, and horizontal rows and vertical rows exist, so that the image cannot be input into a recognizer for recognition, and one-step character cutting is needed. We cut the characters in the horizontal and vertical directions by horizontal and vertical projection. The horizontal projection means: projection of the two-dimensional image on the y-axis, vertical projection refers to: projection of the two-dimensional image on the x-axis. We can see from the projected image that there are many gaps in the two directions, which are reflected in the image as character-to-character gaps, in such a way that we can cut the image into individual characters. Through the image preprocessing, the complexity of the model can be greatly reduced, the performance of the model is improved, and the time for training and subsequent image recognition is shortened.
S8, note identification: and outputting the character images to a note recognizer obtained by model training, outputting codes corresponding to notes, and finally mapping the codes to the specific sound name, fingering and japanning of the south phonetics.
S9, XML conversion: the recognized notes, together with their position information in the score, are output to an XML file, forming formatted text.
The above description is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the above embodiments, and all technical solutions belonging to the idea of the present invention belong to the protection scope of the present invention. It should be noted that modifications and embellishments within the scope of the invention may occur to those skilled in the art without departing from the principle of the invention, and are considered to be within the scope of the invention.

Claims (7)

1. A recognition method for recognizing a southern musical score by using a computer is characterized by comprising the following steps:
firstly, model training: through the steps, obtaining a note identifier;
and secondly, image recognition:
the image recognition comprises the following steps:
s1, image loading: loading the scanned whole batch of southern music score images into a memory;
s2, image scaling: adjusting the image to a set size;
s3, graying: processing the color image to convert the color image into a gray image with one channel;
s4, binarization: the image is processed into black and white, so that the gray scale range is divided into a target class and a background class, and the binaryzation of the image is realized;
s5, tilt correction: for the image with the inclination angle, performing inclination correction on the image;
s6, column cut: performing opening operation on the binary image in the vertical direction, wherein the opening operation is used for eliminating characters in the image, and leaving a vertical frame, so that each column of the music score can be distinguished;
s7, character cutting: after column cutting, a plurality of characters are arranged in the horizontal row and the vertical row of the single column, and the characters in the horizontal direction and the vertical direction are cut through horizontal projection and vertical projection to obtain character images which are cut well and are square and upright, and the character images are used as samples needed by a model;
s8, note identification: outputting the character images preprocessed in the steps from S1 to S8 to a note recognizer obtained by model training, outputting codes corresponding to notes, and finally mapping the codes to the specific note name, fingering and japanning of the south syllable;
s9, XML conversion: the recognized notes, together with their position information in the score, are output to an XML file, forming formatted text.
2. The method for recognizing a southern musical score according to claim 1, wherein:
the model training comprises the following steps:
first step, image scanning: scanning the written southern music score into images, and storing the images on a computer, wherein one image is an image;
secondly, image loading: loading the scanned whole batch of southern music score images into a memory;
thirdly, zooming the image: adjusting the image to a set size;
fourthly, graying: processing the color image to convert the color image into a gray image with one channel;
fifthly, binarization: the image is processed into black and white, so that the gray scale range is divided into a target class and a background class, and the binaryzation of the image is realized;
sixthly, correcting the inclination: for the image with the inclination angle, performing inclination correction on the image;
seventh step, cutting: performing opening operation on the binary image in the vertical direction, wherein the opening operation is used for eliminating characters in the image, and leaving a vertical frame, so that each column of the music score can be distinguished;
eighth step, character cutting: after column cutting, a plurality of characters are arranged in the horizontal row and the vertical row of the single column, and the characters in the horizontal direction and the vertical direction are cut through horizontal projection and vertical projection to obtain character images which are cut well and are square and upright, and the character images are used as samples needed by a model;
ninth step, note labeling: mapping the character image to a unique ASCII code, labeling the sample, and generating a set of high-quality data set after labeling;
step ten, model training: training the network model by adopting an Adam optimizer and a cross entropy loss function, and iteratively setting times;
step eleven, outputting a model result: outputting the optimal model for use in the note recognizer for image recognition in claim 1.
3. A recognition method for recognizing a southern musical score using a computer according to claim 1 or 2, wherein: in the image scaling step, the image is adjusted to a uniform size of 2000x 3000.
4. A recognition method for recognizing a southern musical score using a computer according to claim 1 or 2, wherein: graying the color image in the graying step to reduce the amount of data to be processed, in the RGIn the B model, the value of R, G, B is called gray value, and the color image is grayed by selecting an average value method:
Figure FDA0002634020980000031
the image processed by the average value method only has a gray image of one channel.
5. A recognition method for recognizing a southern musical score using a computer according to claim 1 or 2, wherein: the binarization step is specifically to make the gray value of each pixel in the pixel matrix of the image be 0, namely black, or 255, namely white, namely to make the whole image have the effect of only black and white; determining an optimal threshold value T by using an algorithm, wherein the range of a gray value in a grayed image is 0-255, 255 is set when the threshold value is larger than the optimal threshold value, and 0 is set when the threshold value is smaller than the optimal threshold value;
Figure FDA0002634020980000041
the processed image only has black and white colors, so that the gray scale range is divided into a target type and a background type, and the binarization of the image is realized.
6. The method for recognizing a southern musical score using a computer as claimed in claim 2, wherein: the network model in the model training comprises four convolutional layers, namely two pooling layers and two full-connection layers;
and (3) rolling layers: the kernel size is 3x3, the moving step size is 2x2, and the activation function is Relu;
a pooling layer: kernel size 2x2, step size 2x 2;
full connection layer: outputting a probability value through a Softmax activation function;
and training the network model by adopting an Adam optimizer and a cross entropy loss function, iterating for a fixed number of times, and outputting the most optimal model for image recognition.
7. The method of claim 6, wherein the step of identifying the southern musical score comprises: the fixed number of iterations is 1000.
CN202010819712.1A 2020-08-14 2020-08-14 Method for recognizing southern music score by using computer Pending CN111950552A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010819712.1A CN111950552A (en) 2020-08-14 2020-08-14 Method for recognizing southern music score by using computer

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010819712.1A CN111950552A (en) 2020-08-14 2020-08-14 Method for recognizing southern music score by using computer

Publications (1)

Publication Number Publication Date
CN111950552A true CN111950552A (en) 2020-11-17

Family

ID=73342398

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010819712.1A Pending CN111950552A (en) 2020-08-14 2020-08-14 Method for recognizing southern music score by using computer

Country Status (1)

Country Link
CN (1) CN111950552A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105139866A (en) * 2015-08-10 2015-12-09 泉州师范学院 Nanyin music recognition method and device
CN106446952A (en) * 2016-09-28 2017-02-22 北京邮电大学 Method and apparatus for recognizing score image
CN110598581A (en) * 2019-08-25 2019-12-20 南京理工大学 Optical music score recognition method based on convolutional neural network
CN111104869A (en) * 2019-11-26 2020-05-05 杭州电子科技大学 Method for digitizing work-ruler spectrum capable of identifying content of small characters
CN111291696A (en) * 2020-02-19 2020-06-16 南京大学 Handwritten Dongba character recognition method based on convolutional neural network

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105139866A (en) * 2015-08-10 2015-12-09 泉州师范学院 Nanyin music recognition method and device
CN106446952A (en) * 2016-09-28 2017-02-22 北京邮电大学 Method and apparatus for recognizing score image
CN110598581A (en) * 2019-08-25 2019-12-20 南京理工大学 Optical music score recognition method based on convolutional neural network
CN111104869A (en) * 2019-11-26 2020-05-05 杭州电子科技大学 Method for digitizing work-ruler spectrum capable of identifying content of small characters
CN111291696A (en) * 2020-02-19 2020-06-16 南京大学 Handwritten Dongba character recognition method based on convolutional neural network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘乃歌 等: "AI自动翻译南音——以古典名曲《夫为功名》为例", 《艺术教育》 *

Similar Documents

Publication Publication Date Title
CN111723585B (en) Style-controllable image text real-time translation and conversion method
JP3133403B2 (en) Neighborhood block prediction bit compression method
JP2014106961A (en) Method executed by computer for automatically recognizing text in arabic, and computer program
CN109886174A (en) A kind of natural scene character recognition method of warehouse shelf Sign Board Text region
CN110766020A (en) System and method for detecting and identifying multi-language natural scene text
Jacobs et al. Text recognition of low-resolution document images
CN113255659A (en) License plate correction detection and identification method based on MSAFF-yolk 3
CN113435436A (en) Scene character recognition method based on linear constraint correction network
Sethy et al. Off-line Odia handwritten numeral recognition using neural network: a comparative analysis
Anam et al. An approach for recognizing Modi Lipi using Otsu’s Binarization algorithm and kohenen neural network
US20220262006A1 (en) Device for detecting an edge using segmentation information and method thereof
JP2997403B2 (en) Handwritten character recognition method and apparatus
Rahiman et al. Printed Malayalam character recognition using back-propagation neural networks
Herwanto et al. Zoning feature extraction for handwritten Javanese character recognition
CN111950552A (en) Method for recognizing southern music score by using computer
CN112149644A (en) Two-dimensional attention mechanism text recognition method based on global feature guidance
Chandio et al. Multi-font and multi-size printed Sindhi character recognition using Convolutional Neural Networks
CN112036290A (en) Complex scene character recognition method and system based on class mark coding representation
Jameel et al. A REVIEW ON RECOGNITION OF HANDWRITTEN URDU CHARACTERS USING NEURAL NETWORKS.
Farkya et al. Hindi speech synthesis by concatenation of recognized hand written devnagri script using support vector machines classifier
KR100456620B1 (en) Hangul character recognition method
CN112926603A (en) Music score recognition method, device, equipment and storage medium
CN111738255A (en) Guideboard text detection and recognition algorithm based on deep learning
Rahiman et al. Bilingual OCR system for printed documents in Malayalam and English
Radhi Text Recognition using Image Segmentation and Neural Network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20201117

RJ01 Rejection of invention patent application after publication