CN110599987A - Piano note recognition algorithm based on convolutional neural network - Google Patents

Piano note recognition algorithm based on convolutional neural network Download PDF

Info

Publication number
CN110599987A
CN110599987A CN201910787062.4A CN201910787062A CN110599987A CN 110599987 A CN110599987 A CN 110599987A CN 201910787062 A CN201910787062 A CN 201910787062A CN 110599987 A CN110599987 A CN 110599987A
Authority
CN
China
Prior art keywords
note
neural network
piano
short
points
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910787062.4A
Other languages
Chinese (zh)
Inventor
董瓒
马学健
郭玲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Tech University
Original Assignee
Nanjing Tech University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Tech University filed Critical Nanjing Tech University
Priority to CN201910787062.4A priority Critical patent/CN110599987A/en
Publication of CN110599987A publication Critical patent/CN110599987A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/0008Associated control or indicating means
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
    • G10L25/30Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Acoustics & Sound (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Signal Processing (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Auxiliary Devices For Music (AREA)
  • Measurement Of Mechanical Vibrations Or Ultrasonic Waves (AREA)

Abstract

The invention discloses a piano note recognition algorithm based on a convolutional neural network, which mainly comprises the following steps: searching a starting point and an end point of each note from a continuous piano audio through an end point detection algorithm; dividing the complete piano audio into a set of single note audio files; drawing a spectrogram of each note; and inputting the spectrogram into a trained neural network to finish the identification. The invention provides an algorithm for searching a short-time energy difference peak point and combining double thresholds, which overcomes the defect that the traditional double-threshold algorithm excessively depends on the setting of the threshold; the method has the advantages that the audio signal is processed and converted into the digital image for identification through drawing of the spectrogram, frequency doubling errors generated when the fundamental frequency is extracted by a traditional time domain method are overcome, and the calculation speed and accuracy are greatly improved compared with the traditional frequency domain method.

Description

Piano note recognition algorithm based on convolutional neural network
Technical Field
The invention belongs to an audio signal processing technology, and particularly relates to a piano note identification algorithm based on a convolutional neural network.
Background
With the development of economy and the improvement of culture, the number of music fans is increasing, but limited to factors such as energy and time, a considerable part of music fans choose to self-learn and practice during the off-hours. Because of lacking professional's guidance, the condition such as the wrong note of play and oneself can't judge often can appear, and the software that this moment a section can automatic identification piano play sound can help them to a great extent, and piano play note discernment can also alleviate music worker's working strength simultaneously, is favorable to the intellectuality of music processing and creation.
The piano note identification algorithm mainly comprises an end point detection part, a note segmentation part and a pitch identification part.
The end point detection and the note segmentation are key steps before note identification, and the accurate end point detection is a precondition for ensuring the accuracy of note identification. The double-threshold algorithm is the most classical endpoint algorithm, and the method respectively sets high and low threshold values (marked as delta) of short-time energy and short-time zero crossing rate1、δ2And Z1、Z2) A complete audio file is divided into four stages. 1 and a silent section: short time energy below delta2(ii) a 2. Transition section: short time energy greater than delta2Below delta1And the short-time zero crossing rate is more than Z2(ii) a 3. Music segment: short time energy greater than delta1And the short-time zero crossing rate is more than Z1(ii) a 4. Short time energy below delta2Or short time zero crossing rate lower than Z2. In practice, the noise condition is also taken into account, so that in addition to the above four thresholds, the shortest tone segment length and the longest transition segment length are additionally set for distinguishing noise and preventing tone truncation in advance. Therefore, the accuracy of the algorithm mainly depends on the setting of the threshold, and the setting of the threshold usually takes the background sound of a plurality of frames before the recording, which also has requirements on the recording file, and if a small popping sound occurs at the beginning of the recording, the accuracy rate is greatly reduced, and the practicability is lacked.
Conventional pitch identification has focused on research in both the time and frequency domains. The short-time autocorrelation function is used for judging the similarity degree of two signals in a time domain and is commonly used for detecting the synchronism and the periodicity of the signals. The property that the autocorrelation necessarily has a maximum value at the position of integral multiple of the period provides an important basis for extracting piano pitch, namely fundamental frequency, by using the short-time autocorrelation function. The fundamental frequency is extracted by a traditional autocorrelation function method by drawing a short-time autocorrelation function curve, the autocorrelation function is represented as a peak at a pitch period, and then the interval between two adjacent peaks is a gene period. However, in general, the fundamental component is not the strongest component, and the rich harmonic component makes the waveform of the audio signal very complex, and often a frequency doubling error occurs, that is, the result of the fundamental frequency estimation is the second frequency doubling or second frequency division of the actual fundamental frequency. The wavelet analysis method is used as a method in the field of applied mathematics, and local conversion is carried out on the time and the frequency of a signal, so that the fundamental frequency information in a music signal can be effectively extracted. The specific steps are that a wavelet component curve under the same grade number is drawn, the number n of sampling points between two maximum values in the curve reflects the pitch period, then the number of sampling points between adjacent maximum values under different grade numbers is calculated by continuously changing the grade number, and if the number of sampling points is not changed, the fundamental frequency is determined. Therefore, although the wavelet analysis method can effectively extract the fundamental frequency, the calculation amount is huge because wavelet components under different levels are calculated.
In summary, in the aspect of endpoint detection, the traditional double-threshold algorithm has the disadvantage of being too dependent on the setting of the double-threshold, and in the aspect of pitch identification fundamental frequency extraction, the traditional time domain method is prone to frequency multiplication errors and low in accuracy, while the traditional frequency domain method is high in algorithm complexity, large in calculation amount and low in operation efficiency, and both the frequency domain method and the time domain method have high requirements on signal-to-noise ratio, and cannot accurately extract audio signals with low signal-to-noise ratio.
Disclosure of Invention
The invention aims to provide a piano note identification algorithm based on a convolutional neural network.
The technical solution for realizing the purpose of the invention is as follows: a piano note identification algorithm based on a convolutional neural network comprises the following steps:
step 1, finding out a starting point and an end point of each note from a continuous piano audio through an end point detection algorithm;
step 2, dividing the complete piano audio into a set of single note audio files according to the starting point and the ending point of each note;
step 3, drawing a spectrogram of each note;
and 4, inputting the spectrogram into the trained neural network to finish recognition.
Compared with the prior art, the invention has the following remarkable advantages: (1) compared with the traditional double-threshold algorithm, the short-time energy difference and double-threshold-based endpoint detection algorithm provided by the invention does not excessively depend on the setting of the threshold value, and has high accuracy; (2) compared with the traditional time-frequency domain method, the algorithm for identifying the piano pitch by using the convolutional neural network provided by the invention has the advantages of no frequency doubling error, strong noise resistance, simple algorithm, high operation speed and high accuracy.
Drawings
FIG. 1 is a flow chart of the piano note identification algorithm based on the convolutional neural network of the present invention.
FIG. 2 is a diagram of a neural network used in the present invention.
Fig. 3 is a short time energy plot.
Fig. 4 is a graph illustrating a short-time energy difference curve.
Fig. 5 is a diagram illustrating a short-time energy difference peak point.
FIG. 6 is a schematic diagram of short-term energy difference peak screening.
Detailed Description
As shown in FIG. 1, the piano note identification algorithm based on the convolutional neural network of the present invention comprises the following steps:
step 1, reading a section of audio signal, performing framing and windowing on the audio signal, and performing normalization pretreatment.
The framing windowing represents the music signal from an unstable process as a combination of several frame sequences that are stable and time-invariant, and is the basis of a series of steps followed by calculating the relevant characteristics of the music signal.
Step 2, calculating and drawing a short-time energy difference curve of two adjacent frames, wherein the short-time energy and short-time energy difference formula is as follows:
ΔEi=Ei-Ei-1
since short-time energy difference information between frames is calculated, Δ EiFiltering micro energy fluctuation in a part of original signals, smoothing energy change of the whole audio information, and adopting difference operation to calculate difference value delta E of two adjacent framesiThe note onset is easier to determine than the energy of the short duration of each frame.
And step 3, searching and marking all maximum value points (peak value points) in the curve as candidate note starting points.
All peak points at this time include a large amount of background noise in the audio signal and the extreme points of the note signal, and need to be filtered.
And 4, setting the minimum peak height according to the background environment sound, and setting the shortest distance between adjacent peak points according to the playing speed.
The minimum peak height is mainly used for filtering background noise, and the shortest distance between adjacent peak points is mainly used for filtering pseudo end points in notes, so that one note is prevented from being cut off for multiple times and needs to be adjusted according to the beat speed when the piano is played.
And 5, screening the peak value points in the B according to the minimum peak value height and the minimum peak value distance set in the step 4, and reserving frames corresponding to the points, namely starting points of all notes.
Step 6, calculating the short-time zero crossing rate of each frame, wherein the formula is as follows:
where w (n) is a window function, sgn represents a sign function, which is defined as follows:
the short-time zero-crossing rate measurement has the significance that the periodic change of the signal can be reflected to a certain extent. For sampled sinusoidal periodic signals, the average zero crossing rate must be twice the signal frequency multiplied by the sampling period, and when the sampling period is fixed, the zero crossing rate reflects the signal frequency information. Especially for regular musical tone signals, the zero-crossing rate is distributed in a certain range, and the rule can be used for distinguishing musical tones from noise because the zero-crossing rate of the noise is larger.
And 7, setting two thresholds of short-time energy and short-time zero-crossing rate, and respectively calculating corresponding end points of each starting point obtained in the step 5.
And 8, judging the position of the end point corresponding to each starting point, and taking the first 10 frames of the starting point after the starting point as the corresponding end point if the end point is behind the next starting point.
And 9, calculating the difference value of each pair of start and stop points, judging the difference value as noise if the difference value is smaller than the set shortest note length, deleting the pair of start and stop points from the set, and finally obtaining the start and stop points of all notes.
Since the steps 8 and 9 carry out re-judgment and re-screening on each start point and each stop point, the dependence of the algorithm on threshold setting is greatly reduced, and the accuracy is improved.
And step 10, dividing the continuous notes in the audio into single notes according to the start and stop point information obtained in the step 9.
And step 11, drawing a spectrogram of each note.
And step 12, inputting the spectrogram into a trained neural network to obtain the pitch. The neural network structure is shown in fig. 2. All convolution kernels in the network are 3 x 3 in size, the pooling layers are in maximum pooling, the number of neurons in the fully-connected layer 1 is 1024, the number of neurons in the fully-connected layer 2 is 88, and the size corresponds to 88 pitches of the piano.
The present invention will be described in detail below with reference to the accompanying drawings and examples.
Examples
The audio file used in the embodiment is an artificially recorded piano playing, and comprises 8 notes in total.
Step 1, after the sound recording file is obtained, performing framing and windowing operation on the sound recording file, wherein the sampling rate is 44100Hz, and a window function is selected as a commonly used Hanning window and is defined as follows:
step 2, after the framing operation is finished, according to a formula:
ΔEi=Ei-Ei-1
respectively calculating the short-time energy and the short-time zero crossing rate of each frame and the short-time energy difference of two adjacent frames, storing the results in an array and drawing a curve, wherein the short-time energy is shown in figure 3, and the short-time energy difference is shown in figure 4.
And 3, after obtaining the short-time energy difference curve, searching all wave crests in the curve, namely maximum value points, marking the wave crests in the curve by red asterisks, and storing the peak value points and the peak values into an array for later use. As shown in FIG. 5, the peak of the background noise is generally small and is significantly different from the peak at a position near the note start. A plurality of peak values are detected in a note duration, wherein the highest peak value is a true note starting point, a plurality of adjacent peak values are note pseudo end points, the peak value corresponding to the pseudo end point is slightly lower than the peak value corresponding to the actual starting point, and the distance between the pseudo end point and the actual starting point is small.
And 4, setting the minimum peak height and the shortest peak distance, wherein the minimum peak height is an empirical value, only the piano tones and the environmental background tones are needed to be distinguished, and the shortest peak distance is related to the beat speed adopted when the piano is played. And screening all the peak points in the graph 3 according to the two thresholds, deleting the peak points smaller than the minimum peak height and the peaks thereof, namely noise from the array, wherein the peak values are higher than the minimum peak height but have the distance with the adjacent peak points lower than the shortest peak distance point, and reserving the points with larger peak values, namely pseudo end points. The final screening results are shown in fig. 6.
And 5, obtaining alternative starting points of all notes, starting from each starting point, judging whether the short-time energy and the short-time zero-crossing rate of each frame meet threshold conditions at the same time frame by frame, if one frame meets the end point conditions, continuously judging whether the next 9 frames meet the conditions, if so, judging that the frame is an alternative end point, and if not, continuously judging until the end point is found. And judging whether the position of the end point is before the next starting point after the alternative end point is obtained, if so, setting the position as the end point corresponding to the current starting point, otherwise, indicating that the end point is searched wrongly, and setting the end point corresponding to the current starting point as the first 5 frames of the next starting point.
And repeating the steps until all the end points corresponding to the starting points are searched, and storing the starting points and the end points in an array in a one-to-one correspondence mode and recording the starting points and the end points as the alternative note end points.
And 6, calculating the difference value of each pair of start and stop points, judging that the difference value is greater than the shortest note length, if so, keeping the pair of start and stop points, otherwise, judging that the pair of start and stop points is noise, and deleting the noise from the candidate note end points. This completes the endpoint detection.
And 7, after the start point and the stop point of each note are obtained in the step, reading the part from each pair of start points in the initial recording audio file with the time period from the start point to the end point corresponding to the start point in the step 6, extracting the part from the original audio to obtain an independent audio file, repeating the step, and finally completing audio segmentation to obtain the audio file corresponding to each note, wherein the total number is 8, and the audio file is named as 1 to 8 according to the sequence of the notes.
And 8, drawing a spectrogram of all the note audio files obtained in the step 7, wherein the abscissa of the spectrogram represents time, the ordinate represents frequency, the color represents energy, and the picture name is consistent with the audio name.
And 9, inputting the spectrogram obtained in the step 8 into a neural network, automatically zooming the image to a required input size by the neural network, and finally obtaining a pitch corresponding to each note through calculation of the neural network, wherein an output result is a pitch name. The selected audio frequency of the example comprises 8 piano notes, and the final recognition results are A5, G5, E5, G5, C6, A5, G5 and A5, which are consistent with the actual pitch played and are all correct.

Claims (6)

1. A piano note identification algorithm based on a convolutional neural network is characterized by comprising the following steps:
step 1, finding out a starting point and an end point of each note from a continuous piano audio through an end point detection algorithm;
step 2, dividing the complete piano audio into a set of single note audio files according to the starting point and the ending point of each note;
step 3, drawing a spectrogram of each note;
and 4, inputting the spectrogram into the trained neural network to finish recognition.
2. The convolutional neural network-based piano note identification algorithm as claimed in claim 1, wherein step 1 finds the start point of each note using the energy mutation information on the time domain, and calculates the end point of each note using the double threshold algorithm in combination with the start point information;
the short-time energy formula is:
in the formula Si(m) is an amplitude of an m-th point of the i-th frame audio signal; l is the frame length;
the short-time energy difference is the energy difference delta E between two adjacent framesiNamely:
ΔEi=Ei-Ei-1
3. the convolutional neural network-based piano note identification algorithm as claimed in claim 2, wherein the end point detection algorithm based on short-time energy difference comprises the following steps:
A) calculating and drawing a short-time energy difference curve of two adjacent frames;
B) searching and marking all maximum value points in the curve as candidate note starting points;
C) setting a minimum peak height according to background environment sounds, and setting a shortest distance between adjacent peak points according to playing speed;
D) screening the peak points in the step B according to the minimum peak height and the minimum peak distance set in the step C, wherein the frame corresponding to the reserved points is the starting point of each note;
E) calculating the short-time zero-crossing rate of each frame, wherein the formula is as follows:
where w (n) is a window function, sgn represents a sign function, which is defined as follows:
F) and D, setting two thresholds of short-time energy and short-time zero-crossing rate, and respectively calculating the corresponding end point of each starting point obtained in the step D.
G) And judging the position of the end point corresponding to each starting point, and if the end point is behind the next starting point, taking the first 10 frames of the starting point behind the starting point as the corresponding end point.
4. The convolutional neural network-based piano note identification algorithm as claimed in claim 3, wherein the read-in audio signal is subjected to frame windowing and normalization before endpoint detection.
5. The convolutional neural network-based piano note identification algorithm as claimed in claim 3, wherein the difference between each pair of start and stop points is calculated, if the difference is smaller than the set shortest note length, it is determined as noise, the pair of start and stop points is deleted from the set, and finally the start and stop points of all notes are obtained.
6. The convolutional neural network-based piano note identification algorithm as claimed in claim 1, wherein step 4 inputs the spectrogram into the trained neural network to obtain the pitch; all convolution kernel sizes in the neural network are 3 x 3, the pooling layer is maximum pooling, the number of neurons of the full connection layer 1 is 1024, the number of neurons of the full connection layer 2 is 88, and the size corresponds to 88 pitches of the piano.
CN201910787062.4A 2019-08-25 2019-08-25 Piano note recognition algorithm based on convolutional neural network Pending CN110599987A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910787062.4A CN110599987A (en) 2019-08-25 2019-08-25 Piano note recognition algorithm based on convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910787062.4A CN110599987A (en) 2019-08-25 2019-08-25 Piano note recognition algorithm based on convolutional neural network

Publications (1)

Publication Number Publication Date
CN110599987A true CN110599987A (en) 2019-12-20

Family

ID=68855426

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910787062.4A Pending CN110599987A (en) 2019-08-25 2019-08-25 Piano note recognition algorithm based on convolutional neural network

Country Status (1)

Country Link
CN (1) CN110599987A (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111415681A (en) * 2020-03-17 2020-07-14 北京奇艺世纪科技有限公司 Method and device for determining musical notes based on audio data
CN111508526A (en) * 2020-04-10 2020-08-07 腾讯音乐娱乐科技(深圳)有限公司 Method and device for detecting audio beat information and storage medium
CN111508480A (en) * 2020-04-20 2020-08-07 网易(杭州)网络有限公司 Training method of audio recognition model, audio recognition method, device and equipment
CN111540378A (en) * 2020-04-13 2020-08-14 腾讯音乐娱乐科技(深圳)有限公司 Audio detection method, device and storage medium
CN112259063A (en) * 2020-09-08 2021-01-22 华南理工大学 Multi-tone overestimation method based on note transient dictionary and steady dictionary
CN112420071A (en) * 2020-11-09 2021-02-26 上海交通大学 Constant Q transformation based polyphonic electronic organ music note identification method
CN112509601A (en) * 2020-11-18 2021-03-16 中电海康集团有限公司 Note starting point detection method and system
CN113593504A (en) * 2020-04-30 2021-11-02 小叶子(北京)科技有限公司 Pitch recognition model establishing method, pitch recognition method and pitch recognition device
CN113658612A (en) * 2021-08-25 2021-11-16 桂林智神信息技术股份有限公司 Method and system for identifying played keys based on audio
CN114283841A (en) * 2021-12-20 2022-04-05 天翼爱音乐文化科技有限公司 Audio classification method, system, device and storage medium
CN116884438A (en) * 2023-09-08 2023-10-13 杭州育恩科技有限公司 Method and system for detecting musical instrument training sound level based on acoustic characteristics

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1093310A2 (en) * 1999-09-28 2001-04-18 Nortel Networks Limited Tone detection using neural network
US20060095254A1 (en) * 2004-10-29 2006-05-04 Walker John Q Ii Methods, systems and computer program products for detecting musical notes in an audio signal
US20080188967A1 (en) * 2007-02-01 2008-08-07 Princeton Music Labs, Llc Music Transcription
US20090151544A1 (en) * 2007-12-17 2009-06-18 Sony Corporation Method for music structure analysis
CN103325382A (en) * 2013-06-07 2013-09-25 大连民族学院 Method for automatically identifying Chinese national minority traditional instrument audio data
CN104021789A (en) * 2014-06-25 2014-09-03 厦门大学 Self-adaption endpoint detection method using short-time time-frequency value
CN104143324A (en) * 2014-07-14 2014-11-12 电子科技大学 Musical tone note identification method
CN104217731A (en) * 2014-08-28 2014-12-17 东南大学 Quick solo music score recognizing method
CN105976803A (en) * 2016-04-25 2016-09-28 南京理工大学 Note segmentation method based on music score
CN108038146A (en) * 2017-11-29 2018-05-15 无锡同芯微纳科技有限公司 Musical performance artificial intelligence analysis method, system and equipment
CN110136730A (en) * 2019-04-08 2019-08-16 华南理工大学 A kind of automatic allocation system of piano harmony and method based on deep learning

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1093310A2 (en) * 1999-09-28 2001-04-18 Nortel Networks Limited Tone detection using neural network
US20060095254A1 (en) * 2004-10-29 2006-05-04 Walker John Q Ii Methods, systems and computer program products for detecting musical notes in an audio signal
US20080188967A1 (en) * 2007-02-01 2008-08-07 Princeton Music Labs, Llc Music Transcription
CN101652807A (en) * 2007-02-01 2010-02-17 缪斯亚米有限公司 Music transcription
US20090151544A1 (en) * 2007-12-17 2009-06-18 Sony Corporation Method for music structure analysis
CN103325382A (en) * 2013-06-07 2013-09-25 大连民族学院 Method for automatically identifying Chinese national minority traditional instrument audio data
CN104021789A (en) * 2014-06-25 2014-09-03 厦门大学 Self-adaption endpoint detection method using short-time time-frequency value
CN104143324A (en) * 2014-07-14 2014-11-12 电子科技大学 Musical tone note identification method
CN104217731A (en) * 2014-08-28 2014-12-17 东南大学 Quick solo music score recognizing method
CN105976803A (en) * 2016-04-25 2016-09-28 南京理工大学 Note segmentation method based on music score
CN108038146A (en) * 2017-11-29 2018-05-15 无锡同芯微纳科技有限公司 Musical performance artificial intelligence analysis method, system and equipment
CN110136730A (en) * 2019-04-08 2019-08-16 华南理工大学 A kind of automatic allocation system of piano harmony and method based on deep learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
伍洋: "基于MFCC和BP神经网络的乐音主频识别", 《信息科技辑》 *
黎思泉等: "一种融合时频信息的钢琴音符端点检测算法", 《科技与创新》 *

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111415681A (en) * 2020-03-17 2020-07-14 北京奇艺世纪科技有限公司 Method and device for determining musical notes based on audio data
CN111415681B (en) * 2020-03-17 2023-09-01 北京奇艺世纪科技有限公司 Method and device for determining notes based on audio data
CN111508526B (en) * 2020-04-10 2022-07-01 腾讯音乐娱乐科技(深圳)有限公司 Method and device for detecting audio beat information and storage medium
CN111508526A (en) * 2020-04-10 2020-08-07 腾讯音乐娱乐科技(深圳)有限公司 Method and device for detecting audio beat information and storage medium
CN111540378A (en) * 2020-04-13 2020-08-14 腾讯音乐娱乐科技(深圳)有限公司 Audio detection method, device and storage medium
CN111508480A (en) * 2020-04-20 2020-08-07 网易(杭州)网络有限公司 Training method of audio recognition model, audio recognition method, device and equipment
CN113593504A (en) * 2020-04-30 2021-11-02 小叶子(北京)科技有限公司 Pitch recognition model establishing method, pitch recognition method and pitch recognition device
CN112259063A (en) * 2020-09-08 2021-01-22 华南理工大学 Multi-tone overestimation method based on note transient dictionary and steady dictionary
CN112420071A (en) * 2020-11-09 2021-02-26 上海交通大学 Constant Q transformation based polyphonic electronic organ music note identification method
CN112509601A (en) * 2020-11-18 2021-03-16 中电海康集团有限公司 Note starting point detection method and system
CN113658612A (en) * 2021-08-25 2021-11-16 桂林智神信息技术股份有限公司 Method and system for identifying played keys based on audio
CN113658612B (en) * 2021-08-25 2024-02-09 桂林智神信息技术股份有限公司 Method and system for identifying played keys based on audio frequency
CN114283841A (en) * 2021-12-20 2022-04-05 天翼爱音乐文化科技有限公司 Audio classification method, system, device and storage medium
CN116884438A (en) * 2023-09-08 2023-10-13 杭州育恩科技有限公司 Method and system for detecting musical instrument training sound level based on acoustic characteristics
CN116884438B (en) * 2023-09-08 2023-12-01 杭州育恩科技有限公司 Method and system for detecting musical instrument training sound level based on acoustic characteristics

Similar Documents

Publication Publication Date Title
CN110599987A (en) Piano note recognition algorithm based on convolutional neural network
Gillet et al. Transcription and separation of drum signals from polyphonic music
US8193436B2 (en) Segmenting a humming signal into musical notes
Ryynänen et al. Automatic transcription of melody, bass line, and chords in polyphonic music
JP5282548B2 (en) Information processing apparatus, sound material extraction method, and program
Kroher et al. Automatic transcription of flamenco singing from polyphonic music recordings
CN111369982A (en) Training method of audio classification model, audio classification method, device and equipment
CN109979488B (en) System for converting human voice into music score based on stress analysis
JP2009511954A (en) Neural network discriminator for separating audio sources from mono audio signals
CN110136730B (en) Deep learning-based piano and acoustic automatic configuration system and method
CN106997765B (en) Quantitative characterization method for human voice timbre
CN110516102B (en) Lyric time stamp generation method based on spectrogram recognition
Kirchhoff et al. Evaluation of features for audio-to-audio alignment
Azarloo et al. Automatic musical instrument recognition using K-NN and MLP neural networks
CN113192471B (en) Musical main melody track recognition method based on neural network
CN112420071B (en) Constant Q transformation based polyphonic electronic organ music note identification method
Arumugam et al. An efficient approach for segmentation, feature extraction and classification of audio signals
CN105895079A (en) Voice data processing method and device
TWI299855B (en) Detection method for voice activity endpoint
Gao et al. Vocal melody extraction via dnn-based pitch estimation and salience-based pitch refinement
Oudre et al. Chord recognition using measures of fit, chord templates and filtering methods
Gurunath Reddy et al. Predominant melody extraction from vocal polyphonic music signal by time-domain adaptive filtering-based method
CN112634841B (en) Guitar music automatic generation method based on voice recognition
CN111681674B (en) Musical instrument type identification method and system based on naive Bayesian model
CN114678039A (en) Singing evaluation method based on deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20191220

RJ01 Rejection of invention patent application after publication