CN102930873B - Information entropy based music humming detecting method - Google Patents

Information entropy based music humming detecting method Download PDF

Info

Publication number
CN102930873B
CN102930873B CN201210371373.0A CN201210371373A CN102930873B CN 102930873 B CN102930873 B CN 102930873B CN 201210371373 A CN201210371373 A CN 201210371373A CN 102930873 B CN102930873 B CN 102930873B
Authority
CN
China
Prior art keywords
information entropy
frame
voice
humming
voice signal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201210371373.0A
Other languages
Chinese (zh)
Other versions
CN102930873A (en
Inventor
张栋
谢志成
叶东毅
余春艳
刘会彬
张玉溪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fuzhou University
Original Assignee
Fuzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fuzhou University filed Critical Fuzhou University
Priority to CN201210371373.0A priority Critical patent/CN102930873B/en
Publication of CN102930873A publication Critical patent/CN102930873A/en
Application granted granted Critical
Publication of CN102930873B publication Critical patent/CN102930873B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Auxiliary Devices For Music (AREA)

Abstract

The invention relates to an information entropy based music humming detecting method. The sound of singing is cut through an information entropy method by utilizing the pronunciation similarity of two characters before and after humming through human voice, and then the cut result and the result of a standard document are compared, thus the function whether singing is conducted is detected. The detecting method provided by the invention is flexible to achieve and is high in detecting efficiency.

Description

Music humming detection method based on information entropy
Technical field
The present invention relates to voice and cut apart field, particularly a kind of music humming detection method based on information entropy.
Background technology
In recent years, along with popularizing of music entertainment, the identification based on music humming, search and scoring become the focus of research and application, are subject to the extensive concern of academia and industry.As everyone knows, groan song more more laborsaving than singing, and be more prone to hold the tone of song, so somebody, in order to increase the scoring of oneself, can add humming in K song process, this is just for our marking has increased difficulty and inaccuracy.So can differentiate people and when K sings be with groan or sing, can produce larger impact to the precision of our scoring, thereby also affected, user experiences and the quality of product.When humming, similar before and after our articulation type, when cutting apart, voice are not easy to each single-tone to distinguish and come one by one.Therefore, how utilizing voice to cut apart and accurately pick out humming and sing, is constantly to improve points-scoring system, is also the major issue of development music entertainment simultaneously.
Summary of the invention
The object of this invention is to provide a kind of music humming detection method based on information entropy, the humming situation in the time of effectively detecting K song.
The present invention adopts following scheme to realize: a kind of music humming detection method based on information entropy, it is characterized in that: the pronunciation similarity of utilizing voice former and later two words when humming, method by information entropy is cut song sentence by sentence, the result comparison with normative document by segmentation result again, realize and detect the function of whether humming, comprise the steps:
(1) after obtaining the digital music voice signal of input, whole voice signal is carried out to filtering, normalization pre-service;
(2) to voice signal, divide frame to process, calculate respectively the information entropy of each frame;
(3) according to information entropy, whole voice signal is divided into some sections;
(4) read normative document, if segmentation result is less than the over half of normative document reading result, thinks that this section of voice hum, otherwise think that this voice signal is normal.
In an embodiment of the present invention, the frame length W of each described frame is described as the hits in 10 ~ 30ms, the time span * sample frequency of each frame of W=; Described frame moves WF and is described as the underlapped part of adjacent two frames, WF=frame length/2.
In an embodiment of the present invention, described information entropy is described as representing the size of time series confusion degree, and time series distributes more chaotic, and information entropy is larger, otherwise less.
In an embodiment of the present invention, described normative document is described as a series of tlv triple O i(begin i, end i, C i), 1<=i<=n wherein, C ifor the lyrics, begin ibe the initial time of i word, end ibe the closing time of i word, n represents the lyrics sum of voice segments.
In an embodiment of the present invention, the calculating of the information entropy of each frame described in described step (2) realizes according to following scheme: described voice are divided into after the speech frame that length is W, for each frame, be handled as follows: find the maximal value max in this frame, be then divided into isometric k interval [0, x 1], [x 1, x 2] ..., [x k-1, max], add up this frame at number the calculating probability of each interval value, obtain Probability p 1, p 2..., p k, then according to formula
Figure 2012103713730100002DEST_PATH_IMAGE002
, finally obtain the information entropy of this frame; The information entropy sequence of whole voice signal represents with H.
In an embodiment of the present invention, described in described step (3), according to information entropy, whole voice signal being divided into some sections is to realize according to following scheme: according to the maximal value definite threshold flag=max(H of H)/3, described voice segments must meet length and be greater than 150ms, the length that corresponds to information entropy is L=(0.15*fs)/WF, in H, from first point, start, find certain to put h i>flag, if below continuously L put h i+1, h i+2..., h i+Lvalue be all greater than flag, suppose that L ' stops, this L ' >L, from a h ito a h i+L 'this section of corresponding speech signal segments be exactly one section of required independent voice that split; The like, by H, find some sections of independent voice, be designated as n, a resulting n voice segments is exactly segmentation result.
Useful achievement of the present invention is: the present invention proposes a kind of music humming detection method of cutting apart based on information entropy, its by information entropy to the dividing method of voice sentence by sentence in addition cutting of song, effectively detected the situation of humming, the method is simple, realize flexibly, there is stronger practicality.
Accompanying drawing explanation
Fig. 1 is the process flow diagram of the music humming detection method based on information entropy.
Embodiment
Please refer to Fig. 1, the present invention is based on the music humming detection method of information entropy, utilize the pronunciation similarity of voice former and later two words when humming, method by information entropy is cut song sentence by sentence, the result comparison with normative document by segmentation result again, realize and detect the function of whether humming, specific as follows:
1, voice signal divides frame: first whole voice signal is carried out the pre-service such as filtering, normalization.Then by the voice that obtain, according to length, be that W, each frame move the speech frame that is divided into segment for WF, wherein, W represents frame length, the time span * sample frequency of each frame of W=; WF represents that frame moves, WF=frame length/2.For each frame, be handled as follows: find the maximal value max in this frame, be then divided into isometric k interval [0, x 1], [x 1, x 2] ..., [x k-1, max], add up this frame at number the calculating probability of each interval value, obtain Probability p 1, p 2..., p k, then according to formula , finally obtain the information entropy of this frame.The information entropy sequence of whole voice signal represents with H.
2, cut apart voice signal: according to the maximal value definite threshold flag=max(H of H)/3, described voice segments must meet length and be greater than 150ms, the length that corresponds to information entropy is L=(0.15*fs)/WF, in H, from first point, start, find certain to put h i>flag, if below continuously L put h i+1, h i+2..., h i+Lvalue be all greater than flag, suppose that L ' stops (L ' >L), from a h ito a h i+L 'this section of corresponding speech signal segments be exactly one section of required independent voice that split.The like, we can find some sections of independent voice by H, are designated as N, and a resulting N voice segments is exactly our segmentation result.
3, contrast segmentation result and normative document, whether be humming: according to the tlv triple O in normative document if detecting i(begin i, end i, C i), 1<=i<=n wherein, C ifor the lyrics, begin ifor the initial time of this word, end ifor the closing time of altering, n represents the lyrics sum of voice segments, establishes certain words from T mto T m+1constantly, searching the number of words that normative document obtains comprising in this moment is M, if there is N<M/2 to set up, the part that has comprised humming in these words is described, needs to be deducted points in scoring.
The foregoing is only preferred embodiment of the present invention, all equalizations of doing according to the present patent application the scope of the claims change and modify, and all should belong to covering scope of the present invention.

Claims (5)

1. the music based on information entropy is hummed detection method, it is characterized in that: the pronunciation similarity of utilizing voice former and later two words when humming, method by information entropy is cut song sentence by sentence, the result comparison with normative document by segmentation result again, realize and detect the function of whether humming, comprise the steps:
(1) after obtaining the digital music voice signal of input, whole voice signal is carried out to filtering, normalization pre-service;
(2) to voice signal, divide frame to process, calculate respectively the information entropy of each frame;
(3) according to information entropy, whole voice signal is divided into some sections;
(4) read normative document, if segmentation result is less than the over half of normative document reading result, thinks that this section of voice hum, otherwise think that this voice signal is normal; Described information entropy is described as representing the size of time series confusion degree, and time series distributes more chaotic, and information entropy is larger, otherwise less.
2. the music humming detection method based on information entropy according to claim 1, is characterized in that: the frame length W of each described frame is described as the hits in 10 ~ 30ms the time span * sample frequency of each frame of W=; Frame moves WF and is described as the underlapped part of adjacent two frames, WF=frame length/2.
3. the music humming detection method based on information entropy according to claim 1, is characterized in that: described normative document is described as a series of tlv triple O i(begin i, end i, C i), 1<=i<=n wherein, C ifor the lyrics, begin ibe the initial time of i word, end ibe the closing time of i word, n represents the lyrics sum of voice segments.
4. the music based on information entropy according to claim 1 is hummed detection method, it is characterized in that: the calculating of the information entropy of each frame described in described step (2) realizes according to following scheme: described voice are divided into after the speech frame that length is W, for each frame, be handled as follows: find the maximal value max in this frame, then be divided into isometric k interval [0, x 1], [x 1, x 2] ..., [x k-1, max], add up this frame at number the calculating probability of each interval value, obtain Probability p 1, p 2..., p k, then according to formula
Figure 2012103713730100001DEST_PATH_IMAGE002
, finally obtain the information entropy of this frame; The information entropy sequence of whole voice signal represents with H.
5. the music based on information entropy according to claim 4 is hummed detection method, it is characterized in that: described in described step (3), according to information entropy, whole voice signal being divided into some sections is to realize according to following scheme: according to the maximal value definite threshold flag=max(H of H)/3, described voice segments must meet length and be greater than 150ms, the length that corresponds to information entropy is L=(0.15*fs)/WF, in H, from first point, start, find certain to put h i>flag, if below continuously L put h i+1, h i+2..., h i+Lvalue be all greater than flag, suppose that L ' stops, this L ' >L, from a h ito a h i+L 'this section of corresponding speech signal segments be exactly one section of required independent voice that split; The like, by H, find some sections of independent voice, be designated as n, a resulting n voice segments is exactly segmentation result.
CN201210371373.0A 2012-09-29 2012-09-29 Information entropy based music humming detecting method Expired - Fee Related CN102930873B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210371373.0A CN102930873B (en) 2012-09-29 2012-09-29 Information entropy based music humming detecting method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210371373.0A CN102930873B (en) 2012-09-29 2012-09-29 Information entropy based music humming detecting method

Publications (2)

Publication Number Publication Date
CN102930873A CN102930873A (en) 2013-02-13
CN102930873B true CN102930873B (en) 2014-04-09

Family

ID=47645654

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210371373.0A Expired - Fee Related CN102930873B (en) 2012-09-29 2012-09-29 Information entropy based music humming detecting method

Country Status (1)

Country Link
CN (1) CN102930873B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1703734A (en) * 2002-10-11 2005-11-30 松下电器产业株式会社 Method and apparatus for determining musical notes from sounds
CN1737798A (en) * 2005-09-08 2006-02-22 上海交通大学 Music rhythm sectionalized automatic marking method based on eigen-note
CN101383149A (en) * 2008-10-27 2009-03-11 哈尔滨工业大学 Stringed music vibrato automatic detection method
WO2010136722A1 (en) * 2009-05-29 2010-12-02 Voxler Method for detecting words in a voice and use thereof in a karaoke game
US7962530B1 (en) * 2007-04-27 2011-06-14 Michael Joseph Kolta Method for locating information in a musical database using a fragment of a melody
CN102568456A (en) * 2011-12-23 2012-07-11 深圳市万兴软件有限公司 Notation recording method and a notation recording device based on humming input

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1703734A (en) * 2002-10-11 2005-11-30 松下电器产业株式会社 Method and apparatus for determining musical notes from sounds
CN1737798A (en) * 2005-09-08 2006-02-22 上海交通大学 Music rhythm sectionalized automatic marking method based on eigen-note
US7962530B1 (en) * 2007-04-27 2011-06-14 Michael Joseph Kolta Method for locating information in a musical database using a fragment of a melody
CN101383149A (en) * 2008-10-27 2009-03-11 哈尔滨工业大学 Stringed music vibrato automatic detection method
WO2010136722A1 (en) * 2009-05-29 2010-12-02 Voxler Method for detecting words in a voice and use thereof in a karaoke game
CN102568456A (en) * 2011-12-23 2012-07-11 深圳市万兴软件有限公司 Notation recording method and a notation recording device based on humming input

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
一种新的哼唱音符音高划分方法;杨剑锋等;《电脑知识与技术》;20111031;全文 *
杨剑锋等.一种新的哼唱音符音高划分方法.《电脑知识与技术》.2011,全文.

Also Published As

Publication number Publication date
CN102930873A (en) 2013-02-13

Similar Documents

Publication Publication Date Title
US9262521B2 (en) Apparatus and method for extracting highlight section of music
CN105096932A (en) Voice synthesis method and apparatus of talking book
CN101159834A (en) Method and system for detecting repeatable video and audio program fragment
CN105338327A (en) Video monitoring networking system capable of achieving speech recognition
Fayolle et al. CRF-based combination of contextual features to improve a posteriori word-level confidence measures
Zheng et al. Acoustic texttiling for story segmentation of spoken documents
CN113658594A (en) Lyric recognition method, device, equipment, storage medium and product
CN104167211B (en) Multi-source scene sound abstracting method based on hierarchical event detection and context model
CN102708859A (en) Real-time music voice identification system
CN102930873B (en) Information entropy based music humming detecting method
Katte et al. Techniques for Indian classical raga identification-a survey
Bohac et al. Post-processing of the recognized speech for web presentation of large audio archive
Jeong et al. Dlr: Toward a deep learned rhythmic representation for music content analysis
Bhattacharyya et al. An approach to identify thhat of Indian Classical Music
CN106649643B (en) A kind of audio data processing method and its device
CN109829061A (en) A kind of multimedia messages lookup method and system
Leng et al. Classification of overlapped audio events based on AT, PLSA, and the combination of them
Wu et al. Singing voice detection of popular music using beat tracking and SVM classification
JP2012159717A (en) Musical-data change point detection device, musical-data change point detection method, and musical-data change point detection program
Akiba et al. DTW-Distance-Ordered Spoken Term Detection and STD-based Spoken Content Retrieval: Experiments at NTCIR-10 SpokenDoc-2.
Wu et al. Interruption point detection of spontaneous speech using inter-syllable boundary-based prosodic features
Books Type of publication: Idiap-RR Citation: Szaszak_Idiap-RR-25-2013 Number: Idiap-RR-25-2013 Year: 2013 Month: 7
Pham Van et al. Deep Learning Approach for Singer Voice Classification of Vietnamese Popular Music
Goto et al. PodCastle and Songle: crowdsourcing-based web services for spoken document retrieval and active music listening
Nawasalkar et al. Performance analysis of different audio with raga Yaman

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20140409

Termination date: 20170929