TW202042217A - Method for detecting baby cry - Google Patents
Method for detecting baby cry Download PDFInfo
- Publication number
- TW202042217A TW202042217A TW108116218A TW108116218A TW202042217A TW 202042217 A TW202042217 A TW 202042217A TW 108116218 A TW108116218 A TW 108116218A TW 108116218 A TW108116218 A TW 108116218A TW 202042217 A TW202042217 A TW 202042217A
- Authority
- TW
- Taiwan
- Prior art keywords
- sound
- frame
- crying
- peak
- sound frame
- Prior art date
Links
Images
Landscapes
- Measurement Of Mechanical Vibrations Or Ultrasonic Waves (AREA)
- Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)
Abstract
Description
本發明係有關一種嬰兒哭聲偵測方法。The invention relates to a method for detecting infant crying.
由於智慧電子產品的日益普及,過去常見類似對講機功能的嬰兒監視器(baby monitor)已不敷使用,越來越多的功能,例如,嬰兒哭聲自動偵測功能,也受到許多父母的青睞。在目前已知嬰兒哭聲偵測方法中,常見的方式是透過將聲音裡的過零率當作特徵值,再搭配臨界值及特定判斷規則,來判斷所收到的聲音源中是否含有嬰兒的哭聲。然而,由於聲音裡的過零率受非嬰兒哭聲干擾的影響較大,容易影響判斷的準確性。另一類常見的方式則是透過將倒頻譜係數當作特徵值,再搭配機器學習或樣式辨認(pattern recognition)演算法進行判別;這類方式的缺點是需要收集大量已標註的樣本進行訓練,並且需要執行的運算量較高。Due to the increasing popularity of smart electronic products, the common baby monitors with functions similar to walkie-talkies in the past are no longer adequate. More and more functions, such as automatic detection of baby crying, are also favored by many parents. In the current known baby crying detection methods, the common method is to judge whether the received sound source contains a baby by taking the zero-crossing rate in the sound as a characteristic value, and then matching the threshold value and specific judgment rules. Crying. However, since the zero-crossing rate in the sound is greatly affected by the interference of non-baby crying, it is easy to affect the accuracy of judgment. Another common method is to use cepstral coefficients as feature values, and then use machine learning or pattern recognition algorithms for discrimination; the disadvantage of this method is that it requires a large number of labeled samples to be collected for training, and The amount of computation that needs to be performed is high.
本發明之實施例揭露一種嬰兒哭聲偵測方法,包含下列步驟:萃取特徵值步驟,係將一待測聲音訊號依時序輸入以萃取該待測聲音訊號的至少一特徵值組;以及,特徵值判斷步驟,係將該特徵值組依時序輸入並根據該特徵值組判斷該待測聲音訊號是否包含一嬰兒哭聲,以得到一偵測結果;其中,該萃取特徵值步驟更包括:將該待測聲音訊號進行音框化,產生至少一音框化聲音訊號;計算每個音框化聲音訊號的訊號基頻以得到一音框基頻;將該音框化聲音訊號進行直流移除運算,產生一直流移除音框化聲音訊號;計算該直流移除音框化聲音訊號的訊號強度、以及訊號過零率,以分別得到一音框強度、以及一音框過零率,該音框強度、音框過零率、以及音框基頻即構成一特徵值組;以及,該特徵值判斷步驟更包括:檢測該特徵值組的一音框屬性,再針對該音框屬性判斷該待測聲音訊號是否包含嬰兒哭聲,以得到該偵測結果。An embodiment of the present invention discloses a baby crying detection method, which includes the following steps: a step of extracting characteristic values, which is a step of inputting a sound signal to be measured in time sequence to extract at least one characteristic value group of the sound signal to be measured; and, The value judgment step is to input the characteristic value group in time sequence and determine whether the sound signal to be measured includes a baby cry according to the characteristic value group to obtain a detection result; wherein, the characteristic value extraction step further includes: The sound signal to be tested is sound-framed to generate at least one sound-framed sound signal; the signal fundamental frequency of each sound-framed sound signal is calculated to obtain a sound-frame fundamental frequency; the sound-framed sound signal is DC removed Calculation to generate a direct current to remove the framed audio signal; calculate the signal strength of the DC removed framed audio signal and the signal zero crossing rate to obtain a frame strength and a frame zero crossing rate respectively. The sound frame intensity, the sound frame zero-crossing rate, and the sound frame fundamental frequency constitute a feature value group; and the feature value judgment step further includes: detecting a sound frame attribute of the feature value group, and then determining the sound frame attribute Whether the sound signal to be tested contains baby crying to obtain the detection result.
在一較佳實施例中,計算該待測聲音訊號的訊號強度更包括下列步驟:將該直流移除音框化聲音訊號進行時域能量計算,產生一音框能量;將該音框能量進行能量平滑化運算,即可得該音框強度。In a preferred embodiment, calculating the signal strength of the sound signal under test further includes the following steps: performing time-domain energy calculation on the DC-removed sound framed sound signal to generate a sound frame energy; Energy smoothing calculation can get the sound frame intensity.
在一較佳實施例中,計算該待測聲音訊號的訊號過零率更包括下列步驟:將該直流移除音框化聲音訊號進行過零次數計算,產生一音框過零次數;將該音框過零次數進行過零次數平滑化運算,即可得該音框過零率。In a preferred embodiment, calculating the signal zero-crossing rate of the sound signal under test further includes the following steps: calculating the zero-crossing times of the DC-removed sound framed sound signal to generate a sound frame zero-crossing times; The zero-crossing frequency of the sound frame is smoothed to obtain the zero-crossing rate of the sound frame.
在一較佳實施例中,計算該待測聲音訊號的訊號基頻更包括下列步驟:根據該音框化聲音訊號產生一能量頻譜;根據該能量頻譜產生一基頻估測值;將該基頻估測值進行基頻估測值平滑化運算,即可得該音框基頻。In a preferred embodiment, calculating the signal base frequency of the sound signal to be measured further includes the following steps: generating an energy spectrum based on the sound framed sound signal; generating an estimated base frequency based on the energy spectrum; The basic frequency of the sound frame can be obtained by smoothing the estimated value of the fundamental frequency.
在一較佳實施例中,該產生一能量頻譜步驟係包括:將該音框化聲音訊號進行加窗,產生一加窗音框化聲音訊號;將該加窗音框化聲音訊號進行時頻轉換,產生一頻譜;將該頻譜透過頻譜能量計算,產生該能量頻譜。In a preferred embodiment, the step of generating an energy spectrum includes: windowing the sound framed sound signal to generate a windowed sound framed sound signal; and performing time-frequency on the windowed sound framed sound signal Converting to generate a frequency spectrum; passing the frequency spectrum through the spectrum energy calculation to generate the energy spectrum.
在一較佳實施例中,該產生一基頻估測值步驟更包括:根據該能量頻譜,產生一區域峰值組,係先在該能量頻譜上,將一個頻點選為一候選峰值,再以該候選峰值為參考點,進行區域能量比較,若該候選峰值在區域能量比較中被判定為勝出,則將該候選峰值標註為一區域峰值,反之則標註為其它,直到該能量頻譜上的所有的頻點都被標註完畢為止,所有該區域峰值之集合即為該區域峰值組,其中該區域能量比較,係指若該候選峰值之能量大於以該候選峰值為中心之一頻率範圍內所有其他頻點之能量,則將該候選峰值判定為勝出;然後,計算峰值間隔,包含,若該區域峰值組之區域峰值數高於一區域峰值數門檻,則計算該區域峰值組中相鄰峰值之間隔,以產生一峰值間隔組;反之,則判定基頻估測結果為不穩定;以及,計算基頻,根據該峰值間隔組計算基頻,產生一基頻估測結果,更包含:排除異常間隔,係排除峰值間隔組中之異常極值,以得到一正常峰值間隔組;檢測峰值間隔變異度,係計算該正常峰值間隔組中極值之差異,若差異小於一差異門檻,則進行峰值平均間隔計算,反之則判定該基頻估測結果為不穩定;計算峰值平均間隔,係計算該正常峰值間隔組之平均值,以得到一峰值平均間隔;搜尋基頻峰值,係在該能量頻譜上峰值平均間隔處搜尋該基頻峰值;以及基頻加權平均,係將該基頻峰值與其上下頻點之能量較高者進行加權平均,即可得該基頻估測值。In a preferred embodiment, the step of generating a fundamental frequency estimation value further includes: generating a regional peak group according to the energy spectrum, first selecting a frequency point as a candidate peak on the energy spectrum, and then Use the candidate peak as the reference point to compare the regional energy. If the candidate peak is judged to be the winner in the regional energy comparison, the candidate peak is marked as a regional peak, otherwise, it is marked as other, until the energy spectrum Until all the frequency points are marked, the set of all peaks in the region is the peak group of the region. The energy comparison of the region means that if the energy of the candidate peak is greater than all the peaks in the frequency range centered on the candidate peak For the energy of other frequency points, the candidate peak is judged as the winner; then, the peak interval is calculated, including, if the number of regional peaks in the regional peak group is higher than the threshold of a regional peak number, then the adjacent peaks in the regional peak group are calculated In order to generate a peak interval group; otherwise, determine that the fundamental frequency estimation result is unstable; and, calculate the fundamental frequency, calculate the fundamental frequency according to the peak interval group, and generate a fundamental frequency estimation result, including: The abnormal interval is to exclude the abnormal extreme value in the peak interval group to obtain a normal peak interval group; to detect the peak interval variability, to calculate the difference of the extreme value in the normal peak interval group, if the difference is less than a difference threshold, proceed Peak average interval calculation, otherwise it is judged that the fundamental frequency estimation result is unstable; to calculate the peak average interval, calculate the average value of the normal peak interval group to obtain a peak average interval; search for the fundamental frequency peak, based on the energy Search for the fundamental frequency peak at the average interval of the peaks on the spectrum; and fundamental frequency weighted average, which is a weighted average of the fundamental frequency peak and the higher energy of the upper and lower frequency points to obtain the fundamental frequency estimate.
在一較佳實施例中,該檢測該特徵值組的一音框屬性步驟更包括:對該音框進行強音框檢測,若該音框強度大於一強度門檻,則判定該音框具強音框屬性;反之則判定該音框具弱音框屬性;以及,若該音框具強音框屬性,則再對該音框進行哭聲音框檢測,若該音框過零率落在一過零率上下界之間,或者該音框基頻落在一基頻上下界之間,則判定該音框具哭聲音框屬性。In a preferred embodiment, the step of detecting a sound frame attribute of the feature value group further includes: performing strong sound frame detection on the sound frame, and if the strength of the sound frame is greater than a strength threshold, then determining that the sound frame is strong The sound frame attribute; otherwise, it is determined that the sound frame has the weak sound frame attribute; and, if the sound frame has the strong sound frame attribute, then the crying sound frame detection is performed on the sound frame, and if the zero crossing rate of the sound frame falls within one If it is between the upper and lower bounds of zero rate, or the fundamental frequency of the sound frame falls between the upper and lower bounds of a fundamental frequency, it is determined that the sound frame has the crying sound frame attribute.
在一較佳實施例中,該針對該音框屬性判斷是否包含嬰兒哭聲步驟更包括:計算強音框以及哭聲音框的數量;若兩相鄰音框之屬性依序為先強後弱,則進行聲音長度檢測;若通過該聲音長度檢測,則進行哭聲程度檢測;反之則判定偵測結果為非哭聲,各屬性音框計數歸零。In a preferred embodiment, the step of judging whether the sound frame attribute includes baby crying sound further includes: calculating the number of strong sound frame and crying sound frame; if the attributes of two adjacent sound frames are first strong and then weak , The sound length detection is performed; if the sound length detection is passed, the crying degree detection is performed; otherwise, the detection result is determined to be non-crying, and the count of each attribute sound frame is reset to zero.
在一較佳實施例中,該哭聲程度檢測係指若該哭聲音框計數超過一哭聲音框計數門檻,則判定偵測結果為哭聲,各屬性音框計數歸零;若該哭聲音框計數與該強音框計數之比例高於一哭聲比例門檻,則判定該待測聲音訊號為類哭聲,若類哭聲出現次數超過一類哭聲計數門檻,則判定偵測結果為哭聲,各屬性音框計數歸零;若相鄰兩次類哭聲之間隔大於一類哭聲間隔門檻,則類哭聲計數歸零。In a preferred embodiment, the degree of crying detection means that if the crying sound frame count exceeds a crying sound frame count threshold, the detection result is determined to be a crying sound, and the count of each attribute sound frame is reset to zero; If the ratio of the frame count to the strong sound frame count is higher than the crying ratio threshold, the sound signal to be tested is determined to be a cry-like sound, and if the number of occurrences of the cry-like sound exceeds the threshold of a cry-like count, the detection result is determined to be crying The count of each attribute sound frame is reset to zero; if the interval between two adjacent crying sounds is greater than the threshold of the interval of a kind of crying, the crying count is reset to zero.
以下藉由特定的具體實施例說明本發明之實施方式,熟悉此技術之人士可由本說明書所揭示之內容輕易地瞭解本發明之其他優點及功效。本發明亦可藉由其他不同的具體實例加以施行或應用,本發明說明書中的各項細節亦可基於不同觀點與應用在不悖離本發明之精神下進行各種修飾與變更。The following specific examples illustrate the implementation of the present invention. Those skilled in the art can easily understand the other advantages and effects of the present invention from the contents disclosed in this specification. The present invention can also be implemented or applied by other different specific examples, and various details in the specification of the present invention can also be modified and changed based on different viewpoints and applications without departing from the spirit of the present invention.
其中,本說明書所附圖式繪示之結構、比例、大小等,均僅用以配合說明書所揭示之內容,以供熟悉此技術之人士瞭解與閱讀,並非用以限定本發明可實施之限定條件,故不具技術上之實質意義,任何結構之修飾、比例關之改變或大小之調整,在不影響本發明所能產生之功效及所能達成之目的下,均應落在本發明所揭示之技術內容得能涵蓋之範圍內。Among them, the structure, ratio, size, etc. shown in the drawings in this specification are only used to match the content disclosed in the specification for the understanding and reading of those familiar with the technology, and are not intended to limit the implementation of the present invention. Conditions, so it does not have any technical significance. Any structural modification, ratio change or size adjustment, without affecting the effects and objectives that can be achieved by the invention, should fall within the disclosure of the invention The technical content must be covered.
如圖1所示,本發明之實施例揭露一種嬰兒哭聲偵測方法,包含:步驟100:萃取特徵值步驟,係將一待測聲音訊號依時序輸入以萃取該待測聲音訊號的至少一特徵值組;以及,步驟200:特徵值判斷步驟,係將該特徵值組依時序輸入並根據該特徵值組判斷該待測聲音訊號是否包含一嬰兒哭聲,以得到一偵測結果;其中,該萃取特徵值步驟更包括:步驟110將該待測聲音訊號進行音框化,產生至少一音框化聲音訊號;步驟120計算每個音框化聲音訊號的訊號基頻以得到一音框基頻;步驟130將該音框化聲音訊號進行直流移除運算,產生一直流移除音框化聲音訊號;步驟140計算該直流移除音框化聲音訊號的訊號強度、以及訊號過零率,以分別得到一音框強度、以及一音框過零率,該音框強度、音框過零率、以及音框基頻即構成一特徵值組;以及,該特徵值判斷步驟更包括:步驟210:檢測該特徵值組的一音框屬性,以及步驟220:再針對該音框屬性判斷該待測聲音訊號是否包含嬰兒哭聲,以得到該偵測結果。As shown in FIG. 1, an embodiment of the present invention discloses a baby crying detection method, which includes: Step 100: a step of extracting characteristic values, in which a sound signal to be measured is input in time sequence to extract at least one of the sound signals to be measured Feature value group; and, step 200: feature value judging step, which is to input the feature value group in time sequence and determine whether the sound signal to be tested includes a baby cry according to the feature value group to obtain a detection result; wherein , The step of extracting the characteristic value further includes:
所謂音框(frame)係先將 N 個取樣點集合成一個觀測單位,稱為音框,通常 N 的值是 256 或 512,涵蓋的時間約為 20~30 ms 左右。為了避免相鄰兩音框的變化過大,會讓兩相鄰音框之間有一段重疊區域,此重疊區域包含了 M 個取樣點,通常 M 的值約是 N 的一半或 1/3。其中,在進行音框化時(步驟110),所產生的各音框之間會有部分重疊。值得說明的是,上述之N值、M值、涵蓋的時間長度、以及音框之間是否重疊皆只是習知用來說明本發明之實施例,但在實際應用時並不限於此。The so-called frame is to gather N sampling points into an observation unit, called frame. Usually, the value of N is 256 or 512, and the time covered is about 20-30 ms. In order to avoid excessive changes between two adjacent sound frames, there will be an overlap area between two adjacent sound frames. This overlap area contains M sampling points. Usually the value of M is about half or 1/3 of N. Among them, when the sound frame is performed (step 110), the generated sound frames will partially overlap. It is worth noting that the above-mentioned N value, M value, the length of time covered, and whether the sound frames overlap or not are all conventionally used to illustrate the embodiments of the present invention, but are not limited to these in practical applications.
值得說明的是,在步驟130的主要目的係針對該待測聲音訊號擷取出一組具有嬰兒哭聲辨識度的特徵值;在本發明的設定中該具有嬰兒哭聲辨識度的特徵值組至少包含一訊號強度、一訊號過零率、以及一訊號基頻。其中,該音框強度、音框過零率、以及音框基頻可分別進行計算,其計算順序並無先後之分。It is worth noting that the main purpose of
圖2為本發明之一種嬰兒哭聲偵測方法中計算待測聲音訊號的訊號強度的流程示意圖;圖3為本發明之一種嬰兒哭聲偵測方法中計算待測聲音訊號的訊號過零率的流程示意圖;圖4為本發明之一種嬰兒哭聲偵測方法中計算待測聲音訊號的訊號基頻的流程示意圖。2 is a schematic diagram of the flow chart of calculating the signal intensity of the sound signal to be measured in a method for detecting infant crying of the present invention; FIG. 3 is a flow chart of calculating the signal zero-crossing rate of the sound signal to be measured in a method for detecting infant crying of the present invention Figure 4 is a schematic flow chart of calculating the signal base frequency of the sound signal to be tested in a method for detecting baby crying according to the present invention.
具體來說,如圖2所示,計算該待測聲音訊號的訊號強度更包括:步驟1301、將該直流移除音框化聲音訊號進行時域能量計算,產生一音框能量;步驟1302、將該音框能量進行能量平滑化運算,即可得該音框強度。其中,在一較佳實施例中,進行時域能量計算時所使用的方法為取用該音框內所有取樣點的絕對值之平均值,但不限於此。同樣地,在一較佳實施例中,進行能量平滑化運算時所使用的方法為:將當前音框能量與前一個音框能量進行加權平均,但也不限於此。Specifically, as shown in FIG. 2, calculating the signal intensity of the sound signal to be measured further includes:
另一方面,如圖3所示,計算該待測聲音訊號的訊號過零率更包括:步驟1303、將該直流移除音框化聲音訊號進行過零次數計算,產生一音框過零次數;步驟1304、將該音框過零次數進行過零次數平滑化運算,即可得該音框過零率。其中,在一較佳實施例中,進行過零次數平滑化運算時所使用的方法為:將當前音框過零次數與前一個音框過零次數進行加權平均,但也不限於此。On the other hand, as shown in FIG. 3, calculating the signal zero-crossing rate of the sound signal under test further includes:
同樣地,如圖4所示,計算該待測聲音訊號的訊號基頻更包括:步驟1305、根據該音框化聲音訊號產生一能量頻譜;步驟1306、根據該能量頻譜產生一基頻估測值;步驟1307、將該基頻估測值進行基頻估測值平滑化運算,即可得該音框基頻。Similarly, as shown in FIG. 4, calculating the signal base frequency of the sound signal under test further includes:
圖5為本發明之一種嬰兒哭聲偵測方法中產生能量頻譜的流程示意圖;圖6為本發明之一種嬰兒哭聲偵測方法中產生基頻估測值的流程示意圖。FIG. 5 is a schematic diagram of the process of generating energy spectrum in a method for detecting infant crying of the present invention; FIG. 6 is a schematic diagram of the process of generating an estimated fundamental frequency in a method of detecting infant crying of the present invention.
承前所述,如圖5所示,該產生一能量頻譜步驟(步驟1305)係包括:步驟1305a、將該音框化聲音訊號進行加窗,產生一加窗音框化聲音訊號;步驟1305b、將該加窗音框化聲音訊號進行時頻轉換,產生一頻譜;步驟1305c、將該頻譜透過頻譜能量計算,產生該能量頻譜。其中,所謂加窗,係指將每一個音框乘上一窗函數,例如,漢寧窗(Hamming window),以增加音框左端和右端的連續性,但不限於此。另一方面,在一較佳實施例中,在時頻轉換時所使用的轉換方法為快速傅立葉轉換,但也不限於此。同樣地,在一較佳實施例中,在頻譜能量計算時所使用的計算函式為絕對值函式,但也不限於此。Based on the foregoing, as shown in Figure 5, the step of generating an energy spectrum (step 1305) includes:
如圖6所示,該產生一基頻估測值步驟更包括:步驟1306a、根據該能量頻譜,產生一區域峰值組;步驟1306b、計算峰值間隔;步驟1306c、計算基頻。分別詳述如下:As shown in FIG. 6, the step of generating a fundamental frequency estimate further includes:
其中,步驟1306a、根據該能量頻譜,產生一區域峰值組,係先在該能量頻譜上,將一個頻點選為一候選峰值;再以該候選峰值為參考點,進行區域能量比較,若該候選峰值在區域能量比較中被判定為勝出,則將該候選峰值標註為一區域峰值,反之則標註為其它,直到該能量頻譜上的所有的頻點都被標註完畢為止。所有該區域峰值之集合即為該區域峰值組,其中該區域能量比較,係指若該候選峰值之能量大於以該候選峰值為中心之一頻率範圍內所有其他頻點之能量,則將該候選峰值判定為勝出。Wherein, in
步驟1306b、計算峰值間隔,更包含:若該區域峰值組之區域峰值數高於一區域峰值數門檻,則計算該區域峰值組中相鄰峰值之間隔,以產生峰值間隔組;反之,則判定基頻估測結果為不穩定。
步驟1306c、計算基頻,係根據該峰值間隔組計算基頻,產生一基頻估測結果,更包含:排除異常間隔,係排除峰值間隔組中之異常極值,以得到一正常峰值間隔組;檢測峰值間隔變異度,係計算該正常峰值間隔組中極值之差異,若差異小於一差異門檻,則進行峰值平均間隔計算,反之則判定該基頻估測結果為不穩定;計算峰值平均間隔,係計算該正常峰值間隔組之平均值,以得到一峰值平均間隔;搜尋基頻峰值,係在該能量頻譜上峰值平均間隔處搜尋該基頻峰值;以及基頻加權平均,係將該基頻峰值與其上下頻點之能量較高者進行加權平均,即可得該基頻估測值。其中,在一較佳實施例中, 進行基頻估測值平滑化運算時所使用的方法為:若當前基頻估測值為穩定時,則該音框基頻即為當前基頻估測值;反之,則該音框基頻為前一個音框之基頻。
圖7為本發明之一種嬰兒哭聲偵測方法中檢測特徵值組的音框屬性的流程示意圖。如圖7所示,該檢測該特徵值組的一音框屬性步驟更包括:步驟2101、對該音框進行強音框檢測,若該音框強度大於一強度門檻,則判定該音框具強音框屬性,反之則判定該音框具弱音框屬性;以及,步驟2102、若該音框具強音框屬性,則再對該音框進行哭聲音框檢測,若該音框過零率落在一過零率上下界之間,或者該音框基頻落在一基頻上下界之間,則判定該音框具哭聲音框屬性。FIG. 7 is a schematic diagram of the process of detecting the frame attributes of the feature value group in a method for detecting infant crying according to the present invention. As shown in FIG. 7, the step of detecting a sound frame attribute of the feature value group further includes:
圖8為本發明之一種嬰兒哭聲偵測方法中針對音框屬性判斷待測聲音訊號是否包含嬰兒哭聲的流程示意圖;圖9為本發明之一種嬰兒哭聲偵測方法中哭聲程度檢測的流程示意圖。其中,如圖8所示,該針對該音框屬性判斷該待測聲音訊號是否包含嬰兒哭聲步驟更包括:步驟2201、計算強音框以及哭聲音框的數量;步驟2202、若兩相鄰音框之屬性依序為先強後弱,則進行聲音長度檢測;步驟2203、若通過聲音長度檢測,則進行哭聲程度檢測;反之則判定偵測結果為非哭聲,各屬性音框計數歸零。其中,步驟2202之該聲音長度檢測係指若該強音框計數低於一強音框計數門檻,則視為通過檢測。Fig. 8 is a schematic diagram of the process of judging whether the sound signal to be measured includes baby crying according to the properties of the sound frame in a method for detecting infant crying; Schematic diagram of the process. Wherein, as shown in FIG. 8, the step of judging whether the sound signal to be tested contains baby crying for the sound frame attribute further includes:
如圖9所示,步驟2203中該哭聲程度檢測更包含:步驟2203a、若該哭聲音框計數超過一哭聲音框計數門檻,則判定偵測結果為哭聲,各屬性音框計數歸零;步驟2203b、若該哭聲音框計數與該強音框計數之比例高於一哭聲比例門檻,則判定該待測聲音訊號為類哭聲;步驟2203c、若類哭聲出現次數超過一類哭聲計數門檻,則判定偵測結果為哭聲,各屬性音框計數歸零;步驟2203d、若相鄰兩次類哭聲之間隔大於一類哭聲間隔門檻,則類哭聲計數歸零。As shown in Figure 9, the crying level detection in
儘管已參考本申請的許多說明性實施例描述了實施方式,但應瞭解的是,本領域技術人員能夠想到多種其他改變及實施例,這些改變及實施例將落入本公開原理的精神與範圍內。尤其是,在本公開、圖式以及所附申請專利的範圍之內,對主題結合設置的組成部分及/或設置可作出各種變化與修飾。除對組成部分及/或設置做出的變化與修飾之外,可替代的用途對本領域技術人員而言將是顯而易見的。Although the implementation has been described with reference to many illustrative embodiments of the present application, it should be understood that those skilled in the art can think of many other changes and embodiments, and these changes and embodiments will fall within the spirit and scope of the principles of the present disclosure. Inside. In particular, within the scope of the present disclosure, the drawings and the attached patent application, various changes and modifications can be made to the components and/or arrangements of the subject combination arrangement. In addition to changes and modifications to the components and/or settings, alternative uses will be obvious to those skilled in the art.
100:萃取特徵值 110:將該待測聲音訊號進行音框化 120:計算每個音框化聲音訊號的音框基頻 130:將音框化聲音訊號進行直流移除運算 140:計算直流移除音框化聲音訊號的音框強度、音框過零率,該音框強度、音框過零率、以及音框基頻即構成一特徵值組 200:特徵值判斷 210:檢測特徵值組的音框屬性 220:針對音框屬性判斷是否包含嬰兒哭聲 1301:進行時域能量計算 1302:進行能量平滑化運算 1303:進行過零次數計算 1304:進行過零次數平滑化運算 1305:產生能量頻譜 1306:產生基頻估測值 1307:進行基頻估測值平滑化運算 1305a:進行音框加窗 1305b:進行時頻轉換 1305c:進行頻譜能量計算 1306a:根據能量頻譜,產生區域峰值組 1306b:計算峰值間隔 1306c:計算基頻 2101:進行強音框檢測,判斷是否具強音框屬性 2102:若具強音框屬性,則進行哭聲音框檢測 2201:計算強音框以及哭聲音框的數量 2202:若兩相鄰音框之屬性依序為先強後弱,則進行聲音長度檢測 2203:若通過聲音長度檢測,則進行哭聲程度檢測 2203a:若哭聲音框計數超過一哭聲音框計數門檻,則判定偵測結果為哭聲,各屬性音框計數歸零 2203b:若哭聲音框計數與強音框計數之比例高於一哭聲比例門檻,則判定待測聲音訊號為類哭聲 2203c:若類哭聲出現次數超過一類哭聲計數門檻,則判定偵測結果為哭聲,各屬性音框計數歸零 2203d:若相鄰兩次類哭聲之間隔大於一類哭聲間隔門檻,則類哭聲計數歸零100: Extract characteristic value 110: Frame the sound signal to be tested 120: Calculate the fundamental frequency of each framed sound signal 130: Perform DC removal operation on the framed sound signal 140: Calculate the frame strength and the zero-crossing rate of the sound frame of the DC removed framed sound signal. The frame strength, the zero-crossing rate of the sound frame, and the fundamental frequency of the sound frame constitute a characteristic value group 200: eigenvalue judgment 210: Detect the frame attributes of the feature value group 220: Determine whether the sound frame contains baby crying 1301: Perform time domain energy calculation 1302: Perform energy smoothing operations 1303: Calculate the number of zero crossings 1304: Perform zero-crossing smoothing operation 1305: Generate energy spectrum 1306: Generate estimated fundamental frequency 1307: Perform a smoothing operation on the estimated fundamental frequency 1305a: Perform sound frame and window 1305b: Perform time-frequency conversion 1305c: Perform spectrum energy calculation 1306a: Generate a regional peak group based on the energy spectrum 1306b: Calculate the peak interval 1306c: Calculate the fundamental frequency 2101: Perform strong sound frame detection to determine whether it has strong sound frame attributes 2102: If there is a strong sound frame attribute, perform crying sound frame detection 2201: Calculate the number of strong sound boxes and crying sound boxes 2202: If the attributes of two adjacent sound frames are strong first and then weak, the sound length detection is performed 2203: If the sound length detection is passed, the crying level detection is performed 2203a: If the crying sound frame count exceeds the one crying sound frame count threshold, the detection result is determined to be crying, and the sound frame count of each attribute is reset to zero 2203b: If the ratio of the crying sound frame count to the strong sound frame count is higher than a crying sound ratio threshold, the sound signal to be measured is determined to be a cry-like sound 2203c: If the number of occurrences of crying sounds exceeds the count threshold of crying sounds, the detection result is determined to be crying, and the count of each attribute sound frame is reset to zero 2203d: If the interval between two adjacent crying sounds is greater than the threshold of the first crying sound, the crying sound count will be reset to zero
圖1為本發明之一種嬰兒哭聲偵測方法的流程示意圖; 圖2為本發明之一種嬰兒哭聲偵測方法中計算待測聲音訊號的訊號強度的流程示意圖; 圖3為本發明之一種嬰兒哭聲偵測方法中計算待測聲音訊號的訊號過零率的流程示意圖; 圖4為本發明之一種嬰兒哭聲偵測方法中計算待測聲音訊號的訊號基頻的流程示意圖; 圖5為本發明之一種嬰兒哭聲偵測方法中產生能量頻譜的流程示意圖; 圖6為本發明之一種嬰兒哭聲偵測方法中產生基頻估測值的流程示意圖; 圖7為本發明之一種嬰兒哭聲偵測方法中檢測特徵值組的音框屬性的流程示意圖; 圖8為本發明之一種嬰兒哭聲偵測方法中針對音框屬性判斷待測聲音訊號是否包含嬰兒哭聲的流程示意圖; 圖9為本發明之一種嬰兒哭聲偵測方法中哭聲程度檢測的流程示意圖。Fig. 1 is a schematic flow diagram of a method for detecting infant crying according to the present invention; 2 is a schematic diagram of the process of calculating the signal intensity of the sound signal to be measured in a method for detecting infant crying according to the present invention; 3 is a flow chart of calculating the zero-crossing rate of a sound signal to be measured in a method for detecting baby crying according to the present invention; 4 is a schematic flow chart of calculating the signal base frequency of the sound signal to be measured in a method for detecting baby crying according to the present invention; FIG. 5 is a schematic diagram of the process of generating energy spectrum in a method for detecting infant crying according to the present invention; FIG. 6 is a schematic diagram of a flow chart of generating a fundamental frequency estimation value in a method for detecting infant crying according to the present invention; FIG. 7 is a schematic diagram of the process of detecting the sound frame attributes of the feature value group in a method for detecting baby cry of the present invention; FIG. 8 is a schematic diagram of the process of judging whether the sound signal to be tested contains a baby cry according to the sound frame attribute in a method for detecting baby cry of the present invention; FIG. 9 is a schematic diagram of the flow of detecting the degree of crying in a method for detecting infant crying according to the present invention.
100:萃取特徵值 100: Extract characteristic value
110:將該待測聲音訊號進行音框化 110: Frame the sound signal to be tested
120:計算每個音框化聲音訊號的音框基頻 120: Calculate the fundamental frequency of each framed sound signal
130:將音框化聲音訊號進行直流移除運算 130: Perform DC removal operation on the framed sound signal
140:計算直流移除音框化聲音訊號的音框強度、音框過零率,該音框強度、音框過零率、以及音框基頻即構成一特徵值組 140: Calculate the frame strength and the zero-crossing rate of the sound frame of the DC removed framed sound signal. The frame strength, the zero-crossing rate of the sound frame, and the fundamental frequency of the sound frame constitute a characteristic value group
200:特徵值判斷 200: eigenvalue judgment
210:檢測特徵值組的音框屬性 210: Detect the frame attributes of the feature value group
220:針對音框屬性判斷是否包含嬰兒哭聲 220: Determine whether the sound frame contains baby crying
Claims (18)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
TW108116218A TWI687920B (en) | 2019-05-10 | 2019-05-10 | Method for detecting baby cry |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
TW108116218A TWI687920B (en) | 2019-05-10 | 2019-05-10 | Method for detecting baby cry |
Publications (2)
Publication Number | Publication Date |
---|---|
TWI687920B TWI687920B (en) | 2020-03-11 |
TW202042217A true TW202042217A (en) | 2020-11-16 |
Family
ID=70766916
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
TW108116218A TWI687920B (en) | 2019-05-10 | 2019-05-10 | Method for detecting baby cry |
Country Status (1)
Country | Link |
---|---|
TW (1) | TWI687920B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114765032A (en) * | 2021-01-14 | 2022-07-19 | 漳州立达信光电子科技有限公司 | Sound detection method, device and equipment |
CN113707180A (en) * | 2021-08-10 | 2021-11-26 | 漳州立达信光电子科技有限公司 | Crying sound detection method and device |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102915728B (en) * | 2011-08-01 | 2014-08-27 | 佳能株式会社 | Sound segmentation device and method and speaker recognition system |
TWI474315B (en) * | 2012-05-25 | 2015-02-21 | Univ Nat Taiwan Normal | Infant cries analysis method and system |
CN102881284B (en) * | 2012-09-03 | 2014-07-09 | 江苏大学 | Unspecific human voice and emotion recognition method and system |
CN103778916B (en) * | 2013-12-31 | 2016-09-28 | 三星电子(中国)研发中心 | The method and system of monitoring ambient sound |
CN105139869B (en) * | 2015-07-27 | 2018-11-30 | 安徽清新互联信息科技有限公司 | A kind of baby crying detection method based on section Differential Characteristics |
TWI597720B (en) * | 2017-01-04 | 2017-09-01 | 晨星半導體股份有限公司 | Baby cry detection circuit and associated detection method |
-
2019
- 2019-05-10 TW TW108116218A patent/TWI687920B/en active
Also Published As
Publication number | Publication date |
---|---|
TWI687920B (en) | 2020-03-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP6826276B1 (en) | Method of identifying the degree of propeller cavitation based on the identification of the characteristic pattern of pulse frequency | |
KR101983926B1 (en) | Heart rate detection method and device | |
TW202042217A (en) | Method for detecting baby cry | |
CN108523877B (en) | Electrocardiosignal quality identification method and electrocardiosignal analysis method | |
CN103431856A (en) | Method and device for selecting electrocardiogram lead in multiple lead synchronous electrocardiographic signals | |
CN109034046A (en) | Foreign matter automatic identifying method in a kind of electric energy meter based on Acoustic detection | |
TWI569263B (en) | Method and apparatus for signal extraction of audio signal | |
CA1258099A (en) | Egg with delta wave recognition by double | |
CN103698687A (en) | Method and system for processing signals of hardware Trojan detection in integrated circuit | |
CN105551501B (en) | Harmonic signal fundamental frequency estimation algorithm and device | |
CN104807540A (en) | Noise inspection method and system | |
CN105336344B (en) | Noise detection method and device | |
TWI597720B (en) | Baby cry detection circuit and associated detection method | |
CN102217931A (en) | Method and device for acquiring heart rate variation characteristic parameter | |
TWI766489B (en) | Monitoring method and system for machine tool | |
CN107391935B (en) | The instantaneous Frequency Estimation method examined based on non-delayed cost function and Grubbs | |
TW201332512A (en) | Method and apparatus for heart rate measurement | |
Hansson-Sandsten et al. | SVD-based classification of bird singing in different time-frequency domains using multitapers | |
CN111273101A (en) | Fault arc detection method and device and storage medium | |
JP5092876B2 (en) | Sound processing apparatus and program | |
CN114781466A (en) | Fault diagnosis method and system based on harmonic fundamental frequency of rotary mechanical vibration signal | |
CN111524036B (en) | Transient power quality disturbance classification method | |
TWI716029B (en) | Method for detecting random sound segment | |
CN104282315A (en) | Voice frequency signal classified processing method, device and equipment | |
CN100424692C (en) | Audio fast search method |