CN106486133B - One kind is uttered long and high-pitched sounds scene recognition method and equipment - Google Patents

One kind is uttered long and high-pitched sounds scene recognition method and equipment Download PDF

Info

Publication number
CN106486133B
CN106486133B CN201510532929.3A CN201510532929A CN106486133B CN 106486133 B CN106486133 B CN 106486133B CN 201510532929 A CN201510532929 A CN 201510532929A CN 106486133 B CN106486133 B CN 106486133B
Authority
CN
China
Prior art keywords
window
pitched sounds
uttering long
long
scene
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510532929.3A
Other languages
Chinese (zh)
Other versions
CN106486133A (en
Inventor
徐绍君
王亮
鲜柯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Dingqiao Communication Technology Co Ltd
Original Assignee
Chengdu Dingqiao Communication Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Dingqiao Communication Technology Co Ltd filed Critical Chengdu Dingqiao Communication Technology Co Ltd
Priority to CN201510532929.3A priority Critical patent/CN106486133B/en
Publication of CN106486133A publication Critical patent/CN106486133A/en
Application granted granted Critical
Publication of CN106486133B publication Critical patent/CN106486133B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

It utters long and high-pitched sounds scene recognition method this application discloses one kind, comprising: to each speech frame in detection window, according to uttering long and high-pitched sounds, frame bar part judges whether there is feature of uttering long and high-pitched sounds, if it does, determining that the frame is frame of uttering long and high-pitched sounds;Judge whether current detection window meets scene condition of uttering long and high-pitched sounds, if it is satisfied, then judgement is currently scene of uttering long and high-pitched sounds, otherwise, judgement is currently non-scene of uttering long and high-pitched sounds.It utters long and high-pitched sounds scene Recognition equipment disclosed herein as well is one kind.Using technical solution disclosed in the present application, the accuracy rate for detection of uttering long and high-pitched sounds can be improved, to be adapted to subsequent chauvent's criterion processing.

Description

One kind is uttered long and high-pitched sounds scene recognition method and equipment
Technical field
This application involves field of communication technology, in particular to one kind is uttered long and high-pitched sounds scene recognition method and equipment.
Background technique
The voice service form of sector terminal is mainly the business such as cluster mode, direct mode operation (DMO), and this kind of business Mainly use outer mode playback.Since sector terminal largely works in the biggish outdoor or workshop of ambient noise, it is desirable that it gives great volume, Therefore the uplink and downlink volume gain of terminal is usually adjusted larger, and after sound is amplified by loop gain, energy is constantly accumulated shape It at uttering long and high-pitched sounds, and utters long and high-pitched sounds and seriously affects the normal use of voice service, great discomfort is caused to customer perception, therefore to field of uttering long and high-pitched sounds Scape, which carries out identification tool, to have very important significance.
However, sector terminal is still in the stage of fumbling, largely to the solution of scene Recognition of uttering long and high-pitched sounds and immature at present The problem of the generally existing low efficiency of identifying schemes, identification inaccuracy, has seriously affected the overall performance of chauvent's criterion.
Summary of the invention
It utters long and high-pitched sounds scene recognition method and equipment this application provides one kind, to improve the accuracy rate for detection of uttering long and high-pitched sounds.
One kind provided by the present application is uttered long and high-pitched sounds scene recognition method, comprising:
To each speech frame in detection window, according to uttering long and high-pitched sounds, frame bar part judges whether there is feature of uttering long and high-pitched sounds, if it does, really The fixed frame is frame of uttering long and high-pitched sounds;
Judge whether current detection window meets scene condition of uttering long and high-pitched sounds, if it is satisfied, then judgement is currently scene of uttering long and high-pitched sounds, it is no Then, judgement is currently non-scene of uttering long and high-pitched sounds.
Preferably, the basis is uttered long and high-pitched sounds, frame bar part judges whether there is feature of uttering long and high-pitched sounds and includes:
A, judge whether the frequency of prominent point in the speech frame is greater than the first thresholding of setting, if so, after It is continuous to execute B, otherwise, terminate deterministic process;
B, the position for remembering the prominent point is Po_peak, is delimited centered on Po_peak according to the width of setting Peak_window window, and before_window window and after_window window delimited respectively in the two sides of Peak_window window, Wherein, the width of before_window window and after_window window and Peak_window window are identical or different;
C, judge whether power and the mean power of before_window window and after_window window of Po_peak are full Foot:
If it is satisfied, continuing to execute D, otherwise, terminate deterministic process;Wherein PvFor preset value;
D, judge being averaged for the mean power of Peak_window window and before_window window and after_window window Whether power meets:
If it is satisfied, then determining there is feature of uttering long and high-pitched sounds in the speech frame.
Preferably, the scene condition of uttering long and high-pitched sounds are as follows: the quantity for frame of uttering long and high-pitched sounds in detection window is more than or equal to the quantity of setting Thresholding.
Preferably, the scene condition of uttering long and high-pitched sounds is divided into: the scene item of uttering long and high-pitched sounds under long detection window mechanism and short detection window mechanism Part, wherein the detection window width of long detection window mechanism is greater than the detection window width of short detection window mechanism.
Preferably, the quantity of speech frame for including in the quantity thresholding and detection window is directly proportional, and quantity thresholding is less than Or the quantity equal to the speech frame for including in detection window.
It utters long and high-pitched sounds scene Recognition equipment present invention also provides one kind, comprising: frame judging module of uttering long and high-pitched sounds and scene judgement of uttering long and high-pitched sounds Module, in which:
The frame judging module of uttering long and high-pitched sounds, for each speech frame in detection window, frame bar part to judge whether according to uttering long and high-pitched sounds In the presence of feature of uttering long and high-pitched sounds, if it does, determining that the frame is frame of uttering long and high-pitched sounds;
The scene judging module of uttering long and high-pitched sounds, for judging whether current detection window meets scene condition of uttering long and high-pitched sounds, if it is satisfied, Then judgement is currently scene of uttering long and high-pitched sounds, and otherwise, judgement is currently non-scene of uttering long and high-pitched sounds.
As seen from the above technical solution, utter long and high-pitched sounds scene recognition method and equipment provided by the present application, first according to frame of uttering long and high-pitched sounds Condition judges to detect with the presence or absence of feature of uttering long and high-pitched sounds in each speech frame in window respectively, if it does, determining that the frame is frame of uttering long and high-pitched sounds; Then judge whether current detection window meets scene condition of uttering long and high-pitched sounds, if it is satisfied, then otherwise judgement is currently sentenced for scene of uttering long and high-pitched sounds Certainly currently it is non-scene of uttering long and high-pitched sounds, phonetic feature of uttering long and high-pitched sounds can be effectively identified by technical scheme, improve detection of uttering long and high-pitched sounds Accuracy rate, to be adapted to the processing of subsequent chauvent's criterion.
Detailed description of the invention
Fig. 1 is the flow diagram of the preferable scene recognition method of uttering long and high-pitched sounds of the present invention one;
Fig. 2 is the time domain waveform schematic diagram in the presence of phenomenon of uttering long and high-pitched sounds;
Fig. 3 is the frequency-domain waveform schematic diagram in the presence of phenomenon of uttering long and high-pitched sounds;
Fig. 4 is the schematic diagram that present invention judgement is uttered long and high-pitched sounds a little;
Fig. 5 is the composed structure schematic diagram of a preferable equipment of the invention.
Specific embodiment
It is right hereinafter, referring to the drawings and the embodiments, for the objects, technical solutions and advantages of the application are more clearly understood The application is described in further detail.
Fig. 1 is the flow diagram of the preferable scene recognition method of uttering long and high-pitched sounds of the present invention one, this method comprises:
Firstly, according to uttering long and high-pitched sounds, frame bar part judges whether there is feature of uttering long and high-pitched sounds, if deposited to each speech frame in detection window Determining that the frame is frame of uttering long and high-pitched sounds;
Then, judge whether current detection window meets scene condition of uttering long and high-pitched sounds, if it is satisfied, then judgement is currently field of uttering long and high-pitched sounds Scape, otherwise, judgement are currently non-scene of uttering long and high-pitched sounds.
In general, howling in the time domain concentrate by energy comparison, and there are saturated phenomenons, and are concentrated mainly on a certain section The frequency domain for comparing concentration, the region as shown in ellipse in Fig. 2.Fig. 3 then shows two and utters long and high-pitched sounds a little.In Fig. 2, horizontal axis indicates the time, Unit is the second, and the longitudinal axis indicates power, unit mW;In Fig. 3, horizontal axis indicates frequency, unit Hz, and the longitudinal axis indicates power, unit For dB.The application is concentrated mainly on some this feature into multiple frequency points according to an energy of uttering long and high-pitched sounds and identifies frame of uttering long and high-pitched sounds, and proposes Uttering long and high-pitched sounds, it is necessary to meet following condition for frame:
(1) frequency uttered long and high-pitched sounds a little is greater than the thresholding min_frequency of setting.
(2) power uttered long and high-pitched sounds in the peak_window window centered on uttering long and high-pitched sounds a little a little is maximum, remembers that the position uttered long and high-pitched sounds a little is Po_peak。
(3) mean power of the power and before_window window and after_window window uttered long and high-pitched sounds a little meets:
Wherein, PdFor preset value, recommendation 10.
(4) mean power of the mean power of Peak_window window and before_window window and after_window window Meet:
Wherein, the relationship of Peak_window window, before_window window and after_window window is as shown in Figure 4.Fig. 4 In shown example, of same size, the before_ of Peak_window window, before_window window and after_window window Window window and after_window window are located at the two sides of Peak_window window.In practical applications, Peak_window The width of window, before_window window and after_window window can be identical or different, the value range recommendation of width For 5~12 sampled points.
If current speech frame meets above-mentioned condition, current speech frame is adjudicated in the presence of uttering long and high-pitched sounds a little, current speech frame can be sentenced It is certainly frame of uttering long and high-pitched sounds.
Based on above-mentioned frame bar part of uttering long and high-pitched sounds, judge that a certain speech frame whether there is the detailed process for feature of uttering long and high-pitched sounds are as follows:
A, judge whether the frequency of prominent point in speech frame is greater than the first thresholding of setting (i.e. as previously described Min_frequency), if so, continuing to execute B, otherwise, terminate deterministic process;
B, the position for remembering the prominent point is Po_peak, is delimited centered on Po_peak according to the width of setting Peak_window window, and before_window window and after_window window delimited respectively in the two sides of Peak_window window, Wherein, the width of before_window window and after_window window can be identical or different with Peak_window window, width Value range recommendation be 5~12 sampled points;
C, judge whether power and the mean power of before_window window and after_window window of Po_peak are full Foot:
If it is satisfied, continuing to execute D, otherwise, terminate deterministic process;Wherein, PvFor preset value, recommendation is 5;
D, judge being averaged for the mean power of Peak_window window and before_window window and after_window window Whether power meets:
If it is satisfied, then determining there is feature of uttering long and high-pitched sounds in the speech frame, it may be assumed that the speech frame is frame of uttering long and high-pitched sounds.
For howling scene, phenomenon of uttering long and high-pitched sounds can continue to generate, and there is feature of uttering long and high-pitched sounds in continuous multiple speech frames, immediately Characteristic of field, the application propose the scene decision method of uttering long and high-pitched sounds based on sliding window as previously described, together based on the analysis to this feature When, using long detection window mechanism and short detection window mechanism.Short detection window mechanism is uttered long and high-pitched sounds a little by generating in analysis short cycle Speech frame probability is mainly used for the strong howling of judgement burst to determine whether into scene of uttering long and high-pitched sounds;And grow detection window mechanism Be by generating the speech frame probability a little of uttering long and high-pitched sounds in analysis long period to determine whether into scene of uttering long and high-pitched sounds, be mainly used for judging in Slowly varying howling.
Long detection window mechanism and the algorithm of short detection window mechanism and processing are almost the same, and the main distinction is thresholding and detection Window is of different sizes, is illustrated by taking short detection window mechanism as an example herein.Short detection window uses sliding window mechanism, it is assumed that sliding window size is HORING_DURATION_SHORT, the sliding window include nearest HORING_DURATION_SHORT speech frame, and the application is first Judge whether this HORING_DURATION_SHORT speech frame is frame of uttering long and high-pitched sounds, and then judges HORING_DURATION_ respectively Whether the quantity of frame of uttering long and high-pitched sounds in SHORT speech frame meets the following conditions:
The effective number of speech frames of frame of uttering long and high-pitched sounds >=PEAK_NUM_THD_SHORT
Judge to enter scene of uttering long and high-pitched sounds if meeting, otherwise not can enter scene of uttering long and high-pitched sounds.Wherein, quantity thresholding PEAK_NUM_ THD_SHORT is directly proportional to HORING_DURATION_SHORT, and needs to meet PEAK_NUM_THD_SHORT≤HORING_ DURATION_SHORT。
Corresponding to the above method, utter long and high-pitched sounds scene Recognition equipment present invention also provides one kind, composed structure such as Fig. 5 institute Show, comprising: frame judging module of uttering long and high-pitched sounds and scene judging module of uttering long and high-pitched sounds, in which:
The frame judging module of uttering long and high-pitched sounds, for each speech frame in detection window, frame bar part to judge whether according to uttering long and high-pitched sounds In the presence of feature of uttering long and high-pitched sounds, if it does, determining that the frame is frame of uttering long and high-pitched sounds;
The scene judging module of uttering long and high-pitched sounds, for judging whether current detection window meets scene condition of uttering long and high-pitched sounds, if it is satisfied, Then judgement is currently scene of uttering long and high-pitched sounds, and otherwise, judgement is currently non-scene of uttering long and high-pitched sounds.
The foregoing is merely the preferred embodiments of the application, not to limit the application, all essences in the application Within mind and principle, any modification, equivalent substitution, improvement and etc. done be should be included within the scope of the application protection.

Claims (5)

  1. The scene recognition method 1. one kind is uttered long and high-pitched sounds characterized by comprising
    To each speech frame in detection window, according to uttering long and high-pitched sounds, frame bar part judges whether there is feature of uttering long and high-pitched sounds, if it does, determining should Speech frame is frame of uttering long and high-pitched sounds;
    Judge whether current detection window meets scene condition of uttering long and high-pitched sounds, if it is satisfied, then otherwise judgement is currently sentenced for scene of uttering long and high-pitched sounds It is certainly currently non-scene of uttering long and high-pitched sounds;
    Basis frame bar part of uttering long and high-pitched sounds judges whether there is feature of uttering long and high-pitched sounds and includes:
    A, judge whether the frequency of prominent point in the speech frame is greater than the first thresholding of setting, if so, continuing to hold Otherwise row B terminates deterministic process;
    B, the position for remembering the prominent point is Po_peak, is delimited centered on Po_peak according to the width of setting Peak_window window, and before_window window and after_window window delimited respectively in the two sides of Peak_window window, Wherein, the width of before_window window and after_window window and Peak_window window are identical or different;
    C, whether the power and the mean power of before_window window and after_window window for judging Po_peak meet:
    If it is satisfied, continuing to execute D, otherwise, terminate deterministic process;Wherein PvFor preset value;
    D, judge the mean power of Peak_window window and the mean power of before_window window and after_window window Whether meet:
    If it is satisfied, then determining there is feature of uttering long and high-pitched sounds in the speech frame.
  2. 2. according to the method described in claim 1, it is characterized by:
    The scene condition of uttering long and high-pitched sounds are as follows: the quantity for frame of uttering long and high-pitched sounds in detection window is more than or equal to the quantity thresholding of setting.
  3. 3. according to the method described in claim 2, it is characterized by:
    The scene condition of uttering long and high-pitched sounds is divided into: the scene condition of uttering long and high-pitched sounds under long detection window mechanism and short detection window mechanism, wherein long inspection The detection window width for surveying window mechanism is greater than the detection window width of short detection window mechanism.
  4. 4. according to the method described in claim 2, it is characterized by:
    The quantity of speech frame for including in the quantity thresholding and detection window is directly proportional, and quantity thresholding is less than or equal to detection The quantity for the speech frame for including in window.
  5. The scene Recognition equipment 5. one kind is uttered long and high-pitched sounds characterized by comprising frame judging module of uttering long and high-pitched sounds and scene judging module of uttering long and high-pitched sounds, Wherein:
    The frame judging module of uttering long and high-pitched sounds, for each speech frame in detection window, frame bar part to be judged whether there is according to uttering long and high-pitched sounds It utters long and high-pitched sounds feature, if it does, determining that the frame is frame of uttering long and high-pitched sounds;
    The scene judging module of uttering long and high-pitched sounds, for judging whether current detection window meets scene condition of uttering long and high-pitched sounds, if it is satisfied, then sentencing It is certainly current otherwise to be adjudicated currently as non-scene of uttering long and high-pitched sounds for scene of uttering long and high-pitched sounds;
    Basis frame bar part of uttering long and high-pitched sounds judges whether there is feature of uttering long and high-pitched sounds and includes:
    A, judge whether the frequency of prominent point in the speech frame is greater than the first thresholding of setting, if so, continuing to hold Otherwise row B terminates deterministic process;
    B, the position for remembering the prominent point is Po_peak, is delimited centered on Po_peak according to the width of setting Peak_window window, and before_window window and after_window window delimited respectively in the two sides of Peak_window window, Wherein, the width of before_window window and after_window window and Peak_window window are identical or different;
    C, whether the power and the mean power of before_window window and after_window window for judging Po_peak meet:
    If it is satisfied, continuing to execute D, otherwise, terminate deterministic process;Wherein PvFor preset value;
    D, judge the mean power of Peak_window window and the mean power of before_window window and after_window window Whether meet:
    If it is satisfied, then determining there is feature of uttering long and high-pitched sounds in the speech frame.
CN201510532929.3A 2015-08-27 2015-08-27 One kind is uttered long and high-pitched sounds scene recognition method and equipment Active CN106486133B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510532929.3A CN106486133B (en) 2015-08-27 2015-08-27 One kind is uttered long and high-pitched sounds scene recognition method and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510532929.3A CN106486133B (en) 2015-08-27 2015-08-27 One kind is uttered long and high-pitched sounds scene recognition method and equipment

Publications (2)

Publication Number Publication Date
CN106486133A CN106486133A (en) 2017-03-08
CN106486133B true CN106486133B (en) 2019-11-15

Family

ID=58234495

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510532929.3A Active CN106486133B (en) 2015-08-27 2015-08-27 One kind is uttered long and high-pitched sounds scene recognition method and equipment

Country Status (1)

Country Link
CN (1) CN106486133B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109102819A (en) * 2017-06-20 2018-12-28 中移(杭州)信息技术有限公司 One kind is uttered long and high-pitched sounds detection method and device
CN111724811B (en) * 2019-03-21 2023-01-24 成都鼎桥通信技术有限公司 Squeaking identification method and device based on subaudio frequency
CN110838301B (en) * 2019-11-20 2022-04-12 北京雷石天地电子技术有限公司 Method, device terminal and non-transitory computer readable storage medium for suppressing howling

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1398054A (en) * 2001-07-16 2003-02-19 松下电器产业株式会社 Whistler detection and suppresser thereof, its method and computer program products
CN102111707A (en) * 2009-12-29 2011-06-29 Gn瑞声达公司 A method for the detection of whistling in an audio system and a hearing aid executing the method
CN103871418A (en) * 2014-03-06 2014-06-18 北京飞利信电子技术有限公司 Method and device for detecting howling frequency point of acoustic amplification system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5774138B2 (en) * 2012-01-30 2015-09-02 三菱電機株式会社 Reverberation suppressor

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1398054A (en) * 2001-07-16 2003-02-19 松下电器产业株式会社 Whistler detection and suppresser thereof, its method and computer program products
CN102111707A (en) * 2009-12-29 2011-06-29 Gn瑞声达公司 A method for the detection of whistling in an audio system and a hearing aid executing the method
CN103871418A (en) * 2014-03-06 2014-06-18 北京飞利信电子技术有限公司 Method and device for detecting howling frequency point of acoustic amplification system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于频域的数字助听器中的啸叫检测与抑制;何艳辉等;《电声技术》;20121231;第36卷(第8期);第338-341页 *

Also Published As

Publication number Publication date
CN106486133A (en) 2017-03-08

Similar Documents

Publication Publication Date Title
US10251005B2 (en) Method and apparatus for wind noise detection
CN106531172B (en) Speaker's audio playback discrimination method and system based on ambient noise variation detection
CN106488052A (en) One kind is uttered long and high-pitched sounds scene recognition method and equipment
US20180262832A1 (en) Sound Signal Processing Apparatus and Method for Enhancing a Sound Signal
CN105810201B (en) Voice activity detection method and its system
CN106486133B (en) One kind is uttered long and high-pitched sounds scene recognition method and equipment
CN106098076B (en) One kind estimating time-frequency domain adaptive voice detection method based on dynamic noise
CN103124165A (en) Automatic gain control
US20150228293A1 (en) Method and System for Object-Dependent Adjustment of Levels of Audio Objects
CN106303878A (en) One is uttered long and high-pitched sounds and is detected and suppressing method
EP1706864A4 (en) Computationally efficient background noise suppressor for speech coding and speech recognition
WO2020253073A1 (en) Speech endpoint detection method, apparatus and device, and storage medium
CN105933557A (en) Volume intelligent adjusting method for conference participants in synchronous voice conference and volume intelligent adjusting system thereof
CN103366739A (en) Self-adaptive endpoint detection method and self-adaptive endpoint detection system for isolate word speech recognition
CN104464722A (en) Voice activity detection method and equipment based on time domain and frequency domain
CN105810214B (en) Voice-activation detecting method and device
JP2016012216A (en) Congress analysis device, method and program
CN107257528A (en) A kind of detection method of uttering long and high-pitched sounds based on weighted spectral entropy
JP2005018076A (en) Method of reflecting time/language distortion in objective speech quality assessment
JP2013142870A (en) Specific situation model database creating device and method thereof, specific element sound model database creating device, situation estimation device, call suitability notification device and program
JP5863928B1 (en) Audio adjustment device
US7467084B2 (en) Device and method for operating a voice-enhancement system
Reynolds et al. The Lincoln speaker recognition system: NIST EVAL2000
CN106920558B (en) Keyword recognition method and device
Han et al. Robust speaker clustering strategies to data source variation for improved speaker diarization

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant