CN101393741A - Audio signal classification apparatus and method used in wideband audio encoder and decoder - Google Patents

Audio signal classification apparatus and method used in wideband audio encoder and decoder Download PDF

Info

Publication number
CN101393741A
CN101393741A CNA2007101522352A CN200710152235A CN101393741A CN 101393741 A CN101393741 A CN 101393741A CN A2007101522352 A CNA2007101522352 A CN A2007101522352A CN 200710152235 A CN200710152235 A CN 200710152235A CN 101393741 A CN101393741 A CN 101393741A
Authority
CN
China
Prior art keywords
signal
module
parameter
classification
sorting
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CNA2007101522352A
Other languages
Chinese (zh)
Inventor
钟毅睿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ZTE Corp
Original Assignee
ZTE Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ZTE Corp filed Critical ZTE Corp
Priority to CNA2007101522352A priority Critical patent/CN101393741A/en
Publication of CN101393741A publication Critical patent/CN101393741A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention discloses an audio signal sorting device in a broadband audio codec, wherein a background noise estimating and controlling module is used for receiving the spectral distribution parameter of a sorting parameter extracting module and sending the update rate to a signal initial sorting module; the signal initial sorting module is used for carrying out the initial sorting of the audio input signal according to the sub-band energy parameter and the update rate, and sending the initial sorting results to a sorted parameter extracting module and a signal sorting determining module; the sorted parameter extracting module is used for extracting and sorting the input signals, sending the sorting characteristic parameter of the acquired signal to the signal sorting determining module and feeding the acquired spectral distribution parameter back to the background noise estimating and controlling module at the same time; and the signal sorting determining module is used for setting the final sorting mark for the sorting characteristics parameter according the initial sorting results, wherein the final sorting mark is used for defining the determining sort of the output signal. The invention further discloses an audio signal sorting method in the broadband audio codec.

Description

Sound signal sorter and sorting technique in a kind of wideband audio codec
Technical field
The present invention relates to a kind of digital audio signal sorting technique, specifically, relate to sound signal sorter and sorting technique in a kind of wideband audio codec.
Background technology
In field of voice signal, existing voice signal classification and Detection (SAD, Sound ActivityDetection) is all only developed at voice signal, and only the sound signal with input is divided into two kinds: noise and non-noise.
Though AMR-WB+ comprises the detection of music signal, just revise and replenish as one beyond the SAD judgement.The ACELP of AMR-WB+ encryption algorithm and TCX model selection algorithm have two kinds according to complexity: open loop selection algorithm and closed loop selection algorithm.Closed loop is selected corresponding high complexity, is default option, is a kind of selection mode of the traversal search based on the perceptual weighting signal to noise ratio (S/N ratio), and computational complexity is very high, and size of code is also bigger.
Therefore, when the code decode algorithm application scenarios from being that main progressively carrying out the transition to handled multi-media voice (comprising multimedia music) with the processed voice, itself also progressively expands code decode algorithm from the arrowband to the broadband, so along with the variation of application scenarios, the simple output classification of existing SAD algorithm obviously is not enough to the description audio characteristics of signals.
Summary of the invention
Technical matters solved by the invention provides the sound signal sorter in a kind of wideband audio codec, and it is not high to have solved in the AMR-WB+ scrambler sound classifier classification performance, the problem of structural redundancy.
Technical scheme is as follows:
Sound signal sorter in a kind of wideband audio codec is characterized in that, comprising: ground unrest is estimated control module, signal preliminary classification module, sorting parameter extraction module and signal classification judging module, wherein,
Ground unrest is estimated control module, is used to receive the spectrum distribution parameter of described sorting parameter extraction module, and described spectrum distribution parameter is used for controlling the renewal rate of ground unrest, and described renewal rate is sent to signal preliminary classification module;
Signal preliminary classification module, be used to receive audio input signal, and the sub belt energy parameter and the described renewal rate that are used for the input of received code device parameter extraction module, according to described sub belt energy parameter and renewal rate described audio input signal is carried out preliminary classification, the preliminary classification result is sent to sorting parameter extraction module and signal classification judging module;
The sorting parameter extraction module, be used to receive the coder parameters of described coder parameters extraction module input and the preliminary classification result of described signal preliminary classification module input, input signal is extracted and classifies, the signal characteristic of division parameter that obtains is sent to signal classification judging module, simultaneously the spectrum distribution parameter that obtains is fed back to ground unrest and estimate control module;
Signal classification judging module is used to receive described signal characteristic of division parameter and preliminary classification result, according to described preliminary classification result described characteristic of division parameter is provided with classification and finally indicates, the final sign of described classification is used to define the judgement type of output signal.
Preferably, the preliminary classification result of described signal preliminary classification module comprises noise and non-noise.
Preferably, the signal characteristic of division parameter of described sorting parameter extraction module comprises that fundamental tone parameter, average gain, zero-crossing rate, sub belt energy time domain undulating quantity, height sub belt energy ratio, sub belt energy frequency domain undulating quantity or line spectrum are apart from the short-time average value.
Preferably, the final sign of classification of described signal classification judging module comprises: non-useful signal class, voice class and music class, the middle classification sign comprises uncertain class.
Another technical matters solved by the invention provides the sound signal sorting technique in a kind of wideband audio codec, and it is not high to have solved in the AMR-WB+ scrambler sound classifier classification performance, the problem of structural redundancy.
Technical scheme is as follows:
Sound signal sorting technique in a kind of wideband audio codec, step comprises:
(1) ground unrest estimates that control module receives the spectrum distribution parameter of sorting parameter extraction module, and described renewal rate is sent to signal preliminary classification module, and described spectrum distribution parameter is used for controlling the renewal rate of ground unrest;
(2) signal preliminary classification module receives audio input signal, and the sub belt energy parameter and the described renewal rate of the input of received code device parameter extraction module, according to described sub belt energy parameter and renewal rate described audio input signal is carried out preliminary classification, the preliminary classification result is sent to sorting parameter extraction module and signal classification judging module;
(3) the sorting parameter extraction module receives the coder parameters of described coder parameters extraction module input and the preliminary classification result of described signal preliminary classification module input, input signal is extracted and classifies, the signal characteristic of division parameter that obtains is sent to signal classification judging module, and the spectrum distribution parameter that obtains is fed back to ground unrest estimation control module;
(4) signal classification judging module receives described signal characteristic of division parameter and preliminary classification result, according to described preliminary classification result described characteristic of division parameter being provided with classification finally indicates, obtain the judgement type of output signal, the final sign of described classification is used to define the judgement type of output signal.
Further, in the step (2), the court verdict of described signal preliminary classification module received signal classification judging module feedback is according to described court verdict self-adaptation hangover length.
At above problem, with the model selection algorithm comparison of AMR-WB+, the main advantage of technical solution of the present invention is embodied in the following aspects:
1, improved the accuracy of sound classification;
2, guarantee to improve encryption algorithm efficient under the prerequisite of accuracy rate;
3, fully optimize on the framework, removed unnecessary code redundancy and complexity redundancy that AMR-WB+ model selection algorithm brings to scrambler.
Description of drawings
Fig. 1 is the structural representation of the sound signal sorter in the wideband audio codec.
Embodiment
With reference to the accompanying drawings, the preferred embodiments of the present invention are described in detail.
With reference to shown in Figure 1, the sound signal sorter 10 in the wideband audio codec is described in detail.Sound signal sorter 10 comprises: ground unrest is estimated control module 11, signal preliminary classification module (PSC) 12, sorting parameter extraction module 13 and signal classification judging module 14.
Coder parameters extraction module 20 is used to sound signal sorter 10 that necessary sub belt energy parameter is provided, and accounts for and computation complexity thereby can reduce the resource consumption.The coder parameters that coder parameters extraction module 20 provides comprises: sub belt energy, Lsf coefficient vector, the gain of open loop gene, open loop gene postpone, pitch marks.Calculate after the sub belt energy parameter, will whether carry out the LSF computing according to result's decision of signal preliminary classification.If present frame is non-useful signal, then according to the mechanism of scrambler:, then carry out the LSF computing if scrambler needs the LSF coefficient at the coding of non-useful signal; If do not need, then the coder parameters extraction module finishes.If present frame is a useful signal, then carry out the LSF computing.Calculate the LSF parameter for useful signal, most of coding modes all need, and therefore can not bring redundant complexity to scrambler.
Encoder modes and rate selection module 30 adopt the ACELP/TCX open loop mode of AMR-WB+ to select module.The signal decision type of these module received audio signal sorter 10 outputs is according to different signal decision type selecting corresponding codes patterns.The signal decision type comprises non-useful signal, voice and music three classes, VAD_flag=0, ACELP and TCX during open loop mode is selected among the corresponding A MR-WB+ respectively.
Ground unrest estimation control module 11 utilizes the spectrum distribution parameter of sorting parameter extraction module 13 to control the renewal rate of ground unrest, and renewal rate is sent to signal preliminary classification module (PSC) 12.
The unexpected situation about improving of energy level of ground unrest may appear in actual application environment, at this moment is prone to ground unrest and estimates to continue to be judged to the unrenewable always state of useful signal because of signal.At this problem, ground unrest estimation control module 11 utilizes some spectrum distribution parameters that calculate in the sorting parameter extraction module 13 to control the renewal rate of ground unrest.
Signal preliminary classification module (PSC) 12 receives audio input signal, sub belt energy parameter and ground unrest according to coder parameters extraction module 20 estimate that the renewal rate of control module 11 carries out initial classification to audio input signal, for example is divided into audio input signal noise and non-noise.Preliminary classification result can feed back to sorting parameter extraction module 13; Simultaneously, handle input, initial classification results is sent to signal classification judging module 14 as follow-up classification.
Based on the vad algorithm among the AMR-WB+, and the dissatisfactory problem of differentiation of the music of noise and some kind is improved at the VAD among the AMR-WB+:
At first, the estimation of ground unrest is controlled by the renewal rate (acc) that ground unrest estimation control module 11 provides, and the scheme that noise upgrades can adopt the scheme among the AMR-WB+.
Secondly, among the VAD of AMR-WB+, generally all protect useful signal not to be mistaken for noise by hangover, the length of hangover should and improve at guard signal gets one aspect the transfer efficiency two and trades off.For traditional speech coder, the length of hangover can be got a constant through study; And for the multi-rate coding device, towards be the sound signal that comprises music, long low-energy hangover often appears in this class signal, conventional VAD is difficult to detect this part hangover, therefore needs long hangover protect it.In the present invention, the hangover length in the signal preliminary classification module (PSC) 12 is designed to court verdict self-adaptation according to signal classification judging module 14 feedbacks.
The coder parameters of sorting parameter extraction module 13 received code device parameter extraction module 20 inputs and the preliminary classification result of signal preliminary classification module (PSC) 12, the signal of above-mentioned input is extracted and classifies, and the signal characteristic of division parameter that obtains sent to signal classification judging module 14, the spectrum distribution parameter that obtains is sent to ground unrest estimate control module 11.
The signal characteristic of division parameter that needs to extract comprises:
(1) fundamental tone parameter (pitch)
Fundamental tone parameter (pitch) is the difference of more continuous open-loop pitch delay, if the increment of open-loop pitch delay less than preset threshold, then delay counter adds up; If the delay counter sum of two continuous frames is enough big, pitch=1 then is set, otherwise pitch=0.
(2) average gain
The open-loop pitch gain surpasses threshold value, then puts high-order value of statistical indicant, and it is exactly average gain that continuous a few frame values are got average.
(3) zero-crossing rate (zcr)
The general calculation method is as follows:
zcr = 1 T &Sigma; i - 1 T - 1 II { x ( i ) x ( i - 1 ) < 0 }
II{A} is 1 when setting up (for truth) when the A logic, otherwise is 0 when (for false).
(4) sub belt energy time domain undulating quantity (t_flux)
t _ flux = &Sigma; i = 1 12 | level m ( i ) - level m - 1 ( i ) | short _ mean _ level _ energy
Wherein, level m(i) signal energy of i subband in the expression m frame, short_mean_level_energy represents short-time average energy.
(5) the height sub belt energy is than (ra)
ra = sublevel _ high _ energy sublevel _ low _ energy
Wherein, sublevel_high_energy represents high sub belt energy, the low sub belt energy of sublevel_low_energy representative.
(6) sub belt energy frequency domain undulating quantity (f_flux)
f _ flux = &Sigma; i = 2 12 | level m ( i ) - level m ( i - 1 ) | short _ mean _ level _ energy
(7) line spectrum is represented the mean value of five consecutive frame line spectrums apart from (Isf_SD) apart from short-time average value (Isf_meanSD), wherein
Isf _ SD = &Sigma; i = 1 16 | Is f m ( i ) - Is f m - 1 ( i ) |
Wherein, Isf represents the line spectral frequencies coefficient vector, and m represents frame index, and i represents member's index in the vector.
Signal classification judging module 14 receives the signal characteristic of division parameter of sorting parameter extraction module 13 and the preliminary classification result of signal preliminary classification module 12, according to the preliminary classification result characteristic of division parameter is provided with classification and finally indicates, the final sign of classification is used to define the judgement type of output signal.The final sign of classification of signal classification judgement comprises: non-useful signal class (NOISE), voice class (SPEECH) and music class (MUSIC), the middle classification sign also comprises uncertain class (UNCERTAIN).
The classification judgement mainly comprises following process:
1, characteristic parameter hangover.
For guaranteeing the stable of signal decision and avoiding the conversion of frequent court verdict, the hangover scheme is set.For example, the sign of characteristic parameter is provided with hangover or controls hangover length according to the error rate (ER) of each internal node of the decision tree of training parameter correspondence.
2, preliminary classification.
If current signal is categorized as useful signal, carry out the preliminary classification of voice and music so.At first, carry out the voice judgement,, the voice signal sign is set then if signal satisfies the characteristics of speech sounds standard.Secondly, carry out the music judgement, if signal satisfies the musical specific property standard, think music signal so, and the music signal sign is set.
3, revise classification.
Revising classification carries out according to following steps
A) at first to voice and the zero clearing of music hangover sign.
If be in uncertain class through the initial judgement current classification in back, the classification of signal revised according to linguistic context and some concrete parameters.
B) if be continuous voice class before this frame, and continuity is stronger, according to the characteristic parameter of voice voice is adjudicated so, if satisfy the voice condition, the sign speech_hangover_flag of voice hangover is set so.
C) if be continuous music class before this frame, and continuity is stronger, according to the characteristic parameter of music music is adjudicated so, if satisfy the music condition, the sign music_hangover_flag of music hangover is set so.
D) if the voice hangover is masked as 1, so current signal classification is changed to voice class.
E) if the music hangover is masked as 1, so current signal classification is changed to the music class.
F) if music hangover sign and music hangover sign satisfy simultaneously, so the signal classification is made as uncertain class.If the continuity of music has surpassed 2 frames before, and the value of lsf_meanSD is less, then the signal classification is made as the music class.
G) through after the preliminary hangover, if the signal classification also is uncertain class, according to linguistic context before the signal classification is revised so, be about to current uncertain signal classification and reduce signal classification before.
4, final correction is classified.
Through after the initial classification correction, continue to carry out the correction of classification according to current linguistic context.If current linguistic context is a music, and continuation is very strong, has surpassed setting-up time (for example 3 seconds), can force to revise according to the value of lsf_meanSD so; If current linguistic context is voice, and continuation is very strong, has surpassed setting-up time (for example 3 seconds), can force to revise according to the value of lsf_meanSD so.If the instant energy value of signal is too little, the judgement of the classification of present frame is for identical with the judgement of former frame so.
5, parameter update.
Parameter update comprises three classification counters of renewal, also comprises each threshold value in the update signal classification judging module 14.
If the current music that is categorized as, then music counter music_countinue_counter increases by 1, otherwise zero clearing.The processing of other classification as mentioned above.
Threshold value is upgraded according to the signal to noise ratio (S/N ratio) size of signal preliminary classification module output.

Claims (6)

1, the sound signal sorter in a kind of wideband audio codec is characterized in that, comprising: ground unrest is estimated control module, signal preliminary classification module, sorting parameter extraction module and signal classification judging module, wherein,
Ground unrest is estimated control module, is used to receive the spectrum distribution parameter of described sorting parameter extraction module, and described spectrum distribution parameter is used for controlling the renewal rate of ground unrest, and described renewal rate is sent to signal preliminary classification module;
Signal preliminary classification module, be used to receive audio input signal, and the sub belt energy parameter and the described renewal rate that are used for the input of received code device parameter extraction module, according to described sub belt energy parameter and renewal rate described audio input signal is carried out preliminary classification, the preliminary classification result is sent to sorting parameter extraction module and signal classification judging module;
The sorting parameter extraction module, be used to receive the coder parameters of described coder parameters extraction module input and the preliminary classification result of described signal preliminary classification module input, input signal is extracted and classifies, the signal characteristic of division parameter that obtains is sent to signal classification judging module, simultaneously the spectrum distribution parameter that obtains is fed back to ground unrest and estimate control module;
Signal classification judging module is used to receive described signal characteristic of division parameter and preliminary classification result, according to described preliminary classification result described characteristic of division parameter is provided with classification and finally indicates, the final sign of described classification is used to define the judgement type of output signal.
2, the sound signal sorter in the wideband audio codec according to claim 1 is characterized in that, the preliminary classification result of described signal preliminary classification module comprises noise and non-noise.
3, the sound signal sorter in the wideband audio codec according to claim 1, it is characterized in that the signal characteristic of division parameter of described sorting parameter extraction module comprises that fundamental tone parameter, average gain, zero-crossing rate, sub belt energy time domain undulating quantity, height sub belt energy ratio, sub belt energy frequency domain undulating quantity or line spectrum are apart from the short-time average value.
4, the sound signal sorter in the wideband audio codec according to claim 1, it is characterized in that, the final sign of classification of described signal classification judging module comprises: non-useful signal class, voice class and music class, the middle classification sign comprises uncertain class.
5, the sound signal sorting technique in a kind of wideband audio codec, step comprises:
(1) ground unrest estimates that control module receives the spectrum distribution parameter of sorting parameter extraction module, and described renewal rate is sent to signal preliminary classification module, and described spectrum distribution parameter is used for controlling the renewal rate of ground unrest;
(2) signal preliminary classification module receives audio input signal, and the sub belt energy parameter and the described renewal rate of the input of received code device parameter extraction module, according to described sub belt energy parameter and renewal rate described audio input signal is carried out preliminary classification, the preliminary classification result is sent to sorting parameter extraction module and signal classification judging module;
(3) the sorting parameter extraction module receives the coder parameters of described coder parameters extraction module input and the preliminary classification result of described signal preliminary classification module input, input signal is extracted and classifies, the signal characteristic of division parameter that obtains is sent to signal classification judging module, and the spectrum distribution parameter that obtains is fed back to ground unrest estimation control module;
(4) signal classification judging module receives described signal characteristic of division parameter and preliminary classification result, according to described preliminary classification result described characteristic of division parameter being provided with classification finally indicates, obtain the judgement type of output signal, the final sign of described classification is used to define the judgement type of output signal.
6, the sound signal sorting technique in the wideband audio codec according to claim 5, it is characterized in that, in the step (2), the court verdict of described signal preliminary classification module received signal classification judging module feedback is according to described court verdict self-adaptation hangover length.
CNA2007101522352A 2007-09-19 2007-09-19 Audio signal classification apparatus and method used in wideband audio encoder and decoder Pending CN101393741A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNA2007101522352A CN101393741A (en) 2007-09-19 2007-09-19 Audio signal classification apparatus and method used in wideband audio encoder and decoder

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNA2007101522352A CN101393741A (en) 2007-09-19 2007-09-19 Audio signal classification apparatus and method used in wideband audio encoder and decoder

Publications (1)

Publication Number Publication Date
CN101393741A true CN101393741A (en) 2009-03-25

Family

ID=40494004

Family Applications (1)

Application Number Title Priority Date Filing Date
CNA2007101522352A Pending CN101393741A (en) 2007-09-19 2007-09-19 Audio signal classification apparatus and method used in wideband audio encoder and decoder

Country Status (1)

Country Link
CN (1) CN101393741A (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102142924A (en) * 2010-02-03 2011-08-03 中兴通讯股份有限公司 Versatile audio code (VAC) transmission method and device
CN102194457A (en) * 2010-03-02 2011-09-21 中兴通讯股份有限公司 Audio encoding and decoding method, system and noise level estimation method
CN102446506A (en) * 2010-10-11 2012-05-09 华为技术有限公司 Classification identifying method and equipment of audio signals
CN104040626A (en) * 2012-01-13 2014-09-10 高通股份有限公司 Multiple coding mode signal classification
CN104321815A (en) * 2012-03-21 2015-01-28 三星电子株式会社 Method and apparatus for high-frequency encoding/decoding for bandwidth extension
CN104347067A (en) * 2013-08-06 2015-02-11 华为技术有限公司 Audio signal classification method and device
CN108074579A (en) * 2012-11-13 2018-05-25 三星电子株式会社 For determining the method for coding mode and audio coding method
CN110992965A (en) * 2014-02-24 2020-04-10 三星电子株式会社 Signal classification method and apparatus and audio encoding method and apparatus using the same
US11545160B2 (en) 2019-06-10 2023-01-03 Axis Ab Method, a computer program, an encoder and a monitoring device
WO2023216119A1 (en) * 2022-05-10 2023-11-16 北京小米移动软件有限公司 Audio signal encoding method and apparatus, electronic device and storage medium

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102142924A (en) * 2010-02-03 2011-08-03 中兴通讯股份有限公司 Versatile audio code (VAC) transmission method and device
CN102142924B (en) * 2010-02-03 2014-04-09 中兴通讯股份有限公司 Versatile audio code (VAC) transmission method and device
CN102194457A (en) * 2010-03-02 2011-09-21 中兴通讯股份有限公司 Audio encoding and decoding method, system and noise level estimation method
CN102194457B (en) * 2010-03-02 2013-02-27 中兴通讯股份有限公司 Audio encoding and decoding method, system and noise level estimation method
CN102446506A (en) * 2010-10-11 2012-05-09 华为技术有限公司 Classification identifying method and equipment of audio signals
CN102446506B (en) * 2010-10-11 2013-06-05 华为技术有限公司 Classification identifying method and equipment of audio signals
CN104040626A (en) * 2012-01-13 2014-09-10 高通股份有限公司 Multiple coding mode signal classification
US9761238B2 (en) 2012-03-21 2017-09-12 Samsung Electronics Co., Ltd. Method and apparatus for encoding and decoding high frequency for bandwidth extension
CN104321815B (en) * 2012-03-21 2018-10-16 三星电子株式会社 High-frequency coding/high frequency decoding method and apparatus for bandwidth expansion
US10339948B2 (en) 2012-03-21 2019-07-02 Samsung Electronics Co., Ltd. Method and apparatus for encoding and decoding high frequency for bandwidth extension
CN104321815A (en) * 2012-03-21 2015-01-28 三星电子株式会社 Method and apparatus for high-frequency encoding/decoding for bandwidth extension
CN108074579A (en) * 2012-11-13 2018-05-25 三星电子株式会社 For determining the method for coding mode and audio coding method
US11004458B2 (en) 2012-11-13 2021-05-11 Samsung Electronics Co., Ltd. Coding mode determination method and apparatus, audio encoding method and apparatus, and audio decoding method and apparatus
US10090003B2 (en) 2013-08-06 2018-10-02 Huawei Technologies Co., Ltd. Method and apparatus for classifying an audio signal based on frequency spectrum fluctuation
CN104347067A (en) * 2013-08-06 2015-02-11 华为技术有限公司 Audio signal classification method and device
CN104347067B (en) * 2013-08-06 2017-04-12 华为技术有限公司 Audio signal classification method and device
US10529361B2 (en) 2013-08-06 2020-01-07 Huawei Technologies Co., Ltd. Audio signal classification method and apparatus
US11289113B2 (en) 2013-08-06 2022-03-29 Huawei Technolgies Co. Ltd. Linear prediction residual energy tilt-based audio signal classification method and apparatus
US11756576B2 (en) 2013-08-06 2023-09-12 Huawei Technologies Co., Ltd. Classification of audio signal as speech or music based on energy fluctuation of frequency spectrum
CN110992965A (en) * 2014-02-24 2020-04-10 三星电子株式会社 Signal classification method and apparatus and audio encoding method and apparatus using the same
US11545160B2 (en) 2019-06-10 2023-01-03 Axis Ab Method, a computer program, an encoder and a monitoring device
WO2023216119A1 (en) * 2022-05-10 2023-11-16 北京小米移动软件有限公司 Audio signal encoding method and apparatus, electronic device and storage medium

Similar Documents

Publication Publication Date Title
CN101393741A (en) Audio signal classification apparatus and method used in wideband audio encoder and decoder
CN100483509C (en) Aural signal classification method and device
CN101197130B (en) Sound activity detecting method and detector thereof
Bachu et al. Voiced/unvoiced decision for speech signals based on zero-crossing rate and energy
Lu et al. A robust audio classification and segmentation method
EP2702589B1 (en) Efficient content classification and loudness estimation
CA2663568C (en) Voice activity detection system and method
CN102089803B (en) Method and discriminator for classifying different segments of a signal
JP3197155B2 (en) Method and apparatus for estimating and classifying a speech signal pitch period in a digital speech coder
CN1920947A (en) Voice/music detector for audio frequency coding with low bit ratio
Wang et al. Phonetically-based vector excitation coding of speech at 3.6 kbps
CN102446506B (en) Classification identifying method and equipment of audio signals
CN101496095B (en) Systems, methods, and apparatus for signal change detection
McClellan et al. Variable-rate CELP based on subband flatness
CN1046366C (en) Discriminating between stationary and non-stationary signals
Rajan et al. Two-pitch tracking in co-channel speech using modified group delay functions
KR20080097684A (en) A method for discriminating speech and music on real-time
Malenovsky et al. Two-stage speech/music classifier with decision smoothing and sharpening in the EVS codec
JP3849116B2 (en) Voice detection device and voice detection program
Tucker et al. Compression of acoustic features-are perceptual quality and recognition performance incompatible goals?
Velayatipour et al. A review on speech-music discrimination methods
Lupini et al. A multi-mode variable rate CELP coder based on frame classification
Esfandian et al. Voice activity detection using clustering-based method in Spectro-Temporal features space
CN102655000B (en) Method and device for classifying unvoiced sound and voiced sound
Pasad et al. Voice activity detection for children's read speech recognition in noisy conditions

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Open date: 20090325