CN101046955A - PCM code flow voice detection method - Google Patents

PCM code flow voice detection method Download PDF

Info

Publication number
CN101046955A
CN101046955A CN 200610075906 CN200610075906A CN101046955A CN 101046955 A CN101046955 A CN 101046955A CN 200610075906 CN200610075906 CN 200610075906 CN 200610075906 A CN200610075906 A CN 200610075906A CN 101046955 A CN101046955 A CN 101046955A
Authority
CN
China
Prior art keywords
amplitude
voice signal
detection
frame length
voice
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN 200610075906
Other languages
Chinese (zh)
Inventor
黄育延
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN 200610075906 priority Critical patent/CN101046955A/en
Priority to PCT/CN2007/000419 priority patent/WO2007121648A1/en
Publication of CN101046955A publication Critical patent/CN101046955A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Telephonic Communication Services (AREA)
  • Use Of Switch Circuits For Exchanges And Methods Of Control Of Multiplex Exchanges (AREA)

Abstract

The present invention provides a PCM code flow speech detection method. It is characterized by that the speech signal is detected in relay interface, and firstly a detection threshold is set. Said detection method includes the following steps: A, sampling frequency in relay interface to obtain PCM code flow data; B, utilizing obtained PCM code flow data to obtain amplitude of speech signal in detection frame length interior; and C, judging that the amplitude of speech signal in detection frame length interior can meet detection threshold or not, if said amplitude can meet the detection threshold, judging that the speech signal is existed in said detection frame length interior, otherwise, the speech signal is not existed in said detection frame length interior.

Description

A kind of pcm stream speech detection method
Technical field
The present invention relates to field of telecommunications, especially relate to a kind of PCM (Pulse Code Modulation, pulse code modulation (PCM)) code flow voice detection method that is used for telecommunications switch.
Background technology
The switch of field of telecommunications when the autonomous positioning voice quality problem, need have or not voice on Incoming and the out trunk interface pcm stream by switch and detect, with voice quality problems such as identification single-pass, no audio, cross-talks.
Present speech detection method mainly contains GSM VAD, and ITU-T is Annex A G.723.1, ITU-TG.729 Annex B, analysis of noise range calibration and soft calculating (Soft-Computing) etc.GSM VAD wherein, ITU-T G.723.1 Annex A are the energy measuring of regulating decision threshold according to the noise self-adaptation basically, and they have all utilized the method for linear prediction analysis; ITU-T G.729Annex B is based on time domain energy, and the difference information of zero-crossing rate and frequency domain energy detects.
Bottom is described further speech detection method of the prior art respectively:
Prior art one, the time domain energy detection method:
E = Σ n = 1 N x 2 ( n )
Represent the energy value of signal in the frame, wherein N is a frame length, and x (n) is the voice amplitude after the digitizing; Voice can be divided into voiceless sound, voiced sound two big classes; Be similar to the not too big ground unrest of white noise, intensity relatively, the short-time energy of voiced sound is high more a lot of than noise, can be used for distinguishing voice and ground unrest.
The shortcoming of prior art one is:
1, the data computation amount is big, needs a large amount of multiplying and summation operation;
2, the data sampling amount is big, needs continuous sampled signal, needs per second to get 8K byte data (sample frequency 8KHz) for pcm stream.
Prior art two, the short-time zero-crossing rate detection method:
ZCR = 1 2 Σ n = 1 N - 1 | sign [ x ( n ) ] - sign [ x ( n - 1 ) ]
Short-time zero-crossing rate represents that voice signal passes the number of times of zero level in the frame, wherein
sign [ x ( n ) ] = { 1 &CenterDot; &CenterDot; &CenterDot; x ( n ) &GreaterEqual; 0 - 1 &CenterDot; &CenterDot; &CenterDot; x ( n ) < 0
The is-symbol function; The characteristics of voiceless sound are that short-time energy is smaller in the voice, even near ground unrest, with the difficult resolution of short-time energy, but the zero-crossing rate of voiceless sound is very high, can be used as a foundation judging that voice have or not.
The shortcoming of prior art two is:
The data sampling amount is big, needs continuous sampled signal, needs per second to get 8K byte data for pcm stream.
Prior art three, auto-correlation/cross correlation detection method in short-term:
R ( m ) = &Sigma; n = 0 N - 1 - m x ( n ) x ( n + m )
The autocorrelation of noise is all very little except m=0, and voice signal is stably in short-term, and the autocorrelation height can detect by the autocorrelation function of signal calculated and judge noise and voice; For the oscillogram of voice signal, except main peak, also have higher submaximum, this point also can be used for distinguishing voice and noise;
R ( m ) = &Sigma; n = 0 N - 1 - m x ( n ) y ( n + m )
X (n) is the Incoming voice signal, and y (n) is out voice signal.
Simultaneously, can also use cross correlation to check Incoming and out voice, have only the voice wiring just often, Incoming and out speech data are identical, and certain time delay is just arranged, and are simple crosscorrelation; And the cross correlation of cross-talk or noise is very little, can obviously distinguish.
The cross correlation detection method is used in the switch Incoming, the detection of out trunk interface voice quality is more satisfactory, can detect various voice quality problems such as single-pass, no audio, cross-talk.
The shortcoming of prior art three is:
1, correlation calculations method complexity, operand is big, and must have special-purpose digital signal processor to realize;
2, the data sampling amount is big, needs continuous sampled signal, needs per second to get 8K byte data for pcm stream.
Above-mentioned speech detection method is mainly used in having good real-time performance in the communication system.
But, when on telecommunications switch, realizing Incoming, out trunk interface pcm stream speech detection, because the restriction of the processing power of the high capacity trunking port characteristics of switch and switch veneer, in the general telecommunications switch, relaying E1/T1 quantity is all hundreds and thousands of, trunk interface veneer quantity is also a lot, and general trunk interface veneer all is to support many E1/T1.With E1 is example, each PCM30 has 30 available time slot, if on a trunk interface veneer with 16 E1, carry out Incoming, out relay PC M code flow voice detects, need simultaneously 16 * 2 * 30=960 circuit path to be detected and calculate, if obtain the data of each circuit path with the sample frequency of per second 8KHz, the data sampling amount is big, and operand is big.Therefore, speech detection method of the prior art can not be suitable for.
Summary of the invention
Big at speech detection method data sampling amount in the above prior art, the deficiency that operand is big, the objective of the invention is to, a kind of pcm stream speech detection method is provided, can realize the detection that has or not of pcm stream voice at the high capacity Incoming of telecommunications switch, out trunk interface.
For realizing purpose of the present invention, the invention provides a kind of pcm stream speech detection method, detect voice signal at trunk interface, and detection threshold at first is set, comprising:
Step (A), obtain the pcm stream data with sample frequency fs from trunk interface;
Step (B), according to the pcm stream data obtained, obtain detecting the amplitude of voice signal in the frame length;
Whether the amplitude that step (C), judgement detect voice signal in the frame length satisfies detection threshold, if the amplitude of voice signal satisfies detection threshold in the detection frame length, then judge to detect in the frame length voice signal is arranged, if the amplitude of voice signal does not satisfy detection threshold in the detection frame length, then judging to detect in the frame length does not have voice signal.
The amplitude that step (B) is described to obtain detecting voice signal in the frame length is to utilize the voice signal amplitude computing method of simplifying to obtain, and it further comprises:
Step (B1), obtain the sampling pcm encoder three paragraph sign indicating numbers;
Step (B2), table look-up and obtain the corresponding linear amplitude value of three paragraph sign indicating numbers;
Each range value adds up and obtains the amplitude that detects voice signal in the frame length in step (B3), the detection frame length.
The amplitude that step (B) is described to obtain detecting voice signal in the frame length is the corresponding relation according to pcm encoder and 13 bit linear sign indicating numbers, 8 bit non-uniform encoding data are converted to 13 bit linear speech datas, the amplitude information of reduction actual speech signal obtains the amplitude that detects voice signal in the frame length.
The described sample frequency fs of step (A)>>1/ts, wherein, ts is a voice signal stationary time in short-term; Detection frame length T>>1/fs.
Described detection threshold comprises fixing amplitude detection thresholding or self-adaptation amplitude detection thresholding.
The implementation method of described self-adaptation amplitude detection thresholding is the amplitude according to former frame voice signals, calculates to obtain averaged amplitude value, adds on the basis of averaged amplitude value that again detection threshold obtains.
The beneficial effect that the present invention brings is:
1, when realizing the pcm stream speech detection, significantly reduced data sampling amount to pcm stream;
2, the flow chart of data processing in the time of can simplifying the pcm stream speech detection can not need to carry out the linear transformation of pcm stream, has improved the arithmetic speed of speech detection;
3, can in telecommunications switch, realize simultaneously the high capacity trunk being carried out the pcm stream speech detection.
Description of drawings
Fig. 1 is the process flow diagram of the embodiment one of pcm stream speech detection method of the present invention;
Fig. 2 is that voice signal amplitude of the present invention changes comparison of wave shape Fig. 1 in time;
Fig. 3 is that voice signal amplitude of the present invention changes comparison of wave shape Fig. 2 in time;
Fig. 4 is that voice signal amplitude of the present invention changes comparison of wave shape Fig. 3 in time;
Fig. 5 is the process flow diagram of the embodiment two of pcm stream speech detection method of the present invention.
Embodiment
Below in conjunction with accompanying drawing, pcm stream speech detection method of the present invention is described further:
As shown in Figure 1, the embodiment one of the inventive method comprises step:
Step (101), obtain the pcm stream data with sample frequency fs from trunk interface;
Step (102), obtain the sampling pcm encoder three paragraph sign indicating numbers;
Step (103), table look-up and obtain the corresponding linear amplitude value of three paragraph sign indicating numbers;
Each range value adds up and obtains the amplitude that detects voice signal in the frame length in step (104), the detection frame length;
Step (105), Threshold detection judge whether the amplitude that detects voice signal in the frame length satisfies detection threshold.
Below above-mentioned steps is elaborated:
As step (101), obtain the pcm stream data with sample frequency fs from trunk interface;
Because the steady in short-term character of voice signal, promptly at short a period of time its signal characteristic of (about tens ms) lining constant substantially (steadily), if only need obtain voice signal amplitude and energy information in a period of time, the then unnecessary full sampling of carrying out the 8KHz sample frequency, only need sampling at a certain time interval, sample frequency fs>>1/ts gets final product, wherein, ts is a voice signal stationary time in short-term, is about 50ms.
Accomplish Incoming and out speech detection consistance, the selection that detects frame length is very important, at first, detect frame length must satisfy T>>1/fs, could guarantee when having only sampled point abundant that the result of calculation of every frame is steady; Secondly, detecting frame length must can not have too big-difference with the steady cycle in short-term of voice, otherwise can't obtain the variation characteristic of voice; Therefore general detection frame length is got 250ms.
Among the present invention, test records, confirmed sample frequency fs>>during 1/ts, detect frame length get T>>during 1/fs, can be similar to the Changing Pattern that obtains voice.As shown in Figure 2, be at same section voice signal waveform, when getting fs=8KHz, when the pcm stream of T=250ms is sampled entirely, and as the fs=100Hz that gets among the present invention, during the pcm stream interval sampling of T=250ms, it is identical substantially that its both voice signal amplitudes change waveform in time; Among Fig. 2, series 1 is to change waveform in time with the full voice signal amplitude of sampling of fs=8KHz; Series 2 is that the voice signal amplitude with the fs=100Hz interval sampling changes waveform in time; As seen from the figure, two waveforms are more identical, with the variation tendency of actual speech signal also basically identical.
In addition, the trunk interface veneer of general telecommunications switch does not have synchronization mechanism, and just the start time point of each trunk interface veneer calculating frame is variant, and maximum deviation may be near frame length.As shown in Figure 3, be when frame start time point (the frame zero-time that the embodiment of this place gets 3 tests respectively differs 50ms) inequality, with fs=100Hz of the present invention, during the pcm stream interval sampling of T=250ms, the voice signal amplitude after the sampling changes still basically identical of waveform (waveforms shown in the series 1,2 and 3) in time.
It can be said that bright, sample frequency fs of the present invention>>1/ts, detect frame length T>>the pcm stream interval sampling of 1/fs can satisfy the needs of pcm stream speech detection.
As step (102), obtain the sampling pcm encoder three paragraph sign indicating numbers;
Step (103), table look-up and obtain the corresponding linear amplitude value of three paragraph sign indicating numbers;
Each range value adds up and obtains the amplitude that detects voice signal in the frame length in step (104), the detection frame length;
In order to simplify the operand of speech detection, the multiplying of avoiding energy measuring to need, the present invention adopts the method for amplitude detection.As mentioned above, what the pcm stream voice adopted is the method for non-uniform encoding, leading pcm encoder with A is example, be to adopt folding binary coding method, represent polarity with the most significant digit sign indicating number, input signal is divided into 8 inhomogeneous section, and represents with 3 bit codes, each section is divided into 16 grades of requirements that guarantee to quantize signal to noise ratio (S/N ratio) representing with 4 bit codes again.Like this, then from the relaying sampling interface to voice signal data can not directly use, at first need to carry out the linear transformation of pcm stream, this just need carry out calculating operations such as multiplication, the linear transformation of pcm stream has increased the calculated amount of speech detection greatly.
Three paragraph sign indicating number in the pcm stream, the same amplitude of representing voice signal, just quantified precision is lower, because adopt the pcm stream interval sampling method among the present invention, therefore quantified precision is little to the influence of final statistics, so can directly use in the pcm stream three paragraph sign indicating number, use as the amplitude of voice signal.
Pcm encoder b1b2b3b4b5b6b7b8,8 bits, is-symbol position, b1 position needs absolute value when the amplitude of calculating, therefore, the b1 position abandon need not, and what need to use is these three paragraph sign indicating numbers of b2b3b4, it represents the amplitude of voice signal, the b2b3b4 position of pcm encoder, have only 8 kinds of one-to-one relationships with the linear amplitude of correspondence, can obtain the corresponding linear amplitude value of three paragraph sign indicating numbers by tabling look-up, its corresponding relation is as shown in table 1:
Table 1
Pcm encoder b2b3b4 position Corresponding linear amplitude
000 0
001 1
010 2
011 4
100 8
101 16
110 32
111 64
(annotate: the even bit that A leads pcm encoder needs the negate conversion)
The accumulated value that then detects voice signal amplitude in the frame length is:
V = &Sigma; n = 1 25 x s ( n )
Expression fs=100Hz during T=250ms, detects the amplitude of voice signal in the frame length, x s(n) be voice signal through interval sampling.
Test records, and the voice amplitude computing method of this simplification can be obtained the actual margin situation of change of pcm stream voice.As shown in Figure 4, series 1 is that the voice signal amplitude that obtains after the linear transformation of normal pcm stream changes waveform in time, series 2 is to adopt the voice signal amplitude of the voice amplitude computing method conversion back acquisition of above-mentioned simplification to change waveform in time, the variation tendency of two waveforms is in full accord, only on the details of concrete amplitude, very little difference is arranged, therefore adopt among the present invention the voice amplitude computing method of this kind simplification can satisfy the needs of pcm stream speech detection as can be seen.
As step (105), Threshold detection, judge whether the amplitude that detects voice signal in the frame length satisfies detection threshold:
The amplitude of voice signal in the detection frame length that obtains by the voice amplitude computing method simplified in the above-mentioned explanation, if the amplitude of voice signal satisfies detection threshold in the detection frame length, then judge in this detection frame length voice signal is arranged, if the amplitude of voice signal does not satisfy detection threshold in the detection frame length, then judging in this detection frame length does not have voice signal.
The setting of detection threshold, can adopt two kinds of methods to set up, a kind of is that fixing amplitude detection thresholding is set, judge whether the amplitude of trunking port voice signal surpasses the amplitude of general voice signal, if surpass the amplitude detection thresholding, then be judged as voice signal, if do not surpass the amplitude detection thresholding, then being judged as does not have voice signal.The method of fixed amplitude detection threshold realizes simple, is adapted to the less demanding system of speech detection accuracy rate, and its shortcoming is when carrying out speech detection in noise circumstance, and the amplitude of noise also might surpass the detection threshold of setting;
Another kind is that self-adaptation amplitude detection thresholding is set, wherein better simply implementation method, it is amplitude according to former frame voice signals, calculate and obtain averaged amplitude value, on the basis of averaged amplitude value, add again and detect Men Xian ⊿ V, just obtained self-adaptation amplitude detection thresholding, if surpass the amplitude detection thresholding, then be judged as voice signal, do not surpassed the amplitude detection thresholding, then being judged as does not have voice signal.
Above-mentioned steps is the preferred forms of pcm stream speech detection method of the present invention, in addition, the present invention is obtaining with sample frequency fs on the basis of pcm stream data, pcm stream data to sampling are carried out the detection method that linear transformation is obtained the amplitude of voice signal, as shown in Figure 5, the embodiment two of the inventive method comprises step:
Step (501), obtain the pcm stream data with sample frequency fs from trunk interface;
Step (502), pcm stream linear transformation are obtained the amplitude data that detects voice signal in the frame length;
Step (503), Threshold detection judge whether the amplitude that detects voice signal in the frame length satisfies detection threshold.
Wherein, step (501), (503) are consistent with corresponding steps in the above-mentioned steps, no longer repeat, and step (502) is described below:
Pcm stream is a non-uniform encoding, has that A leads, μ leads two kinds of coded systems, improving small-signal resolution, reaches quantified precision near 13 bits of encoded with the coding of 8 bits.Restrain 13 broken line compound coding methods as A, adopt folding binary coding method, represent polarity, input signal is divided into 8 inhomogeneous section with the most significant digit sign indicating number, and represent that with 3 bit codes each section is divided into 16 grades of requirements that guarantee to quantize signal to noise ratio (S/N ratio) representing with 4 bit codes again.
The corresponding relation of pcm encoder and 13 bit linear sign indicating numbers is as shown in table 2:
Table 2
Pcm encoder b1b2b3b4b5b6b7b8 13 bit linear sign indicating numbers
k000wxyz s0000000wxyz1
k001wxyz s0000001wxyz1
k010wxyz s000001wxyz10
k011wxyz s00001wxyz100
k100wxyz s0001wxyz1000
k101wxyz s001wxyz10000
k110wxyz s01wxyz100000
k111wxyz s1wxyz1000000
The pcm stream linear transformation is the corresponding relation according to pcm encoder shown in the table 2 and 13 bit linear sign indicating numbers, 8 bit non-uniform encoding data are converted to the linear speech data of 13 bits, the amplitude information of reduction actual speech signal obtains the amplitude that detects voice signal in the frame length.
The step of the invention described above is to launch at the characteristics of pcm stream voice, and its characteristics are as described below:
Though the voice signal right and wrong stably, time becomes, but it has instantaneous stable state, voice signal is to be the quasi-periodic signal of 3.3~16ms in the cycle, and the basic characteristics of voice signal are steady in short-term, promptly at short a period of time its signal characteristic of (about tens ms) lining constant substantially (steadily); Simultaneously, in the long time period of voice signal, be again the non-stationary acute variation.
And the characteristics of carrying out Incoming, out trunk interface pcm stream speech detection on the telecommunications switch are:
1. real-time is less demanding
Telecommunications switch is when carrying out the pcm stream speech detection, do not need to detect accurately the starting point and the end point of voice, only need judge in the call proceeding in (a few second) in a period of time in the past, whether Incoming and striking out have voice signal to exist, foundation as the voice quality problem location, therefore less demanding to the real-time of speech detection, detection time, precision promptly satisfied request for utilization in 1S;
2. coherence request is higher than accuracy requirement
Telecommunications switch is when carrying out the pcm stream speech detection, the location voice quality problem, key is contrast Incoming and out relay PC M code flow voice testing result, whether consistent in a period of time of call proceeding, therefore the speech detection result of trunking port does not need 100% correct, but good consistance is arranged, promptly same input voice, the result who detects when out at Incoming wants consistent.
Based on above-mentioned discussion, the present invention is feasible, and brings beneficial effect as follows:
1, when realizing the pcm stream speech detection, significantly reduced data sampling amount to pcm stream;
2, the flow chart of data processing in the time of can simplifying the pcm stream speech detection can not need to carry out the linear transformation of pcm stream, has improved the arithmetic speed of speech detection;
3, can in telecommunications switch, realize simultaneously the high capacity trunk being carried out the pcm stream speech detection.
The above; only for the preferable embodiment of the present invention, but protection scope of the present invention is not limited thereto, and anyly is familiar with those skilled in the art in the technical scope that the present invention discloses; the variation that can expect easily or replacement all should be encompassed within protection scope of the present invention.

Claims (7)

1. a pcm stream speech detection method is characterized in that, detects voice signal at trunk interface, and detection threshold at first is set, and comprising:
Step (A), obtain the pcm stream data with sample frequency fs from trunk interface;
Step (B), according to the pcm stream data obtained, obtain detecting the amplitude of voice signal in the frame length;
Whether the amplitude that step (C), judgement detect voice signal in the frame length satisfies detection threshold, if the amplitude of voice signal satisfies detection threshold in the detection frame length, then judge to detect in the frame length voice signal is arranged, if the amplitude of voice signal does not satisfy detection threshold in the detection frame length, then judging to detect in the frame length does not have voice signal.
2. the method for claim 1 is characterized in that: the amplitude that step (B) is described to obtain detecting voice signal in the frame length is to utilize the voice signal amplitude computing method of simplifying to obtain, and it further comprises:
Step (B1), obtain the sampling pcm encoder three paragraph sign indicating numbers;
Step (B2), table look-up and obtain the corresponding linear amplitude value of three paragraph sign indicating numbers;
Each range value adds up and obtains the amplitude that detects voice signal in the frame length in step (B3), the detection frame length.
3. the method for claim 1, it is characterized in that: the amplitude that step (B) is described to obtain detecting voice signal in the frame length is the corresponding relation according to pcm encoder and 13 bit linear sign indicating numbers, 8 bit non-uniform encoding data are converted to 13 bit linear speech datas, the amplitude information of reduction actual speech signal obtains the amplitude that detects voice signal in the frame length.
4. the method for claim 1 is characterized in that: the described sample frequency fs of step (A)>> 1/ Ts, wherein, ts is a voice signal stationary time in short-term.
5. method as claimed in claim 4 is characterized in that, also comprises: detection frame length T>> 1/ Fs
6. the method for claim 1, it is characterized in that: described detection threshold comprises fixing amplitude detection thresholding or self-adaptation amplitude detection thresholding.
7. method as claimed in claim 6 is characterized in that: the implementation method of described self-adaptation amplitude detection thresholding is the amplitude according to former frame voice signals, calculates to obtain averaged amplitude value, adds on the basis of averaged amplitude value that again detection threshold obtains.
CN 200610075906 2006-04-24 2006-04-24 PCM code flow voice detection method Pending CN101046955A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN 200610075906 CN101046955A (en) 2006-04-24 2006-04-24 PCM code flow voice detection method
PCT/CN2007/000419 WO2007121648A1 (en) 2006-04-24 2007-02-07 A method of pcm code stream speech detection and the apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 200610075906 CN101046955A (en) 2006-04-24 2006-04-24 PCM code flow voice detection method

Publications (1)

Publication Number Publication Date
CN101046955A true CN101046955A (en) 2007-10-03

Family

ID=38624540

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 200610075906 Pending CN101046955A (en) 2006-04-24 2006-04-24 PCM code flow voice detection method

Country Status (2)

Country Link
CN (1) CN101046955A (en)
WO (1) WO2007121648A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103780770A (en) * 2012-10-24 2014-05-07 中国移动通信集团河南有限公司 Crosstalk detecting method and device
CN105825864A (en) * 2016-05-19 2016-08-03 南京奇音石信息技术有限公司 Double-talk detection and echo cancellation method based on zero-crossing rate
CN106612168A (en) * 2016-12-23 2017-05-03 中国电子科技集团公司第三十研究所 Voice out-of-synchronism detection method based on PCM coding characteristics
CN106941644A (en) * 2016-01-05 2017-07-11 中兴通讯股份有限公司 The sounds trigger method and smart machine of a kind of smart machine
CN110600060A (en) * 2019-09-27 2019-12-20 云知声智能科技股份有限公司 Hardware audio active detection HVAD system

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5619564A (en) * 1995-05-31 1997-04-08 Lucent Technologies Inc. Tone detector with improved performance in the presence of speech
KR0149759B1 (en) * 1995-11-20 1998-11-02 김광호 Dtmf detector using dsp chip
US6219635B1 (en) * 1997-11-25 2001-04-17 Douglas L. Coulter Instantaneous detection of human speech pitch pulses

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103780770A (en) * 2012-10-24 2014-05-07 中国移动通信集团河南有限公司 Crosstalk detecting method and device
CN106941644A (en) * 2016-01-05 2017-07-11 中兴通讯股份有限公司 The sounds trigger method and smart machine of a kind of smart machine
WO2017118037A1 (en) * 2016-01-05 2017-07-13 中兴通讯股份有限公司 Voice-triggering method for smart device and smart device
CN105825864A (en) * 2016-05-19 2016-08-03 南京奇音石信息技术有限公司 Double-talk detection and echo cancellation method based on zero-crossing rate
CN105825864B (en) * 2016-05-19 2019-10-25 深圳永顺智信息科技有限公司 Both-end based on zero-crossing rate index is spoken detection and echo cancel method
CN106612168A (en) * 2016-12-23 2017-05-03 中国电子科技集团公司第三十研究所 Voice out-of-synchronism detection method based on PCM coding characteristics
CN106612168B (en) * 2016-12-23 2019-07-16 中国电子科技集团公司第三十研究所 A kind of voice step failing out detecting method based on pcm encoder feature
CN110600060A (en) * 2019-09-27 2019-12-20 云知声智能科技股份有限公司 Hardware audio active detection HVAD system
CN110600060B (en) * 2019-09-27 2021-10-22 云知声智能科技股份有限公司 Hardware audio active detection HVAD system

Also Published As

Publication number Publication date
WO2007121648A1 (en) 2007-11-01

Similar Documents

Publication Publication Date Title
CN1248190C (en) Fast frequency-domain pitch estimation
CN1266674C (en) Closed-loop multimode mixed-domain linear prediction (MDLP) speech coder
CN1188831C (en) System and method for voice recognition with a plurality of voice recognition engines
CN1175398C (en) Sound activation detection method for identifying speech and music from noise environment
CN101046955A (en) PCM code flow voice detection method
CN1675684A (en) Distributed speech recognition with back-end voice activity detection apparatus and method
CN1146862C (en) Pitch extraction method and device
CN1991976A (en) Phoneme based voice recognition method and system
CN1271594C (en) Pitch determination method and apparatus on spectral analysis
CN1922659A (en) Coding model selection
PT1239465E (en) METHOD AND APPARATUS FOR SELECTING A CODE RATE ON A VARIABLE VOWAL CODE
CN1703734A (en) Method and apparatus for determining musical notes from sounds
CN1737903A (en) Method and apparatus for speech decoding
CN1273662A (en) Vocoder-based voice recognizer
CN1123862C (en) Speech recognition special-purpose chip based speaker-dependent speech recognition and speech playback method
CN1212607C (en) Predictive speech coder using coding scheme selection patterns to reduce sensitivity to frame errors
CN1815558A (en) Low bit-rate coding of unvoiced segments of speech
CN1447963A (en) Method for noise robust classification in speech coding
CN1949364A (en) System and method for testing identification degree of input speech signal
CN1300049A (en) Method and apparatus for identifying speech sound of chinese language common speech
JPS5870299A (en) Discrimination of and analyzer for voice signal
CN1238513A (en) Speech recognition method
CN101030374A (en) Method and apparatus for extracting base sound period
CN1311421C (en) Apparatus and method for voice activity detection
CN1234898A (en) Transmitter with improved speech encoder and decoder

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Open date: 20071003