WO2000079516A1 - Dispositif et procede de determination de trames voisees/non voisees - Google Patents
Dispositif et procede de determination de trames voisees/non voisees Download PDFInfo
- Publication number
- WO2000079516A1 WO2000079516A1 PCT/JP2000/003954 JP0003954W WO0079516A1 WO 2000079516 A1 WO2000079516 A1 WO 2000079516A1 JP 0003954 W JP0003954 W JP 0003954W WO 0079516 A1 WO0079516 A1 WO 0079516A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- voice
- determination
- frame
- input signal
- band
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims description 19
- 238000006243 chemical reaction Methods 0.000 claims abstract description 38
- 238000001514 detection method Methods 0.000 abstract description 5
- 238000012360 testing method Methods 0.000 description 12
- 238000010586 diagram Methods 0.000 description 8
- 230000005540 biological transmission Effects 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/90—Pitch determination of speech signals
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04J—MULTIPLEX COMMUNICATION
- H04J3/00—Time-division multiplex systems
- H04J3/17—Time-division multiplex systems in which the transmission channel allotted to a first user may be taken away and re-allotted to a second user if the first user becomes inactive, e.g. TASI
Definitions
- the present invention relates to an audio / voiceless frame determination device and a determination method thereof, and in particular, determines whether an input signal is audio or non-voice for each fixed section (constant frame). This relates to a voice / non-voice frame determination method.
- Conventional technology relates to a voice / non-voice frame determination method.
- Conventional devices for this type of voice frame detection aim to determine a non-voice section and a voice section in order to reduce the average transmission rate by encoding the non-voice section at a lower rate than the voice section. It is used as For example, there is a judging device used in "ITU-T Recommendation G. 729 Annex B". In this conventional device, it is determined whether the frame is a speech section (speech frame) or a non-speech section (frame) using four types of feature parameters extracted from the input signal for each 10 msec frame. The determination in this case is performed by comparing a predetermined determination parameter (threshold) with the extracted characteristic parameter.
- a predetermined determination parameter threshold
- the unit length dividing circuit 20 divides the signal input from the input terminal 10 for each frame length of a certain section (for example, 10 msec) and passes the signal to the testing circuit 40.
- the test circuit 40 determines whether the input signal passed from the unit length division circuit 20 in frame units is a voice section or a non-voice section, and outputs the determination result in frame units from the output terminal 60. It has become.
- a speech / non-speech frame determination device that determines whether an input signal is a speech or a non-speech for each fixed section, wherein a band conversion unit that performs band conversion of the input signal,
- a speech / non-speech frame discriminating apparatus is provided, which comprises a decision means for making the decision based on a signal.
- the determination means is designed for a signal limited to a predetermined band.
- a speech / non-speech frame determination device that determines whether an input signal is a speech or a non-speech for each fixed section, wherein the dividing section divides the fixed section into shorter short sections
- a speech / non-speech frame discriminating apparatus comprising: a judgment unit for making the judgment for each short section; and a unit length conversion unit for making a judgment for the certain section based on the judgment result.
- the unit length conversion means is characterized in that, when any one of the short sections is determined to be speech, the determination corresponding to the certain section is determined to be speech.
- the apparatus further includes a band conversion unit that performs band conversion on the input signal, and the determination unit performs the determination based on the signal after the band conversion. Further, the determination means is designed for a signal limited to a predetermined band and unit length.
- a voice / non-voice frame determination method for determining whether an input signal is voice or non-voice for each fixed section, A voice / non-voice frame determination method is provided, which includes a step of performing band conversion and a step of performing the determination based on the signal after the band conversion.
- a speech / non-speech frame determination method for determining whether an input signal is speech or non-speech for each fixed section, wherein the fixed section is divided into shorter short sections.
- a voice / non-voice frame determination method which includes a step of performing the determination and a step of determining the predetermined section based on the determination result.
- the method further includes a step of performing band conversion on the input signal, wherein the determination for each short section is performed based on the signal after band conversion.
- the determination results obtained for each predetermined unit length are combined to determine the determination result corresponding to the frame. For example, when any one of the determination results for the predetermined unit length is “voice”, the determination result corresponding to the frame can be “voice”.
- the frame length must be equal to or longer than the predetermined unit time length.
- FIG. 1 is a block diagram illustrating a configuration example of a first voice / non-voice frame determination device according to the present invention.
- FIG. 2 is a flowchart showing the operation of the block in FIG.
- FIG. 3 is a block diagram illustrating a configuration example of a second voice / non-voice frame determination device according to the present invention.
- FIG. 4 is a flowchart showing the operation of the block in FIG.
- FIG. 5 is a diagram showing a frame configuration for explaining the operation of the block in FIG.
- FIG. 6 is a block diagram illustrating a configuration example of a third voice / non-voice frame determination device according to the present invention.
- FIG. 7 is a flowchart showing the operation of the block in FIG.
- FIG. 8 is a block diagram illustrating a configuration example of a conventional voice / non-voice frame determination device.
- FIG. 1 is a block diagram showing the configuration of a first voice / voiceless frame determination apparatus according to the present invention, and the same parts as those in FIG. 8 are denoted by the same reference numerals.
- FIG. 2 is a flowchart showing a flow of the operation.
- the unit length dividing circuit 20 divides the signal input from the input terminal 10 into a frame length (for example, 10 msec) which is a predetermined fixed section (step S 10), and the band converting circuit 3 Pass to 0.
- This band conversion circuit 30 limits the frequency band of the input signal having the frame length passed from the unit length division circuit 20 to a frequency band that can be tested by the test circuit 40 (step S11). , Pass the test circuit 40.
- the test circuit 40 determines whether the input signal in units of frames passed from the unit length dividing circuit 20 is a voice section or a non-voice section (step S12). The judgment result is output from output terminal 60.
- the band conversion circuit for example, a circuit having a band-pass filter function or a low-pass filter function can be used, but the band of the input signal is the same as the band after conversion by this band conversion circuit. Of course, it needs to be wider.
- FIG. 3 is a block diagram showing a second speech / non-speech frame discriminating apparatus according to the present invention, and the same parts as those in FIGS.
- FIG. 4 is a flowchart showing the flow of the operation.
- the unit length dividing circuit 20 converts the signal input from the input terminal 10 into a unit length (for example, 2.5 msec) shorter than a frame length (for example, 10 msec) as shown in FIG. (Step S 20) and passed to the test circuit 40.
- the test circuit 40 determines whether a voice section or a non-voice section is present for each short unit length passed from the unit length dividing circuit 20 (step S 21), and determines the results of these determinations for each frame unit of the input signal. Passed to unit length conversion circuit 50.
- the unit length conversion circuit 50 derives from the multiple judgment results (judgment results of “Yes” and “No” in the short section in FIG. 5 (A)) passed from the test circuit 40 corresponding to each frame.
- the determination result of one frame is determined (step S22), and output from the output terminal 60.
- Fig. 5 (A) if at least one of the short sections constituting one frame is judged as "present”, the judgment result of one frame is shown in Fig. 5 (B). Thus, it is determined to be "Yes".
- the frame length must be equal to or longer than the predetermined unit time length.
- FIG. 6 is a block diagram showing a third speech / non-speech frame determination device according to the present invention, and the same parts as those in FIGS.
- FIG. 7 is a flowchart showing the flow of the operation.
- the unit length dividing circuit 20 divides the signal input from the input terminal into a unit length (for example, 2.5 msec) shorter than the frame length (for example, 10 msec) as shown in FIG. (Step S 30) Pass it to the band conversion circuit 30.
- the band conversion circuit 30 receives the input signal of the frame length passed from the unit length division circuit 20.
- the frequency band possessed is limited to a frequency band that can be tested by the test circuit 40 (step S31) and passed to the test circuit 40.
- the test circuit 40 determines whether a voice section or a non-voice section is present for each short unit length passed from the band conversion circuit 30 (step S32), and determines the results of these determinations for each frame unit of the input signal.
- the unit length conversion circuit 50 extracts the multiple judgment results (the judgment results of “Yes” and “No” in the short section in FIG. 5 (A)) passed from the test circuit 40 corresponding to each frame.
- the frame determination result is determined (step S33) and output from the output terminal 60.
- the first effect is that it is possible to obtain appropriate judgment results in all cases when constructing a voice / silence frame judgment device that can respond to various input signals with different frequency bands. .
- the reason is that the judgment can be made with a single judgment parameter.
- the second effect is that when constructing an audio / voiceless frame determination device that can support various input signals with different unit lengths (frame lengths) for performing determination, it is possible to obtain appropriate determination results in all cases. It is possible. The reason is that it can be reflected with a single judgment parameter.
Landscapes
- Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Computer Networks & Wireless Communication (AREA)
- Time-Division Multiplex Systems (AREA)
- Telephone Function (AREA)
Description
Claims
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CA002375330A CA2375330A1 (en) | 1999-06-23 | 2000-06-16 | Speech/non-speech frame discriminator and method for discriminating between speech/non-speech frames |
EP00939086A EP1217607A4 (en) | 1999-06-23 | 2000-06-16 | DEVICE AND METHOD FOR DETERMINING VOIDED / NON-VOIDED FRAMES |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP11176167A JP2001005473A (ja) | 1999-06-23 | 1999-06-23 | 音声・無音声フレーム判定装置及びその判定方法 |
JP11/176167 | 1999-06-23 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2000079516A1 true WO2000079516A1 (fr) | 2000-12-28 |
Family
ID=16008844
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2000/003954 WO2000079516A1 (fr) | 1999-06-23 | 2000-06-16 | Dispositif et procede de determination de trames voisees/non voisees |
Country Status (4)
Country | Link |
---|---|
EP (1) | EP1217607A4 (ja) |
JP (1) | JP2001005473A (ja) |
CA (1) | CA2375330A1 (ja) |
WO (1) | WO2000079516A1 (ja) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH09152894A (ja) * | 1995-11-30 | 1997-06-10 | Denso Corp | 有音無音判別器 |
JP2656586B2 (ja) * | 1988-11-30 | 1997-09-24 | 株式会社日立製作所 | 音声検出方法及びその装置 |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB1486868A (en) * | 1974-05-17 | 1977-09-28 | Int Standard Electric Corp | Conference circuit |
JP3355473B2 (ja) * | 1996-12-18 | 2002-12-09 | 京セラ株式会社 | 音声検出方法 |
-
1999
- 1999-06-23 JP JP11176167A patent/JP2001005473A/ja active Pending
-
2000
- 2000-06-16 EP EP00939086A patent/EP1217607A4/en not_active Withdrawn
- 2000-06-16 WO PCT/JP2000/003954 patent/WO2000079516A1/ja not_active Application Discontinuation
- 2000-06-16 CA CA002375330A patent/CA2375330A1/en not_active Abandoned
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2656586B2 (ja) * | 1988-11-30 | 1997-09-24 | 株式会社日立製作所 | 音声検出方法及びその装置 |
JPH09152894A (ja) * | 1995-11-30 | 1997-06-10 | Denso Corp | 有音無音判別器 |
Non-Patent Citations (1)
Title |
---|
See also references of EP1217607A4 * |
Also Published As
Publication number | Publication date |
---|---|
CA2375330A1 (en) | 2000-12-28 |
EP1217607A4 (en) | 2005-05-04 |
JP2001005473A (ja) | 2001-01-12 |
EP1217607A1 (en) | 2002-06-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US6807525B1 (en) | SID frame detection with human auditory perception compensation | |
US7171357B2 (en) | Voice-activity detection using energy ratios and periodicity | |
US5848384A (en) | Analysis of audio quality using speech recognition and synthesis | |
JP4995913B2 (ja) | 信号変化検出のためのシステム、方法、および装置 | |
KR100986957B1 (ko) | 토널 컴포넌트들을 감지하는 시스템들, 방법들, 및 장치들 | |
JP5284477B2 (ja) | 音声データの伝送にエラーがある際のエラー隠蔽方法 | |
JP2002366174A (ja) | G.729の付属書bに準拠した音声アクティビティ検出回路を収束させるための方法 | |
AU689300B2 (en) | Test method | |
JPH08505715A (ja) | 定常的信号と非定常的信号との識別 | |
US6950511B2 (en) | Detection of both voice and tones using Goertzel filters | |
JPS62204652A (ja) | 可聴周波信号識別方式 | |
PT1554717E (pt) | Pré-processamento de dados digitais áudio para codificadores/descodificadores de áudio móveis | |
WO2000079516A1 (fr) | Dispositif et procede de determination de trames voisees/non voisees | |
EP1424684A1 (en) | Voice activity detection apparatus and method | |
JP2003514262A (ja) | 割込みのない言語品質の評価 | |
US20070100611A1 (en) | Speech codec apparatus with spike reduction | |
US7277537B2 (en) | Tone, modulated tone, and saturated tone detection in a voice activity detection device | |
CA2279264C (en) | Speech immunity enhancement in linear prediction based dtmf detector | |
IL108401A (en) | Method and apparatus for indicating the emotional state of a person | |
JP4309749B2 (ja) | 帯域制限を考慮した音声品質客観評価装置 | |
JP2905112B2 (ja) | 環境音分析装置 | |
JPH03241400A (ja) | 音声検出器 | |
JPS5915228B2 (ja) | シングルチヤネルコ−デツク監視装置 | |
JPH0836400A (ja) | 音声状態判定回路 | |
JPS62260442A (ja) | 故障検出回路 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): CA US |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): DE FR GB |
|
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
WWE | Wipo information: entry into national phase |
Ref document number: 10019123 Country of ref document: US |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2000939086 Country of ref document: EP |
|
ENP | Entry into the national phase |
Ref document number: 2375330 Country of ref document: CA Ref country code: CA Ref document number: 2375330 Kind code of ref document: A Format of ref document f/p: F |
|
WWP | Wipo information: published in national office |
Ref document number: 2000939086 Country of ref document: EP |
|
WWW | Wipo information: withdrawn in national office |
Ref document number: 2000939086 Country of ref document: EP |