KR970050112A - Real time voice recognition - Google Patents
Real time voice recognition Download PDFInfo
- Publication number
- KR970050112A KR970050112A KR1019950047885A KR19950047885A KR970050112A KR 970050112 A KR970050112 A KR 970050112A KR 1019950047885 A KR1019950047885 A KR 1019950047885A KR 19950047885 A KR19950047885 A KR 19950047885A KR 970050112 A KR970050112 A KR 970050112A
- Authority
- KR
- South Korea
- Prior art keywords
- speech
- signal
- feature
- neural network
- value
- Prior art date
Links
- 238000000034 method Methods 0.000 claims abstract 10
- 238000013528 artificial neural network Methods 0.000 claims abstract 5
- 230000011218 segmentation Effects 0.000 claims abstract 4
- 230000004913 activation Effects 0.000 claims abstract 3
- 230000006870 function Effects 0.000 claims abstract 3
- 238000009825 accumulation Methods 0.000 claims abstract 2
- 230000003247 decreasing effect Effects 0.000 claims 2
- 238000010586 diagram Methods 0.000 description 3
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/16—Speech classification or search using artificial neural networks
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/04—Segmentation; Word boundary detection
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Telephonic Communication Services (AREA)
Abstract
본 발명은 음성인식방법에 관한 것으로, 다수의 심플음성신호로부터 구해진 신호차이를 저장하는 신호차이 누적과정; 저장된 신호차이의 최대값으로 된 일련의 열을 계산하고, 상기 열로부터 음성을 음절로 분리하고 자음과 모음을 구별하는 세그먼트화과정; 세그먼트화과정에서 분리된 자음과 모음에 의하여 음성신호를 시간 영역에서 정규화하여 음성특징으로 추출하고 정규화과정; 및 실수형 신경회로망에서와 같은 학습상수 및 활성함수를 가진 정수형 입력구도 다층 퍼셉트론 신경회로망에 상기 추출된 특징으로 적용하여 특징으로 부터 음성을 분류하는 과정을 포함함을 특징으로 한다.The present invention relates to a speech recognition method, comprising: a signal difference accumulation process for storing signal differences obtained from a plurality of simple voice signals; A segmentation process of calculating a series of maximum values of stored signal differences, separating speech from syllables into syllables, and distinguishing consonants from vowels; Normalizing the speech signal in the time domain by consonants and vowels separated in the segmentation process and extracting the speech signal into a speech feature; And an integer input sphere having a learning constant and an activation function as in a real neural network, is applied to the multilayer perceptron neural network as the extracted feature to classify a voice from the feature.
본 발명에 의하면, 음성인식처리에 잇어서, 저가형, 소형이면서도 실시간 처리가 가능하며 높은 인식률을 가지는 시스템을 구현할 수 있다.According to the present invention, in addition to the voice recognition processing, it is possible to implement a system having a low recognition rate, a small size and real-time processing and a high recognition rate.
Description
본 내용은 요부공개 건이므로 전문내용을 수록하지 않았음Since this is an open matter, no full text was included.
제1도는 본 발명에 의한 음성인식장난감의 하드웨어적인 구성을 도시한 구성블럭도.1 is a block diagram showing a hardware configuration of a voice recognition toy according to the present invention.
제2도는 제1도에 도시된 음성인식장치의 상세 구성블럭도.2 is a detailed block diagram of the speech recognition apparatus shown in FIG.
제3도는 본 발명에 의한 음성인식 알고리즘을 설명하기 위한 블럭도.3 is a block diagram for explaining a speech recognition algorithm according to the present invention.
Claims (3)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1019950047885A KR100202424B1 (en) | 1995-12-08 | 1995-12-08 | Real time speech recognition method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1019950047885A KR100202424B1 (en) | 1995-12-08 | 1995-12-08 | Real time speech recognition method |
Publications (2)
Publication Number | Publication Date |
---|---|
KR970050112A true KR970050112A (en) | 1997-07-29 |
KR100202424B1 KR100202424B1 (en) | 1999-06-15 |
Family
ID=19438634
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR1019950047885A KR100202424B1 (en) | 1995-12-08 | 1995-12-08 | Real time speech recognition method |
Country Status (1)
Country | Link |
---|---|
KR (1) | KR100202424B1 (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR102413692B1 (en) | 2015-07-24 | 2022-06-27 | 삼성전자주식회사 | Apparatus and method for caculating acoustic score for speech recognition, speech recognition apparatus and method, and electronic device |
KR102192678B1 (en) | 2015-10-16 | 2020-12-17 | 삼성전자주식회사 | Apparatus and method for normalizing input data of acoustic model, speech recognition apparatus |
-
1995
- 1995-12-08 KR KR1019950047885A patent/KR100202424B1/en not_active IP Right Cessation
Also Published As
Publication number | Publication date |
---|---|
KR100202424B1 (en) | 1999-06-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Zazo et al. | Feature Learning with Raw-Waveform CLDNNs for Voice Activity Detection. | |
Thomas et al. | Analyzing convolutional neural networks for speech activity detection in mismatched acoustic conditions | |
US5805731A (en) | Adaptive statistical classifier which provides reliable estimates or output classes having low probabilities | |
Chang et al. | Robust CNN-based speech recognition with Gabor filter kernels. | |
CA2247006C (en) | Speech processing | |
CN109003625B (en) | Speech emotion recognition method and system based on ternary loss | |
CA2122575C (en) | Speaker independent isolated word recognition system using neural networks | |
CN110033756B (en) | Language identification method and device, electronic equipment and storage medium | |
US6151592A (en) | Recognition apparatus using neural network, and learning method therefor | |
US5809461A (en) | Speech recognition apparatus using neural network and learning method therefor | |
EP0586714B1 (en) | Speech recognition apparatus using neural network, and learning method therefor | |
US5101434A (en) | Voice recognition using segmented time encoded speech | |
US9972310B2 (en) | System and method for neural network based feature extraction for acoustic model development | |
Scherer et al. | Real-time emotion recognition from speech using echo state networks | |
CN113889099A (en) | Voice recognition method and system | |
KR970050112A (en) | Real time voice recognition | |
Sunny et al. | Feature extraction methods based on linear predictive coding and wavelet packet decomposition for recognizing spoken words in malayalam | |
Yousfi et al. | Isolated Iqlab checking rules based on speech recognition system | |
JPH06119476A (en) | Time sequential data processor | |
Sekhar et al. | Neural network models for spotting stop consonant-vowel (SCV) segments in continuous speech | |
Wang et al. | Speaker verification and identification using gamma neural networks | |
KR100211113B1 (en) | Learning method and speech recognition using chaotic recurrent neural networks | |
Sankar et al. | Noise immunization using neural net for speech recognition | |
Surampudi et al. | Speech signal processing using neural networks mapping the phonology of sanskrit language using neural networks | |
dos Santos Moura et al. | Source Extraction based on Binary Masking and Machine Learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
A201 | Request for examination | ||
E902 | Notification of reason for refusal | ||
E701 | Decision to grant or registration of patent right | ||
GRNT | Written decision to grant | ||
LAPS | Lapse due to unpaid annual fee |