CN85100180B

CN85100180B - Recognition method of chinese sound using computer

Info

Publication number: CN85100180B
Application number: CN85100180A
Authority: CN
Inventors: 严普强; 施昊; 靳怀义
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 1985-04-01
Filing date: 1985-04-01
Publication date: 1987-05-13
Also published as: CN85100180A

Abstract

The present invention relates to a method for computers to recognize Chinese speech, which belongs to the field of speech recognition. Computers are used for recognizing Chinese speech. Speech recognition devices are not independent of specific users by the method without the limitation of vocabulary. The present invention recognizes Chinese speech according to phonemes, syllables and tones. The fundamental frequency of voiced sound is extracted, frequency multiplication pulses are generated by a phase-locked loop, and then analysis, characteristic extraction and recognition are carried out to a signal sequence for sampling by a synchronous sampling technology. The method can be applied to man-machine systems which use natural Chinese speech as input.

Description

A kind of computing machine that utilizes is to methods for mandarin speech recognition

The invention belongs to field of speech recognition, utilize computing machine that Chinese speech is discerned.

General speech analysis and recognition methods now all is that voice signal is sampled by the mode of even time interval, divides frame by the time, and the time ordinal series of every frame is asked for feature, discerns then.This recognition methods depends critically upon intonation and speech speed, and therefore the recognition device of making in this way depends on specific people, and the vocabulary of its identification also is very limited.The second phase in 1985 " international electronics newspaper ", the listed various speech recognition plug-in cards that now put goods on the market promptly belonged to this example.

The present invention proposes a kind of recognition device that can not rely on specific end user and be not subjected to concrete vocabulary restriction.This apparatus features takes into full account characteristics and people's the sounding and the mechanism of the sense of hearing of Chinese speech to the processing of voice signal, analysis and identification the time.The present invention will discern by phoneme, syllable and tone Chinese speech.For the voice signal that is sent by vocal cord vibration, the present invention proposes to adopt the technology of extracting fundamental frequency and synchronized sampling, then the burst of sampling is analyzed, and extracts phonetic feature, discerns.

Chinese speech is monosyllabic, and each syllable is formed to several phonemes by one.The quantity of syllable and phoneme all is limited.Voiced sound phoneme by the vocal cord vibration pronunciation in the four tones of standard Chinese pronunciation intonation of Chinese and the syllable occupies an important position.To take into full account these characteristics of Chinese in the present invention, the voiced sound signal has the characteristic of cycle or quasi-periodic signal, and its fundamental frequency changes when intonation changes.If adopt the Sampling techniques of even time interval, then data volume is very big and introduce information fuzzy such as leakage errors inevitably.The used synchronous sampling technique quantity of information of compress voiced significantly among the present invention, it can also provide the feature of intonation and the variation of intonation fully.

The present invention can develop into the input of the Chinese speech of usefulness nature as the person machine system.Recognition methods among the present invention can be widely used in various fields, for example various semiautomatic plants of term sound control system and work mechanism; Term sound control false making limb, nursing machinery; With voice computing machine is carried out program composition; Sound-controlled typewriter; Secret device that discriminates one's identification with voice etc.

The speech recognition equipment block diagram that the present invention proposes as shown in Figure 1.A is voice, and it is detected by microphone (1), changes electric signal into.Then by a prime amplifier (2).Voice telecommunication after the amplification number is by a low-pass filter (3), and the fundamental frequency of voiced sound can be searched for and follow the tracks of to the cutoff frequency of this wave filter automatically.(4) be judgment means, then, voiced sound fundamental frequency C triggered a phaselocking frequency multiplier (5), obtain 64 frequencys multiplication for example) with the sampling pulse sequence d(of voiced sound fundamental frequency frequency multiplication to the voiced sound fundamental frequency.(6) be a frequency divider, it provides feedback for the phase-locked loop.Voice telecommunication b simultaneously again by a frequency overlapped-resistable filter (7), uses A/D converter (8) to carry out synchronized sampling then, for voiced sound, samples in the mode of external trigger with the double frequency pulse sequence d of its fundamental frequency.For voiceless sound, then still sample with time clock.The information of the sample sequence of voice telecommunication number and fundamental frequency all delivered in the computing machine (9) analyze, extract feature and also discern.(10) are to use the mode transfer plate among Fig. 1, and (11) are phoneme and syllable template, and template all presets, and the output e of computing machine is the identification to phoneme, syllable; F is the identification to the intonation four tones of standard Chinese pronunciation, and g is the identification to speaker characteristic.

Claims

1, a kind of device that utilizes computing machine that Chinese speech is discerned, comprise low-pass filter (3), A/D converter (8), computing machine (9) and some tone templates (10), phoneme syllable template (11) etc., it is characterized in that utilizing wave filter to extract the fundamental frequency of voiced sound, trigger a frequency multiplication of phase locked loop device (5) to obtain the sampling pulse sequence of a frequency multiplication, this sampling pulse sequence trigger A/D converter (8) carries out synchronized sampling to the voiced sound signal, send into computing machine through the voiced sound signal of sampling and discern, meanwhile the fundamental frequency information of voiced sound is also sent into the identification four tones of standard Chinese pronunciation of computing machine.