CN1264107A

CN1264107A - Environment control system by speech recognition for disabled people

Info

Publication number: CN1264107A
Application number: CN00103360A
Authority: CN
Inventors: 唐庆玉
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2000-05-12
Filing date: 2000-05-12
Publication date: 2000-08-23
Anticipated expiration: 2020-05-12
Also published as: CN1123861C

Abstract

An environment control system for disabled people is composed of multimedia computer, control box connected to parallel port of computer and system software including learning module and speech recognition module. Said control box consists of power supply controller for the electric appliances , telephone controller connected to telehone set, and infrared remote controller connected to TV set. The domestic appliances can be controlled by speech.

Description

Environment control system by speech recognition for disabled people

The invention belongs to electronic instrument technology field, the particularly improvement of disabled person's environmental control system design.

Disabled person's environmental control can make the high paraplegia disabled person utilize some existing on health ability to operate, and it is reached or controls the ability of household electrical appliance near the normal person.The clear people in this Shen had succeeded in developing ECU-1 type disabled person environmental control in 1998, and it is a kind of electronic installation, and its structure and principle of work are as shown in Figure 1.This device mainly comprises electrical equipment selection control module, control signal monitor, the tv control unit and the infrared controller that link to each other with this control module one end, the switch output that links to each other with the calling hummer, the power supply that links to each other with electrical equipment such as electric light, electric fans output; This control module other end links to each other with gas control switch, jaw touch of key-press switch, mouthful three kinds of gauge tap of excellent keyswitch.This device adopts these specially designed switches to be applicable to that the high paraplegia disabled person is switched on or switched off the power supply of household electrical appliance such as electric light, electric fan, televisor, and can the infrared remote control televisor.

The advantage of ECU-1 type disabled person environmental control is that circuit is simple, low price, but it is convenient inadequately still to seem for the severe disability people, because this electronic installation needs the disabled person to use the athletic performance of head to control.

The objective of the invention is for overcoming the weak point of prior art, a kind of environment control system by speech recognition for disabled people is proposed, control based on PC, adopt speech recognition technology, but can make the disabled person by speak issue an order just switching lamp, electric fan, see TV, not only easy to use, also increased the control function of answering the call and making a phone call.

A kind of environment control system by speech recognition for disabled people that the present invention proposes, it is characterized in that, multimedia computer and the control box that links to each other with its parallel interface, be stored in the system software that constitutes by study module and sound identification module in this computer in advance, said control box comprises the appliances power source control circuit that links to each other with electric equipment, the telephone control circuit that links to each other with phone, one or more of the televisor infrared remote controller that links to each other with televisor.

Said appliances power source control circuit is made up of driving circuit and coupled midget relay group and auxiliary reclay; Said telephone control circuit is made up of driving circuit and coupled midget relay group; Said televisor infrared remote controller is made up of driving circuit and coupled midget relay group and infrared remote control chip.Said driving circuit is made of the MC1413 chip.

Said study module divides module unit by the data acquisition unit that links to each other successively, cutting syllable unit, extraction eigenvector unit, eigenvector and sets up the sound template unit and formed.

Said sound identification module divides module unit, judgement unit, noise removing unit and judgement output unit to be formed by the data acquisition unit that links to each other successively, cutting syllable unit, extraction eigenvector unit, eigenvector.

Environment control system by speech recognition for disabled people of the present invention can make the disabled person need only control household electrical appliance by speaking, be undoubtedly very easily than control method by keyswitch, can make the disabled person reach part takes care of oneself, alleviated family burden, the important social meaning is arranged, therefore can be applicable, certain market demand and economic benefit are arranged.

Brief Description Of Drawings:

Fig. 1 is the structure and the principle of work synoptic diagram of prior art.

Fig. 2 is a system hardware general structure synoptic diagram of the present invention.

Fig. 3 is the schematic block circuit diagram of control box of the present invention.

Fig. 4 is a control system software learning modular program process flow diagram of the present invention.

Fig. 5 is a control system software sound identification module program flow diagram of the present invention.

Fig. 6 is the study module program initialization and the data acquisition program process flow diagram of embodiments of the invention.

Fig. 7 is the wavePre function program process flow diagram of the embodiment of the invention.

Fig. 8 is the recognizer general flow chart of the embodiment of the invention.

Fig. 9 is the OnBufferReturn function program process flow diagram of the embodiment of the invention.

Figure 10 is the Recognize function program process flow diagram of the embodiment of the invention.

One of circuit theory diagrams of Figure 11 control box of the invention process.

Two of the circuit theory diagrams of Figure 12 embodiment of the invention control box.

Three of the circuit theory diagrams of Figure 13 embodiment of the invention control box.

System hardware circuit of the present invention and software and embodiment are described in detail as follows in conjunction with each accompanying drawing:

System hardware circuit general structure of the present invention is made up of PC586 PC (comprising display), sound card (comprising microphone, audio amplifier), control box as shown in Figure 2.Sound card inserts in the expansion slot of PC, is connected to microphone and audio amplifier on the sound card.The disabled person is facing to the microphone issue an order of speaking, and sound card is that data are sent PC with voice collecting.PC is discerned disabled person's issued command by speech recognition software, and carries out text prompt and carry out verbal cue by sound card and audio amplifier by display.After PC identifies command type, to control box output control corresponding sign indicating number, to control different function (control function is seen " function and the key technical indexes ").

Control box circuit structure of the present invention and principle of work are as shown in Figure 3.It is made up of four partial circuits: 1. PC parallel interface: 2. appliances power source control circuit; 3. telephone control circuit; 4. televisor infrared remote controller.The PC parallel interface is divided into three groups, and wherein one group is used for the power supply of control apparatus, and one group is used for controlling telephone set, and one group is used for controlling televisor.After PC identifies the voice command type, by these three groups of parallel interface output control corresponding sign indicating numbers.If the control code of appliances power source after then this control code drives by driving circuit 1, makes 1 adhesive of midget relay group, and then makes the auxiliary reclay adhesive, thus the power supply of connection electric equipment.If the control code of phone after then this control code drives by driving circuit 2, makes 2 adhesives of midget relay group, thereby the dial feature of control phone.If the control code of televisor after then this control code drives by driving circuit 3, makes 3 adhesives of midget relay group, thereby selects certain function of infrared remote control chip, make televisor enter this function.

System software of the present invention is divided into study module and sound identification module two big modules, and the software program flow chart of these two modules is distinguished as shown in Figure 4 and Figure 5, and two modules have identical part.Program is by the VC++5.0 language compilation under the Windows98.

Study module is to be used to set up sound template, and its process is: the user issues an order, the advanced line data collection of this module, and cutting syllable, extraction eigenvector, eigenvector piecemeal are set up sound template at last then.A template is all set up in each order, and the input command with the unknown when these templates are used for speech recognition mates differentiation.

Sound identification module is that what order the order that is used to differentiate input is, and according to different orders, to the different control code of interface output.Speech recognition by speech data collection, cutting syllable, extract eigenvector, eigenvector piecemeal, differentiation, noise removing and judgement output totally 7 program parts form.

The function of each program module, algorithm principle and embodiment program flow diagram are described in detail as follows:

(1) study module

The function of study module is to open up the memory block record that can store 4 seconds speech datas, and the user is to each keyword pronunciation (each speech 5 times) and record (image data), calls the eigenvector analytic function then and analyzes and obtain sound template.The rudimentary recording control function wavein family function that has called during recording that Windows provides, sampling rate is 8kHz, precision is 16bit.

The general flow chart of study module as shown in Figure 6.The OnInitDialog function is called in program initialization, and its function is to finish the setting of prompts on the interface, and the user's vocabulary of packing into.Then show the study module interface, wait for user's input.User's input has four kinds: select, determine, cancel and demonstration, program is imported according to the user, changes handling procedure separately respectively over to.If user's input is " selection ", then call the OnStart function, the preparation of recording.Then call the RecordWaveStart function, the effect of RecordWaveStart function is to start Recording Process, keyword of every record, and program all will judge whether to have recorded, if do not recorded, then continues record.If 5 keywords have all been recorded, then call the OnRecordWaveStop function.The function of OnRecordWaveStop function is to judge that whether the memory block that returns is recorded completely, if record is not full, does not deal with.If the memory block that returns is that record is full, just call the wavePre function.

The process flow diagram of wavePre function as shown in Figure 7.Its effect is that the voice of importing are carried out syllable splitting, extract eigenvector and eigenvector piecemeal, and returns syllable quantity and give the OnRecordWaveStop function.Process flow diagram begins at first to ask dynamic zero position and determines codomain value frequently, has judged whether the ending of buffer zone then.If do not arrive, then change the reference position of seeking syllable over to.When seeking the reference position of syllable, see being worth frequently of speech data whether to have surpassed threshold value earlier, just see two frames more backward if surpass threshold value, just think that when surpassing threshold value for the first time be the beginning of syllable if back two frames also surpass threshold value, otherwise be considered as noise.In the ending that the beginning part back of having determined a syllable continues to seek syllable downwards, the criterion of judgement is can be worth frequently to be lower than threshold value first.In the process of seeking, give much attention to the end that does not exceed buffer zone.In allowed limits whether the length that also will judge this syllable behind the beginning and end that has found a syllable again (0.25～1 second), and the purpose of doing like this is in order to remove interference of noise, and this program process is called syllable splitting.If the ending of having arrived buffer zone then changes over to and extracts eigenvector, use linear predictive coding cepstrum and difference cepstrum as eigenvector.Carry out the eigenvector piecemeal at last and handle, promptly set up a sound template.

In the general flow chart of study module, after learning process finished, " determining " key just can become effectively, at this moment clicks this key and will call the OnOK function.The OnOK function program judges that whether active user's sound template file exists, and if there is no just deposits the sound template data behind the piecemeal with current user name; If the sound template file exists, program can the demand operating person will cover a this document or a newly-built user.

In the general flow chart of study module, the effect of OnCancel function is to close audio input device, releasing memory and close box.Will call the OnCancel function if in study, click " cancellation " key, end learning process.

(2) sound identification module

Identification module is the groundwork part of whole procedure, and its general flow chart as shown in Figure 8.The used multimedia difference of function of identification module and study module is few, and used sampling rate is the same with sampling precision, but more complex on program structure.Call the OnInitDialog function during program initialization, the effect of OnInitDialog function is to distribute some memory blocks of using in identification, finishes the setting of prompts on some interfaces, and the user's vocabulary of packing into.Show the identification module interface then, wait for user's input.User's input has 2 keys: " selection " and " cancellation ".If " selection " key in the point then calls the OnStart2 function, the OnStart2 function is finished the preliminary work before some identifications.Enter then and call the OnBufferReturn function." cancellation " key calls when the user finishes identification module, and it mainly finishes following several work: closed communication port, pressure are returned all memory blocks that do not return, are closed waveform input equipment and releasing memory.

The process flow diagram of OnBufferReturn function as shown in Figure 9.The effect of OnBufferReturn function is that each memory block returns Shi Douhui from voice-input device (sound card) and calls this function.When calling for the first time, can the lining average and can be worth parameter such as threshold value frequently, on screen, show " please loquitur " the lang sound prompting of going forward side by side, then memory block is sent to voice-input device and begin to gather speech data.Mainly discern and recognition result is further processed when calling the OnBufferReturn function again after for the second time: if not enough 5 of the memory block quantity of returning is not just handled; If enough 5 of the memory block quantity of returning just requeue the memory block that returns in chronological order, and judge whether phonetic entry is arranged.If no phonetic entry then sends first current memory block to voice-input device and continues to gather speech data.If phonetic entry is arranged, then call the Recognize function and carry out speech recognition.If recognition result is correct, then the result of speech recognition will show, and exports control code to control box from parallel interface.If recognition result then shows " refusing to know " for " refusing to know (refusal identification) ", reresent next time and import.

The process flow diagram of Recognize function as shown in figure 10.The function of Recognize function is to cut out independently syllable from one section continuous voice, and method and program circuit and study module are similar.After cutting syllable, extraction eigenvector, eigenvector piecemeal, order and the sound template that will import with the sleiding form method compare, and find out an immediate sound template, and with the dual threshold method recognition result are screened again.Having under the situation of recognition result, the rreturn value of function is a recognition result; Be sky for refusing the knowledge situation with the situation rreturn value portion that exceeds upper threshold value.

The circuit of control box embodiment of the present invention is described in detail as follows in conjunction with Figure 11,12,13:

1. PC parallel interface

Control box adopts PC standard parallel interface, is connected to by connector CN1 on any one idle parallel interface of PC.By 8 bit data of parallel interface output, wherein low 6 is control code, and high 2 is address code.High 2 bit address sign indicating numbers carry out address decoding by address decoder U5 (74LS138), and its 4 outputs are respectively as the chip selection signal of 4 control code interface U1, U2, U10, U11 (74LS273).4 control code interfaces adopt " 0 circuit clearly powers on " (being made up of R1, R4, C1, U4:B, U4:C) to power on clearly 0.After speech recognition program is judged the order that the operator sends, by these 4 control code interfaces output control corresponding sign indicating numbers.

2. appliances power source control circuit

Control code by U1 control code interface output control power supply.After the output of U1 is driven by driver U6 (MC1413) earlier, remove to control respectively 6 midget relay RL0～RL5, the normally opened contact of RL0～RL5 is controlled auxiliary reclay RL25～RL30 then, the normally opened contact of auxiliary reclay is connected the 220V power supply again, thereby connects the 220V power supply of 6 kinds of household electrical appliance.

3. televisor infrared remote controller

Control code by U2 control code interface output control televisor.After the output of U2 is driven by driver U7 (MC1413) earlier, remove to control respectively 5 midget relay RL6～RL10, the normally opened contact of RL6～RL10 S1～S6 controls infrared remote control chip TC9012 then, by infrared remote control chip TC9012 control infrarede emitting diode L1, thus the channel switch of control televisor, volume plus-minus and switching on and shutting down.

4. telephone control circuit

Control code by U10, U11 control code interface output control telephone set.After the output of U10, U11 is driven by driver U12, U13 (MC1413) earlier, remove to control respectively 12 midget relay RL11～RL22, the normally opened contact key " 0 " of RL11～RL22～key " 9 ", key " G ", key " M " connect 0～9 10 numeric keys, on-hook key and the Hands-free key of telephone set respectively by connector CN2 then.The function of present embodiment and the key technical indexes:

1. function

(Speech-recognition-based EnvironmentalControl System for the Disabled SECS) is the electronic installation of the control household electrical appliance that design for the disabled person to environment control system by speech recognition for disabled people.The SECS-1 type is suitable for high paraplegia (four limbs pamplegia) disabled person, by the language issue an order, by the computing machine automatic speech recognition, differentiate the type of order, select channel, control telephone set to make a phone call or answer the call arbitrarily by power switch, the infrared remote control televisor of control circuit control household electrical appliance (electric light, electric fan, televisor etc.) then.

2. the key technical indexes

Main frame: PC586, internal memory 32MB, hard disk 540MB is furnished with sound card, audio amplifier and microphone.

Speech recognition: can discern 30 isolated word orders at most, irrelevant with accent.Identification is correct, all have when refusing to know

Voice suggestion.Discrimination 96%, reject rate 4%, misclassification rate 0%.

Electrical equipment control: speech recognition controlled, the power switch of maximum 6 electrical equipment of may command, the specified output 220V in every road,

5A, total specified output 220V, 10A.

Televisor control: speech recognition, infrared remote control, 3 meters of infrared remote control distances;

Control content is: channel+, channel-, volume+, volume-, the start, the shutdown;

The original Infrared remote controller of televisor still can be used as usual.

Telephone set control: speech recognition, line traffic control is arranged, may command is hands-free, dialing (0～9), on-hook, merit such as redial

Can, can dial international and domestic phone.The original function of telephone set still can be used as usual.

Man-machine interface: have the dual prompting of screen display and voice suggestion.

Claims

1, a kind of environment control system by speech recognition for disabled people, it is characterized in that, comprise multimedia computer and the control box that links to each other with its parallel interface, be stored in the system software that constitutes by study module and sound identification module in this computer in advance, said control box comprises the appliances power source control circuit that links to each other with electric equipment, the telephone control circuit that links to each other with phone, one or more of the televisor infrared remote controller that links to each other with televisor.

2, environment control system by speech recognition for disabled people as claimed in claim 1 is characterized in that, said appliances power source control circuit is made up of driving circuit and coupled midget relay group and auxiliary reclay; Said telephone control circuit is made up of driving circuit and coupled midget relay group; Said televisor infrared remote controller is made up of driving circuit and coupled midget relay group and infrared remote control chip.

3, environment control system by speech recognition for disabled people as claimed in claim 2 is characterized in that, said driving circuit is made of the MC1413 chip.

4, environment control system by speech recognition for disabled people as claimed in claim 1, it is characterized in that said study module divides module unit by the data acquisition unit that links to each other successively, cutting syllable unit, extraction eigenvector unit, eigenvector and sets up the sound template unit and formed.

5, environment control system by speech recognition for disabled people as claimed in claim 1, it is characterized in that said sound identification module divides module unit, judgement unit, noise removing unit and judgement output unit to be formed by the data acquisition unit that links to each other successively, cutting syllable unit, extraction eigenvector unit, eigenvector.