CN1241746A - Universal phonetic control command generator - Google Patents

Universal phonetic control command generator Download PDF

Info

Publication number
CN1241746A
CN1241746A CN99116106A CN99116106A CN1241746A CN 1241746 A CN1241746 A CN 1241746A CN 99116106 A CN99116106 A CN 99116106A CN 99116106 A CN99116106 A CN 99116106A CN 1241746 A CN1241746 A CN 1241746A
Authority
CN
China
Prior art keywords
control command
speech
command generator
keyboard
flash memory
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN99116106A
Other languages
Chinese (zh)
Inventor
江太辉
张歆奕
宋国栋
张有为
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuyi University
Original Assignee
Wuyi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuyi University filed Critical Wuyi University
Priority to CN99116106A priority Critical patent/CN1241746A/en
Publication of CN1241746A publication Critical patent/CN1241746A/en
Pending legal-status Critical Current

Links

Images

Abstract

The universal speech control command generator of the present invention makes it possible for machine to execute according to natural human speech command intelligently. It consists of digital signal processor, microprocessor unit, flash memory, A/D converter, D/A converter, LED, speech receiver, loudspeaker or earphone,keyboard, power source controller, etc. It operates in double CPU mode nd has maximal command number of 256. It is universal, small, low in cost, and suitable for various speech recognition algorithms, and is applicable where the machine is to be controlled through natural human language.

Description

Universal phonetic control command generator
The present invention relates to a kind of natural language instruction that makes machine energy executor, realize the universal phonetic control command generator of machine intelligence.
In the present existing technology, utilize speech recognition technology to realize people's natural language and machine dialogue, it is human-computer dialogue, the instruction that makes machine can understand people's phonetic order and go correct executor to send, had in recent years suitable progress, make the degree of machine intelligence that quite rapidly raising be arranged, just enter the practical stage at many algorithms aspect the speech recognition technology method, for example, the US Patent No. 08/254,844 of U.S. Motorola Inc. application, US08/413,146, the European patent EP 95021139.3 of Dutch Philips Electronics company application etc. all provide such as utilizing the speech recognition algorithms such as neutral net, hidden Markov. But in the above-mentioned technology, do not realize the hardware design of the phonetic control command generator of machine intelligence.
The purpose of this invention is to provide the phonetic control command generator that a kind of highly versatile, structure essence are little, low-cost, can adopt the different phonetic recognizer.
The present invention is comprised of flash memory (I), flash memory (II), modulus and the parts such as digital to analog converter (A/D and D/A), liquid crystal display (LCD), receiver, loudspeaker (or earphone), keyboard and power supply, it is characterized in that also being provided with digital signal processor (DSP) and little processing (MPU), digital signal processor is connected with digital to analog converter with modulus by serial port, microprocessor is connected by serial line interface with digital signal processor, keyboard, liquid crystal display and interface circuit are connected and are connected with microprocessor, and receiver, loudspeaker then are connected on modulus and the digital to analog converter.
Universal phonetic control command generator of the present invention has adopted the mode of microprocessor (MPU) and the two CPU co-ordinations of digital signal processor (DSP), has solved the communication interface of MPU and DSP, has provided MPU and DSP communication specific command; Make MPU also finish simultaneously the function of keyboard interface, LCD interface, extraneous interface, power management and house dog, realized system minimizes; By program code and the initialization data of flash memory (I) storaged voice recognizer, can select algorithms of different and need not to change hardware configuration; Upper outside except being installed on machinery panel in LCD, keyboard, receiver, loudspeaker (or earphone) use, all the other hardware can be integrated in the printed board of a 4 * 7cm; The maximum number of output control instruction is 28Article=256. Because universal phonetic control command generator highly versatile of the present invention, little, low-cost, the high discrimination of structure essence, so needing can be widely used in end user's natural-sounding control machine, make the intelligentized occasion of machinery equipment, for example, machine for producing device equipment, domestic electric appliance, communication apparatus, traffic delivery means, instrument and equipment.
The present invention will be further described below in conjunction with the drawings and specific embodiments.
Fig. 1 is the universal phonetic control command generator composition diagram;
Fig. 2 is the universal phonetic control command generator circuit diagram;
Fig. 3 realizes between microprocessor SMC88308 and the digital signal processing chip ADSP2186 leading to
The flow chart of letter;
Fig. 4 is the total empty flow chart of the software of universal phonetic control command generator;
Fig. 5 is the identification module flow chart;
Fig. 6 is the administration module flow chart;
Fig. 7 is the training module flow chart.
As shown in drawings, universal phonetic control command generator of the present invention is comprised of digital signal processor (DSP) 1, microprocessor (MPU) 2, flash memory (I) 3, flash memory (II) 4, A/D and D/A converter 5, liquid crystal display (LCD) 6, receiver 7, loudspeaker 8 (or earphone), keyboard and electric power controller etc. Receiver 7 is accepted the instruction person's of sending phonetic order, and each instruction is a phrase, and a plurality of instructions are a plurality of phrases. The analog voice instruction is converted to digital information by the A/D converter and is input among the DSP1 and processes, voice repayment is converted to analog information by the D/A converter and delivers to loudspeaker 8 (or earphone) and report to the instruction person of sending, so that the instruction of sending with prompt statement or affirmation to the instruction person of sending. Digital signal processor (DSP) 1 is the core component of speech recognition, finish speech recognition and compress speech scheduling algorithm, it by data/address bus and address bus and flash memory (I) 3 be connected II) 4 directly be connected, are connected connection with the D/A converter by data/address bus and A/D; Flash memory (I) 3 is used for storing program code and the initialization data of selected speech recognition algorithm; Flash memory (II) 4 is used for storing trained phonetic control command sample. Microprocessor (MPU) 2 is realized dual CPU with digital signal processor (DSP) 1, and MPU is connected by serial line interface with DSP, by specialized instructions communication and the operation of the present invention's design; MPU can be directly directly be connected with keyboard, liquid crystal display glass sheet and interface circuit, and inside comprises the watchdog circuit function. Liquid crystal display (LCD) 6 is used for the display reminding statement. Electric power controller is used for saving the management of DSP power consumption. 4 * 4 keyboard is used in the order input of training with management process. Control instruction exports outside controll plant to, and in the situation of 8 bits, the maximum number of control instruction is 28Article=256.
Fig. 1 has clearly illustrated the composition of universal phonetic control command generator and the annexation between each part, and LCD display 6 reality wherein are the liquid-crystalline glasses sheet, do not contain the driving chip. As seen from Figure 2, universal phonetic control command generator of the present invention mainly is comprised of five chips, and system is very simple. These five chips are: (1) U1, ADSP2186, Digital Signal Processing (DSP) 1 chip, clock 16.67M, 33MIPS, an instruction cycle is 30n`s, inside contains 8K word program storage and 8K digital data memory, is used for the realization of speech recognition algorithm and voice compression algorithm; (2) U3, AT29C020, flash memory (I) 3 is used for program code stored and initialization data; (3) U2, AT29C020, flash memory (II) 4 is used for the storaged voice command template; (4) U5, AD73311, A/D and D/A conversion chip 5,16 D/A and A/D including gain control, and its is the analog voice signal digitlization that is obtained by microphone of being sent into by J052, then send into the serial port of ADSP2186 by the DR signal line, it also can receive into the serial data from ADSP2186 output from the DT holding wire, then carry out the D/A conversion, is connected to loudspeaker 8 by CON2 and is reduced to sound; (5) U7, SMC88308,8 single-chip microcomputers for EPSON company are characterized in: include the ROM of 8K BYTE and the RAM of 256K BYTE, be used for solidifying user program; Include the LCD drive circuit, can directly drive LCD panel, saved outside liquid crystal display drive circuit; Include WatchDog Timer, saved outside corresponding circuits; Input/output port is very abundant, can directly link to each other with keyboard matrix and need not extra keyboard coding circuit, direct coding corresponding to output command also, control external circuit; Include serial line interface, by SIN, the SOUT equisignal line can with the dsp chip Direct Communication; It also has power voltage monitoring circuit in addition, is convenient to power supply is managed etc. Therefore to be used be main characteristics of the present invention for MPU and DSP, and it has not only been reduced the area of system to greatest extent so that whole system simplifies to greatest extent, has reduced cost, has also improved the reliability of system; MPU and DSP share out the work and help one another in addition, DSP mainly realizes speech identifying function and compress speech playback function, other functions are then finished by MPU, reduce so to greatest extent the service time of DSP, thereby reduce the power consumption of whole system, because the power consumption of DSP is large, and the power consumption of MPU is very little, thereby makes the present invention also can be applicable to use the portable product of battery. (6) U6, MC7805 is the voltage stabilizing chip, for system provides stable power supply VCC; (7) U8, MAX705 is used for producing power-on reset signal RESET here; J5 is the connector of keyboard and MPU in addition, and J4 is the connector of system and liquid-crystalline glasses sheet, and J105 is the interface of system and emulator, and J6 is the instruction encoding delivery outlet.
Carry out serial communication between MPU and the DSP, its data transfer procedure as shown in Figure 3. MPU controls the operation of DSP and returns required data by sending custom-designed order. Mainly order as follows for three groups: 1. training order:
Order Parameter The data of returning Explanation
01H Nothing Nothing Training function, typing order sound template
2. recognition command:
Order Parameter The data of returning Explanation
02H Nothing The coding of the order correspondence of recognition result The command recognition function, the voice command of the current input of identification.
3. administration order:
Order Parameter The data of returning Explanation
03H Key=1 Nothing Newly-built template
04H Key=2 Code A upper template
05H Key=3 Code Next template
06H Key=4 Nothing The deletion template
07H Key=5 Code The playback command word
08H Key=6 Code The playback system word
09H Key=7 Nothing The input system word
The software master control flow chart of universal phonetic control command generator as shown in Figure 4. Now in conjunction with the course of work of this this universal phonetic of flowchart text control generator. After system starts, wait for keyboard commands, can enter respectively three kinds of patterns, i.e. recognition mode and training mode and management mode. If enter recognition mode, then give an order by serial port, make ADSP2186 start speech recognition program, carry out the operation of speech recognition, then the result of identification, the information such as coding of the order that namely identifies are returned SMC88308, and send to demonstration, detailed process is as shown in Figure 5; If enter training mode, then give an order by serial port, make ADSP2186 start training program, carry out the training and operation of voice command, the coding of intermediate demand input command, and by serial port transmission data, detailed process is referring to Fig. 6; If the entrance management pattern is then given an order by serial port, make ADSP2186 start hypervisor, carry out corresponding bookkeeping, and return relevant data, referring to Fig. 7.
Fig. 5 is the flow chart of speech recognition. As seen from the figure, the process of speech recognition is at first carried out speech detection, has judged whether phonetic entry; If have then these voice are carried out feature extraction, namely extract the MFCC parameter of input voice; The laggard line parameter of parameter extraction relatively, namely the characteristic parameter of input voice and the characteristic parameter (being template) that is stored in the voice command in the flash memory are compared, determine whether and wherein certain template matches, two kinds of situations are arranged here, situation is to mate fully in first, the template of then being mated is the voice command of input, and the coding that at this time matching template is corresponding is the coding of input voice command, sends MPU back to by serial port; The second situation is Incomplete matching, at this time finds three immediate voice command templates, and the respectively playback of their voice, allow the user judge, if wherein there is one to be the voice command of input, after then being confirmed by the user, its voice coding is returned MPU; If three is not the voice command of input, then prompting allows the user from voice command of new input, repeats above-mentioned speech recognition process, until identify the result.
Fig. 6 is the flow chart of hypervisor, and it carries out template and search according to the keyboard commands that the user keys in, template deletion, playback command word, the operation of playback system word and recording system word.
Fig. 7 is voice command training program flow chart. The process of voice command training at first is speech detection, has namely judged whether phonetic entry; After having determined phonetic entry, these voice are carried out the processing of two aspects, the one, extract the feature of these voice, namely calculate its MFCC parameter, the 2nd, this speech data is carried out compressed encoding; Then allow the audio playback that has recorded the user judge, if user's key entry information represents the quality of dissatisfied voice command, then repeat above operation, the quality of voice command if user's key entry information is satisfied with, then prompting user is keyed in the coding of voice command, then voice command and coding thereof after the characteristic parameter (being template) of the voice command of input and the compression are deposited in the flash memory, at this time finished the once operation of training.

Claims (1)

1, a kind of universal phonetic control command generator, include flash memory (I) 3, flash memory (II) 4, modulus and the parts such as digital to analog converter (A/D and D/A) 5, liquid crystal display (LCD) 6, receiver 7, loudspeaker (or earphone) 8, keyboard and power supply, it is characterized in that also being provided with digital signal processor (DSP) 1 and microprocessor (MPU) 2, digital signal processor 1 is connected with digital to analog converter with modulus by serial port and is connected, microprocessor 2 and digital signal processor 1 are connected by serial line interface, keyboard, liquid crystal display 6 and interface circuit directly are connected with microprocessor 2, and 8 of receiver 7 and loudspeakers are connected on modulus and the digital to analog converter 5.
CN99116106A 1999-03-31 1999-03-31 Universal phonetic control command generator Pending CN1241746A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN99116106A CN1241746A (en) 1999-03-31 1999-03-31 Universal phonetic control command generator

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN99116106A CN1241746A (en) 1999-03-31 1999-03-31 Universal phonetic control command generator

Publications (1)

Publication Number Publication Date
CN1241746A true CN1241746A (en) 2000-01-19

Family

ID=5278951

Family Applications (1)

Application Number Title Priority Date Filing Date
CN99116106A Pending CN1241746A (en) 1999-03-31 1999-03-31 Universal phonetic control command generator

Country Status (1)

Country Link
CN (1) CN1241746A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002091357A1 (en) * 2001-05-08 2002-11-14 Intel Corporation Method, apparatus, and system for building context dependent models for a large vocabulary continuous speech recognition (lvcsr) system
CN1308797C (en) * 2002-10-30 2007-04-04 爱特梅尔股份有限公司 Method for identification of SPI compatible serial memory devices
CN100365599C (en) * 2005-07-15 2008-01-30 中国船舶重工集团公司第七○九研究所 Flash array storage method and module for real-time data record in digital signal processor
CN108364646A (en) * 2018-02-08 2018-08-03 上海智臻智能网络科技股份有限公司 Embedded speech operating method, device and system

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002091357A1 (en) * 2001-05-08 2002-11-14 Intel Corporation Method, apparatus, and system for building context dependent models for a large vocabulary continuous speech recognition (lvcsr) system
CN1308797C (en) * 2002-10-30 2007-04-04 爱特梅尔股份有限公司 Method for identification of SPI compatible serial memory devices
CN100365599C (en) * 2005-07-15 2008-01-30 中国船舶重工集团公司第七○九研究所 Flash array storage method and module for real-time data record in digital signal processor
CN108364646A (en) * 2018-02-08 2018-08-03 上海智臻智能网络科技股份有限公司 Embedded speech operating method, device and system
CN108364646B (en) * 2018-02-08 2020-12-29 上海智臻智能网络科技股份有限公司 Embedded voice operation method, device and system

Similar Documents

Publication Publication Date Title
US7292678B2 (en) Voice activated, voice responsive product locator system, including product location method utilizing product bar code and aisle-situated, aisle-identifying bar code
US7136465B2 (en) Voice activated, voice responsive product locator system, including product location method utilizing product bar code and product-situated, location-identifying bar code
CN108010531A (en) A kind of visible intelligent inquiry method and system
CN106023995A (en) Voice recognition method and wearable voice control device using the method
CN108632653B (en) Voice control method, smart television and computer readable storage medium
CN203232565U (en) Multifunctional remote network remote controller
CN109903748A (en) A kind of phoneme synthesizing method and device based on customized sound bank
CN1241746A (en) Universal phonetic control command generator
CN110534096A (en) A kind of artificial intelligent voice recognition methods and system based on microcontroller
JP2759267B2 (en) Method and apparatus for synthesizing speech from a speech recognition template
CN208256287U (en) Control device and smart home device based on speech recognition
CN105867148A (en) System and method for intelligent home control based on flexible electronic skin
CN2548056Y (en) Sound-control air-conditioner remote controller
CN201845546U (en) Device capable of controlling mobile phone through speech
CN1664924A (en) Aquatic bionic animal voice control system
CN2486896Y (en) Voice control air conditioner
CN108873713A (en) A kind of man-machine interaction method and system applied in smart home
CN1256460A (en) Phonetic command controller
CN101840640B (en) Interactive voice response system and method
CN2862265Y (en) Audio control MP3 player
CN110085212A (en) A kind of audio recognition method for CNC program controller
CN201374087Y (en) Multifunctional voter
CN111245690A (en) Shortcut control system based on voice control
CN209165801U (en) Power carrier controller and water heater
CN108154883A (en) A kind of compact shelving management system for having voice control function

Legal Events

Date Code Title Description
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C06 Publication
PB01 Publication
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication