CN203419063U - Voice control system of taxi top light - Google Patents

Voice control system of taxi top light Download PDF

Info

Publication number
CN203419063U
CN203419063U CN201320172221.8U CN201320172221U CN203419063U CN 203419063 U CN203419063 U CN 203419063U CN 201320172221 U CN201320172221 U CN 201320172221U CN 203419063 U CN203419063 U CN 203419063U
Authority
CN
China
Prior art keywords
module
audio
voice
cpu
sound identification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201320172221.8U
Other languages
Chinese (zh)
Inventor
洪海峰
楼远志
周艳会
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ZHEJIANG HAILIAN ELECTRONIC CO., LTD.
Original Assignee
ZHEJIANG HAILIAN ELECTRONICS CO Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ZHEJIANG HAILIAN ELECTRONICS CO Ltd filed Critical ZHEJIANG HAILIAN ELECTRONICS CO Ltd
Priority to CN201320172221.8U priority Critical patent/CN203419063U/en
Application granted granted Critical
Publication of CN203419063U publication Critical patent/CN203419063U/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Landscapes

  • Fittings On The Vehicle Exterior For Carrying Loads, And Devices For Holding Or Mounting Articles (AREA)

Abstract

The utility model belongs to the technical field of intelligent taxi top lights and intelligent vehicle-mounted equipment and discloses a voice control system of a taxi top light, which comprises a controller and a display inside a taxi disposed in the taxi and a LED top light disposed on the taxi top, wherein the controller comprises an audio acquisition module for acquisition of a driver's voice, an audio coding module for coding of audio signals, a voice recognition module for recognition of the audio signals, a power source drive circuit to provide a power source to all the modules, a CPU and a storage to store a voice feature library and procedure codes. According to the utility model, the control system of the taxi top light is built by designing the special LED top light controller with a voice input function for the taxi and problems that an operation of manual input of display contents of the top light is inconvenient, a passenger's time is wasted and safety hidden dangers are generated are effectively solved.

Description

A kind of taxi dome lamp speech control system
Technical field
The utility model belongs to taxi intelligent top lamp and vehicle intelligent equipment technical field, relates to the technology that taxi dome lamp and car-mounted display equipment controlled in voice.
Background technology
Taxicar is as commerial vehicle, need in the process of moving to roadside and passenger inside the vehicle shows as status informations such as empty wagons, carryings, and along with city traffic day by day block up and limited taxicar resource can not meet all the more the growing demand of calling a taxi, under the prerequisite of agreeing to passenger, share-car has become to economize on resources, and approximately separates the call a taxi effective ways of difficult problem of peak.Therefore, to roadside passenger, pass on this car target direction information in the process of moving, become a crucial requirement realizing share-car.
Early stage taxicar is corresponding front window place installation empty wagons notice board in operator's compartment generally, and after passenger loading, driver is manually by notice board overturning, and after passenger getting off car, driver manually digs notice board again.Obviously, the information of this notice board is very single, and troublesome poeration.Along with the development of LED technique of display and singlechip technology, there is a kind of intelligent prompt board that shows empty wagons or carrying printed words by LED display.Compare and roll up traditional notice board of turning down, this notice board does not need to stir, but still need driver manually to control displaying contents, between " empty wagons " and " carrying ", converts.
In order to be more observably different from private vehicle, taxicar generally can arrange ceiling light on roof, appearance along with intelligent prompt board, also occurred a kind of intelligent overhead light that comprises equally LED display, intelligent overhead light can more greatly more observably show service condition information such as " empty wagons " and " carrying ".In addition, intelligent overhead light, with respect to intelligent prompt board, has the advantage that display area is large.Except showing service condition information such as " empty wagons " and " carrying ", also make it there is the possibility of display-object direction.But because passenger's destination has randomness, cannot be preset in one by one on operation keyboard, can only, by driver, after passenger loading, manually input so that it shows on ceiling light.Character input process brings operational trouble to driver, has wasted again passenger's time.As driver carries out the input of target direction after axletree starting, can bring traffic safety hidden danger.
Summary of the invention
The purpose of this utility model is, for above shortcomings in prior art, to provide a kind of scheme that can exempt driver's hand control notice board or taxi dome lamp.This scheme discloses a kind of based on voice-operated taxi dome lamp and car-mounted display apparatus control system.
To achieve these goals, the technical scheme that the utility model adopts is: a kind of taxi dome lamp speech control system, comprising: be arranged at controller and Che Nei read-out in compartment, and the LED ceiling light that is arranged at roof;
Described controller comprises: for gather driver's voice audio collection module, for the audio coding module of coding audio signal, for identify sound signal sound identification module, be used to each module the power driving circuit, CPU of power supply is provided, for the memory device of storaged voice feature database and program code;
Described audio collection module comprises microphone and A/D sample circuit, simulated audio signal output that A/D sample circuit sampling microphone generates; The input end of described audio coding module is connected with the mouth of A/D sample circuit; Audio coding module is encoded to the digital audio and video signals of A/D sample circuit output by the discernible coding rule of sound identification module, and the data input pin of sound identification module connects the data output interface of audio coding module;
Sound identification module reads audio coding module buffer zone sound intermediate frequency data, extracts the feature of voice data, and search is pre-stored in the character features storehouse in memory device, carries out characteristic matching, by the Word message output of characteristic matching;
The data output end of sound identification module meets CPU by digital interface; CPU connects read-out and LED ceiling light in car by RS232 interface; CPU receives the Word message of sound identification module output, and by word-information display to read-out and LED ceiling light in car.
In addition, the utility model also discloses a kind of improvement project, is specially: a kind of taxi dome lamp speech control system, comprising: be arranged at controller and Che Nei read-out in compartment, and the LED ceiling light that is arranged at roof;
Described controller comprises: for gather driver's voice audio collection module, for the audio coding module of coding audio signal, for identify sound signal sound identification module, be used to each module the power driving circuit, CPU of power supply is provided, for the memory device of storaged voice feature database and program code, and for controlling the button of voice collecting start-stop;
Described audio collection module comprises microphone and A/D sample circuit, simulated audio signal output that A/D sample circuit sampling microphone generates; The input end of described audio coding module is connected with the mouth of A/D sample circuit; Audio coding module is encoded to the digital audio and video signals of A/D sample circuit output by the discernible coding rule of sound identification module, and the data input pin of sound identification module connects the data output interface of audio coding module; The data output interface of sound identification module meets CPU by digital interface;
CPU connects respectively the interruption control interface of audio collection module, audio coding module, sound identification module by digital interface, the energizing signal of CPU response button, control audio collection module, audio coding module start to gather and coding audio signal, detection reaches after default time delay without voice signal, CPU controls audio collection module, audio coding module stops, and controls sound identification module and start speech recognition;
Sound identification module reads audio coding module buffer zone sound intermediate frequency data, extracts the feature of voice data, and search is pre-stored in the character features storehouse in memory device, carries out characteristic matching, by the Word message output of characteristic matching;
CPU is connected with LED ceiling light with read-out in car by RS232 interface, CPU by RS232 interface by the word-information display of sound identification module output to read-out and LED ceiling light in car.
The utility model has the taxicar special use of the speech voice input function mode of LED ceiling light controller by design builds taxi dome lamp control system, efficiently solves manual input ceiling light displaying contents operation inconvenience, waste passenger's time, causes the problems such as potential safety hazard.
Accompanying drawing explanation
Fig. 1 is the functional block diagram of taxi dome lamp control system described in the utility model.
Fig. 2 is a kind of particular hardware topology diagram of the controller of taxi dome lamp control system.
Fig. 3 is a kind of concrete software flow pattern of taxi dome lamp control system.
Fig. 4 is speech recognition algorithm schematic diagram.
Fig. 5 is MFCC calculation flow chart.
Fig. 6 is DWT algorithm search distance base diagram.
The specific embodiment
For the utility model such scheme, provide an instantiation below, to further illustrate taxi dome lamp control system described in the utility model.
In this example, the speech recognition of described controller adopts Mel cepstrum to extract improvement calculation and dynamic time warping (DTW) algorithm of MFCC.This improvement algorithm does not directly extract voice short-time magnitude spectrum, but first amplitude spectrum is carried out smoothly, calculating MFCC parameter on the basis of spectrum envelope, thereby reduces the impact of fundamental frequency on it.According to the advantage of extracting the template matches of Mel cepstrum coefficient and dynamic time bending, input speech signal is carried out to end-point detection, can to the isolated word of particular person, identify preferably.Algorithm has provided MFCC parameter extraction simultaneously, and breaking point detection is carried out to performance analysis and evaluation.Experimental result shows that algorithm has the very recognition effect of high precision.
One, hardware design
This csr controller hardware topology structure as shown in Figure 2, mainly by audio decoder module, noise reduction synthesis module, core cpu processing module, power module (power driving circuit), have uninterrupted power supply (UPS) I ,/O interface, serial ports input and output, 3G communication module, A/D modular converter, audio collection device, etc. build, system has also been set up communication module, communication module is alternative, the various information such as voice that are equipment are uploaded to backstage by communication module, realize information system management, what the core processor that system is used adopted is ARM high speed processing chip.Voice messaging provides the audio-frequency information of high-fidelity after by audio signal sample, filtration, noise reduction, through A/D modular converter, convert analog signal to digital model, treater is identified key words and is dynamically edited lists of keywords by tailor-made algorithm, Output rusults is confirmed by I/O port, exports correct demonstration information or control command and is sent to electronic operation certificate execution and intelligent overhead light demonstration voice messaging by bus.Native system is by Embedded Speech Recognition System technology, input and output voice data neatly in processing complicated audio frequency processing system.In firmware, hardware design, under protected mode, FLASH memory device is accessed in total ban.After programming, can start from in-line memory, thereby serve as a Complete customization speech recognition apparatus.Thereby the phonetic recognization rate of more effective raising equipment.
Described audio collection module adopts noise resistance microphone, and audio encoding chip is selected vs1005, its radical function be by speech signal collection in signal processing module, and extract corresponding voice feature data by speech recognition algorithm.Audio encoding chip vs1005 can obtain extraordinary audio, and than software power saving.VS1005 is a slice audio platform device flexibly.It is based on VS_DPS4 member, and this is a powerful DSP(digital signal processor) core.In independent utility, digital interface provides external devices access flexibly.Input and output voice data neatly in complicated audio frequency processing system.Analog interface provides the audio frequency input and output of high-fidelity, for example, control ADC and can be used as resistive touch screen interface.VS1005 has 8Mbit(1MByte) Embedded Flash Memory of VLSI or customization side or third party's customization.In firmware and hardware design, under protected mode, FLASH memory device is accessed in total ban.After programming, VS1005 can start from in-line memory, thereby serves as the independent audio treater of a Complete customization.
CPU selects ARM9 high speed processor S3C2440 chip, and S3C2440 has adopted the kernel of ARM920t, CMOS standard macroelement and the memory cell of 0.13um.Its low-power consumption, simple, gracefulness, and full Static Design is particularly suitable for the application to cost and power sensitive type.
Sound identification module adopts LD3320 dedicated voice identification chip.LD3320 chip is a " speech recognition " special chip.This integrated chip voice recognition processor and some external circuits, comprise AD, DA conv, microphone interface, voice output interface etc.This chip does not need external any companion chip as Flash, RAM etc., is directly integrated in existing product and can realizes speech recognition/acoustic control/good in interactive function.And the key words list of identification can dynamically be edited arbitrarily.
Two, Software for Design
Native system software flow pattern as shown in Figure 3
2.1 system call
Native system adopts audio coding decoding chip VS1005 audio-frequency module to gather voice signal, by Interruption, comes control system to move.
Main program enables audio coding module, reads voice data in the buffer zone of audio coding module to DDRSDRAM memory device.When audio-frequency module buffer zone is all sky, main program carries out pretreatment, end-point detection, MFCC parameter extraction to the data in DDR.Pattern matching algorithm adopts dynamic time warping (DTW) algorithm.
2.2 algorithm principle
As shown in Figure 4, MFCC calculation process as shown in Figure 5 for the basic structure of sound identification module.
2.2.1 divide frame, preemphasis to process
The object of preemphasis is for the HFS of voice is increased the weight of to increase its high frequency resolution, by transfer function, is generally H (z)=1-α Z -1.α gets 0.98; Voice signal has pulsation-free feature in short-term, therefore can divide frame to process to voice signal, thereby reduce because of the next counter productive of voice signal time variation strong band.
Preemphasis algorithm:
sign(n)=s(n)-a*s(n-1)
In formula: α gets 0.9; S (n) is digitized voice signal; Sign (n) is the voice signal after increasing the weight of.
Divide frame algorithm:
S w ( n ) = Σ - ∞ ∞ s ( m ) * w ( n - m )
In formula: s (n) is original signal; Sw (n) is signal after minute frame.
The window function (wherein N is frame length, i.e. the sampling number of a frame) that divides frame to adopt:
Figure DEST_PATH_GDA0000371426840000052
2.2.2 end-point detection algorithm
The effect of end-point detection is to find out the starting point of voice command and the position of terminal in one section of voice signal.Native system adopts short-time average magnitude method to carry out end-point detection, starting point and the terminal of voice accurately detected, thereby has guaranteed the high discrimination of system.
The calculating of short-time average magnitude is suc as formula shown in (3):
E = Σ n = 0 N - 1 | s ( n ) |
2.2.3MFCC extraction algorithm
Voice signal is a kind of typical time varying signal, if shortening to a few tens of milliseconds observing time, can obtain a series of nearly quasi-stationary signals.People's vocal organs can be simulated with latter linked sound pipe before some sections, Here it is so-called vocal tube model.
Voice signal is through pretreatment, and its each sample value all can be approached by the linear combination of several sample values of past, can adopt the mode that makes mean square error minimum between actual speech sampling and linear prediction sampling simultaneously, solves one group of predictive coefficient a.The initial characteristics of Here it is signal that MFCC extracts.
MFCC leaching process is as follows:
Voice signal is carried out to pretreatment, and windowing divides frame to be become short signal.By inciting somebody to action time-domain signal in short-term, be converted into frequency-region signal, and calculate its short-time energy, discrete Fourier transformation.To after time-domain signal x (n), not some 0 with shape, grow into N(and generally get 512) sequence, obtain linear spectral X (k) after then entering discrete Fourier transformation.
X ( k ) = &Sigma; n = 0 N - 1 x ( n ) e - 2 Pnk / n , 0 < n , k < N - 1
Ask logarithm energy.In order to make result of calculation have better robustness to noise and spectrum estimating noise, the energy of generally the above-mentioned Mel frequency spectrum obtaining through Mel bank of filters being taken the logarithm.
e ( m ) = &Sigma; k = 0 N - 1 | X ( k ) | 2 H m ( k ) , 0 < m < M
S(m)=ln(e(m)),0<m<M
Discrete cosine transform (DCT).The cepstrum parameter of standard only reflects the static response of speech parameter, and in fact, due to the physical condition restriction of pronunciation, between different frame, voice must be correlated with, and variation is continuous, so still use first order difference cepstrum parameter in identification parameter.
c ( n ) = &Sigma; m = 0 N - 1 S ( m ) cos ( Pn ( m + 1 / 2 ) M ) , 0 < m < M
2.2.4 pattern match
System model compatible portion adopts dynamic time warping (DTW) algorithm.
Reference template is expressed as:
R={R 1,R 2,R m,…,R M}
Test template is expressed as:
T={T 1,T 2,…,T n,…,T N}
Wherein, Rm and Tn are respectively the characteristic parameter of m frame reference voice and n frame tested speech, and reference template and test template generally adopt the MFCC characteristic parameter of same type, and the two is all the vector of L=16 dimension.
As shown in Figure 6, by each frame T of test template 1, T 2..., T n..., T n.With the transverse axis of rectangular coordinate system, represent each frame R of reference template 1, R 2..., R m..., R mwith the longitudinal axis, represent.
In Fig. 6, the joint of a certain frame in a certain frame and reference template in each point of crossing (n, m) expression test template in network.With DTW algorithm, find one by the optimal path of this some point of crossing of network, by calculating optimized local distance, obtain whole minimum Cumulative Distance.Adopt Euclid formula to calculate local distance, see formula (7), d (n, m) represents the distance between Tn and this two frame feature vector of Rm.
d ( n , m ) = &Sigma; l = 1 L [ T n ( l ) - R m ( l ) ] 2
The D for Cumulative Distance (n, m) of data point (Tn, Rm) represents:
D(n,m)=min{D[(n,m)],D[(n,m-1)]},D[(n,m-2)]
Three, system performance
Speech recognition algorithm all adopts C software to realize.Utilize this software to complete generation, editor, compiling, link, loading, the debugging of the calling of IP kernel, compiling, emulation, comprehensive, checking, realization and c program code.Native system is tested, chosen 20 different people and respectively system has been trained, after training, respectively system is carried out to 50 voice command tests, each orders 5-8 word.The average recognition accuracy and the system that in test process, have recorded everyone demonstrate the averaging time of recognition result from gathering voice to hyper terminal, result is as shown in table 1.
Table 1 system performance testing
Tester System operation averaging time/s Accuracy rate/%
1 1.5 98
2 1.6 96
3 1.5 98
4 1.9 92
5 1.2 94
6 1.8 96
7 1.9 98
8 1.7 92
9 1.8 96
10 1.4 92
11 1.7 95
12 1.5 94
13 1.5 91
14 1.2 100
15 1.3 94
16 1.9 89
17 1.8 92
18 1.5 91
19 1.3 90
20 2.1 97
avg 1.605 94.25
As shown in Table 1, for single people's recognition success rate, be 94.25%, system average operating time is 1.605s, so can meet the performance requriements of embedded device to speech recognition in this entire system.

Claims (2)

1. a taxi dome lamp speech control system, is characterized in that, comprising: be arranged at controller and Che Nei read-out in compartment, and the LED ceiling light that is arranged at roof;
Described controller comprises: for gather driver's voice audio collection module, for the audio coding module of coding audio signal, for identify sound signal sound identification module, be used to each module the power driving circuit, CPU of power supply is provided, for the memory device of storaged voice feature database and program code;
Described audio collection module comprises microphone and A/D sample circuit, simulated audio signal output that A/D sample circuit sampling microphone generates; The input end of described audio coding module is connected with the mouth of A/D sample circuit; Audio coding module is encoded to the digital audio and video signals of A/D sample circuit output by the discernible coding rule of sound identification module, and the data input pin of sound identification module connects the data output interface of audio coding module;
The data output end of sound identification module meets CPU by digital interface; CPU connects read-out and LED ceiling light in car by RS232 interface; CPU receives the Word message of sound identification module output, and by word-information display to read-out and LED ceiling light in car.
2. a taxi dome lamp speech control system, is characterized in that, comprising: be arranged at controller and Che Nei read-out in compartment, and the LED ceiling light that is arranged at roof;
Described controller comprises: for gather driver's voice audio collection module, for the audio coding module of coding audio signal, for identify sound signal sound identification module, be used to each module the power driving circuit, CPU of power supply is provided, for the memory device of storaged voice feature database and program code, and for controlling the button of voice collecting start-stop;
Described audio collection module comprises microphone and A/D sample circuit, simulated audio signal output that A/D sample circuit sampling microphone generates; The input end of described audio coding module is connected with the mouth of A/D sample circuit; Audio coding module is encoded to the digital audio and video signals of A/D sample circuit output by the discernible coding rule of sound identification module, and the data input pin of sound identification module connects the data output interface of audio coding module; The data output interface of sound identification module meets CPU by digital interface;
CPU connects respectively the interruption control interface of audio collection module, audio coding module, sound identification module by digital interface,
CPU is connected with LED ceiling light with read-out in car by RS232 interface, CPU by RS232 interface by the word-information display of sound identification module output to read-out and LED ceiling light in car.
CN201320172221.8U 2013-04-08 2013-04-08 Voice control system of taxi top light Expired - Fee Related CN203419063U (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201320172221.8U CN203419063U (en) 2013-04-08 2013-04-08 Voice control system of taxi top light

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201320172221.8U CN203419063U (en) 2013-04-08 2013-04-08 Voice control system of taxi top light

Publications (1)

Publication Number Publication Date
CN203419063U true CN203419063U (en) 2014-02-05

Family

ID=50018164

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201320172221.8U Expired - Fee Related CN203419063U (en) 2013-04-08 2013-04-08 Voice control system of taxi top light

Country Status (1)

Country Link
CN (1) CN203419063U (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104786911A (en) * 2015-04-15 2015-07-22 上海电机学院 Vehicle-mounted expression display system and method based on voice control
CN105323902A (en) * 2014-11-03 2016-02-10 苏州朗米尔照明科技有限公司 Intelligent display OLED ceiling dome light
CN105774688A (en) * 2014-12-19 2016-07-20 田坡 Information communication and indication system used in automobile
CN112509576A (en) * 2020-04-13 2021-03-16 安徽中科新辰技术有限公司 Voice-controlled large-screen display system

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105323902A (en) * 2014-11-03 2016-02-10 苏州朗米尔照明科技有限公司 Intelligent display OLED ceiling dome light
CN105774688A (en) * 2014-12-19 2016-07-20 田坡 Information communication and indication system used in automobile
CN104786911A (en) * 2015-04-15 2015-07-22 上海电机学院 Vehicle-mounted expression display system and method based on voice control
CN112509576A (en) * 2020-04-13 2021-03-16 安徽中科新辰技术有限公司 Voice-controlled large-screen display system

Similar Documents

Publication Publication Date Title
CN103204100B (en) A kind of Taxi roof beacon voice control system
CN102982811B (en) Voice endpoint detection method based on real-time decoding
CN110706690A (en) Speech recognition method and device
CN102930866B (en) Evaluation method for student reading assignment for oral practice
CN102999161B (en) A kind of implementation method of voice wake-up module and application
CN102163427B (en) Method for detecting audio exceptional event based on environmental model
CN1119794C (en) Distributed voice recognition system
CN104428766B (en) Speech recognition equipment
CN103617799B (en) A kind of English statement pronunciation quality detection method being adapted to mobile device
CN203419063U (en) Voice control system of taxi top light
Schluter et al. Using phase spectrum information for improved speech recognition performance
CN106782591A (en) A kind of devices and methods therefor that phonetic recognization rate is improved under background noise
CN104575504A (en) Method for personalized television voice wake-up by voiceprint and voice identification
CN102870156A (en) Audio communication device, method for outputting an audio signal, and communication system
CN102982803A (en) Isolated word speech recognition method based on HRSF and improved DTW algorithm
CN102231278A (en) Method and system for realizing automatic addition of punctuation marks in speech recognition
CN103065629A (en) Speech recognition system of humanoid robot
CN109508402A (en) Violation term detection method and device
CN102360187A (en) Chinese speech control system and method with mutually interrelated spectrograms for driver
CN112581963B (en) Voice intention recognition method and system
CN103366729A (en) Speech dialogue system, terminal apparatus, and data center apparatus
CN113393828A (en) Training method of voice synthesis model, and voice synthesis method and device
CN111508469A (en) Text-to-speech conversion method and device
CN107104994A (en) Audio recognition method, electronic installation and speech recognition system
CN102237083A (en) Portable interpretation system based on WinCE platform and language recognition method thereof

Legal Events

Date Code Title Description
C14 Grant of patent or utility model
GR01 Patent grant
C56 Change in the name or address of the patentee
CP03 Change of name, title or address

Address after: Daxi River town Xiaoshan District Hangzhou road 311265 Zhejiang province No. 391

Patentee after: ZHEJIANG HAILIAN ELECTRONIC CO., LTD.

Address before: 311265, Hangzhou Town, Xiaoshan District, Zhejiang Province

Patentee before: Zhejiang Hailian Electronics Co.,Ltd.

CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20140205

Termination date: 20190408

CF01 Termination of patent right due to non-payment of annual fee