CN1545696A - Method of employing prefetch instructions in speech recognition - Google Patents

Method of employing prefetch instructions in speech recognition Download PDF

Info

Publication number
CN1545696A
CN1545696A CNA018235549A CN01823554A CN1545696A CN 1545696 A CN1545696 A CN 1545696A CN A018235549 A CNA018235549 A CN A018235549A CN 01823554 A CN01823554 A CN 01823554A CN 1545696 A CN1545696 A CN 1545696A
Authority
CN
China
Prior art keywords
speech data
acoustic processing
memory
group
speech
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CNA018235549A
Other languages
Chinese (zh)
Other versions
CN1223986C (en
Inventor
赖春荣
赵庆伟
潘杰林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Intel China Ltd
Intel Corp
Original Assignee
Intel China Ltd
Intel Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Intel China Ltd, Intel Corp filed Critical Intel China Ltd
Publication of CN1545696A publication Critical patent/CN1545696A/en
Application granted granted Critical
Publication of CN1223986C publication Critical patent/CN1223986C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/285Memory allocation or algorithm optimisation to reduce hardware requirements

Abstract

In general, the new prefetching method pursuant to one embodiment of the invention, employed by a computer system engaged in human speech recognition provides an efficient method of computing and searching speech features based on a Gaussian Distribution of acoustic Hidden Markov Model states. The new method transfers the speech data to be processed while the processor is engaged in acoustically processing a speech data in process. Accordingly, the prefetching method, pursuant to one embodiment of the invention, employed by a computer system engaged in human speech recognition, reduces or eliminates memory latency caused by the processor waiting idle while the memory transfers speech data to be processed to the processor.

Description

In speech recognition, adopt the method for prefetched instruction
Technical field
The present invention relates to speech recognition.Especially, the present invention relates to a kind of new apparatus and method, when it carries out acoustic processing to speech data in system in the processing of the voice recognition phase process of voice recognition processing, adopt prefetched instruction being sent to high-speed cache from primary memory by the speech data of acoustic processing.
Background technology
In the past few years, technology and the science by a people's that machine carried out speech recognition obtained big development.Today, have many application programs that are used for the big vocabulary continuous speech recognition (LVCSR) of automatic speech recognition (ASR).In order to realize speech recognition, a kind of computer system can be used as a large amount of speech engines that calculate and search for of processing, to analyze and to discern the voice signal of carrier's phonetic feature.Correspondingly the efficient of a computer system in carrying out these operations has influence to the performance of speech engine.
Usually, a speech recognition system is carried out several operations to a people's voice signal, to determine said content.For example, when a people said following sentence " my name is John ", for example the such voice capture device of microphone was caught this pronunciation as an analoging sound signal.This simulating signal is converted into a digital signal then, so that handled by digital machine.The institute's picked up signal that carries phonetic feature can be used a mathematical model and quantize and show as a plurality of eigenvectors.For example, Mel frequency cepstrum (Cepstral) coefficient (MFCC) can be used to indicate phonetic feature.
The feature of being calculated is carried out acoustic processing by a computer system then.In the acoustic processing process, this feature is compared with the known phonetic symbol unit in being included in a sound model.The example of a sound model is hidden Markov model (HMM).This phonetic feature may cause one or more couplings with the comparison that is included in the known phonetic symbol unit in this model.The phonetic symbol unit that is mated for example uses a dictionary or grammer dictionary to carry out Language Processing then, to form a word string of being discerned.
In order to carry out acoustic processing, this speech engine uses a large amount of probability distribution, for example as the mixing of the M gauss of distribution function of the N dimension space in the space of the eigenvector of this voice signal.The mean value of each eigenvector and variance calculated and the storer of this computer system of storer in.Afterwards, each parameter was taken out from storer, finished the calculating of Gaussian function to be used for this speech engine.
Fig. 1 is the storage of existing computer system related in people's speech recognition and the synoptic diagram of performance period.This illustrate in the acoustic processing process of voice signal this performance element and memory bus the time base relatively.When memory bus wanting processed speech data when storer transmits, it is idle that this performance element keeps, till wanting processed data to become can be obtained by this processor.Owing to whole calculated amount required in phonetic analysis, this memory latency time increases fast, and promptly the time of being wasted when this storer transmits the data of wanting processed increases.When the continuous received speech signal of LVCSR, this problem especially severe.Many action needs were finished in p.s., and this shortcoming seriously limits the speed and the efficient of this system.
Description of drawings
Fig. 1 is used for according to the storage of the computer system of the acoustic processing of prior art and the synoptic diagram of performance period.
Fig. 2 is the block scheme of the signal speech recognition system of method according to an embodiment of the invention.
Fig. 3 is for illustrating the process flow diagram of speech recognition system according to an embodiment of the invention.
Fig. 4 is the illustrative method that the phonetic feature in the acoustic processing process of voice signal calculates.
Fig. 5 is the signal computer code of the C language of the new prefetching technique of employing the method according to this invention.
Fig. 6 adopts the signal computer code of the assembly language of the new prefetching technique of method according to an embodiment of the invention.
Fig. 7 is used for illustrating according to an embodiment of the invention the storage of computer system and the synoptic diagram of performance period.
Embodiment
In the following detailed description of embodiments of the invention, provide various details.But those of ordinary skill in the art obviously can realize according to an embodiment of the invention method as can be seen and not have these details.In other words, well-known method, process, parts and circuit are not described in detail, and obscure to avoid embodiments of the invention are caused.
The method according to this invention comprises the various functional steps that will be described below.This functional steps can be realized by hardware component, perhaps can be presented as the executable instruction of machine, and it can be used to make carries out this functional steps with the general processor of this instruction programming.In addition, this functional steps can be carried out by the combination of hardware and software.
Embodiments of the invention disclose a kind of new prefetching technique that will realize in the acoustic processing phase process of people's speech recognition.When wanting processed data by when primary memory is sent to performance element in the acoustic processing process, this new prefetching technique can be used to reduce or eliminate because performance element is waited for the memory latency time that the free time is caused.In a preferred embodiment, for example, when this performance element was busy with the computing voice feature, this application program was carried out the prefetched instruction that is used for data that will be processed concurrently.Correspondingly, when this performance element was busy with calculating, this memory bus this performance element that is busy with looking ahead calculated required data next time.
Referring now to Fig. 2,, the block scheme of the speech recognition system 200 of a signal shown in it.This system comprises voice capture device 210, analog to digital converter 212, computer system 250 and a series of I/O equipment, for example controller equiment 240, display device 242, network interface unit 244 and printing device 246.This computer system 250 comprises processor 252, storer 280, high-speed cache 260, director cache 262, memory bus 272 and I/O bus 270 again.Preferably, this computer system may further include a direct memory access (DMA) 274.
The following work of this system: a people speaks to microphone 210, obtains an analog voice signal.This signal is then by analog to digital converter 212, to form the digitized representations of this analog voice signal.This digitized expression is imported into this computer system 250 then.This processor 252 begins to discern the phonetic feature relevant with this voice signal then, and these characteristic storage in the storer 280 of computer system 250.A high-speed cache 260 is used to be stored in prefetch data required in the calculating of phonetic feature.A director cache 262 coprocessors 252 and the data between the high-speed cache 260 that are connected to processor 252 and high-speed cache 260 transmit.
Also being stored in the storer 280 is a plurality of known phonetic symbol unit, and it is called as a sound model.By the employed sound model of present embodiment can be relevant (SD) model with the speaker or can with speaker-independent (SI) model.This SD model is by a specific people's sound institute efficient, and this recognition system is supposed to be used by identical people.For example, mobile phone or personal digital assistant adopt the SD model usually, because its estimates to be used by identical people (owner of this equipment).When the people of this system of use changes, use the SI model on the other hand.For example, an auto-teller (ATM) generally uses the SI model.
Processor 252 finished this voice signal feature calculating and be stored in them in this storer 280 after, it can in the sound model in also being stored in storer 280, seek the coupling.Used particular search method does not influence the method that is used for this embodiment.For example, can use single the best or N best hypothesis.In addition, a word figure or a phonetic symbol word figure can be used to indicate the coupling that obtains in the search procedure of sound model.
In any case this coupling is carried out Language Processing, with the word string of determining to be identified.In addition, this processor 252 can utilize this display device 242 that the result of coupling is sent to another computing machine, for example can carry out the server apparatus (not shown) of this Language Processing.If this processor 252 is programmed to also the effective language as a result of coupling be handled, then it can utilize printing device 246 to print relevant institute's identification string.In addition, the word string of being discerned may be displayed on the display device 242, perhaps for example is sent to controller equiment 240, control signal is sent to another system, controls an equipment.
Referring now to Fig. 3,, shown in it according to the process flow diagram of the use speech recognition system of an embodiment.In step 3e06, catch the people's of a signal voice signal with analog form.The voice signal of being caught carries and the relevant phonetic feature of the said content of this speaker.Selected special sound feature does not influence the method according to present embodiment.For example, selected phonetic feature can be the voice signal energy intensity of measuring according to frequency interval.When the people speaks, this characteristic change, and this feature can be represented by a plurality of eigenvectors, and each eigenvector has a direction and amplitude.Then this voice signal can by mathematics be expressed as summation with the eigenvector of different time interval measurement.This time interval or sample frequency are short more, and then the expression of this voice signal is accurate more.In order to calculate these features, this signal at first is converted into digital form, makes it by a digital machine reason of living in shown in the step 308.In step 310, the feature of this digitized voice signal is calculated and is stored in the storage unit of this system.For example, a mathematical model that generally is used to indicate phonetic feature is a Mel frequency Cepstral coefficient (MFCC).
Also be stored in the storage unit of this system is a sound model 330 and language model 332.Step 340 expression sound and Language Processing.In this step process, carry out search according to a searching algorithm, for example search (decoding) algorithm of propagating based on token.In this " search handle " or " matching treatment " process, this performance element is searched institute's calculated characteristics (for example, the MFCC of voice signal) and is included in coupling between the known phonetic symbol feature in this sound model in step 310.In this stage, the candidate item of high matching probability obtains optimal candidate item, for example a phonetic symbol unit list by selecting to have.
This search volume has been programmed the specific identification application program of carrying out according to this system and has changed.For example, for listening writing task, this search volume can be organized as a words tree; And in order to order and control task, this search volume can be organized as a word figure.Can carry out any known searching method, for example single the best or N best hypothesis.In any case, after search, can produce a word figure by this performance element.The word figure that replaces option by the word that utilizes the coupling that this sound model does is carried out Language Processing then, and produces a word string of being discerned in step 350.Eigenvector be included in the matching operation process of the known features in this sound model, that is, sound model coupling and forming is handled, and can use the method according to different embodiments of the invention.
In the Language Processing process, a language model can be used to form single best sentences.This language model can adopt dictionary and grammer dictionary to eliminate from the candidate item of coupling not similar or not allow the word that occurs.The best sentences that is obtained can be used as a control signal, and perhaps it can be stored in the dictation application simply.
Referring now to Fig. 4,, handles the illustrative method of the acoustic processing of a voice signal shown in it.In general, voice signal for example is represented as a mathematical model based on MFCC.This model is calculated by the gauss of distribution function according to the expression state relevant with a plurality of eigenvectors.An example of this mathematical model uses a Gaussian distribution probability function according to formula 410 to form.Wherein x=(x1, x2 ... xN) be the eigenvector 1 to N of voice signal, and mean value 412 and variable 413 be the i n dimensional vector n, m mixing of the Gaussian distribution of sound HMM state.In general, this algorithm computation is used, to quicken the calculating of eigenvector.For example, if computational algorithm 408, then common following formula is used to quicken aforementioned calculation, because log (Wmfm (x)) can be calculated as follows:
Log(y1+y2)=Log(y1)+Log(1+y2/y1)=Logy1+log(1+e?POWERlogy2-logy1)
In order to make this processor carry out this calculating,, can utilize the circulation of a counting.In this loop blocks, arithmetic instruction and former data transmission functional dependence.Before carry out calculating, for example relevant with the variance vector 413 of mean vector 412 and each eigenvector such data of numerical value will be provided for this processor.A prefetched instruction can be used to transmit the average and variate-value of each eigenvector.In a preferred embodiment, when this performance element was busy with calculating current data, this prefetched instruction was performed.This prefetched instruction can be carried out in this performance element is busy with any periodic process of current calculating.Two incidents are not necessarily wanted fully simultaneously, but in a preferred embodiment, the current computation period of this prefetched instruction and this performance element is carried out simultaneously.
This Gaussian Computation can many times be used for calculates gaussian probability from this eigenvector, mean vector, variance vector, when this voice signal is done till.In general, a circulation is used to carry out this calculating.When this performance element is busy with a cell mean used in this calculating and variance vector, this software for example can comprise a prefetched instruction of the several average and variance vector then of looking ahead, make that having finished its calculating and preparation when this performance element is used for next group on average and during variable vector, this numerical value is this memory buffer place Already in.The numerical value of looking ahead at this high-speed cache place means that this performance element does not need the idle pending data that also waits.Data that will be processed are available, and after finishing current calculating, the next one that this performance element can be carried out it simply calculates.
Fig. 5 adopts the signal computer code of the C language of prefetched instruction according to an embodiment of the invention.Be expert in 514, the lattice prefetched instruction is set up, the required data of calculating of the function ippsLogGauss1_32f_D2 shown in 518 that is expert to look ahead.Function _ mm_prefetch () is the prefetched instruction of a signal in the C language library.Also can use any other prefetched instruction that calls the turn at any other machine word, as long as this instruction makes storer send the data that are positioned at prefetch address that will be sent to this high-speed cache.In this embodiment, can use any computerese.
When carrying out this prefetched instruction, the cache line of generally looking ahead.In system with cache line that equals 32 bytes, should _ mm_prefetch is loaded into 8 floating numbers in this high-speed cache, because each floating number comprises 4 bytes.Correspondingly, can be by an increment and next prefetch address addition be calculated this prefetch address.This increment will guarantee when data pre-fetching is finished, and and then need prefetched data afterwards.Otherwise this operation may cause the pollution (cache pollution) of high-speed cache, causes the poor efficiency of total system.Similarly, if this increment is too little, then before the next computation period of this performance element began, this was looked ahead and will not hide this stand-by period of looking ahead effectively.If increment is too big, then the start-up cost of the data of not looking ahead for primary iteration is reduced the advantage of these data of looking ahead, and surround before should prefetched data may former data of looking ahead being actually used and replace this former data of looking ahead.For big circulation, this increment can be set to 32 bytes or 8 floating numbers.
Usually, the numerical value of this increment depends on and assesses the cost and this round-robin storer is filled ratio between cost.The desired quantity of this increment can obtain by experience and design parameter.For systemic circulation, the numerical value of this increment can be set to 16.This will cause the 3rd cache line of looking ahead in this calculation process.By using increment numerical value 16, the situation of can slip up high-speed cache (miss) reduces half.
This increment can also change according to used computerese.For example, experience shows, in the C language, obtains optimum when looking ahead the 3rd cache line.But in assembly language, when looking ahead the 4th cache line, obtain optimum.The reason of this difference is the specific compiler by selected speech selection use.In the C language, because this compiler makes that prefetched instruction is sent more randomly.Utilize unordered core processor, the difference on performance is less and can be left in the basket.But, obtain optimum performance by code with the compilation language compilation.
Prefetched instruction can also be added in the major cycle of ippsLogGauss1_32f_D2, as shown in row 528 and 529.This is illustrated in and is specifically shows at looking ahead after the memory loads, and it can obtain similar effects.
Fig. 6 is illustrated in the correction code of the major cycle shown in the row 529 of Fig. 5.The signal computer code of this employing assembly language adopts prefetched instruction according to an embodiment of the invention.This circulation is unfolded so that it handles 32 bytes, and the data in the 4th cache line are prefetched.This method can reduce the decoding cost of speech recognition.For example, the experiment on a speech recognition system with Chinese (51K) language model shows 9% improvement.
Fig. 7 for the performance element of the signal computer system that in people's according to an embodiment of the invention speech recognition, relates to and memory cycle time-the action synoptic diagram.Utilize the advantage of the long computation period of gaussian probability distribution function by the next average and variance numerical value of the corresponding eigenvector of looking ahead according to the method for present embodiment.As shown in Figure 7, when this performance element was used for the calculating on summit (n-1), this memory bus was looked ahead and is used for the data of this summit (n).Similarly, in the next cycle process, when this performance element is busy with calculating summit (n), this memory bus summit (n+1) of being busy with looking ahead.In this manner, this performance element does not wait for idly that this memory bus loads him and finish the required data of this calculating.Consequently eliminate the intrinsic stand-by period in the processing of the voice recognition of prior art.

Claims (27)

1. a method comprises:
Recipient's voice signal;
The first group speech data relevant with described people's voice signal carried out acoustic processing;
When described first group of speech data during by acoustic processing, being sent to second memory from first memory by second group of speech data of acoustic processing;
Described first and second groups of speech datas through acoustic processing are carried out Language Processing; And
Form an institute identification string relevant with described people's voice signal.
2. method according to claim 1, wherein said first memory comprise a primary memory.
3. method according to claim 1, wherein said second memory comprise a high-speed cache.
4. method according to claim 1, wherein said first and second groups of speech datas comprise based on mean vector of the Gaussian distribution of the hidden Markov model state of sound and variance vector.
5. method according to claim 4, wherein said mean vector and described variance vector are used to calculate an eigenvector, and it then is used to search for a sound model.
6. method according to claim 1, the word string of wherein said identification are used to control an equipment.
7. method, comprising:
First group of speech data carried out acoustic processing; And
When described first group of speech data is carried out acoustic processing, being sent to second memory from first memory by second group of speech data of acoustic processing.
8. method according to claim 7, wherein said first and second groups of speech datas comprise mean vector and the variance vector based on the Gaussian distribution of the hidden Markov model state of sound.
9. method according to claim 7, wherein said first memory is slower than described second memory.
10. method according to claim 7 wherein further comprises:
Described first and second groups of speech datas through acoustic processing are carried out Language Processing; And
Identification is corresponding at least one word of described speech data.
11. a system, comprising:
Client devices, it comprises:
The processor that first and second groups of speech datas are carried out acoustic processing,
Be connected to the primary memory of described processor, the described first and second groups of speech datas of this main memory store,
Be connected to the high-speed cache of described processor and described primary memory, and
When being sent to described high-speed cache from described primary memory with described second group of speech data, described processor carries out acoustic processing to described first group of speech data, and the transmitter module that is connected to the described processor of this client devices, this transmitter module sends to a server to described first and second groups of speech datas through acoustic processing.
12. system according to claim 11 wherein further comprises:
People's voice trapping module, the voice signal that is used to catch the people;
The analog to digital converter module is used for described people's voice signal is converted to audio digital signals; And
The phonetic feature identifier module is used to discern the feature of described audio digital signals.
13. system according to claim 11, wherein said client devices is selected from mobile phone, personal digital assistant and portable computer system.
14. system according to claim 12, wherein said phonetic feature identifier module also carry out end point detection, emphasize filtering and quantification in advance described people's voice signal.
15. system according to claim 11, wherein said speech data comprises mean vector and the differential vector based on the Gaussian distribution of the hidden Markov model state of sound.
16. system according to claim 11, wherein said speech data through acoustic processing is a word figure.
17. system according to claim 16, wherein said transmitter module forms the binary representation of described word figure, and before sending described word figure, and described binary representation and source address and destination address are together placed a packet.
18. a device, comprising:
Store the primary memory of first and second groups of speech datas;
High-speed cache; And
When sending to described high-speed cache from described primary memory, to described first group of processor that speech data carries out acoustic processing with described second group of speech data.
19. device according to claim 18, wherein said speech data are the average and differential vector of the eigenvector relevant with people's voice signal.
20. device according to claim 18, wherein said device is selected from wireless device, personal digital assistant and mobile device.
21. device according to claim 18 wherein further comprises:
Be connected to the direct memory access (DMA) module of described primary memory, be used for sending a speech data, be used for Language Processing through acoustic processing by network.
22. device according to claim 21, wherein said network is the internet.
23. one kind comprise can be by the computer-readable medium of the performed program of processor, comprising:
First subroutine is used for recipient's voice signal;
Second subroutine is used for the first group speech data relevant with described people's voice signal carried out acoustic processing;
The 3rd subroutine is used for when described first group of speech data is carried out acoustic processing, being sent to second memory from first memory by second group of speech data of acoustic processing;
The 4th subroutine is used for described first group of speech data by acoustic processing carried out Language Processing; And
The 5th subroutine is used to form an institute identification string relevant with described people's voice signal.
24. computer-readable medium according to claim 23, wherein said first and described second group of speech data comprise mean vector and differential vector based on the Gaussian distribution of the hidden Markov model state of sound.
25. computer-readable medium according to claim 24, wherein said speech data through acoustic processing comprises a word figure.
26. computer-readable medium according to claim 25 wherein further comprises:
The 6th subroutine is used for described word figure is packaged as a packet; And
The 7th subroutine is used for sending described packet by a network.
27. computer-readable medium according to claim 26, wherein said network is the internet.
CN01823554.9A 2001-06-19 2001-06-19 Method of employing prefetch instructions in speech recognition Expired - Fee Related CN1223986C (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2001/001028 WO2002103677A1 (en) 2001-06-19 2001-06-19 Method of employing prefetch instructions in speech recognition

Publications (2)

Publication Number Publication Date
CN1545696A true CN1545696A (en) 2004-11-10
CN1223986C CN1223986C (en) 2005-10-19

Family

ID=4574815

Family Applications (1)

Application Number Title Priority Date Filing Date
CN01823554.9A Expired - Fee Related CN1223986C (en) 2001-06-19 2001-06-19 Method of employing prefetch instructions in speech recognition

Country Status (2)

Country Link
CN (1) CN1223986C (en)
WO (1) WO2002103677A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104035743A (en) * 2013-03-07 2014-09-10 亚德诺半导体技术公司 System and method for processor wake-up based on sensor data
CN113068410A (en) * 2019-10-15 2021-07-02 谷歌有限责任公司 Efficient and low latency automated assistant control for smart devices

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS5998387A (en) * 1982-11-26 1984-06-06 Nec Corp Memory circuit
IT1229782B (en) * 1989-05-22 1991-09-11 Face Standard Ind METHOD AND APPARATUS TO RECOGNIZE UNKNOWN VERBAL WORDS BY EXTRACTION OF PARAMETERS AND COMPARISON WITH REFERENCE WORDS

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104035743A (en) * 2013-03-07 2014-09-10 亚德诺半导体技术公司 System and method for processor wake-up based on sensor data
CN104035743B (en) * 2013-03-07 2017-08-15 亚德诺半导体集团 System for carrying out processor wake-up based on sensing data
CN113068410A (en) * 2019-10-15 2021-07-02 谷歌有限责任公司 Efficient and low latency automated assistant control for smart devices

Also Published As

Publication number Publication date
WO2002103677A1 (en) 2002-12-27
CN1223986C (en) 2005-10-19

Similar Documents

Publication Publication Date Title
US10403266B2 (en) Detecting keywords in audio using a spiking neural network
EP3477633A1 (en) Systems and methods for robust speech recognition using generative adversarial networks
US6374212B2 (en) System and apparatus for recognizing speech
US5384892A (en) Dynamic language model for speech recognition
WO2017076222A1 (en) Speech recognition method and apparatus
KR101970041B1 (en) Methods for Hybrid GPU/CPU Data Processing
US6178401B1 (en) Method for reducing search complexity in a speech recognition system
US11107461B2 (en) Low-power automatic speech recognition device
US20040162729A1 (en) Assigning meanings to utterances in a speech recognition system
US10013974B1 (en) Compact HCLG FST
CN108735201A (en) Continuous speech recognition method, apparatus, equipment and storage medium
CN105118501A (en) Speech recognition method and system
CN112071310A (en) Speech recognition method and apparatus, electronic device, and storage medium
CN1221937C (en) Voice identification system of voice speed adaption
EP0938076B1 (en) A speech recognition system
CN108847251B (en) Voice duplicate removal method, device, server and storage medium
CN1223986C (en) Method of employing prefetch instructions in speech recognition
US9269355B1 (en) Load balancing for automatic speech recognition
US20220147570A1 (en) Information processing apparatus and information processing method
Liu et al. Speech recognition systems on the Cell Broadband Engine processor
Dixon et al. Recent development of wfst-based speech recognition decoder
JP2004053745A (en) Method, apparatus, and program for language model generation
KR100464420B1 (en) Apparatus for calculating an Observation Probability for a search of hidden markov model
US20230386458A1 (en) Pre-wakeword speech processing
Lin et al. In silico vox: Towards speech recognition in silicon

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C17 Cessation of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20051019

Termination date: 20130619