CN110459216A - A kind of dining room brushing card device and application method with speech recognition - Google Patents

A kind of dining room brushing card device and application method with speech recognition Download PDF

Info

Publication number
CN110459216A
CN110459216A CN201910748064.2A CN201910748064A CN110459216A CN 110459216 A CN110459216 A CN 110459216A CN 201910748064 A CN201910748064 A CN 201910748064A CN 110459216 A CN110459216 A CN 110459216A
Authority
CN
China
Prior art keywords
probability
state
speech recognition
voice signal
sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910748064.2A
Other languages
Chinese (zh)
Other versions
CN110459216B (en
Inventor
高兴宇
贺晓莹
宁黎华
廖斌
丁畅
侯晓玲
于方津
李煜
陆佳琪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guilin University of Electronic Technology
Original Assignee
Guilin University of Electronic Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guilin University of Electronic Technology filed Critical Guilin University of Electronic Technology
Priority to CN201910748064.2A priority Critical patent/CN110459216B/en
Publication of CN110459216A publication Critical patent/CN110459216A/en
Application granted granted Critical
Publication of CN110459216B publication Critical patent/CN110459216B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/14Speech classification or search using statistical models, e.g. Hidden Markov Models [HMMs]
    • G10L15/142Hidden Markov Models [HMMs]
    • G10L15/144Training of HMMs
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/14Speech classification or search using statistical models, e.g. Hidden Markov Models [HMMs]
    • G10L15/142Hidden Markov Models [HMMs]
    • G10L15/148Duration modelling in HMMs, e.g. semi HMM, segmental models or transition probabilities
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/183Speech classification or search using natural language modelling using context dependencies, e.g. language models
    • G10L15/19Grammatical context, e.g. disambiguation of the recognition hypotheses based on word sequence rules
    • G10L15/197Probabilistic grammars, e.g. word n-grams
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/30Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
    • GPHYSICS
    • G07CHECKING-DEVICES
    • G07FCOIN-FREED OR LIKE APPARATUS
    • G07F7/00Mechanisms actuated by objects other than coins to free or to actuate vending, hiring, coin or paper currency dispensing or refunding apparatus
    • G07F7/08Mechanisms actuated by objects other than coins to free or to actuate vending, hiring, coin or paper currency dispensing or refunding apparatus by coded identity card or credit card or other personal identification means
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Probability & Statistics with Applications (AREA)
  • Artificial Intelligence (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The present invention provides a kind of dining room brushing card device and application method with speech recognition, is related to brushing card device technical field.The present apparatus includes mobile device end, central processing unit, LED display card reader;Central processing unit includes Radio Network System module, voice signal receiver, speech recognition system, A/D converter;Mobile device end is connected by Radio Network System module with voice signal receiver;Voice signal receiver is connected with speech recognition system;Speech recognition system is used to carry out noise reduction process to phonetic order, and feature extraction is carried out to voice signal, GMM-HMM model is established using the method for Hidden Markov Model HMM training gauss hybrid models, the maximum likelihood probability for obtaining each feature exports this feature parallel expression according to maximum likelihood probability;Speech recognition system is connected with A/D converter;A/D converter shows that card reader is connected with LED.The sensitivity of operation response can be improved in the present apparatus, improves dining room and swipes the card the rate got food.

Description

A kind of dining room brushing card device and application method with speech recognition
Technical field
The present invention relates to brushing card device technical field more particularly to a kind of dining room brushing card device with speech recognition and make Use method.
Background technique
It is lined up in most dining room and swipes the card that get food be universal phenomenon, but existing brushing card device needs manually The number that key-press input is swiped the card, then be lined up personnel carry out induction type swipe the card, queuing get food personnel it is many in the case where, first adopt Number is inputted with key mode, then is swiped the card, slower and artificial key-press input is acted and is easy error, cause queue waiting time Too long, rate is low.
Summary of the invention
It is a kind of with speech recognition the technical problem to be solved by the present invention is in view of the above shortcomings of the prior art, provide Dining room brushing card device and application method, the present apparatus use speech recognition system, and the sensitivity of operation response can be improved, and improve food Hall is swiped the card the rate got food.
In order to solve the above technical problems, the technical solution used in the present invention is:
On the one hand, the present invention provides a kind of dining room brushing card device with speech recognition, including mobile device end, centre Manage unit, LED shows card reader;
The central processing unit includes Radio Network System module, voice signal receiver, speech recognition system, A/D Converter;
The mobile device end is connected by Radio Network System module with voice signal receiver;
The Radio Network System module is for mobile device end and voice signal receiver to be communicatively coupled;It is described Voice signal receiver is known for receiving the voice signal sended over by Radio Network System module, output end and voice Other system is connected;
The phonetic order that the speech recognition system is used to input mobile device end carries out noise reduction process, and believes voice Number carry out feature extraction, using gauss hybrid models GMM model, Hidden Markov Model HMM training gauss hybrid models side Method establishes GMM-HMM model, using each feature of extraction as the input of GMM-HMM model, obtains each feature most Maximum-likelihood probability, and this feature parallel expression is exported according to maximum likelihood probability, each feature parallel expression is closed And correct sequence is set a file;The output end of speech recognition system is connected with the input terminal of A/D converter;
The output end of the A/D converter shows that the input terminal of card reader is connected with LED.
On the other hand, the present invention provides a kind of application method of dining room brushing card device with speech recognition, by described A kind of dining room brushing card device with speech recognition realize, include the following steps:
Step 1: mobile device end being connect with Radio Network System module, it is ensured that phonetic order can pass through mobile device End transmission;Mobile device end receives the phonetic order of user, and is transmitted to phonetic order by Radio Network System module Speech recognition system;
Step 2: speech recognition system passes through training GMM-HMM model and carries out speech recognition, output and language to phonetic order Sound instructs corresponding letter signal, and obtained letter signal is merged into the collection of correct sequence according to voice signal cutting sequence File is closed, and this is set a file and is exported to A/D converter;
Step 3:A/D converter, which converts analog signals into digital signal and is transferred to LED, shows card reader, is shown by LED Show that card reader shows digital signal.
Specific step is as follows for the step 2:
Step 2.1: the voice signal that mobile device end exports being pre-processed, the pretreatment includes noise reduction and enhancing Processing, the cutting of equal frame number wave, voice signal cutting set H={ h are carried out to pretreated voice signal1,h2,…, hg..., hG, wherein hgRepresent g-th of voice signal;
Step 2.2: according to Fourier transform to hgCarry out acoustic feature extraction;It obtains characteristic sequence and forms characteristic vector Sequence
Step 2.3: establishing continuous type phonetic order hgThe probability density function of gauss hybrid models GMM is obeyed, and will be high This mixed model is integrated into Hidden Markov HMM model, to be fitted the output distribution based on state, according to Hidden Markov mould Type HMM is trained, and obtains phonetic order hgMaximum likelihood probability;
Probability density function p (hg) indicate are as follows:
Wherein, μnN voice mean variable value is represented,Represent n voice variable variance, σnRepresent n voice variable standard Difference, cnFor weight, hybrid weight
HMM includesWherein, SgIndicate the set of hidden state, KgIndicate the collection of output symbol It closes, AgIndicate state transition probability matrix, BgIndicate probability distribution matrix,Indicate initial state probabilities distribution, wherein State transition probability Ag=[aij]N×NIf observable feature sequence isCorresponding time span is T Set of state sequence be combined into Q '={ q1,q2,…,qt,…,qT, that is, indicate that the period of each frame voice signal is T, wherein qtGeneration The state of table t moment, qtInterior includes one or more observable features;
Set hidden state siPositioned at the state q at t-1 momentt-1, hidden state sjPositioned at the state q of t momentt, then hide State is from siIt is transferred to sjTransition probability are as follows:
aij=P (qt=sj|qt-1=si)
Wherein, 1≤i≤N, 1≤j≤N, andP (*) is new probability formula;Shape is found out according to above-mentioned formula State transition probability Ag
According to KgAnd SgFind out probability distribution matrix Bg;Wherein in hidden state sjState positioned at t moment is qtCondition Under, output symbol vkIn Observable stateProbability be expressed as:
Wherein, 1≤j≤N, 1≤k≤M, andvkIt representsThe symbol extracted under state;Root Probability distribution matrix B is obtained according to above-mentioned formulag
Find out stateful initial state probabilities distributionWherein
WhereinQi is hidden state siUnder Observable state;
The probability distribution matrix of observable feature sequence is calculated using forwards algorithmsThe initial shape at t=1 moment The initialization of state transition probability:WhereinIt representsProbability under state calculates a at t+1 momentij, formula Are as follows:Wherein 1≤t≤T-1, bj(ot+1) represent the probability of t+1 moment observable feature sequence; ThenWherein at(i) the corresponding output state transition probability of t moment is represented;
HMM model parameter is decoded, if characteristic sequenceCorresponding optimum state sequence is Q= q′1,q′2,...,q′T, optimal status switch is found by viterbi algorithm;If Viterbi variablePath FunctionConclude the path function for obtaining t momentThe then T moment pair The optimum state sequence answered are as follows:
Wherein
The training of HMM is made by adjusting the parameter of HMMIt is maximum;Using expectation-maximization algorithm to attribute Sound signal observation sequenceModel parameter is adjusted, is madeMaximum, it is expected thatParameter state probability distribution, state transition probability and symbol probability are distributed respectively Matrix memory updates;
More new formula are as follows:
Wherein γ1(i) updated probability distribution over states is represented;Solve maximum likelihood probability;
Step 2.4: enabling g=g+1, repeat step step 2.2- step 2.3, obtain voice signal cutting set H={ h1, h2,…,hg..., hGIn all voice signals maximum likelihood probability;
Step 2.5: letter signal corresponding thereto being obtained according to the maximum likelihood probability of each voice signal, formula is such as Under:
Wherein, W refers to that letter signal, χ refer to the probability of characteristic matching,Represent g-th of letter signal.
The beneficial effects of adopting the technical scheme are that a kind of food with speech recognition provided by the invention Hall brushing card device and application method, compared with prior art, the present invention have effect following prominent:
1. a kind of dining room brushing card device and application method with speech recognition provided by the invention is believed using wireless transmission Number mode, connection speed is fast, and easy to operate, cost is relatively low.
2. a kind of dining room brushing card device and application method with speech recognition provided by the invention, is carried out by user Speech talkback, speech recognition system carry out feature extraction, determine transmission of the voice signal from user, can fast implement letter Number conversion and swiping card, have the characteristics that accurately and fast, it is stable.
3. a kind of dining room brushing card device and application method with speech recognition provided by the invention, simple, convenient Fast, it can be achieved that accurately speech recognition, applies and greatly solve the time-consuming that artificial key is swiped the card in dining room, largely The rate that dining room is got food is improved, and speech processes signal is more accurate, avoids artificial key at this stage and swipe the card and mistake occur Frequency, have very big practical application meaning.
Detailed description of the invention
Fig. 1 is the dining room brushing card device block diagram provided in an embodiment of the present invention with speech recognition;
Fig. 2 is the application method flow chart of the dining room brushing card device provided in an embodiment of the present invention with speech recognition;
Fig. 3 is speech recognition flow chart provided in an embodiment of the present invention;
Fig. 4 is HMM provided in an embodiment of the present invention training schematic diagram.
Specific embodiment
With reference to the accompanying drawings and examples, specific embodiments of the present invention will be described in further detail.Implement below Example is not intended to limit the scope of the invention for illustrating the present invention.
The method of the present embodiment is as described below.
On the one hand, the present invention provides a kind of dining room brushing card device with speech recognition, as shown in Figure 1, including that movement is set Standby end, central processing unit, LED show card reader;
The central processing unit it is embedding with ARM intelligent control chip in, including Radio Network System module, voice signal connect Receive device, speech recognition system, A/D converter;
The mobile device end is connected by Radio Network System module with voice signal receiver;The mobile device End is preferred headset, short range transmission accurately clearly voice signal not only may be implemented, but also convenient and efficient, and had avoided remote biography Phenomena such as noise caused by defeated voice signal and distortion, to guarantee the phonetic recognization rate under real scene.
The Radio Network System module is for mobile device end and voice signal receiver to be communicatively coupled;As The tie of mobile device end and speech recognition system transmission, it is convenient and efficient, it is easy to operate, it is easy to accomplish.The voice signal connects Device is received for receiving the voice signal sended over by Radio Network System module, output end is connected with speech recognition system It connects;
The phonetic order that the speech recognition system is used to input mobile device end carries out noise reduction process, and believes voice Number carry out feature extraction, using gauss hybrid models GMM model, Hidden Markov Model HMM training gauss hybrid models side Method establishes GMM-HMM model, using each feature of extraction as the input of GMM-HMM model, obtains each feature most Maximum-likelihood probability, and this feature parallel expression is exported according to maximum likelihood probability, each feature parallel expression is closed And correct sequence is set a file;The output end of speech recognition system is connected with the input terminal of A/D converter;Speech recognition system The output end of system is connected with the input terminal of A/D converter;Feature is carried out to the voice signal of mobile device end input first to mention It takes, differentiates whether similarity matches;Acoustic model and pattern match are carried out to the voice signal, it is ensured that inputted acoustic model is Control the mode of card-punching system;
The A/D converter (ADC0809CCN) is used to be installed on dining room brushing card device internal circuitry, it can be achieved that will The analog signal of setting a file of speech recognition system input is converted to digital signal, and output end shows the defeated of card reader with LED Enter end to be connected.
The LED shows that card reader is used to show the digital signal of A/D converter output.Possess simple, intuitive, it is convenient fast Prompt advantage can fast implement number display and prompt of swiping the card;LED shows that card reader uses P3 full-color LED display screen.
On the other hand, the present invention provides a kind of application method of dining room brushing card device with speech recognition, by described A kind of dining room brushing card device with speech recognition realize, as shown in Fig. 2, including the following steps:
Step 1: mobile device end being connect with Radio Network System module, it is ensured that phonetic order can pass through mobile device End transmission;Mobile device end receives the phonetic order of user, and is transmitted to phonetic order by Radio Network System module Speech recognition system;
Step 2: speech recognition system passes through training GMM-HMM model and carries out speech recognition, output and language to phonetic order Sound instructs corresponding letter signal, as shown in figure 3, the merging of obtained letter signal is positive according to voice signal cutting sequence True sequence is set a file, and this is set a file and is exported to A/D converter;
Specific step is as follows:
Step 2.1: the voice signal that mobile device end exports being pre-processed, the pretreatment includes noise reduction and enhancing Processing, the cutting of equal frame number wave, voice signal cutting set H={ h are carried out to pretreated voice signal1,h2,…, hg..., hG, wherein hgRepresent g-th of voice signal;
Step 2.2: according to Fourier transform to hgCarry out acoustic feature extraction;It obtains characteristic sequence and forms characteristic vector Sequence
Step 2.3: establishing continuous type phonetic order hgThe probability density function of gauss hybrid models GMM is obeyed, and will be high This mixed model is integrated into Hidden Markov HMM model, to be fitted the output distribution based on state, according to Hidden Markov mould Type HMM is trained, and obtains phonetic order hgMaximum likelihood probability;
Probability density function p (hg) indicate are as follows:
Wherein, μnN voice mean variable value is represented,Represent n voice variable variance, σnRepresent n voice variable mark Poor, the c of standardnFor weight, hybrid weight
HMM includesAs shown in Figure 4, wherein SgIndicate the set of hidden state, KgIndicate defeated The set of symbol out, AgIndicate state transition probability matrix, BgIndicate probability distribution matrix,Indicate initial state probabilities point Cloth, wherein state transition probability Ag=[aij]N×NIf observable feature sequence isIt is corresponding Time span be T set of state sequence be combined into Q '={ q1,q2,…,qt,…,qTIndicate period of each frame voice signal For T, wherein qtRepresent the state of t moment, qtInterior includes one or more observable features;
Set hidden state siPositioned at the state q at t-1 momentt-1, hidden state sjPositioned at the state q of t momentt, then hide State is from siIt is transferred to sjTransition probability are as follows:
aij=P (qt=sj|qt-1=si)
Wherein, 1≤i≤N, 1≤j≤N, andP (*) is new probability formula;Shape is found out according to above-mentioned formula State transition probability Ag
According to KgAnd SgFind out probability distribution matrix Bg;Wherein in hidden state sjState positioned at t moment is qtCondition Under, output symbol vkIn Observable stateProbability be expressed as:
Wherein, 1≤j≤N, 1≤k≤M, andvkIt representsThe symbol extracted under state;According to Above-mentioned formula obtains probability distribution matrix Bg
Find out stateful initial state probabilities distributionWherein
WhereinqiFor hidden state siUnder Observable state;
The probability distribution matrix of observable feature sequence is calculated using forwards algorithmsThe initial shape at t=1 moment The initialization of state transition probability:WhereinIt representsProbability under state calculates a at t+1 momentij, formula Are as follows:Wherein 1≤t≤T-1, bj(ot+1) represent the probability of t+1 moment observable feature sequence; ThenWherein at(i) the corresponding output state transition probability of t moment is represented;
HMM model parameter is decoded, if characteristic sequenceCorresponding optimum state sequence is Q= q′1,q′2,...,q′T, optimal status switch is found by viterbi algorithm (Viterbi algorithm);If Viterbi variablePath functionConclude the path function for obtaining t momentThen T moment corresponding optimum state sequence are as follows:
Wherein
The training of HMM is made by adjusting the parameter of HMMIt is maximum;Using expectation-maximization algorithm (EM algorithm) To speech signal observation sequenceModel parameter is adjusted, is madeMaximum, it is expected thatRespectively to parameter state probability distribution, state transition probability and symbol probability moment of distribution Battle array memory updates;
More new formula are as follows:
Wherein γ1(i) updated probability distribution over states is represented;Solve maximum likelihood probability;
Step 2.4: enabling g=g+1, repeat step step 2.2- step 2.3, obtain voice signal cutting set H={ h1, h2,…,hg..., hGIn all voice signals maximum likelihood probability;
Step 2.5: letter signal corresponding thereto being obtained according to the maximum likelihood probability of each voice signal, formula is such as Under:
Wherein, W refers to that letter signal, χ refer to the probability of characteristic matching,Represent g-th of letter signal;
Step 3:A/D converter, which converts analog signals into digital signal and is transferred to LED, shows card reader, is shown by LED Show that card reader shows digital signal.
Finally, it should be noted that the above embodiments are merely illustrative of the technical solutions of the present invention, rather than its limitations;Although Present invention has been described in detail with reference to the aforementioned embodiments, those skilled in the art should understand that: it still may be used To modify to technical solution documented by previous embodiment, or some or all of the technical features are equal Replacement;And these are modified or replaceed, model defined by the claims in the present invention that it does not separate the essence of the corresponding technical solution It encloses.

Claims (3)

1. a kind of dining room brushing card device with speech recognition, it is characterised in that: including mobile device end, central processing unit, LED shows card reader;
The central processing unit includes Radio Network System module, voice signal receiver, speech recognition system, A/D conversion Device;
The mobile device end is connected by Radio Network System module with voice signal receiver;
The Radio Network System module is for mobile device end and voice signal receiver to be communicatively coupled;The voice Signal receiver is for receiving the voice signal sended over by Radio Network System module, output end and speech recognition system System is connected;
The phonetic order progress noise reduction process that the speech recognition system is used to input mobile device end, and to voice signal into Row feature extraction is modeled using gauss hybrid models GMM, and the method for Hidden Markov Model HMM training gauss hybrid models is built Vertical GMM-HMM model obtains the maximum of each feature seemingly using each feature of extraction as the input of GMM-HMM model Right probability, and this feature parallel expression is exported according to maximum likelihood probability, each feature parallel expression is merged just True sequence is set a file;The output end of speech recognition system is connected with the input terminal of A/D converter;
The output end of the A/D converter shows that the input terminal of card reader is connected with LED.
2. a kind of application method of the dining room brushing card device with speech recognition, by described in claim 1 a kind of with language The dining room brushing card device of sound identification is realized, characterized by the following steps:
Step 1: mobile device end being connect with Radio Network System module, it is ensured that phonetic order can be passed by mobile device end It is defeated;Mobile device end receives the phonetic order of user, and phonetic order is transmitted to voice by Radio Network System module Identifying system;
Step 2: speech recognition system passes through training GMM-HMM model and carries out speech recognition to phonetic order, and output refers to voice Corresponding letter signal is enabled, obtained letter signal is merged into the set text of correct sequence according to voice signal cutting sequence Part, and this is set a file and is exported to A/D converter;
Step 3:A/D converter, which converts analog signals into digital signal and is transferred to LED, shows card reader, is shown and is brushed by LED Card device shows digital signal.
3. a kind of application method of dining room brushing card device with speech recognition according to claim 2, it is characterised in that: Specific step is as follows for the step 2:
Step 2.1: the voice signal that mobile device end exports being pre-processed, the pretreatment includes at noise reduction and enhancing Reason, the cutting of equal frame number wave, voice signal cutting set H={ h are carried out to pretreated voice signal1,h2,…, hg..., hG, wherein hgRepresent g-th of voice signal;
Step 2.2: according to Fourier transform to hgCarry out acoustic feature extraction;It obtains characteristic sequence and forms feature vector sequence
Step 2.3: establishing continuous type phonetic order hgThe probability density function of gauss hybrid models GMM is obeyed, and by Gaussian Mixture Model integration is into Hidden Markov HMM model, to be fitted the output distribution based on state, according to Hidden Markov Model HMM It is trained, obtains phonetic order hgMaximum likelihood probability;
Probability density function p (hg) indicate are as follows:
Wherein, μnN voice mean variable value is represented,Represent n voice variable variance, σnN voice variable standard deviation is represented, cnFor weight, hybrid weight
HMM includesWherein, SgIndicate the set of hidden state, KgIndicate the set of output symbol, Ag Indicate state transition probability matrix, BgIndicate probability distribution matrix,Indicate initial state probabilities distribution, wherein state Transition probability Ag=[aij]N×NIf observable feature sequence isCorresponding time span is the shape of T State arrangement set is Q '={ q1,q2,…,qt,…,qT, that is, indicate that the period of each frame voice signal is T, wherein qtWhen representing t The state at quarter, qtInterior includes one or more observable features;
Set hidden state siPositioned at the state q at t-1 momentt-1, hidden state sjPositioned at the state q of t momentt, then hidden state From siIt is transferred to sjTransition probability are as follows:
aij=P (qt=sj|qt-1=si)
Wherein, 1≤i≤N, 1≤j≤N, andP (*) is new probability formula;State is found out according to above-mentioned formula to turn Move probability Ag
According to KgAnd SgFind out probability distribution matrix Bg;Wherein in hidden state sjState positioned at t moment is qtUnder conditions of, it is defeated Symbol v outkIn Observable stateProbability be expressed as:
Wherein, 1≤j≤N, 1≤k≤M, andvkIt representsThe symbol extracted under state;According to above-mentioned Formula obtains probability distribution matrix Bg
Find out stateful initial state probabilities distributionWherein
WhereinqiFor hidden state siUnder Observable state;
The probability distribution matrix of observable feature sequence is calculated using forwards algorithmsThe original state at t=1 moment turns Move probability initialization:WhereinIt representsProbability under state calculates a at t+1 momentij, formula are as follows:Wherein 1≤t≤T-1, bj(ot+1) represent the probability of t+1 moment observable feature sequence;ThenWherein at(i) the corresponding output state transition probability of t moment is represented;
HMM model parameter is decoded, if characteristic sequenceCorresponding optimum state sequence is Q=q1′, q2′,...,q′T, optimal status switch is found by viterbi algorithm;If Viterbi variablePath functionConclude the path function for obtaining t momentThen the T moment is corresponding Optimum state sequence are as follows:
Wherein
The training of HMM is made by adjusting the parameter of HMMIt is maximum;Given voice is believed using expectation-maximization algorithm Number observation sequenceModel parameter is adjusted, is madeMaximum, it is expected thatParameter state probability distribution, state transition probability and symbol probability are distributed respectively Matrix memory updates;
More new formula are as follows:
Wherein γ1(i) updated probability distribution over states is represented;Solve maximum likelihood probability;
Step 2.4: enabling g=g+1, repeat step step 2.2- step 2.3, obtain voice signal cutting set H={ h1,h2,…, hg..., hGIn all voice signals maximum likelihood probability;
Step 2.5: show that letter signal corresponding thereto, formula are as follows according to the maximum likelihood probability of each voice signal:
Wherein, W refers to that letter signal, χ refer to the probability of characteristic matching,Represent g-th of letter signal.
CN201910748064.2A 2019-08-14 2019-08-14 Canteen card swiping device with voice recognition function and using method Active CN110459216B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910748064.2A CN110459216B (en) 2019-08-14 2019-08-14 Canteen card swiping device with voice recognition function and using method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910748064.2A CN110459216B (en) 2019-08-14 2019-08-14 Canteen card swiping device with voice recognition function and using method

Publications (2)

Publication Number Publication Date
CN110459216A true CN110459216A (en) 2019-11-15
CN110459216B CN110459216B (en) 2021-11-30

Family

ID=68486465

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910748064.2A Active CN110459216B (en) 2019-08-14 2019-08-14 Canteen card swiping device with voice recognition function and using method

Country Status (1)

Country Link
CN (1) CN110459216B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1423802A (en) * 1999-11-18 2003-06-11 怀尔卡通信加拿大有限公司 Electronic system having variable functions
CN103117060A (en) * 2013-01-18 2013-05-22 中国科学院声学研究所 Modeling approach and modeling system of acoustic model used in speech recognition
CN103400461A (en) * 2013-07-22 2013-11-20 孙伟 POS (point-of-sale) machine, card service realization system and method
CN104656513A (en) * 2015-01-19 2015-05-27 北京联合大学 Intelligent showering behavior control system and method
CN105321111A (en) * 2015-11-26 2016-02-10 李伟 Convenient shopping system for community
US20160240190A1 (en) * 2015-02-12 2016-08-18 Electronics And Telecommunications Research Institute Apparatus and method for large vocabulary continuous speech recognition
CN106372891A (en) * 2016-08-23 2017-02-01 努比亚技术有限公司 Payment method and apparatus, and mobile terminal

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1423802A (en) * 1999-11-18 2003-06-11 怀尔卡通信加拿大有限公司 Electronic system having variable functions
CN103117060A (en) * 2013-01-18 2013-05-22 中国科学院声学研究所 Modeling approach and modeling system of acoustic model used in speech recognition
CN103400461A (en) * 2013-07-22 2013-11-20 孙伟 POS (point-of-sale) machine, card service realization system and method
CN104656513A (en) * 2015-01-19 2015-05-27 北京联合大学 Intelligent showering behavior control system and method
US20160240190A1 (en) * 2015-02-12 2016-08-18 Electronics And Telecommunications Research Institute Apparatus and method for large vocabulary continuous speech recognition
CN105321111A (en) * 2015-11-26 2016-02-10 李伟 Convenient shopping system for community
CN106372891A (en) * 2016-08-23 2017-02-01 努比亚技术有限公司 Payment method and apparatus, and mobile terminal

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
朱春山: "基于Kaldi的语音识别的研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Also Published As

Publication number Publication date
CN110459216B (en) 2021-11-30

Similar Documents

Publication Publication Date Title
CN110610707B (en) Voice keyword recognition method and device, electronic equipment and storage medium
CN108346427A (en) A kind of audio recognition method, device, equipment and storage medium
CN108281137A (en) A kind of universal phonetic under whole tone element frame wakes up recognition methods and system
CN110970018B (en) Speech recognition method and device
CN109887484A (en) A kind of speech recognition based on paired-associate learning and phoneme synthesizing method and device
CN110992932B (en) Self-learning voice control method, system and storage medium
CN109754790A (en) A kind of speech recognition system and method based on mixing acoustic model
CN107257996A (en) The method and system of environment sensitive automatic speech recognition
CN101345819B (en) Speech control system used for set-top box
WO2019062931A1 (en) Image processing apparatus and method
CN108899044A (en) Audio signal processing method and device
CN110534099A (en) Voice wakes up processing method, device, storage medium and electronic equipment
CN108109613A (en) For the audio training of Intelligent dialogue voice platform and recognition methods and electronic equipment
CN108735199B (en) Self-adaptive training method and system of acoustic model
CN102324232A (en) Method for recognizing sound-groove and system based on gauss hybrid models
CN107146615A (en) Audio recognition method and system based on the secondary identification of Matching Model
CN104123930A (en) Guttural identification method and device
CN106875944A (en) A kind of system of Voice command home intelligent terminal
CN110459216A (en) A kind of dining room brushing card device and application method with speech recognition
CN112347788A (en) Corpus processing method, apparatus and storage medium
CN115858747A (en) Clustering-combined Prompt structure intention identification method, device, equipment and storage medium
CN116343797A (en) Voice awakening method and corresponding device
CN113851113A (en) Model training method and device and voice awakening method and device
CN114283791A (en) Speech recognition method based on high-dimensional acoustic features and model training method
CN106981287A (en) A kind of method and system for improving Application on Voiceprint Recognition speed

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant