CN1760973A - Method of speech recognition based on qualitative mapping - Google Patents
Method of speech recognition based on qualitative mapping Download PDFInfo
- Publication number
- CN1760973A CN1760973A CNA2004100670847A CN200410067084A CN1760973A CN 1760973 A CN1760973 A CN 1760973A CN A2004100670847 A CNA2004100670847 A CN A2004100670847A CN 200410067084 A CN200410067084 A CN 200410067084A CN 1760973 A CN1760973 A CN 1760973A
- Authority
- CN
- China
- Prior art keywords
- similarity
- pattern
- memory
- sample
- oscillogram
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Landscapes
- Image Analysis (AREA)
- Character Discrimination (AREA)
Abstract
The method carries out learning image of voice (pattern) regarded as a set of points. Characters of point are integrated as characters of mode in time for carrying out recognition. Thus, the method can learn and recognize modes in low difficulty and quickly. Using function of inverting degree obtains degree of similarity to make process of generation and recognition possesses fuzziness, which reflects that thinking process of human brain has fuzziness. The method can generate actual, visible memory mode. Comparing with prior art, the invention does not need large sample and complex structure of recognition system. Features of the invention are: quick recognition speed and high recognition rate for learned voice.
Description
Technical field
The present invention relates to a kind of audio recognition method, particularly relate to a kind of audio recognition method based on qualitative mapping.
Background technology
Formerly in the technology, usually use a kind of dictation machine for the identification of voice.Dictation machine is the topological method that is based upon a kind of implicit Markov model (HMM) on acoustic model and the speech model basis.It is a kind of large vocabulary that contains, the operation of unspecified person, the mode of continuous speech recognition.
Formerly in the technology, be used to realize that man-machine spoken dialog also usually uses interactive method.The conversational system of setting up is often towards a narrow field, and vocabulary is limited.
The mode of above-mentioned two kinds of speech recognitions all need to set up large sample amount and the complicated system of structure, and recognition speed is slower, and discrimination is also lower.
Summary of the invention
Purpose of the present invention provides a kind of audio recognition method based on qualitative mapping in order to overcome the deficiency of technology formerly.Need not to set up and contain large sample amount and baroque speech recognition or conversational system.Method based on qualitative mapping and conversion degree function.The speech recognition speed of learning for oneself is fast, discrimination is high.
The present invention to achieve the above object, the technical scheme of being taked is: at first voice are converted into oscillogram; Then, adopt the method extraction model and the integral body of qualitative mapping to discern; In identifying, utilize conversion degree function to ask its similarity.
The qualitative mapping that said extraction model adopted is for quantitative mapping.The method of qualitative mapping is pattern to be regarded as integral body learn and discern.In the learning process, the oscillogram integral memory in a tri-vector, one-dimensional representation file sequence number wherein.In the identifying, the needs recognized patterns is regarded as an integral body discern.According to amount---the matter transformational relation provides a result qualitatively.After promptly similarity being compared, find out a maximum similarity as a result of.And quantitatively shine upon the amount of being---the correspondence of amount, it is measured for each, and a result is all arranged.It is unpractical clearly using this quantitative mapping in this speech recognition, also there is no need.
Usually a main formulas explaining qualitative mapping is:
τ (x, (α
i, β
i))=x ⊥ (α
i, β
i)=τ
i(x) p
i(o) wherein,
The corresponding character p of the amount of being x
i(o) true value, ⊥ is relation " x ∈ (α
i, β
i) " detecting operation, (with " ⊥ " horizontal expression interval (α down
i, β
i), a perpendicular expression x is at (α
i, β
i) between.) again weighing one matter (feature) transform operator or the qualitative operator of matter feature (or character).Because of " x ∈ (α
i, β
i) " " α
i<x<β
i" so operator ⊥ can be realized with the simplest arithmetical operation " 〉=" and "≤" or ">" and "<".
Said conversion degree function because of measuring the matter transforming degree difference that difference causes, is to be prevalent among attribute amount-matter conversion.So introduce the conversion degree function that to portray the notion of this species diversity.
In general, qualitative benchmark (α
i, β
i) frontier point α
iAnd β
iTwo corresponding character p
i(α
i), p
i(β
i) ∈ p
i(o), be p
i(o) the easiest in to other matter feature (or character) p
j(o) or p
k(o) character of Zhuan Bianing is so can be referred to as class p
i(o) critical properties in; And mid point ξ
iCorresponding character p
i(ξ
i) then be p
i(o) the most stable in, also change other matter features and best embody its matter feature class p least easily into
i(o) Ben Zhi character is so can establish p
i(ξ
i) be p
i(o) intrinsic properties, and claim ξ
iBe p
i(o) intrinsic point.If claim k
1 (x<ξ wherein
i) and
(wherein, x>ξ
i) be the character p of x correspondence
i(x) depart from p
i(ξ
t) degree, then can claim
Be p
i(x) near intrinsic properties p
i(ξ
i) degree or p
i(x) embody its matter feature class p
i(o) (or p
i(ξ
i)) degree or be magnetized to matter feature class p
i(o) transforming degree.
Conversion degree function η
i(x) definition: consider k
1(x), k
2(x) ∈ [0,1] is so η
i(x) ∈ [1,1] is as claiming mapping η: X * Γ → [1,1]
iBe p
i(x) embody its matter feature class p
i(ξ
i) degree function, if to (x, N (ξ
i, δ
i)) ∈ X * Γ,
, make:
η(x,ξ
i,δ
i)=|x-ξ
i|⊥δ
i=η
i(x)
Thus, conversion degree function η
i(x) mathematics essence is with | x-ξ
i| with given limit value δ
iCompare, so ratio
Size, just can the pairing character p of reflection amount x
i(x) with intrinsic properties p
i(ξ
i) difference.
Said pattern-recognition is that the set of the pattern with some denominator is discerned.Pattern can be thought the description to the quantitative and structure of a certain object.Can be called the set of pattern with some denominator is mode class.Therefore, alleged pattern-recognition is identification to mode class among the present invention.
The advantage of audio recognition method of the present invention is significant.
As above-mentioned, the present invention is based on feature extraction model and the identification of qualitative mapping (QM-Qualitative Mapping).The recognition methods of this qualitative mapping model is to regard voice (pattern) image as set a little learn.During identification the feature of point is integrated into the feature of pattern.Therefore, can learn apace and discern pattern.Its learning difficulty is low, and speed is fast.Adopt conversion degree function to ask similarity, make generation and identifying have ambiguity, this has exactly reflected the ambiguity that human brain thinking process itself is had.Method of the present invention can generate tangible, visible memory pattern.Need not desired large sample amount and the complicated recognition system of structure in the technology formerly.Method of the present invention needs only the voice to having learnt, and not only recognition speed is fast, and the discrimination height.
Description of drawings
Fig. 1 is the process flow diagram of the first step of audio recognition method of the present invention.
Fig. 2 is the process flow diagram of the second step learning process of audio recognition method of the present invention.
Fig. 3 is the process flow diagram of the 3rd step identifying of audio recognition method of the present invention.
Embodiment
Further specify method of the present invention below in conjunction with accompanying drawing.
As Fig. 1, Fig. 2, shown in Figure 3.The concrete steps of the inventive method are:
<1〉gathers voice signal, speech pattern is converted into oscillogram;
<2〉learn with the set of above-mentioned oscillogram mid point, obtain the memory pattern of voice;
<3〉discern for the memory pattern of above-mentioned study, adopt conversion degree function to ask for the similarity of memory pattern and reference model, compare similarity, maximum similarity is recognition result.
The flow process of Fig. 1 is the above-mentioned first step, is about to speech pattern and is converted into oscillogram, just audio files is changed into oscillogram.At first judge sound channel, if the sexadecimal voice document, then by sexadecimal a section read, with the line of the information read, just constituted oscillogram.
The flow process of Fig. 2 is second above-mentioned step, the i.e. learning process.Said study be with the oscillogram integral memory in a tri-vector, one-dimensional representation file sequence number wherein; Its concrete learning process is:
A. at first set up three-dimensional group;
B. read sample figure again;
C. gather the available point among the sample figure again, give the three-dimensional group assignment simultaneously;
D. try to achieve key point (highs and lows) again;
E. judge central row and starting point again;
F. note the memory sample mode at last.
The flow process of Fig. 3 is the 3rd above-mentioned step, i.e. identifying.Said identifying is:
A. at first need to select the sample mode of identification, and read figure according to the memory sample mode of noting in the learning process;
B. give two-dimentional assignment (removing the one dimension file sequence number in the three-dimensional) again;
C. adopt key point vector relative method again, the key point vector of reference sample pattern with the memory sample mode compared;
D. the key point vector of remembering sample mode and reference sample pattern reaches desired match point as if having, and then is recognition result;
E. if no match point, adopt conversion degree function that the memory sample mode is carried out pre-service again after, ask for the similarity of remembering sample mode and reference sample pattern;
F. similarity relatively, the similarity maximum be recognition result.
The detailed process of said employing conversion degree function identification is: at first the oscillogram of learning is asked for amplitude and differ multiple, and carry out sameization of oscillogram amplitude; Ask the distance that differs of starting point and central row again, find out closest approach and weight; Seek the nearest match point and the assignment of each point, try to achieve similarity and assignment; Compare similarity, maximum similarity is a recognition result.
Claims (4)
1, a kind of audio recognition method based on qualitative mapping is after it is characterized in that at first voice being converted into oscillogram; Adopt the method extraction model and the integral body of qualitative mapping to discern; In identifying, utilize conversion degree function to ask its similarity.
2, the audio recognition method based on qualitative mapping according to claim 1 is characterized in that the concrete grammar step is:
<1〉gathers voice signal, speech pattern is converted into oscillogram;
<2〉learn with the set of above-mentioned oscillogram mid point, obtain the memory pattern of voice;
<3〉discern for the memory pattern of above-mentioned study, adopt conversion degree function to ask for the similarity of memory pattern and reference model, compare similarity, maximum similarity is recognition result.
3, the audio recognition method based on qualitative mapping according to claim 1 and 2, it is characterized in that said study be with the oscillogram integral memory in a tri-vector, one-dimensional representation file sequence number wherein; Its concrete learning process is:
A. at first set up three-dimensional group;
B. read sample figure again;
C. gather the available point among the sample figure again, give the three-dimensional group assignment simultaneously;
D. try to achieve key point (highs and lows) again;
E. judge central row and starting point again;
F. note the memory sample mode at last.
4, the audio recognition method based on qualitative mapping according to claim 1 and 2 is characterized in that said identifying is:
A. at first need to select the sample mode of identification, and read figure according to the memory sample mode of noting in the learning process;
B. give two-dimentional assignment (removing the one dimension file sequence number in the three-dimensional) again;
C. adopt key point vector relative method again, the key point vector of reference sample pattern with the memory sample mode compared;
D. the key point vector of remembering sample mode and reference sample pattern reaches desired match point as if having, and then is recognition result;
E. if no match point, adopt conversion degree function that the memory sample mode is carried out pre-service again after, ask for the similarity of remembering sample mode and reference sample pattern;
F. similarity relatively, the similarity maximum be recognition result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CNA2004100670847A CN1760973A (en) | 2004-10-12 | 2004-10-12 | Method of speech recognition based on qualitative mapping |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CNA2004100670847A CN1760973A (en) | 2004-10-12 | 2004-10-12 | Method of speech recognition based on qualitative mapping |
Publications (1)
Publication Number | Publication Date |
---|---|
CN1760973A true CN1760973A (en) | 2006-04-19 |
Family
ID=36707015
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CNA2004100670847A Pending CN1760973A (en) | 2004-10-12 | 2004-10-12 | Method of speech recognition based on qualitative mapping |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN1760973A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106888392A (en) * | 2017-02-14 | 2017-06-23 | 广东九联科技股份有限公司 | A kind of Set Top Box automatic translation system and method |
CN107393538A (en) * | 2017-07-26 | 2017-11-24 | 上海与德通讯技术有限公司 | Robot interactive method and system |
CN108520752A (en) * | 2018-04-25 | 2018-09-11 | 西北工业大学 | A kind of method for recognizing sound-groove and device |
-
2004
- 2004-10-12 CN CNA2004100670847A patent/CN1760973A/en active Pending
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106888392A (en) * | 2017-02-14 | 2017-06-23 | 广东九联科技股份有限公司 | A kind of Set Top Box automatic translation system and method |
CN107393538A (en) * | 2017-07-26 | 2017-11-24 | 上海与德通讯技术有限公司 | Robot interactive method and system |
CN108520752A (en) * | 2018-04-25 | 2018-09-11 | 西北工业大学 | A kind of method for recognizing sound-groove and device |
CN108520752B (en) * | 2018-04-25 | 2021-03-12 | 西北工业大学 | Voiceprint recognition method and device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9767792B2 (en) | System and method for learning alternate pronunciations for speech recognition | |
US6535850B1 (en) | Smart training and smart scoring in SD speech recognition system with user defined vocabulary | |
CN1165887C (en) | Method and system for dynamically adjusted training for speech recognition | |
US8271283B2 (en) | Method and apparatus for recognizing speech by measuring confidence levels of respective frames | |
EP1936606B1 (en) | Multi-stage speech recognition | |
US6845357B2 (en) | Pattern recognition using an observable operator model | |
US6839667B2 (en) | Method of speech recognition by presenting N-best word candidates | |
CN101785051B (en) | Voice recognition device and voice recognition method | |
US20110202487A1 (en) | Statistical model learning device, statistical model learning method, and program | |
CN101188109B (en) | Speech recognition apparatus and method | |
US20140039896A1 (en) | Methods and System for Grammar Fitness Evaluation as Speech Recognition Error Predictor | |
CN109036471B (en) | Voice endpoint detection method and device | |
WO1998000834A9 (en) | Method and system for dynamically adjusted training for speech recognition | |
Tripathy et al. | A MFCC based Hindi speech recognition technique using HTK Toolkit | |
Hamooni et al. | Dual-domain hierarchical classification of phonetic time series | |
CN102592593A (en) | Emotional-characteristic extraction method implemented through considering sparsity of multilinear group in speech | |
EP0344017B1 (en) | Speech recognition system | |
CN117727307B (en) | Bird voice intelligent recognition method based on feature fusion | |
CN1760973A (en) | Method of speech recognition based on qualitative mapping | |
CN1198261C (en) | Voice identification based on decision tree | |
Hirschberg et al. | Generalizing prosodic prediction of speech recognition errors | |
JP3428058B2 (en) | Voice recognition device | |
CN112420075B (en) | Multitask-based phoneme detection method and device | |
RU2403628C2 (en) | Method of recognising key words in continuous speech | |
Cao et al. | Continuous speech research based on two-weight neural network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C02 | Deemed withdrawal of patent application after publication (patent law 2001) | ||
WD01 | Invention patent application deemed withdrawn after publication |