CN101645064A

CN101645064A - Superficial natural spoken language understanding system and method thereof

Info

Publication number: CN101645064A
Application number: CN200810239727A
Authority: CN
Inventors: 徐为群; 包长春; 李亚丽; 潘接林; 颜永红
Original assignee: Institute of Acoustics CAS; Beijing Kexin Technology Co Ltd
Current assignee: Institute of Acoustics CAS; Beijing Kexin Technology Co Ltd
Priority date: 2008-12-16
Filing date: 2008-12-16
Publication date: 2010-02-10
Anticipated expiration: 2028-12-16
Also published as: CN101645064B

Abstract

The invention relates to a superficial natural spoken language understanding system and a method thereof. The system comprises a pretreatment module, a lexical characteristic extraction module, a context characteristic extraction module, a solid fuzzy matching module, a maximum entropy sort module and a Viterbi search module. The method comprises the following steps: firstly, getting rid of part of spoken language phenomena by pretreatment to simplify subsequent treatments; extracting sentence characteristics containing basic word characteristics, context word characteristics and solid characteristics; adopting a maximum entropy sorter for recognition; optimizing a whole sentence to obtain a final key-word sorting sequence; and finally, extracting named solids from the key-word sorting sequence. The system and the method effectively and robustly solves specific incoherent spoken language phenomena in a spoken language, such as repetition, pauses, filler words, and the like and problemspossibly occurring in spoken language recognition, such as recognition errors, and the like.

Description

A kind of superficial natural spoken language understanding system and method

Technical field

The present invention relates to the natural language understanding field, a kind of especially system and method for speech understanding.

Background technology

Natural spoken language is understood (Spoken Language Understanding--SLU) system and is being carried out the function that the character string of input is converted into corresponding semantic expressiveness.In spoken dialogue system, sound identification module is identified as word sequence with the user's voice signal; Word sequence is input to the natural spoken language Understanding Module subsequently, and the natural spoken language Understanding Module is discerned the semanteme of word sequence, gives dialogue management module; The dialogue management module regeneration is to user's return information, thereby finished the dialogue with the user, as shown in Figure 1.

Usually, the speech understanding task can be decomposed into crucial named entity recognition (Named Entity Recognition, NER) subtask and intention (or language power) recognin task.Wherein (Information Extraction, IE) middle application is more extensive in the natural language information extraction for named entity recognition and extraction task.

The NER usual way is that the input word word sequence is carried out sequence classification, by judging whether each word in the sequence belongs to certain named entity and determine the named entity that occurs in the whole sequence.As shown in Figure 2, wherein X represents observation sequence, and C is illustrated in the key words sorting sequence that each observation point obtains by classification.Obtain 2 entities by the key words sorting sequence C among the figure: " Zhong Guan-cun " belongs to classification loc (place class), and " Bank of China " belongs to classification bank (bank's class).

Sequence sorting technique based on statistical learning commonly used has hidden Markov model (Hidden MarkovModel), maximum entropy model (Maximum Entropy Model), conditional random field models (ConditionalRandom Field), AdaBoost model and mixture model or the like.

Maximum entropy model is a kind of differentiation pattern type, main principle of work as shown in Equation (1), p (c wherein _j| x _t) t that is illustrated in list entries x observes x constantly _tCondition under, the observation in this moment belongs to c _jThe backward probability of class.For same observation x _t, obtaining the maximum c of backward probability is exactly optimum class.Ask for optimal classification for the point that list entries is all, just draw respective classified flag sequence C.

p (c_{j} | x_{t}) = \frac{1}{Z (x_{t})} \exp [Σ_{m = 1}^{M} λ_{m} f_{m} (c_{j}, x_{t})] - - - (1)

C wherein _jJ=1 ..., the definition of J} can define interested entity in advance voluntarily according to the characteristics in task field.For example in a peripheral life information search was used, we can define entities such as place, bank, restaurant, hotel, cinema, hospital, refuelling station and sports buildings.f _mM=1 ..., M} is the good fundamental function of predefine, λ _mM=1 ..., M} is and f _mCorresponding parameter, Z (x) is a normalized parameter.

Summary of the invention

In order to overcome above-mentioned the deficiencies in the prior art, on the basis of the present invention's named entity recognition method in using for reference NLU, read statement exists under the situation of a large amount of noises (spoken repetition, fill spoken phenomenon such as speech and ASR identification error etc.) in using at speech understanding, the processing mode of a shallow-layer of design, can tolerate most of noise, extract the key message in the statement, thereby realize the robustness speech understanding.

In order to achieve the above object, a kind of superficial natural spoken language understanding system provided by the invention and method are based on maximum entropy model and realize that the shallow-layer of named entity recognition and language power understands.

Under the framework of maximum entropy statistical learning model, by a well-designed stack features function set, maximizedly again when realizing avoiding noise effect as far as possible utilize available contextual information, thereby improve recognition performance preferably.The fundamental function set can be divided into following three major parts:

1, lexical information: by utilizing the differentiation of the frequent vocabulary help that occurs of each entity class and non-entity to current classification." may I ask " as speech is common non-entity speech, if so current speech is " may I ask ", just judges easily that current speech belongs to non-entity class.

2, sentence contextual information: common forward and backward some specific vocabulary that have at entity; as in " I am in the Zhong Guan-cun " as " Zhong Guan-cun " front of place entity through regular meeting occur " ", " from ", " to " or the like this class keyword, so these contextual informations have direction action for the judgement of entity class.

3, the knowledge information of entity: native system utilizes existing entity knowledge base, by designing a kind of fuzzy matching algorithm of qualification, surveys and identify existing named entity in the knowledge base in read statement.

A kind of superficial natural spoken language understanding system provided by the invention, this system comprises:

One pretreatment module is used for the insignificant filling speech of the spoken language of input is removed, and with pretreated voice sequence output.

One lexical feature extraction module is used for judging the entity class of the speech in the pretreated voice sequence by each entity class and the frequent vocabulary that occurs of non-entity class, and this entity class is sent into the maximum entropy sort module.

One contextual feature extraction module is used for judging the entity class of the speech in the pretreated voice sequence by the forward and backward specific vocabulary of entity, and this entity class is sent into the maximum entropy sort module.

One entity fuzzy matching module is used to utilize the entity knowledge base, by fuzzy matching algorithm, surveys in the pretreated voice sequence of input and identifies existing entity class in the knowledge base, and this entity class is sent into the maximum entropy sort module.

One maximum entropy sort module is used for the optimal classification of getting a little to the entity class of input, obtains the respective classified flag sequence, and this key words sorting sequence is sent into Viterbi (Viterbi) search module.And

One Viterbi search module is used for searching for optimal path on the key words sorting sequence of input, finally obtains named entity.

Wherein, described lexical feature extraction module comprises:

One individual character feature is investigated module, is used to utilize corpus to generate the individual character fundamental function, and according to the individual character fundamental function, investigates the individual character feature in the voice sequence, judges the entity class of current individual character.

One double word feature is investigated module, is used for investigating the double word feature of voice sequence, and according to the double word fundamental function that generates, judges the entity class of current double word.

One common word and double word are investigated module, be used for obtaining the common word of each classification and the set of double word by statistical method from corpus, and utilize named entity kind quantity to define each common word fundamental function, obtain the common word feature of current word or double word then according to this set and each common word fundamental function, judge the entity class of current word or double word.

Wherein, described contextual feature extraction module further comprises:

One investigates the previous observation point of the current observation classification of mark, utilizes last this historical information of observation point classification to help the module of the differentiation of current observation point classification;

Whether one speech of investigating current observation front " trigger word " of certain entity class, and the appearance by " trigger word " helps to differentiate the module whether current observation belongs to certain classification;

Wherein, described entity fuzzy matching module comprises:

One coupling offset point computing module be used for the match point of the voice sequence of input is carried out migration processing, and the result after will handling is sent into the pre-matching module.

One pre-matching module is used for the character string of current input and the entity coupling of known class: preceding two the double word x ' that at first extract all entities in the known entities storehouse ₀X ' ₁And x ' ₁X ' ₂, form map data structure m_ne_bg; " key " of described map data structure m_ne_bg is preceding two double word x ' of all entities of being extracted ₀X ' ₁And x ' ₁X ' ₂, the value of these double word correspondences is a list of entities; Then, investigate the current double word x of process skew _T+sx _T+s+1If this double word is identical with certain key " key " (being preceding two double words of entity) among the map data structure m_ne_bg, then pre-matching success, and entity to be matched is exactly all entities in the corresponding key assignments; Wherein, t represents current time, and s represents side-play amount.

One entity matching degree computing module is used to utilize Levi's smooth (Levenstein) smallest edit distance to define the tolerance of matching degree, and the entity class that matching degree is the highest output, and formula is as follows:

ρ = \frac{len - D_{levenstein}}{len}

Wherein, len is the length of entity to be matched; D _LevensteinIt is the Levenstein smallest edit distance of current string and entity; D when mating fully _LevensteinBe 0, ρ is 1, the highest matching degree of expression; D when not matching fully _LevensteinBe len, ρ is 0, represents minimum matching degree.

One ρ threshold settings module is used to set the threshold value of ρ, and matching degree promptly is identified as entity class more than or equal to the character string of ρ threshold value.

A kind of superficial natural spoken language understanding method provided by the invention may further comprise the steps:

(1) read statement is carried out pre-service:

Insignificant filling speech is removed in the statement of pretreatment module with input, and with pretreated voice sequence output.

(2) after pre-service, feature is extracted in each observation constantly of statement, comprise following substep:

(21) extraction of lexical feature:

The lexical feature extraction module is judged the entity class of the speech in the pretreated voice sequence by the vocabulary that each entity class and non-entity class often occur, and this entity class is sent into the maximum entropy sort module.

(22) extraction of contextual feature:

The contextual feature extraction module is judged the entity class of the speech in the pretreated voice sequence by the forward and backward specific vocabulary of entity, and this entity class is sent into the maximum entropy sort module.

(23) fuzzy matching of entity:

Entity fuzzy matching module is utilized the entity knowledge base, by fuzzy matching algorithm, surveys in the pretreated voice sequence of input and identifies existing entity class in the knowledge base, and this entity class is sent into the maximum entropy sort module.

(3) maximum entropy classification:

The maximum entropy sort module obtains the respective classified flag sequence, and this key words sorting sequence is sent into the Viterbi search module the optimal classification of getting a little of the entity class of input.

(4) search optimal route, extraction named entity:

The Viterbi search module is searched for optimal path on the key words sorting sequence of input, finally obtain named entity.

Wherein, described step (21) further comprises following substep:

(211) the individual character feature is investigated module and is generated the individual character fundamental function with corpus, and according to the individual character fundamental function, investigates the individual character feature in the voice sequence, judges the entity class of current individual character.

(212) the double word feature is investigated the double word feature in the module investigation voice sequence, and according to the double word fundamental function that generates, judges the entity class of current double word.

(213) common word and double word are investigated module and obtain the common word of each classification and the set of double word by statistical method from corpus, and utilize named entity kind quantity to define each common word fundamental function, obtain the common word feature of current word or double word then according to this set and each common word fundamental function, judge the entity class of current word or double word.

Wherein, described step (22) further comprises following substep:

(221) investigate the previous observation point of the current observation classification of mark, utilize last this historical information of observation point classification to help the differentiation of current observation point classification.

(222) speech of investigating current observation front " trigger word " of certain entity class whether, the appearance by " trigger word " helps to differentiate current observation and whether belongs to certain classification.

Wherein, described step (23) further comprises following substep:

(231) coupling offset point computing module carries out migration processing to the match point in the voice sequence of input, and the result after will handling sends into the pre-matching module.

(232) the pre-matching module is mated the character string of current input and the entity of known class: preceding two the double word x ' that at first extract all entities in the known entities storehouse ₀X ' ₁And x ' ₁X ' ₂, form map data structure m_ne_bg; " key " of described map data structure m_ne_bg is preceding two double word x ' of all entities of being extracted ₀X ' ₁And x ' ₁X ' ₂, the value of these double word correspondences is a list of entities; Then, investigate the current double word x of process skew _T+sx _T+s+1If this double word is identical with certain key " key " (being preceding two double words of entity) among the map data structure m_ne_bg, then pre-matching success, and entity to be matched is exactly all entities in the corresponding key assignments; Wherein, t represents current time, and s represents side-play amount.

(233) entity matching degree computing module utilizes the tolerance of Levenstein smallest edit distance definition matching degree, and the entity class that matching degree is the highest output, and formula is as follows:

ρ = \frac{len - D_{levenstein}}{len}

(244) threshold value of ρ threshold settings module settings ρ, matching degree promptly is identified as entity class more than or equal to the character string of ρ threshold value.

The invention has the advantages that:

Possible problems such as identification error in discontinuous spoken phenomenons such as superficial natural spoken language understanding system provided by the invention and method can be effectively, robust ground solves distinctive repetition in the spoken language, pause, filling speech and the spoken identification are more suitable in spoken environment.

Description of drawings

Fig. 1 is prior art interactive system basic framework figure

Fig. 2 is that prior art is extracted corresponding entity by the sequence classification;

Fig. 3 is a superficial natural spoken language understanding system frame diagram of the present invention;

Fig. 4 is superficial natural spoken language understanding system of the present invention and method identification framework process flow diagram;

Fig. 5 is the fuzzy matching process flow diagram of current string of the present invention and entity.

Embodiment

Below in conjunction with a specific embodiment superficial natural spoken language understanding system of the present invention and method are elaborated.Superficial natural spoken language understanding system framework of the present invention as shown in Figure 3.

The superficial natural spoken language understanding system of present embodiment as shown in Figure 4, comprising: pretreatment module, lexical feature extraction module, contextual feature extraction module, entity fuzzy matching module, maximum entropy sort module, and Viterbi search module.

Wherein, the lexical feature extraction module comprises: the individual character feature is investigated module, and the double word feature is investigated module, and common word and double word are investigated module.

Wherein, the contextual feature extraction module comprises:

One investigates the previous observation point of the current observation classification of mark, utilizes last this historical information of observation point classification to help the module of the differentiation of current observation point classification.

Whether one speech of investigating current observation front " trigger word " of certain entity class, and the appearance by " trigger word " helps to differentiate the module whether current observation belongs to certain classification.

Wherein, entity fuzzy matching module frame comprises: coupling offset point computing module, pre-matching module, entity matching degree computing module and ρ threshold settings module.

Present embodiment superficial natural spoken language understanding method flow process may further comprise the steps as shown in Figure 4:

1, read statement is carried out pre-service:

Partly remove spoken language repeatedly as " I want to ask and want to ask " by the method for rule, insignificant filling speech " ", " ", " ", " that " etc.

2,, feature is extracted in each observation constantly of statement through after the pre-service:

2.1 the extraction of lexical feature

At first, extract maximum entropy lexical feature commonly used, in the window of 5 words, investigate corresponding word and double word feature, be defined as follows fundamental function template P _mSeries:

P ₁Series is investigated the individual character feature, wherein P _1,0Investigate current word x _t, P _{1 ,-1}Investigate a back word x _T-1, P _{1 ,-2}Investigate back second word x _T-2, P _1,1Investigate previous word x _T+1And P _1,2Investigate preceding second word x _T+2Need to prove that feature templates is not equal to fundamental function, but template can be according to corpus generating feature function, as for template P _1,0, if current word is " I " and belong to non-entity class, this template will the generating feature function in corpus, corresponding to a plurality of different current words in the corpus, this template can generate corresponding fundamental function.Template described below is all identical with this.

P ₂Series is investigated the double word feature, wherein P _2,0Investigate current double word x _tAnd x _T+1Combination, P _{2 ,-1}Investigate a back word x _T-1With current word x _tCombination, P _{2 ,-2}Investigate latter two word x _T-1And x _T-2Combination, P _2,1Investigate preceding two word x _T+1And x _T+2Combination.

P ₃Series is investigated the common word and the double word of each classification, obtains the common word of each classification and the set of double word by statistical method from corpus.In classification, use these set to obtain the common word feature of current word or double word.Define each common word fundamental function P according to named entity kind quantity _{3, k}For example, P _3,0Whether common investigate current word or double word non-entity word, if current double word is " you are good ", this double word is common in non-entity class, then P in training data _3,0Characteristic of correspondence function response is 1; The rest may be inferred by analogy, other P _{3, k}Investigating separately, whether the common word of class occurs in current word and double word.

2.2 the extraction of contextual feature

P ₄Series is investigated the previous observation point of the current observation classification c of mark _T-1

P ₅Series is investigated whether introductory word before certain particular category common of the previous word of current word or double word, such as the preceding introductory word of place class can be " ", " from ", " to " or the like.

2.3 the fuzzy matching of entity

P ₆Series is utilized existing named entity tabulation, carries out fuzzy matching forward at current word.If the entity and the string matching forward of current word of certain classification are arranged, then current word probably is exactly the entity class that is mated.Because the statement of input has noise and bigger spoken randomness, matching process need be that fuzzy matching is to strengthen its robustness.Fig. 5 is the process flow diagram of fuzzy matching.

Wherein, the coupling offset point is that the title at place in Chinese characters spoken language or service facility has more random prefix usually; as " the good electronics mansion of Haidian District, Beijing City ancient cooking vessel ", " the good electronics mansion of Beijing ancient cooking vessel ", " the good electronics mansion of Haidian District ancient cooking vessel "; these character strings all refer to same place " the good electronics mansion of ancient cooking vessel "; in matching process, should ignore the influence of these prefixes, therefore before coupling, will the match point in the statement be offset.

The purpose of pre-matching mainly contains 2, and the one, check the necessity of mating, if pre-matching is unsuccessful, then do not need to carry out more deep coupling, saved the processing time; The 2nd, through pre-matching, the entity of needs and current string coupling can be limited in the entity scope of pre-matching success, so often the scope of mating has been narrowed down in average 10 from 4000～5000, saved greatly and searched and match time.The implementation method of pre-matching is: for all entities in the entity storehouse, get its preceding two double words (x ' ₀X ' ₁And x ' ₁X ' ₂), forming map data structure m_ne_bg, its key is exactly all such double words, and corresponding value is a list of entities, and first of all entities in the tabulation or second double word are exactly key.Investigate current double word x during pre-matching through skew _T+sx _T+s+1(wherein t represents current time, and s represents side-play amount), if this double word is identical with certain key among the m_ne_bg, then pre-matching success, and also entity to be matched is exactly all entities in the corresponding key assignments.

When calculating matching degree, mainly utilize the tolerance of Levenstein smallest edit distance definition matching degree, as shown in Equation (2), wherein len is the length of entity to be matched, D _LevensteinBe the Levenstein smallest edit distance of current string and entity, D when mating fully _LevensteinBe 0, ρ gets 1, the highest matching degree of expression; D when not matching fully _LevensteinBe len, ρ gets 0, represents minimum matching degree.

ρ = \frac{len - D_{levenstein}}{len} - - - (2)

From the calculating of matching degree as can be known, by setting the threshold value of ρ, can allow that part characters matched string regarded as entity, thereby improve the robustness of system.For example, have the read statement mansion of electronics " the Zhong Guan-cun sea otter electricity " of noise, the spoken phenomenon of repetition (" electronics ") and filling (" ") is arranged in this statement; And for the entity in the knowledge base " Zhong Guan-cun sea otter electronics mansion ", the distance of two character strings is 2, matching degree is 0.78, if threshold settings is 0.7, then this character string success be identified as entity " Zhong Guan-cun sea otter electronics mansion ", thereby improved the robustness of system for grammatical phenomenon and spoken identification error.Can draw a series of fundamental functions relevant by this template with entity class, as following function:

3, maximum entropy classification:

All features are sent into maximum entropy classifiers, utilize following formula:

p (c_{j} | x_{t}) = \frac{1}{Z (x_{t})} \exp [Σ_{m = 1}^{M} λ_{m} f_{m} (c_{j}, x_{t})],

Try to achieve all backward probability p (c of each classification constantly _j| x _t).

4, search optimal route:

On list entries, use Viterbi algorithm search optimal route.

δ_{t} (j) = \max_{i &Element; S} {p (c_{j} | c_{i}) p (c_{j} | x_{t}) δ_{t - 1} (i)} - - - (3)

5, from the key words sorting sequence that obtains, extract named entity.

Claims

1, a kind of superficial natural spoken language understanding system, this system comprises:

One pretreatment module is used for the insignificant filling speech of the spoken language of input is removed, and with pretreated voice sequence output;

One maximum entropy sort module, to the every bit of list entries, features such as the lexical feature by choosing this point, contextual feature, the backward probability of obtaining all possible classification of this point by maximum entropy algorithm distributes;

One Viterbi search module is used for searching for optimal path on the net in the key words sorting sequence of maximum entropy output, obtains optimum key words sorting sequence, thereby draws named entity;

It is characterized in that the characteristic module that maximum entropy model is selected comprises:

One lexical feature extraction module is used for judging the entity class of the speech in the pretreated voice sequence by each entity class and the frequent vocabulary that occurs of non-entity class, and this entity class is sent into the maximum entropy sort module;

One contextual feature extraction module is used for judging the entity class of the speech in the pretreated voice sequence by the forward and backward specific vocabulary of entity, and this entity class is sent into the maximum entropy sort module; With

2, superficial natural spoken language understanding system according to claim 1 is characterized in that, described lexical feature extraction module comprises:

One individual character feature is investigated module, is used to utilize corpus to generate the individual character fundamental function, and according to the individual character fundamental function, investigates the individual character feature in the voice sequence, judges the entity class of current individual character;

One double word feature is investigated module, is used for investigating the double word feature of voice sequence, and according to the double word fundamental function that generates, judges the entity class of current double word;

One common word and double word investigated module, be used for obtaining the common word of each classification and the set of double word by statistical method from corpus, and utilize named entity kind quantity to define each common word fundamental function, obtain the common word feature of current word or double word then according to this set and each common word fundamental function, judge the entity class of current word or double word.

3, superficial natural spoken language understanding system according to claim 1 is characterized in that, described contextual feature extraction module further comprises:

-investigate the previous observation point of the current observation classification of mark, utilize last this historical information of observation point classification to help the module of the differentiation of current observation point classification;

-speech " trigger word " of certain entity class whether of investigating current observation front, the appearance by " trigger word " helps to differentiate the module whether current observation belongs to certain classification.

4, superficial natural spoken language understanding system according to claim 1 is characterized in that, described entity fuzzy matching module comprises:

One coupling offset point computing module be used for the match point of the voice sequence of input is carried out migration processing, and the result after will handling is sent into the pre-matching module;

One pre-matching module is used for the character string of current input and the entity coupling of known class: preceding two the double word x ' that at first extract all entities in the known entities storehouse ₀X ' ₁And x ' ₁X ' ₂, form map data structure m_ne_bg; " key " of described map data structure m_ne_bg is preceding two double word x ' of all entities of being extracted ₀X ' ₁And x ' ₁X ' ₂, the value of preceding two double word correspondences of described all entities is a list of entities; Then, investigate the current double word x of process skew _T+sx _T+s+1If preceding two double words of certain entity among this double word and the map data structure m_ne_bg are identical, then pre-matching success, and entity to be matched is exactly all entities in the corresponding key assignments; Wherein, t represents current time, and s represents side-play amount;

One entity matching degree computing module is used to utilize the smooth smallest edit distance of Levi's to define the tolerance of matching degree, and the entity class that matching degree is the highest output, and formula is as follows:

ρ = \frac{len - D_{levenstein}}{len}

Wherein, len is the length of entity to be matched; D _LevensteinIt is the smooth smallest edit distance of Levi's of current string and entity; D when mating fully _LevensteinBe 0, ρ is 1, the highest matching degree of expression; D when not matching fully _LevensteinBe len, ρ is 0, represents minimum matching degree.

5, superficial natural spoken language understanding system according to claim 4, it is characterized in that, described entity fuzzy matching module also comprises a ρ threshold settings module, is used to set the threshold value of ρ, and matching degree promptly is identified as entity class more than or equal to the character string of ρ threshold value.

6, a kind of superficial natural spoken language understanding method, this method may further comprise the steps:

(1) read statement is carried out pre-service:

Insignificant filling speech is removed in the statement of pretreatment module with input, and with pretreated voice sequence output;

(21) extraction of lexical feature:

The lexical feature extraction module is judged the entity class of the speech in the pretreated voice sequence by the vocabulary that each entity class and non-entity class often occur, and this entity class is sent into the maximum entropy sort module;

(22) extraction of contextual feature:

The contextual feature extraction module is judged the entity class of the speech in the pretreated voice sequence by the forward and backward specific vocabulary of entity, and this entity class is sent into the maximum entropy sort module;

(23) fuzzy matching of entity:

Entity fuzzy matching module is utilized the entity knowledge base, by fuzzy matching algorithm, surveys in the pretreated voice sequence of input and identifies existing entity class in the knowledge base, and this entity class is sent into the maximum entropy sort module;

(3) maximum entropy classification:

The maximum entropy sort module obtains the respective classified flag sequence, and this key words sorting sequence is sent into the Viterbi search module the optimal classification of getting a little of the entity class of input;

(4) search optimal route, extraction named entity:

7, superficial natural spoken language understanding method according to claim 6 is characterized in that, described step (21) further comprises following substep:

(211) the individual character feature is investigated module and is generated the individual character fundamental function with corpus, and according to the individual character fundamental function, investigates the individual character feature in the voice sequence, judges the entity class of current individual character;

(212) the double word feature is investigated the double word feature in the module investigation voice sequence, and according to the double word fundamental function that generates, judges the entity class of current double word; With

8, superficial natural spoken language understanding method according to claim 6 is characterized in that, described step (22) further comprises following substep:

(221) investigate the previous observation point of the current observation classification of mark, utilize last this historical information of observation point classification to help the differentiation of current observation point classification;

9, superficial natural spoken language understanding method according to claim 6 is characterized in that, described step (23) further comprises following substep:

(231) coupling offset point computing module carries out migration processing to the match point in the voice sequence of input, and the result after will handling sends into the pre-matching module;

(232) the pre-matching module is mated the character string of current input and the entity of known class: preceding two the double word x ' that at first extract all entities in the known entities storehouse ₀X ' ₁And x ' ₁X ' ₂, form map data structure m_ne_bg; " key " of described map data structure m_ne_bg is preceding two double word x ' of all entities of being extracted ₀X ' ₁And x ' ₁X ' ₂, the value of preceding two double word correspondences of described all entities is a list of entities; Then, investigate the current double word x of process skew _T+sx _T+s+1If preceding two double words of certain entity among this double word and the map data structure m_ne_bg are identical, then pre-matching success, and entity to be matched is exactly all entities in the corresponding key assignments; Wherein, t represents current time, and s represents side-play amount;

(233) entity matching degree computing module utilizes the tolerance of Levi's's smooth smallest edit distance definition matching degree, and the entity class that matching degree is the highest output, and formula is as follows:

ρ = \frac{len - D_{levenstein}}{len}

10, superficial natural spoken language understanding method according to claim 9, it is characterized in that, described step (23) further comprises: the threshold value of ρ threshold settings module settings ρ, matching degree promptly is identified as the step of entity class more than or equal to the character string of ρ threshold value.