CN108255956A

CN108255956A - The method and system of dictionary are adaptively obtained based on historical data and machine learning

Info

Publication number: CN108255956A
Application number: CN201711391038.6A
Authority: CN
Inventors: 蔡劲松; 苏少炜; 陈孝良; 冯大航; 常乐
Original assignee: BEIJING WISDOM TECHNOLOGY Co Ltd
Current assignee: BEIJING WISDOM TECHNOLOGY Co Ltd; Beijing SoundAI Technology Co Ltd
Priority date: 2017-12-21
Filing date: 2017-12-21
Publication date: 2018-07-06
Anticipated expiration: 2037-12-21
Also published as: CN108255956B

Abstract

Present disclose provides a kind of method for adaptively obtaining voice lexicon based on historical data and machine learning, including：Step S1, the sentence mould that semantic plane is carried out to voice recognition result are classified, and find the kinetonucleus in phonetic order and relative dynamic member；Step S2 wins out the dynamic member in phonetic order, with reference to machine learning and user's history data, selects several dictionaries；Step S3, the participle of syntax plane is carried out with the method in natural language processing in the dictionary of selection, the result in comprehensive multiple dictionary fields is assessed, and asks for the highest field of point value of evaluation as optimal result, the optimal result is exported, while updates user's history data；Step S4 by the sentence category analysis (sca) of optimal result combination pragmatic plane, determines final dictionary field.By the service condition combination machine learning of user's history dictionary, corresponding field is adaptively obtained from the historical data of user, so as to considerably increase flexibility and accuracy.

Description

The method and system of dictionary are adaptively obtained based on historical data and machine learning

Technical field

This disclosure relates to artificial intelligent voice interaction field more particularly to one kind are adaptive based on historical data and machine learning The method and system in dictionary field should be obtained.

Background technology

One of the probing direction of intelligent sound box as man-machine interaction mode, under continuous development in recent years, various manufacturers All in exploitation ASR and carry out lexical comprehension using Chinese word segmentation.The characteristics of due to Chinese, the analysis of complex sentence it is sufficiently complex and Elapsed time.So ASR manufacturers can generally allow user that corresponding dictionary is selected to match corresponding field, such as music field is chatted Its field etc., to reduce the complexity of algorithm.

But the phonetic order that intelligent sound box receives is mostly simple sentence, i.e., only there are one kinetonucleus structure, and is mostly to pray making Sentence and interrogative sentence, pattern are also than relatively limited, this allows our the phonetic order features received according to intelligent sound box, The optimization in result and corresponding field is identified.

The selection in existing intelligent sound box voice lexicon field lacks flexibility, usually needs specified manually or passes through Call parameters are manually filling when ASR services are applied for.And when specify dictionary field after, do not do Method is adjusted correspondingly according to the usage scenario and historical data of user.

Disclosure

(1) technical problems to be solved

Present disclose provides a kind of method and system that voice lexicon is adaptively obtained based on historical data and machine learning, At least partly to solve the technical issues of set forth above.

(2) technical solution

According to one aspect of the disclosure, it provides one kind and voice word is adaptively obtained based on historical data and machine learning The method in library, including：Step S1, the sentence mould that semantic plane is carried out to voice recognition result are classified, and are found dynamic in phonetic order Core and relative dynamic member；Step S2 wins out the dynamic member in the phonetic order, with reference to machine learning and user's history Data select several dictionaries；Step S3 carries out point of syntax plane in the dictionary of selection with natural language processing method Word, the result in comprehensive multiple dictionary fields are assessed, and ask for the highest field of point value of evaluation as optimal result, described in output Optimal result, while update user's history data；Step S4 by the sentence category analysis (sca) of optimal result combination pragmatic plane, is determined most Whole dictionary field.

In the disclosure some embodiments, the sentence mould classification that semantic plane is carried out in the step S1 is calculated using pattern match Method obtains the kinetonucleus in phonetic order and relative dynamic member.

In the disclosure some embodiments, the step S2 includes：By the kinetonucleus in phonetic order and relative Dynamic member separation, and win and set out member, the several dictionaries gone out with reference to selected by machine deep learning according to dynamic member；And it is gone through according to user History data choose the most used several dictionary fields of user.

In the disclosure some embodiments, in the step S3, with the N- in natural language processing in the dictionary of selection Shortest-path method carries out the participle of syntax plane；It is using greedy algorithm or Dijkstra shortest paths in selection shortest path Algorithm.

In the disclosure some embodiments, in the step S3, the result in comprehensive multiple dictionary fields carries out assessment and includes： Correlation degree between word and word is assessed and shortest path algorithm result is assessed；Update user's history data Including：Update user's history dictionary field service condition and the power by historical data and machine learning to noun in dictionary Value optimizes.

In the disclosure some embodiments, further included before the step S1：Step S0, ASR identification engine receives user Phonetic order is sent out, speech recognition is carried out, obtains voice recognition result.

It is a kind of another aspect of the present disclosure provides that voice is adaptively obtained based on historical data and machine learning The system in dictionary field, including：Semantic plane analysis module carries out sentence mould classification, and by classification results to voice recognition result It is sent to a variety of dictionaries of selection；Syntax two dimensional analysis module, to carrying out syntax participle, and comprehensive multiple words in the dictionary of selection The result in library field is assessed, and exports the optimal result；Pragmatic two dimensional analysis module, by the optimal result combination pragmatic The sentence category analysis (sca) of plane determines final dictionary field.

In the disclosure some embodiments, the semantic plane analysis module includes：Sentence mould classification submodule, to the knowledge Other result carries out the sentence mould classification of semantic plane, finds the kinetonucleus in phonetic order and relative dynamic member；Machine choice Submodule after winning and setting out member, is sent to the several dictionaries gone out with reference to selected by machine deep learning；History selects submodule Block according to user's history data, is sent to the most used several dictionary fields of the user.

In the disclosure some embodiments, the syntax two dimensional analysis module includes：Submodule is segmented, in the dictionary of selection The middle N- shortest-path methods in natural language processing carry out the participle of syntax plane；Assessment and update submodule, by right Correlation degree between word and word is assessed, and shortest path algorithm result is assessed, and asks for the highest field of point value of evaluation As optimal result；And update user's history dictionary field service condition and by historical data and machine learning to word The weights of noun optimize in library.

In the disclosure some embodiments, ASR identification engines send out phonetic order for receiving user, carry out voice knowledge Not, voice recognition result is obtained.

(3) advantageous effect

It can be seen from the above technical proposal that the disclosure is based on historical data and machine learning adaptively obtains voice lexicon Method and system at least have the advantages that one of them：

(1) by the service condition of user's history dictionary, the high dictionary of frequency of use is preferentially found out, in combination with engineering Exercises are supplement, and corresponding field is adaptively obtained from the historical data of user, is avoided through parameter or its other party Formula forces designated user using specific field, so as to considerably increase flexibility and accuracy；

(2) by the way that the analysis of sentence is divided into three different aspects, comprehensive three aspects syntax, sentence mould and pragmatic side The analysis result in face reduces the complexity of analysis, improves identification accuracy.

Description of the drawings

Fig. 1 is the method stream that the embodiment of the present disclosure adaptively obtains voice lexicon field based on historical data and machine learning Cheng Tu.

Fig. 2 is the system knot that the embodiment of the present disclosure adaptively obtains voice lexicon field based on historical data and machine learning Structure schematic diagram.

Specific embodiment

Present disclose provides it is a kind of based on historical data and machine learning adaptively obtain voice lexicon field method and System.The disclosure is used the Type division of sentence into the analysis method of three planes：It is syntax, semantic and pragmatic.Its In, according to the sentence type that the syntax plane of sentence branches away, sentence pattern is can be described as, for example sentence is divided into subject-predicate sentence and non-subject-predicate Sentence.According to the sentence type that sentence semantics plane branches away, a mould is can be described as, for example sentence is divided into " kinetonucleus+take charge ", " is moved Core+take charge+visitor's thing ".According to the sentence type branched away in sentence pragmatic plane, a class is can be described as, for example sentence is divided into old State sentence, interrogative sentence, imperative sentence etc..

Because the sentence type come out from three two dimensional analysis is different, and the combination of different level can cause sentence The selection in sub- analysis result and field is more reasonable.Disclosure user speech instructs the adaptively selected of dictionary field so that uses Family or developer need not specify corresponding field, and can be according to the historical data of user instruction and with reference to machine learning Obtained field is supplement, is rapidly selected corresponding field.Using three aspects of the analysis of sentence, the analysis of sentence is carried out, Reduce the complexity of analysis.

Before the solution of description problem, the definition for first defining some specific vocabulary is helpful.

ASR Automatic Speech Recognition automatic speech recognition technologies；

Kinetonucleus is generally by the predicate of sentence or the verb of predicate head and Adjective ingredient；

Dynamic member kinetonucleus in connection with enforceable semantic component.

Purpose, technical scheme and advantage to make the disclosure are more clearly understood, below in conjunction with specific embodiment, and reference The disclosure is further described in attached drawing.

Disclosure some embodiments will be done with reference to appended attached drawing in rear and more comprehensively describe to property, some of but not complete The embodiment in portion will be shown.In fact, the various embodiments of the disclosure can be realized in many different forms, and should not be construed To be limited to this several illustrated embodiment；Relatively, these embodiments are provided so that the disclosure meets applicable legal requirement.

In first exemplary embodiment of the disclosure, provide a kind of adaptive based on historical data and machine learning The method for obtaining voice lexicon field.Fig. 1 is the flow chart of the adaptively selected dictionary field flow chart of the first embodiment of the present disclosure. As shown in Figure 1, the disclosure is included based on the method that historical data and machine learning adaptively obtain voice lexicon：

Step S0, ASR identification engine receives user and sends out phonetic order, carries out speech recognition, obtains recognition result；

Step S1, the sentence mould that semantic plane is carried out to the recognition result are classified, find kinetonucleus in phonetic order and Relative dynamic member, dynamic member is objective thing etc. of taking charge, most of to be represented by noun ingredient；

Step S2 wins out the dynamic member in phonetic order, and several dictionaries, while basis are selected with reference to machine deep learning User's history data select several dictionary fields that the user is the most used；

Step S3 carries out syntax plane in the dictionary of selection with the N- shortest-path methods in natural language processing Participle, the result in comprehensive multiple dictionary fields are assessed, and are asked for the highest field of point value of evaluation as optimal result, are exported institute Optimal result is stated, while updates user's history data；

Step S4 carries out the sentence category analysis (sca) of pragmatic plane, determines final dictionary field.

The each step for adaptively obtaining the method for voice lexicon to the present embodiment individually below is described in detail.

The sentence mould classification of semantic plane is carried out in the step S1, using pattern matching algorithm, so as to obtain phonetic order In kinetonucleus and relative dynamic member.

The kinetonucleus in phonetic order and relative dynamic member are detached, and win and set out member in the step S2, root According to several dictionaries that dynamic member goes out with reference to selected by machine deep learning, such as：Music, navigation etc.；According to historical data, choose and use The most used several dictionary fields in family, such as：Chat, star etc..

In the step S3, syntax is carried out with the N- shortest-path methods in natural language processing in the dictionary of selection The participle of plane；The principle of the N- shortest-path methods is：Each sentence will generate a directed acyclic graph, each word As a vertex of figure, while representing possible participle.Each side has that there are one weights (initial value 1), represents that the word goes out Existing probability；Preferably, the weights are using the value of TF-IDF obtained in dictionary；In above-mentioned directed acyclic graph, N items are found Weights and maximum path.In general, shortest path more than one, is to solve suboptimal solution using greedy algorithm in selection shortest path Or Dijkstra shortest path firsts.Since the optimal solution and suboptimal solution of shortest path are not much different in participle effect, It is preferred that optimal path is solved using greedy algorithm；

The result in comprehensive multiple dictionary fields carries out assessment and includes：(for example voice refers to correlation degree between word and word Enable as " I wants to listen turning one's head again for Huan Jiang Yu ", the correlation degree of " Jiang Yuhuan " and " turning one's head again " here just than " educating " and The correlation degree of " turning one's head again " is high) it is assessed, to shortest path algorithm result, (for example " Jiang Yuhuan " this word is just than " educating " higher is evaluated in shortest path algorithm) assessed；

Update user's history data include：Update user's history dictionary field service condition and by historical data, with And machine learning optimizes the weights of noun in dictionary.

The step S4, the sentence category analysis (sca) for carrying out pragmatic plane include：Sentence put forward query or issue an order etc. into Row analysis finally determines dictionary field according to analysis result with reference to the optimal result.

The disclosure integrates machine learning according to the historical data of user, adaptively finds matched user thesaurus field, And dynamically update the service condition in user thesaurus field.In terms of the analysis of sentence, the kinetonucleus of sentence is found out in terms of sentence mould With dynamic member, the analysis in terms of syntax is carried out in dictionary to dynamic member.Analysis in terms of last synthetic sentence mould and in terms of syntax, carries out The analysis of pragmatic side by disclosed method, can improve dictionary field selection accuracy, and improve the identification of phonetic order Accuracy.

So far, the first embodiment of the present disclosure adaptively obtains the side in voice lexicon field based on historical data and machine learning Method introduction finishes.

In second exemplary embodiment of the disclosure, provide a kind of adaptive based on historical data and machine learning The system for obtaining voice lexicon field.Fig. 2 is based on historical data for the embodiment of the present disclosure and machine learning adaptively obtains voice The system structure diagram in dictionary field.As shown in Fig. 2, system includes：ASR identifications engine, semantic plane analysis module, syntax Two dimensional analysis module and pragmatic two dimensional analysis module.

The system that voice lexicon field is adaptively obtained based on historical data and machine learning to the present embodiment individually below Various pieces be described in detail.

ASR identification engines send out phonetic order for receiving user, carry out speech recognition, obtain recognition result；

Semantic plane analysis module, the semantic plane analysis module include：

Sentence mould classification submodule, the sentence mould that semantic plane is carried out to the recognition result are classified, are found in phonetic order Kinetonucleus and relative dynamic member (most of to be represented by noun ingredient, that is, objective thing etc. of taking charge)；

Machine choice submodule after winning and setting out member, is sent to the several words gone out with reference to selected by machine deep learning Library；

History selects submodule, after winning and setting out member, according to user's history data, while is sent to the user using most Frequent several dictionary fields.

Syntax two dimensional analysis module, the syntax two dimensional analysis module include：

Submodule is segmented, carrying out syntax with the N- shortest-path methods in natural language processing in the dictionary of selection puts down The participle in face, the principle of the N- shortest-path methods are：Each sentence will generate a directed acyclic graph, and each word is made For a vertex of figure, while representing possible participle.Each side has there are one weights (initial value 1), represents that the word occurs Probability；Preferably, the weights are using the value of TF-IDF obtained in dictionary；In above-mentioned directed acyclic graph, N items power is found Value and maximum path.In general, shortest path more than one, selection shortest path be using greedy algorithm solve suboptimal solution or Dijkstra shortest path firsts.It is excellent since the optimal solution and suboptimal solution of shortest path are not much different in participle effect Choosing solves optimal path using greedy algorithm；

Assessment and update submodule, the result in comprehensive multiple dictionary fields are assessed, and ask for the highest neck of point value of evaluation Domain exports the optimal result, while update user's history data as optimal result；

It is described that the highest field of point value of evaluation is taken to include：Between word and word correlation degree (such as phonetic order be " I Want to listen turning one's head again for Huan Jiang Yu ", the correlation degree of " Jiang Yuhuan " and " turning one's head again " here just " is returned than " educating " and again It is first " correlation degree it is high) assessed, to shortest path algorithm result, (for example " Jiang Yuhuan " this word just exists than " educating " Higher is evaluated in shortest path algorithm) it is assessed,

Update user's history data include：Update user's history dictionary field service condition and by historical data and Machine learning optimizes the weights of noun in dictionary.

Pragmatic two dimensional analysis module carries out the sentence category analysis (sca) of pragmatic plane, determines final dictionary field.

In order to achieve the purpose that brief description, in above-described embodiment 1, any technical characteristic narration for making same application is all And in this, without repeating identical narration.

So far, the second embodiment of the present disclosure is based on what historical data and machine learning adaptively obtained voice lexicon field The various pieces introduction of system finishes.

So far, attached drawing is had been combined the embodiment of the present disclosure is described in detail.It should be noted that it in attached drawing or says In bright book text, the realization method that is not painted or describes is form known to a person of ordinary skill in the art in technical field, and It is not described in detail.In addition, the above-mentioned definition to each element and method be not limited in mentioning in embodiment it is various specific Structure, shape or mode, those of ordinary skill in the art simply can be changed or replaced to it.

Furthermore word "comprising" does not exclude the presence of element or step not listed in the claims.Before element Word "a" or "an" does not exclude the presence of multiple such elements.

In addition, unless specifically described or the step of must sequentially occur, there is no restriction in more than institute for the sequence of above-mentioned steps Row, and can change or rearrange according to required design.And above-described embodiment can be based on the considerations of design and reliability, that This mix and match is used using or with other embodiment mix and match, i.e., the technical characteristic in different embodiments can be freely combined Form more embodiments.

Algorithm and display be not inherently related to any certain computer, virtual system or miscellaneous equipment provided herein. Various general-purpose systems can also be used together with teaching based on this.As described above, required by constructing this kind of system Structure be obvious.In addition, the disclosure is not also directed to any certain programmed language.It should be understood that it can utilize various Programming language realizes content of this disclosure described here, and the description done above to language-specific is to disclose this public affairs The preferred forms opened.

The disclosure can be by means of including the hardware of several different elements and by means of properly programmed computer It realizes.The all parts embodiment of the disclosure can be with hardware realization or to be run on one or more processor Software module is realized or is realized with combination thereof.It it will be understood by those of skill in the art that can be in practice using micro- Processor or digital signal processor (DSP) are some or all in the relevant device according to the embodiment of the present disclosure to realize The some or all functions of component.The disclosure be also implemented as a part for performing method as described herein or Whole equipment or program of device (for example, computer program and computer program product).Such journey for realizing the disclosure Sequence can may be stored on the computer-readable medium or can have the form of one or more signal.Such signal can It obtains either providing on carrier signal or providing in the form of any other to download from internet website.

Those skilled in the art, which are appreciated that, to carry out adaptively the module in the equipment in embodiment Change and they are arranged in one or more equipment different from the embodiment.It can be the module or list in embodiment Member or component be combined into a module or unit or component and can be divided into addition multiple submodule or subelement or Sub-component.Other than such feature and/or at least some of process or unit exclude each other, it may be used any Combination is disclosed to all features disclosed in this specification (including adjoint claim, abstract and attached drawing) and so to appoint Where all processes or unit of method or equipment are combined.Unless expressly stated otherwise, this specification is (including adjoint power Profit requirement, abstract and attached drawing) disclosed in each feature can be by providing the alternative features of identical, equivalent or similar purpose come generation It replaces.If also, in the unit claim for listing equipment for drying, several in these devices can be by same hard Part item embodies.

Similarly, it should be understood that in order to simplify the disclosure and help to understand one or more of each open aspect, Above in the description of the exemplary embodiment of the disclosure, each feature of the disclosure is grouped together into single implementation sometimes In example, figure or descriptions thereof.However, the method for the disclosure should be construed to reflect following intention：I.e. required guarantor The disclosure of shield requires features more more than the feature being expressly recited in each claim.More precisely, as following Claims reflect as, open aspect is all features less than single embodiment disclosed above.Therefore, Thus the claims for following specific embodiment are expressly incorporated in the specific embodiment, wherein each claim is in itself All as the separate embodiments of the disclosure.

Particular embodiments described above has carried out the purpose, technical solution and advantageous effect of the disclosure further in detail It describes in detail bright, it should be understood that the foregoing is merely the specific embodiment of the disclosure, is not limited to the disclosure, it is all Within the spirit and principle of the disclosure, any modification, equivalent substitution, improvement and etc. done should be included in the guarantor of the disclosure Within the scope of shield.

Claims

1. a kind of method that voice lexicon is adaptively obtained based on historical data and machine learning, including：

Step S1, the sentence mould that semantic plane is carried out to voice recognition result are classified, find kinetonucleus in phonetic order and and its Relevant dynamic member；

Step S2 wins out the dynamic member in the phonetic order, with reference to machine learning and user's history data, selects several words Library；

Step S3 carries out the participle of syntax plane, comprehensive multiple dictionary necks in the dictionary of selection with natural language processing method The result in domain is assessed, and is asked for the highest field of point value of evaluation as optimal result, is exported the optimal result, update simultaneously User's history data；

Step S4 by the sentence category analysis (sca) of optimal result combination pragmatic plane, determines final dictionary field.

2. according to the method described in claim 1, wherein, the sentence mould classification of semantic plane is carried out in the step S1 using pattern Matching algorithm obtains the kinetonucleus in phonetic order and relative dynamic member.

3. according to the method described in claim 1, wherein, the step S2 includes：

It by the kinetonucleus in phonetic order and relative dynamic member separation, and wins and sets out member, it is deep to combine machine according to dynamic member The several dictionaries gone out selected by degree study；And according to user's history data, choose the most used several dictionary fields of user.

4. according to the method described in claim 1, wherein,

In the step S3, syntax plane is carried out with the N- shortest-path methods in natural language processing in the dictionary of selection Participle；It is using greedy algorithm or Dijkstra shortest path firsts in selection shortest path.

5. according to the method described in claim 1, wherein, in the step S3,

The result in comprehensive multiple dictionary fields carries out assessment and includes：Correlation degree between word and word is assessed and right Shortest path algorithm result is assessed；

Update user's history data include：It updates user's history dictionary field service condition and passes through historical data and machine Study optimizes the weights of noun in dictionary.

6. it according to the method described in claim 1, is further included before the step S1：

Step S0, ASR identification engine receives user and sends out phonetic order, carries out speech recognition, obtains voice recognition result.

7. a kind of system that voice lexicon field is adaptively obtained based on historical data and machine learning, including：

Semantic plane analysis module carries out sentence mould classification to voice recognition result, and classification results is sent to a variety of words of selection Library；

Syntax two dimensional analysis module, to carrying out syntax participle in the dictionary of selection, and the result in comprehensive multiple dictionary fields into Row assessment, exports the optimal result；

Pragmatic two dimensional analysis module by the sentence category analysis (sca) of the optimal result combination pragmatic plane, determines final dictionary field.

8. system according to claim 7, wherein, the semantic plane analysis module includes：

Sentence mould classification submodule, the sentence mould that semantic plane is carried out to the recognition result are classified, and find the kinetonucleus in phonetic order And relative dynamic member；

Machine choice submodule after winning and setting out member, is sent to the several dictionaries gone out with reference to selected by machine deep learning；

History selects submodule, according to user's history data, is sent to the most used several dictionary fields of the user.

9. system according to claim 7, wherein, the syntax two dimensional analysis module includes：

Submodule is segmented, syntax plane is carried out with the N- shortest-path methods in natural language processing in the dictionary of selection Participle；

Assessment and update submodule, are assessed by the correlation degree between word and word, shortest path algorithm result are carried out Assessment, asks for the highest field of point value of evaluation as optimal result；And update user's history dictionary field service condition and The weights of noun in dictionary are optimized by historical data and machine learning.

10. system according to claim 7, further includes,

ASR identifies engine, sends out phonetic order for receiving user, carries out speech recognition, obtain voice recognition result.