Background technology
One of the probing direction of intelligent sound box as man-machine interaction mode, under continuous development in recent years, various manufacturers
All in exploitation ASR and carry out lexical comprehension using Chinese word segmentation.The characteristics of due to Chinese, the analysis of complex sentence it is sufficiently complex and
Elapsed time.So ASR manufacturers can generally allow user that corresponding dictionary is selected to match corresponding field, such as music field is chatted
Its field etc., to reduce the complexity of algorithm.
But the phonetic order that intelligent sound box receives is mostly simple sentence, i.e., only there are one kinetonucleus structure, and is mostly to pray making
Sentence and interrogative sentence, pattern are also than relatively limited, this allows our the phonetic order features received according to intelligent sound box,
The optimization in result and corresponding field is identified.
The selection in existing intelligent sound box voice lexicon field lacks flexibility, usually needs specified manually or passes through
Call parameters are manually filling when ASR services are applied for.And when specify dictionary field after, do not do
Method is adjusted correspondingly according to the usage scenario and historical data of user.
Disclosure
(1) technical problems to be solved
Present disclose provides a kind of method and system that voice lexicon is adaptively obtained based on historical data and machine learning,
At least partly to solve the technical issues of set forth above.
(2) technical solution
According to one aspect of the disclosure, it provides one kind and voice word is adaptively obtained based on historical data and machine learning
The method in library, including:Step S1, the sentence mould that semantic plane is carried out to voice recognition result are classified, and are found dynamic in phonetic order
Core and relative dynamic member;Step S2 wins out the dynamic member in the phonetic order, with reference to machine learning and user's history
Data select several dictionaries;Step S3 carries out point of syntax plane in the dictionary of selection with natural language processing method
Word, the result in comprehensive multiple dictionary fields are assessed, and ask for the highest field of point value of evaluation as optimal result, described in output
Optimal result, while update user's history data;Step S4 by the sentence category analysis (sca) of optimal result combination pragmatic plane, is determined most
Whole dictionary field.
In the disclosure some embodiments, the sentence mould classification that semantic plane is carried out in the step S1 is calculated using pattern match
Method obtains the kinetonucleus in phonetic order and relative dynamic member.
In the disclosure some embodiments, the step S2 includes:By the kinetonucleus in phonetic order and relative
Dynamic member separation, and win and set out member, the several dictionaries gone out with reference to selected by machine deep learning according to dynamic member;And it is gone through according to user
History data choose the most used several dictionary fields of user.
In the disclosure some embodiments, in the step S3, with the N- in natural language processing in the dictionary of selection
Shortest-path method carries out the participle of syntax plane;It is using greedy algorithm or Dijkstra shortest paths in selection shortest path
Algorithm.
In the disclosure some embodiments, in the step S3, the result in comprehensive multiple dictionary fields carries out assessment and includes:
Correlation degree between word and word is assessed and shortest path algorithm result is assessed;Update user's history data
Including:Update user's history dictionary field service condition and the power by historical data and machine learning to noun in dictionary
Value optimizes.
In the disclosure some embodiments, further included before the step S1:Step S0, ASR identification engine receives user
Phonetic order is sent out, speech recognition is carried out, obtains voice recognition result.
It is a kind of another aspect of the present disclosure provides that voice is adaptively obtained based on historical data and machine learning
The system in dictionary field, including:Semantic plane analysis module carries out sentence mould classification, and by classification results to voice recognition result
It is sent to a variety of dictionaries of selection;Syntax two dimensional analysis module, to carrying out syntax participle, and comprehensive multiple words in the dictionary of selection
The result in library field is assessed, and exports the optimal result;Pragmatic two dimensional analysis module, by the optimal result combination pragmatic
The sentence category analysis (sca) of plane determines final dictionary field.
In the disclosure some embodiments, the semantic plane analysis module includes:Sentence mould classification submodule, to the knowledge
Other result carries out the sentence mould classification of semantic plane, finds the kinetonucleus in phonetic order and relative dynamic member;Machine choice
Submodule after winning and setting out member, is sent to the several dictionaries gone out with reference to selected by machine deep learning;History selects submodule
Block according to user's history data, is sent to the most used several dictionary fields of the user.
In the disclosure some embodiments, the syntax two dimensional analysis module includes:Submodule is segmented, in the dictionary of selection
The middle N- shortest-path methods in natural language processing carry out the participle of syntax plane;Assessment and update submodule, by right
Correlation degree between word and word is assessed, and shortest path algorithm result is assessed, and asks for the highest field of point value of evaluation
As optimal result;And update user's history dictionary field service condition and by historical data and machine learning to word
The weights of noun optimize in library.
In the disclosure some embodiments, ASR identification engines send out phonetic order for receiving user, carry out voice knowledge
Not, voice recognition result is obtained.
(3) advantageous effect
It can be seen from the above technical proposal that the disclosure is based on historical data and machine learning adaptively obtains voice lexicon
Method and system at least have the advantages that one of them:
(1) by the service condition of user's history dictionary, the high dictionary of frequency of use is preferentially found out, in combination with engineering
Exercises are supplement, and corresponding field is adaptively obtained from the historical data of user, is avoided through parameter or its other party
Formula forces designated user using specific field, so as to considerably increase flexibility and accuracy;
(2) by the way that the analysis of sentence is divided into three different aspects, comprehensive three aspects syntax, sentence mould and pragmatic side
The analysis result in face reduces the complexity of analysis, improves identification accuracy.
Specific embodiment
Present disclose provides it is a kind of based on historical data and machine learning adaptively obtain voice lexicon field method and
System.The disclosure is used the Type division of sentence into the analysis method of three planes:It is syntax, semantic and pragmatic.Its
In, according to the sentence type that the syntax plane of sentence branches away, sentence pattern is can be described as, for example sentence is divided into subject-predicate sentence and non-subject-predicate
Sentence.According to the sentence type that sentence semantics plane branches away, a mould is can be described as, for example sentence is divided into " kinetonucleus+take charge ", " is moved
Core+take charge+visitor's thing ".According to the sentence type branched away in sentence pragmatic plane, a class is can be described as, for example sentence is divided into old
State sentence, interrogative sentence, imperative sentence etc..
Because the sentence type come out from three two dimensional analysis is different, and the combination of different level can cause sentence
The selection in sub- analysis result and field is more reasonable.Disclosure user speech instructs the adaptively selected of dictionary field so that uses
Family or developer need not specify corresponding field, and can be according to the historical data of user instruction and with reference to machine learning
Obtained field is supplement, is rapidly selected corresponding field.Using three aspects of the analysis of sentence, the analysis of sentence is carried out,
Reduce the complexity of analysis.
Before the solution of description problem, the definition for first defining some specific vocabulary is helpful.
ASR Automatic Speech Recognition automatic speech recognition technologies;
Kinetonucleus is generally by the predicate of sentence or the verb of predicate head and Adjective ingredient;
Dynamic member kinetonucleus in connection with enforceable semantic component.
Purpose, technical scheme and advantage to make the disclosure are more clearly understood, below in conjunction with specific embodiment, and reference
The disclosure is further described in attached drawing.
Disclosure some embodiments will be done with reference to appended attached drawing in rear and more comprehensively describe to property, some of but not complete
The embodiment in portion will be shown.In fact, the various embodiments of the disclosure can be realized in many different forms, and should not be construed
To be limited to this several illustrated embodiment;Relatively, these embodiments are provided so that the disclosure meets applicable legal requirement.
In first exemplary embodiment of the disclosure, provide a kind of adaptive based on historical data and machine learning
The method for obtaining voice lexicon field.Fig. 1 is the flow chart of the adaptively selected dictionary field flow chart of the first embodiment of the present disclosure.
As shown in Figure 1, the disclosure is included based on the method that historical data and machine learning adaptively obtain voice lexicon:
Step S0, ASR identification engine receives user and sends out phonetic order, carries out speech recognition, obtains recognition result;
Step S1, the sentence mould that semantic plane is carried out to the recognition result are classified, find kinetonucleus in phonetic order and
Relative dynamic member, dynamic member is objective thing etc. of taking charge, most of to be represented by noun ingredient;
Step S2 wins out the dynamic member in phonetic order, and several dictionaries, while basis are selected with reference to machine deep learning
User's history data select several dictionary fields that the user is the most used;
Step S3 carries out syntax plane in the dictionary of selection with the N- shortest-path methods in natural language processing
Participle, the result in comprehensive multiple dictionary fields are assessed, and are asked for the highest field of point value of evaluation as optimal result, are exported institute
Optimal result is stated, while updates user's history data;
Step S4 carries out the sentence category analysis (sca) of pragmatic plane, determines final dictionary field.
The each step for adaptively obtaining the method for voice lexicon to the present embodiment individually below is described in detail.
The sentence mould classification of semantic plane is carried out in the step S1, using pattern matching algorithm, so as to obtain phonetic order
In kinetonucleus and relative dynamic member.
The kinetonucleus in phonetic order and relative dynamic member are detached, and win and set out member in the step S2, root
According to several dictionaries that dynamic member goes out with reference to selected by machine deep learning, such as:Music, navigation etc.;According to historical data, choose and use
The most used several dictionary fields in family, such as:Chat, star etc..
In the step S3, syntax is carried out with the N- shortest-path methods in natural language processing in the dictionary of selection
The participle of plane;The principle of the N- shortest-path methods is:Each sentence will generate a directed acyclic graph, each word
As a vertex of figure, while representing possible participle.Each side has that there are one weights (initial value 1), represents that the word goes out
Existing probability;Preferably, the weights are using the value of TF-IDF obtained in dictionary;In above-mentioned directed acyclic graph, N items are found
Weights and maximum path.In general, shortest path more than one, is to solve suboptimal solution using greedy algorithm in selection shortest path
Or Dijkstra shortest path firsts.Since the optimal solution and suboptimal solution of shortest path are not much different in participle effect,
It is preferred that optimal path is solved using greedy algorithm;
The result in comprehensive multiple dictionary fields carries out assessment and includes:(for example voice refers to correlation degree between word and word
Enable as " I wants to listen turning one's head again for Huan Jiang Yu ", the correlation degree of " Jiang Yuhuan " and " turning one's head again " here just than " educating " and
The correlation degree of " turning one's head again " is high) it is assessed, to shortest path algorithm result, (for example " Jiang Yuhuan " this word is just than " educating
" higher is evaluated in shortest path algorithm) assessed;
Update user's history data include:Update user's history dictionary field service condition and by historical data, with
And machine learning optimizes the weights of noun in dictionary.
The step S4, the sentence category analysis (sca) for carrying out pragmatic plane include:Sentence put forward query or issue an order etc. into
Row analysis finally determines dictionary field according to analysis result with reference to the optimal result.
The disclosure integrates machine learning according to the historical data of user, adaptively finds matched user thesaurus field,
And dynamically update the service condition in user thesaurus field.In terms of the analysis of sentence, the kinetonucleus of sentence is found out in terms of sentence mould
With dynamic member, the analysis in terms of syntax is carried out in dictionary to dynamic member.Analysis in terms of last synthetic sentence mould and in terms of syntax, carries out
The analysis of pragmatic side by disclosed method, can improve dictionary field selection accuracy, and improve the identification of phonetic order
Accuracy.
So far, the first embodiment of the present disclosure adaptively obtains the side in voice lexicon field based on historical data and machine learning
Method introduction finishes.
In second exemplary embodiment of the disclosure, provide a kind of adaptive based on historical data and machine learning
The system for obtaining voice lexicon field.Fig. 2 is based on historical data for the embodiment of the present disclosure and machine learning adaptively obtains voice
The system structure diagram in dictionary field.As shown in Fig. 2, system includes:ASR identifications engine, semantic plane analysis module, syntax
Two dimensional analysis module and pragmatic two dimensional analysis module.
The system that voice lexicon field is adaptively obtained based on historical data and machine learning to the present embodiment individually below
Various pieces be described in detail.
ASR identification engines send out phonetic order for receiving user, carry out speech recognition, obtain recognition result;
Semantic plane analysis module, the semantic plane analysis module include:
Sentence mould classification submodule, the sentence mould that semantic plane is carried out to the recognition result are classified, are found in phonetic order
Kinetonucleus and relative dynamic member (most of to be represented by noun ingredient, that is, objective thing etc. of taking charge);
Machine choice submodule after winning and setting out member, is sent to the several words gone out with reference to selected by machine deep learning
Library;
History selects submodule, after winning and setting out member, according to user's history data, while is sent to the user using most
Frequent several dictionary fields.
Syntax two dimensional analysis module, the syntax two dimensional analysis module include:
Submodule is segmented, carrying out syntax with the N- shortest-path methods in natural language processing in the dictionary of selection puts down
The participle in face, the principle of the N- shortest-path methods are:Each sentence will generate a directed acyclic graph, and each word is made
For a vertex of figure, while representing possible participle.Each side has there are one weights (initial value 1), represents that the word occurs
Probability;Preferably, the weights are using the value of TF-IDF obtained in dictionary;In above-mentioned directed acyclic graph, N items power is found
Value and maximum path.In general, shortest path more than one, selection shortest path be using greedy algorithm solve suboptimal solution or
Dijkstra shortest path firsts.It is excellent since the optimal solution and suboptimal solution of shortest path are not much different in participle effect
Choosing solves optimal path using greedy algorithm;
Assessment and update submodule, the result in comprehensive multiple dictionary fields are assessed, and ask for the highest neck of point value of evaluation
Domain exports the optimal result, while update user's history data as optimal result;
It is described that the highest field of point value of evaluation is taken to include:Between word and word correlation degree (such as phonetic order be " I
Want to listen turning one's head again for Huan Jiang Yu ", the correlation degree of " Jiang Yuhuan " and " turning one's head again " here just " is returned than " educating " and again
It is first " correlation degree it is high) assessed, to shortest path algorithm result, (for example " Jiang Yuhuan " this word just exists than " educating "
Higher is evaluated in shortest path algorithm) it is assessed,
Update user's history data include:Update user's history dictionary field service condition and by historical data and
Machine learning optimizes the weights of noun in dictionary.
Pragmatic two dimensional analysis module carries out the sentence category analysis (sca) of pragmatic plane, determines final dictionary field.
In order to achieve the purpose that brief description, in above-described embodiment 1, any technical characteristic narration for making same application is all
And in this, without repeating identical narration.
So far, the second embodiment of the present disclosure is based on what historical data and machine learning adaptively obtained voice lexicon field
The various pieces introduction of system finishes.
So far, attached drawing is had been combined the embodiment of the present disclosure is described in detail.It should be noted that it in attached drawing or says
In bright book text, the realization method that is not painted or describes is form known to a person of ordinary skill in the art in technical field, and
It is not described in detail.In addition, the above-mentioned definition to each element and method be not limited in mentioning in embodiment it is various specific
Structure, shape or mode, those of ordinary skill in the art simply can be changed or replaced to it.
Furthermore word "comprising" does not exclude the presence of element or step not listed in the claims.Before element
Word "a" or "an" does not exclude the presence of multiple such elements.
In addition, unless specifically described or the step of must sequentially occur, there is no restriction in more than institute for the sequence of above-mentioned steps
Row, and can change or rearrange according to required design.And above-described embodiment can be based on the considerations of design and reliability, that
This mix and match is used using or with other embodiment mix and match, i.e., the technical characteristic in different embodiments can be freely combined
Form more embodiments.
Algorithm and display be not inherently related to any certain computer, virtual system or miscellaneous equipment provided herein.
Various general-purpose systems can also be used together with teaching based on this.As described above, required by constructing this kind of system
Structure be obvious.In addition, the disclosure is not also directed to any certain programmed language.It should be understood that it can utilize various
Programming language realizes content of this disclosure described here, and the description done above to language-specific is to disclose this public affairs
The preferred forms opened.
The disclosure can be by means of including the hardware of several different elements and by means of properly programmed computer
It realizes.The all parts embodiment of the disclosure can be with hardware realization or to be run on one or more processor
Software module is realized or is realized with combination thereof.It it will be understood by those of skill in the art that can be in practice using micro-
Processor or digital signal processor (DSP) are some or all in the relevant device according to the embodiment of the present disclosure to realize
The some or all functions of component.The disclosure be also implemented as a part for performing method as described herein or
Whole equipment or program of device (for example, computer program and computer program product).Such journey for realizing the disclosure
Sequence can may be stored on the computer-readable medium or can have the form of one or more signal.Such signal can
It obtains either providing on carrier signal or providing in the form of any other to download from internet website.
Those skilled in the art, which are appreciated that, to carry out adaptively the module in the equipment in embodiment
Change and they are arranged in one or more equipment different from the embodiment.It can be the module or list in embodiment
Member or component be combined into a module or unit or component and can be divided into addition multiple submodule or subelement or
Sub-component.Other than such feature and/or at least some of process or unit exclude each other, it may be used any
Combination is disclosed to all features disclosed in this specification (including adjoint claim, abstract and attached drawing) and so to appoint
Where all processes or unit of method or equipment are combined.Unless expressly stated otherwise, this specification is (including adjoint power
Profit requirement, abstract and attached drawing) disclosed in each feature can be by providing the alternative features of identical, equivalent or similar purpose come generation
It replaces.If also, in the unit claim for listing equipment for drying, several in these devices can be by same hard
Part item embodies.
Similarly, it should be understood that in order to simplify the disclosure and help to understand one or more of each open aspect,
Above in the description of the exemplary embodiment of the disclosure, each feature of the disclosure is grouped together into single implementation sometimes
In example, figure or descriptions thereof.However, the method for the disclosure should be construed to reflect following intention:I.e. required guarantor
The disclosure of shield requires features more more than the feature being expressly recited in each claim.More precisely, as following
Claims reflect as, open aspect is all features less than single embodiment disclosed above.Therefore,
Thus the claims for following specific embodiment are expressly incorporated in the specific embodiment, wherein each claim is in itself
All as the separate embodiments of the disclosure.
Particular embodiments described above has carried out the purpose, technical solution and advantageous effect of the disclosure further in detail
It describes in detail bright, it should be understood that the foregoing is merely the specific embodiment of the disclosure, is not limited to the disclosure, it is all
Within the spirit and principle of the disclosure, any modification, equivalent substitution, improvement and etc. done should be included in the guarantor of the disclosure
Within the scope of shield.