CN110489517A - The Auto-learning Method and system of virtual assistant - Google Patents
The Auto-learning Method and system of virtual assistant Download PDFInfo
- Publication number
- CN110489517A CN110489517A CN201810436639.2A CN201810436639A CN110489517A CN 110489517 A CN110489517 A CN 110489517A CN 201810436639 A CN201810436639 A CN 201810436639A CN 110489517 A CN110489517 A CN 110489517A
- Authority
- CN
- China
- Prior art keywords
- corpus
- data
- model
- intention
- vocabulary
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 26
- 238000004458 analytical method Methods 0.000 claims abstract description 50
- 238000012545 processing Methods 0.000 claims abstract description 27
- 230000011218 segmentation Effects 0.000 claims abstract description 23
- 238000003058 natural language processing Methods 0.000 claims abstract description 12
- 238000012549 training Methods 0.000 claims description 67
- 230000006870 function Effects 0.000 claims description 33
- 230000006399 behavior Effects 0.000 claims description 17
- 241001269238 Data Species 0.000 claims description 4
- 206010028916 Neologism Diseases 0.000 claims description 2
- 230000000694 effects Effects 0.000 abstract description 5
- 238000012856 packing Methods 0.000 description 15
- 238000013499 data model Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 238000007726 management method Methods 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 238000013527 convolutional neural network Methods 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 2
- 230000008878 coupling Effects 0.000 description 2
- 238000010168 coupling process Methods 0.000 description 2
- 238000005859 coupling reaction Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 230000004069 differentiation Effects 0.000 description 2
- 235000013399 edible fruits Nutrition 0.000 description 2
- 230000005611 electricity Effects 0.000 description 2
- 230000036541 health Effects 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 235000012054 meals Nutrition 0.000 description 2
- 230000000306 recurrent effect Effects 0.000 description 2
- 238000012706 support-vector machine Methods 0.000 description 2
- 230000008901 benefit Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000007787 long-term memory Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000015654 memory Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
- 239000011800 void material Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
Abstract
A kind of Auto-learning Method and system of virtual assistant.The Auto-learning Method of virtual assistant includes: receiving audio input and recognizes audio to form corpus data;Corpus data is analyzed using Natural Language Processing Models, to generate language feature information corresponding with corpus data;Functional scenario analysis is carried out to language feature information according to functional contextual information, judges the corresponding operation of one of these intentions;If functional scenario analysis can not judge the corresponding operation of one of these intentions, word segmentation processing is carried out for corpus data;With according to after word segmentation processing as a result, judging whether there is new term or new corpus data;If there is new term, Natural Language Processing Models are updated according to the meaning of new term, if there is new corpus data, functional scenario analysis is updated according to the intention of new corpus data.Whereby, reach effect that user can more rapidly facilitate when using ERP system.
Description
Technical field
This case relates to a kind of method and system learnt automatically, and in particular to a kind of the automatic of virtual assistant
Learning method and system.
Background technique
Enterprise Resource Planning System (Enterprise Resource Planning, ERP), abbreviation ERP system refers to and builds
It stands and provides the management platform of decision on the basis of information technology for business decision layer.Its mainly by the stream of people of enterprise, logistics,
Information flow, cash flow carry out unified management, to utilize the resource of enterprise to greatest extent.And ERP system includes production control
The function of three broad aspects such as system, logistics management and financial management, therefore ERP system scale is very huge.
Virtual assistant is applied in ERP system, more can quickly help user to exchange with huge ERP system,
User can be saved the time spent in using in ERP system, but since each user uses ERP system to be accustomed to not
Together, therefore the case where virtual assistant can not understand user's problem is had, causes user using tired in ERP system instead
It is difficult.
Summary of the invention
The main object of the present invention is to provide the Auto-learning Method and system of a kind of virtual assistant, mainly allows void
Quasi- assistant has the function of learning automatically, allows virtual assistant can be during exchanging with user, automatic study to use
Jargoon word in speak habit or the industry of person, reaches and user is allowed the use of ERP system to be more rapidly to facilitate
The effect of.
To reach above-mentioned purpose, first aspect of this case is to provide a kind of Auto-learning Method of virtual assistant, this side
Method, which comprises the steps of:, to be received audio input and recognizes audio to form corpus data;It is analyzed using Natural Language Processing Models
Corpus data, to generate corresponding with corpus data language feature information, wherein language feature information include multiple intentions, it is described
The corresponding probability of multiple intentions and multiple vocabulary;Functional situation point is carried out to language feature information according to functional contextual information
Analysis judges the corresponding operation of one of the multiple intention;If functional scenario analysis can not judge the multiple intention
One of corresponding operation, then for corpus data carry out word segmentation processing;With according to after word segmentation processing as a result, judging whether
There are new term or new corpus datas;If there is new term, Natural Language Processing Models are updated according to the meaning of new term, such as
There are new corpus datas for fruit, update functional scenario analysis according to the intention of new corpus data;Wherein, operation includes inquiry data behaviour
One of work and executing instruction operations.
According to one embodiment of this case, also include: generating one according to a working knowledge database and a domain knowledge data library
System regions lexical set;The system regions lexical set and multiple parameters that are served by are formed as a critical entities set, should
Critical entities set includes multiple system regions vocabulary;Multiple training corpus are classified as the inquiry data manipulation and the execution refers to
Enable one of operation;The multiple training of the inquiry data manipulation is corresponded to according to the class discrimination in the enterprise database
The intention of corpus forms multiple queries data manipulation intention, and the service behavior differentiation pair provided according to the enterprise resource system
Should the intentions of the multiple training corpus of executing instruction operations form multiple executing instruction operations and be intended to;It establishes the multiple
Inquire the model that data manipulation is intended to and the model that the multiple executing instruction operations are intended to;According to the critical entities set,
The model that the model and the multiple executing instruction operations that the multiple inquiry data manipulation is intended to are intended to establishes the totality number
According to library;Recognize the multiple system regions vocabulary in the critical entities set occur in the multiple training corpus it is multiple
First probability, and multiple sentence pattern knots of the multiple the multiple training corpus of system regions lexical analysis by picking out
Multiple relevances between structure and the multiple system regions vocabulary, and according to the multiple first probability and institute
It states multiple relevances and establishes a common lexicon model;And it the multiple inquiry data manipulation of analysis intention and the multiple holds
There are multiple second probability of the multiple system regions vocabulary in being intended in row instruction operation, and according to the multiple sentence pattern structure
And the multiple second probability establishes a common semanteme model.
According to one embodiment of this case, also include: it is strong that the data in one historical data base are carried out relationship using a classifier
Weak typing generates a functional situational model;And the multiple training corpus is subjected to hyphenation and analysis, and according to the history number
A functional lexicon model is generated according to the data in library.
According to one embodiment of this case, which also includes: being believed using the corpus data and the function situation
Breath is compared with the function situational model, and generates a functional situation identification result;And it is recognized and is tied according to the function situation
Fruit judges one of the multiple intention corresponding one of the inquiry data manipulation and the executing instruction operations.
According to one embodiment of this case, which also includes: being carried out according to the function lexicon model to the corpus data
Hyphenation, to generate multiple participles;And calculate the frequency of the multiple participle.
According to one embodiment of this case, also include: judge the calculated the multiple participle of the word segmentation processing frequency whether
Lower than a threshold value;If it is the multiple participle one of be lower than the threshold value, it is the multiple participle one of if
For the new term, and the definition of the new term is received, to update the common lexicon model and the common semanteme model;And if
The multiple participle is above the threshold value, then the corpus data is then the new corpus data, and receives the new corpus data
It is intended to, to update the function situational model.
According to one embodiment of this case, also include: judging whether the new corpus data is common corpus, if it is basis should
New corpus data updates the system regions lexical set;And the system regions lexical set is updated according to the new term.
According to one embodiment of this case, which analyzes the corpus data and also includes: utilizing the common word
Whether there is the multiple system regions vocabulary met in the critical entities set in remittance Model Distinguish corpus data, will distinguish
Know result and be set as the multiple vocabulary, and analyzes the probability that the multiple vocabulary occurs;It should according to the multiple lexical analysis
The sentence pattern structure of corpus data;And the probability occurred using the common semanteme model according to the multiple vocabulary and the corpus
The multiple intention of the sentence pattern Structure Identification of the data corpus data and the corresponding probability of the multiple intention.
Second aspect of this case is to provide a kind of automatic learning system of virtual assistant, respectively with enterprise database and enterprise
The connection of industry resource system, it includes: processor, storage device and input/output device.Storage device is electrically connected to processing
Device, to store global database, working knowledge database, domain knowledge data library and historical data base.Input/output dress
It sets and is electrically connected to processor, to provide interface for input audio.Wherein, processor includes: voice identification module, corpus
Analysis module, situation identification module, unknown corpus judgment module and update information module.Voice identification module is to recognize sound
Frequency is to form corpus data.Concordance module and voice identification module are electrically connected, to utilize Natural Language Processing Models
Analyze corpus data, to generate corresponding with corpus data language feature information, wherein language feature information include multiple intentions,
The corresponding probability of the multiple intention and multiple vocabulary.Situation recognizes module and Concordance module and is electrically connected, to according to
Functional scenario analysis is carried out to language feature information according to functional contextual information, judges that one of the multiple intention is corresponding
Operation.Unknown corpus judgment module and situation identification module are electrically connected, described more can not recognize in situation identification module
It is a when one of being intended to corresponding operation, carry out word segmentation processing for corpus data, and with according to after word segmentation processing as a result,
Judge whether there is new term or new corpus data.Information module and unknown corpus judgment module is updated to be electrically connected, to
When having new term generation, which is updated according to the meaning of the new term, and produce in the new corpus data
When raw, which is updated according to the intention of the new corpus data;Wherein, the operation include one inquiry data manipulation and
One of one executing instruction operations.
According to one embodiment of this case, which also includes: a training module, is electrically connected with the Concordance module,
To generate a system regions lexical set, the system regions word according to the working knowledge database and the domain knowledge data library
Collect conjunction and multiple parameters that are served by are formed as a critical entities set, which includes multiple system regions words
It converges, and multiple training corpus is classified as one of the inquiry data manipulation and the executing instruction operations, according to the enterprise
The intention that class discrimination in database corresponds to the multiple training corpus of the inquiry data manipulation forms multiple queries data
Operation is intended to, and distinguishes the multiple of the corresponding executing instruction operations according to the service behavior that the enterprise resource system provides
The intention of training corpus forms multiple executing instruction operations and is intended to;One model establishes module, is electrically connected, builds with the training module
The model that the multiple inquiry data manipulation is intended to and the model that the multiple executing instruction operations are intended to are found, according to the pass
The model that the model and the multiple executing instruction operations that key entity sets, the multiple inquiry data manipulation are intended to are intended to is built
Found the global database;One lexicon model establishes module, establishes module with the model and is electrically connected, recognizes the critical entities set
In the multiple system regions vocabulary multiple first probability for occurring in the multiple training corpus, and by picking out
Multiple sentence pattern structures of the multiple the multiple training corpus of system regions lexical analysis and the multiple system regions word
Multiple relevances between remittance, and a common vocabulary is established according to the multiple first probability and the multiple relevance
Model;And one semanteme model establish module, with the model establish module be electrically connected, analyze the multiple inquiry data manipulation
There is multiple second probability of the multiple system regions vocabulary, and root in being intended in intention and the multiple executing instruction operations
A common semanteme model is established according to the multiple sentence pattern structure and the multiple second probability.
According to one embodiment of this case, which also includes: a situation training module electrically connects with the scenario analysis module
It connects, the data in the historical data base are carried out relationship power classification using a classifier, generates a functional situational model;
And a vocabulary training module, it is electrically connected with the unknown corpus judgment module, the multiple training corpus to break
Word and analysis, and a functional lexicon model is generated according to the data in the historical data base.
According to one embodiment of this case, the scenario analysis module is more to utilize the corpus data and the function contextual information
It is compared with the function situational model, and generates a functional situation identification result, and according to the function situation identification result
Judge one of the multiple intention corresponding one of the inquiry data manipulation and the executing instruction operations.
According to one embodiment of this case, the unknown corpus judgment module more to according to the function lexicon model to the corpus number
According to hyphenation is carried out, to generate multiple participles, to calculate the frequency of the multiple participle.
According to one embodiment of this case, the update information module is more to judge calculated the multiple point of the word segmentation processing
Whether the frequency of word is lower than a threshold value;If one of the multiple participle is lower than the threshold value, the multiple participle
One of be then the new term, and the definition of the new term is received, to update the common lexicon model and the common meaning of one's words
Model;If the multiple participle is above the threshold value, which is then the new corpus data, and receives the newspeak
The intention of data is expected, to update the function situational model.
According to one embodiment of this case, the update information module more to judge whether the new corpus data is common corpus,
If it is the system regions lexical set is updated according to the new corpus data;And the system regions are updated according to the new term
Lexical set.
According to one embodiment of this case, the Concordance module is more to recognize the corpus data using the common lexicon model
In whether have and meet the multiple system regions vocabulary in the critical entities set, identification result is set as the multiple
Vocabulary, and the probability that the multiple vocabulary occurs is analyzed, according to the sentence pattern structure of the multiple lexical analysis corpus data, and
The sentence pattern Structure Identification of probability and the corpus data language occurred using the common semanteme model according to the multiple vocabulary
Expect the multiple intention and the corresponding probability of the multiple intention of data.
The Auto-learning Method of virtual assistant of the invention and the automatic learning system of virtual assistant mainly allow and virtually help
Reason has the function of learning automatically, allows virtual assistant can be during exchanging with user, automatic study arrives user's
Jargoon word in habit or the industry of speaking, reach allow user using ERP system when being capable of more rapidly convenient function
Effect.
Detailed description of the invention
For above and other purpose, feature, advantage and embodiment of the invention can be clearer and more comprehensible, appended attached drawing is said
It is bright as follows:
Fig. 1 is a kind of schematic diagram of the automatic learning system of virtual assistant according to depicted in some embodiments of this case;
Fig. 2 is the schematic diagram of the processor according to depicted in some embodiments of this case;
Fig. 3 is a kind of flow chart of the Auto-learning Method of virtual assistant according to depicted in some embodiments of this case;
Fig. 4 is the flow chart of the training data model according to depicted in some embodiments of this case;
Fig. 5 is the flow chart of the step S320 according to depicted in some embodiments of this case;
Fig. 6 is the flow chart of the step S330 according to depicted in some embodiments of this case;
Fig. 7 is the flow chart of the step S340 according to depicted in some embodiments of this case;And
Fig. 8 is the flow chart of the step S360 according to depicted in some embodiments of this case.
Specific embodiment
The many different embodiments or illustration disclosed below of providing are to implement different characteristic of the invention.In special illustration
Element and configuration are used to simplify this announcement in the following discussion.The purposes that any illustration discussed only is used to narrate, and
It will not limit the invention in any way or the range and meaning of its illustration.In addition, this announcement may repeat in different illustrations
Numerical chracter and/or letter are quoted, these are repeated all in order to simplify and illustrate, different real in itself and not specified following discussion
Apply the relationship between example and/or configuration.
The word (terms) used in full piece specification and claims usually has every in addition to having and especially indicating
A word using in the content disclosed in this area, herein with the usual meaning in special content.It is certain to describe originally to take off
The word of dew by it is lower or this specification other places discuss, to provide those skilled in the art in the description in relation to this exposure
Additional guidance.
About " coupling " used herein or " connection ", can refer to two or multiple element mutually directly make entity or electricity
Property contact, or mutually put into effect indirectly body or in electrical contact, and " coupling " or " connection " also can refer to two or multiple element mutually grasp
Make or acts.
Herein, using the vocabulary of first, second and third etc., be used to describe various elements, component, region,
Layer and/or block be it is understood that.But these elements, component, region, layer and/or block should not be by these terms
It is limited.These vocabulary are only limited to for distinguishing single element, component, region, layer and/or block.Therefore, one hereinafter
First element, component, region, layer and/or block are also referred to as second element, component, region, layer and/or block, without de-
From original idea of the invention.As used herein, vocabulary " and/or " contain any of one or more of associated item listed
Combination.Mentioned in this case file " and/or " refer to table column element any one, all or at least one any combination.
Please refer to Fig. 1.Fig. 1 is a kind of automatic learning system of virtual assistant according to depicted in some embodiments of this case
100 schematic diagram.Show as depicted in FIG. 1, the automatic learning system 100 of virtual assistant include processor 110, storage device 130 with
And input/output device 150.Storage device 130 to store global database 131, working knowledge database 132, field are known
Know database 133 and historical data base 134, stores global database 131, working knowledge database 132, domain knowledge data
Library 133 and historical data base 134 are electrically connected to processor 110.Input/output device 150 is electrically connected to processor
110, to provide interface for input audio.In an embodiment, input/output device 150 can be keyboard, touch screen
Curtain, microphone, loudspeaker or other suitable input/output devices.It is defeated that user can pass through the interface that input/output device provides
Enter audio.
In various embodiments of the present invention, processor 110 may be embodied as integrated circuit such as micro-control unit
(microcontroller), microprocessor (microprocessor), digital signal processor (digital signal
Processor), special application integrated circuit (application specific integrated circuit, ASIC), patrol
Collect the combination of circuit or other similar element or said elements.Storage device 150 may be embodied as memory body, hard disk, portable disk,
Memory card etc..
Referring to Fig. 2, Fig. 2 is a kind of schematic diagram of processor 110 according to depicted in some embodiments of this case.Processing
Device 110 include voice identification module 111, Concordance module 112, situation identification module 113, unknown corpus judgment module 114,
Update information module 115, training module 121, model establishes module 122, semanteme model establishes module 123, lexicon model is established
Module 124, situation training module 125 and vocabulary training module 126.111 electricity of Concordance module 112 and voice identification module
Property connection, situation recognizes module 113 and Concordance module 112 and is electrically connected, and unknown corpus judgment module 114 judges with situation
Module 113 is electrically connected, and updates information module 115 and unknown corpus judgment module 114 is electrically connected.Training module 121 and language
Expect that analysis module 112 is electrically connected, model establishes module 122 and training module 121 is electrically connected, and semanteme model establishes module
123 and lexicon model establishes module 124 and model establishes the electric connection of module 122, situation training module 125 and situation recognize
Module 113 is electrically connected, and unknown corpus judgment module 114 is electrically connected with vocabulary training module 126.
Also referring to FIG. 1 to FIG. 3.Fig. 3 be a kind of virtual assistant according to depicted in some embodiments of this case from
The flow chart of dynamic learning method 300.As shown in figure 3, the Auto-learning Method 300 of virtual assistant comprises the steps of:
Step S310: it receives audio input and recognizes audio to form corpus data;
Step S320: analyzing corpus data using Natural Language Processing Models, to generate language corresponding with corpus data
Characteristic information;
Step S330: functional scenario analysis is carried out to language feature information according to functional contextual information, judges these intentions
One of corresponding operation;
Step S340: it if functional scenario analysis can not judge the corresponding operation of one of these intentions, is directed to
Corpus data carries out word segmentation processing;
Step S350: with according to after word segmentation processing as a result, judging whether there is new term or new corpus data;And
Step S360: if there is new term, updating Natural Language Processing Models according to the meaning of new term, if there is
New corpus data updates functional scenario analysis according to the intention of new corpus data.
In step S310, receives audio input and recognize audio to form corpus data.In an embodiment, via defeated
Enter/audio that receives of output device 150 can carry out speech recognition by the voice identification module 111 of processor 110, it will use
The natural language of person is converted to corpus data.In another embodiment, speech recognition can also be passed audio by Internet
It send to cloud voice identification system, after recognizing audio via cloud voice identification system, then using identification result as corpus data,
For example, cloud voice identification system may be embodied as the voice identification system of google.
Before executing step S320, common lexicon model and common semanteme model need to be first established.Therefore figure is please referred to
4, Fig. 4 be the flow chart of the training data model according to depicted in some embodiments of this case.As shown in figure 4, training data mould
The type stage comprises the steps of:
Step S410: according to working knowledge database and domain knowledge data library generation system Field Words set;
Step S420: system regions lexical set and multiple parameters that are served by are formed as critical entities set;
Step S430: multiple training corpus are classified as inquiry one of data manipulation and executing instruction operations;
Step S440: according to the meaning of these training corpus of the corresponding inquiry data manipulation of class discrimination in enterprise database
Figure is intended at multiple queries data manipulation, and the service behavior provided according to enterprise resource system is distinguished correspondence and executed instruction
The intention of these training corpus of operation forms multiple executing instruction operations and is intended to;
Step S450: the model that inquiry data manipulation is intended to and the model that executing instruction operations are intended to are established;
Step S460: the model and executing instruction operations being intended to according to critical entities set, inquiry data manipulation are intended to
Model establish global database;
Step S470: multiple first machines that the system regions vocabulary in identification critical entities set occurs in training corpus
Rate, and the multiple sentence pattern structures and system regions vocabulary of the system regions lexical analysis training corpus by picking out are each other
Between multiple relevances, and common lexicon model is established according to the first probability and relevance;And
Step S480: there is system regions vocabulary in analysis inquiry data manipulation intention and executing instruction operations intention
Multiple second probability, and common semanteme model is established according to sentence pattern structure and the second probability.
In step S410 and step S420, system is generated according to working knowledge database 132 and domain knowledge data library 133
Domain lexical set is commanded, reutilization system Field Words set and multiple parameters that are served by are formed as critical entities set, close
Key entity sets include multiple system regions vocabulary.For example, critical entities set includes enterprise's Field Words and enterprise
System is served by the information such as parameter.Enterprise's Field Words then refer to that the enterprise of each different field may may require that and use
Vocabulary, such as the vocabulary that the vocabulary that applies to of hospitality industry and transport service apply to is not centainly identical, therefore enterprise's Field Words meeting
It is varied according to each enterprise's difference using ERP system.Business system to be served by parameter then be business system institute
The corresponding parameter of the respective services of offer, for example, the function of asking for leave in business system may need to ask for leave the time, it is false not etc.
Information, the system regions vocabulary in critical entities set are just needed comprising information such as the leave of absence, annual leave, sick leave, vacations of going on business.
Specifically, critical entities set also includes that data field title, the business system having when accessing data provide
To the parameter value of the service name of user, the user restrictive condition set in inquiry, the parameter value that is served by with
And handling function of business system etc., the handling function of business system can be ask for leave, work overtime application, application of going on business, report branch etc.
Handling function.And these above-mentioned information may also have corresponding alias, need to also input together in tranining database, example
Such as: packing slip is possible to shipment detail list or the different title of sales slip for the manufacturer of specific area.
In step S430, by multiple training corpus be classified as inquiry data manipulation and executing instruction operations wherein it
One.Training corpus can be user may under instruction or the problem of can ask etc. natural languages data, establishing
Can be by training corpus according to intent classifier after critical entities set, in an embodiment, the intention of user is divided into inquiry data
Operation and executing instruction operations, but can also be by the finer of the intent classifier of user, the invention is not limited thereto.Citing and
Speech, if user says virtual assistant: " me please be helped to look for the packing slip of XX company " can classify in the intent of the present invention classification
To inquire data manipulation, virtual assistant will remove in enterprise database the packing slip for helping user to inquire XX company.If used
Person says virtual assistant: " me is helped to ask the vacation of going on business on January 30 " can be classified as executing instruction behaviour in the intent of the present invention classification
Make, virtual assistant will help user to ask for leave in Entry Firm resource system.
In step S440, according to these training corpus of the corresponding inquiry data manipulation of class discrimination in enterprise database
Intention form multiple queries data manipulation intention, and distinguish corresponding execute according to the service behavior that enterprise resource system provides
It instructs the intention of these training corpus of operation to form multiple executing instruction operations to be intended to.It, can be first according to every in an embodiment
The enterprise database of a different field, which distinguishes inquiry data manipulation, to be intended to.For example, the enterprise database of hospitality industry is stored up
The data field deposited is not centainly identical as the enterprise database of transport service, therefore user's demand of the two is also not necessarily identical.
For example, might have inquiry medical record data, inquiry ward vacancy etc. to the user of hospitality industry is all to inquire data manipulation not
With being intended to, might have to the user of transport service and inquire shipment record, inquiry package shipping situation etc. is all that inquiry data are grasped
The different of work are intended to.Certainly also can according to the enterprise resource system of each different field provide service behavior to execute instruction behaviour
To make to distinguish and be intended to, service also certainly can be different with transport service provided by the enterprise resource system of hospitality industry as described above,
Inquiry data manipulation provided by the enterprise of each different field or service behavior operation also not necessarily can be general, therefore also need
Will service differentiation provided by the enterprise to each different field be intended to, for example, might have offer to the user of hospitality industry
It is all that the difference that service behavior operates is intended to that service, the offer registered, which are hospitalized and order service etc. of health meal, the use to transport service
It is all that service behavior operates not that person, which might have the service for providing automatic classification cargo, the service for arranging cargo shipment sequence etc.,
With intention.
In step S450 and step S460, the model and executing instruction operations intention that inquiry data manipulation is intended to are established
Model, and according to critical entities set, inquiry data manipulation be intended to model and executing instruction operations be intended to model build
Vertical global database 131.For example, user is grasped in the inquiry data that the virtual assistant for operating some field enterprise has
Make to be intended to and executing instruction operations are intended to after all distinguishing well, so that it may corresponding model is generated for each intention, according to top
Example, hospitality industry just has corresponding inquiry medical record data, inquiry ward vacancy, the service registered is provided and provide be hospitalized order it is strong
4 models of the service of health meal, transport service just have corresponding inquiry shipment record, inquiry package shipping situation, provide automatic point
4 models of service, the service of arrangement cargo shipment sequence of class cargo, then can be real according to these above-mentioned models and key
Body set establishes global database 131.
In step S470, the system regions vocabulary in critical entities set occurs in training corpus multiple the are recognized
One probability, and the multiple sentence pattern structures and system regions vocabulary of the system regions lexical analysis training corpus by picking out
Multiple relevances each other, and common lexicon model is established according to the first probability and relevance.In one embodiment, sharp
It is calculated with two kinds of algorithms of n-gram (n-GRAM) and context-free grammar (Context-free grammar, CFG) every
The probability that one system regions vocabulary occurs in training corpus, and pass through the sentence pattern structure of system regions lexical analysis training corpus
And the relevance between system regions vocabulary is to establish common lexicon model.For example, if having in training corpus "
I will inquire the price list of XX company " and " I will inquire the packing slip of XX company ", and " XX company ", " price list " and " out
Manifest " is all system regions vocabulary, but in above-mentioned example, since " XX company " may averagely appear in each inquiry number
According in the intention of operation, therefore the probability of " XX company " is almost the same in the intention of each inquiry data manipulation, and " report
Valence list " and " packing slip " then only largely occur in the training corpus of intention for inquiring certain specific datas, without
It inquires in the training corpus of the intention of other data, therefore the probability of " price list " and " packing slip " can be special in corresponding intention
It is not high, and can be lower in other intentions.
In step S480, there is system regions word in analysis inquiry data manipulation intention and executing instruction operations intention
Multiple second probability converged, and common semanteme model is established according to sentence pattern structure and the second probability.In one embodiment, it utilizes
Hidden Markov model (Hidden Markov Model, HMM) algorithm computing system Field Words are in inquiry data manipulation meaning
The probability occurred in figure and executing instruction operations intention, to establish common semanteme model, for example, in training data model
Can input many training corpus when the stage, hidden Markov model algorithm must computing system Field Words be intended to different
Existing probability.In conjunction with above-mentioned example, if having " I will inquire the packing slip of XX company " in training corpus, according to n-gram
And context-free grammar can find out " XX company " and " packing slip " is all system regions vocabulary, and hidden Markov model
Algorithm can be intended to according to all system regions vocabulary picked out in inquiry data manipulation intention and executing instruction operations
In probability and system regions vocabulary between relationship, further judge " packing slip " with inquire the intention of stock withdrawal data it is related
Connection can help user to inquire going out for XX company in enterprise database automatically in conjunction with the system regions vocabulary of " XX company "
Goods related data.
After having established common lexicon model and common semanteme model, step S320 is then carried out, at natural language
Model analysis corpus data is managed, to generate language feature information corresponding with corpus data, language feature information includes multiple meanings
Scheme, be intended to corresponding probability and multiple vocabulary.The thin portion process of step S320 is referring to FIG. 5, Fig. 5 is according to some of this case
The flow chart of step S320 depicted in embodiment.As shown in figure 5, step S320 is comprised the steps of:
Step S321: it is using whether having to meet in critical entities set in common lexicon model identification corpus data
System Field Words, are set as the vocabulary in language feature information for identification result, and the vocabulary in metalanguage characteristic information goes out
Existing probability;
Step S322: according to the sentence pattern structure of the lexical analysis corpus data in characteristic information;And
Step S323: the probability and corpus data occurred using common semanteme model according to the vocabulary in characteristic information
The intention and the corresponding probability of intention of sentence pattern Structure Identification corpus data.
In step S321 and step S322, meet key using whether having in common lexicon model identification corpus data
Identification result is set as the vocabulary in language feature information, and metalanguage feature by the system regions vocabulary in entity sets
The probability that vocabulary in information occurs, further according to the sentence pattern structure of the lexical analysis corpus data in characteristic information.For example,
The corpus data that user is inputted is recognized the vocabulary in corpus data containing system regions vocabulary using common lexicon model
Out, further judge the sentence pattern structure of corpus data.For example, if user says virtual assistant: " I wants
Look into the packing slip of XX company last month ", " XX company ", " last month " and " packing slip " can be picked out according to common lexicon model
Etc. the vocabulary for meeting system regions vocabulary.
In step S323, using common semanteme model according to the probability and corpus number of the vocabulary appearance in characteristic information
According to sentence pattern Structure Identification corpus data intention and be intended to corresponding probability.According to the example of top, " XX public affairs are picked out
After the vocabulary such as department ", " last month " and " packing slip ", can further judge these vocabulary it is intentional in probability.Herein
All probability for being intended to encompass all inquiry data manipulation intentions and executing instruction operations and being intended to referred to.
In step S330, functional scenario analysis is carried out to language feature information according to functional contextual information, judges these
The corresponding operation of one of intention.It needs first to establish functional situational model and functional vocabulary before carrying out functional scenario analysis
Model, functional situational model are the spies for being first converted into the data in historical data base 134 when carrying out functional scenario analysis
Vector is levied, then can utilize machine learning algorithm by the data in historical data base 134 according to a variety of different context classifications
The strong or weak relation between feature vector and each situation is calculated afterwards, then generates functional situational model.It is suitble to establish above-mentioned functional feelings
The machine learning algorithm in border include: conventional machines study it is common support vector machine (Support Vector Machine,
SVM), and at present deep learning (Deep Learning) relevant convolutional Neural network (Convolutional Neural
Networks, CNN), recurrent neural network (Recurrent Neural Networks, RNN) and shot and long term memory models
Algorithms such as (Long Short-Term Memory, LSTM).
Hold above-mentioned, functional lexicon model is to be divided according to the training corpus that largely inputs using hidden Markov model algorithm
Hyphenation processing is carried out after analysis again, can then count the frequency of occurrences of participle to generate participle frequency meter, and then establishes functional vocabulary
Model.The thin portion process of step S330 is referring to FIG. 6, Fig. 6 is the step S330 according to depicted in some embodiments of this case
Flow chart.As shown in fig. 6, step S330 is comprised the steps of:
Step S331: it is compared using corpus data and functional contextual information with functional situational model, and generates duty
It can situation identification result;And
Step S332: according to functional situation identification result judge these one of be intended to corresponding inquiry data manipulation and
One of executing instruction operations.
In step S331, it is compared using corpus data and functional contextual information with functional situational model, and produce
Raw function situation identification result.Functional contextual information include the identity of user, the position of user, user department, when
Between and place.The partial information of functional contextual information can be sensed by input/output device 150, such as can be detected and made
The current state of user (such as, if go on business back).According to front recognize after user's corpus data it is obtained intentionally
Scheme corresponding probability and vocabulary, the corpus data and training number of user can be further estimated in conjunction with functional contextual information
Probability according to the similarity degree of the data in model, as corresponding intention.
In step S332, the corresponding inquiry data behaviour of one of these intentions is judged according to functional situation identification result
One of work and executing instruction operations.Due to had in training data model multiple queries data manipulation be intended to and it is more
A executing instruction operations are intended to, and the corresponding machine of each intention can be being generated after the calculating of common semanteme model above-mentioned
Rate, the intention with lower probit value can use threshold value filtering, to obtain most possible intention and confirm corresponding behaviour
Make.By example above-mentioned it is found that judging these words after picking out vocabulary such as " XX companies ", " last month " and " packing slip "
The functional contextual information of the collocation that converges finds out the inquiry data manipulation intention being best suitable for or executing instruction operations are intended to, and is passing through above-mentioned behaviour
Judge out user after work to say virtual assistant: " I wants to look into the packing slip of XX company last month " can most possibly be wanted
The packing slip of XX company is looked into, therefore can correspond to out that user wishes to carry out is inquiry data manipulation.Need functional situation
Judgement be because can have different need because the information such as position, department, operating time, operation place of user are different
Ask, for example, procurement staff and financial staff can see [the every monthly returns of manufacturer], but may both [manufacturer is every
Monthly returns] statistics target it is not identical: one be count manufacturer situation of stocking up, the other is count oneself company payment
To the situation of manufacturer.But user when talking with virtual assistant not necessarily can clearly refer to needs what [manufacturer monthly counts
Table], it may only say: " I needs the every monthly returns of the manufacturer of last month " this simple sentence pattern, therefore just make with greater need for collocation
The functional contextual information of user is further accurately judged again.
In step S340, if functional scenario analysis can not judge the corresponding operation of one of these intentions,
Word segmentation processing is carried out for corpus data.The thin portion process of step S340 is referring to FIG. 7, Fig. 7 is some implementations according to this case
The flow chart of step S340 depicted in example.As shown in fig. 7, step S340 is comprised the steps of:
Step S341: hyphenation is carried out to corpus data according to functional lexicon model, to generate multiple participles;And
Step S342: the frequency of these participles is calculated.
In step S341 and step S342, hyphenation is carried out to corpus data according to functional lexicon model, it is multiple to generate
Participle;Then the frequency of these participles is calculated.If functional scenario analysis can not judge the corpus number of input in step S330
When according to corresponding operation, it is necessary to carry out word segmentation processing to corpus data.Firstly, can be according to the functional word previously pre-established
The vocabulary stored in remittance model carries out hyphenation to corpus data, then calculates the frequency of the multiple participles generated after hyphenation.
In step S350 and step S360, with according to after word segmentation processing as a result, judging whether there is new term or newspeak
Expect data;If there is new term, Natural Language Processing Models are updated according to the meaning of new term, if there is new corpus number
According to according to the functional scenario analysis of the intention of new corpus data update.The thin portion process of step S360 is referring to FIG. 8, Fig. 8 is basis
The flow chart of step S360 depicted in some embodiments of this case.As shown in figure 8, step S360 is comprised the steps of:
Step S361: judge whether the frequency of these calculated participles of word segmentation processing is lower than threshold value;
Step S362: if these participle one of be lower than threshold value, these participle one of if be neologisms
It converges, and receives the definition of new term, to update common lexicon model and common semanteme model;And
Step S363: if these participles are above threshold value, corpus data is then new corpus data, and receives newspeak
The intention of data is expected, to update functional situational model.
In step S361 and step S362, judge whether the frequency of these calculated participles of word segmentation processing is lower than threshold
Value, if one of these participles are lower than threshold value, one of these participles ifs is new term, and receives new term
Definition, to update common lexicon model and common semanteme model.In an embodiment, these points have been calculated by word segmentation processing
After the frequency of word, the participle that will be less than threshold value is set as new term, and virtual assistant can inquire the definition of user's new term, and
The definition of new term and new term is stored in together in common lexicon model and common semanteme model.For example, user
The corpus of input is " contact person of my Xiang Zhao XX company ", and if virtual assistant can not judge " the contact of my Xiang Zhao XX company
The meaning of people ", can be separated after word segmentation processing " I ", " wanting to look for ", " XX company ", " ", the vocabulary such as " contact person ", if " XX
Company " be lower than threshold value, virtual assistant can inquire user " XX company " be what the meaning, then by the answer of user and "
XX company " is stored in common lexicon model and common semanteme model together;And new term is also required to be stored in system regions vocabulary together
In set, shared with owner.
In step S363, if these participles are above threshold value, corpus data is then new corpus data, and is received
The intention of new corpus data, to update functional situational model.The example for connecting top " contact person of my Xiang Zhao XX company " is dividing
Separated after word processing " I ", " wanting to look for ", " XX company ", " ", the vocabulary such as " contact person ", if being all lower than threshold without vocabulary
Value, indicate virtual assistant it is unapprehended be corpus intention, it is possible to the training corpus in training smart assistant be all about "
Help me to look into the contact person of XX company " narration, therefore virtual assistant will can not understand " contact person of my Xiang Zhao XX company "
It is intended to, and virtual assistant just needs to inquire what meaning user " contact person of my Xiang Zhao XX company " is again, will then use
The new corpus of the answer of person and " contact person of my Xiang Zhao XX company " are stored in functional situational model together.Be stored in functional model it
Before need to judge again whether new corpus is common corpus, if so then representing other people can also make when using virtual assistant
Use new corpus, it is therefore desirable to new corpus is stored in system regions lexical set, owner is allowed to share;But if words that no are then
It represents new corpus and is the different terms be accustomed to and had of speaking of user itself, therefore only need to update functional situational model
, do not need to update system regions lexical set again.
By the embodiment of above-mentioned this case it is found that mainly virtual assistant is allowed to have the function of learning automatically, allows and virtually help
Reason can be during exchanging, if there is the vocabulary that intelligent assistant is ignorant of can be after inquiring user, more with user
The database of new virtual assistant learns virtual assistant automatically to the spy in speak habit or the industry of user
Language is very used, reaches and user is allowed the use of ERP system to be the effect that can more rapidly facilitate.
In addition, above-mentioned illustration includes example steps sequentially, but these steps need not be sequentially executed according to shown.With
Different order executes these steps all considering in range in this disclosure.In the spirit and model of the embodiment of this disclosure
In enclosing, it can optionally increase, replace, change sequence and/or omitting these steps.
Although this case is disclosed as above with embodiment, so it is not limited to this case, any to be familiar with this those skilled in the art, In
Do not depart from the spirit and scope of this case, when can be used for a variety of modifications and variations, therefore the protection scope of this case when view it is appended
Subject to the range that claims are defined.
Claims (16)
1. a kind of Auto-learning Method of virtual assistant, characterized by comprising:
It receives an audio input and recognizes the audio to form a corpus data;
The corpus data is analyzed using a Natural Language Processing Models, to generate language feature letter corresponding with the corpus data
Breath, wherein the language feature information includes multiple intentions, the corresponding probability of the multiple intention and multiple vocabulary;
One functional scenario analysis is carried out to the language feature information according to a functional contextual information, judges its of the multiple intention
One of it is corresponding one operation;
If the function scenario analysis can not judge the corresponding operation of one of the multiple intention, it is directed to the corpus
Data carry out a word segmentation processing;
With according to after the word segmentation processing as a result, judging whether there is a new term or a new corpus data;And
If there is the new term, which is updated according to the meaning of the new term, if there is the newspeak
Expect data, which is updated according to the intention of the new corpus data;
Wherein, which includes inquiry one of a data manipulation and an executing instruction operations.
2. the Auto-learning Method of virtual assistant according to claim 1, which is characterized in that also include:
A system regions lexical set is generated according to a working knowledge database and a domain knowledge data library;
The system regions lexical set and multiple parameters that are served by are formed as a critical entities set, the critical entities set packet
Containing multiple system regions vocabulary;
Multiple training corpus are classified as one of the inquiry data manipulation and the executing instruction operations;
The intention shape of the multiple training corpus of the inquiry data manipulation is corresponded to according to the class discrimination in the enterprise database
It is intended at multiple queries data manipulation, and the service behavior provided according to the enterprise resource system is distinguished corresponding this and executed instruction
The intention of the multiple training corpus of operation forms multiple executing instruction operations and is intended to;
Establish the model that the multiple inquiry data manipulation is intended to and the model that the multiple executing instruction operations are intended to;
The model being intended to according to the critical entities set, the multiple inquiry data manipulation and the multiple executing instruction operations
The model of intention establishes the global database;
Recognize the multiple system regions vocabulary in the critical entities set occur in the multiple training corpus it is multiple
First probability, and multiple sentence pattern knots of the multiple the multiple training corpus of system regions lexical analysis by picking out
Multiple relevances between structure and the multiple system regions vocabulary, and according to the multiple first probability and institute
It states multiple relevances and establishes a common lexicon model;And
It analyzes in the multiple inquiry data manipulation intention and the multiple executing instruction operations intention and the multiple system occurs
It unites multiple second probability of Field Words, and it is common according to the multiple sentence pattern structure and the multiple second probability to establish one
Semanteme model.
3. the Auto-learning Method of virtual assistant according to claim 2, which is characterized in that also include:
The data in one historical data base are subjected to relationship power classification using a classifier, generate a functional situational model;With
And
The multiple training corpus is subjected to hyphenation and analysis, and generates a functional vocabulary according to the data in the historical data base
Model.
4. the Auto-learning Method of virtual assistant according to claim 3, which is characterized in that the function scenario analysis is also wrapped
Contain:
It is compared using the corpus data and the function contextual information with the function situational model, and generates a functional situation
Identification result;And
Judge that one of the multiple intention corresponds to the inquiry data manipulation and this is held according to the function situation identification result
One of row instruction operation.
5. the Auto-learning Method of virtual assistant according to claim 4, which is characterized in that the word segmentation processing also includes:
Hyphenation is carried out to the corpus data according to the function lexicon model, to generate multiple participles;And
Calculate the frequency of the multiple participle.
6. the Auto-learning Method of virtual assistant according to claim 5, which is characterized in that also include:
Judge whether the frequency of the calculated the multiple participle of the word segmentation processing is lower than a threshold value;
If it is the multiple participle one of be lower than the threshold value, it is the multiple participle one of if be the neologisms
It converges, and receives the definition of the new term, to update the common lexicon model and the common semanteme model;And
If the multiple participle is above the threshold value, which is then the new corpus data, and receives the newspeak
The intention of data is expected, to update the function situational model.
7. the Auto-learning Method of virtual assistant according to claim 6, which is characterized in that also include:
Judge whether the new corpus data is common corpus, if it is updates the system regions vocabulary according to the new corpus data
Set;And
The system regions lexical set is updated according to the new term.
8. the Auto-learning Method of virtual assistant according to claim 2, which is characterized in that the Natural Language Processing Models
Analyzing the corpus data also includes:
It is recognized using the common lexicon model whether the multiple in the critical entities set with meeting in the corpus data
Identification result is set as the multiple vocabulary by system regions vocabulary, and analyzes the probability that the multiple vocabulary occurs;
According to the sentence pattern structure of the multiple lexical analysis corpus data;And
Utilize the sentence pattern Structure Identification of probability and the corpus data that the common semanteme model occurs according to the multiple vocabulary
The multiple intention of the corpus data and the corresponding probability of the multiple intention.
9. a kind of automatic learning system of virtual assistant, connect with an enterprise database and an enterprise resource system respectively, special
Sign is, includes:
One processor;
One storage device is electrically connected to the processor, to store a global database, a working knowledge database, a neck
Domain knowledge database and a historical data base;
One input/output device is electrically connected to the processor, inputs an audio to provide an interface;
Wherein, which includes:
One voice identification module, to recognize the audio to form a corpus data;
One Concordance module is electrically connected with the voice identification module, to be somebody's turn to do using Natural Language Processing Models analysis
Corpus data, to generate a language feature information corresponding with the corpus data, wherein the language feature information includes multiple meanings
Figure, the corresponding probability of the multiple intention and multiple vocabulary;
One situation recognizes module, is electrically connected with the Concordance module, to special to the language according to a functional contextual information
Reference breath carries out a functional scenario analysis, judges the corresponding operation of one of the multiple intention;
One unknown corpus judgment module is electrically connected with situation identification module, can not recognize in situation identification module
When the corresponding operation of one of the multiple intention, a word segmentation processing is carried out for the corpus data, and with according to this point
Treated for word as a result, judging whether there is a new term or a new corpus data;And
One updates information module, is electrically connected with the unknown corpus judgment module, to when there is new term generation, according to this
The meaning of new term updates the Natural Language Processing Models, and when the new corpus data generates, according to the new corpus data
Intention update the function scenario analysis;
Wherein, which includes inquiry one of a data manipulation and an executing instruction operations.
10. the automatic learning system of virtual assistant according to claim 9, which is characterized in that the processor also includes:
One training module is electrically connected, to according to the working knowledge database and the domain knowledge with the Concordance module
Database generates a system regions lexical set, and the system regions lexical set and multiple parameters that are served by are formed as a key
Entity sets, which includes multiple system regions vocabulary, and multiple training corpus are classified as the inquiry data
One of operation and the executing instruction operations, correspond to the inquiry data manipulation according to the class discrimination in the enterprise database
The multiple training corpus intention formed multiple queries data manipulation intention, and according to the enterprise resource system provide
The intention that service behavior distinguishes the multiple training corpus of the corresponding executing instruction operations forms multiple executing instruction operations meanings
Figure;
One model establishes module, is electrically connected with the training module, establishes the model that the multiple inquiry data manipulation is intended to, with
And the model that the multiple executing instruction operations are intended to, it is intended to according to the critical entities set, the multiple inquiry data manipulation
Model and the multiple executing instruction operations be intended to model establish the global database;
One lexicon model establishes module, establishes module with the model and is electrically connected, recognizes described more in the critical entities set
Multiple first probability that a system regions vocabulary occurs in the multiple training corpus, and the multiple system by picking out
Between multiple sentence pattern structures and the multiple system regions vocabulary of commanding the multiple training corpus of domain lexical analysis
Multiple relevances, and a common lexicon model is established according to the multiple first probability and the multiple relevance;And
One semanteme model establishes module, establishes module with the model and is electrically connected, and analyzes the multiple inquiry data manipulation and is intended to
And there are multiple second probability of the multiple system regions vocabulary in the multiple executing instruction operations intention, and according to institute
It states multiple sentence pattern structures and the multiple second probability establishes a common semanteme model.
11. the automatic learning system of virtual assistant according to claim 10, which is characterized in that the processor also includes:
One situation training module is electrically connected with the scenario analysis module, to utilize a classifier will be in the historical data base
Data carry out relationship power classification, generate a functional situational model;And
One vocabulary training module is electrically connected, the multiple training corpus to break with the unknown corpus judgment module
Word and analysis, and a functional lexicon model is generated according to the data in the historical data base.
12. the automatic learning system of virtual assistant according to claim 11, which is characterized in that the scenario analysis module is more
To be compared using the corpus data and the function contextual information with the function situational model, and generate a functional situation
Identification result, and the corresponding inquiry data behaviour of one of the multiple intention is judged according to the function situation identification result
One of work and the executing instruction operations.
13. the automatic learning system of virtual assistant according to claim 12, which is characterized in that the unknown corpus judges mould
Block is the multiple to calculate to generate multiple participles more to carry out hyphenation to the corpus data according to the function lexicon model
The frequency of participle.
14. the automatic learning system of virtual assistant according to claim 13, which is characterized in that the update information module is more
To judge whether the frequency of the calculated the multiple participle of the word segmentation processing is lower than a threshold value;If the multiple participle
One of be lower than the threshold value, one of the multiple participle is then the new term, and receives determining for the new term
Justice, to update the common lexicon model and the common semanteme model;If the multiple participle is above the threshold value, the language
Expect that data are then the new corpus data, and receive the intention of the new corpus data, to update the function situational model.
15. the automatic learning system of virtual assistant according to claim 14, which is characterized in that the update information module is more
To judge whether the new corpus data is common corpus, the system regions vocabulary is if it is updated according to the new corpus data
Set;And the system regions lexical set is updated according to the new term.
16. the automatic learning system of virtual assistant according to claim 10, which is characterized in that the Concordance module is more
It is whether the multiple in the critical entities set with meeting in the corpus data to be recognized using the common lexicon model
Identification result is set as the multiple vocabulary by system regions vocabulary, and analyzes the probability that the multiple vocabulary occurs, according to institute
The sentence pattern structure of multiple lexical analysis corpus datas is stated, and occurred using the common semanteme model according to the multiple vocabulary
The multiple intention of probability and the sentence pattern Structure Identification of the corpus data corpus data and the multiple intention correspond to
Probability.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810436639.2A CN110489517B (en) | 2018-05-09 | 2018-05-09 | Automatic learning method and system of virtual assistant |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810436639.2A CN110489517B (en) | 2018-05-09 | 2018-05-09 | Automatic learning method and system of virtual assistant |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110489517A true CN110489517A (en) | 2019-11-22 |
CN110489517B CN110489517B (en) | 2023-10-31 |
Family
ID=68543225
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810436639.2A Active CN110489517B (en) | 2018-05-09 | 2018-05-09 | Automatic learning method and system of virtual assistant |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110489517B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3709295A1 (en) * | 2019-03-11 | 2020-09-16 | Beijing Baidu Netcom Science And Technology Co. Ltd. | Methods, apparatuses, and storage media for generating training corpus |
CN112559699A (en) * | 2020-11-09 | 2021-03-26 | 联想(北京)有限公司 | Information interaction method, device and equipment |
Citations (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2007094291A (en) * | 2005-09-30 | 2007-04-12 | Tetsuo Suga | Learning system of linguistic knowledge of natural language learning system and recording medium which records natural language learning program |
TW201140559A (en) * | 2010-05-10 | 2011-11-16 | Univ Nat Cheng Kung | Method and system for identifying emotional voices |
US20130173252A1 (en) * | 2011-12-30 | 2013-07-04 | Hon Hai Precision Industry Co., Ltd. | Electronic device and natural language analysis method thereof |
CN103226949A (en) * | 2011-09-30 | 2013-07-31 | 苹果公司 | Using context information to facilitate processing of commands in a virtual assistant |
CN104346406A (en) * | 2013-08-08 | 2015-02-11 | 北大方正集团有限公司 | Training corpus expanding device and training corpus expanding method |
CN104360994A (en) * | 2014-12-04 | 2015-02-18 | 科大讯飞股份有限公司 | Natural language understanding method and natural language understanding system |
TW201516756A (en) * | 2013-10-28 | 2015-05-01 | Univ Kun Shan | Intelligent voice control system and method therefor |
CN104778945A (en) * | 2005-08-05 | 2015-07-15 | 沃伊斯博克斯科技公司 | Systems and methods for responding to natural language speech utterance |
US20150220511A1 (en) * | 2014-02-04 | 2015-08-06 | Maluuba Inc. | Method and system for generating natural language training data |
US20150379414A1 (en) * | 2014-06-27 | 2015-12-31 | Nuance Communications, Inc. | Utilizing large-scale knowledge graphs to support inference at scale and explanation generation |
US20160026634A1 (en) * | 2014-07-28 | 2016-01-28 | International Business Machines Corporation | Corpus Quality Analysis |
CN106057200A (en) * | 2016-06-23 | 2016-10-26 | 广州亿程交通信息有限公司 | Semantic-based interaction system and interaction method |
US20170213545A1 (en) * | 2016-01-22 | 2017-07-27 | Electronics And Telecommunications Research Institute | Self-learning based dialogue apparatus and method for incremental dialogue knowledge |
CN107015969A (en) * | 2017-05-19 | 2017-08-04 | 四川长虹电器股份有限公司 | Can self-renewing semantic understanding System and method for |
WO2017212783A1 (en) * | 2016-06-09 | 2017-12-14 | 真由美 稲場 | Program for realizing function to assist in understanding personality and preferences of other party and communicating |
CN107818781A (en) * | 2017-09-11 | 2018-03-20 | 远光软件股份有限公司 | Intelligent interactive method, equipment and storage medium |
-
2018
- 2018-05-09 CN CN201810436639.2A patent/CN110489517B/en active Active
Patent Citations (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104778945A (en) * | 2005-08-05 | 2015-07-15 | 沃伊斯博克斯科技公司 | Systems and methods for responding to natural language speech utterance |
JP2007094291A (en) * | 2005-09-30 | 2007-04-12 | Tetsuo Suga | Learning system of linguistic knowledge of natural language learning system and recording medium which records natural language learning program |
TW201140559A (en) * | 2010-05-10 | 2011-11-16 | Univ Nat Cheng Kung | Method and system for identifying emotional voices |
CN103226949A (en) * | 2011-09-30 | 2013-07-31 | 苹果公司 | Using context information to facilitate processing of commands in a virtual assistant |
US20130173252A1 (en) * | 2011-12-30 | 2013-07-04 | Hon Hai Precision Industry Co., Ltd. | Electronic device and natural language analysis method thereof |
CN104346406A (en) * | 2013-08-08 | 2015-02-11 | 北大方正集团有限公司 | Training corpus expanding device and training corpus expanding method |
TW201516756A (en) * | 2013-10-28 | 2015-05-01 | Univ Kun Shan | Intelligent voice control system and method therefor |
US20150220511A1 (en) * | 2014-02-04 | 2015-08-06 | Maluuba Inc. | Method and system for generating natural language training data |
US20150379414A1 (en) * | 2014-06-27 | 2015-12-31 | Nuance Communications, Inc. | Utilizing large-scale knowledge graphs to support inference at scale and explanation generation |
US20160026634A1 (en) * | 2014-07-28 | 2016-01-28 | International Business Machines Corporation | Corpus Quality Analysis |
CN104360994A (en) * | 2014-12-04 | 2015-02-18 | 科大讯飞股份有限公司 | Natural language understanding method and natural language understanding system |
US20170213545A1 (en) * | 2016-01-22 | 2017-07-27 | Electronics And Telecommunications Research Institute | Self-learning based dialogue apparatus and method for incremental dialogue knowledge |
WO2017212783A1 (en) * | 2016-06-09 | 2017-12-14 | 真由美 稲場 | Program for realizing function to assist in understanding personality and preferences of other party and communicating |
CN106057200A (en) * | 2016-06-23 | 2016-10-26 | 广州亿程交通信息有限公司 | Semantic-based interaction system and interaction method |
CN107015969A (en) * | 2017-05-19 | 2017-08-04 | 四川长虹电器股份有限公司 | Can self-renewing semantic understanding System and method for |
CN107818781A (en) * | 2017-09-11 | 2018-03-20 | 远光软件股份有限公司 | Intelligent interactive method, equipment and storage medium |
Non-Patent Citations (3)
Title |
---|
孟晋: "智能语音助手抢占AI入口市场", 《新经济导刊》 * |
孟晋: "智能语音助手抢占AI入口市场", 《新经济导刊》, no. 04, 5 April 2017 (2017-04-05) * |
陈克健: "电子词典与词汇知识表达", 中文信息学报, no. 04 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3709295A1 (en) * | 2019-03-11 | 2020-09-16 | Beijing Baidu Netcom Science And Technology Co. Ltd. | Methods, apparatuses, and storage media for generating training corpus |
US11348571B2 (en) | 2019-03-11 | 2022-05-31 | Beijing Baidu Netcom Science And Technology Co., Ltd. | Methods, computing devices, and storage media for generating training corpus |
CN112559699A (en) * | 2020-11-09 | 2021-03-26 | 联想(北京)有限公司 | Information interaction method, device and equipment |
Also Published As
Publication number | Publication date |
---|---|
CN110489517B (en) | 2023-10-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11553080B2 (en) | Detecting fraud using machine-learning and recorded voice clips | |
US20190272269A1 (en) | Method and system of classification in a natural language user interface | |
US7912714B2 (en) | Method for segmenting communication transcripts using unsupervised and semi-supervised techniques | |
EP3610396A1 (en) | Voice identification feature optimization and dynamic registration methods, client, and server | |
CN109635117A (en) | A kind of knowledge based spectrum recognition user intention method and device | |
CN107886949A (en) | A kind of content recommendation method and device | |
CN107833059B (en) | Service quality evaluation method and system for customer service | |
CN110032724B (en) | Method and device for recognizing user intention | |
CN106844335A (en) | Natural language processing method and device | |
CN108027814A (en) | Disable word recognition method and device | |
US9910909B2 (en) | Method and apparatus for extracting journey of life attributes of a user from user interactions | |
CN110489517A (en) | The Auto-learning Method and system of virtual assistant | |
KR102307380B1 (en) | Natural language processing based call center support system and method | |
CN111190973A (en) | Method, device, equipment and storage medium for classifying statement forms | |
JP2002351899A (en) | Data analysis device, data analysis method and program | |
WO2020139865A1 (en) | Systems and methods for improved automated conversations | |
CN112116457B (en) | Bank counter business supervision method, device and equipment | |
TWI674530B (en) | Method and system for operating a virtual assistant | |
CN110209776A (en) | Operate the method and system of virtual assistant | |
TWI679548B (en) | Method and system for automated learning of a virtual assistant | |
TWI665566B (en) | System and method for product classification | |
CN114546326A (en) | Virtual human sign language generation method and system | |
TWM555499U (en) | Product classification system | |
CN115203382A (en) | Service problem scene identification method and device, electronic equipment and storage medium | |
CN113094471A (en) | Interactive data processing method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |