CN108038234A - A kind of question sentence template automatic generation method and device - Google Patents

A kind of question sentence template automatic generation method and device Download PDF

Info

Publication number
CN108038234A
CN108038234A CN201711436114.0A CN201711436114A CN108038234A CN 108038234 A CN108038234 A CN 108038234A CN 201711436114 A CN201711436114 A CN 201711436114A CN 108038234 A CN108038234 A CN 108038234A
Authority
CN
China
Prior art keywords
question sentence
template
msub
sentence template
question
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201711436114.0A
Other languages
Chinese (zh)
Other versions
CN108038234B (en
Inventor
邹辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Zhongan Information Technology Service Co ltd
Original Assignee
Zhongan Information Technology Service Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhongan Information Technology Service Co Ltd filed Critical Zhongan Information Technology Service Co Ltd
Priority to CN201711436114.0A priority Critical patent/CN108038234B/en
Publication of CN108038234A publication Critical patent/CN108038234A/en
Application granted granted Critical
Publication of CN108038234B publication Critical patent/CN108038234B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/3332Query translation
    • G06F16/3334Selection or weighting of terms from queries, including natural language queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology

Abstract

The invention discloses a kind of question sentence template automatic generation method and device, belong to intelligent answer technical field.The described method includes:Prepare question sentence daily record language material;The daily record language material is segmented and part-of-speech tagging;It is named Entity recognition and replacement;Carry out semantic replacement;Frequent item set mining is carried out, generates question sentence template.This method and device not only increase the efficiency of question sentence template generation, substantial saving in manual resource, and the question sentence template of generation can be assessed, autonomous lasting extension question sentence template library, lift the quality of intelligent Answer System knowledge base.

Description

A kind of question sentence template automatic generation method and device
Technical field
The present invention relates to intelligent answer technical field, more particularly to a kind of question sentence template automatic generation method and device.
Background technology
At present, more and more enterprises can undertake a large number of users after-sale service or pre-sales work of consultation.Due to number of users Exponential increase, all users consulting is carried out by answer can expend substantial amounts of manual resource completely by the way of artificial, and Many knowledge points Relatively centralized, manually replys and usually contains a large amount of duplications of labour, so that intelligent Answer System comes into being, intelligence Question answering system can be answered automatically for problem input by user, and efficiency is increased substantially.
The technical principle of intelligent Answer System has based on modes such as question sentence template matches, knowledge library searchings.Wherein, question sentence mould Plate matching technique is one of most popular technology, question sentence template refer to by question sentence is identified replace it after formed Specific symbol label sequence, corresponding answer is increased to question sentence template problem, is being run into and the same or similar degree of template During the problem of very high, template matching technique can carry out matching answer to the problem.The difficult point of question sentence template matching technique be as What efficient and sustainable generation question sentence template.The generation of traditional question sentence template for specific clause, it is necessary to manually carry out mould Plate set, it is not only cumbersome and can spreadability it is poor;When knowledge base update, it is also desirable to manually to that cannot cover in template library Question sentence template carry out the setting and assessment of new template, maintainable and self-teaching is poor.In presently disclosed related patents, Do not find also to make correspondingly modified technical solution for above technical problem, such as patent application 201611076382.1 (《One The method and device of kind automatic question answering template matches》), pass through the template problem set of the corresponding each participle that determines to wait to answer a question Subset, obtains matching problem to be answered a question, and improves template matches efficiency and the accuracy of automatically request-answering system, but not It is related to template and automatically generates and generate template quality evaluation problem.
The content of the invention
In order to solve problem of the prior art, an embodiment of the present invention provides a kind of question sentence template automatic generation method and dress Put.The technical solution is as follows:
First aspect, there is provided a kind of question sentence template automatic generation method, the described method includes:
Prepare question sentence daily record language material;The daily record language material is segmented and part-of-speech tagging;Be named Entity recognition and Replace;Carry out semantic replacement;Frequent item set mining is carried out, generates question sentence template.
With reference to first aspect, in the first possible implementation, question sentence daily record language material is prepared, including:
Question sentence daily record language material is obtained, and question sentence daily record language material is pre-processed, including punctuation mark removes, illegal symbol Remove, the conversion of word capital and small letter.
With reference to first aspect, the daily record language material may be segmented and part-of-speech tagging in implementation at second, Including:
The daily record language material is segmented with reference to the segmenting method of industry dictionary.
With reference to first aspect, in the third possible implementation, Entity recognition and replacement are named, including:
Entity knowledge is named to the general entity including time, numeral and/or place name occurred in question sentence daily record language material Not, and by the general entity it is substituted for corresponding entity tag.
With reference to first aspect, in the 4th kind of possible implementation, semantic replacement is carried out, including:
Word after question sentence participle in question sentence daily record language material is searched for by semantic net, will be same or similar according to the paraphrase of word The word of paraphrase, which is abstracted, is unified for label, and is accordingly replaced, and generates by name entity and semantic replaced semantic concept structure Into symbol label sequence.
The 4th kind of possible implementation with reference to first aspect, in the 5th kind of possible implementation, carries out frequent item set Excavate, generate question sentence template, including:
By given threshold scope, frequent item set is obtained from the candidate of question sentence language material daily record, generates question sentence template.
The 5th kind of possible implementation with reference to first aspect, in the 6th kind of possible implementation, carries out frequent item set Excavate, generate question sentence template, including:
According to default frequency threshold range and default item collection length threshold range, using predetermined association rule-based algorithm from described Frequent item set is screened in symbol label sequence, the sequence formed according to the default sequence of item is to generate question sentence template.
The five to six kind of possible implementation with reference to first aspect, it is described in the seven to eight kind of possible implementation Method further includes:
Sentence vector characterization is carried out to the question sentence under the question sentence template that filters out using default sentence vector model;
The cluster compactness of the question sentence template is calculated using following calculation formula:
According to default template cluster compactness threshold value, the question sentence mould that cluster compactness is more than the tight ness rating threshold value is filtered out Plate;
The question sentence template filtered out is subjected to lookup contrast in template library, if the question sentence mould filtered out is not present in template library Plate, the question sentence template filtered out is preserved to template library;
Wherein, in calculation formula, CPjCluster compactness for j-th of question sentence template being calculated, XiFor j-th of question sentence The sentence vector of i-th of question sentence, W under templatejFor all vectorial average values of the corresponding cluster of j-th of question sentence template;ΩjFor For all long summations of vectorial mould of the corresponding cluster of j-th of question sentence template, i, j are the integer more than or equal to 1.
The seven to eight kind of possible implementation with reference to first aspect, it is described in the nine to ten kind of possible implementation Default sentence vector model is deep learning encoder model Skip-Thoughts.
The seven to eight kind of possible implementation with reference to first aspect, in the 11st to 12 kind of possible implementation, The method further includes:
Increase answer corresponding with the question sentence template filtered out, complete question sentence template is formed with the question sentence template filtered out Question and answer pair, preserve to template library.
A kind of second aspect, there is provided question sentence template automatically generating device, it is characterised in that including:
Preparation module, for preparing question sentence daily record language material;Participle and part-of-speech tagging module, for carrying out participle and part of speech mark Note;Entity recognition module is named, for being named Entity recognition and replacement;Semantic replacement module, for carrying out semantic replacement; Frequent item set mining module, for carrying out frequent item set mining, generates question sentence template.
With reference to second aspect, in the first possible implementation, the preparation module includes acquisition module and pretreatment Module, the acquisition module are used to obtain question sentence daily record language material, and the pretreatment module is used to carry out question sentence daily record language material pre- Processing, including punctuation mark removes, illegal symbol removes, the conversion of word capital and small letter.
With reference to second aspect, in second of possible implementation, the name Entity recognition module is used for:To question sentence day The general entity including time, numeral and/or place name occurred in will language material is named Entity recognition, and by the general reality Body is substituted for corresponding entity tag.
With reference to second aspect, in the third possible implementation, the semanteme replacement module is used for:By question sentence daily record language Word in material after question sentence participle is searched for by semantic net, is abstracted the word of same or similar paraphrase according to the paraphrase of word and is unified for mark Label, and accordingly replaced, generate the symbol label sequence being made of name entity and semantic replaced semantic concept.
With reference to the third possible implementation of second aspect, in the 4th kind of possible implementation, the frequent item set Module is excavated to be used for:According to default frequency threshold range and default item collection length threshold range, predetermined association rule-based algorithm is utilized Frequent item set is screened from the symbol label sequence, the sequence formed according to the default sequence of item is to generate question sentence template.
With reference to the three to four kind of possible implementation of second aspect, in the five to six possible implementation, the dress Put and further include:
The vectorial characterization module of sentence, for using default sentence vector model to the question sentence progress sentence under the question sentence template that filters out Vector characterization;
Cluster compactness computing module, the cluster for calculating the question sentence template using following calculation formula are close Degree:
Screening module, for according to default template cluster compactness threshold value, it is close more than this to filter out cluster compactness Spend the question sentence template of threshold value;
Determine preserving module, the question sentence template for that will filter out carries out lookup contrast in template library, if template library is not In the presence of the question sentence template filtered out, the question sentence template filtered out is preserved to template library;
Wherein, in calculation formula, CPjCluster compactness for j-th of question sentence template being calculated, XiFor j-th of question sentence The sentence vector of i-th of question sentence, W under templatejFor all vectorial average values of the corresponding cluster of j-th of question sentence template;ΩjFor For all long summations of vectorial mould of the corresponding cluster of j-th of question sentence template, i, j are the integer more than or equal to 1.
It is described in the seven to eight kind of possible implementation with reference to the five to six kind of possible implementation of second aspect Default sentence vector model is deep learning encoder model Skip-Thoughts.
It is described in the nine to ten kind of possible implementation with reference to the five to six kind of possible implementation of second aspect Device further includes:
Answer add module, the corresponding answer of question sentence template for increasing with filtering out, with the question sentence template filtered out Complete question sentence template question and answer pair are formed, are preserved to template library.
The beneficial effect that technical solution provided in an embodiment of the present invention is brought is:
1st, by semantic replacement step, the word of more justice of word one is carried out by abstract unification according to paraphrase, so as to add semanteme Generalization ability;
2nd, by carrying out frequent item set mining, frequent item set is looked for from candidate, question sentence template is generated, improves question sentence mould The efficiency of plate generation, substantial saving in manual resource;
3rd, by according to default item collection frequency threshold range and default item collection length threshold range, being screened from frequent item set Go out satisfactory item collection, to generate question sentence template, the sentence with similar structure and public word sequence can be clustered out, obtain The question sentence template of better quality;
4th, default sentence vector model sentence vector characterization, calculating cluster compactness and the satisfactory question sentence mould of screening are utilized Plate, can realize semantic dimension to generate template quality evaluation, so as to obtain the high quality question sentence template of accuracy higher;
5th, the question sentence template filtered out is subjected to lookup contrast in template library, if the question sentence filtered out is not present in template library Template, the question sentence template filtered out is preserved to template library, easy to generate effective renewal of the template in template library;
6th, increase answer corresponding with the question sentence template filtered out, complete question sentence mould is formed with the question sentence template filtered out Plate question and answer pair, preserve to template library, it is ensured that the integrality of template question and answer pair in template library, realization automatically generate answering for question sentence template Case matches.
In general, question sentence template automatic generation method provided in an embodiment of the present invention and device, can not only be efficiently Question sentence template is automatically generated, and quality evaluation, autonomous lasting extension or renewal question sentence can be carried out to the question sentence template of generation Template library, lifts the quality of intelligent Answer System knowledge base, in intelligent answer etc. the technical field of customer service can be needed to carry out extensive Popularization and application.
Brief description of the drawings
To describe the technical solutions in the embodiments of the present invention more clearly, make required in being described below to embodiment Attached drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the present invention, for For those of ordinary skill in the art, without creative efforts, other can also be obtained according to these attached drawings Attached drawing.
Fig. 1 is the question sentence template automatic generation method flow chart that the embodiment of the present invention 1 provides;
Fig. 2 is the question sentence template automatic generation method flow chart that the embodiment of the present invention 2 provides;
Fig. 3 is the question sentence template automatically generating device structure diagram that the embodiment of the present invention 3 provides.
Embodiment
To make the object, technical solutions and advantages of the present invention clearer, below in conjunction with attached in the embodiment of the present invention Figure, is clearly and completely described the technical solution in the embodiment of the present invention, it is clear that described embodiment is only this Invention part of the embodiment, instead of all the embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art exist All other embodiments obtained under the premise of creative work are not made, belong to the scope of protection of the invention.
It should be noted that term " first ", " second " are only used for description purpose, and it is not intended that instruction or hint phase To importance or the implicit quantity for indicating indicated technical characteristic.Thus, define " first ", the feature of " second " can be with Express or implicitly include one or more this feature.In the description of the present invention, " multiple " be meant that two with On, unless otherwise specifically defined.
The purpose of the present invention is to solve in the template matching technique of intelligent Answer System, the generation of question sentence template and The problem of assessment.From the question sentence daily record of question answering system, question sentence template automatic generation method and dress provided in an embodiment of the present invention Putting by the Mining Frequent Itemsets Based from question sentence daily record language material, can automatically generate the question sentence template candidate collection of question answering system, so that real Now automatically generate question sentence template and template quality is assessed, establish the self-study mechanism of intelligent Answer System.With tradition The method that template is established using artificial rule is compared, and this method and device not only increase the efficiency of question sentence template generation, largely Manual resource has been saved, and the question sentence template of generation can have been assessed, autonomous lasting extension question sentence template library, lifts intelligence The quality of energy question answering system knowledge base, therefore question sentence template automatic generation method provided in an embodiment of the present invention and device can be in intelligence Energy question and answer etc. need the technical field of customer service widely promote and apply.
Embodiment 1
Fig. 1 is question sentence template automatic generation method flow chart provided in an embodiment of the present invention.As shown in Figure 1, the present invention is real The question sentence template automatic generation method of example offer is provided, is comprised the following steps:
101st, question sentence daily record language material is prepared.
Specifically, obtaining question sentence daily record language material, and question sentence daily record language material is pre-processed, include but not limited to punctuate symbol Number remove, illegal symbol remove, word capital and small letter conversion etc..
102nd, daily record language material is segmented and part-of-speech tagging.
Specifically, the segmenting method with reference to industry dictionary segments daily record language material.As needed or industry difference, Different types of industry dictionary can be created, during particularly with regard to specific vertical industry, using the participle side for combining industry dictionary Method segments corresponding daily record language material, can obtain preferable participle effect.
103rd, Entity recognition and replacement are named.
Specifically, the general entity including time, numeral and/or place name occurred in question sentence daily record language material is named Entity recognition, and general entity is substituted for corresponding entity tag.
It should be noted that the above-mentioned pretreatment of daily record language material, participle and part-of-speech tagging and name Entity recognition and replacement It is the more commonly used Sentence analysis treatment technology of natural language processing field, any possible skill in the prior art can also be used Art means or mode realize above-mentioned several processes, to avoid burden, do not describe in detail herein.
104th, semantic replacement is carried out.
Specifically, the word after question sentence participle in question sentence daily record language material is searched for by semantic net, according to the paraphrase of word by phase The word of same or similar paraphrase, which is abstracted, is unified for label, and is accordingly replaced, and generates by name entity and semantic replaced language The symbol label sequence that adopted concept is formed.Here semantic net is preferably by Chinese semantic net HowNet.The mesh of the operating process , by the generalization ability that semantic abstraction increase is semantic, cannot be found in sentence in semantic net or cannot be real by naming The word of body identification can directly be ignored.
105th, frequent item set mining is carried out, generates question sentence template.
Specifically, by given threshold scope, frequent item set is obtained from the candidate of question sentence language material daily record, generation is asked Sentence template.Template excavation is carried out to the daily record language material for being converted into symbol sebolic addressing, the key of template generation is asked from largely different Frequent item set is clustered out in label symbol sebolic addressing after sentence conversion, these frequent item sets can express clause to a certain extent Diaphyseal portion, so as to realize automatically generating for question sentence template according to the frequent item set clustered out.Preferably, according to default frequency threshold It is worth scope and default item collection length threshold range, desired frequency is obtained from symbol label sequence using predetermined association rule-based algorithm Numerous item collection, to generate question sentence template.That is, frequent item set mining is carried out using predetermined association rule-based algorithm, from candidate item Collection looks for frequent item set.Here, predetermined association rule-based algorithm can be selected using any possible phase in the prior art as needed Association rule algorithm is closed, preferably by Apriori algorithm.
Embodiment 2
Fig. 2 is the question sentence template automatic generation method flow chart that the embodiment of the present invention 2 provides.As shown in Fig. 2, the present invention is real The question sentence template automatic generation method of example offer is provided, is comprised the following steps:
201st, question sentence daily record language material is obtained, and question sentence daily record language material is pre-processed, includes but not limited to punctuation mark Remove, illegal symbol removes, the conversion of word capital and small letter.
It should be noted that the acquisition methods of the language material of question sentence daily record here can use any in the prior art possible obtain Method is taken, the embodiment of the present invention is not particularly limited to it;The process or mode for realizing the pretreatment of daily record language material are also not necessarily limited to The above, can use any possible technological means or mode in the prior art, not repeat one by one herein.
202nd, the segmenting method with reference to industry dictionary segments the daily record language material and part-of-speech tagging.
As needed or industry difference, can create different types of industry dictionary, particularly with regard to specific vertical row During industry, corresponding daily record language material is segmented using the segmenting method for combining industry dictionary, preferable participle effect can be obtained.
It should be noted that above-mentioned 202 step is not limited to aforesaid operations mode, any possibility in the prior art can be used Mode or method, the embodiment of the present invention is not particularly limited to it.
203rd, reality is named to the general entity including time, numeral and/or place name occurred in question sentence daily record language material Body identifies, and general entity is substituted for corresponding entity tag.
It should be noted that above-mentioned name Entity recognition and replacement can also use any possible technology in the prior art Means or mode are realized, to avoid burden, are not described in detail herein.
204th, the word after question sentence participle in question sentence daily record language material is searched for by semantic net, according to the paraphrase of word by identical or The word of similar paraphrase, which is abstracted, is unified for label, and is accordingly replaced, and generates general by name entity and semantic replaced semanteme Read the symbol label sequence formed.
Due to there are the more words of a substantial amounts of justice in Chinese, it is desirable to specific word is abstracted into a representation of word meaning, To increase the generalization ability of semanteme.Here Chinese semantic net HowNet is preferably used and carries out semantic abstraction replacement.Specific behaviour It is to be searched the word after participle in semantic net as mode, there is the paraphrase to word in semantic net, will be identical according to the paraphrase of word Or the word of similar paraphrase is abstracted and is unified for label, and accordingly replaced.For example, definition of the word " hepatitis " in HowNet is " disease ", and it is also " disease " that " flu " is corresponding in semantic net, the word system that these represent " disease " in sentence One is substituted for " disease ", so as to achieve the purpose that semantic abstraction.And for that cannot be found in sentence in semantic net or cannot By naming the word of Entity recognition, can directly ignore.After the step process, language material question sentence changes order already issued from contamination sequence Name entity and semantic replaced symbol sebolic addressing.
205th, according to default frequency threshold range and default item collection length threshold range, using predetermined association rule-based algorithm from Frequent item set is screened in the symbol label sequence, the sequence formed according to the default sequence of item is to generate question sentence template.
Specifically, carrying out template excavation to the daily record language material for being converted into symbol sebolic addressing, the key of template generation is from a large amount of Frequent item set is clustered out in label symbol sebolic addressing after different question sentence conversion, these frequent item sets can to a certain extent can be Clause diaphyseal portion is expressed, so as to realize automatically generating for question sentence template according to the frequent item set clustered out.That is, using Predetermined association rule-based algorithm carries out frequent item set mining, and frequent item set is looked for from candidate.Here, predetermined association rule-based algorithm can With selection as needed using any possible related association rule algorithm in the prior art, preferably by Apriori algorithm.
In order to obtain relatively good generalization ability and semantic tight ness rating, can to the template quality that step before generates into Row assessment, so as to obtain the question sentence template of better quality.According to predetermined sequence frequency threshold range and predetermined sequence length threshold Scope, selects satisfactory corresponding candidate template sequence.Specifically, two threshold value indexs at this, first, the sequence is not With the frequency k1 occurred in language material question sentence, second, sequence length k2, second, the frequency that the sequence occurs in different language material question sentences Secondary k2.K1 and k2 generally can empirically be set, wherein preferably, k1 is arranged between [3,5], the reason is that length is less than 3 Although template generalization ability it is stronger, semantic tight ness rating is relatively low;Length is preferable more than 5 template semanteme tight ness rating, still Generalization ability is poor.In the case of appropriate threshold range, this step can be clustered out with similar structure and public word sequence Sentence, obtain the question sentence template of better quality.
For example, there is so several question sentences in daily record language material, " may I ask hereditary disease can buy", " have a hyperthyroidism can To buy", " there is can insuring for mild diabetes”.By before the step of handle, three sentences are converted into following respectively Three symbol sebolic addressings:
[question verb_you disease question_feasible apply_v question_polar];
[disease question_feasible apply_v question_polar];
[disease question_feasible apply_v]。
For convenience of description, by the semantic concept in three sequences, with letter, " a b c d e " etc. are replaced respectively.Sequence is changed For [a b c d e f], [c d e f], [c d e], passes through Frequent Itemsets Mining Algorithm, that is, predetermined association rule-based algorithm, difference The threshold value of the frequency and item collection length is set, the threshold value for setting the frequency and item collection here is all 3, is calculated in sequence first Existing frequent item set is [c d e:3], [c d:3], [c e:3], [c:3], [d:3], [e:3] etc., only list total herein Sequence in occur the combined sequence of 3 times altogether.What the alphabetical sequence combination in sequence before colon represented is asked in difference Frequent item set in sentence sequence, digitized representation behind colon is the frequency that the frequent item set occurs in language material totality.Example Such as sequence [c d e], in three sequences of citing, all sequentially occurred with this.[c d e] this sequence can so be obtained Meet the requirements, the template that this symbol sebolic addressing can be ultimately produced as three words of this citing.
It should be noted that above-mentioned 205 step carries out frequent item set mining using predetermined association rule-based algorithm, from candidate item Collection looks for frequent item set, and the illustration of generation question sentence template is only exemplary, without departing from the step of the embodiment of the present invention In the case of specific inventive concept, it can be subject to using other any possible processes or mode, the embodiment of the present invention It is particularly limited to.
And in order to further improve the quality of question sentence template, it can tied by the template taken out in above-mentioned steps sentence Characterize similar question sentence in structure feature, but under template corresponding difference question sentences semantically differ surely fully meet it is similar Property, it is also necessary to further assessed in semantic dimension, implementation step 206~209 as described below of the process.
206th, the question sentence under the question sentence template that filters out is carried out using deep learning encoder model Skip-Thoughts Sentence vector characterization.
The step carries out cluster compactness to the question sentence of cluster out and calculates, it is ensured that the different question sentences under a template have The Semantic Similarity of height, so as to be assessed on semantic dimension the template of generation.The step uses the mode of sentence vector Question sentence is characterized, sentence vector model uses the Skip-Thoughts algorithms that Google increases income, which is non-supervisory model, will Sentence expression can represent semantic well into the vector of a fixed dimension under large-scale corpus.The model is off-line training, In training process based on usage log language material term vector, when unregistered word is run into, outside can be preferably combined Chinese wikipedia language material as word extend.
207th, the cluster compactness of question sentence template is calculated using following calculation formula:
The cluster compactness of the different question sentence templates of multiple classifications can be calculated by the calculation formula.Wherein, calculate In formula, CPjCluster compactness for j-th of question sentence template being calculated, XiFor i-th question sentence under j-th of question sentence template Sentence vector, WjFor all vectorial average values of the corresponding cluster of j-th of question sentence template;ΩjFor for j-th of question sentence template pair The long summation of the vectorial mould of all of the cluster answered, i, j are the integer more than or equal to 1.
208th, according to default template cluster compactness threshold value, cluster compactness asking more than the tight ness rating threshold value is filtered out Sentence template.Exemplarily, the cluster compactness threshold value k3 of definition template, as step is produced candidate template assessment according to According to, filter out cluster compactness be more than threshold value template, by the process can realize semantic dimension to generate template matter Amount assessment, so as to obtain the high quality question sentence template of accuracy higher.Preferably, the initial value setting of threshold value k3 here, can be with To randomly selecting segment template in original template library, each corresponding cluster compactness is calculated, is then averaged.
It should be noted that above-mentioned 208 step is not limited to aforesaid operations mode, any possibility in the prior art can be used Mode or method, the embodiment of the present invention is not particularly limited to it.
209th, the question sentence template filtered out is subjected to lookup contrast in template library, if template library be not present filter out to ask Sentence template, the question sentence template filtered out is preserved to template library.
210th, increase answer corresponding with the question sentence template filtered out, complete question sentence is formed with the question sentence template filtered out Template question and answer pair, preserve to template library.
It should be noted that during above-mentioned 209~210 step operation, without departing from the step of the embodiment of the present invention , can be using any possible process or mode, the embodiment of the present invention be not right in the prior art in the case of specific inventive concept It is particularly limited to.
Embodiment 3
Fig. 3 is the question sentence template automatically generating device structure diagram that the embodiment of the present invention 3 provides.As shown in figure 3, this hair The question sentence template automatically generating device that bright embodiment provides, including:
Preparation module 1, for preparing question sentence daily record language material.Specifically, preparation module 1 includes acquisition module and pretreatment mould Block, the acquisition module are used to obtain question sentence daily record language material, and the pretreatment module is used to locate question sentence daily record language material in advance Reason, including label symbol removes, illegal symbol removes, the conversion of word capital and small letter.
Participle and part-of-speech tagging module 2, for being segmented and part-of-speech tagging.Specifically, as needed or industry is not Together, different types of industry dictionary can be created, during particularly with regard to specific vertical industry, participle and part-of-speech tagging module 2 are adopted Corresponding daily record language material is segmented with the segmenting method for combining industry dictionary, preferable participle effect can be obtained.
Entity recognition module 3 is named, for being named Entity recognition and replacement.Specifically, Entity recognition module 3 is named For:Entity recognition is named to the general entity including time, numeral and/or place name occurred in question sentence daily record language material, And the general entity is substituted for corresponding entity tag.
Semantic replacement module 4, for carrying out semantic replacement.Semantic replacement module 4 is used for:By question sentence in question sentence daily record language material Word after participle is searched for by semantic net, is abstracted the word of same or similar paraphrase according to the paraphrase of word and is unified for label, goes forward side by side Row is corresponding to be replaced, and generates the symbol label sequence being made of name entity and semantic replaced semantic concept.
Frequent item set mining module 5, for carrying out frequent item set mining, generates question sentence template.Specifically, frequent item set is dug Pick module 5 is used for:According to default frequency threshold range and default item collection length threshold range, using predetermined association rule-based algorithm from Frequent item set is screened in symbol label sequence, the sequence formed according to the default sequence of item is to generate question sentence template.
Preferably, above-mentioned question sentence template automatically generating device further includes:
The vectorial characterization module 6 of sentence, for using default sentence vector model to the question sentence progress under the question sentence template that clusters out Sentence vector characterization;Preferably, it is deep learning encoder model Skip-Thoughts to preset sentence vector model.
Cluster compactness computing module 7, the cluster for calculating the question sentence template using following calculation formula are close Degree:
Wherein, in calculation formula, CPjCluster compactness for j-th of question sentence template being calculated, XiFor j-th of question sentence The sentence vector of i-th of question sentence, W under templatejFor all vectorial average values of the corresponding cluster of j-th of question sentence template;ΩjFor For all long summations of vectorial mould of the corresponding cluster of j-th of question sentence template, i, j are the integer more than or equal to 1.
Screening module 8, for according to default template cluster compactness threshold value, it is close more than this to filter out cluster compactness Spend the question sentence template of threshold value;
Determine preserving module 9, the question sentence template for that will filter out carries out lookup contrast in template library, if template library is not In the presence of the question sentence template filtered out, the question sentence template filtered out is preserved to template library;
Additionally preferably, above-mentioned question sentence template automatically generating device further includes:
Answer add module 10, the corresponding answer of question sentence template for increasing with filtering out, with the question sentence mould filtered out Plate forms complete question sentence template question and answer pair, preserves to template library.
It should be noted that:The question sentence template automatically generating device that above-described embodiment provides is given birth to automatically in progress question sentence template , can as needed will be above-mentioned only with the division progress of above-mentioned each function module for example, in practical application during into business Function distribution is completed by different function module, i.e., the internal structure of device is divided into different function modules, with complete with The all or part of function of upper description.In addition, question sentence template automatically generating device and question sentence template that above-described embodiment provides Automatic generation method embodiment belongs to same design, its specific implementation process refers to embodiment of the method, and which is not described herein again.
In conclusion question sentence template automatic generation method provided in an embodiment of the present invention and device, relative to the prior art With following beneficial aspects:
1st, by semantic replacement step, the word of more justice of word one is carried out by abstract unification according to paraphrase, so as to add semanteme Generalization ability;
2nd, by carrying out frequent item set mining, frequent item set is looked for from candidate, question sentence template is generated, improves question sentence mould The efficiency of plate generation, substantial saving in manual resource;
3rd, by according to predetermined sequence frequency threshold range and predetermined sequence length threshold scope, being screened from frequent item set Go out satisfactory item collection, to generate question sentence template, the sentence with similar structure and public word sequence can be clustered out, obtain The question sentence template of better quality;
4th, default sentence vector model sentence vector characterization, calculating cluster compactness and the satisfactory question sentence mould of screening are utilized Plate, can realize semantic dimension to generate template quality evaluation, so as to obtain the high quality question sentence template of accuracy higher;
5th, the question sentence template filtered out is subjected to lookup contrast in template library, if the question sentence filtered out is not present in template library Template, the question sentence template filtered out is preserved to template library, easy to generate effective renewal of the template in template library;
6th, increase answer corresponding with the question sentence template filtered out, complete question sentence mould is formed with the question sentence template filtered out Plate question and answer pair, preserve to template library, it is ensured that the integrality of template question and answer pair in template library, realization automatically generate answering for question sentence template Case matches.
In general, question sentence template automatic generation method provided in an embodiment of the present invention and device, can not only be efficiently Question sentence template is automatically generated, and quality evaluation, autonomous lasting extension or renewal question sentence can be carried out to the question sentence template of generation Template library, lifts the quality of intelligent Answer System knowledge base, in intelligent answer etc. the technical field of customer service can be needed to carry out extensive Popularization and application.
One of ordinary skill in the art will appreciate that hardware can be passed through by realizing all or part of step of above-described embodiment To complete, relevant hardware can also be instructed to complete by program, the program can be stored in a kind of computer-readable In storage medium, storage medium mentioned above can be read-only storage, disk or CD etc..
Above-mentioned all optional technical solutions, can use any combination to form the alternative embodiment of the present invention, herein no longer Repeat one by one.
The foregoing is merely presently preferred embodiments of the present invention, is not intended to limit the invention, it is all the present invention spirit and Within principle, any modification, equivalent replacement, improvement and so on, should all be included in the protection scope of the present invention.

Claims (18)

  1. A kind of 1. question sentence template automatic generation method, it is characterised in that the described method includes:
    Prepare question sentence daily record language material;
    The daily record language material is segmented and part-of-speech tagging;
    It is named Entity recognition and replacement;
    Carry out semantic replacement;
    Frequent item set mining is carried out, generates question sentence template.
  2. 2. according to the method described in claim 1, it is characterized in that, prepare question sentence daily record language material, including:
    Question sentence daily record language material is obtained, and question sentence daily record language material is pre-processed, including punctuation mark removes, illegal symbol is gone Remove, the conversion of word capital and small letter.
  3. 3. according to the method described in claim 1, it is characterized in that, the daily record language material is segmented and part-of-speech tagging, bag Include:
    The daily record language material is segmented with reference to the segmenting method of industry dictionary.
  4. 4. according to the method described in claim 1, it is characterized in that, be named Entity recognition and replacement, including:
    Entity recognition is named to the general entity including time, numeral and/or place name occurred in question sentence daily record language material, and The general entity is substituted for corresponding entity tag.
  5. 5. according to the method described in claim 1, it is characterized in that, carry out semantic replacement, including:
    Word after question sentence participle in question sentence daily record language material is searched for by semantic net, according to the paraphrase of word by same or similar paraphrase Word be abstracted and be unified for label, and accordingly replaced, generate what is be made of name entity and semantic replaced semantic concept Symbol label sequence.
  6. 6. according to the method described in claim 5, it is characterized in that, carry out frequent item set mining, generate question sentence template, including:
    By given threshold scope, frequent item set is obtained from the candidate of question sentence language material daily record, generates question sentence template.
  7. 7. according to the method described in claim 6, it is characterized in that, carry out frequent item set mining, generate question sentence template, including:
    According to default frequency threshold range and default item collection length threshold range, using predetermined association rule-based algorithm from the symbol Frequent item set is screened in sequence label, the sequence formed according to the default sequence of item is to generate question sentence template.
  8. 8. the method according to claim 6 or 7, it is characterised in that the method further includes:
    Sentence vector characterization is carried out using the question sentence of question sentence template of the default sentence vector model to filtering out;
    The cluster compactness of the question sentence template is calculated using following calculation formula:
    <mrow> <msub> <mover> <mrow> <mi>C</mi> <mi>P</mi> </mrow> <mo>&amp;OverBar;</mo> </mover> <mi>j</mi> </msub> <mo>=</mo> <mfrac> <mn>1</mn> <mrow> <mo>|</mo> <msub> <mi>&amp;Omega;</mi> <mi>j</mi> </msub> <mo>|</mo> </mrow> </mfrac> <munder> <mo>&amp;Sigma;</mo> <mrow> <msub> <mi>x</mi> <mi>i</mi> </msub> <mo>&amp;Element;</mo> <msub> <mi>&amp;Omega;</mi> <mi>j</mi> </msub> </mrow> </munder> <mo>|</mo> <mo>|</mo> <msub> <mi>x</mi> <mi>i</mi> </msub> <mo>-</mo> <msub> <mi>W</mi> <mi>j</mi> </msub> <mo>|</mo> <mo>|</mo> </mrow>
    According to default template cluster compactness threshold value, the question sentence template that cluster compactness is more than the tight ness rating threshold value is filtered out;
    The question sentence template filtered out is subjected to lookup contrast in template library, if the question sentence template filtered out is not present in template library, The question sentence template filtered out is preserved to template library;
    Wherein, in calculation formula, CPjCluster compactness for j-th of question sentence template being calculated, XiFor j-th of question sentence template The sentence vector of lower i-th of question sentence, WjFor all vectorial average values of the corresponding cluster of j-th of question sentence template;ΩjFor for jth All long summations of vectorial mould of the corresponding cluster of a question sentence template, i, j are the integer more than or equal to 1.
  9. 9. according to the method described in claim 8, it is characterized in that, the default sentence vector model is deep learning encoder mould Type Skip-Thoughts.
  10. 10. according to the method described in claim 8, it is characterized in that, the method further includes:
    Increase answer corresponding with the question sentence template filtered out, complete question sentence template question and answer are formed with the question sentence template filtered out It is right, preserve to template library.
  11. A kind of 11. question sentence template automatically generating device, it is characterised in that including:
    Preparation module, for preparing question sentence daily record language material;
    Participle and part-of-speech tagging module, for being segmented and part-of-speech tagging;
    Entity recognition module is named, for being named Entity recognition and replacement;
    Semantic replacement module, for carrying out semantic replacement;
    Frequent item set mining module, for carrying out frequent item set mining, generates question sentence template.
  12. 12. according to the devices described in claim 11, it is characterised in that the preparation module includes acquisition module and pretreatment mould Block, the acquisition module are used to obtain question sentence daily record language material, and the pretreatment module is used to locate question sentence daily record language material in advance Reason, including punctuation mark removes, illegal symbol removes, the conversion of word capital and small letter.
  13. 13. according to the devices described in claim 11, it is characterised in that the name Entity recognition module is used for:To question sentence day The general entity including time, numeral and/or place name occurred in will language material is named Entity recognition, and by the general reality Body is substituted for corresponding entity tag.
  14. 14. according to the devices described in claim 11, it is characterised in that the semanteme replacement module is used for:By question sentence daily record language Word in material after question sentence participle is searched for by semantic net, is abstracted the word of same or similar paraphrase according to the paraphrase of word and is unified for mark Label, and accordingly replaced, generate the symbol label sequence being made of name entity and semantic replaced semantic concept.
  15. 15. device according to claim 14, it is characterised in that the frequent item set mining module is used for:According to default Frequency threshold range and default item collection length threshold range, are sieved using predetermined association rule-based algorithm from the symbol label sequence Frequent item set is selected, the sequence formed according to the default sequence of item is to generate question sentence template.
  16. 16. the device according to claims 14 or 15, it is characterised in that described device further includes:
    The vectorial characterization module of sentence, for carrying out sentence vector table using the question sentence of question sentence template of the default sentence vector model to filtering out Sign;
    Cluster compactness computing module, for calculating the cluster compactness of the question sentence template using following calculation formula:
    <mrow> <msub> <mover> <mrow> <mi>C</mi> <mi>P</mi> </mrow> <mo>&amp;OverBar;</mo> </mover> <mi>j</mi> </msub> <mo>=</mo> <mfrac> <mn>1</mn> <mrow> <mo>|</mo> <msub> <mi>&amp;Omega;</mi> <mi>j</mi> </msub> <mo>|</mo> </mrow> </mfrac> <munder> <mo>&amp;Sigma;</mo> <mrow> <msub> <mi>x</mi> <mi>i</mi> </msub> <mo>&amp;Element;</mo> <msub> <mi>&amp;Omega;</mi> <mi>j</mi> </msub> </mrow> </munder> <mo>|</mo> <mo>|</mo> <msub> <mi>x</mi> <mi>i</mi> </msub> <mo>-</mo> <msub> <mi>W</mi> <mi>j</mi> </msub> <mo>|</mo> <mo>|</mo> </mrow>
    Screening module, for according to default template cluster compactness threshold value, filtering out cluster compactness and being more than the tight ness rating threshold The question sentence template of value;
    Determine preserving module, the question sentence template for that will filter out carries out lookup contrast in template library, if template library is not present The question sentence template filtered out, the question sentence template filtered out is preserved to template library;
    Wherein, in calculation formula, CPjCluster compactness for j-th of question sentence template being calculated, XiFor j-th of question sentence template The sentence vector of lower i-th of question sentence, WjFor all vectorial average values of the corresponding cluster of j-th of question sentence template;ΩjFor for jth All long summations of vectorial mould of the corresponding cluster of a question sentence template, i, j are the integer more than or equal to 1.
  17. 17. device according to claim 16, it is characterised in that the default sentence vector model is deep learning encoder Model Skip-Thoughts.
  18. 18. device according to claim 16, it is characterised in that described device further includes:
    Answer add module, the corresponding answer of question sentence template for increasing with filtering out, forms with the question sentence template filtered out Complete question sentence template question and answer pair, preserve to template library.
CN201711436114.0A 2017-12-26 2017-12-26 Automatic question template generating method and device Active CN108038234B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711436114.0A CN108038234B (en) 2017-12-26 2017-12-26 Automatic question template generating method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711436114.0A CN108038234B (en) 2017-12-26 2017-12-26 Automatic question template generating method and device

Publications (2)

Publication Number Publication Date
CN108038234A true CN108038234A (en) 2018-05-15
CN108038234B CN108038234B (en) 2021-06-15

Family

ID=62101304

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711436114.0A Active CN108038234B (en) 2017-12-26 2017-12-26 Automatic question template generating method and device

Country Status (1)

Country Link
CN (1) CN108038234B (en)

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108776677A (en) * 2018-05-28 2018-11-09 深圳前海微众银行股份有限公司 Creation method, equipment and the computer readable storage medium of parallel statement library
CN108829680A (en) * 2018-06-22 2018-11-16 北京百悟科技有限公司 A kind of violation publicity detection method and device, computer readable storage medium
CN109033390A (en) * 2018-07-27 2018-12-18 深圳追科技有限公司 The method and apparatus for automatically generating similar question sentence
CN109241251A (en) * 2018-07-27 2019-01-18 众安信息技术服务有限公司 A kind of session interaction method
CN109271492A (en) * 2018-11-16 2019-01-25 广东小天才科技有限公司 A kind of automatic generation method and system of corpus regular expression
CN109408821A (en) * 2018-10-22 2019-03-01 腾讯科技(深圳)有限公司 A kind of corpus generation method, calculates equipment and storage medium at device
CN109461039A (en) * 2018-08-28 2019-03-12 厦门快商通信息技术有限公司 A kind of text handling method and intelligent customer service method
CN109522534A (en) * 2018-10-12 2019-03-26 北京来也网络科技有限公司 Task creating method and device for corpus processing
CN109597873A (en) * 2018-11-21 2019-04-09 腾讯科技(深圳)有限公司 Processing method, device, computer-readable medium and the electronic equipment of corpus data
CN110196897A (en) * 2019-05-23 2019-09-03 竹间智能科技(上海)有限公司 A kind of case recognition methods based on question and answer template
CN110362803A (en) * 2019-07-19 2019-10-22 北京邮电大学 A kind of text template generation method based on the combination of domain features morphology
CN110727780A (en) * 2019-10-17 2020-01-24 福建天晴数码有限公司 System and method for automatically expanding acquaintance text
CN110738033A (en) * 2018-07-03 2020-01-31 百度在线网络技术(北京)有限公司 Report template generation method, device and storage medium
CN111274361A (en) * 2020-01-21 2020-06-12 北京明略软件系统有限公司 Industry new word discovery method and device, storage medium and electronic equipment
CN111309858A (en) * 2020-01-20 2020-06-19 腾讯科技(深圳)有限公司 Information identification method, device, equipment and medium
CN111382256A (en) * 2020-03-20 2020-07-07 北京百度网讯科技有限公司 Information recommendation method and device
CN111552862A (en) * 2019-12-28 2020-08-18 华南理工大学 Automatic template mining system and method based on cross support degree evaluation
CN111597322A (en) * 2019-12-28 2020-08-28 华南理工大学 Automatic template mining system and method based on frequent item set
CN112948561A (en) * 2021-03-29 2021-06-11 建信金融科技有限责任公司 Method and device for automatically expanding question-answer knowledge base
CN113127610A (en) * 2019-12-31 2021-07-16 北京猎户星空科技有限公司 Data processing method, device, equipment and medium
CN113169931A (en) * 2018-11-16 2021-07-23 利维帕尔森有限公司 Script-based automated robot program creation
CN117130791A (en) * 2023-10-26 2023-11-28 南通话时代信息科技有限公司 Computing power resource allocation method and system of cloud customer service platform

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060004561A1 (en) * 2004-06-30 2006-01-05 Microsoft Corporation Method and system for clustering using generalized sentence patterns
US20090106019A1 (en) * 2004-08-31 2009-04-23 Microsoft Corporation Method and system for prioritizing communications based on sentence classifications
CN105868313A (en) * 2016-03-25 2016-08-17 浙江大学 Mapping knowledge domain questioning and answering system and method based on template matching technique
CN106649612A (en) * 2016-11-29 2017-05-10 中国银联股份有限公司 Method and device for matching automatic question and answer template

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060004561A1 (en) * 2004-06-30 2006-01-05 Microsoft Corporation Method and system for clustering using generalized sentence patterns
US20090106019A1 (en) * 2004-08-31 2009-04-23 Microsoft Corporation Method and system for prioritizing communications based on sentence classifications
CN105868313A (en) * 2016-03-25 2016-08-17 浙江大学 Mapping knowledge domain questioning and answering system and method based on template matching technique
CN106649612A (en) * 2016-11-29 2017-05-10 中国银联股份有限公司 Method and device for matching automatic question and answer template

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
向鑫: "基于语义逻辑推理的地理试题解答方法研究", 《中国优秀硕士学位论文全文数据库 哲学与人文科学辑》 *

Cited By (34)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108776677A (en) * 2018-05-28 2018-11-09 深圳前海微众银行股份有限公司 Creation method, equipment and the computer readable storage medium of parallel statement library
CN108776677B (en) * 2018-05-28 2021-11-12 深圳前海微众银行股份有限公司 Parallel sentence library creating method and device and computer readable storage medium
CN108829680A (en) * 2018-06-22 2018-11-16 北京百悟科技有限公司 A kind of violation publicity detection method and device, computer readable storage medium
CN110738033A (en) * 2018-07-03 2020-01-31 百度在线网络技术(北京)有限公司 Report template generation method, device and storage medium
CN110738033B (en) * 2018-07-03 2023-09-19 百度在线网络技术(北京)有限公司 Report template generation method, device and storage medium
CN109241251A (en) * 2018-07-27 2019-01-18 众安信息技术服务有限公司 A kind of session interaction method
CN109241251B (en) * 2018-07-27 2022-05-27 众安信息技术服务有限公司 Conversation interaction method
CN109033390A (en) * 2018-07-27 2018-12-18 深圳追科技有限公司 The method and apparatus for automatically generating similar question sentence
CN109461039A (en) * 2018-08-28 2019-03-12 厦门快商通信息技术有限公司 A kind of text handling method and intelligent customer service method
CN109522534A (en) * 2018-10-12 2019-03-26 北京来也网络科技有限公司 Task creating method and device for corpus processing
CN109522534B (en) * 2018-10-12 2022-12-13 北京来也网络科技有限公司 Task generation method and device for corpus processing
CN109408821A (en) * 2018-10-22 2019-03-01 腾讯科技(深圳)有限公司 A kind of corpus generation method, calculates equipment and storage medium at device
CN109408821B (en) * 2018-10-22 2020-09-04 腾讯科技(深圳)有限公司 Corpus generation method and device, computing equipment and storage medium
CN109271492A (en) * 2018-11-16 2019-01-25 广东小天才科技有限公司 A kind of automatic generation method and system of corpus regular expression
CN113169931A (en) * 2018-11-16 2021-07-23 利维帕尔森有限公司 Script-based automated robot program creation
CN109597873A (en) * 2018-11-21 2019-04-09 腾讯科技(深圳)有限公司 Processing method, device, computer-readable medium and the electronic equipment of corpus data
CN109597873B (en) * 2018-11-21 2022-02-08 腾讯科技(深圳)有限公司 Corpus data processing method and device, computer readable medium and electronic equipment
CN110196897A (en) * 2019-05-23 2019-09-03 竹间智能科技(上海)有限公司 A kind of case recognition methods based on question and answer template
CN110362803A (en) * 2019-07-19 2019-10-22 北京邮电大学 A kind of text template generation method based on the combination of domain features morphology
CN110727780A (en) * 2019-10-17 2020-01-24 福建天晴数码有限公司 System and method for automatically expanding acquaintance text
CN111597322B (en) * 2019-12-28 2023-04-21 华南理工大学 Automatic template mining system and method based on frequent item sets
CN111552862B (en) * 2019-12-28 2023-04-21 华南理工大学 Automatic template mining system and method based on cross support evaluation
CN111552862A (en) * 2019-12-28 2020-08-18 华南理工大学 Automatic template mining system and method based on cross support degree evaluation
CN111597322A (en) * 2019-12-28 2020-08-28 华南理工大学 Automatic template mining system and method based on frequent item set
CN113127610A (en) * 2019-12-31 2021-07-16 北京猎户星空科技有限公司 Data processing method, device, equipment and medium
CN113127610B (en) * 2019-12-31 2024-04-19 北京猎户星空科技有限公司 Data processing method, device, equipment and medium
CN111309858A (en) * 2020-01-20 2020-06-19 腾讯科技(深圳)有限公司 Information identification method, device, equipment and medium
CN111309858B (en) * 2020-01-20 2023-03-07 腾讯科技(深圳)有限公司 Information identification method, device, equipment and medium
CN111274361A (en) * 2020-01-21 2020-06-12 北京明略软件系统有限公司 Industry new word discovery method and device, storage medium and electronic equipment
CN111382256B (en) * 2020-03-20 2024-04-09 北京百度网讯科技有限公司 Information recommendation method and device
CN111382256A (en) * 2020-03-20 2020-07-07 北京百度网讯科技有限公司 Information recommendation method and device
CN112948561A (en) * 2021-03-29 2021-06-11 建信金融科技有限责任公司 Method and device for automatically expanding question-answer knowledge base
CN117130791B (en) * 2023-10-26 2023-12-26 南通话时代信息科技有限公司 Computing power resource allocation method and system of cloud customer service platform
CN117130791A (en) * 2023-10-26 2023-11-28 南通话时代信息科技有限公司 Computing power resource allocation method and system of cloud customer service platform

Also Published As

Publication number Publication date
CN108038234B (en) 2021-06-15

Similar Documents

Publication Publication Date Title
CN108038234A (en) A kind of question sentence template automatic generation method and device
CN106484664B (en) Similarity calculating method between a kind of short text
Bang et al. Explaining a black-box by using a deep variational information bottleneck approach
CN107368468A (en) A kind of generation method and system of O&M knowledge mapping
CN108509519A (en) World knowledge collection of illustrative plates enhancing question and answer interactive system based on deep learning and method
CN105868184A (en) Chinese name recognition method based on recurrent neural network
CN106844346A (en) Short text Semantic Similarity method of discrimination and system based on deep learning model Word2Vec
CN112507136B (en) Knowledge-driven business operation map construction method
DE112013004082T5 (en) Search system of the emotion entity for the microblog
CN106503148B (en) A kind of table entity link method based on multiple knowledge base
CN107633005A (en) A kind of knowledge mapping structure, comparison system and method based on class teaching content
CN106570708A (en) Management method and management system of intelligent customer service knowledge base
CN108197294A (en) A kind of text automatic generation method based on deep learning
CN107301170A (en) The method and apparatus of cutting sentence based on artificial intelligence
CN112100397A (en) Electric power plan knowledge graph construction method and system based on bidirectional gating circulation unit
CN106886576A (en) It is a kind of based on the short text keyword extracting method presorted and system
CN106683667A (en) Automatic rhythm extracting method, system and application thereof in natural language processing
CN110162631A (en) Chinese patent classification method, system and storage medium towards TRIZ inventive principle
Khetarpal et al. Spatial terms across languages support near-optimal communication: Evidence from Peruvian Amazonia, and computational analyses
CN110188359A (en) A kind of text entities abstracting method
CN109670042A (en) A kind of examination question classification and grade of difficulty method based on recurrent neural network
CN111008215B (en) Expert recommendation method combining label construction and community relation avoidance
CN116010581A (en) Knowledge graph question-answering method and system based on power grid hidden trouble shooting scene
CN114265937A (en) Intelligent classification analysis method and system of scientific and technological information, storage medium and server
CN102063497A (en) Open type knowledge sharing platform and entry processing method thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20240306

Address after: Room 1179, W Zone, 11th Floor, Building 1, No. 158 Shuanglian Road, Qingpu District, Shanghai, 201702

Patentee after: Shanghai Zhongan Information Technology Service Co.,Ltd.

Country or region after: China

Address before: 518000 Room 201, building A, No. 1, Qian Wan Road, Qianhai Shenzhen Hong Kong cooperation zone, Shenzhen, Guangdong (Shenzhen Qianhai business secretary Co., Ltd.)

Patentee before: ZHONGAN INFORMATION TECHNOLOGY SERVICE Co.,Ltd.

Country or region before: China

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20240415

Address after: Room 1179, W Zone, 11th Floor, Building 1, No. 158 Shuanglian Road, Qingpu District, Shanghai, 201702

Patentee after: Shanghai Zhongan Information Technology Service Co.,Ltd.

Country or region after: China

Address before: 518000 Room 201, building A, No. 1, Qian Wan Road, Qianhai Shenzhen Hong Kong cooperation zone, Shenzhen, Guangdong (Shenzhen Qianhai business secretary Co., Ltd.)

Patentee before: ZHONGAN INFORMATION TECHNOLOGY SERVICE Co.,Ltd.

Country or region before: China