CN110209780A - A kind of question template generation method, device, server and storage medium - Google Patents

A kind of question template generation method, device, server and storage medium Download PDF

Info

Publication number
CN110209780A
CN110209780A CN201810890730.1A CN201810890730A CN110209780A CN 110209780 A CN110209780 A CN 110209780A CN 201810890730 A CN201810890730 A CN 201810890730A CN 110209780 A CN110209780 A CN 110209780A
Authority
CN
China
Prior art keywords
candidate
question template
attribute information
seed pattern
template
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810890730.1A
Other languages
Chinese (zh)
Other versions
CN110209780B (en
Inventor
高航
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201810890730.1A priority Critical patent/CN110209780B/en
Publication of CN110209780A publication Critical patent/CN110209780A/en
Application granted granted Critical
Publication of CN110209780B publication Critical patent/CN110209780B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a kind of question template generation method, device, server and storage medium, can be by the way that seed pattern is extended to multiple candidate question templates;For each candidate question template, all attribute informations that each seed pattern for being used to be extended to candidate's question template is carried, it is determined as candidate attribute information corresponding with candidate's question template, and then a possibility that corresponding each candidate attribute information, is had based on candidate's question template, a candidate attribute information is chosen from the corresponding each candidate attribute information of candidate's question template, attribute information as candidate's question template, in a manner of generating question template, question template is automatically generated.Relative to it is traditional by the mode of each question template of human configuration for, can not only save cost of labor but also question template formation efficiency can be improved.

Description

A kind of question template generation method, device, server and storage medium
Technical field
The present invention relates to template digging technology fields, and in particular to a kind of question template generation method, device, server and Storage medium.
Background technique
In order to provide the user with better search result, multiple question templates would generally be arranged in search server.It is searching for After server gets search condition, the search condition can be extended based on preset problem template, with obtain with The relevant multiple search problems of the search condition, in this way, being retrieved based on multiple search problem, available and retrieval item Part and the corresponding search result of each search problem expanded, so as to provide more optional search results for user, Improve search result hit rate.
Traditional technology is usually each attribute being directed in entity set, by human configuration question template, wherein entity set by Multiple entities are constituted, and entity has at least one attribute.For example, "@is drilled human configuration for the film native in entity set This question template of which film ".However, the mode of human configuration question template can not only exist because person works' state difference is led The low problem of question template allocative efficiency is caused, and the quantity of template is also limited the problem of human configuration goes out, to configuration A large amount of question template needs to increase more costs of labor.
Therefore, a kind of question template generation method how is provided, to reduce problem caused by human configuration question template The generation of the case where template configuration low efficiency, high labor cost, is a problem to be solved.
Summary of the invention
In view of this, the embodiment of the present invention provides a kind of question template generation method, device, server and storage medium, With reduce cost of labor, improve question template formation efficiency on the basis of, the generation of problem of implementation template.
To achieve the above object, the embodiment of the present invention provides the following technical solutions:
A kind of question template generation method, comprising:
Multiple seed patterns are obtained, each seed pattern carries attribute information;
It extends the seed pattern and generates at least one candidate question template;
For each candidate question template, at least one the target species submodule for expanding candidate's question template is determined Plate, and all properties information that at least one target seed pattern carries is determined as the corresponding candidate category of candidate's question template Property information;
For the corresponding every kind of candidate attribute information of each candidate question template, determine the candidate question template with The similarity of at least one seed pattern of the candidate attribute information is carried, the similarity is for reflecting the candidate problem mould Plate has a possibility that candidate attribute information;
For each candidate question template, different candidate attribute information are respectively provided with according to the candidate question template A possibility that, from the corresponding each candidate attribute information of candidate's question template, a candidate attribute information is chosen as institute The attribute information of candidate question template is stated, to generate question template.
A kind of question template generating means, comprising:
Seed pattern acquiring unit, for obtaining multiple seed patterns, each seed pattern carries attribute information;
Candidate question template expanding element generates at least one candidate question template for extending the seed pattern;
Target property information determination unit, for determining that expanding the candidate asks for each candidate question template At least one target seed pattern of template is inscribed, and all properties information that at least one target seed pattern carries is determined as The corresponding candidate attribute information of candidate's question template;
Similarity determining unit, for being directed to the corresponding every kind of candidate attribute information of each candidate's question template, really The similarity of the fixed candidate question template and at least one seed pattern for carrying the candidate attribute information, the similarity are used There is a possibility that candidate attribute information in the reflection candidate question template;
Question template generation unit, for dividing according to the candidate question template for each candidate question template Not there is a possibility that different candidate attribute information, from the corresponding each candidate attribute information of candidate's question template, choose Attribute information of the one candidate attribute information as the candidate question template, to generate question template.
A kind of server, including at least one processor and at least one processor;The memory is stored with program, institute The program that processor calls the memory storage is stated, described program is used for:
Multiple seed patterns are obtained, each seed pattern carries attribute information;
It extends the seed pattern and generates at least one candidate question template;
For each candidate question template, at least one the target species submodule for expanding candidate's question template is determined Plate, and all properties information that at least one target seed pattern carries is determined as the corresponding candidate category of candidate's question template Property information;
For the corresponding every kind of candidate attribute information of each candidate question template, determine the candidate question template with The similarity of at least one seed pattern of the candidate attribute information is carried, the similarity is for reflecting the candidate problem mould Plate has a possibility that candidate attribute information;
For each candidate question template, different candidate attribute information are respectively provided with according to the candidate question template A possibility that, from the corresponding each candidate attribute information of candidate's question template, a candidate attribute information is chosen as institute The attribute information of candidate question template is stated, to generate question template.
A kind of storage medium is stored with computer executable instructions in the storage medium, and the computer is executable to be referred to It enables for executing described problem template generation method.
It can be seen via above technical scheme that user need to only preset some seed patterns for carrying attribute information, this Invention can be by being extended to multiple candidate question templates for seed pattern;For each candidate question template, will be used to extend At all attribute informations that each seed pattern of candidate's question template carries, it is determined as corresponding with candidate's question template Candidate attribute information, and then a possibility that corresponding each candidate attribute information, is had based on candidate's question template, from A candidate attribute information is chosen in the corresponding each candidate attribute information of candidate's question template, as candidate's question template Attribute information automatically generates question template in a manner of generating question template.Also, because each seed pattern can be by The problem of being extended to multiple candidate question templates for being used to generate question template, therefore being automatically generated based on seed pattern template Quantity is also considerable.Relative to it is traditional by the mode of each question template of human configuration for, can not only save people Work cost but also question template formation efficiency can be improved.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this The embodiment of invention for those of ordinary skill in the art without creative efforts, can also basis The attached drawing of offer obtains other attached drawings.
Fig. 1 shows a kind of a kind of composed structure schematic diagram of application scenarios of question template generation method of the application;
Fig. 2 is a kind of hardware block diagram of server provided by the embodiments of the present application;
Fig. 3 is a kind of question template generation method flow chart provided by the embodiments of the present application;
Fig. 4 is a kind of question template generation method schematic diagram provided by the embodiments of the present application;
Fig. 5 is a kind of search result schematic diagram provided by the embodiments of the present application;
Fig. 6 is a kind of at least one kind for calculating candidate question template and carrying attribute information provided by the embodiments of the present application The schematic diagram of the similarity of subtemplate;
Fig. 7 is another question template generation method flow chart provided by the embodiments of the present application;
Fig. 8 is a kind of selection provided by the embodiments of the present application objective attribute target attribute high with the associated similarity of candidate question template Attribute information of the information as candidate question template, to generate the method flow diagram of question template;
Fig. 9 is the method that a kind of extension seed pattern provided by the embodiments of the present application generates at least one candidate question template Flow chart;
Figure 10 be it is provided by the embodiments of the present application it is a kind of retrieved using kind of subproblem as querying condition, obtain and seed The method flow diagram of at least one relevant search problem of problem;
Figure 11 is a kind of structural schematic diagram of question template generating means provided by the embodiments of the present application.
Specific embodiment
The scheme of the application in order to facilitate understanding first explains some terms being related in the embodiment of the present application.
Entity, from the point of view of data processing, entity not only can be distinguished, be can recognize, can touch to be in the real world And objective things [for example, entity can refer to people, such as teacher, student, can also refer to object, such as book, warehouse];Can also be Abstract event (for example, entity can be performance, football match etc.).
Attribute: attribute is the information description an of entity.Attribute-name is the title of the attribute;Attribute value is the attribute-name institute Corresponding value can be a number, text or another entity.For example, if entity be Zhang San when, entity can have to A few attribute information, at least one attribute information may include " video display star ", " age ", " gender ", " film ", " song " Equal attribute informations.
Entity set: entity set is the combination of contact attribute between several entities, entity self attributes and entity.Entity set The form of expression can be knowledge base, it is above be only entity set provided by the embodiments of the present application the preferred form of expression, herein Without limitation.
Contact attribute between illustration entity, if in entity set including " Zhang San " this entity and " A song " this reality Body, " A song " this entity can be associated with " song " attribute information in " Zhang San " this entity.
Seed pattern: for the problem that attribute each in entity set, by manual examination and verification or human configuration template.Such as reality The film native that body is concentrated, can configure seed pattern " which film@has drilled ".
Search problem: utilizing search engine nodes for research problem, and obtained search engine returns similar to kind of subproblem Problem is search problem.
Candidate question template: search problem is known as candidate question template after name entity replacement.
Name Entity recognition: name Entity recognition, abbreviation NER, also referred to as " proper name identification ", referring to has in identification text The entity of certain sense mainly includes name, place name, mechanism name, proper noun etc..
It is detailed to a kind of question template generation method progress provided by the embodiments of the present application below with reference to the above-mentioned term provided It is thin to introduce.
Question template is mainly used in search server, and the search server for being provided with question template is receiving retrieval item After part, search condition can be extended based on question template, obtain multiple search problems relevant to search condition;This Sample can return to the search result of search condition and each search problem to user, to provide more optional retrieval knots for user Fruit improves search result hit rate, and then improves user's viscosity.
The setting of question template is vital for search engine as a result, is directly related to search engine Whether search result hit rate is improved, also therefore, the generation method of people's increasingly concerned issue template, the generation of question template Method has become one of retrieval technique field main direction of studying.
Traditional mode by human configuration question template is often deposited because of reasons such as artificial quantity, artificial working conditions The case where question template formation efficiency is low, generation quantity is few, is not able to satisfy search server to the quantity demand of question template; And need to increase more costs of labor the generation quantity to improve question template, in this way, will lead to high labor cost again The problem of.
In order to solve the problems, such as template generation low efficiency caused by traditional human configuration question template, high labor cost Problem, inventor has found that the realization of mark method can be used back to automatically generate question template.
Return mark method implementation it is as follows: be arranged multiple triples { S, P, O }, based on S, P be attribute, O is object;Expand Exhibition seed pattern obtains multiple retrieval question and answer to (retrieving question and answer to including search problem similar with seed pattern and search problem Search result);For each retrieval question and answer pair, it is determining with the retrieval question and answer to matched target triple, and by the retrieval The search problem of question and answer centering is demarcated as the attribute P in target triple, to generate question template;Wherein, retrieval question and answer pair with The search problem of target triple matching instruction retrieval question and answer centering is matched with the main body S in target triple, retrieves question and answer pair In search result matched with the object O in target triple.
Although automatically generate question template however, returning mark method and can be based on seed pattern, it but needs to be directed to and each ask Answering questions the information matches process twice that executes, (process of information matches is respectively to match in the main body S of multiple triples twice for this The process of the search problem of question and answer pair and in the object O of multiple triples match question and answer pair search result process), because This, frequently can lead to the low problem of question template formation efficiency;Also, because the search result of retrieval question and answer centering provides for user , usually there is non-type situation, therefore, frequently can lead to the problem of generating low problem of template accuracy.
Accuracy is generated in order to improve question template formation efficiency and question template, inventor carries out above-mentioned time mark method Further improvement, thus proposes a kind of question template generation method.
For the ease of to a kind of understanding of question template generation method provided by the embodiments of the present application, below in conjunction with this hair Attached drawing in bright embodiment, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described Embodiment is only a part of the embodiment of the present invention, instead of all the embodiments.Based on the embodiments of the present invention, this field Those of ordinary skill's every other embodiment obtained without making creative work, belongs to protection of the present invention Range.
The embodiment of the present application first to the embodiment of the present application the problem of the template generation method application scenarios that are applicable in be System composition is introduced.It may include: server 11 and searching platform 12 in application scenarios shown in Fig. 1 such as Fig. 1.
Server 11 can generate server for question template, which can be network side and provide service for user Service equipment may be the server cluster of multiple servers composition, it is also possible to single server.
Searching platform 12 can be server, terminal etc., it is not limited here.
Server 11, for obtaining multiple seed patterns, each seed pattern carries attribute information;It generates and kind submodule At least one relevant kind subproblem of plate;It generates corresponding with kind subproblem for URL of the subproblem as querying condition will to be planted Link;URL link is sent to searching platform 12;It receives searching platform 12 and retrieval return search result is carried out based on URL link; At least one search problem relevant to kind subproblem is extracted from search result;Entity in search problem is replaced with specific Character generates candidate question template;For each candidate question template, at least one for expanding candidate's question template is determined Target seed pattern, and all properties information that at least one target seed pattern carries is determined as candidate's question template pair The candidate attribute information answered;For the corresponding every kind of candidate attribute information of each candidate's question template, candidate's question template is determined With the similarity at least one seed pattern for carrying the candidate attribute information, similarity is for reflecting that candidate question template has A possibility that candidate attribute information;For each candidate question template, different candidates are respectively provided with according to candidate question template A possibility that attribute information, chooses a candidate attribute information and makees from the corresponding each candidate attribute information of candidate question template For the attribute information of candidate question template, to generate question template.
Searching platform 12 is retrieved, and to service for receiving the URL link of the transmission of server 11 based on URL link Device 11 returns to search result.
A kind of question template generation method provided by the embodiments of the present application is applied to server 11.Optionally, Fig. 2 shows The hardware block diagram of server, referring to Fig. 2, the hardware configuration of server may include: processor 21, and communication interface 22 is deposited Reservoir 23 and communication bus 24;
In embodiments of the present invention, processor 21, communication interface 22, memory 23, communication bus 24 quantity can be with For at least one, and processor 21, communication interface 22, memory 23 complete mutual communication by communication bus 24;
Processor 21 may be a central processor CPU or specific integrated circuit ASIC (Application Specific Integrated Circuit), or be arranged to implement the integrated electricity of one or more of the embodiment of the present invention Road etc.;
Memory 23 may include high speed RAM memory, it is also possible to further include nonvolatile memory (non-volatile Memory) etc., a for example, at least magnetic disk storage;
Wherein, memory is stored with program, the program that processor can call memory to store, and program is used for:
Multiple seed patterns are obtained, each seed pattern carries attribute information;
Extension seed pattern generates at least one candidate question template;
For each candidate question template, at least one the target seed pattern for expanding candidate's question template is determined, And all properties information that at least one target seed pattern carries is determined as the corresponding candidate attribute of candidate's question template Information;
For the corresponding every kind of candidate attribute information of each candidate's question template, determines candidate's question template and carry the time The similarity of at least one seed pattern of attribute information is selected, similarity is for reflecting that candidate question template has the candidate attribute A possibility that information;
For each candidate question template, the possibility of different candidate attribute information is respectively provided with according to candidate question template Property, from the corresponding each candidate attribute information of candidate question template, a candidate attribute information is chosen as candidate question template Attribute information, to generate question template.
Optionally, the refinement function of program and extension function can refer to and be described below.
It describes in detail below to a kind of question template generation method provided by the embodiments of the present application.It is illustrated in figure 3 A kind of question template generation method flow chart provided by the embodiments of the present application.
As shown in figure 3, this method comprises:
S301, multiple seed patterns are obtained, each seed pattern carries attribute information;
Multiple seed patterns are obtained, each seed pattern carries attribute information, the category that different seed patterns carries Property information may it is identical may also be different.
It such as, may include specific character in seed pattern.For example, seed pattern can be " film drilled ", at this time The attribute information that seed pattern carries can be " video display star ";Seed pattern can also be " song sung ", plant submodule at this time The attribute information that plate carries can be " singer ";Wherein@is the specific character in seed pattern.
Part1 in the problem of providing in conjunction with Fig. 4 template generation method schematic diagram partially illustrates 3 seed patterns, point Not Wei seed pattern 1, seed pattern 2 and seed pattern 3, wherein seed pattern 1 carry attribute information 1, seed pattern 2 carry belong to Property information 2, seed pattern 3 carry attribute information 1.
Such as, seed pattern 1, seed pattern 2 and seed pattern that multiple seed patterns can be as shown in Figure 4 for acquisition are obtained 3。
S302, extension seed pattern generate at least one candidate question template;
For each seed pattern, each entity of the attribute information carried in entity set with the seed pattern is determined, For determining each entity, which is replaced into the specific character in the seed pattern and obtains a kind of subproblem.
For example, if seed pattern is " film that@was drilled ", when the attribute information that seed pattern carries is " video display star ", If the entity in entity set with " video display star " this attribute information has 3, respectively " Zhang San ", " Li Si ", " king five ", then " Zhang San " this entity can be replaced to the specific character@in " film that@was drilled " this seed pattern, obtain a seed and ask It inscribes " film that Zhang San drilled ";" Li Si " this entity is replaced into the specific word in " film that@was drilled " this seed pattern Symbol, obtains a kind of subproblem " film that Li Si drilled ";By " king five " this entity replacement " film drilled ", this is a kind of Specific character in subtemplate obtains a kind of subproblem " film that king five drilled ";It is directed to " film that@was drilled " as a result, This seed pattern generates 3 kinds of subproblems, this 3 kinds of subproblems are respectively kind of subproblem " film that Zhang San drilled ", seed Problem " film that Li Si drilled " and kind subproblem " film that king five drilled ".
For each kind of subproblem, multiple candidate question templates can be expanded.Optionally, for each kind of subproblem, It can be retrieved using this kind of subproblem as querying condition, obtain search result, search result is parsed to obtain and be planted At least one relevant search problem of subproblem.
It can be and retrieved by search engine using kind of subproblem as querying condition, obtain search result.In search result Search problem relevant to kind of subproblem can be search engine fed back in the search result page it is similar to kind of subproblem The problem of.
Diversity is usually compared in the statement of usually search problem, and stating can between the different search problems of same searching request To think similar.Such as search problem " may I ask the film that Zhang San drilled? ", search problem " which the film that Zhang San drilled has ", Search problem " may I ask which film that Zhang San drilled has ", search problem " may I ask which Zhang San's film drilled in 2012 has " institute The searching request of characterization is identical, is all the film that request search Zhang San drills, therefore, these search problems may be considered similar 's.
Optionally, search problem can be according to the actual situation as kind of subproblem or search problem;For example, search problem When " may I ask which the film that Zhang San drilled has " is kind of subproblem, include in obtained search result is related to this kind of subproblem Search problem may include search problem " may I ask the film that Zhang San drilled? ", search problem " which the film that Zhang San drilled has A bit ", problem " may I ask which Zhang San's film drilled in 2012 has " is searched for.
It is search result schematic diagram provided by the embodiments of the present application referring to Fig. 5.Search result as shown in Figure 5 is with seed Problem " film that Zhang San drilled " (referring to the mark 1 in Fig. 5) is the search result that querying condition is retrieved, the retrieval It as a result further include the querying condition " electricity that Zhang San acts the leading role in addition to the search result including kind of subproblem " film that Zhang San drilled " in Depending on play " (referring to the mark 2 in Fig. 5) search result and querying condition " which ancient costume TV play Zhang San drilled " (referring to Fig. 5 In mark 3) search result;Wherein, can regarding and plant subproblem as, " Zhang San drills querying condition " TV play that Zhang San acts the leading role " The relevant search problem of the film crossed ", querying condition " which ancient costume TV play Zhang San drilled " can be regarded as and be asked with seed Inscribe " film that Zhang San drilled " relevant search problem.
It is illustrated by taking a kind of subproblem as an example, is that querying condition is retrieved to obtain search result by this kind of subproblem, And it parses search result and obtains respectively asking obtained each retrieval to after the relevant each search problem of kind subproblem Topic is handled, and candidate question template corresponding with the search problem is obtained.
Search problem is handled, the mode for obtaining candidate question template can be with are as follows: using name entity recognition method Entity in search problem is identified, the entity being identified in search problem is substituted for specific character, is generated Candidate question template.
For example, identifying the reality in the search problem if search problem is " TV play that may I ask Zhang San's protagonist be thanks " Body is " Zhang San ", and the entity " Zhang San " in the search problem " TV play that may I ask Zhang San's protagonist be thanks " is replaced with specific word Symbol, obtains candidate question template.If specific character is "@", obtained candidate question template is that " may I ask the TV play of@protagonist It thanks ".
Further, because of the diversity of search problem statement, a kind of question template generation provided by the embodiments of the present application Method handles search problem, when obtaining candidate question template, can further include: denoising to search problem Processing, optionally, to search problem carry out denoising process can by specific character to the entity in search problem into It executes, can also be executed before being replaced specific character to the entity in search problem after row replacement.
In the embodiment of the present application, it is preferred that search problem carry out denoising include to the prefix of search problem into Row denoising, and/or, denoising is carried out to the suffix of search problem.
Wherein, meaningless prefix in deletion search problem can be referred to by carrying out denoising to the prefix of search problem, For example, " may I ask ", " who knows ", " please answer " etc. are construed as meaningless prefix;To the suffix of search problem Carrying out denoising can refer to that the meaningless suffix deleted in search problem, such as " ", " ", " thanks " etc. all may be used To be considered meaningless suffix.
For example, if being carried out at denoising when search problem is " TV play that may I ask Zhang San's protagonist be thanks " to search problem After reason, obtained result can be " the main TV play of Zhang San ".Search problem " is asked if specific character is "@" based on this The TV play for asking that Zhang San acts the leading role be thanks " it carries out denoising and the entity in the search problem is replaced with specific character After changing, obtained candidate question template is " TV play that@is acted the leading role ".
Part1 in the problem of providing in conjunction with Fig. 4 template generation method schematic diagram partially illustrates 3 seed patterns and is given birth to At kind subproblem, wherein seed pattern 1 generates 3 kinds of subproblems, respectively kind subproblem 1, kind subproblem 2 and seed Problem 3;Seed pattern 2 generates two kinds of subproblems, respectively kind subproblem 4 and kind subproblem 5;Seed pattern 3 generates 3 A kind of subproblem, respectively kind subproblem 6, kind subproblem 7 and kind subproblem 8.
Further, it for each kind of subproblem, is retrieved to obtain and this kind using this kind of subproblem as querying condition At least one relevant search problem of subproblem;As shown in figure 4, available: search problem relevant to kind of subproblem 1 is inspection Suo Wenti 1a and search problem 1b;Search problem 2 relevant to kind of subproblem 2;Search problem 3 relevant to kind of subproblem 3;With The kind relevant search problem 4 of subproblem 4;Search problem 5a relevant to kind of subproblem 5 and search problem 5b;With kind of subproblem 6 Relevant search problem 6;Search problem 7 relevant to kind of subproblem 7;And search problem 8 relevant to kind of subproblem 8.
Further, it can also be directed to each search problem, denoising is carried out to the search problem and by specific word Symbol replaces the entity in the search problem, generates candidate question template corresponding with the search problem.Search problem as shown in Figure 4 Pointed candidate question template is the candidate question template corresponding with the search problem generated.
Referring to fig. 4, generation and the corresponding candidate question template of search problem 1a, generate it is corresponding with search problem 1b Candidate question template, generate and the corresponding candidate question template of search problem 2, the candidate corresponding with search problem 3 that generates Question template, generate and the corresponding candidate question template of search problem 5a, the candidate problem corresponding with search problem 7 that generates Template and the candidate question template corresponding with search problem 8 of generation are identical, are all candidate question templates 1;Generate with inspection The corresponding candidate question template of Suo Wenti 4, the candidate question template corresponding with search problem 5b generated and generation with inspection The corresponding candidate question template of Suo Wenti 6 is identical, is all candidate question template 2.
S303, it is directed to each candidate question template, determines at least one the target seed for expanding candidate's question template Template, and all attribute informations that at least one target seed pattern carries are determined as the corresponding time of candidate's question template Select attribute information;
Optionally, for a candidate question template, at least one the target seed for expanding candidate's question template is determined The mode of template can be with are as follows: determines each search problem corresponding with the candidate question template, acquisition is identified for generating The kind subproblem of each search problem determines the seed pattern for generating acquired kind subproblem, by identified seed Template is as target seed pattern.
Referring to fig. 4, the part part1 in Fig. 4 has obtained two candidate question templates, the two candidate question templates point Not Wei candidate question template 1 and candidate question template 2, the part pat1 is inversely derived, it is known that: candidate question template 1 is right Kind subproblem 1, kind subproblem 2, kind subproblem 3, kind subproblem 7, kind subproblem 8 and kind subproblem 5 are answered;Kind subproblem 1, Kind subproblem 2 and kind subproblem 3 have corresponded to seed pattern 1;Kind subproblem 7 and kind subproblem 8 have corresponded to seed pattern 3;Seed Problem 5 has corresponded to seed pattern 2, and the seed pattern for being accordingly used in expanding candidate question template 1 is respectively seed pattern 1, kind Subtemplate 2 and seed pattern 3.Candidate question template 2 has corresponded to kind subproblem 4, kind subproblem 5 and kind subproblem 6;Kind subproblem 4 have corresponded to seed pattern 2 with kind subproblem 5;Kind subproblem 6 has corresponded to seed pattern 3;It is accordingly used in expanding candidate problem mould The seed pattern of plate 2 is seed pattern 2 and seed pattern 3.
In order to facilitate understanding referring to fig. 4 in the part part2, it is corresponding each that part2 partially illustrates candidate question template 1 A kind of subproblem, and, seed pattern corresponding to the corresponding each kind of subproblem of candidate question template 1 (can be described as corresponding candidate Each target seed pattern of question template 1), and, the corresponding each kind of subproblem of candidate question template 2, and candidate problem The corresponding seed pattern of the corresponding each kind of subproblem of template 2 (can be described as each target species submodule of corresponding candidate question template 2 Plate).
Further, after determining the target seed pattern for expanding candidate question template, it can also determine and be used for The attribute information that the target seed pattern of candidate question template carries is expanded, the attribute information determined is asked as candidate Inscribe the corresponding candidate attribute information of template.
The part part2 in referring to fig. 4 determines that the corresponding target seed pattern of candidate question template 1 is seed pattern 1, after seed pattern 3 and seed pattern 2, determining seed pattern 1 and seed pattern 3 is to carry attribute information 1, seed pattern 2 carry attribute information 2, and therefore, there are two the corresponding candidate attribute information of candidate question template 1, respectively attribute information 1 and category Property information 2;After determining the corresponding target seed pattern of candidate question template 2 for seed pattern 2 and seed pattern 3, determine Seed pattern 2 carries attribute information 2, and seed pattern 3 carries attribute information 1, and therefore, the corresponding candidate of candidate question template 2 belongs to There are two property information, respectively attribute information 1 and attribute information 2.
S304, it is directed to the corresponding every kind of candidate attribute information of each candidate question template, determines candidate's question template and takes The similarity of at least one seed pattern with the candidate attribute information, similarity is for reflecting that candidate question template has the time A possibility that selecting attribute information;
It, can be corresponding every for candidate's question template after determining the corresponding candidate attribute information of candidate question template Kind candidate attribute information determines the similar of at least one seed pattern of candidate's question template to carrying candidate attribute information Degree, similarity is for reflecting that candidate's question template has a possibility that candidate attribute information.
Referring to fig. 4, determine that the corresponding candidate attribute information of candidate question template 1 is attribute information 1 and attribute information 2; It, can be for candidate question template 1 after the corresponding candidate attribute information of candidate question template 2 is attribute information 1 and attribute information 2 For, the similarity of candidate question template 1 and at least one seed pattern for carrying attribute information 1 is calculated, and, candidate problem The similarity of at least one seed pattern of template 1 and carrying attribute information 2;For candidate question template 2, calculate candidate The similarity of at least one seed pattern of question template 2 and carrying attribute information 1, and candidate question template 2 belong to carrying The similarity of at least one seed pattern of property information 2;Wherein, at least one seed pattern for carrying attribute information 1 includes kind Subtemplate 1 and seed pattern 3;At least one seed pattern for carrying attribute information 2 includes seed pattern 2.
Fig. 6 is a kind of candidate question template provided by the embodiments of the present application and at least one the kind submodule for carrying attribute information The calculating schematic diagram of the similarity of plate.Optionally, Fig. 6 is provided on the basis of fig. 4, and candidate problem has been determined in Fig. 4 The corresponding candidate attribute information of template 1 is attribute information 1 and attribute information 2, the corresponding candidate attribute information of candidate question template 2 Seed pattern for attribute information 1 and attribute information 2, and carrying attribute information 1 includes seed pattern 1 and seed pattern 3, is carried The seed pattern of attribute information 2 includes seed pattern 2;Therefore, it referring to Fig. 6, calculates candidate question template 1 and carries attribute information 1 seed pattern 1 and the similarity of seed pattern 3;Calculate candidate question template 1 and the seed pattern 2 of carrying attribute information 2 Similarity;Calculate the similarity of candidate question template 2 with the seed pattern 1 and seed pattern 3 that carry attribute information 1;Meter Calculate the similarity of candidate question template 2 and the seed pattern 2 for carrying attribute information 2;Wherein, it is asked for each candidate in Fig. 6 The case where template is by arrow direction attribute information is inscribed, illustrates to need to calculate include in candidate's question template and the attribute each The similarity of a seed pattern.
S305, it is directed to each candidate question template, is respectively provided with different candidate attribute information according to candidate question template Possibility chooses a candidate attribute information as candidate problem from the corresponding each candidate attribute information of candidate question template The attribute information of template, to generate question template.
Such as, for each candidate question template, which is corresponding with multiple candidate attribute information, determines the time A possibility that selecting question template to be respectively provided with corresponding each candidate attribute, based on identified each possibility, from time It selects and chooses category of the candidate attribute information as candidate's question template in the corresponding each candidate attribute information of question template Property information, to generate question template.
In conjunction with Fig. 4 for example, the candidate corresponding candidate attribute information of question template 1 is respectively attribute information 1 and attribute A possibility that information 2, the candidate question template 1 of foundation is with attribute information 1 and candidate question template 1 have attribute information 2 Possibility is chosen an attribute information from the corresponding attribute information 1 of candidate question template 1 and attribute information 2 and is asked as candidate The attribute information of template 1 is inscribed, to generate question template.
In order to make it easy to understand, existing provide another question template generation method flow chart, Fig. 7 is specifically referred to.
As shown in fig. 7, this method comprises:
S701, each seed pattern that seed pattern is concentrated is obtained, each seed pattern carries attribute information;
S702, extension seed pattern generate at least one candidate question template;
S703, it is directed to each candidate question template, determines at least one the target seed for expanding candidate's question template Template, and all properties information that at least one target seed pattern carries is determined as the corresponding candidate of candidate's question template Attribute information;
S704, it is directed to the corresponding every kind of candidate attribute information of each candidate question template, determines candidate's question template and takes The similarity of at least one seed pattern with the candidate attribute information, similarity is for reflecting that candidate question template has the time A possibility that selecting attribute information;
S705, it is directed to each candidate question template, is respectively provided with different candidate attribute information according to candidate question template Possibility chooses a candidate attribute information as candidate problem from the corresponding each candidate attribute information of candidate question template The attribute information of template, to generate question template;
S706, it stores question template as seed pattern to seed pattern collection;And return to step S701.
The embodiment of the present application by using template the problem of history trendline as seed pattern, with for subsequent to new problem The generation of template, so that the quantity of the seed pattern in the embodiment of the present application is constantly expanded, in this way, subsequent execution is primary The quantity of question template generating process, question template generated can be more and more;Also, because the quantity of seed pattern increases It is more, the similarity of calculated candidate question template and the seed pattern for carrying attribute information can be made more accurate, in this way, can The problem of to further increase generation template accuracy.
For the ease of those skilled in the art to a kind of understanding of question template generation method provided by the embodiments of the present application, Now to determining candidate question template and carry candidate attribute in a kind of question template generation method provided by the embodiments of the present application The mode of the similarity of at least one seed pattern of information is described in detail.
The similarity for determining candidate's question template and carrying at least one seed pattern of candidate attribute information can be with are as follows: In multiple dimensions of setting, determine that at least one seed pattern of candidate question template and carrying the candidate attribute information exists respectively Similarity in multiple dimensions.
Such as, multiple and different dimensions can be set, for each dimension, determine candidate's question template and carry the time Select similarity of at least one seed pattern of attribute information on this dimension.
Wherein it is determined that at least one seed pattern of candidate question template and carrying the candidate attribute information is in a dimension On the mode of similarity can be with are as follows: calculate candidate question template respectively with each seed pattern for carrying the candidate attribute information Editing distance;The maximum editing distance in each editing distance is chosen, as candidate question template and carries the candidate attribute The similarity of at least one seed pattern of information in one dimension.For ease of description, the similarity is temporarily known as candidate The editing distance of at least one seed pattern of question template and carrying the candidate attribute information.
It determines candidate's question template and carries at least one seed pattern of the candidate attribute information in one dimension The mode of similarity can be with are as follows: splices to the text for each seed pattern for carrying the candidate attribute information, is spliced Text;By word frequency-inverse file frequency TF-IDF (term frequency-of the text of candidate question template and splicing text Inverse document frequency, word frequency-inverse file frequency) similarity, as candidate question template and carry the time Select the similarity of at least one seed pattern of attribute information in one dimension.For ease of description, temporarily the similarity is claimed For the TF-IDF similarity of candidate question template and at least one seed pattern for carrying the candidate attribute information.
It determines candidate's question template and carries at least one seed pattern of the candidate attribute information in one dimension The mode of similarity can for calculate candidate question template respectively with each seed pattern for carrying the candidate attribute information more than String similarity;The maximum cosine similarity in each cosine similarity is chosen, as candidate question template and carries candidate category The similarity of at least one seed pattern of property information in one dimension.For ease of description, temporarily the similarity is known as waiting It selects question template and carries the cosine similarity of at least one seed pattern of the candidate attribute information.
It determines candidate's question template and carries at least one seed pattern of the candidate attribute information in one dimension The mode of similarity can be with are as follows: determines the candidate's question template expanded by the seed pattern for carrying the candidate attribute information First quantity;Second number of all candidate question templates that the seed pattern that calculating carries the candidate attribute information expands Amount;And determine that the first quantity occupies the ratio of the second quantity;First quantity and ratio as candidate question template and are carried The similarity of at least one seed pattern of the candidate attribute information in one dimension.For ease of description, temporarily that this is similar Degree is known as the statistics similarity of candidate question template and at least one seed pattern for carrying the candidate attribute information.
Optionally, determining that candidate asks template and carries the similarity of at least one seed pattern of candidate attribute information can be with It comprises determining that candidate question template and carries the editing distance of at least one seed pattern of candidate attribute information, candidate problem The TF-IDF similarity of at least one seed pattern of template and carrying candidate attribute information, candidate question template and carrying are candidate The cosine similarity of at least one seed pattern of attribute information, candidate question template and at least the one of carrying candidate attribute information One or more of statistics similarity of a seed pattern.
Above it is only determining candidate question template provided by the embodiments of the present application and carries candidate attribute information at least The preferred embodiment of the similarity of one seed pattern, the specific executive mode inventor in relation to which can be according to oneself need Any setting is asked, it is not limited here.
Optionally, at least one seed pattern of candidate question template and carrying the candidate attribute information is in each dimension Similarity pass through characteristic value.Such as, the candidate question template and carry the candidate attribute information that above-described embodiment is mentioned The editing distance of at least one seed pattern indicated by characteristic value (this feature value be editing distance);Candidate question template with Carrying the cosine similarity of at least one seed pattern of the candidate attribute information, (this feature value can be cosine by characteristic value Similarity) it indicates;The TF-IDF similarity of at least one seed pattern of candidate question template and carrying the candidate attribute information It is indicated by characteristic value (this feature value can be TF-IDF similarity);Candidate question template and carry the candidate attribute information The statistics similarity of at least one seed pattern is indicated by characteristic value (this feature value can be the first quantity and ratio).
A kind of determining candidate question template that there is provided based on the above embodiments of the present application and carry the candidate attribute information The mode of the similarity of at least one seed pattern, the embodiment of the present application further provide for a kind of according to candidate question template difference There is a possibility that different candidate attribute information, from the corresponding each candidate attribute information of candidate question template, choose a time Attribute information is selected to refer to Fig. 8 as the method for the attribute information of candidate question template.
As shown in figure 8, this method comprises:
S801, candidate question template and characteristic value of the different candidate attribute information in each dimension are input to it is random gloomy Woods prediction model obtains the probability that candidate question template belongs to every kind of candidate attribute information;Random forest prediction model is to be based on Carry what the seed pattern of attribute information was trained random forest grader;
Such as, for a candidate question template, the corresponding each candidate attribute information of candidate's question template is determined, it will Characteristic value input value random forest prediction model of the candidate question template from different candidate attribute information in each dimension, obtains The output of random forest prediction model is as a result, the output result of random forest prediction model includes that candidate's question template belongs to respectively In the probability of every kind of corresponding candidate attribute information.
For example, if two different dimensions of setting, respectively dimension 1 and dimension 2;For candidate question template 1, with candidate The corresponding candidate attribute information of question template 1 is respectively attribute information 1 and attribute information 2, determines candidate's question template 1 and carries At least one seed pattern of attribute information 1 determines that candidate asks in the characteristic value 1 in dimension 1 and the characteristic value in dimension 22 At least one seed pattern of topic template 1 and carrying attribute 2, will in the characteristic value 3 in dimension 1 and the characteristic value in dimension 24 Characteristic value 1, characteristic value 2, characteristic value 3 and characteristic value 4 are input to random forest prediction model, obtain candidate question template 1 and belong to The probability of attribute information 1 and candidate question template 1 belong to the probability of attribute information 2.
It wherein, is that characteristic value is converted to random forest prediction mould by the mode of characteristic value input random forest prediction model After type can be with the format of identifying processing, it is input to random forest prediction model.
S802, using the candidate attribute information of maximum probability as the attribute information of candidate question template, to generate problem mould Plate.
Optionally, candidate's question template obtained in the step S801 is belonging respectively to corresponding every kind of candidate attribute letter In the probability of breath, attribute information of the candidate attribute information of maximum probability as candidate's question template is chosen, to generate problem Template.
For example, features described above value 1, characteristic value 2, characteristic value 3 and characteristic value 4 are input to random forest prediction mould Type obtains candidate question template 1 to belong to the probability of attribute information 1 being 30%, and candidate question template 1 belongs to the general of attribute information 2 Rate is 70%, then can attribute information by attribute information 2 as candidate's question template, to generate question template.
In the embodiment of the present application, it is preferred that identical attribute information may be considered same attribute information, different Attribute information may be considered different types of attribute information.
In the embodiment of the present application, it is preferred that before not generating question template, each can be carried based on existing The seed pattern of attribute information is trained random forest grader, to generate random forest prediction model.In order to further The accuracy for improving the problem of a kind of question template generation method provided by the embodiments of the present application generates template, can ask in generation After inscribing template, random forest prediction model is further trained according to generated question template, with random gloomy to this Woods prediction model optimizes, so that the output result of random forest prediction model is more accurate.
For the ease of a kind of question template generation method provided by the embodiments of the present application is understood in detail, now to the application The extension seed pattern in a kind of question template generation method that embodiment provides generates the side of at least one candidate question template Method describes in detail.
The embodiment of the present application provides a kind of extension seed pattern as shown in Figure 9 and generates at least one candidate question template Method flow diagram, based on flow chart as shown in Figure 9, can be more apparent from order to user extension seed pattern generate to The logical process of a few candidate question template.
As shown in figure 9, this method comprises:
S901, generation at least one kind subproblem relevant to seed pattern;
S902, it is retrieved using kind of subproblem as querying condition, and extraction is related to kind subproblem from search result At least one search problem;
S903, the entity in search problem is replaced with to specific character, generates candidate question template.
Optionally, the detailed description of above-described embodiment is referred in relation to the specific executive mode of step S901-S903, herein It does not repeat them here.
Further, a kind of extension seed pattern provided by the embodiments of the present application generates at least one candidate question template It can also include: that denoising is carried out to search problem in method, denoising is used to indicate the prefix progress to search problem Denoising, and/or, denoising is carried out to the suffix of search problem;Correspondingly, step S903 is specially that will carry out at denoising The entity in search problem after reason replaces with specific character, generates candidate question template.
It should be understood that the embodiment of the present application is not to " carrying out denoising to search problem " process and " with specific The execution sequence of entity in character replacement search problem " process is defined, and be can be and is first replaced the entity in search problem After being changed to specific character, denoising is being carried out by specific character replaced search problem to entity, is generating candidate problem mould Plate;After can also being replaced with specific character to the entity in search problem, to the progress replaced search problem of entity Carry out denoising.
In the embodiment of the present application, in order to improve to obtain the efficiency of search problem, can realize in the following way will be planted Subproblem is retrieved as querying condition, obtains the purpose of at least one search problem relevant to kind subproblem.
In conjunction with Fig. 1 and Figure 10 (Figure 10 be it is provided by the embodiments of the present application it is a kind of will kind subproblem as querying condition progress Retrieval obtains the method flow diagram of at least one search problem relevant to kind subproblem) it is found that provided by the embodiments of the present application It is a kind of to be retrieved using kind of subproblem as querying condition, obtain the side of at least one search problem relevant to kind subproblem Method, comprising:
S1001, generation are corresponding with kind subproblem for that will plant URL link of the subproblem as querying condition;
In the embodiment of the present application, it is preferred that the problem template generation method institute as shown in Figure 2 that above-described embodiment is mentioned The server being applied to can be generated corresponding with kind subproblem for that will plant URL link of the subproblem as querying condition;Phase It answers, a kind of server that question template generation method is applied to provided by the embodiments of the present application, can after generating URL link To send the URL link to searching platform.
S1002, URL link is sent to searching platform, and receives being examined based on URL link for searching platform return The search result that rope returns;
S1003, parsing search result, obtain at least one search problem relevant to kind of subproblem in search result.
Server is used to plant URL link of the subproblem as querying condition by generating in the embodiment of the present application, and will URL link is sent to searching platform, in a manner of obtaining search result from searching platform, can not only mitigate the money of server itself Source occupancy, and a large amount of search result can also be directly grabbed from searching platform based on URL link, effectively increase the application What embodiment provided retrieves using kind of subproblem as querying condition, obtains at least one retrieval relevant to kind subproblem and asks The efficiency of topic, and then improve a kind of efficiency of question template generation method provided by the embodiments of the present application.
The embodiment of the present application provides a kind of question template generation method and server, can be not required to utilize retrieval question and answer pair In search result and the search problem and search condition that are not required in triple in matching retrieval question and answer pair one by one basis On, the generation of problem of implementation template;Accordingly, with respect to return mark method for, the present invention effectively increase question template formation efficiency, It improves question template and generates accuracy.
Question template generating means provided in an embodiment of the present invention are introduced below, question template described below is raw Be regarded as at device, server to realize the present invention embodiment provide the problem of template generation method, the program of required setting Module.Question template generating means content described below, can be mutually right with above-described question template generation method content It should refer to.
Figure 11 is a kind of structural schematic diagram of question template generating means provided by the embodiments of the present application.
As shown in figure 11, which includes:
Seed pattern acquiring unit 111, for obtaining multiple seed patterns, each seed pattern carries attribute information;
Candidate question template expanding element 112 generates at least one candidate question template for extending seed pattern;
Target property information determination unit 113, for for each candidate question template, determination to expand candidate's problem At least one target seed pattern of template, and all properties information that at least one target seed pattern carries is determined as this The corresponding candidate attribute information of candidate question template;
Similarity determining unit 114, for determining for the corresponding every kind of candidate attribute information of each candidate's question template The similarity of at least one seed pattern of candidate question template and carrying the candidate attribute information, similarity is for reflecting candidate Question template has a possibility that candidate attribute information;
Question template generation unit 115, for being respectively provided with according to candidate question template for each candidate question template A possibility that different candidate attribute information, chooses a candidate category from the corresponding each candidate attribute information of candidate question template Property attribute information of the information as candidate question template, to generate question template.
Optionally, similarity determining unit is specifically used for: in multiple dimensions of setting, determining candidate problem mould respectively Similarity of at least one seed pattern of plate and carrying the candidate attribute information in multiple dimensions.
Optionally, at least one seed pattern of candidate question template and carrying the candidate attribute information is in each dimension Similarity pass through characteristic value;Question template generation unit, comprising:
Model prediction unit, for the characteristic value by candidate question template from different candidate attribute information in each dimension It is input to random forest prediction model, obtains the probability that candidate question template belongs to every kind of candidate attribute information;Random forest is pre- Surveying model is to be trained based on the seed pattern for carrying attribute information to random forest grader;
Question template generates subelement, for using the candidate attribute information of maximum probability as the attribute of candidate question template Information, to generate question template.
Optionally, candidate question template expanding element, comprising:
Kind subproblem generation unit, for generating at least one kind subproblem relevant to seed pattern;
Search problem generation unit is retrieved for that will plant subproblem as querying condition, and is mentioned from search result Take at least one search problem relevant to kind subproblem;
Candidate question template extends subelement, for the entity in search problem to be replaced with specific character, generates candidate Question template.
Further, a kind of candidate question template expanding element provided by the embodiments of the present application further includes denoising unit, is used In carrying out denoising to search problem, denoising, which is used to indicate, carries out denoising to the prefix of search problem, and/or, Denoising is carried out to the suffix of search problem.
Correspondingly, candidate question template extends subelement, specifically for that will carry out in the search problem after denoising Entity replaces with specific character, generates candidate question template.
Optionally, search problem generation unit, comprising:
URL link generation unit, it is corresponding with subproblem is planted for subproblem will to be planted as querying condition for generating URL link;
Search result acquiring unit, for URL link to be sent to searching platform, and the base that reception searching platform returns The search result of retrieval return is carried out in URL link;
Search problem generates subelement, for parsing search result, analytically after search result in extract and asked with seed Inscribe at least one relevant search problem.
Optionally, similarity determining unit is specifically used for the candidate question template of calculating and belongs to respectively with each carrying candidate The editing distance of the seed pattern of property information;The maximum editing distance in each editing distance is chosen, as candidate question template With at least one seed pattern similarity in one dimension for carrying the candidate attribute information.
Optionally, similarity determining unit, specifically for the text to each seed pattern for carrying the candidate attribute information This is spliced, and splicing text is obtained;By word frequency-inverse file frequency TF-IDF of the text of candidate question template and splicing text Similarity, as candidate question template and at least one seed pattern phase in one dimension for carrying the candidate attribute information Like degree.
Optionally, similarity determining unit is specifically used for the candidate question template of calculating and belongs to respectively with each carrying candidate The cosine similarity of the seed pattern of property information;The maximum cosine similarity in each cosine similarity is chosen, is asked as candidate It inscribes template and carries the similarity of at least one seed pattern of the candidate attribute information in one dimension.
Further, a kind of question template generating means provided by the embodiments of the present application further include return unit, and being used for will Question template returns as seed pattern and executes " extension seed pattern generates at least one candidate question template " process.
Further, the embodiment of the present invention also provides a kind of storage medium, which can be stored with suitable for processor The program of execution, program are used for:
Multiple seed patterns are obtained, each seed pattern carries attribute information;
Extension seed pattern generates at least one candidate question template;
For each candidate question template, at least one the target seed pattern for expanding candidate's question template is determined, And all properties information that at least one target seed pattern carries is determined as the corresponding candidate attribute of candidate's question template Information;
For the corresponding every kind of candidate attribute information of each candidate's question template, determines candidate's question template and carry the time The similarity of at least one seed pattern of attribute information is selected, similarity is for reflecting that candidate question template has the candidate attribute A possibility that information;
For each candidate question template, the possibility of different candidate attribute information is respectively provided with according to candidate question template Property, from the corresponding each candidate attribute information of candidate question template, a candidate attribute information is chosen as candidate question template Attribute information, to generate question template.
Optionally, the refinement function of program and extension function can refer to above description.
The embodiment of the present application, which provides a kind of question template generation method, device, server and storage medium, user, only to be needed pre- Some seed patterns for carrying attribute information are first set, and the present invention can be by being extended to multiple candidate problem moulds for seed pattern Plate;For each candidate question template, all categories that each seed pattern for being used to be extended to candidate's question template is carried Property information, be determined as candidate attribute information corresponding with candidate's question template, so based on the candidate question template with A possibility that its corresponding each candidate attribute information, chooses one from the corresponding each candidate attribute information of candidate's question template A candidate attribute information in a manner of generating question template, automatically generates problem as the attribute information of candidate's question template Template.Also, because each seed pattern can be extended to multiple candidate question templates for being used to generate question template, because The quantity of the problem of this is automatically generated based on seed pattern template is also considerable.Relative to it is traditional by human configuration each For the mode of question template, it can not only save cost of labor but also question template formation efficiency can be improved.
Each embodiment in this specification is described in a progressive manner, the highlights of each of the examples are with other The difference of embodiment, the same or similar parts in each embodiment may refer to each other.For device disclosed in embodiment For, since it is corresponded to the methods disclosed in the examples, so being described relatively simple, related place is said referring to method part It is bright.
Professional further appreciates that, unit described in conjunction with the examples disclosed in the embodiments of the present disclosure And algorithm steps, can be realized with electronic hardware, computer software, or a combination of the two, in order to clearly demonstrate hardware and The interchangeability of software generally describes each exemplary composition and step according to function in the above description.These Function is implemented in hardware or software actually, the specific application and design constraint depending on technical solution.Profession Technical staff can use different methods to achieve the described function each specific application, but this realization is not answered Think beyond the scope of this invention.
The step of method described in conjunction with the examples disclosed in this document or algorithm, can directly be held with hardware, processor The combination of capable software module or the two is implemented.Software module can be placed in random access memory (RAM), memory, read-only deposit Reservoir (ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or technology In any other form of storage medium well known in field.
The foregoing description of the disclosed embodiments enables those skilled in the art to implement or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, as defined herein General Principle can be realized in other embodiments in the case where not departing from core of the invention thought or scope.Therefore, originally Invention is not intended to be limited to the embodiments shown herein, and is to fit to and the principles and novel features disclosed herein Consistent widest scope.

Claims (15)

1. a kind of question template generation method characterized by comprising
Multiple seed patterns are obtained, each seed pattern carries attribute information;
It extends the seed pattern and generates at least one candidate question template;
For each candidate question template, at least one the target seed pattern for expanding candidate's question template is determined, And all properties information that at least one target seed pattern carries is determined as the corresponding candidate attribute of candidate's question template Information;
For the corresponding every kind of candidate attribute information of each candidate question template, determines the candidate question template and carry The similarity of at least one seed pattern of the candidate attribute information, the similarity is for reflecting the candidate question template tool There is a possibility that candidate attribute information;
For each candidate question template, according to the candidate question template be respectively provided with different candidate attribute information can Energy property chooses a candidate attribute information as the time from the corresponding each candidate attribute information of candidate's question template The attribute information of question template is selected, to generate question template.
2. the method according to claim 1, wherein the determination candidate question template and the carrying candidate The similarity of at least one seed pattern of attribute information, comprising:
In multiple dimensions of setting, determine respectively the candidate question template and carry the candidate attribute information at least one Similarity of the seed pattern in the multiple dimension.
3. according to the method described in claim 2, it is characterized in that, candidate's question template and the carrying candidate attribute information Similarity of at least one seed pattern in each dimension pass through characteristic value;
Described a possibility that being respectively provided with different candidate attribute information according to the candidate question template, from the candidate problem mould In the corresponding each candidate attribute information of plate, attribute information of the candidate attribute information as the candidate question template is chosen, Include:
It is pre- that the candidate question template from characteristic value of the different candidate attribute information in each dimension is input to random forest Model is surveyed, the probability that the candidate question template belongs to every kind of candidate attribute information is obtained;The random forest predicts mould Type is to be trained based on the seed pattern for carrying attribute information to random forest grader;
Using the candidate attribute information of maximum probability as the attribute information of the candidate question template.
4. the method according to claim 1, wherein the extension seed pattern generates at least one candidate Question template, comprising:
Generate at least one kind subproblem relevant to the seed pattern;
It is retrieved using described kind of subproblem as querying condition, and extraction is relevant to described kind of subproblem from search result At least one search problem;
Entity in the search problem is replaced with into specific character, generates candidate question template.
5. according to the method described in claim 4, it is characterized by further comprising: carrying out denoising, institute to the search problem It states denoising and is used to indicate and denoising is carried out to the prefix of search problem, and/or, the suffix of search problem is denoised Processing;
The entity by the search problem replaces with specific character, generates candidate question template, comprising: will denoise Entity in treated the search problem replaces with specific character, generates candidate question template.
6. according to the method described in claim 4, it is characterized in that, described examine using described kind of subproblem as querying condition Rope, and at least one search problem relevant to described kind of subproblem is extracted from search result, comprising:
It generates corresponding with described kind of subproblem for using described kind of subproblem as the URL link of querying condition;
The URL link is sent to searching platform, and receives carrying out based on the URL link for the searching platform return Retrieve the search result returned;
Parse the search result, analytically after the search result in extract it is relevant to described kind of subproblem at least one Search problem.
7. according to the method described in claim 2, it is characterized in that, determining the candidate question template and carrying the candidate attribute The similarity of at least one seed pattern of information in one dimension, comprising:
Calculate the candidate question template editing distance with each seed pattern for carrying the candidate attribute information respectively;
The maximum editing distance in each editing distance is chosen, as the candidate question template and carries the candidate attribute The similarity of at least one seed pattern of information in one dimension.
8. according to the method described in claim 2, it is characterized in that, the determination candidate question template and the carrying candidate The similarity of at least one seed pattern of attribute information in one dimension, comprising:
The text for each seed pattern for carrying the candidate attribute information is spliced, splicing text is obtained;
By word frequency-inverse file frequency TF-IDF similarity of the text of the candidate question template and the splicing text, as The similarity of at least one seed pattern of candidate's question template and carrying the candidate attribute information in one dimension.
9. according to the method described in claim 2, it is characterized in that, the determination candidate question template and the carrying candidate The similarity of at least one seed pattern of attribute information in one dimension, comprising:
Calculate the candidate question template cosine similarity with each seed pattern for carrying the candidate attribute information respectively;
The maximum cosine similarity in each cosine similarity is chosen, as the candidate question template and carries the candidate The similarity of at least one seed pattern of attribute information in one dimension.
10. the method according to claim 1, wherein further include:
Using described problem template as seed pattern, returns to execution and " extend the seed pattern and generate at least one candidate problem Template " process.
11. a kind of question template generating means characterized by comprising
Seed pattern acquiring unit, for obtaining multiple seed patterns, each seed pattern carries attribute information;
Candidate question template expanding element generates at least one candidate question template for extending the seed pattern;
Target property information determination unit, for for each candidate question template, determination to expand candidate's problem mould At least one target seed pattern of plate, and all properties information that at least one target seed pattern carries is determined as the time Select the corresponding candidate attribute information of question template;
Similarity determining unit determines institute for being directed to the corresponding every kind of candidate attribute information of each candidate's question template It states candidate question template and carries the similarity of at least one seed pattern of the candidate attribute information, the similarity is for anti- Reflecting the candidate question template has a possibility that candidate attribute information;
Question template generation unit, for having respectively according to the candidate question template for each candidate question template There is a possibility that different candidate attribute information, from the corresponding each candidate attribute information of candidate's question template, chooses one Attribute information of the candidate attribute information as the candidate question template, to generate question template.
12. device according to claim 11, which is characterized in that the similarity determining unit is specifically used for: setting Multiple dimensions on, determine the candidate question template respectively and carry at least one seed pattern of the candidate attribute information and exist Similarity in the multiple dimension.
13. device according to claim 12, which is characterized in that candidate's question template and carrying candidate attribute letter Similarity of at least one seed pattern of breath in each dimension passes through characteristic value;Described problem template generation list Member, comprising:
Model prediction unit, for the characteristic value by the candidate question template from different candidate attribute information in each dimension It is input to random forest prediction model, obtains the probability that the candidate question template belongs to every kind of candidate attribute information;Institute Stating random forest prediction model is to be trained to obtain to random forest grader based on the seed pattern for carrying attribute information 's;
Question template generates subelement, for using the candidate attribute information of maximum probability as the attribute of the candidate question template Information, to generate question template.
14. a kind of server, which is characterized in that including at least one processor and at least one processor;The memory is deposited Program is contained, the processor calls the program of the memory storage, and described program is used for:
Multiple seed patterns are obtained, each seed pattern carries attribute information;
It extends the seed pattern and generates at least one candidate question template;
For each candidate question template, at least one the target seed pattern for expanding candidate's question template is determined, And all properties information that at least one target seed pattern carries is determined as the corresponding candidate attribute of candidate's question template Information;
For the corresponding every kind of candidate attribute information of each candidate question template, determines the candidate question template and carry The similarity of at least one seed pattern of the candidate attribute information, the similarity is for reflecting the candidate question template tool There is a possibility that candidate attribute information;
For each candidate question template, according to the candidate question template be respectively provided with different candidate attribute information can Energy property chooses a candidate attribute information as the time from the corresponding each candidate attribute information of candidate's question template The attribute information of question template is selected, to generate question template.
15. a kind of storage medium, which is characterized in that be stored with computer executable instructions, the calculating in the storage medium Machine executable instruction requires 1 to 10 described in any item question template generation methods for perform claim.
CN201810890730.1A 2018-08-07 2018-08-07 Question template generation method and device, server and storage medium Active CN110209780B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810890730.1A CN110209780B (en) 2018-08-07 2018-08-07 Question template generation method and device, server and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810890730.1A CN110209780B (en) 2018-08-07 2018-08-07 Question template generation method and device, server and storage medium

Publications (2)

Publication Number Publication Date
CN110209780A true CN110209780A (en) 2019-09-06
CN110209780B CN110209780B (en) 2023-03-10

Family

ID=67779879

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810890730.1A Active CN110209780B (en) 2018-08-07 2018-08-07 Question template generation method and device, server and storage medium

Country Status (1)

Country Link
CN (1) CN110209780B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111324715A (en) * 2020-02-18 2020-06-23 北京百度网讯科技有限公司 Method and device for generating question-answering robot
CN113408271A (en) * 2021-06-16 2021-09-17 北京来也网络科技有限公司 Information extraction method, device, equipment and medium based on RPA and AI
WO2023134087A1 (en) * 2022-01-11 2023-07-20 平安科技(深圳)有限公司 Method and apparatus for generating inquiry template, electronic device, and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100299139A1 (en) * 2009-04-23 2010-11-25 International Business Machines Corporation Method for processing natural language questions and apparatus thereof
CN103136221A (en) * 2011-11-24 2013-06-05 北京百度网讯科技有限公司 Method capable of generating requirement template and requirement identification method and device
CN108153876A (en) * 2017-12-26 2018-06-12 爱因互动科技发展(北京)有限公司 Intelligent answer method and system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100299139A1 (en) * 2009-04-23 2010-11-25 International Business Machines Corporation Method for processing natural language questions and apparatus thereof
CN103136221A (en) * 2011-11-24 2013-06-05 北京百度网讯科技有限公司 Method capable of generating requirement template and requirement identification method and device
CN108153876A (en) * 2017-12-26 2018-06-12 爱因互动科技发展(北京)有限公司 Intelligent answer method and system

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111324715A (en) * 2020-02-18 2020-06-23 北京百度网讯科技有限公司 Method and device for generating question-answering robot
CN113408271A (en) * 2021-06-16 2021-09-17 北京来也网络科技有限公司 Information extraction method, device, equipment and medium based on RPA and AI
WO2023134087A1 (en) * 2022-01-11 2023-07-20 平安科技(深圳)有限公司 Method and apparatus for generating inquiry template, electronic device, and storage medium

Also Published As

Publication number Publication date
CN110209780B (en) 2023-03-10

Similar Documents

Publication Publication Date Title
US10896214B2 (en) Artificial intelligence based-document processing
JP6515624B2 (en) Method of identifying lecture video topics and non-transitory computer readable medium
US20200192727A1 (en) Intent-Based Organisation Of APIs
WO2019169858A1 (en) Searching engine technology based data analysis method and system
JP3648051B2 (en) Related information retrieval apparatus and program recording medium
CN109408821B (en) Corpus generation method and device, computing equipment and storage medium
CN107391500A (en) Text interpretation method, device and equipment
US20120290621A1 (en) Generating a playlist
US20170364495A1 (en) Propagation of changes in master content to variant content
US8515986B2 (en) Query pattern generation for answers coverage expansion
CN110147544B (en) Instruction generation method and device based on natural language and related equipment
JP2020191075A (en) Recommendation of web apis and associated endpoints
CN110321437B (en) Corpus data processing method and device, electronic equipment and medium
CN109564573A (en) Platform from computer application metadata supports cluster
CN103646049B (en) The method and system of automatically generated data form
US20150213066A1 (en) System and method for creating data models from complex raw log files
CN111737443B (en) Answer text processing method and device and key text determining method
CN113656547B (en) Text matching method, device, equipment and storage medium
CN110209780A (en) A kind of question template generation method, device, server and storage medium
CN109918627A (en) Document creation method, device, electronic equipment and storage medium
CN111400473A (en) Method and device for training intention recognition model, storage medium and electronic equipment
CN109542757A (en) Interface testing environment determines method, apparatus, electronic equipment and storage medium
CN109271624A (en) A kind of target word determines method, apparatus and storage medium
CN112970011B (en) Pedigree in record query optimization
CN113343012B (en) News matching method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant