CN110196901A - Construction method, device, computer equipment and the storage medium of conversational system - Google Patents

Construction method, device, computer equipment and the storage medium of conversational system Download PDF

Info

Publication number
CN110196901A
CN110196901A CN201910578623.XA CN201910578623A CN110196901A CN 110196901 A CN110196901 A CN 110196901A CN 201910578623 A CN201910578623 A CN 201910578623A CN 110196901 A CN110196901 A CN 110196901A
Authority
CN
China
Prior art keywords
keyword
query statement
candidate
matched
conversational system
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910578623.XA
Other languages
Chinese (zh)
Other versions
CN110196901B (en
Inventor
焦振宇
孙叔琦
李婷婷
孙珂
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201910578623.XA priority Critical patent/CN110196901B/en
Publication of CN110196901A publication Critical patent/CN110196901A/en
Application granted granted Critical
Publication of CN110196901B publication Critical patent/CN110196901B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/9032Query formulation
    • G06F16/90332Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing

Abstract

The application proposes construction method, device, computer equipment and the storage medium method of a kind of conversational system, wherein method includes: the candidate documents for receiving developer and sending;Candidate documents are analyzed to generate the corresponding keyword set of candidate documents, wherein keyword set includes the multiple keywords extracted from candidate documents;The query statement of user is received, and is obtained and the matched keyword of query statement;Query result is extracted from the corresponding candidate documents of matched keyword according to the matched keyword of query statement, and query result is fed back into user.For this method when constructing conversational system, developer inputs candidate documents, and without a large amount of work of investment, time cost is low, and conversational system building is convenient, and can be adapted for the conversational system building of several scenes, strong applicability.

Description

Construction method, device, computer equipment and the storage medium of conversational system
Technical field
This application involves field of computer technology more particularly to a kind of construction method of conversational system, device, computer to set Standby and storage medium.
Background technique
The building of conversational system is to establish a machine conversational system, can be interacted with people, the building of conversational system It is the development trend of human-computer interaction, is the basic technology of the intelligent robot to receive much attention at present.
Conversational system is mainly constructed using following two mode at present, one is the meanings that developer designs this field in advance Dialog logic (abbreviation dialog mode) under figure and word slot and different situations;Another kind is to arrange this field by developer Question and answer pair, Robot Selection and active user input the answer (abbreviation question and answer mode) of the most like problem of query when dialogue.
But dialog mode conversational system needs developer to fully understand target domain, while again to the principle of conversational system There is a higher grasp, the vocabulary of the collection word slot of developer arranges and is intended to and the workload of dialog logic is very big, time cost;And Question and answer mode conversational system developer gets out question and answer pair in advance, if the problem of user and its Similar Problems are not in question and answer pair, Conversational system cannot be answered, it is difficult to support more wheel interactions.
Summary of the invention
The application proposes a kind of construction method of conversational system, for solving existing conversational system construction method, exists The problem of heavy workload, time cost are high and take turns difficult interface more.
The application one side embodiment proposes a kind of construction method of conversational system, comprising:
Receive the candidate documents that developer sends;
The candidate documents are analyzed to generate the corresponding keyword set of the candidate documents, wherein the pass Keyword set includes the multiple keywords extracted from the candidate documents;
The query statement of user is received, and is obtained and the matched keyword of the query statement;And
It is extracted from the corresponding candidate documents of the matched keyword according to the matched keyword of the query statement Query result, and query result is fed back into the user.
The construction method of the conversational system of the embodiment of the present application, the candidate documents sent by receiving developer, to candidate Document is analyzed to generate the corresponding keyword set of candidate documents, wherein keyword set includes extracting from candidate documents Multiple keywords, receive the query statement of user, and obtain with the matched keyword of query statement, according to query statement The keyword matched extracts query result from the corresponding candidate documents of matched keyword, and query result is fed back to user. In the present embodiment, corresponding keyword set is obtained by being analyzed the candidate documents that developer sends, in user query When, the determining matched keyword of query statement with user's input, through matched keyword from its corresponding candidate documents Query result is extracted, as a result, when constructing conversational system, developer inputs candidate documents, without putting into a large amount of work Make, time cost is low, and conversational system building is convenient, and can be adapted for the conversational system building of several scenes, strong applicability.
The application another aspect embodiment proposes a kind of construction device of conversational system, comprising:
Receiving module, for receiving the candidate documents of developer's transmission;
Generation module, for being analyzed the candidate documents to generate the corresponding keyword set of the candidate documents It closes, wherein the keyword set includes the multiple keywords extracted from the candidate documents;
First obtains module, for receiving the query statement of user, and obtains and the matched keyword of the query statement; And
Enquiry module, for according to the matched keyword of the query statement from the corresponding time of the matched keyword Query result is extracted in selection shelves, and query result is fed back into the user.
The construction device of the conversational system of the embodiment of the present application, the candidate documents sent by receiving developer, to candidate Document is analyzed to generate the corresponding keyword set of candidate documents, wherein keyword set includes extracting from candidate documents Multiple keywords, receive the query statement of user, and obtain with the matched keyword of query statement, according to query statement The keyword matched extracts query result from the corresponding candidate documents of matched keyword, and query result is fed back to user. In the present embodiment, corresponding keyword set is obtained by being analyzed the candidate documents that developer sends, in user query When, the determining matched keyword of query statement with user's input, through matched keyword from its corresponding candidate documents Query result is extracted, as a result, when constructing conversational system, developer inputs candidate documents, without putting into a large amount of work Make, time cost is low, and conversational system building is convenient, and can be adapted for the conversational system building of several scenes, strong applicability.
The application another aspect embodiment proposes a kind of computer equipment, including processor and memory;
Wherein, the processor run by reading the executable program code stored in the memory with it is described can The corresponding program of program code is executed, with the construction method for realizing the conversational system as described in above-mentioned one side embodiment.
The application another aspect embodiment proposes a kind of computer readable storage medium, is stored thereon with computer journey Sequence realizes the construction method of the conversational system as described in above-mentioned one side embodiment when the program is executed by processor.
The additional aspect of the application and advantage will be set forth in part in the description, and will partially become from the following description It obtains obviously, or recognized by the practice of the application.
Detailed description of the invention
The application is above-mentioned and/or additional aspect and advantage will become from the following description of the accompanying drawings of embodiments Obviously and it is readily appreciated that, in which:
Fig. 1 is a kind of flow diagram of the construction method of conversational system provided by the embodiments of the present application;
Fig. 2 is the flow diagram of the construction method of another conversational system provided by the embodiments of the present application;
Fig. 3 is the flow diagram of the construction method of another conversational system provided by the embodiments of the present application;
Fig. 4 is the flow diagram of the construction method of another conversational system provided by the embodiments of the present application;
Fig. 5 is the flow diagram of the construction method of another conversational system provided by the embodiments of the present application;
Fig. 6 is a kind of method schematic diagram based on the sequence of semantic candidate answers provided by the embodiments of the present application;
Fig. 7 is a kind of overall schematic of the construction method of conversational system provided by the embodiments of the present application;
Fig. 8 is a kind of structural schematic diagram of the construction device of conversational system provided by the embodiments of the present application;
Fig. 9 shows the block diagram for being suitable for the exemplary computer device for being used to realize the application embodiment.
Specific embodiment
Embodiments herein is described below in detail, examples of the embodiments are shown in the accompanying drawings, wherein from beginning to end Same or similar label indicates same or similar element or element with the same or similar functions.Below with reference to attached The embodiment of figure description is exemplary, it is intended to for explaining the application, and should not be understood as the limitation to the application.
Below with reference to the accompanying drawings the construction method of the conversational system of the embodiment of the present application, device, computer equipment are described and is deposited Storage media.
The embodiment of the present application, for the construction method of dialog mode conversational system in the related technology, developer's heavy workload, when Between at high cost and question and answer mode conversational system be difficult to support to take turns the problems of interaction more, propose a kind of construction method of conversational system.
The construction method of the conversational system of the embodiment of the present application analyze by the candidate documents for sending developer To corresponding keyword set, in user query, the determining matched keyword of query statement with user's input passes through matching Keyword extract query result from its corresponding candidate documents, with realize when construct conversational system, developer input Candidate documents, without a large amount of work of investment, time cost is low, and conversational system building is convenient, and can be adapted for more The conversational system building of kind scene, strong applicability.
Fig. 1 is a kind of flow diagram of the construction method of conversational system provided by the embodiments of the present application.
The construction method of the conversational system of the embodiment of the present application is executed by the construction device of conversational system, which can match It is placed in computer equipment, to construct conversational system.
As shown in Figure 1, the construction method of the conversational system includes:
Step 101, the candidate documents that developer sends are received.
In the present embodiment, when constructing conversational system, developer can obtain and answer according to the application scenarios of conversational system With the relevant document of scene, computer equipment is sent it to, so that conversational system construction device receives the time that developer sends Selection shelves.For example, when constructing the conversational system on certain airport, by the relevant documentation on the airport, such as boarding process, stream of checking luggage The documents such as journey, each security check position in airport are sent to computer equipment.
When specific implementation, conversational system is calculating interface with document upload button, and developer can click upload document Button, as a result, computer equipment can receive developer upload candidate documents.
Wherein, the candidate documents that developer sends, can be no structured documents, FAQ type document.No structured documents allow Developer can directly upload the document information of target domain, and the document permission user of FAQ type uploads the prefinishing of oneself Good question and answer pair can play the role of batch and intervene.In addition, developer send document can be one be also possible to it is more It is a, it can be same type of document, be also possible to a plurality of types of documents.For example, to construct pair for certain tourist attraction Telephone system, then developer can upload the word document for introducing tourist attraction, PPT document etc..
In practical applications, conversational system can be single field i.e. for certain special scenes, be also possible to multi-field 's.In the present embodiment, when to construct multi-field conversational system, the relevant document in each field is sent.
In order to improve the intelligence of conversational system, developer can also send out some dictionaries while sending candidate documents Give computer.For example, index dictionary, greeting dictionary, morphological analysis correlation dictionary.
Wherein, index dictionary realizes similar " first ", the selection of " Article 2 " etc, therefore indexing dictionary is to have The dictionary of text and call number corresponding relationship;It greets dictionary and includes some common greetings and reply, such as " hello ", " Hi ", " thanks ", " goodbye " etc.;The relevant dictionary of morphological analysis is that carry out certain morphological analysis to query statement relevant Processing.
Step 102, candidate documents are analyzed to generate the corresponding keyword set of candidate documents.
Specifically, morphological analysis can be carried out to candidate documents and obtains keyword set.More specifically, can be accorded with based on punctuate Number cut sentence processing to candidate documents, obtain multiple semantic segments, word cutting, part-of-speech tagging etc. then are carried out to semantic segment, Obtain basic word segment.The importance of each word segment in a document is calculated again, and the pass of candidate documents is determined according to importance Keyword, according to the available keyword set of these keywords, then including from the more of candidate documents extraction in keyword set A keyword.
In the present embodiment, word frequency inverse document frequency (Term Frequency-Inverse Document can use Frequency, abbreviation TF-IDF) calculate importance of each word segment where it in candidate documents.
Step 103, the query statement of user is received, and is obtained and the matched keyword of query statement.
In the present embodiment, it can be understood as and the Keywords matching in query statement with the matched keyword of query statement Keyword.Here matched keyword can be the synonym of the keyword in query statement.
User can input inquiry sentence, such as voice input, text input etc. in several ways.Obtaining user When the query statement of input, the keyword in query statement can be determined, then from the corresponding keyword set of candidate documents Obtain the keyword with the Keywords matching in query statement.
For example, user inputs problem " sea floor world admission ticket how much? ", keyword " sea floor world " is extracted from the problem " admission ticket ", then being obtained and " sea floor world " and " admission ticket " matched keyword from the keyword set of candidate documents.
In the present embodiment, after receiving input inquiry sentence, the type of query statement can be determined first with classifier, if looked into Query statement that sentence belongs to no special type, such as injunctive query statement, greeting formula query statement etc. are ask, then passes through morphology Analysis automatically extracts keyword from query statement.
Specifically, it this information can be screened by part of speech, the word for not having actually expressing the meaning property to function word etc. is removed, Higher weight is given to notional word, particularly name entity, then by deactivating vocabulary to the stop words in some statistical significances It is filtered, in addition, giving lower weight for the word largely generally occurred in multiple documents, these words are typically all this The most common word in field, the differentiation for field inner question, meaning are smaller.It, can be according to after being screened to keyword The weight of keyword is ranked up keyword, the keyword of the higher preset data of weight selection, the pass as query statement Keyword.
In the present embodiment, when obtaining keyword matched with query statement, the keyword of query statement can be first determined, Then the keyword of query statement and the similarity of each keyword in keyword set are calculated, chosen from keyword set with The higher keyword of crucial Word similarity of query statement, as matched keyword.For example, from the corresponding key of candidate documents In set of words choose with the keyword of query statement identical and inconsistent but similarity be more than 90% keyword, as with The matched keyword of query statement.
Step 104, it is extracted from the corresponding candidate documents of matched keyword according to the matched keyword of query statement Query result, and query result is fed back into user.
After obtaining with the matched keyword of query statement, can according to the matched all keywords of query statement, determine Candidate documents where matched keyword extract query result from the corresponding candidate documents of matched keyword.Specifically, When having one with the matched keyword of query statement, matched pass can be selected from the corresponding candidate documents of matched keyword Sentence where keyword feeds back to user using the sentence as query result.For example, showing inquiry knot by human-computer interaction interface Fruit.
In the present embodiment, when with the matched keyword of query statement there are it is multiple when, according to matched keyword and inquiry Matching degree between sentence is ranked up multiple matched keywords, selects maximally related matched keyword from corresponding time Selection shelves extract query result.
Alternatively, being matched for each candidate documents that developer sends according to the keyword of query statement and with query word Keyword, the weight in query statement and the weight in each candidate documents are weighted summation respectively, with obtain pair The scoring of candidate documents chooses the highest candidate documents that score, and chooses query result from the segment of candidate documents.
For example, including 2 keywords A and B in query statement, find out A's from the corresponding keyword set of candidate documents Synonym has A_1 and A_2, and the synonym of B has B_1, then according to keyword A and B, matched keyword A_1, A_2 and B_1's All candidate documents are given a mark and are ranked up by basis point, and the weight in query statement and candidate documents.
For example, it is 0.4 that weight of the A (A_1, A_2) in query statement, which is weight of 0.6, the B (B_1) in query statement, Assuming that the basis of each keyword point is 1, the weight that the weight comprising A_1 in A_1, B and document 1 in document 1 is 0.1, B is 0.05, then document 1 is scored at 0.6*0.1+0.4*0.05+2*1=2.08;It include A_2, B_1, and A_2 in document in document 2 Weight be 0.04, B_1 weight be 0.1, then document 2 is scored at 0.6*0.04+0.4*0.1+2*1=2.064;Document 3 In include A, A_1, and the weight that the weight of A is 0.1, A_1 in document is 0.2, then document 3 is scored at 0.6*0.1+0.6* 0.2+1=1.18.It include B in document 4, and the weight of B is 0.2 in document, then document 3 is scored at 0.4*0.2+1=1.08.
In above-mentioned example, when in document comprising two or more crucial words being mutually matched, the basis of these keywords / He Rengwei 1, that is to say, that keyword quantity basis point and different from include in document are corresponding.
In the present embodiment, the document comprising more matched keywords, the every packet of candidate documents are recalled in order to as much as possible Containing a different keyword, its certain basis point is given, is in order to avoid largely there is some document comprising a large amount of in this way The problem of repetition of particular keywords.
Further, when this user query with it is last round of exist be associated with when, can be to appearing in last round of candidate documents In epicycle document give higher priority, i.e., higher score.Specifically, can with this user input query statement with Correlation degree between last round of query statement determines that this user query is associated with last round of whether there is.For example, upper one Whether the answer for taking turns conversational system is guidance, and whether the input of this wheel is selection (such as " second ") etc..
In addition, in order to improve the intelligence of conversational system, to after user feedback query result, being shown to user pair The problem whether query result is satisfied with is asked user to input "Yes" or "No" and is fed back to give.If the "Yes" of user's input, Query result is remembered, if user's input is "No", to query result without memory.
As a kind of application scenarios, the conversational system construction method of the embodiment of the present application can be applied to airport.It is existing, though Right airport has had various documents to examine for seizing the opportunity ginseng, can be carried with understanding which article, which needs to declare, in advance How long boarding situations such as, but there are still largely seize the opportunity people to inquire various things to the attendant on airport in practice Preferably.In order to mitigate the burden of airport attendant, using can become with people's intelligent robot exchanged that engages in the dialogue is seized the opportunity Important selection.So, using the conversational system construction method of the embodiment of the present application, airport intelligent Service dialogue system can be constructed System only needs developer to send document relevant to airport in building.
In the embodiment of the present application, corresponding keyword set is obtained by being analyzed the candidate documents that developer sends It closes, in user query, the determining matched keyword of query statement with user's input is corresponded to by matched keyword from it Candidate documents in extract query result, as a result, when constructing conversational system, developer input candidate documents, be not necessarily to The a large amount of work of investment, time cost is low, and conversational system building is convenient, and can be adapted for the conversational system structure of several scenes It builds, strong applicability.
In one embodiment of the application, above-mentioned keyword set further includes the corresponding crucial lexeme of multiple keywords It sets, query result can be extracted according to keyword position.It is illustrated below with reference to Fig. 2, Fig. 2 is provided by the embodiments of the present application another A kind of flow diagram of the construction method of conversational system.
As shown in Fig. 2, above-mentioned basis and the matched keyword of query statement are from the corresponding candidate documents of matched keyword Middle extraction query result, comprising:
Step 201, keyword position of the matched keyword in candidate documents is obtained.
In the present embodiment, the keyword position of keyword refers to position of the keyword in candidate documents, to candidate text It is determined when shelves are analyzed.Keyword position may include two location informations: candidate documents and keyword where keyword Position in candidate documents.
For example, developer only has sent a candidate documents, then keyword position refers to keyword in the candidate documents Position, if developer has sent multiple candidate documents, then keyword position refers to candidate documents where keyword and is waiting Position in selection shelves.
Wherein, position of the keyword in candidate documents can be with the page number where keyword in candidate documents and at this Position in the corresponding page of the page number indicates.For example, keyword " boarding process " is at candidate documents page 3, the 10-13 character Position.
Step 202, the keyword position according to matched keyword in candidate documents determines corresponding candidate documents, and Query result is extracted from candidate documents.
Specifically, candidate documents where matched keyword can be determined according to keyword position and should be in candidate documents In position, then the position according to matched keyword in candidate documents, determines the sentence where matched keyword, will It feeds back to user as query result.
In the embodiment of the present application, by determining corresponding candidate documents according to the position of matched keyword, can quickly, Accurately extract query result.
In order to improve extraction query result accuracy, can pass through improve determining matched keyword accuracy it is real It is existing.In one embodiment of the application, keyword set may also include the keyword term vector of multiple keywords, according to key The determination of word term vector and the matched keyword of query statement.It is explained below with reference to Fig. 3, Fig. 3 is the embodiment of the present application The flow diagram of the construction method of another conversational system of offer.
As described in Figure 3, above-mentioned acquisition and the matched keyword of query statement, comprising:
Step 301, the first query statement term vector of query statement is obtained.
It in the present embodiment, is analyzed to the candidate documents that developer sends, obtains the corresponding keyword of candidate documents Afterwards, the term vector that can obtain keyword can obtain the corresponding term vector of keyword by way of word2vec when realizing It indicates.Term vector is that distribution to word indicates, vocabulary is shown as to the combination of multidimensional characteristic, convenient for measure similar between word or Different relation.If query statement is word, then the term vector table of query statement by way of word2vec, can be obtained Show, i.e. the first query statement term vector.
In the present embodiment, keyword is extracted from query statement, is obtained in query statement and is closed by way of word2vec The term vector of keyword indicates, indicates the term vector of the keyword of query statement as the first query statement term vector.
In practical applications, it can be based on large-scale corpus, using the Skip-Gram model in word2vec, training is obtained The term vectors of 128 dimensions indicate, so that the word in most user's question and answer can be indicated with the term vector of 128 dimensions. For the ease of using the distance conception on basis to measure the similitude between two words, vector can be indicated be converted into unitization The form of vector and its mould, it is unitization after vector can preferably measure the semanteme of a word, and vector field homoemorphism can be certain The degree of word is indicated in degree.For example, " liking " and " having deep love for " the two words, semantically relatively, but in degree difference compared with Greatly.
Step 302, it determines and the first matched keyword term vector of query statement term vector institute.
Wherein, keyword corresponding with the matched keyword term vector of the first query statement term vector is and query statement The keyword matched.
After obtaining the first query statement term vector, the first query statement vector key corresponding with candidate documents can be calculated Matching degree is more than the key of preset matching degree threshold value by the matching degree in set of words between the keyword term vector of each keyword Term vector, as with the first matched keyword term vector of query statement term vector institute, then with the first query statement term vector The corresponding keyword of the matched keyword term vector of institute, as with the matched keyword of query statement.It is understood that when full When the keyword term vector of sufficient condition has multiple, then also having with the matched keyword of query statement multiple.
In the embodiment of the present application, in determining keyword matched with query statement, by according to the first of query statement Keyword term vector in query statement vector sum keyword set, the determining and matched keyword of query statement can be improved really The accuracy of fixed matched keyword.
In practical applications, often there is difference to the statement of same semanteme in user and document, therefore only analyze user Keyword in the query statement of input is far from being enough.In order to recall more keywords, in the implementation of the application In example, near synonym expansion can be carried out to query statement, determine matched keyword according near synonym.It is carried out below with reference to Fig. 4 Illustrate, Fig. 4 is the flow diagram of the construction method of another conversational system provided by the embodiments of the present application.
As shown in figure 4, the construction method of the conversational system further include:
Step 401, near synonym are carried out to the keyword in query statement to expand to generate the close of keyword in query statement Adopted word.
In the present embodiment, the keyword in query statement can be expanded based on dictionary, wherein can be by from net The near synonym table for excavating and collecting in network obtains the dictionary for searching near synonym, changes to some common statements carry out in this way Covering.Alternatively, rewriting the near synonym that technology obtains the keyword of query statement based on query statement, specifically, looked by excavating The feature in log is ask, query statement is expanded in realization, by the alignment in word granularity, is realized to user query language The expansion of keyword in sentence.
Alternatively, word-based vector indicates to obtain the near synonym of keyword in query statement.For example, according to term vector training one A k nearest neighbor classification device (k-NearestNeighbor, abbreviation KNN), each word assign one classification.It is subsequent in this way when new defeated When entering a word, term vector can be first obtained, this word is then obtained using KNN classifier according to term vector and is most connect semantically Close word.It is found and the keyword in query statement in whole candidate documents that developer sends using KNN classifier as a result, The keyword of most similar preset quantity carrys out generation as the near synonym of keyword in query statement with the near synonym of query statement For the primary keys of query statement.
Expanded as a result, by near synonym, the synonym of the keyword in available query statement or nearly justice statement, thus The quantity of the corresponding keyword of query statement can be expanded, query result is determined as far as possible from more candidate documents.Also, it is right Keyword in the query statement never occurred in candidate documents obtains near synonym to candidate documents using the above method It scores.
Step 402, the second query statement vector of the near synonym of keyword in query statement is obtained.
In the present embodiment, the second inquiry of the near synonym of keyword in query statement can be obtained by way of word2vec Sentence vector.If the near synonym of keyword have multiple in query statement, the nearly justice of keyword in each query statement is obtained Second query statement term vector of word.
Step 403, determining to determine the matched keyword term vector of institute with the second query statement term vector.
In the present embodiment, it is determining determine with the second query statement term vector matched keyword term vector method, with In above-described embodiment it is determining with the determination of the first query statement term vector the method for matched keyword term vector it is similar, therefore herein It repeats no more.
In the embodiment of the present application, by close near synonym expansion generation query statement to the keyword in query statement The near synonym of keyword determine matched keyword according to the near synonym of keyword in query statement, according to the pass in query statement The near synonym of keyword and keyword score to candidate documents, and the accuracy of query result can be improved.
In practical applications, multiple places in same document may all include keyword, in order to improve query result Accuracy can first determine the segment in candidate documents comprising keyword, then according to piece in one embodiment of the application Section obtains query result.It is illustrated below with reference to Fig. 5, Fig. 5 is another conversational system provided by the embodiments of the present application The flow diagram of construction method.
As shown in figure 5, the above-mentioned keyword position according to matched keyword in candidate documents determines corresponding candidate Document, and query result is extracted from candidate documents, comprising:
Step 501, according to the keyword position of matched keyword, from the every of the corresponding candidate documents of matched keyword In a segment, determine candidate segment as the first candidate answers.
In the present embodiment, when the candidate documents where matched keyword have multiple, scoring highest the can be chosen The candidate documents of one preset quantity, for example highest preceding 3 candidate documents of score are chosen, specific method can be found in above-described embodiment The method of middle record given a mark and sorted to candidate documents, details are not described herein.
It is determined for each segment of these candidate documents according to position of the matched keyword in the candidate documents Each segment includes matched keyword quantity and the importance for calculating these keywords, and is carried out based on this to each segment Marking chooses the highest segment of score as candidate segment.Here segment refers to a word or half word.
Alternatively, when the candidate documents where matched keyword have multiple, it can be for each of each candidate documents Segment, according to the keyword position of matched keyword determine include keyword segment, and according to comprising matched key The quantity and importance of word, give a mark to the segment comprising matched keyword, and candidate documents are commented in the above-mentioned record of method The method divided is similar, therefore details are not described herein, and the segment of highest second preset quantity of score is then chosen from all segments, As candidate segment.For example, choose highest 10 segments of score, as candidate segment, for the ease of distinguishing, referred to herein as the One candidate answers.
Step 502, the first candidate answers are ranked up and are post-processed, obtain the second candidate answers.
In the present embodiment, the synonym of the above-mentioned keyword in based on the keyword and query statement in query statement is right After candidate segment comprising keyword is ranked up the first candidate answers of selection, first candidate answers can also be carried out based on semantic It reorders.
In the present embodiment, the first candidate answers can be ranked up by question and answer paragraph ordering techniques.Wherein, question and answer section Falling sequence is the candidate paragraph that major function is given problem and the answer that may include, and is calculated with the method for sequence study each Paragraph is the probability of correct option.
Fig. 6 is a kind of method schematic diagram based on the sequence of semantic candidate answers provided by the embodiments of the present application.Such as Fig. 6 institute Show, which is by the problematic understanding of multiple model groups such as problem understanding, semantic matches, type feature, sequence polymerization, semantic Match, type feature, the multiple models such as polymerization that sort can use the data resources instructions such as web data, search log, knowledge mapping It gets.When in use, query statement and the first candidate answers can be input in these models, obtains the first candidate answers Score is finally ranked up all first candidate answers according to score.
Wherein, problem, which understands, refers to that the query statement to user's input carries out structured analysis, includes problem identification, asks Topic classification and answer type identification etc., the model are obtained by using the magnanimity inquiry log training accumulated in search engine 's;Semantic matches refer to the correlation from text semantic angle calculation problem and candidate paragraph, which can be used Chinese pre- instruction Practice text representation model E RNIE, and using the method for transfer learning, training in the magnanimity click logs accumulated in a search engine ?;Type feature module has done feature refinement for the problem that different problem categories, promotes question and answer effect for different type.
When sorting to the first candidate answers, the case where including multiple segments in sentence complete for one, not only The similitude of query statement and segment is considered, it is also contemplated that the similitude and last round of inquiry language of query statement and title The similitude of sentence and current clip.
Due to the case where including multiple segments there are same sentence, in order to improve the accuracy of query result, need Multiple first candidate answers are post-processed.Wherein, post-processing mainly includes two, and first is weight in removal candidate answers The problem of higher answer of multiplicity, this kind of situation is mainly due to user, can find multiple answers in the document of user, and These answer substantive contents are consistent;Second is merged to answer adjacent in same document, these adjacent answers Often collectively form the complete reply of the same problem.
Step 503, according to the quantity of the second candidate answers, query result is generated.
After the first candidate answers are ranked up and post-process, the second candidate answers are obtained, there may be single candidate answers Perhaps confidence level be apparently higher than other candidate answers be perhaps possible to all higher candidate answers of multiple confidence levels or There is no the confidence level of candidate answers or all candidate answers all relatively low.In the case of single answer, second can be waited Answer is selected to be fed directly to user, still, the candidate answers directly extracted from candidate documents are often more stiff, therefore in order to Keep answer more accurate more close to the answer of people, certain processing can be carried out with the second candidate answers, generation is compared close to people Answer.Specifically, it can generate common answer sentence according to the type of query statement, or generate whether type is looked into Ask result.
If user input query statement be not selective problems, using machine reading understand model obtain closer to The answer of people, wherein machine reading, which understands that model refers to, allows machine to read text, then answers ask relevant with reading content Topic, technology can make computer have the ability for obtaining knowledge from text data and answering a question.
As an example, it can understand that model V-NET generates query result using more document readings end to end.Its In, which is a neural network model end to end, is based on shot and long term memory network (Long Short-Term Memory, abbreviation LSTM) and two-way attention mechanism, and three prediction models are added, respectively from the Boundary Prediction mould of answer The answer authentication module of block, answer content forecast module and more documents, three aspects predict answer.
In practical applications, for some problems, it is desirable to which the simple segment that intercepts from determining candidate answers is used as back It is multiple, cannot normally respond, such as problem is " XXX is gone to need identity card? ", the candidate answers found are " please be carry-on Carry effective identity certificate carefully ", appropriate reply should be "needed" for, and any segment intercepted from original answer is not one A appropriate reply.
For such issues that, in the present embodiment, using whether class answer generating algorithm, obtain final query result simultaneously Feed back to user.Firstly, problem and the second candidate answers are input in the classifier based on ERNIE pre-training first, obtain Second candidate answers are the classification results of " affirmative ", " negative " or " without viewpoint " to problem.The case where for affirmative, into one Step is by generating corresponding query result based on the Rule Builder of part of speech and Feature Words;The case where for negating, then needs The generation of further progress negative answer on the basis of the affirmative answer of generation, such as according to " needs " generation " not needing ", root According to " going " generation " not going ".
In practical applications, there may be relatively high several of confidence level after the first candidate segment being sorted and post-processed Candidate answers i.e. the second candidate answers have multiple, the case where for this multiple answers, in the present embodiment, can pass through guidance words Query result is fed back to user by the mode of art, and user can be helped to understand and how to be replied, the completeness of dialogue how is improved.
Wherein, guidance words art refers to when conversational system finds multiple answers relevant to query statement, it is not known which is answered Case is that customer problem is wanted, and at this moment can be guided to user.
For example, user input query statement is " private stroke can be submitted an expense account to company? ".Candidate answers 1: it drives private People's automobile can submit an expense account oil to company and take.Candidate answers 2: because the stroke of private trip cannot be submitted an expense account.At this moment guidance can be generated Words art " you want to ask be it is following which? (1): drive private car;(2) because private is gone on a journey ".
When generating guidance words art, introducer first can be extracted from each candidate answers that post-processing obtains, be waited from one It selects the introducer extracted in answer to can be one, is also possible to multiple.Pass through word importance to generate suitable introducer Analytical technology generates candidate introducer from the second candidate answers, then the matter of introducer is advanced optimized by certain strategy Amount.
Introducer optimisation strategy is specific can include: (1) according to morphological information, is filtered to function word, stop words;(2) base In from the query statement largely excavated, introducer granularity is aligned, so that introducer tool is expressed the meaning completely;(3) to appearing in Introducer in multiple second candidate answers is deleted, for being closely related with query statement and only particular candidate answer packet The key message contained is weighted, so that introducer has distinction;(4) it removes in candidate introducer and is lower than with the inquiry degree of correlation The introducer of preset threshold gives the more candidate introducer of the keywords semantics in the query statement that inputs comprising user High weight, so that introducer has correlation;(5) for multiple candidate introducers the case where, is paid the utmost attention in search log Once in user search request co-occurrence combination, achieve the purpose that focus on introducer between combine expressing the meaning property;(6) for spy Different syntactic structure, the weight of specific word in appropriate adjustment syntactic structure, such as to multiple words arranged side by side in parallel construction, give compared with Low weight.
The above analysis can take following four principles: a) distinction when generating introducer: introducer can be with Distinguish different candidate answers;B) importance: introducer is the primary word in candidate reply;C) correlation: introducer and inquiry Keyword in sentence should have certain correlation;D) refinement property: introducer should be it is condensed, coverage power is strong, rather than It is excessively specific.
For example, user input query sentence: " Disney has much? ".Candidate answers have: (1) Shanghai Disney is Asia Maximum Disney theme park, area are 1234 hectares;(2) about 3000 mu are taken up an area in Hong Kong Disneyland.Art is talked about in the guidance of generation: " you to be asked be it is following which? (1) Shanghai Disney area;(2) Hong Kong Disneyland takes up an area ".Introducer in the two options Shanghai and Hong Kong have a distinction, and Disney is the word with importance, and Disney and area have a correlation, usable floor area, The non-specific numbers such as land occupation have refinement property.
In the present embodiment, when the second candidate answers be it is multiple when, can by generating introducer from the second candidate answers, Guidance words art is generated based on introducer, using guidance words art as query result feedback user, conversational system can be by more as a result, The mode of Wheel-guiding is interacted with user, is helped user in the case where initial problem expresses the meaning fuzzy, is found target answer.
The case where being zero for the second candidate answers obtained after post-processing, it is believed that the answer of current queries sentence is not Developer provide candidate documents in, at this moment can will " I does not know how this answers you ", " let down, I can not answer You " etc. similar sentence reply to user.
In the embodiment of the present application, by according to matched keyword in candidate documents position, from matched keyword pair In each segment for the candidate documents answered, determine candidate segment as the first candidate answers, then to the first candidate answers into Row sequence and post-processing, obtain the second candidate answers, further according to the quantity of the second candidate answers, generate query result, as a result, may be used To generate query result according to the candidate answers finally selected, the accuracy of query result is improved.
In practical applications, conversational system may include test pattern and tuning mode.In test mode, conversational system The determination query result mode of above-mentioned record can be executed.Wherein, tuning mode is maximum with test pattern is not both, in tuning Under mode, conversational system can be replied in a manner of providing multiple candidate items for user's selection every time, and each candidate item can Occur in a manner of completely stating, then user can select oneself to think that this problem is suitably answered from multiple candidate items Case, and this tuning result is come into force.
List records can be passed through in one embodiment of the application to improve the accuracy of the conversational system of building The question and answer historical information of each round of conversational system, for example, each round query statement, with the matched keyword of query statement, The corresponding candidate documents of the keyword matched, query result etc..
In addition, actually obtaining phase from user hand with after the dialogue of the robot with conversational system by user The knowledge of equivalent, these knowledge can be cured, and use other users can also directly, therefore, can save dialogue The knowledge obtained in the process.Here knowledge can be understood as the query statement of user's input and the query statement is corresponding answers Case.
In the present embodiment, under test pattern and tuning mode, conversational system all has Memorability, unlike, test Memorability under mode can only come into force to current user session, be short-term memory, and the memory under tuning mode can be to survey All users under die trial formula and tuning mode are come into force, i.e. long-term memory.This memory design characteristic, had both enabled developers to The interfering system of effect itself generate as a result, making this intervention come into force all users of oneself, but also user itself is in list When secondary dialogue, Memorability bring advantage can be enjoyed.
In the present embodiment, developer can be intervened by the query result of tuning modal dialog system feedback, be intervened As a result it is come into force in a manner of increment to model, when making the problem that subsequent user request is semantic or statement is close, conversational system will be with It is that brigadier's query result feeds back to user that developer, which intervenes result,;When user interacts with conversational system, conversational system can also be adopted With attention mechanism, answer is directly given to synonym problem the problem of inquiry before the user.
Illustrate below with reference to Fig. 7, the construction method of the conversational system of the embodiment of the present application.Fig. 7 mentions for the embodiment of the present application A kind of overall schematic of the construction method of the conversational system supplied.
As shown in fig. 7, mainly there is two stages of training stage and test phase.
Wherein, the training stage can be understood as the document end processed offline stage, the key including based process, acquisition document The KNN classifier of word and inverted index and building based on term vector.
Wherein, this part of based process processing work mainly include to document carry out it is encoded translated, remove document in language The unrelated spcial character of justice carries out document to cut sentence (being syncopated as complete semantic section), carries out morphological analysis to each semantic section Deng.By based process process, the document that conversational system uploads user has the understanding on basis.This stage is sent out developer The candidate documents sent are analyzed, including morphology relevant treatment such as cuts sentence, word cutting, part-of-speech tagging etc., obtains candidate documents In keyword and keyword term vector, it is hereby achieved that the corresponding keyword set of candidate documents.
The keyword and inverted index for obtaining document are mainly based upon many algorithms such as TF-IDF to integrate and measure in document The importance of word is mainly used for carrying out coarse localization to the corresponding candidate documents of matched keyword and answer section.
It constructs the KNN classifier based on term vector: in practical applications, large-scale corpus can be based on, use word2vec In Skip-Gram model, the term vector that training has obtained 128 dimensions indicates, so that the word in most user's question and answer all may be used To be indicated with the term vector of 128 dimensions.For the ease of using the distance conception on basis to measure the similitude between two words, Vector can indicate to be converted into the form of unitization vector and its mould, it is unitization after vector can preferably measure a word Semanteme, and vector field homoemorphism can indicate the degree of word to a certain extent.For example, " liking " and " having deep love for " the two words, semantic On relatively, but difference is larger in degree.
In obtaining candidate documents after the term vector of word, KNN classifier is constructed according to term vector, each word assigns it One classification, subsequent in this way when newly one word of input, we can first obtain term vector, then use KNN according to term vector Classifier obtains this word in semantically immediate word.
Test rank includes: that data are loaded into work, dictionary imports, query statement processing, knowledge preservation, history maintenance, answer Show etc..
Data are loaded into the loading that work includes: document data, the loading of the knowledge of training before user.The load of document data Enter and refers to that term vector of morphological analysis result, keyword and keyword for obtaining document end processed offline part etc. is loaded into.
The loading of the knowledge of training refers to that user may carry out certain " adjust to robot before before user Religion ", this partial knowledge is also required to be imported into.This part includes direct question and answer mapping relations, and the pass to document and sentence The adjustment of keyword and its weight.When being adjusted to weight, it is related to an entirety tune power problem.Wherein, whole to adjust Power refer to assume to adjust weigh before in document the average weight of word be (such as 0.1) x, after being weighted to some word, at this moment word Average weight is greater than x, and needing average weight to become again x at this time, (weight for being equivalent to each word reduces centainly Ratio), this is the unlimited expansion in order to avoid weight.
Dictionary importing includes: to import index statement dictionary, import greeting statement dictionary etc..
Dialog logic processing: the query statement of user's input is judged, and makes corresponding logical action.Specifically Injunctive query statement judgement and processing are flowed in ground, control: the processing comprising number order formula query statement, such as exit (make Server is exited at end, and saves specifically trained achievement), clear (clear history);The keyword of morphological analysis and query statement It obtains: morphological analysis being carried out to input inquiry sentence, and carries out the keyword extraction in query statement, keyword extraction is mainly Word-based importance, while being weighed using part of speech further progress tune;Judge related between epicycle question and answer and last round of question and answer Property;The processing of specific type inquiry;The answer process of general inquiry sentence.
Wherein, the processing of specific type query statement includes: the input of greeting formula, index type input and once trained defeated Enter, initiate the problem whether being satisfied with to query result.Wherein, for index type input, such as when input be " first ", " When the input of number second " etc, by extracting the candidate item of reference numeral from last round of answer as answer.It should be noted that , candidate item is subject at this time perfect again, the content omitted before in order to succinct is supplied herein.For example, last round of The problem of be " private stroke can be submitted an expense account to company? ", corresponding query result be " you want to ask be it is following which? (1) drives Sail private car;(2) because private is gone on a journey ".Epicycle user input is " second ", then corresponding query result is then " because of private The stroke of trip cannot be submitted an expense account ".
For greeting formula input, the input of greeting formula can be given and replied by simply matching.And for once training The input crossed can directly give by keyword as key, in a manner of dictionary enquiring, the problem of to once training It replys.
The answer process of general inquiry sentence is to carry out analysis to the query statement of user's input to extract keyword, then right Keyword is expanded, this refers to obtain the synonym or near synonym of keyword in query statement, thus to query statement pair The keyword answered is expanded, later according to the keyword of query statement from candidate corresponding with the matched keyword of query statement Document is ranked up, and is chosen and answer and be ranked up from the candidate documents of preceding preset quantity, and after carrying out to ranking results Processing, precisely reply further according to the result after post-processing and generates, and is then shown to answer, that is, by query result Show user.In order to improve the performance of conversational system, history maintenance can also be carried out and carry out the preservation of knowledge, for example, logical The question and answer historical information of each round of list records conversational system is crossed, and saves the knowledge obtained in dialog procedure.
In practical applications, all types of template of query statement can be preset in conversational system, for example, injunctive inquiry Sentence greets formula query statement etc. and has corresponding template, after obtaining user query sentence, can by query statement with it is corresponding Template is compared, and to determine the type of query statement, and then performs corresponding processing to query statement.
In order to realize above-described embodiment, the embodiment of the present application also proposes a kind of construction device of conversational system.Fig. 8 is this Shen Please embodiment provide a kind of conversational system construction device structural schematic diagram.
As shown in figure 8, the construction device of the conversational system includes:
Receiving module 610, for receiving the candidate documents of developer's transmission;
Generation module 620, for being analyzed candidate documents to generate the corresponding keyword set of candidate documents, In, keyword set includes the multiple keywords extracted from candidate documents;
First obtains module 630, for receiving the query statement of user, and obtains and the matched keyword of query statement; And
Enquiry module 640, for according to corresponding candidate literary from matched keyword with the matched keyword of query statement Query result is extracted in shelves, and query result is fed back into user.
In a kind of possible implementation of the embodiment of the present application, above-mentioned keyword set includes that multiple keywords are corresponding Keyword position, above-mentioned enquiry module 640, comprising:
Acquiring unit, for obtaining keyword position of the matched keyword in candidate documents;And
Extraction unit, for determining corresponding candidate text according to keyword position of the matched keyword in candidate documents Shelves, and query result is extracted from candidate documents.
In a kind of possible implementation of the embodiment of the present application, above-mentioned keyword set further includes the pass of multiple keywords Keyword term vector, above-mentioned first obtains module, is specifically used for:
Obtain the first query statement term vector of query statement;
Determine and the first matched keyword term vector of query statement term vector institute, wherein with the first query statement word to The flux matched corresponding keyword of keyword term vector be and the matched keyword of query statement.
In a kind of possible implementation of the embodiment of the present application, the device further include:
Enlargement module expands for carrying out near synonym to the keyword in query statement to generate keyword in query statement Near synonym;
Second obtains module, for obtaining the second query statement vector of the near synonym of keyword in query statement;
Determining module, for determining and the second matched keyword term vector of query statement term vector institute, wherein with second The corresponding keyword of the matched keyword term vector of query statement term vector be and the matched keyword of query statement.
In a kind of possible implementation of the embodiment of the present application, above-mentioned enlargement module is specifically used for:
According to classifier trained in advance, the near synonym of keyword in the query statement are obtained.
In a kind of possible implementation of the embodiment of the present application, said extracted unit is specifically used for:
According to the keyword position of matched keyword, from each segment of the corresponding candidate documents of matched keyword In, determine candidate segment as the first candidate answers;
Candidate segment is extended for complete words as candidate answers;
First candidate answers are ranked up and Screening Treatment, obtain the second candidate answers;
According to the quantity of the second candidate answers, query result is generated.
In a kind of possible implementation of the embodiment of the present application, said extracted unit is also used to:
If the quantity of second candidate answers is one, the inquiry is generated according to the type of the query statement As a result;
If the quantity of second candidate answers be it is multiple, extract introducer, base from each second candidate answers The query result is generated in the introducer.
In a kind of possible implementation of the embodiment of the present application, the device further include:
Preserving module for the question and answer historical information of each round by list records conversational system, and saves dialogue The knowledge obtained in the process.
It should be noted that the explanation of the above-mentioned construction method embodiment to conversational system, is also applied for the implementation The construction device of the conversational system of example, therefore details are not described herein.
The construction device of the conversational system of the embodiment of the present application, the candidate documents sent by receiving developer, to candidate Document is analyzed to generate the corresponding keyword set of candidate documents, wherein keyword set includes extracting from candidate documents Multiple keywords, receive the query statement of user, and obtain with the matched keyword of query statement, according to query statement The keyword matched extracts query result from the corresponding candidate documents of matched keyword, and query result is fed back to user. In the present embodiment, corresponding keyword set is obtained by being analyzed the candidate documents that developer sends, in user query When, the determining matched keyword of query statement with user's input, through matched keyword from its corresponding candidate documents Query result is extracted, as a result, when constructing conversational system, developer inputs candidate documents, without putting into a large amount of work Make, time cost is low, and conversational system building is convenient, and can be adapted for the conversational system building of several scenes, strong applicability.
In order to realize above-described embodiment, the embodiment of the present application also proposes a kind of computer equipment, including processor and storage Device;
Wherein, processor is run and the executable program by reading the executable program code stored in memory The corresponding program of code, with the prediction of construction method or language model for realizing the conversational system as described in above-described embodiment Method.
Fig. 9 shows the block diagram for being suitable for the exemplary computer device for being used to realize the application embodiment.What Fig. 9 was shown Computer equipment 12 is only an example, should not function to the embodiment of the present application and use scope bring any restrictions.
As shown in figure 9, computer equipment 12 is showed in the form of universal computing device.The component of computer equipment 12 can be with Including but not limited to: one or more processor or processing unit 16, system storage 28 connect different system components The bus 18 of (including system storage 28 and processing unit 16).
Bus 18 indicates one of a few class bus structures or a variety of, including memory bus or Memory Controller, Peripheral bus, graphics acceleration port, processor or the local bus using any bus structures in a variety of bus structures.It lifts For example, these architectures include but is not limited to industry standard architecture (Industry Standard Architecture;Hereinafter referred to as: ISA) bus, microchannel architecture (Micro Channel Architecture;Below Referred to as: MAC) bus, enhanced isa bus, Video Electronics Standards Association (Video Electronics Standards Association;Hereinafter referred to as: VESA) local bus and peripheral component interconnection (Peripheral Component Interconnection;Hereinafter referred to as: PCI) bus.
Computer equipment 12 typically comprises a variety of computer system readable media.These media can be it is any can be by The usable medium that computer equipment 12 accesses, including volatile and non-volatile media, moveable and immovable medium.
Memory 28 may include the computer system readable media of form of volatile memory, such as random access memory Device (Random Access Memory;Hereinafter referred to as: RAM) 30 and/or cache memory 32.Computer equipment 12 can be with It further comprise other removable/nonremovable, volatile/non-volatile computer system storage mediums.Only as an example, Storage system 34 can be used for reading and writing immovable, non-volatile magnetic media, and (Fig. 9 do not show, commonly referred to as " hard drive Device ").Although being not shown in Fig. 9, the disk for reading and writing to removable non-volatile magnetic disk (such as " floppy disk ") can be provided and driven Dynamic device, and to removable anonvolatile optical disk (such as: compact disc read-only memory (Compact Disc Read Only Memory;Hereinafter referred to as: CD-ROM), digital multi CD-ROM (Digital Video Disc Read Only Memory;Hereinafter referred to as: DVD-ROM) or other optical mediums) read-write CD drive.In these cases, each driving Device can be connected by one or more data media interfaces with bus 18.Memory 28 may include that at least one program produces Product, the program product have one group of (for example, at least one) program module, and it is each that these program modules are configured to perform the application The function of embodiment.
Program/utility 40 with one group of (at least one) program module 42 can store in such as memory 28 In, such program module 42 include but is not limited to operating system, one or more application program, other program modules and It may include the realization of network environment in program data, each of these examples or certain combination.Program module 42 is usual Execute the function and/or method in embodiments described herein.
Computer equipment 12 can also be with one or more external equipments 14 (such as keyboard, sensing equipment, display 24 Deng) communication, can also be enabled a user to one or more equipment interact with the computer equipment 12 communicate, and/or with make The computer equipment 12 any equipment (such as network interface card, the modulatedemodulate that can be communicated with one or more of the other calculating equipment Adjust device etc.) communication.This communication can be carried out by input/output (I/O) interface 22.Also, computer equipment 12 may be used also To pass through network adapter 20 and one or more network (such as local area network (Local Area Network;Hereinafter referred to as: LAN), wide area network (Wide Area Network;Hereinafter referred to as: WAN) and/or public network, for example, internet) communication.Such as figure Shown, network adapter 20 is communicated by bus 18 with other modules of computer equipment 12.It should be understood that although not showing in figure Out, other hardware and/or software module can be used in conjunction with computer equipment 12, including but not limited to: microcode, device drives Device, redundant processing unit, external disk drive array, RAID system, tape drive and data backup storage system etc..
Processing unit 16 by the program that is stored in system storage 28 of operation, thereby executing various function application and Data processing, such as realize the method referred in previous embodiment.
In order to realize above-described embodiment, the embodiment of the present application also proposes a kind of computer readable storage medium, stores thereon There is computer program, the construction method of the conversational system as described in above-described embodiment is realized when which is executed by processor.
In the description of this specification, term " first ", " second " are used for description purposes only, and should not be understood as instruction or It implies relative importance or implicitly indicates the quantity of indicated technical characteristic.The spy of " first ", " second " is defined as a result, Sign can explicitly or implicitly include at least one of the features.In the description of the present application, the meaning of " plurality " is at least two It is a, such as two, three etc., unless otherwise specifically defined.
Any process described otherwise above or method description are construed as in flow chart or herein, and expression includes It is one or more for realizing custom logic function or process the step of executable instruction code module, segment or portion Point, and the range of the preferred embodiment of the application includes other realization, wherein can not press shown or discussed suitable Sequence, including according to related function by it is basic simultaneously in the way of or in the opposite order, Lai Zhihang function, this should be by the application Embodiment person of ordinary skill in the field understood.
Expression or logic and/or step described otherwise above herein in flow charts, for example, being considered use In the order list for the executable instruction for realizing logic function, may be embodied in any computer-readable medium, for Instruction execution system, device or equipment (such as computer based system, including the system of processor or other can be held from instruction The instruction fetch of row system, device or equipment and the system executed instruction) it uses, or combine these instruction execution systems, device or set It is standby and use.For the purpose of this specification, " computer-readable medium ", which can be, any may include, stores, communicates, propagates or pass Defeated program is for instruction execution system, device or equipment or the dress used in conjunction with these instruction execution systems, device or equipment It sets.The more specific example (non-exhaustive list) of computer-readable medium include the following: there is the electricity of one or more wirings Interconnecting piece (electronic device), portable computer diskette box (magnetic device), random access memory (RAM), read-only memory (ROM), erasable edit read-only storage (EPROM or flash memory), fiber device and portable optic disk is read-only deposits Reservoir (CDROM).In addition, computer-readable medium can even is that the paper that can print described program on it or other are suitable Medium, because can then be edited, be interpreted or when necessary with it for example by carrying out optical scanner to paper or other media His suitable method is handled electronically to obtain described program, is then stored in computer storage.
It should be appreciated that each section of the application can be realized with hardware, software, firmware or their combination.Above-mentioned In embodiment, software that multiple steps or method can be executed in memory and by suitable instruction execution system with storage Or firmware is realized.Such as, if realized with hardware in another embodiment, following skill well known in the art can be used Any one of art or their combination are realized: have for data-signal is realized the logic gates of logic function from Logic circuit is dissipated, the specific integrated circuit with suitable combinational logic gate circuit, programmable gate array (PGA), scene can compile Journey gate array (FPGA) etc..
Those skilled in the art are understood that realize all or part of step that above-described embodiment method carries It suddenly is that relevant hardware can be instructed to complete by program, the program can store in a kind of computer-readable storage medium In matter, which when being executed, includes the steps that one or a combination set of embodiment of the method.
It, can also be in addition, can integrate in a processing module in each functional unit in each embodiment of the application It is that each unit physically exists alone, can also be integrated in two or more units in a module.Above-mentioned integrated mould Block both can take the form of hardware realization, can also be realized in the form of software function module.The integrated module is such as Fruit is realized and when sold or used as an independent product in the form of software function module, also can store in a computer In read/write memory medium.
Storage medium mentioned above can be read-only memory, disk or CD etc..Although having been shown and retouching above Embodiments herein is stated, it is to be understood that above-described embodiment is exemplary, and should not be understood as the limit to the application System, those skilled in the art can be changed above-described embodiment, modify, replace and become within the scope of application Type.

Claims (18)

1. a kind of construction method of conversational system characterized by comprising
Receive the candidate documents that developer sends;
The candidate documents are analyzed to generate the corresponding keyword set of the candidate documents, wherein the keyword Set includes the multiple keywords extracted from the candidate documents;
The query statement of user is received, and is obtained and the matched keyword of the query statement;And
It is inquired according to being extracted from the corresponding candidate documents of the matched keyword with the matched keyword of the query statement As a result, and query result is fed back to the user.
2. the construction method of conversational system as described in claim 1, which is characterized in that wherein, the keyword set includes The corresponding keyword position of the multiple keyword, wherein the basis and the matched keyword of the query statement are from described Query result is extracted in the corresponding candidate documents of matched keyword, comprising:
Obtain keyword position of the matched keyword in the candidate documents;And
Corresponding candidate documents are determined according to keyword position of the matched keyword in the candidate documents, and from institute It states and extracts the query result in candidate documents.
3. the construction method of conversational system as claimed in claim 2, which is characterized in that the keyword set further includes described The keyword term vector of multiple keywords, wherein the acquisition and the matched keyword of the query statement, comprising:
Obtain the first query statement term vector of the query statement;
The matched keyword term vector of the determining and described first query statement term vector institute, wherein with first query statement The corresponding keyword of the matched keyword term vector of term vector be and the matched keyword of the query statement.
4. the construction method of conversational system as claimed in claim 3, which is characterized in that further include:
Near synonym are carried out to the keyword in the query statement to expand to generate the near synonym of keyword in query statement;
Obtain the second query statement vector of the near synonym of keyword in the query statement;
The matched keyword term vector of the determining and described second query statement term vector institute, wherein with second query statement The corresponding keyword of the matched keyword term vector of term vector be and the matched keyword of the query statement.
5. the construction method of conversational system as claimed in claim 4, which is characterized in that the pass in the query statement Keyword carries out near synonym and expands to generate the near synonym of keyword in query statement, comprising:
According to classifier trained in advance, the near synonym of keyword in the query statement are obtained.
6. the construction method of conversational system as claimed in claim 2, which is characterized in that the basis and the query statement The keyword matched extracts query result from the corresponding candidate documents of the matched keyword, comprising:
According to the keyword position of the matched keyword, from each of corresponding candidate documents of the matched keyword piece Duan Zhong determines candidate segment as the first candidate answers;
First candidate answers are ranked up and are post-processed, the second candidate answers are obtained;
According to the quantity of second candidate answers, the query result is generated.
7. the construction method of conversational system as claimed in claim 6, which is characterized in that described according to second candidate answers Quantity, generate the query result, comprising:
If the quantity of second candidate answers is one, the inquiry is generated according to the type of the query statement and is tied Fruit;
If the quantity of second candidate answers be it is multiple, extract introducer from each second candidate answers, be based on institute It states introducer and generates the query result.
8. the construction method of conversational system as claimed in claim 1, which is characterized in that further include:
By the question and answer historical information of each round of list records conversational system, and save the knowledge obtained in dialog procedure.
9. a kind of construction device of conversational system characterized by comprising
Receiving module, for receiving the candidate documents of developer's transmission;
Generation module, for being analyzed the candidate documents to generate the corresponding keyword set of the candidate documents, In, the keyword set includes the multiple keywords extracted from the candidate documents;
First obtains module, for receiving the query statement of user, and obtains and the matched keyword of the query statement;And
Enquiry module, for according to corresponding candidate literary from the matched keyword with the matched keyword of the query statement Query result is extracted in shelves, and query result is fed back into the user.
10. the construction device of conversational system as claimed in claim 9, which is characterized in that the keyword set includes described The corresponding keyword position of multiple keywords, the enquiry module, comprising:
Acquiring unit, for obtaining keyword position of the matched keyword in the candidate documents;And
Extraction unit, for determining corresponding time according to keyword position of the matched keyword in the candidate documents Selection shelves, and the query result is extracted from the candidate documents.
11. the construction device of conversational system as claimed in claim 10, which is characterized in that the keyword set further includes institute The keyword term vector of multiple keywords is stated, described first obtains module, it is specifically used for:
Obtain the first query statement term vector of the query statement;
The matched keyword term vector of the determining and described first query statement term vector institute, wherein with first query statement The corresponding keyword of the matched keyword term vector of term vector be and the matched keyword of the query statement.
12. the construction device of conversational system as claimed in claim 11, which is characterized in that further include:
Enlargement module expands for carrying out near synonym to the keyword in the query statement to generate keyword in query statement Near synonym;
Second obtains module, for obtaining the second query statement vector of the near synonym of keyword in the query statement;
Determining module, for determining with the matched keyword term vector of the second query statement term vector institute, wherein with it is described The corresponding keyword of the matched keyword term vector of second query statement term vector be and the matched keyword of the query statement.
13. the construction device of conversational system as claimed in claim 12, which is characterized in that the enlargement module is specifically used for:
According to classifier trained in advance, the near synonym of keyword in the query statement are obtained.
14. the construction device of conversational system as claimed in claim 10, which is characterized in that the extraction unit is specifically used for:
According to the keyword position of the matched keyword, from each of corresponding candidate documents of the matched keyword piece Duan Zhong determines candidate segment as the first candidate answers;
First candidate answers are ranked up and are post-processed, the second candidate answers are obtained;
According to the quantity of second candidate answers, the query result is generated.
15. the construction device of conversational system as claimed in claim 14, which is characterized in that the extraction unit is also used to:
If the quantity of second candidate answers is one, the inquiry is generated according to the type of the query statement and is tied Fruit;
If the quantity of second candidate answers be it is multiple, extract introducer from each second candidate answers, be based on institute It states introducer and generates the query result.
16. the construction device of the conversational system as described in claim 9-15 is any, which is characterized in that further include:
Preserving module for the question and answer historical information of each round by list records conversational system, and saves dialog procedure The knowledge of middle acquisition.
17. a kind of computer equipment, which is characterized in that including processor and memory;
Wherein, the processor is run by reading the executable program code stored in the memory can be performed with described The corresponding program of program code, with the construction method for realizing conversational system such as described in any one of claims 1-8.
18. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the program is by processor The construction method such as conversational system described in any one of claims 1-8 is realized when execution.
CN201910578623.XA 2019-06-28 2019-06-28 Method and device for constructing dialog system, computer equipment and storage medium Active CN110196901B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910578623.XA CN110196901B (en) 2019-06-28 2019-06-28 Method and device for constructing dialog system, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910578623.XA CN110196901B (en) 2019-06-28 2019-06-28 Method and device for constructing dialog system, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN110196901A true CN110196901A (en) 2019-09-03
CN110196901B CN110196901B (en) 2022-02-11

Family

ID=67755527

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910578623.XA Active CN110196901B (en) 2019-06-28 2019-06-28 Method and device for constructing dialog system, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN110196901B (en)

Cited By (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110598078A (en) * 2019-09-11 2019-12-20 京东数字科技控股有限公司 Data retrieval method and device, computer-readable storage medium and electronic device
CN110717029A (en) * 2019-10-15 2020-01-21 支付宝(杭州)信息技术有限公司 Information processing method and system
CN110795549A (en) * 2019-10-31 2020-02-14 腾讯科技(深圳)有限公司 Short text conversation method, device, equipment and storage medium
CN110955767A (en) * 2019-12-04 2020-04-03 中国太平洋保险(集团)股份有限公司 Algorithm and device for generating intention candidate set list set in robot dialogue system
CN111061839A (en) * 2019-12-19 2020-04-24 过群 Combined keyword generation method and system based on semantics and knowledge graph
CN111104476A (en) * 2019-12-19 2020-05-05 用友网络科技股份有限公司 Archive data generation method, archive data generation device, and readable storage medium
CN111177367A (en) * 2019-11-11 2020-05-19 腾讯科技(深圳)有限公司 Case classification method, classification model training method and related products
CN111177339A (en) * 2019-12-06 2020-05-19 百度在线网络技术(北京)有限公司 Dialog generation method and device, electronic equipment and storage medium
CN111198940A (en) * 2019-12-27 2020-05-26 北京百度网讯科技有限公司 FAQ method, question-answer search system, electronic device, and storage medium
CN111339278A (en) * 2020-02-28 2020-06-26 支付宝(杭州)信息技术有限公司 Method and device for generating training speech generating model and method and device for generating answer speech
CN111382256A (en) * 2020-03-20 2020-07-07 北京百度网讯科技有限公司 Information recommendation method and device
CN111460092A (en) * 2020-03-11 2020-07-28 中国电子科技集团公司第二十八研究所 Multi-document-based automatic complex problem solving method
CN111522938A (en) * 2020-04-27 2020-08-11 广东电网有限责任公司培训与评价中心 Method, device and equipment for screening talent performance documents
CN111523304A (en) * 2020-04-27 2020-08-11 华东师范大学 Automatic generation method of product description text based on pre-training model
CN111767386A (en) * 2020-07-31 2020-10-13 腾讯科技(深圳)有限公司 Conversation processing method and device, electronic equipment and computer readable storage medium
CN111966869A (en) * 2020-07-07 2020-11-20 北京三快在线科技有限公司 Phrase extraction method and device, electronic equipment and storage medium
CN111988479A (en) * 2020-08-20 2020-11-24 浙江企蜂信息技术有限公司 Call information processing method and device, computer equipment and storage medium
CN112507068A (en) * 2020-11-30 2021-03-16 北京百度网讯科技有限公司 Document query method and device, electronic equipment and storage medium
WO2021068352A1 (en) * 2019-10-12 2021-04-15 平安科技(深圳)有限公司 Automatic construction method and apparatus for faq question-answer pair, and computer device and storage medium
CN112735413A (en) * 2020-12-25 2021-04-30 浙江大华技术股份有限公司 Instruction analysis method based on camera device, electronic equipment and storage medium
CN112818167A (en) * 2021-01-28 2021-05-18 北京百度网讯科技有限公司 Entity retrieval method, entity retrieval device, electronic equipment and computer-readable storage medium
CN112925889A (en) * 2021-02-26 2021-06-08 北京声智科技有限公司 Natural language processing method, device, electronic equipment and storage medium
CN113077185A (en) * 2021-04-27 2021-07-06 平安普惠企业管理有限公司 Workload evaluation method and device, computer equipment and storage medium
CN113378539A (en) * 2021-06-29 2021-09-10 华南理工大学 Template recommendation method for standard document compiling
WO2021174829A1 (en) * 2020-03-02 2021-09-10 平安科技(深圳)有限公司 Crowdsourced task inspection method, apparatus, computer device, and storage medium
CN113515938A (en) * 2021-05-12 2021-10-19 平安国际智慧城市科技股份有限公司 Language model training method, device, equipment and computer readable storage medium
CN113821616A (en) * 2021-08-09 2021-12-21 北京交通大学 Domain-adaptive slot filling method, device, equipment and storage medium
CN114281944A (en) * 2021-12-27 2022-04-05 北京中科闻歌科技股份有限公司 Document matching model construction method and device, electronic equipment and storage medium
WO2022166621A1 (en) * 2021-02-02 2022-08-11 北京有竹居网络技术有限公司 Dialog attribution recognition method and apparatus, readable medium and electronic device
CN116431838A (en) * 2023-06-15 2023-07-14 北京墨丘科技有限公司 Document retrieval method, device, system and storage medium
CN112925889B (en) * 2021-02-26 2024-04-30 北京声智科技有限公司 Natural language processing method, device, electronic equipment and storage medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090144255A1 (en) * 2007-11-29 2009-06-04 Palo Alto Research Center Incorporated Augmenting privacy policies with inference detection
WO2009090498A2 (en) * 2007-10-30 2009-07-23 Transformer Software, Ltd. Key semantic relations for text processing
CN102023989A (en) * 2009-09-23 2011-04-20 阿里巴巴集团控股有限公司 Information retrieval method and system thereof
CN102163229A (en) * 2011-04-13 2011-08-24 北京百度网讯科技有限公司 Method and equipment for generating abstracts of searching results
CN103902652A (en) * 2014-02-27 2014-07-02 深圳市智搜信息技术有限公司 Automatic question-answering system
CN104753765A (en) * 2013-12-31 2015-07-01 华为技术有限公司 Automatic short message reply method and device
US20160125291A1 (en) * 2014-11-05 2016-05-05 International Business Machines Corporation Answer interactions in a question-answering environment
CN108763529A (en) * 2018-05-31 2018-11-06 苏州大学 A kind of intelligent search method, device and computer readable storage medium
CN109241243A (en) * 2018-08-30 2019-01-18 清华大学 Candidate documents sort method and device

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009090498A2 (en) * 2007-10-30 2009-07-23 Transformer Software, Ltd. Key semantic relations for text processing
US20090144255A1 (en) * 2007-11-29 2009-06-04 Palo Alto Research Center Incorporated Augmenting privacy policies with inference detection
CN102023989A (en) * 2009-09-23 2011-04-20 阿里巴巴集团控股有限公司 Information retrieval method and system thereof
CN102163229A (en) * 2011-04-13 2011-08-24 北京百度网讯科技有限公司 Method and equipment for generating abstracts of searching results
CN104753765A (en) * 2013-12-31 2015-07-01 华为技术有限公司 Automatic short message reply method and device
CN103902652A (en) * 2014-02-27 2014-07-02 深圳市智搜信息技术有限公司 Automatic question-answering system
US20160125291A1 (en) * 2014-11-05 2016-05-05 International Business Machines Corporation Answer interactions in a question-answering environment
CN108763529A (en) * 2018-05-31 2018-11-06 苏州大学 A kind of intelligent search method, device and computer readable storage medium
CN109241243A (en) * 2018-08-30 2019-01-18 清华大学 Candidate documents sort method and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
YANG JIE: "Keyword Extraction Multi-Document Based on Joint Weight", 《JOURNAL OF CHINESE INFORMATION PROCESSING》 *
张德阳: "基于主题的关键词提取对微博情感倾向的研究", 《燕山大学学报》 *

Cited By (48)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110598078A (en) * 2019-09-11 2019-12-20 京东数字科技控股有限公司 Data retrieval method and device, computer-readable storage medium and electronic device
CN110598078B (en) * 2019-09-11 2022-09-30 京东科技控股股份有限公司 Data retrieval method and device, computer-readable storage medium and electronic device
WO2021068352A1 (en) * 2019-10-12 2021-04-15 平安科技(深圳)有限公司 Automatic construction method and apparatus for faq question-answer pair, and computer device and storage medium
CN110717029A (en) * 2019-10-15 2020-01-21 支付宝(杭州)信息技术有限公司 Information processing method and system
CN110795549A (en) * 2019-10-31 2020-02-14 腾讯科技(深圳)有限公司 Short text conversation method, device, equipment and storage medium
CN110795549B (en) * 2019-10-31 2023-03-17 腾讯科技(深圳)有限公司 Short text conversation method, device, equipment and storage medium
CN111177367A (en) * 2019-11-11 2020-05-19 腾讯科技(深圳)有限公司 Case classification method, classification model training method and related products
CN110955767A (en) * 2019-12-04 2020-04-03 中国太平洋保险(集团)股份有限公司 Algorithm and device for generating intention candidate set list set in robot dialogue system
CN111177339A (en) * 2019-12-06 2020-05-19 百度在线网络技术(北京)有限公司 Dialog generation method and device, electronic equipment and storage medium
CN111104476A (en) * 2019-12-19 2020-05-05 用友网络科技股份有限公司 Archive data generation method, archive data generation device, and readable storage medium
CN111061839B (en) * 2019-12-19 2024-01-23 过群 Keyword joint generation method and system based on semantics and knowledge graph
CN111104476B (en) * 2019-12-19 2023-06-20 用友网络科技股份有限公司 Archive data generation method, archive data generation device, and readable storage medium
CN111061839A (en) * 2019-12-19 2020-04-24 过群 Combined keyword generation method and system based on semantics and knowledge graph
CN111198940A (en) * 2019-12-27 2020-05-26 北京百度网讯科技有限公司 FAQ method, question-answer search system, electronic device, and storage medium
CN111198940B (en) * 2019-12-27 2023-01-31 北京百度网讯科技有限公司 FAQ method, question-answer search system, electronic device, and storage medium
CN111339278A (en) * 2020-02-28 2020-06-26 支付宝(杭州)信息技术有限公司 Method and device for generating training speech generating model and method and device for generating answer speech
WO2021174829A1 (en) * 2020-03-02 2021-09-10 平安科技(深圳)有限公司 Crowdsourced task inspection method, apparatus, computer device, and storage medium
CN111460092A (en) * 2020-03-11 2020-07-28 中国电子科技集团公司第二十八研究所 Multi-document-based automatic complex problem solving method
CN111460092B (en) * 2020-03-11 2022-11-29 中国电子科技集团公司第二十八研究所 Multi-document-based automatic complex problem solving method
CN111382256A (en) * 2020-03-20 2020-07-07 北京百度网讯科技有限公司 Information recommendation method and device
CN111382256B (en) * 2020-03-20 2024-04-09 北京百度网讯科技有限公司 Information recommendation method and device
CN111522938B (en) * 2020-04-27 2023-03-24 广东电网有限责任公司培训与评价中心 Method, device and equipment for screening talent performance documents
CN111522938A (en) * 2020-04-27 2020-08-11 广东电网有限责任公司培训与评价中心 Method, device and equipment for screening talent performance documents
CN111523304A (en) * 2020-04-27 2020-08-11 华东师范大学 Automatic generation method of product description text based on pre-training model
CN111523304B (en) * 2020-04-27 2022-08-02 华东师范大学 Automatic generation method of product description text based on pre-training model
CN111966869A (en) * 2020-07-07 2020-11-20 北京三快在线科技有限公司 Phrase extraction method and device, electronic equipment and storage medium
CN111767386B (en) * 2020-07-31 2023-11-17 腾讯科技(深圳)有限公司 Dialogue processing method, device, electronic equipment and computer readable storage medium
CN111767386A (en) * 2020-07-31 2020-10-13 腾讯科技(深圳)有限公司 Conversation processing method and device, electronic equipment and computer readable storage medium
CN111988479B (en) * 2020-08-20 2021-04-20 浙江企蜂信息技术有限公司 Call information processing method and device, computer equipment and storage medium
CN111988479A (en) * 2020-08-20 2020-11-24 浙江企蜂信息技术有限公司 Call information processing method and device, computer equipment and storage medium
CN112507068B (en) * 2020-11-30 2023-11-14 北京百度网讯科技有限公司 Document query method, device, electronic equipment and storage medium
CN112507068A (en) * 2020-11-30 2021-03-16 北京百度网讯科技有限公司 Document query method and device, electronic equipment and storage medium
CN112735413A (en) * 2020-12-25 2021-04-30 浙江大华技术股份有限公司 Instruction analysis method based on camera device, electronic equipment and storage medium
CN112818167A (en) * 2021-01-28 2021-05-18 北京百度网讯科技有限公司 Entity retrieval method, entity retrieval device, electronic equipment and computer-readable storage medium
CN112818167B (en) * 2021-01-28 2024-03-22 北京百度网讯科技有限公司 Entity retrieval method, entity retrieval device, electronic equipment and computer readable storage medium
WO2022166621A1 (en) * 2021-02-02 2022-08-11 北京有竹居网络技术有限公司 Dialog attribution recognition method and apparatus, readable medium and electronic device
CN112925889B (en) * 2021-02-26 2024-04-30 北京声智科技有限公司 Natural language processing method, device, electronic equipment and storage medium
CN112925889A (en) * 2021-02-26 2021-06-08 北京声智科技有限公司 Natural language processing method, device, electronic equipment and storage medium
CN113077185B (en) * 2021-04-27 2022-10-25 平安普惠企业管理有限公司 Workload evaluation method, workload evaluation device, computer equipment and storage medium
CN113077185A (en) * 2021-04-27 2021-07-06 平安普惠企业管理有限公司 Workload evaluation method and device, computer equipment and storage medium
CN113515938A (en) * 2021-05-12 2021-10-19 平安国际智慧城市科技股份有限公司 Language model training method, device, equipment and computer readable storage medium
CN113515938B (en) * 2021-05-12 2023-10-20 平安国际智慧城市科技股份有限公司 Language model training method, device, equipment and computer readable storage medium
CN113378539A (en) * 2021-06-29 2021-09-10 华南理工大学 Template recommendation method for standard document compiling
CN113821616B (en) * 2021-08-09 2023-11-14 北京交通大学 Domain-adaptive slot filling method, device, equipment and storage medium
CN113821616A (en) * 2021-08-09 2021-12-21 北京交通大学 Domain-adaptive slot filling method, device, equipment and storage medium
CN114281944A (en) * 2021-12-27 2022-04-05 北京中科闻歌科技股份有限公司 Document matching model construction method and device, electronic equipment and storage medium
CN116431838B (en) * 2023-06-15 2024-01-30 北京墨丘科技有限公司 Document retrieval method, device, system and storage medium
CN116431838A (en) * 2023-06-15 2023-07-14 北京墨丘科技有限公司 Document retrieval method, device, system and storage medium

Also Published As

Publication number Publication date
CN110196901B (en) 2022-02-11

Similar Documents

Publication Publication Date Title
CN110196901A (en) Construction method, device, computer equipment and the storage medium of conversational system
CN106997382B (en) Innovative creative tag automatic labeling method and system based on big data
US20180341871A1 (en) Utilizing deep learning with an information retrieval mechanism to provide question answering in restricted domains
CN109829104B (en) Semantic similarity based pseudo-correlation feedback model information retrieval method and system
Robertson et al. The TREC 2002 Filtering Track Report.
Naseri et al. Ceqe: Contextualized embeddings for query expansion
CN110442777B (en) BERT-based pseudo-correlation feedback model information retrieval method and system
CN112667794A (en) Intelligent question-answer matching method and system based on twin network BERT model
CN106776532B (en) Knowledge question-answering method and device
JP7111154B2 (en) Answer selection device, answer selection method, answer selection program
CN106156365A (en) A kind of generation method and device of knowledge mapping
KR20160026892A (en) Non-factoid question-and-answer system and method
CN110895559A (en) Model training method, text processing method, device and equipment
CN112487140A (en) Question-answer dialogue evaluating method, device, equipment and storage medium
CN112115716A (en) Service discovery method, system and equipment based on multi-dimensional word vector context matching
CN112256845A (en) Intention recognition method, device, electronic equipment and computer readable storage medium
CN109829045A (en) A kind of answering method and device
CN111026840A (en) Text processing method, device, server and storage medium
CN113590778A (en) Intelligent customer service intention understanding method, device, equipment and storage medium
Celikyilmaz et al. A graph-based semi-supervised learning for question-answering
CN101853298A (en) Event-oriented query expansion method
Sergienko et al. A comparative study of text preprocessing approaches for topic detection of user utterances
KR20210038260A (en) Korean Customer Service Associate Assist System based on Machine Learning
CN107818078B (en) Semantic association and matching method for Chinese natural language dialogue
CN115269961A (en) Content search method and related device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant