WO2021159670A1 - Method and apparatus for processing unknown question in intelligent questions and answers, computer device, and medium - Google Patents

Method and apparatus for processing unknown question in intelligent questions and answers, computer device, and medium Download PDF

Info

Publication number
WO2021159670A1
WO2021159670A1 PCT/CN2020/105089 CN2020105089W WO2021159670A1 WO 2021159670 A1 WO2021159670 A1 WO 2021159670A1 CN 2020105089 W CN2020105089 W CN 2020105089W WO 2021159670 A1 WO2021159670 A1 WO 2021159670A1
Authority
WO
WIPO (PCT)
Prior art keywords
question
current
unknown
convergence
word segmentation
Prior art date
Application number
PCT/CN2020/105089
Other languages
French (fr)
Chinese (zh)
Inventor
范广
Original Assignee
深圳壹账通智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳壹账通智能科技有限公司 filed Critical 深圳壹账通智能科技有限公司
Publication of WO2021159670A1 publication Critical patent/WO2021159670A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/01Customer relationship services

Definitions

  • This application relates to the field of artificial intelligence technology, and in particular to a method, device, computer equipment and medium for processing unknown questions in intelligent question answering.
  • FAQ Frequently Asked Questions
  • the most common FAQ system on the market is a question-and-answer FAQ system.
  • the user can directly in a chat dialog box.
  • the system provides an answer to similar questions based on keywords or the latest similarity algorithm.
  • Question-and-answer man-machine dialogue can provide users with a better user experience, but the inventor realizes that because a single round of similarity judgment will lead to problem matching is not necessarily very accurate (or because the accuracy of the system settings is high, it leads to finding Not a satisfactory answer). Therefore, user questions with unmatched answers will be generated in the background. If there are many users using the FAQ system, the set of unmatched questions will be very large, which will bring great pressure to the operation.
  • a method, device, computer equipment, and medium for processing unknown questions in intelligent question answering are provided.
  • a method for handling unknown questions in intelligent question answering including:
  • the word segmentation is input into the pre-trained slot word extraction model to obtain the word corresponding to each word segmentation through the slot word extraction model type;
  • a device for handling unknown questions in intelligent question answering including:
  • the first receiving module is configured to receive the current question input by the user sent by the user terminal, and query the stored current service label corresponding to the previous question;
  • the word segmentation module is used to obtain the word database corresponding to the current business tag, perform word segmentation processing on the current question to obtain a number of word segmentation, match the word segmentation with the keywords in the thesaurus, and obtain the successfully matched word segmentation The number of
  • the slot word extraction module is used to input the word segmentation into a pre-trained slot word extraction model when the number of successfully matched word segmentation is less than a preset value, so as to be obtained through the slot word extraction model
  • the first relevance calculation module is used to obtain the standard text corresponding to the word type, match the standard text with the keywords in the thesaurus, and determine the current standard text according to the number of successfully matched standard texts. The degree of relevance of the question to the previous question;
  • the correlation output module is used to determine whether the current question is an unknown question according to preset rules when the degree of correlation is lower than a preset value; when the current question is an unknown question, combine the unknown question with Associated output of the current service label;
  • a clustering module for clustering the output of the unknown problem
  • the sending module is used to send the clustered unknown problem to the operation terminal corresponding to the service label.
  • a computer device including a memory and one or more processors, the memory stores computer readable instructions, and when the computer readable instructions are executed by the processor, the one or more processors execute The following steps: receiving the current question entered by the user from the user terminal, and querying the stored current service label corresponding to the previous question;
  • the word segmentation is input into the pre-trained slot word extraction model to obtain the word corresponding to each word segmentation through the slot word extraction model type;
  • One or more computer-readable storage media storing computer-readable instructions.
  • the one or more processors perform the following steps:
  • the word segmentation is input into the pre-trained slot word extraction model to obtain the word corresponding to each word segmentation through the slot word extraction model type;
  • the method, device, computer equipment, and medium for handling unknown questions in the above-mentioned intelligent question answering process after receiving the current question entered by the user from the user terminal, first calculate the correlation between the current question and the previous question according to the current business label of the previous question, If the degree of relevance is lower than the threshold, it will converge through the convergence question library. If the question is still an unknown question after convergence, then the unknown question will be correlated with the current service label, that is, the current question that does not match the answer will be output. , That is, unknown issues are classified and sorted under their respective business labeling systems, turning a large number of single issues into a small number of multi-category issues. The operation terminal only needs to re-categorize the classification results or delete invalid issues. Yes, thereby improving the processing efficiency.
  • Fig. 1 is an application scenario diagram of a method for processing unknown questions in intelligent question answering according to one or more embodiments.
  • Fig. 2 is a schematic flowchart of a method for processing unknown questions in intelligent question answering according to one or more embodiments.
  • Fig. 3 is a flowchart of the steps of calculating the degree of association according to one or more embodiments.
  • Fig. 4 is a structural block diagram of an unknown question processing device in intelligent question answering according to one or more embodiments.
  • Figure 5 is a block diagram of a computer device according to one or more embodiments.
  • both the user terminal 102 and the operation terminal 106 can communicate with the server 104 through the network.
  • the user terminal 102 can receive the current question entered by the user and send the current question to the server 104, so that the server 104 can query the stored current service label corresponding to the previous question, so that the server 104 obtains the word corresponding to the current service label Database, the current problem is segmented to obtain a number of word segmentation, then the word segmentation is matched with the keywords in the thesaurus, and the number of successfully matched word segments is obtained.
  • the server 104 determines whether the current question is an unknown question according to the preset rule. If the server determines that the current problem is an unknown problem, it associates the unknown problem with the current service tag and outputs it.
  • the server 104 clusters the unknown problem according to the service tag, and sends the clustered unknown problem to the operation terminal. In this way, unknown issues are classified and sorted under their respective business labeling systems, and a large number of single issues are turned into a small number of multi-category issues. The operation only needs to re-categorize the classification results or delete invalid issues to ensure the classification. Correctness.
  • the user terminal 102 and the operation terminal 106 can be, but are not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices.
  • the server 104 can be implemented by an independent server or a server cluster composed of multiple servers. .
  • a method for processing unknown questions in intelligent question answering is provided. Taking the method applied to the server in FIG. 1 as an example for description, the method includes the following steps:
  • S202 Receive the current question input by the user sent by the user terminal, and query the stored current service label corresponding to the previous question.
  • the user terminal first receives the question, that is, after the session is created, it sends the question to the server.
  • the server determines whether the user terminal is creating a session. If so, the server creates this session accordingly, and then creates each common session.
  • a short answer to a question is asked, it is necessary to put a business label on the short answer to the common question to mark the business question of which business the short answer to the common question is for.
  • the server saves the current service label corresponding to the short answer to the most recent common question in real time.
  • the server can obtain the current service label saved last time, that is, the service label of the short answer to the last common question saved last time.
  • S204 Obtain the thesaurus corresponding to the current business label, perform word segmentation processing on the current question to obtain a number of word segmentation, match the word segmentation with the keywords in the thesaurus, and obtain the number of successfully matched word segmentation.
  • the server When the server receives the current question, it determines the business field corresponding to the question according to the current business label, and then calculates the correlation between the current question and the question that has been asked in the business field. That is to say, the server only needs to calculate the current question and the question.
  • the degree of relevance of the questions that have been asked in the business field reduces the scope of matching, and then performs matching, which can improve the efficiency of matching. There are two ways to calculate the correlation degree.
  • One is the thesaurus matching method, that is, each business tag corresponds to a thesaurus, the server calculates the correlation between the current question and the words in the thesaurus, and the second method is training
  • the method of the model that is, when the word database is not successfully matched, that is, when the relevance is 0 or lower than the threshold
  • the server inputs the current question into the training model to obtain the slot word, and then uses the slot word corresponding to the current business label
  • the lexicon is matched to get the degree of relevance.
  • the keyword lexicon is a lexicon created and managed by the operators according to their own business characteristics. Therefore, the lexicon corresponds to a business tag, and a business tag corresponds to a lexicon, and the lexicon is pre-configured in the server background. The server can select the corresponding vocabulary according to the current business tag.
  • the word segmentation process can use a traditional word segmentation algorithm, such as the NLP algorithm.
  • the number of successfully matched tokens may refer to the number of successfully matched tokens under fuzzy matching.
  • the slot extraction scheme requires pre-training of the model, that is, the association between words and word types obtained through training.
  • the word types can include insurance types, insurance companies, and intended products.
  • the server segmentation of the current question, and then input the segmentation into the model to obtain the word type corresponding to each segmentation.
  • S208 Obtain the standard text corresponding to the word type, match the standard text with the keywords in the lexicon, and determine the degree of relevance between the current question and the previous question according to the number of successfully matched standard texts.
  • the standard text corresponding to the word type is obtained, and the standard text is matched with the keywords in the thesaurus, and the relevance degree is obtained according to the number of successfully matched keywords. For example, the number of successful matches is higher than that of the segmentation The total number can get the degree of relevance.
  • the server may continue to ask the user according to the preset rule to converge the current question, thereby judging whether the current question is an unknown question.
  • the current problem can be converged through the convergence problem library.
  • the unknown question is associated with the current service label and output. Specifically, if the server still does not confirm the business area of the current problem after converging the current problem according to the convergence problem database, that is, the problem is not stored in the corresponding business area, the server can mark the problem as an unknown problem and assign the unknown problem. The problem is output in association with the business tag, so that unknown problems under the specific business tag can be obtained. In this way, the unknown problems are associated with the current business tags, and the unknown problems are clustered, so that the operating terminal can see one type of one type of problem, rather than all the chaotic problems.
  • the clustering method here can adopt a traditional clustering method, for example, using similarity matching, that is, calculating the similarity of all questions, and taking the questions with high similarity as one category.
  • the server clusters the unknown problems according to the current business tags, and divides the unknown problems with the same current business tags into one category.
  • the server may calculate the number of unknown problems after clustering, and only send it to the operation after the preset number is reached, which facilitates the operation of one-time processing.
  • the operating terminal after receiving the current question entered by the user from the user terminal, first calculate the degree of relevance between the current question and the previous question according to the current service label of the previous question. If the degree of relevance is lower than the threshold, Then the unknown question is associated with the current business label and output, that is to say, the current question that does not match the answer, that is, the unknown question is sorted and sorted under the respective business label system, and a large number of single questions are turned into For a small number of multi-category issues, the operating terminal only needs to re-categorize or delete invalid issues based on the classification results, thereby improving the processing efficiency.
  • the above method further includes: when the number of successfully matched word segmentation is greater than or equal to a preset value, determining the degree of relevance between the current question and the question that has been asked according to the data of the successfully matched word segmentation.
  • the word segmentation is matched with the keywords in the thesaurus, and the degree of relevance between the current question and the question that has been asked is determined according to the data of the successfully matched word segmentation.
  • the server matches the obtained word segmentation with the keywords in the thesaurus, and obtains the relevance degree according to the number of successfully matched keywords.
  • the relevance degree can be obtained by comparing the number of successfully matched words to the total number of word segmentation. .
  • the degree of relevance is calculated in a variety of ways, which can improve the accuracy of calculating the degree of relevance.
  • determining whether the current problem is an unknown problem according to preset rules includes:
  • S302 Extract the current convergence question from the convergence question database corresponding to the current service label, and send the current convergence question to the user terminal for display.
  • the convergence question library corresponds to the service label, and the stored convergence question sentence is used to converge the question asked by the user to a certain type of question in a certain service area.
  • the server can first extract the current keywords from the current question, determine the meaning of the current keywords, and then extract the convergence questions from the convergence question database according to the meaning of the current keywords, and send the convergence questions to the user terminal for display, so that The user discovers the convergent question in time and answers it.
  • the convergence question can be "Hello, confirm, you are still consulting [%s] (variables, fill in keywords or intentions)".
  • the first is: if the keywords or intentions recorded in the previous round and the keywords or intentions extracted from the next round of user questions belong to two businesses, the keywords or intentions recorded in the previous round are directly discarded without convergence. For example, the last round is still about insurance, and the next round will directly ask the question about the credit card. Then, just use the next round of questions and go to the FAQ database for similarity analysis.
  • the third is: if there is a discrepancy between the intention of the next round of question and the intention of the previous round (the situation in this embodiment), the user is guided to clarify the question by bringing the vocabulary or slot into the question sentence. For example, "Hello, confirm, you are still consulting [%s] (variables, fill in keywords or intentions)".
  • S304 Receive a confirmation reply corresponding to the current convergence question returned by the user terminal, and match the business question corresponding to the service label according to the confirmation reply.
  • the server can confirm the convergent question according to the keywords in the confirmation reply or the convergent question.
  • the keywords in the sentences are matched with the business questions of the business tags.
  • the server can mark the problem as an unknown problem, and associate the unknown problem with the business label to output, so that it can be obtained Go to the unknown problem under the specific business tab.
  • the unknown problems are associated with the current business tags, and the unknown problems are clustered, so that the operating terminal can see one type of one type of problem, rather than all the chaotic problems.
  • the current problem is converged through the convergence problem library to determine whether the current problem is an unknown problem, and the processing efficiency is improved.
  • extracting the current convergence question sentence from the convergence question database corresponding to the current business label includes: segmenting the current question to obtain representative words and the appearance order of the representative words; from the convergence question database corresponding to the current business label Select the initial question sentence corresponding to the representative word; select the initial question sentence whose vocabulary appearance order is consistent with the appearance order of the representative word as the current convergent question sentence.
  • the server can perform word segmentation processing on the current problem, in which the word segmentation processing method can be the same as the above, and the sequence of the representative words obtained by the word segmentation can be obtained, so that the server can select and The initial question corresponding to the representative word.
  • the representative word first appears as the company, then the product type and finally the intended product, the server can match in this order to obtain the initial question, so that matching in order can improve the efficiency of matching .
  • the slot extraction algorithm if the keyword vocabulary is not hit in the user's question, the slot extraction algorithm is used to identify the user's intention, and after the slot is extracted, the similarity judgment is made with the user's question.
  • the convergent question sentence is determined according to the appearance order of the keywords, which can improve the efficiency of determining the convergent question sentence.
  • the method further includes: receiving the denial reply corresponding to the current convergent question returned by the user terminal, and judging whether the secondary denial reply can be extracted from the denial reply.
  • the second The high-level keywords are used as the current keywords, and the convergence questions are obtained from the convergence question library according to the current keywords, and then the convergence questions are sent to the terminal for display until the number of convergence questions reaches the preset value, or When the user stops the question and answer, the current question is output as an unknown question, and the unknown question is output in association with the current business label.
  • the unknown questions are classified according to the replies of the users, so as to facilitate the subsequent processing of the terminal operation.
  • the clustered unknown question after the clustered unknown question is sent to the operating terminal corresponding to the service tag, it includes: receiving the standard answer corresponding to the unknown question returned by the operating terminal, and corresponding the standard answer, the unknown question, and the unknown question
  • receiving the current question entered by the user from the user terminal and querying the current service label corresponding to the stored previous question it also includes: according to the current service label, the current question is compared with the stored unknown question Matching; if the matching is successful, the standard response corresponding to the unknown question is obtained for output; otherwise, the correlation between the current question and the question that has been asked is continued to be calculated according to the current business label.
  • the operating terminal can also add the standard answers, unknown questions, and business tags corresponding to the unknown questions to the corresponding library, that is, import the classified questions into the existing question library in batches.
  • the processing efficiency can be greatly improved. That is, after receiving the current question, it is matched with the question library corresponding to the business label first, and if the matching is successful, the answer is directly output, otherwise the correlation between the current question and the question that has been asked is calculated according to the current business label.
  • a problem database of the unknown problem is established.
  • the problem database is still classified according to the business tag, so that after receiving the current problem, the problem corresponding to the business tag can be directly queried.
  • the answers in the library improve efficiency.
  • a device for processing unknown questions in intelligent question answering including: a first receiving module 100, a word segmentation module 200, a slot word extraction module 300, The first correlation degree calculation module 400, the first judgment module 500, the correlation output module 600, the clustering module 700 and the sending module 800, wherein:
  • the first receiving module 100 is configured to receive the current question input by the user sent by the user terminal, and query the stored current service label corresponding to the previous question;
  • the word segmentation module 200 is used to obtain the vocabulary corresponding to the current business tag, perform word segmentation processing on the current question to obtain a number of word segmentation, match the word segmentation with keywords in the thesaurus, and obtain the number of successfully matched word segments;
  • the slot word extraction module 300 is used to input the word segmentation into the pre-trained slot word extraction model when the number of successfully matched word segmentation is less than the preset value, so as to obtain each word segmentation through the slot word extraction model Corresponding word type;
  • the first relevance calculation module 400 is used to obtain the standard text corresponding to the word type, match the standard text with the keywords in the lexicon, and determine the relevance of the current question to the previous question according to the number of successfully matched standard texts Spend;
  • the first judgment module 500 is used for judging whether the current problem is an unknown problem according to the preset rule when the degree of association is lower than the preset value;
  • the correlation output module 600 is used to correlate and output the unknown question with the current business label when the current question is an unknown question
  • the clustering module 700 is used for clustering the output unknown problems
  • the sending module 800 is configured to send the clustered unknown problem to the operation terminal corresponding to the service label.
  • the device further includes:
  • the second degree of relevance calculation module is used to determine the degree of relevance between the current question and the question that has been asked according to the data of the successfully matched word segmentation when the number of successfully matched word segmentation is greater than or equal to the preset value.
  • the correlation output module 600 may include:
  • the display unit is configured to extract the current convergence question sentence from the convergence question library corresponding to the current service label, and send the current convergence question sentence to the user terminal for display.
  • the receiving unit is configured to receive the confirmation reply corresponding to the current convergence question returned by the user terminal, and match the business question corresponding to the service label according to the confirmation reply.
  • the output unit is used to mark the current question as an unknown question when the service question corresponding to the service label is not matched.
  • the above-mentioned display unit may include:
  • the sequence acquisition unit is used to segment the current question to obtain the representative words and the appearance order of the representative words.
  • the initial question selection unit is used to select the initial question corresponding to the representative word from the convergence question library corresponding to the current business label.
  • the convergent question selection unit is used to select the initial question sentence whose vocabulary appearance order is consistent with the appearance order of the representative word as the current convergent question sentence.
  • the device for handling unknown questions in intelligent question answering may further include:
  • the third receiving module is used to receive the denial reply corresponding to the current convergent question returned by the user terminal, and determine whether the secondary representative words can be extracted from the denial reply.
  • the extraction module is used to extract the next convergence question from the convergence question database as the current convergence question when the secondary representative word cannot be extracted from the denial reply, and continue to send the current convergence question to the user terminal for display.
  • the loop module is used to determine that the current question is an unknown question until the number of convergent questions sent to the user terminal reaches a preset value, or the user terminal does not receive a response corresponding to the current convergent question within a preset time period, And correlate the location problem with the current business label and output it.
  • the device for handling unknown questions in intelligent question answering may further include:
  • the fourth receiving module is used to receive the standard answer corresponding to the unknown question returned by the operation terminal, and store the standard answer, the unknown question, and the service label corresponding to the unknown question in association.
  • the matching module is used to match the current question with the stored unknown question according to the current business tag.
  • the output module is used to obtain the standard response corresponding to the unknown question for output if the matching is successful; otherwise, continue to calculate the correlation degree between the current question and the question that has been asked according to the current service label.
  • Each module in the device for handling unknown questions in the above intelligent question answering can be implemented in whole or in part by software, hardware, and a combination thereof.
  • the above-mentioned modules may be embedded in the form of hardware or independent of the processor in the computer equipment, or may be stored in the memory of the computer equipment in the form of software, so that the processor can call and execute the operations corresponding to the above-mentioned modules.
  • a computer device is provided.
  • the computer device may be a server, and its internal structure diagram may be as shown in FIG. 5.
  • the computer equipment includes a processor, a memory, a network interface, and a database connected through a system bus.
  • the processor of the computer device is used to provide calculation and control capabilities.
  • the memory of the computer device includes a non-volatile or volatile storage medium and internal memory.
  • the non-volatile or volatile storage medium stores an operating system, computer readable instructions, and a database.
  • the internal memory provides an environment for the operation of the operating system and computer-readable instructions in the non-volatile storage medium.
  • the database of the computer equipment is used for data such as convergence problem database and business tags.
  • the network interface of the computer device is used to communicate with an external terminal through a network connection. When the computer readable instruction is executed by the processor, a method for processing unknown questions in intelligent question answering is realized.
  • FIG. 5 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation on the computer device to which the solution of the present application is applied.
  • the specific computer device may Including more or fewer parts than shown in the figure, or combining some parts, or having a different arrangement of parts.
  • a computer device including a memory and one or more processors.
  • the memory stores computer-readable instructions.
  • the one or more processors perform the following steps: The current question entered by the user, and query the stored current business label corresponding to the previous question; get the thesaurus corresponding to the current business label, perform word segmentation processing on the current question to obtain several word segmentation, and match the word segmentation with the keywords in the thesaurus , And obtain the number of successfully matched word segmentation; when the number of successfully matched word segmentation is less than the preset value, the word segmentation is input into the pre-trained slot word extraction model to obtain each word segmentation model through the slot word extraction model
  • the word type corresponding to a word segmentation; obtain the standard text corresponding to the word type, match the standard text with the keywords in the thesaurus, and determine the degree of relevance between the current question and the previous question according to the number of successfully matched standard texts; When the degree of relevance is lower than the preset value, judge whether the current question is an unknown question
  • the processor further implements the following steps when executing the computer-readable instructions: when the number of successfully matched word segmentation is greater than or equal to the preset value, the current question and the questioned question are determined according to the data of the successfully matched word segmentation The degree of relevance of the question.
  • the process of judging whether the current question is an unknown question according to preset rules includes: extracting the current convergence question from the convergence question library corresponding to the current business tag, and adding The current convergence question is sent to the user terminal for display; the confirmation response corresponding to the current convergence question returned by the user terminal is received, and the service question corresponding to the service label is matched according to the confirmation response; and when the service corresponding to the service label is not matched When there is a question, the current question is marked as an unknown question.
  • extracting the current convergence question from the convergence question library corresponding to the current service tag includes: segmenting the current question to obtain the representative words and the appearance order of the representative words ; Select the initial question sentence corresponding to the representative word from the convergence question library corresponding to the current business label; and select the initial question sentence whose vocabulary appearance order is consistent with the appearance order of the representative word as the current convergence question sentence.
  • the method further includes: receiving a denial reply corresponding to the current convergent question returned by the user terminal, and Determine whether the secondary representative words can be extracted from the denial reply; when the secondary representative words cannot be extracted from the denial reply, extract the next convergence question from the convergence question database as the current convergence question, and continue to converge the current
  • the question is sent to the user terminal for display; and until the number of convergent questions sent to the user terminal reaches a preset value, or the user terminal does not receive a reply corresponding to the current convergent question within a preset time period, it is determined that the current convergent question
  • the problem is an unknown problem, and the location problem is associated with the current business label and output.
  • the method includes: receiving a standard response corresponding to the unknown question returned by the operation terminal , The standard answers, unknown questions, and business tags corresponding to the unknown questions are associated and stored; the processor executes the computer-readable instructions to receive the current question sent by the user from the user terminal and query the stored previous question corresponding to the current question After the current business label, it also includes: matching the current question with the stored unknown question according to the current business label; and if the matching is successful, obtain the standard answer corresponding to the unknown question and output it, otherwise, continue to calculate the current question based on the current business label The degree of relevance to the question that has been asked.
  • One or more computer-readable storage media storing computer-readable instructions.
  • the one or more processors perform the following steps: receiving user input from a user terminal Current question, and query the stored current business tag corresponding to the previous question; get the thesaurus corresponding to the current business tag, perform word segmentation processing on the current question to obtain several word segmentation, match the word segmentation with the keywords in the thesaurus, and get The number of successfully matched tokens; when the number of successfully matched tokens is less than the preset value, the tokens are input into the pre-trained slot word extraction model to obtain the corresponding word segmentation through the slot word extraction model Type of words; get the standard text corresponding to the word type, match the standard text with the keywords in the lexicon, and determine the degree of relevance between the current question and the previous question according to the number of successfully matched standard texts; when the degree of relevance is low At the preset value, judge whether the current question is an unknown question according to the preset rules; when the current question is an unknown
  • the computer-readable storage medium may be non-volatile or volatile.
  • the following steps are also implemented: when the number of successfully matched word segmentation is greater than or equal to a preset value, the current problem and the existing problem are determined according to the data of the successfully matched word segmentation. The relevance of the question asked.
  • judging whether the current question is an unknown question according to preset rules includes: extracting the current convergence question from the convergence question library corresponding to the current business tag, and Send the current convergence question to the user terminal for display; receive the confirmation reply corresponding to the current convergence question returned by the user terminal, and match the service question corresponding to the service label according to the confirmation response; and when the service label corresponding to the service label is not matched When there is a business problem, the current problem is marked as an unknown problem.
  • the extraction of the current convergence question from the convergence question library corresponding to the current service tag when the computer-readable instruction is executed by the processor includes: segmenting the current question to obtain the representative word and the appearance of the representative word Sequence; select the initial question sentence corresponding to the representative word from the convergence question library corresponding to the current business label; and select the initial question sentence whose vocabulary appearance order is consistent with the appearance order of the representative word as the current convergent question sentence.
  • the method further includes: receiving a denial reply corresponding to the current convergent question returned by the user terminal, And judge whether the secondary representative words can be extracted from the denial reply; when the secondary representative words cannot be extracted from the denial reply, the next convergent question sentence is extracted from the convergent question database as the current convergent question sentence, and the current convergent question is continued.
  • the convergent question is sent to the user terminal for display; and until the number of convergent questions sent to the user terminal reaches the preset value, or the user terminal does not receive a reply corresponding to the current convergent question within the preset time period, it is determined
  • the current problem is an unknown problem, and the location problem is associated with the current business label and output.
  • the computer-readable instruction when executed by the processor, after the clustered unknown problem is sent to the operation terminal corresponding to the service tag, it includes: receiving the standard corresponding to the unknown problem returned by the operation terminal Reply: Associate and store standard answers, unknown questions, and business tags corresponding to unknown questions; when the computer-readable instructions are executed by the processor, it receives the current question entered by the user from the user terminal and queries the stored previous question After the corresponding current business label, it also includes: matching the current question with the stored unknown question according to the current business label; and if the matching is successful, obtain the standard answer corresponding to the unknown question for output, otherwise, continue to calculate based on the current business label The degree of relevance of the current question to the question that has been asked.
  • Non-volatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory may include random access memory (RAM) or external cache memory.
  • RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Channel (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Human Computer Interaction (AREA)
  • Artificial Intelligence (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Finance (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Provided are a method and apparatus for processing an unknown question in intelligent questions and answers, a computer device, and a medium, which fall within the field of big data processing. The method comprises: receiving the current question input by a user and sent by a user terminal, and querying the current service tag corresponding to a stored previous question (S202); firstly, performing matching by means of a word library corresponding to the service tag; if the matching fails, continuing to perform matching by means of a slot word extraction model so as to determine whether the current question is an unknown question; when the current question is an unknown question, outputting the unknown question and the current service tag in an associated manner (S212); performing clustering processing on the output unknown question (S214); and sending the clustered unknown question to an operation terminal corresponding to the service tag (S216).

Description

智能问答中未知问题处理方法、装置、计算机设备和介质Method, device, computer equipment and medium for processing unknown problems in intelligent question answering
相关申请的交叉引用Cross-references to related applications
本申请要求于2020年2月11日提交中国专利局,申请号为2020100872142,申请名称为“智能问答中未知问题处理方法、装置、计算机设备和介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on February 11, 2020, the application number is 2020100872142, and the application name is "Methods, devices, computer equipment, and media for handling unknown questions in intelligent question and answering", and its entire contents Incorporated in this application by reference.
技术领域Technical field
本申请涉及人工智能技术领域,特别是涉及一种智能问答中未知问题处理方法、装置、计算机设备和介质。This application relates to the field of artificial intelligence technology, and in particular to a method, device, computer equipment and medium for processing unknown questions in intelligent question answering.
背景技术Background technique
常见问题简答(简称FAQ)是大部分产品的标配系统,帮助用户自己查找问题以此降低人工客服的成本,市面上常见的FAQ系统是问答式FAQ系统,在一个聊天对话框中用户直接咨询问题,系统基于关键词或最新的相似度算法提供一个相似问题的答案。Frequently Asked Questions (FAQ) is a standard system for most products. It helps users find problems on their own to reduce the cost of manual customer service. The most common FAQ system on the market is a question-and-answer FAQ system. The user can directly in a chat dialog box. For consulting questions, the system provides an answer to similar questions based on keywords or the latest similarity algorithm.
问答式的人机对话能给用户提供更好的用户体验,但是发明人意识到,因为单轮的相似度判断会导致问题匹配并不一定非常精准(或者因为系统设置的精准度高,导致找不到满意的答案)。因此会在后台产生未能匹配答案的用户问题,如果使用FAQ系统的用户多,则产生的未匹配问题集将会非常庞大,给运营工作带来很大的压力。Question-and-answer man-machine dialogue can provide users with a better user experience, but the inventor realizes that because a single round of similarity judgment will lead to problem matching is not necessarily very accurate (or because the accuracy of the system settings is high, it leads to finding Not a satisfactory answer). Therefore, user questions with unmatched answers will be generated in the background. If there are many users using the FAQ system, the set of unmatched questions will be very large, which will bring great pressure to the operation.
发明内容Summary of the invention
根据本申请公开的各种实施例,提供一种智能问答中未知问题处理处理方法、装置、计算机设备和介质。According to various embodiments disclosed in the present application, a method, device, computer equipment, and medium for processing unknown questions in intelligent question answering are provided.
一种智能问答中未知问题处理方法,包括:A method for handling unknown questions in intelligent question answering, including:
接收用户终端发送的用户输入的当前问题,并查询已存储的上一问题对应的当前业务标签;Receive the current question entered by the user from the user terminal, and query the stored current service label corresponding to the previous question;
获取所述当前业务标签对应的词库,将所述当前问题进行分词处理得到若干分词,将所述分词与所述词库中的关键词进行匹配,并获取匹配成功的分词的个数;Acquiring the word database corresponding to the current business tag, performing word segmentation processing on the current question to obtain a number of word segmentation, matching the word segmentation with keywords in the thesaurus, and obtaining the number of successfully matched word segmentation;
当匹配成功的分词的个数小于预设值时,则将所述分词输入至预先训练得到的槽位词抽取模型中,以通过所述槽位词抽取模型得到每个所述分词对应的词语类型;When the number of successfully matched word segmentation is less than the preset value, the word segmentation is input into the pre-trained slot word extraction model to obtain the word corresponding to each word segmentation through the slot word extraction model type;
获取所述词语类型对应的标准文本,通过所述标准文本与所述词库中的关键词进行匹配,并根据匹配成功的标准文本的个数确定所述当前问题与上一问题的关联度;Acquiring the standard text corresponding to the word type, matching the standard text with keywords in the thesaurus, and determining the degree of relevance between the current question and the previous question according to the number of successfully matched standard texts;
当所述关联度低于预设值时,则根据预设规则判断所述当前问题是否为未知问题;当 所述当前问为未知问题时,则将所述未知问题与所述当前业务标签关联输出;When the degree of association is lower than a preset value, it is determined whether the current question is an unknown question according to preset rules; when the current question is an unknown question, the unknown question is associated with the current business tag Output
将输出的所述未知问题进行聚类处理;及Clustering the output of the unknown problem; and
将聚类后的未知问题发送给与所述业务标签对应的运营终端。Send the clustered unknown question to the operation terminal corresponding to the service label.
一种智能问答中未知问题处理装置,包括:A device for handling unknown questions in intelligent question answering, including:
第一接收模块,用于接收用户终端发送的用户输入的当前问题,并查询已存储的上一问题对应的当前业务标签;The first receiving module is configured to receive the current question input by the user sent by the user terminal, and query the stored current service label corresponding to the previous question;
分词模块,用于获取所述当前业务标签对应的词库,将所述当前问题进行分词处理得到若干分词,将所述分词与所述词库中的关键词进行匹配,并获取匹配成功的分词的个数;The word segmentation module is used to obtain the word database corresponding to the current business tag, perform word segmentation processing on the current question to obtain a number of word segmentation, match the word segmentation with the keywords in the thesaurus, and obtain the successfully matched word segmentation The number of
槽位词抽取模块,用于当匹配成功的分词的个数小于预设值时,则将所述分词输入至预先训练得到的槽位词抽取模型中,以通过所述槽位词抽取模型得到每个所述分词对应的词语类型;The slot word extraction module is used to input the word segmentation into a pre-trained slot word extraction model when the number of successfully matched word segmentation is less than a preset value, so as to be obtained through the slot word extraction model The word type corresponding to each said word segmentation;
第一关联度计算模块,用于获取所述词语类型对应的标准文本,通过所述标准文本与所述词库中的关键词进行匹配,并根据匹配成功的标准文本的个数确定所述当前问题与上一问题的关联度;The first relevance calculation module is used to obtain the standard text corresponding to the word type, match the standard text with the keywords in the thesaurus, and determine the current standard text according to the number of successfully matched standard texts. The degree of relevance of the question to the previous question;
关联输出模块,用于当所述关联度低于预设值时,则根据预设规则判断所述当前问题是否为未知问题;当所述当前问为未知问题时,则将所述未知问题与所述当前业务标签关联输出;The correlation output module is used to determine whether the current question is an unknown question according to preset rules when the degree of correlation is lower than a preset value; when the current question is an unknown question, combine the unknown question with Associated output of the current service label;
聚类模块,用于将输出的所述未知问题进行聚类处理;及A clustering module for clustering the output of the unknown problem; and
发送模块,用于将聚类后的未知问题发送给与所述业务标签对应的运营终端。The sending module is used to send the clustered unknown problem to the operation terminal corresponding to the service label.
一种计算机设备,包括存储器和一个或多个处理器,所述存储器中储存有计算机可读指令,所述计算机可读指令被所述处理器执行时,使得所述一个或多个处理器执行以下步骤:接收用户终端发送的用户输入的当前问题,并查询已存储的上一问题对应的当前业务标签;A computer device, including a memory and one or more processors, the memory stores computer readable instructions, and when the computer readable instructions are executed by the processor, the one or more processors execute The following steps: receiving the current question entered by the user from the user terminal, and querying the stored current service label corresponding to the previous question;
获取所述当前业务标签对应的词库,将所述当前问题进行分词处理得到若干分词,将所述分词与所述词库中的关键词进行匹配,并获取匹配成功的分词的个数;Acquiring the word database corresponding to the current business tag, performing word segmentation processing on the current question to obtain a number of word segmentation, matching the word segmentation with keywords in the thesaurus, and obtaining the number of successfully matched word segmentation;
当匹配成功的分词的个数小于预设值时,则将所述分词输入至预先训练得到的槽位词抽取模型中,以通过所述槽位词抽取模型得到每个所述分词对应的词语类型;When the number of successfully matched word segmentation is less than the preset value, the word segmentation is input into the pre-trained slot word extraction model to obtain the word corresponding to each word segmentation through the slot word extraction model type;
获取所述词语类型对应的标准文本,通过所述标准文本与所述词库中的关键词进行匹配,并根据匹配成功的标准文本的个数确定所述当前问题与上一问题的关联度;Acquiring the standard text corresponding to the word type, matching the standard text with keywords in the thesaurus, and determining the degree of relevance between the current question and the previous question according to the number of successfully matched standard texts;
当所述关联度低于预设值时,则根据预设规则判断所述当前问题是否为未知问题;当所述当前问为未知问题时,则将所述未知问题与所述当前业务标签关联输出;When the degree of association is lower than a preset value, it is determined whether the current question is an unknown question according to preset rules; when the current question is an unknown question, the unknown question is associated with the current business tag Output
将输出的所述未知问题进行聚类处理;及Clustering the output of the unknown problem; and
将聚类后的未知问题发送给与所述业务标签对应的运营终端。Send the clustered unknown question to the operation terminal corresponding to the service label.
一个或多个存储有计算机可读指令的计算机可读存储介质,计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行以下步骤:One or more computer-readable storage media storing computer-readable instructions. When the computer-readable instructions are executed by one or more processors, the one or more processors perform the following steps:
接收用户终端发送的用户输入的当前问题,并查询已存储的上一问题对应的当前业务标签;Receive the current question entered by the user from the user terminal, and query the stored current service label corresponding to the previous question;
获取所述当前业务标签对应的词库,将所述当前问题进行分词处理得到若干分词,将所述分词与所述词库中的关键词进行匹配,并获取匹配成功的分词的个数;Acquiring the word database corresponding to the current business tag, performing word segmentation processing on the current question to obtain a number of word segmentation, matching the word segmentation with keywords in the thesaurus, and obtaining the number of successfully matched word segmentation;
当匹配成功的分词的个数小于预设值时,则将所述分词输入至预先训练得到的槽位词抽取模型中,以通过所述槽位词抽取模型得到每个所述分词对应的词语类型;When the number of successfully matched word segmentation is less than the preset value, the word segmentation is input into the pre-trained slot word extraction model to obtain the word corresponding to each word segmentation through the slot word extraction model type;
获取所述词语类型对应的标准文本,通过所述标准文本与所述词库中的关键词进行匹配,并根据匹配成功的标准文本的个数确定所述当前问题与上一问题的关联度;Acquiring the standard text corresponding to the word type, matching the standard text with keywords in the thesaurus, and determining the degree of relevance between the current question and the previous question according to the number of successfully matched standard texts;
当所述关联度低于预设值时,则根据预设规则判断所述当前问题是否为未知问题;当所述当前问为未知问题时,则将所述未知问题与所述当前业务标签关联输出;When the degree of association is lower than a preset value, it is determined whether the current question is an unknown question according to preset rules; when the current question is an unknown question, the unknown question is associated with the current business tag Output
将输出的所述未知问题进行聚类处理;及Clustering the output of the unknown problem; and
将聚类后的未知问题发送给与所述业务标签对应的运营终端。Send the clustered unknown question to the operation terminal corresponding to the service label.
上述智能问答中未知问题处理方法、装置、计算机设备和介质,在接收到用户终端发送的用户输入的当前问题后,首先根据上一问题的当前业务标签计算当前问题与上一问题的关联度,若关联度低于阈值,则通过收敛问题库进行收敛,若收敛后,该问题仍为未知问题,则将该未知问题与当前业务标签进行关联输出,也就是说将未匹配到答复的当前问题,即未知问题在各自的当亲业务标签体系下进行分类整理,将数量较多的单个问题变成数量较少的多类问题,运营终端只要针对分类结果进行重分类或删除无效问题的操作即可,从而提高了处理效率。The method, device, computer equipment, and medium for handling unknown questions in the above-mentioned intelligent question answering process, after receiving the current question entered by the user from the user terminal, first calculate the correlation between the current question and the previous question according to the current business label of the previous question, If the degree of relevance is lower than the threshold, it will converge through the convergence question library. If the question is still an unknown question after convergence, then the unknown question will be correlated with the current service label, that is, the current question that does not match the answer will be output. , That is, unknown issues are classified and sorted under their respective business labeling systems, turning a large number of single issues into a small number of multi-category issues. The operation terminal only needs to re-categorize the classification results or delete invalid issues. Yes, thereby improving the processing efficiency.
本申请的一个或多个实施例的细节在下面的附图和描述中提出。本申请的其它特征和优点将从说明书、附图以及权利要求书变得明显。The details of one or more embodiments of the present application are set forth in the following drawings and description. Other features and advantages of this application will become apparent from the description, drawings and claims.
附图说明Description of the drawings
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其它的附图。In order to more clearly describe the technical solutions in the embodiments of the present application, the following will briefly introduce the drawings needed in the embodiments. Obviously, the drawings in the following description are only some embodiments of the present application. A person of ordinary skill in the art can obtain other drawings based on these drawings without creative work.
图1为根据一个或多个实施例中智能问答中未知问题处理方法的应用场景图。Fig. 1 is an application scenario diagram of a method for processing unknown questions in intelligent question answering according to one or more embodiments.
图2为根据一个或多个实施例中智能问答中未知问题处理方法的流程示意图。Fig. 2 is a schematic flowchart of a method for processing unknown questions in intelligent question answering according to one or more embodiments.
图3为根据一个或多个实施例中的关联度计算步骤的流程图。Fig. 3 is a flowchart of the steps of calculating the degree of association according to one or more embodiments.
图4为根据一个或多个实施例中智能问答中未知问题处理装置的结构框图。Fig. 4 is a structural block diagram of an unknown question processing device in intelligent question answering according to one or more embodiments.
图5为根据一个或多个实施例中计算机设备的框图。Figure 5 is a block diagram of a computer device according to one or more embodiments.
具体实施方式Detailed ways
为了使本申请的技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处描述的具体实施例仅仅用以解释本申请,并不用于限 定本申请。In order to make the technical solutions and advantages of the present application clearer, the following further describes the present application in detail with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present application, and are not used to limit the present application.
本申请提供的智能问答中未知问题处理智能问答中未知问题处理方法,可以应用于如图1所示的应用环境中。其中,用户终端102与运营终端106均可以与服务器104通过网络进行通信。其中用户终端102可以接收到用户输入的当前问题,并将当前问题发送至服务器104,从而服务器104可以查询已存储的上一个问题对应的当前业务标签,这样服务器104获取到当前业务标签对应的词库,将当前问题进行分词处理得到若干分词,然后将该分词与词库中的关键词进行匹配,并获取匹配成功的分词的个数,当匹配成功的分词的个数小于预设值时,则将分词输入至预先训练得到的槽位词抽取模型中,以通过槽位词抽取模型得到每个分词对应的词语类型;获取词语类型对应的标准文本,通过标准文本与词库中的关键词进行匹配,并根据匹配成功的标准文本的个数确定当前问题与上一问题的关联度,当所计算的关联度低于预设值时,则服务器104根据预设规则判断当前问题是否为未知问题,如果服务器判断当前问题为未知问题,则将未知问题与当前业务标签关联输出,而服务器104根据业务标签对未知问题进行聚类处理,并将聚类处理后的未知问题发送给运营终端。这样将未知问题在各自的业务标签体系下进行分类整理,将数量较多的单个问题变成数量较少的多类问题,运营只要针对分类结果进行重分类或删除无效问题的操作,确保分类的正确性。其中用户终端102与运营终端106可以但不限于是各种个人计算机、笔记本电脑、智能手机、平板电脑和便携式可以戴设备,服务器104可以用独立的服务器或者是多个服务器组成的服务器集群来实现。The method for handling unknown questions in the intelligent question answering provided by this application can be applied to the application environment as shown in FIG. 1. Wherein, both the user terminal 102 and the operation terminal 106 can communicate with the server 104 through the network. The user terminal 102 can receive the current question entered by the user and send the current question to the server 104, so that the server 104 can query the stored current service label corresponding to the previous question, so that the server 104 obtains the word corresponding to the current service label Database, the current problem is segmented to obtain a number of word segmentation, then the word segmentation is matched with the keywords in the thesaurus, and the number of successfully matched word segments is obtained. When the number of successfully matched word segments is less than the preset value, Then input the word segmentation into the pre-trained slot word extraction model to obtain the word type corresponding to each word segmentation through the slot word extraction model; obtain the standard text corresponding to the word type, through the standard text and the keywords in the thesaurus Perform matching, and determine the degree of relevance between the current question and the previous question according to the number of successfully matched standard texts. When the calculated degree of relevance is lower than the preset value, the server 104 determines whether the current question is an unknown question according to the preset rule. If the server determines that the current problem is an unknown problem, it associates the unknown problem with the current service tag and outputs it. The server 104 clusters the unknown problem according to the service tag, and sends the clustered unknown problem to the operation terminal. In this way, unknown issues are classified and sorted under their respective business labeling systems, and a large number of single issues are turned into a small number of multi-category issues. The operation only needs to re-categorize the classification results or delete invalid issues to ensure the classification. Correctness. The user terminal 102 and the operation terminal 106 can be, but are not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices. The server 104 can be implemented by an independent server or a server cluster composed of multiple servers. .
在其中一个实施例中,如图2所示,提供了一种智能问答中未知问题处理智能问答中未知问题处理方法,以该方法应用于图1中的服务器为例进行说明,包括以下步骤:In one of the embodiments, as shown in FIG. 2, a method for processing unknown questions in intelligent question answering is provided. Taking the method applied to the server in FIG. 1 as an example for description, the method includes the following steps:
S202:接收用户终端发送的用户输入的当前问题,并查询已存储的上一问题对应的当前业务标签。S202: Receive the current question input by the user sent by the user terminal, and query the stored current service label corresponding to the previous question.
具体地,用户终端最开始接收到问题,也即创建会话后,将问题发送给服务器,服务器判断用户终端是否是创建会话,如果是,则服务器也相应地创建本次会话,然后创建每一个常见问题简答时,均需要对该常见问题简答打上业务标签,以标记该常见问题简答是针对哪一个业务的业务问题的。Specifically, the user terminal first receives the question, that is, after the session is created, it sends the question to the server. The server determines whether the user terminal is creating a session. If so, the server creates this session accordingly, and then creates each common session. When a short answer to a question is asked, it is necessary to put a business label on the short answer to the common question to mark the business question of which business the short answer to the common question is for.
这样用户在通过终端输入问题时,服务器实时保存最近一次常见问题简答对应的当前业务标签。In this way, when the user inputs a question through the terminal, the server saves the current service label corresponding to the short answer to the most recent common question in real time.
因此当终端接收到用户新输入的问题的时候,服务器可以获取到上一次保存的当前业务标签,也就是上一次所保存的上一个常见问题简答的业务标签。Therefore, when the terminal receives a question newly input by the user, the server can obtain the current service label saved last time, that is, the service label of the short answer to the last common question saved last time.
S204:获取当前业务标签对应的词库,将当前问题进行分词处理得到若干分词,将分词与词库中的关键词进行匹配,并获取匹配成功的分词的个数。S204: Obtain the thesaurus corresponding to the current business label, perform word segmentation processing on the current question to obtain a number of word segmentation, match the word segmentation with the keywords in the thesaurus, and obtain the number of successfully matched word segmentation.
服务器在接收到当前问题的时候,根据当前业务标签确定该问题对应的业务领域,然后在该业务领域内,计算当前问题与已提问的问题的关联度,也就是说服务器仅需要计算当前问题与该业务领域中的已提问的问题的关联度,这样缩小了匹配的范围,然后再进行 匹配,可以提高匹配的效率。该关联度的计算存在两种方式,一种是词库匹配的方式,即每一个业务标签对应一个词库,服务器计算当前问题与该词库中的词的关联度,第二种方式是训练模型的方式,即在通过词库没有匹配成功时,即关联度为0或低于阈值时,服务器将当前问题输入至训练模型中得到槽位词,然后通过槽位词与当前业务标签对应的词库进行匹配得到关联度。When the server receives the current question, it determines the business field corresponding to the question according to the current business label, and then calculates the correlation between the current question and the question that has been asked in the business field. That is to say, the server only needs to calculate the current question and the question. The degree of relevance of the questions that have been asked in the business field reduces the scope of matching, and then performs matching, which can improve the efficiency of matching. There are two ways to calculate the correlation degree. One is the thesaurus matching method, that is, each business tag corresponds to a thesaurus, the server calculates the correlation between the current question and the words in the thesaurus, and the second method is training The method of the model, that is, when the word database is not successfully matched, that is, when the relevance is 0 or lower than the threshold, the server inputs the current question into the training model to obtain the slot word, and then uses the slot word corresponding to the current business label The lexicon is matched to get the degree of relevance.
具体地,关键词词库是由运营人员根据自己业务特点创建管理的一个词库,因此词库与业务标签相对应,一个业务标签对应一个词库,该词库是预先配置在服务器后台的。服务器可以根据当前业务标签选择对应的词库。Specifically, the keyword lexicon is a lexicon created and managed by the operators according to their own business characteristics. Therefore, the lexicon corresponds to a business tag, and a business tag corresponds to a lexicon, and the lexicon is pre-configured in the server background. The server can select the corresponding vocabulary according to the current business tag.
具体地,分词处理可以采用传统的分词算法,例如NLP算法等。匹配成功的分词的个数可以是指模糊匹配下匹配成功的分词的个数。Specifically, the word segmentation process can use a traditional word segmentation algorithm, such as the NLP algorithm. The number of successfully matched tokens may refer to the number of successfully matched tokens under fuzzy matching.
S206:当匹配成功的分词的个数小于预设值时,则将分词输入至预先训练得到的槽位词抽取模型中,以通过槽位词抽取模型得到每个分词对应的词语类型。S206: When the number of successfully matched word segmentation is less than the preset value, input the word segmentation into the pre-trained slot word extraction model, so as to obtain the word type corresponding to each word segmentation through the slot word extraction model.
具体地,槽位抽取方案是需要预先进行模型训练的,即训练得到词语与词语类型的关联关系,词语类型可以包括保险类型、保险公司以及意向产品等。服务器将当前问题进行分词,然后将分词输入到该模型中得到每个分词对应的词语类型。Specifically, the slot extraction scheme requires pre-training of the model, that is, the association between words and word types obtained through training. The word types can include insurance types, insurance companies, and intended products. The server segmentation of the current question, and then input the segmentation into the model to obtain the word type corresponding to each segmentation.
S208:获取词语类型对应的标准文本,通过标准文本与词库中的关键词进行匹配,并根据匹配成功的标准文本的个数确定当前问题与上一问题的关联度。S208: Obtain the standard text corresponding to the word type, match the standard text with the keywords in the lexicon, and determine the degree of relevance between the current question and the previous question according to the number of successfully matched standard texts.
具体地,获取到词语类型对应的标准文本,通过该标准文本与与词库中的关键词进行匹配,根据匹配成功的关键词的个数得到关联度,例如匹配成功的个数比上分词的总个数即可以得到关联度。Specifically, the standard text corresponding to the word type is obtained, and the standard text is matched with the keywords in the thesaurus, and the relevance degree is obtained according to the number of successfully matched keywords. For example, the number of successful matches is higher than that of the segmentation The total number can get the degree of relevance.
在实际应用中,例如如下对话:In practical applications, for example, the following dialogue:
用户:你们有少儿无忧这个产品吗?User: Do you have this product?
系统:您好,我们有这款产品。//假设少儿无忧是公司一个具体产品名,当做关键词已经配置到词库中,用户提问中提到这个关键词,那系统直接记录该关键词。并计算得到对应的关联度。System: Hello, we have this product. //Assuming that Children's Worry is a specific product name of the company, as a keyword has been configured in the thesaurus, and the keyword is mentioned in the user's question, the system directly records the keyword. And calculate the corresponding degree of relevance.
槽位抽取方案(第二个问题同这个一样),若用户提问中没有命中到关键词词库,那采用槽位抽取算法来识别用户意图,抽取到槽位后再和用户提问一起去做相似度判断。Slot extraction scheme (the second question is the same as this one). If the keyword vocabulary is not hit in the user's question, then the slot extraction algorithm is used to identify the user's intention, and after the slot is extracted, it will be similar to the user's question. Degree judgment.
示例:Example:
用户:我有平安的寿险,能做抵押贷吗?User: I have Ping An Life Insurance, can I do a mortgage loan?
系统:个人抵押贷资质要求如下:xxxxxxxx。//假设没有配置任何关键词,槽位抽取算法能抽取到如下信息“保险类型:寿险”、“保险公司:平安”、“意向产品:抵押贷”然后根据抽取到的关键词去计算得到对应的关联度。上述实施例中,通过多种方式来计算关联度,可以提高关联度的计算准确性。S210:当关联度低于预设值时,则根据预设规则判断当前问题是否为未知问题。System: Personal mortgage loan qualification requirements are as follows: xxxxxxxx. //Assuming that no keywords are configured, the slot extraction algorithm can extract the following information "Insurance Type: Life Insurance", "Insurance Company: Ping An", "Intentional Product: Mortgage Loan" and then calculate the corresponding according to the extracted keywords The degree of relevance. In the foregoing embodiment, the degree of relevance is calculated in a variety of ways, which can improve the accuracy of calculating the degree of relevance. S210: When the degree of association is lower than the preset value, judge whether the current problem is an unknown problem according to the preset rule.
具体地,当关联度低于预设值时,则服务器可以按照预设规则来对用户继续进行提问 以将当前问题进行收敛,从而判断当前问题是否为未知问题。例如通过收敛问题库来对当前问题进行收敛等。Specifically, when the degree of association is lower than the preset value, the server may continue to ask the user according to the preset rule to converge the current question, thereby judging whether the current question is an unknown question. For example, the current problem can be converged through the convergence problem library.
S212:当当前问为未知问题时,则将未知问题和当前业务标签关联输出。具体地,如果服务器根据收敛问题库对当前问题收敛后,仍没有确认当前问题的业务领域,即该问题并没有存储在相应的业务领域,则服务器可以标记该问题为未知问题,并将该未知问题与业务标签进行关联输出,这样可以获取到特定业务标签下的未知问题。这样是将未知问题和当业务标签进行关联,对未知问题进行了聚类,从而运营终端就可以看到一类一类的问题,而不是杂乱无章的所有的问题。S212: When the current question is an unknown question, the unknown question is associated with the current service label and output. Specifically, if the server still does not confirm the business area of the current problem after converging the current problem according to the convergence problem database, that is, the problem is not stored in the corresponding business area, the server can mark the problem as an unknown problem and assign the unknown problem. The problem is output in association with the business tag, so that unknown problems under the specific business tag can be obtained. In this way, the unknown problems are associated with the current business tags, and the unknown problems are clustered, so that the operating terminal can see one type of one type of problem, rather than all the chaotic problems.
S214:将输出的未知问题进行聚类处理。S214: Perform clustering processing on the output unknown problem.
具体地,此处的聚类方式可以采用传统的聚类方法,例如采用相似度匹配,即将所有的问题计算相似度,相似度高的问题作为一类。服务器根据当前业务标签对未知问题进行聚类处理,即将当前业务标签相同的未知问题划分为一类。Specifically, the clustering method here can adopt a traditional clustering method, for example, using similarity matching, that is, calculating the similarity of all questions, and taking the questions with high similarity as one category. The server clusters the unknown problems according to the current business tags, and divides the unknown problems with the same current business tags into one category.
S216:将聚类后的未知问题发送给与业务标签对应的运营终端。S216: Send the clustered unknown question to the operation terminal corresponding to the service label.
具体地,在聚类后,即将一个业务领域下的不同的未知问题进行了聚类,例如理财领域下可以聚类得到还款类、购买类等不同类的问题,服务器将该些不同类的问题返回给运营终端,从而运营终端可以针对某一类下的问题进行处理,而不会对所有的未知问题进行统一处理,可以提高处理的准确性。Specifically, after clustering, different unknown problems in a business area are clustered. For example, in the financial management field, different types of problems such as repayment and purchase can be clustered, and the server will classify these different types of problems. The problem is returned to the operation terminal, so that the operation terminal can deal with problems of a certain category, instead of uniformly processing all unknown problems, which can improve the accuracy of processing.
可选地,服务器可以计算聚类后的未知问题的数量,只有达到预设数量后才会发给运营,这样方便运营一次性处理。Optionally, the server may calculate the number of unknown problems after clustering, and only send it to the operation after the preset number is reached, which facilitates the operation of one-time processing.
上述智能问答中未知问题处理方法,在接收到用户终端发送的用户输入的当前问题后,首先根据上一问题的当前业务标签计算当前问题与上一问题的关联度,若关联度低于阈值,则将该未知问题与当前业务标签进行关联输出,也就是说将未匹配到答复的当前问题,即未知问题在各自的当亲业务标签体系下进行分类整理,将数量较多的单个问题变成数量较少的多类问题,运营终端只要针对分类结果进行重分类或删除无效问题的操作即可,从而提高了处理效率。In the above-mentioned method for handling unknown questions in the smart question and answer, after receiving the current question entered by the user from the user terminal, first calculate the degree of relevance between the current question and the previous question according to the current service label of the previous question. If the degree of relevance is lower than the threshold, Then the unknown question is associated with the current business label and output, that is to say, the current question that does not match the answer, that is, the unknown question is sorted and sorted under the respective business label system, and a large number of single questions are turned into For a small number of multi-category issues, the operating terminal only needs to re-categorize or delete invalid issues based on the classification results, thereby improving the processing efficiency.
在其中一个实施例中,上述方法还包括:当匹配成功的分词的个数大于等于预设值时,则根据匹配成功的分词的个数据确定当前问题与已提问的问题的关联度。In one of the embodiments, the above method further includes: when the number of successfully matched word segmentation is greater than or equal to a preset value, determining the degree of relevance between the current question and the question that has been asked according to the data of the successfully matched word segmentation.
具体地,将分词与词库中的关键词进行匹配,并根据匹配成功的分词的个数据确定当前问题与已提问的问题的关联度。Specifically, the word segmentation is matched with the keywords in the thesaurus, and the degree of relevance between the current question and the question that has been asked is determined according to the data of the successfully matched word segmentation.
具体地,服务器将所得到的分词与词库中的关键词进行匹配,根据匹配成功的关键词的个数得到关联度,例如匹配成功的个数比上分词的总个数即可以得到关联度。Specifically, the server matches the obtained word segmentation with the keywords in the thesaurus, and obtains the relevance degree according to the number of successfully matched keywords. For example, the relevance degree can be obtained by comparing the number of successfully matched words to the total number of word segmentation. .
上述实施例中,通过多种方式来计算关联度,可以提高关联度的计算准确性。In the foregoing embodiment, the degree of relevance is calculated in a variety of ways, which can improve the accuracy of calculating the degree of relevance.
在其中一个实施例中,根据预设规则判断当前问题是否为未知问题,包括:In one of the embodiments, determining whether the current problem is an unknown problem according to preset rules includes:
S302:从当前业务标签对应的收敛问题库中提取当前收敛问句,并将当前收敛问句发送至用户终端进行显示。S302: Extract the current convergence question from the convergence question database corresponding to the current service label, and send the current convergence question to the user terminal for display.
具体地,收敛问题库是与业务标签对应的,其中存储的收敛问句用于将用户所提问的问题收敛到某一个业务区域内的某一类问题上的。收敛问题库中可以存在多个收敛问句。服务器可以首先从当前问题中提取当前关键词,判断当前关键词的含义,再根据当前关键词的含义从收敛问题库中提取收敛问句,并将收敛问句发送给用户终端进行显示,以便于用户及时发现该收敛问句,并进行回答。该收敛问句可以是“您好,确认一下,您还是在咨询[%s](变量,填入关键词或意图)”。Specifically, the convergence question library corresponds to the service label, and the stored convergence question sentence is used to converge the question asked by the user to a certain type of question in a certain service area. There can be multiple convergence questions in the convergence question library. The server can first extract the current keywords from the current question, determine the meaning of the current keywords, and then extract the convergence questions from the convergence question database according to the meaning of the current keywords, and send the convergence questions to the user terminal for display, so that The user discovers the convergent question in time and answers it. The convergence question can be "Hello, confirm, you are still consulting [%s] (variables, fill in keywords or intentions)".
其中在根据关联度判断的时候一共有三种情况:There are three situations when judging based on the degree of relevance:
第一种是:若上一轮记录的关键词或意图和下一轮用户提问抽取的关键词或意图完全属于两种业务,则直接抛弃上一轮记录的关键词或意图,不做收敛。比如上一个轮问的还是保险问题,下一轮就直接问信用卡的问题,那直接用下一轮的问题单独去FAQ库中做相似度分析即可。The first is: if the keywords or intentions recorded in the previous round and the keywords or intentions extracted from the next round of user questions belong to two businesses, the keywords or intentions recorded in the previous round are directly discarded without convergence. For example, the last round is still about insurance, and the next round will directly ask the question about the credit card. Then, just use the next round of questions and go to the FAQ database for similarity analysis.
第二种是:下一轮问题的意图不明确,那可以随机用无明确指向的问题引导用户提供更多信息,比如“您好,我没有听清楚,能再说一下吗”、“您好,我没理解您的问题,能麻烦您说清楚点吗”The second is: the intention of the next round of questions is not clear, you can randomly guide users to provide more information with unclearly directed questions, such as "Hello, I did not hear clearly, can you say something more", "Hello, I did not understand your question, can I trouble you to make it clearer?"
第三种是:若下一轮问题意图和上一轮意图有出入(本实施例中的情况),则通过将词库或槽位带入问句中引导用户明确问题。比如“您好,确认一下,您还是在咨询[%s](变量,填入关键词或意图)”。The third is: if there is a discrepancy between the intention of the next round of question and the intention of the previous round (the situation in this embodiment), the user is guided to clarify the question by bringing the vocabulary or slot into the question sentence. For example, "Hello, confirm, you are still consulting [%s] (variables, fill in keywords or intentions)".
S304:接收用户终端返回的与当前收敛问句对应的确认答复,并根据确认答复匹配与业务标签对应的业务问题。S304: Receive a confirmation reply corresponding to the current convergence question returned by the user terminal, and match the business question corresponding to the service label according to the confirmation reply.
具体地,用户在看到服务器返回的收敛问句后,可以确认该收敛问句或者是否认该收敛问句,当用户给出确认答复时,服务器可以根据确认答复中的关键词或者是收敛问句中的关键词与业务标签的业务问题进行匹配。Specifically, after seeing the convergent question returned by the server, the user can confirm the convergent question or deny the convergent question. When the user gives a confirmation reply, the server can confirm the convergent question according to the keywords in the confirmation reply or the convergent question. The keywords in the sentences are matched with the business questions of the business tags.
S306:当未匹配到与业务标签对应的业务问题时,则将当前问题标记为未知问题。S306: When the business question corresponding to the business label is not matched, mark the current question as an unknown question.
具体地,如果服务器没有匹配到对应的业务问题,即该问题并没有存储在相应的业务领域,则服务器可以标记该问题为未知问题,并将该未知问题与业务标签进行关联输出,这样可以获取到特定业务标签下的未知问题。这样是将未知问题和当业务标签进行关联,对未知问题进行了聚类,从而运营终端就可以看到一类一类的问题,而不是杂乱无章的所有的问题。Specifically, if the server does not match the corresponding business problem, that is, the problem is not stored in the corresponding business area, the server can mark the problem as an unknown problem, and associate the unknown problem with the business label to output, so that it can be obtained Go to the unknown problem under the specific business tab. In this way, the unknown problems are associated with the current business tags, and the unknown problems are clustered, so that the operating terminal can see one type of one type of problem, rather than all the chaotic problems.
上述实施例中,通过收敛问题库将当前问题进行收敛以判断当前问题是否为未知问题,提高处理效率。In the foregoing embodiment, the current problem is converged through the convergence problem library to determine whether the current problem is an unknown problem, and the processing efficiency is improved.
在其中一个实施例中,从当前业务标签对应的收敛问题库中提取当前收敛问句,包括:对当前问题进行分词得到代表词以及代表词的出现顺序;从当前业务标签对应的收敛问题库中选取与代表词对应的初始问句;选取词汇出现顺序与代表词的出现顺序相一致的初始问句作为当前收敛问句。In one of the embodiments, extracting the current convergence question sentence from the convergence question database corresponding to the current business label includes: segmenting the current question to obtain representative words and the appearance order of the representative words; from the convergence question database corresponding to the current business label Select the initial question sentence corresponding to the representative word; select the initial question sentence whose vocabulary appearance order is consistent with the appearance order of the representative word as the current convergent question sentence.
具体地,服务器可以对当期那问题进行分词处理,其中分词处理的方式可以与上文一 致,并获取到分词得到的代表词的顺序,这样服务器可以从当前业务标签对应的收敛问题库中选取与代表词对应的初始问句,例如代表词首先出现的是公司,然后是产品类型最后是意向产品,则服务器可以按照这个顺序去匹配得到初始问句,这样按照顺序进行匹配,可以提高匹配的效率。Specifically, the server can perform word segmentation processing on the current problem, in which the word segmentation processing method can be the same as the above, and the sequence of the representative words obtained by the word segmentation can be obtained, so that the server can select and The initial question corresponding to the representative word. For example, the representative word first appears as the company, then the product type and finally the intended product, the server can match in this order to obtain the initial question, so that matching in order can improve the efficiency of matching .
例如用户:你们有少儿无忧这个产品吗?For example, users: Do you have this product?
系统:您好,我们有这款产品。//假设少儿无忧是公司一个具体产品名,当做关键词已经配置到词库中,用户提问中提到这个关键词,那系统直接记录该关键词。System: Hello, we have this product. //Assuming that Worry-free for Children is a specific product name of the company, as a keyword, it has been configured in the thesaurus. If the keyword is mentioned in the user's question, the system will directly record the keyword.
用户:那3岁以下的小孩它保吗?User: Does it protect children under 3 years old?
//此时若直接用该提问去答案库中做相似度基本是找不到答案或找到答案的概率非常低,因为意图不明确。所以可以用上一轮记录的关键词做补充寻找答案,查找到问题的相似度就高很多了。//At this time, if you directly use the question to do similarity in the answer database, the answer is basically not found or the probability of finding the answer is very low, because the intention is not clear. So you can use the key words recorded in the previous round to find the answer, and the similarity of the question is much higher.
槽位抽取方案,若用户提问中没有命中到关键词词库,那采用槽位抽取算法来识别用户意图,抽取到槽位后再和用户提问一起去做相似度判断。In the slot extraction scheme, if the keyword vocabulary is not hit in the user's question, the slot extraction algorithm is used to identify the user's intention, and after the slot is extracted, the similarity judgment is made with the user's question.
示例:Example:
用户:我有平安的寿险,能做抵押贷吗?User: I have Ping An Life Insurance, can I do a mortgage loan?
系统:个人抵押贷资质要求如下:xxxxxxxx。//假设没有配置任何关键词,槽位抽取算法能抽取到如下信息“保险公司:平安”、“保险类型:寿险”、“意向产品:抵押贷”。System: Personal mortgage loan qualification requirements are as follows: xxxxxxxx. //Assuming no keywords are configured, the slot extraction algorithm can extract the following information "Insurance Company: Ping An", "Insurance Type: Life Insurance", "Intentional Product: Mortgage Loan".
用户:那我的这个保险算是你们要求的资质吗?//此时将上一轮抽取到的槽位信息补充到用户意图中一起去答案库中查询相关问题的答案。User: Is my insurance the qualification you require? //At this time, add the slot information extracted in the previous round to the user's intention and go to the answer database to query the answers to related questions.
上述实施例中,按照关键词的出现顺序来确定收敛问句,这样可以提高收敛问句的确定效率。In the above embodiment, the convergent question sentence is determined according to the appearance order of the keywords, which can improve the efficiency of determining the convergent question sentence.
在其中一个实施例中,将当前收敛问句发送至用户终端进行显示之后,还包括:接收用户终端返回的与当前收敛问句对应的否认答复,并判断从否认答复中是否能提取出次级代表词;当从否认答复中不能提取出次级代表词时,则从收敛问题库提取下一收敛问句作为当前收敛问句,并继续将当前收敛问句发送至用户终端进行显示;直至发送至用户终端的收敛问句的数量达到预设值,或者用户终端在预设时间段内未接收到与当前收敛问句对应的答复时,判定当前问题为未知问题,并将位置问题与当前业务标签进行关联输出。In one of the embodiments, after sending the current convergent question to the user terminal for display, the method further includes: receiving the denial reply corresponding to the current convergent question returned by the user terminal, and judging whether the secondary denial reply can be extracted from the denial reply. Representative words; when the secondary representative words cannot be extracted from the denial reply, extract the next convergence question from the convergence question database as the current convergence question, and continue to send the current convergence question to the user terminal for display; until it is sent When the number of convergent questions to the user terminal reaches the preset value, or the user terminal does not receive a response corresponding to the current convergent question within the preset time period, it is determined that the current question is an unknown question, and the location problem is compared with the current business The label is associated with the output.
具体地,接收用户终端返回的与当前收敛问句对应的否认答复,并判断从否认答复中是否能提取出次级代表词,当可以从否认答复中提取出次级代表词时候,则将次级关键词作为当前关键词,并根据当前关键词从收敛问题库中获取到收敛问句,然后继续将收敛问句发送给终端进行显示,直至收敛问句的个数达到预设值,或者是用户停止问答时,则输出当前问题为未知问题,且将未知问题与当前业务标签关联输出。Specifically, receiving the denial reply corresponding to the current convergent question returned by the user terminal, and judging whether the secondary representative word can be extracted from the denial reply, and when the secondary representative word can be extracted from the denial reply, the second The high-level keywords are used as the current keywords, and the convergence questions are obtained from the convergence question library according to the current keywords, and then the convergence questions are sent to the terminal for display until the number of convergence questions reaches the preset value, or When the user stops the question and answer, the current question is output as an unknown question, and the unknown question is output in association with the current business label.
上述实施例中,根据用户的答复来对未知问题进行分类,以便于后续运营终端的处理。In the above-mentioned embodiment, the unknown questions are classified according to the replies of the users, so as to facilitate the subsequent processing of the terminal operation.
在其中一个实施例中,将聚类后的未知问题发送给与业务标签对应的运营终端之后,包括:接收运营终端返回的与未知问题对应的标准答复,将标准答复、未知问题以及未知 问题对应的业务标签进行关联存储;接收用户终端发送的用户输入的当前问题,并查询已存储的上一问题对应的当前业务标签之后,还包括:根据当前业务标签将当前问题与已存储的未知问题进行匹配;若匹配成功,则获取未知问题对应的标准答复进行输出,否则,继续根据当前业务标签计算当前问题与已提问的问题的关联度。In one of the embodiments, after the clustered unknown question is sent to the operating terminal corresponding to the service tag, it includes: receiving the standard answer corresponding to the unknown question returned by the operating terminal, and corresponding the standard answer, the unknown question, and the unknown question After receiving the current question entered by the user from the user terminal and querying the current service label corresponding to the stored previous question, it also includes: according to the current service label, the current question is compared with the stored unknown question Matching; if the matching is successful, the standard response corresponding to the unknown question is obtained for output; otherwise, the correlation between the current question and the question that has been asked is continued to be calculated according to the current business label.
具体地,运营终端在对聚类后的未知问题处理后,还可以将该标准答复、未知问题以及未知问题对应的业务标签添加到对应的库中,即对分类的问题批量导入已有问题库的相似问题或整体新建标准问题库,处理效率能获得极大的提升。即在接收到当前问题后,先与业务标签对应的问题库进行匹配,如果匹配成功,则直接输出答案,否则据当前业务标签计算当前问题与已提问的问题的关联度。Specifically, after processing the unknown questions after clustering, the operating terminal can also add the standard answers, unknown questions, and business tags corresponding to the unknown questions to the corresponding library, that is, import the classified questions into the existing question library in batches. The processing efficiency can be greatly improved. That is, after receiving the current question, it is matched with the question library corresponding to the business label first, and if the matching is successful, the answer is directly output, otherwise the correlation between the current question and the question that has been asked is calculated according to the current business label.
上述实施例中,当运营处理完未知问题后,则建立未知问题的问题库,该问题库仍是按照业务标签进行分类的,这样在接收到当前问题后,可以直接查询到业务标签对应的问题库中的答复,提高了效率。In the above embodiment, after the operation has processed the unknown problem, a problem database of the unknown problem is established. The problem database is still classified according to the business tag, so that after receiving the current problem, the problem corresponding to the business tag can be directly queried. The answers in the library improve efficiency.
应该理解的是,虽然图2-3的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,图2-3中的至少一部分步骤可以包括多个子步骤或者多个阶段,这些子步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些子步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。It should be understood that, although the various steps in the flowchart of FIGS. 2-3 are displayed in sequence as indicated by the arrows, these steps are not necessarily performed in sequence in the order indicated by the arrows. Unless specifically stated in this article, the execution of these steps is not strictly limited in order, and these steps can be executed in other orders. Moreover, at least some of the steps in Figure 2-3 may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily executed at the same time, but can be executed at different times. These sub-steps or stages The execution order of is not necessarily performed sequentially, but may be performed alternately or alternately with at least a part of other steps or sub-steps or stages of other steps.
在其中一个实施例中,如图4所示,提供了一种智能问答中未知问题处理智能问答中未知问题处理装置,包括:第一接收模块100、分词模块200、槽位词抽取模块300、第一关联度计算模块400、第一判断模块500、关联输出模块600、聚类模块700和发送模块800,其中:In one of the embodiments, as shown in FIG. 4, a device for processing unknown questions in intelligent question answering is provided, including: a first receiving module 100, a word segmentation module 200, a slot word extraction module 300, The first correlation degree calculation module 400, the first judgment module 500, the correlation output module 600, the clustering module 700 and the sending module 800, wherein:
第一接收模块100,用于接收用户终端发送的用户输入的当前问题,并查询已存储的上一问题对应的当前业务标签;The first receiving module 100 is configured to receive the current question input by the user sent by the user terminal, and query the stored current service label corresponding to the previous question;
分词模块200,用于获取当前业务标签对应的词库,将当前问题进行分词处理得到若干分词,将分词与词库中的关键词进行匹配,并获取匹配成功的分词的个数;The word segmentation module 200 is used to obtain the vocabulary corresponding to the current business tag, perform word segmentation processing on the current question to obtain a number of word segmentation, match the word segmentation with keywords in the thesaurus, and obtain the number of successfully matched word segments;
槽位词抽取模块300,用于当匹配成功的分词的个数小于预设值时,则将分词输入至预先训练得到的槽位词抽取模型中,以通过槽位词抽取模型得到每个分词对应的词语类型;The slot word extraction module 300 is used to input the word segmentation into the pre-trained slot word extraction model when the number of successfully matched word segmentation is less than the preset value, so as to obtain each word segmentation through the slot word extraction model Corresponding word type;
第一关联度计算模块400,用于获取词语类型对应的标准文本,通过标准文本与词库中的关键词进行匹配,并根据匹配成功的标准文本的个数确定当前问题与上一问题的关联度;The first relevance calculation module 400 is used to obtain the standard text corresponding to the word type, match the standard text with the keywords in the lexicon, and determine the relevance of the current question to the previous question according to the number of successfully matched standard texts Spend;
第一判断模块500,用于当关联度低于预设值时,则根据预设规则判断当前问题是否 为未知问题;The first judgment module 500 is used for judging whether the current problem is an unknown problem according to the preset rule when the degree of association is lower than the preset value;
关联输出模块600,用于当当前问为未知问题时,则将未知问题与当前业务标签关联输出;The correlation output module 600 is used to correlate and output the unknown question with the current business label when the current question is an unknown question;
聚类模块700,用于将输出的未知问题进行聚类处理;The clustering module 700 is used for clustering the output unknown problems;
发送模块800,用于将聚类后的未知问题发送给与业务标签对应的运营终端。The sending module 800 is configured to send the clustered unknown problem to the operation terminal corresponding to the service label.
在其中一个实施例中,装置还包括:In one of the embodiments, the device further includes:
第二关联度计算模块,用于当匹配成功的分词的个数大于等于预设值时,则根据匹配成功的分词的个数据确定当前问题与已提问的问题的关联度。The second degree of relevance calculation module is used to determine the degree of relevance between the current question and the question that has been asked according to the data of the successfully matched word segmentation when the number of successfully matched word segmentation is greater than or equal to the preset value.
在其中一个实施例中,关联输出模块600可以包括:In one of the embodiments, the correlation output module 600 may include:
显示单元,用于从当前业务标签对应的收敛问题库中提取当前收敛问句,并将当前收敛问句发送至用户终端进行显示。The display unit is configured to extract the current convergence question sentence from the convergence question library corresponding to the current service label, and send the current convergence question sentence to the user terminal for display.
接收单元,用于接收用户终端返回的与当前收敛问句对应的确认答复,并根据确认答复匹配与业务标签对应的业务问题。The receiving unit is configured to receive the confirmation reply corresponding to the current convergence question returned by the user terminal, and match the business question corresponding to the service label according to the confirmation reply.
输出单元,用于当未匹配到与业务标签对应的业务问题时,则将当前问题标记为未知问题。The output unit is used to mark the current question as an unknown question when the service question corresponding to the service label is not matched.
在其中一个实施例中,上述显示单元可以包括:In one of the embodiments, the above-mentioned display unit may include:
顺序获取单元,用于对当前问题进行分词得到代表词以及代表词的出现顺序。The sequence acquisition unit is used to segment the current question to obtain the representative words and the appearance order of the representative words.
初始问句选取单元,用于从当前业务标签对应的收敛问题库中选取与代表词对应的初始问句。The initial question selection unit is used to select the initial question corresponding to the representative word from the convergence question library corresponding to the current business label.
收敛问句选取单元,用于选取词汇出现顺序与代表词的出现顺序相一致的初始问句作为当前收敛问句。The convergent question selection unit is used to select the initial question sentence whose vocabulary appearance order is consistent with the appearance order of the representative word as the current convergent question sentence.
在其中一个实施例中,智能问答中未知问题处理装置还可以包括:In one of the embodiments, the device for handling unknown questions in intelligent question answering may further include:
第三接收模块,用于接收用户终端返回的与当前收敛问句对应的否认答复,并判断从否认答复中是否能提取出次级代表词。The third receiving module is used to receive the denial reply corresponding to the current convergent question returned by the user terminal, and determine whether the secondary representative words can be extracted from the denial reply.
提取模块,用于当从否认答复中不能提取出次级代表词时,则从收敛问题库提取下一收敛问句作为当前收敛问句,并继续将当前收敛问句发送至用户终端进行显示。The extraction module is used to extract the next convergence question from the convergence question database as the current convergence question when the secondary representative word cannot be extracted from the denial reply, and continue to send the current convergence question to the user terminal for display.
循环模块,用于直至发送至用户终端的收敛问句的数量达到预设值,或者用户终端在预设时间段内未接收到与当前收敛问句对应的答复时,判定当前问题为未知问题,并将位置问题与当前业务标签进行关联输出。The loop module is used to determine that the current question is an unknown question until the number of convergent questions sent to the user terminal reaches a preset value, or the user terminal does not receive a response corresponding to the current convergent question within a preset time period, And correlate the location problem with the current business label and output it.
在其中一个实施例中,智能问答中未知问题处理装置还可以包括:In one of the embodiments, the device for handling unknown questions in intelligent question answering may further include:
第四接收模块,用于接收运营终端返回的与未知问题对应的标准答复,将标准答复、未知问题以及未知问题对应的业务标签进行关联存储。The fourth receiving module is used to receive the standard answer corresponding to the unknown question returned by the operation terminal, and store the standard answer, the unknown question, and the service label corresponding to the unknown question in association.
匹配模块,用于根据当前业务标签将当前问题与已存储的未知问题进行匹配。The matching module is used to match the current question with the stored unknown question according to the current business tag.
输出模块,用于若匹配成功,则获取未知问题对应的标准答复进行输出,否则,继续根据当前业务标签计算当前问题与已提问的问题的关联度。The output module is used to obtain the standard response corresponding to the unknown question for output if the matching is successful; otherwise, continue to calculate the correlation degree between the current question and the question that has been asked according to the current service label.
关于智能问答中未知问题处理装置的具体限定可以参见上文中对于智能问答中未知问题处理方法的限定,在此不再赘述。上述智能问答中未知问题处理装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。For the specific limitation of the unknown question processing device in the intelligent question and answer, please refer to the above limitation on the method of processing the unknown question in the intelligent question answering, which will not be repeated here. Each module in the device for handling unknown questions in the above intelligent question answering can be implemented in whole or in part by software, hardware, and a combination thereof. The above-mentioned modules may be embedded in the form of hardware or independent of the processor in the computer equipment, or may be stored in the memory of the computer equipment in the form of software, so that the processor can call and execute the operations corresponding to the above-mentioned modules.
在其中一个实施例中,提供了一种计算机设备,该计算机设备可以是服务器,其内部结构图可以如图5所示。该计算机设备包括通过系统总线连接的处理器、存储器、网络接口和数据库。其中,该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性或易失性存储介质、内存储器。该非易失性或易失性存储介质存储有操作系统、计算机可读指令和数据库。该内存储器为非易失性存储介质中的操作系统和计算机可读指令的运行提供环境。该计算机设备的数据库用于收敛问题库、业务标签等数据。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机可读指令被处理器执行时以实现一种智能问答中未知问题处理方法。In one of the embodiments, a computer device is provided. The computer device may be a server, and its internal structure diagram may be as shown in FIG. 5. The computer equipment includes a processor, a memory, a network interface, and a database connected through a system bus. Among them, the processor of the computer device is used to provide calculation and control capabilities. The memory of the computer device includes a non-volatile or volatile storage medium and internal memory. The non-volatile or volatile storage medium stores an operating system, computer readable instructions, and a database. The internal memory provides an environment for the operation of the operating system and computer-readable instructions in the non-volatile storage medium. The database of the computer equipment is used for data such as convergence problem database and business tags. The network interface of the computer device is used to communicate with an external terminal through a network connection. When the computer readable instruction is executed by the processor, a method for processing unknown questions in intelligent question answering is realized.
本领域技术人员可以理解,图5中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。Those skilled in the art can understand that the structure shown in FIG. 5 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation on the computer device to which the solution of the present application is applied. The specific computer device may Including more or fewer parts than shown in the figure, or combining some parts, or having a different arrangement of parts.
一种计算机设备,包括存储器和一个或多个处理器,存储器中储存有计算机可读指令,计算机可读指令被处理器执行时,使得一个或多个处理器执行以下步骤:接收用户终端发送的用户输入的当前问题,并查询已存储的上一问题对应的当前业务标签;获取当前业务标签对应的词库,将当前问题进行分词处理得到若干分词,将分词与词库中的关键词进行匹配,并获取匹配成功的分词的个数;当匹配成功的分词的个数小于预设值时,则将分词输入至预先训练得到的槽位词抽取模型中,以通过槽位词抽取模型得到每个分词对应的词语类型;获取词语类型对应的标准文本,通过标准文本与词库中的关键词进行匹配,并根据匹配成功的标准文本的个数确定当前问题与上一问题的关联度;当关联度低于预设值时,则根据预设规则判断当前问题是否为未知问题;当当前问为未知问题时,则将未知问题与当前业务标签关联输出;及将输出的未知问题进行聚类处理;及将聚类后的未知问题发送给与业务标签对应的运营终端。A computer device, including a memory and one or more processors. The memory stores computer-readable instructions. When the computer-readable instructions are executed by the processor, the one or more processors perform the following steps: The current question entered by the user, and query the stored current business label corresponding to the previous question; get the thesaurus corresponding to the current business label, perform word segmentation processing on the current question to obtain several word segmentation, and match the word segmentation with the keywords in the thesaurus , And obtain the number of successfully matched word segmentation; when the number of successfully matched word segmentation is less than the preset value, the word segmentation is input into the pre-trained slot word extraction model to obtain each word segmentation model through the slot word extraction model The word type corresponding to a word segmentation; obtain the standard text corresponding to the word type, match the standard text with the keywords in the thesaurus, and determine the degree of relevance between the current question and the previous question according to the number of successfully matched standard texts; When the degree of relevance is lower than the preset value, judge whether the current question is an unknown question according to the preset rules; when the current question is an unknown question, associate the unknown question with the current business label; and cluster the output unknown question Processing; and sending the clustered unknown problem to the operation terminal corresponding to the service label.
在其中一个实施例中,处理器执行计算机可读指令时还实现以下步骤:当匹配成功的分词的个数大于等于预设值时,则根据匹配成功的分词的个数据确定当前问题与已提问的问题的关联度。In one of the embodiments, the processor further implements the following steps when executing the computer-readable instructions: when the number of successfully matched word segmentation is greater than or equal to the preset value, the current question and the questioned question are determined according to the data of the successfully matched word segmentation The degree of relevance of the question.
在其中一个实施例中,处理器执行计算机可读指令时所实现的根据预设规则判断当前问题是否为未知问题,包括:从当前业务标签对应的收敛问题库中提取当前收敛问句,并将当前收敛问句发送至用户终端进行显示;接收用户终端返回的与当前收敛问句对应的确认答复,并根据确认答复匹配与业务标签对应的业务问题;及当未匹配到与业务标签对应 的业务问题时,则将当前问题标记为未知问题。In one of the embodiments, when the processor executes the computer-readable instructions, the process of judging whether the current question is an unknown question according to preset rules includes: extracting the current convergence question from the convergence question library corresponding to the current business tag, and adding The current convergence question is sent to the user terminal for display; the confirmation response corresponding to the current convergence question returned by the user terminal is received, and the service question corresponding to the service label is matched according to the confirmation response; and when the service corresponding to the service label is not matched When there is a question, the current question is marked as an unknown question.
在其中一个实施例中,处理器执行计算机可读指令时所实现的从当前业务标签对应的收敛问题库中提取当前收敛问句,包括:对当前问题进行分词得到代表词以及代表词的出现顺序;从当前业务标签对应的收敛问题库中选取与代表词对应的初始问句;及选取词汇出现顺序与代表词的出现顺序相一致的初始问句作为当前收敛问句。In one of the embodiments, when the processor executes the computer-readable instruction, extracting the current convergence question from the convergence question library corresponding to the current service tag includes: segmenting the current question to obtain the representative words and the appearance order of the representative words ; Select the initial question sentence corresponding to the representative word from the convergence question library corresponding to the current business label; and select the initial question sentence whose vocabulary appearance order is consistent with the appearance order of the representative word as the current convergence question sentence.
在其中一个实施例中,处理器执行计算机可读指令时所实现的将当前收敛问句发送至用户终端进行显示之后,还包括:接收用户终端返回的与当前收敛问句对应的否认答复,并判断从否认答复中是否能提取出次级代表词;当从否认答复中不能提取出次级代表词时,则从收敛问题库提取下一收敛问句作为当前收敛问句,并继续将当前收敛问句发送至用户终端进行显示;及直至发送至用户终端的收敛问句的数量达到预设值,或者用户终端在预设时间段内未接收到与当前收敛问句对应的答复时,判定当前问题为未知问题,并将位置问题与当前业务标签进行关联输出。In one of the embodiments, after the processor executes the computer-readable instruction to send the current convergent question to the user terminal for display, the method further includes: receiving a denial reply corresponding to the current convergent question returned by the user terminal, and Determine whether the secondary representative words can be extracted from the denial reply; when the secondary representative words cannot be extracted from the denial reply, extract the next convergence question from the convergence question database as the current convergence question, and continue to converge the current The question is sent to the user terminal for display; and until the number of convergent questions sent to the user terminal reaches a preset value, or the user terminal does not receive a reply corresponding to the current convergent question within a preset time period, it is determined that the current convergent question The problem is an unknown problem, and the location problem is associated with the current business label and output.
在其中一个实施例中,处理器执行计算机可读指令时所实现的将聚类后的未知问题发送给与业务标签对应的运营终端之后,包括:接收运营终端返回的与未知问题对应的标准答复,将标准答复、未知问题以及未知问题对应的业务标签进行关联存储;处理器执行计算机可读指令时所实现的接收用户终端发送的用户输入的当前问题,并查询已存储的上一问题对应的当前业务标签之后,还包括:根据当前业务标签将当前问题与已存储的未知问题进行匹配;及若匹配成功,则获取未知问题对应的标准答复进行输出,否则,继续根据当前业务标签计算当前问题与已提问的问题的关联度。In one of the embodiments, after the processor executes the computer-readable instruction and sends the clustered unknown question to the operation terminal corresponding to the service tag, the method includes: receiving a standard response corresponding to the unknown question returned by the operation terminal , The standard answers, unknown questions, and business tags corresponding to the unknown questions are associated and stored; the processor executes the computer-readable instructions to receive the current question sent by the user from the user terminal and query the stored previous question corresponding to the current question After the current business label, it also includes: matching the current question with the stored unknown question according to the current business label; and if the matching is successful, obtain the standard answer corresponding to the unknown question and output it, otherwise, continue to calculate the current question based on the current business label The degree of relevance to the question that has been asked.
一个或多个存储有计算机可读指令的计算机可读存储介质,计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行以下步骤:接收用户终端发送的用户输入的当前问题,并查询已存储的上一问题对应的当前业务标签;获取当前业务标签对应的词库,将当前问题进行分词处理得到若干分词,将分词与词库中的关键词进行匹配,并获取匹配成功的分词的个数;当匹配成功的分词的个数小于预设值时,则将分词输入至预先训练得到的槽位词抽取模型中,以通过槽位词抽取模型得到每个分词对应的词语类型;获取词语类型对应的标准文本,通过标准文本与词库中的关键词进行匹配,并根据匹配成功的标准文本的个数确定当前问题与上一问题的关联度;当关联度低于预设值时,则根据预设规则判断当前问题是否为未知问题;当当前问为未知问题时,则将未知问题与当前业务标签关联输出;将输出的未知问题进行聚类处理;及将聚类后的未知问题发送给与业务标签对应的运营终端。One or more computer-readable storage media storing computer-readable instructions. When the computer-readable instructions are executed by one or more processors, the one or more processors perform the following steps: receiving user input from a user terminal Current question, and query the stored current business tag corresponding to the previous question; get the thesaurus corresponding to the current business tag, perform word segmentation processing on the current question to obtain several word segmentation, match the word segmentation with the keywords in the thesaurus, and get The number of successfully matched tokens; when the number of successfully matched tokens is less than the preset value, the tokens are input into the pre-trained slot word extraction model to obtain the corresponding word segmentation through the slot word extraction model Type of words; get the standard text corresponding to the word type, match the standard text with the keywords in the lexicon, and determine the degree of relevance between the current question and the previous question according to the number of successfully matched standard texts; when the degree of relevance is low At the preset value, judge whether the current question is an unknown question according to the preset rules; when the current question is an unknown question, associate the unknown question with the current business label and output; cluster the output unknown question; and The clustered unknown questions are sent to the operation terminal corresponding to the service label.
其中,该计算机可读存储介质可以是非易失性,也可以是易失性的。Wherein, the computer-readable storage medium may be non-volatile or volatile.
在其中一个实施例中,计算机可读指令被处理器执行时还实现以下步骤:当匹配成功的分词的个数大于等于预设值时,则根据匹配成功的分词的个数据确定当前问题与已提问的问题的关联度。In one of the embodiments, when the computer-readable instruction is executed by the processor, the following steps are also implemented: when the number of successfully matched word segmentation is greater than or equal to a preset value, the current problem and the existing problem are determined according to the data of the successfully matched word segmentation. The relevance of the question asked.
在其中一个实施例中,计算机可读指令被处理器执行时所实现的根据预设规则判断当 前问题是否为未知问题,包括:从当前业务标签对应的收敛问题库中提取当前收敛问句,并将当前收敛问句发送至用户终端进行显示;接收用户终端返回的与当前收敛问句对应的确认答复,并根据确认答复匹配与业务标签对应的业务问题;及当未匹配到与业务标签对应的业务问题时,则将当前问题标记为未知问题。In one of the embodiments, when the computer-readable instruction is executed by the processor, judging whether the current question is an unknown question according to preset rules includes: extracting the current convergence question from the convergence question library corresponding to the current business tag, and Send the current convergence question to the user terminal for display; receive the confirmation reply corresponding to the current convergence question returned by the user terminal, and match the service question corresponding to the service label according to the confirmation response; and when the service label corresponding to the service label is not matched When there is a business problem, the current problem is marked as an unknown problem.
在其中一个实施例中,计算机可读指令被处理器执行时所实现的从当前业务标签对应的收敛问题库中提取当前收敛问句,包括:对当前问题进行分词得到代表词以及代表词的出现顺序;从当前业务标签对应的收敛问题库中选取与代表词对应的初始问句;及选取词汇出现顺序与代表词的出现顺序相一致的初始问句作为当前收敛问句。In one of the embodiments, the extraction of the current convergence question from the convergence question library corresponding to the current service tag when the computer-readable instruction is executed by the processor includes: segmenting the current question to obtain the representative word and the appearance of the representative word Sequence; select the initial question sentence corresponding to the representative word from the convergence question library corresponding to the current business label; and select the initial question sentence whose vocabulary appearance order is consistent with the appearance order of the representative word as the current convergent question sentence.
在其中一个实施例中,计算机可读指令被处理器执行时所实现的将当前收敛问句发送至用户终端进行显示之后,还包括:接收用户终端返回的与当前收敛问句对应的否认答复,并判断从否认答复中是否能提取出次级代表词;当从否认答复中不能提取出次级代表词时,则从收敛问题库提取下一收敛问句作为当前收敛问句,并继续将当前收敛问句发送至用户终端进行显示;及直至发送至用户终端的收敛问句的数量达到预设值,或者用户终端在预设时间段内未接收到与当前收敛问句对应的答复时,判定当前问题为未知问题,并将位置问题与当前业务标签进行关联输出。In one of the embodiments, after the computer-readable instruction is executed by the processor to send the current convergent question to the user terminal for display, the method further includes: receiving a denial reply corresponding to the current convergent question returned by the user terminal, And judge whether the secondary representative words can be extracted from the denial reply; when the secondary representative words cannot be extracted from the denial reply, the next convergent question sentence is extracted from the convergent question database as the current convergent question sentence, and the current convergent question is continued. The convergent question is sent to the user terminal for display; and until the number of convergent questions sent to the user terminal reaches the preset value, or the user terminal does not receive a reply corresponding to the current convergent question within the preset time period, it is determined The current problem is an unknown problem, and the location problem is associated with the current business label and output.
在其中一个实施例中,计算机可读指令被处理器执行时所实现的将聚类后的未知问题发送给与业务标签对应的运营终端之后,包括:接收运营终端返回的与未知问题对应的标准答复,将标准答复、未知问题以及未知问题对应的业务标签进行关联存储;计算机可读指令被处理器执行时所实现的接收用户终端发送的用户输入的当前问题,并查询已存储的上一问题对应的当前业务标签之后,还包括:根据当前业务标签将当前问题与已存储的未知问题进行匹配;及若匹配成功,则获取未知问题对应的标准答复进行输出,否则,继续根据当前业务标签计算当前问题与已提问的问题的关联度。In one of the embodiments, when the computer-readable instruction is executed by the processor, after the clustered unknown problem is sent to the operation terminal corresponding to the service tag, it includes: receiving the standard corresponding to the unknown problem returned by the operation terminal Reply: Associate and store standard answers, unknown questions, and business tags corresponding to unknown questions; when the computer-readable instructions are executed by the processor, it receives the current question entered by the user from the user terminal and queries the stored previous question After the corresponding current business label, it also includes: matching the current question with the stored unknown question according to the current business label; and if the matching is successful, obtain the standard answer corresponding to the unknown question for output, otherwise, continue to calculate based on the current business label The degree of relevance of the current question to the question that has been asked.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机可读指令来指令相关的硬件来完成,所述的计算机可读指令可存储于一计算机可读取存储介质中,该计算机可读指令在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。A person of ordinary skill in the art can understand that all or part of the processes in the above-mentioned embodiment methods can be implemented by instructing relevant hardware through computer-readable instructions. The computer-readable instructions can be stored in a computer-readable storage. In the medium, when the computer-readable instructions are executed, they may include the processes of the above-mentioned method embodiments. Wherein, any reference to memory, storage, database or other media used in the embodiments provided in this application may include non-volatile and/or volatile memory. Non-volatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory may include random access memory (RAM) or external cache memory. As an illustration and not a limitation, RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Channel (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
以上实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾, 都应当认为是本说明书记载的范围。The technical features of the above embodiments can be combined arbitrarily. In order to make the description concise, all possible combinations of the technical features in the above embodiments are not described. However, as long as there is no contradiction in the combination of these technical features, they should all be combined. It is considered as the range described in this specification.
以上所述实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对发明专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请专利的保护范围应以所附权利要求为准。The above-mentioned embodiments only express several implementation manners of the present application, and the description is relatively specific and detailed, but it should not be understood as a limitation on the scope of the invention patent. It should be pointed out that for those of ordinary skill in the art, without departing from the concept of this application, several modifications and improvements can be made, and these all fall within the protection scope of this application. Therefore, the scope of protection of the patent of this application shall be subject to the appended claims.

Claims (20)

  1. 一种智能问答中未知问题处理方法,包括:A method for handling unknown questions in intelligent question answering, including:
    接收用户终端发送的用户输入的当前问题,并查询已存储的上一问题对应的当前业务标签;Receive the current question entered by the user from the user terminal, and query the stored current service label corresponding to the previous question;
    获取所述当前业务标签对应的词库,将所述当前问题进行分词处理得到若干分词,将所述分词与所述词库中的关键词进行匹配,并获取匹配成功的分词的个数;Acquiring the word database corresponding to the current business tag, performing word segmentation processing on the current question to obtain a number of word segmentation, matching the word segmentation with keywords in the thesaurus, and obtaining the number of successfully matched word segmentation;
    当匹配成功的分词的个数小于预设值时,则将所述分词输入至预先训练得到的槽位词抽取模型中,以通过所述槽位词抽取模型得到每个所述分词对应的词语类型;When the number of successfully matched word segmentation is less than the preset value, the word segmentation is input into the pre-trained slot word extraction model to obtain the word corresponding to each word segmentation through the slot word extraction model type;
    获取所述词语类型对应的标准文本,通过所述标准文本与所述词库中的关键词进行匹配,并根据匹配成功的标准文本的个数确定所述当前问题与上一问题的关联度;Acquiring the standard text corresponding to the word type, matching the standard text with keywords in the thesaurus, and determining the degree of relevance between the current question and the previous question according to the number of successfully matched standard texts;
    当所述关联度低于预设值时,则根据预设规则判断所述当前问题是否为未知问题;当所述当前问为未知问题时,则将所述未知问题与所述当前业务标签关联输出;When the degree of association is lower than a preset value, it is determined whether the current question is an unknown question according to preset rules; when the current question is an unknown question, the unknown question is associated with the current business tag Output
    将输出的所述未知问题进行聚类处理;及Clustering the output of the unknown problem; and
    将聚类后的未知问题发送给与所述业务标签对应的运营终端。Send the clustered unknown question to the operation terminal corresponding to the service label.
  2. 根据权利要求1所述的方法,其中,所述方法还包括:The method according to claim 1, wherein the method further comprises:
    当匹配成功的分词的个数大于等于所述预设值时,则根据匹配成功的分词的个数据确定所述当前问题与已提问的问题的关联度。When the number of successfully matched word segmentation is greater than or equal to the preset value, the degree of relevance between the current question and the question that has been asked is determined according to the data of the successfully matched word segmentation.
  3. 根据权利要求2所述的方法,其中,所述根据预设规则判断所述当前问题是否为未知问题,包括:The method according to claim 2, wherein the judging whether the current problem is an unknown problem according to a preset rule comprises:
    从所述当前业务标签对应的收敛问题库中提取当前收敛问句,并将所述当前收敛问句发送至所述用户终端进行显示;Extracting the current convergence question sentence from the convergence question database corresponding to the current service label, and sending the current convergence question sentence to the user terminal for display;
    接收用户终端返回的与所述当前收敛问句对应的确认答复,并根据所述确认答复匹配与所述业务标签对应的业务问题;及Receiving a confirmation reply corresponding to the current convergence question returned by the user terminal, and matching the business question corresponding to the service label according to the confirmation reply; and
    当未匹配到与所述业务标签对应的业务问题时,则将所述当前问题标记为未知问题。When the service question corresponding to the service label is not matched, the current question is marked as an unknown question.
  4. 根据权利要求3所述的方法,其中,所述从所述当前业务标签对应的收敛问题库中提取当前收敛问句,包括:The method according to claim 3, wherein said extracting the current convergence question from the convergence question database corresponding to the current service label comprises:
    对所述当前问题进行分词得到代表词以及所述代表词的出现顺序;Perform word segmentation on the current question to obtain representative words and the appearance order of the representative words;
    从所述当前业务标签对应的收敛问题库中选取与所述代表词对应的初始问句;及Selecting the initial question sentence corresponding to the representative word from the convergence question library corresponding to the current business label; and
    选取词汇出现顺序与所述代表词的出现顺序相一致的所述初始问句作为当前收敛问句。The initial question sentence whose vocabulary appearance order is consistent with the appearance order of the representative words is selected as the current convergent question sentence.
  5. 根据权利要求4所述的方法,其中,所述将所述当前收敛问句发送至所述用户终端进行显示之后,所述方法还包括:The method according to claim 4, wherein after the sending the current convergence question to the user terminal for display, the method further comprises:
    接收所述用户终端返回的与所述当前收敛问句对应的否认答复,并判断从所述否认答复中是否能提取出次级代表词;Receiving a denial reply corresponding to the current convergence question returned by the user terminal, and determining whether secondary representative words can be extracted from the denial reply;
    当从所述否认答复中不能提取出次级代表词时,则从所述收敛问题库提取下一收敛问 句作为当前收敛问句,并继续将所述当前收敛问句发送至所述用户终端进行显示;及When the secondary representative word cannot be extracted from the denial reply, the next convergence question is extracted from the convergence question library as the current convergence question, and the current convergence question is continued to be sent to the user terminal Display; and
    直至发送至所述用户终端的收敛问句的数量达到预设值,或者所述用户终端在预设时间段内未接收到与所述当前收敛问句对应的答复时,判定所述当前问题为未知问题,并将所述位置问题与所述当前业务标签进行关联输出。Until the number of convergent questions sent to the user terminal reaches a preset value, or the user terminal does not receive a response corresponding to the current convergent question within a preset time period, it is determined that the current question is Unknown problem, and correlate and output the location problem with the current service label.
  6. 根据权利要求1至5任意一项所述的方法,其中,所述将聚类后的未知问题发送给与所述业务标签对应的运营终端之后,所述方法还包括:The method according to any one of claims 1 to 5, wherein after the clustered unknown question is sent to the operating terminal corresponding to the service label, the method further comprises:
    接收所述运营终端返回的与所述未知问题对应的标准答复,将所述标准答复、所述未知问题以及所述未知问题对应的业务标签进行关联存储;Receiving the standard answer corresponding to the unknown question returned by the operation terminal, and storing the standard answer, the unknown question, and the service label corresponding to the unknown question in association;
    所述接收用户终端发送的用户输入的当前问题,并查询已存储的上一问题对应的当前业务标签之后,所述方法还包括:After receiving the current question input by the user sent by the user terminal and querying the stored current service label corresponding to the previous question, the method further includes:
    根据所述当前业务标签将所述当前问题与已存储的未知问题进行匹配;及Matching the current question with the stored unknown question according to the current service tag; and
    若匹配成功,则获取所述未知问题对应的标准答复进行输出,否则,继续根据所述当前业务标签计算所述当前问题与已提问的问题的关联度。If the matching is successful, obtain the standard response corresponding to the unknown question and output it; otherwise, continue to calculate the degree of relevance between the current question and the question that has been asked according to the current service tag.
  7. 一种智能问答中未知问题处理装置,包括:A device for handling unknown questions in intelligent question answering, including:
    第一接收模块,用于接收用户终端发送的用户输入的当前问题,并查询已存储的上一问题对应的当前业务标签;The first receiving module is configured to receive the current question input by the user sent by the user terminal, and query the stored current service label corresponding to the previous question;
    分词模块,用于获取所述当前业务标签对应的词库,将所述当前问题进行分词处理得到若干分词,将所述分词与所述词库中的关键词进行匹配,并获取匹配成功的分词的个数;The word segmentation module is used to obtain the word database corresponding to the current business tag, perform word segmentation processing on the current question to obtain a number of word segmentation, match the word segmentation with the keywords in the thesaurus, and obtain the successfully matched word segmentation The number of
    槽位词抽取模块,用于当匹配成功的分词的个数小于预设值时,则将所述分词输入至预先训练得到的槽位词抽取模型中,以通过所述槽位词抽取模型得到每个所述分词对应的词语类型;The slot word extraction module is used to input the word segmentation into a pre-trained slot word extraction model when the number of successfully matched word segmentation is less than a preset value, so as to be obtained through the slot word extraction model The word type corresponding to each said word segmentation;
    第一关联度计算模块,用于获取所述词语类型对应的标准文本,通过所述标准文本与所述词库中的关键词进行匹配,并根据匹配成功的标准文本的个数确定所述当前问题与上一问题的关联度;The first relevance calculation module is used to obtain the standard text corresponding to the word type, match the standard text with the keywords in the thesaurus, and determine the current standard text according to the number of successfully matched standard texts. The degree of relevance of the question to the previous question;
    关联输出模块,用于当所述关联度低于预设值时,则根据预设规则判断所述当前问题是否为未知问题;当所述当前问为未知问题时,则将所述未知问题与所述当前业务标签关联输出;The correlation output module is used to determine whether the current question is an unknown question according to preset rules when the degree of correlation is lower than a preset value; when the current question is an unknown question, combine the unknown question with Associated output of the current service label;
    聚类模块,用于将输出的所述未知问题进行聚类处理;及A clustering module for clustering the output of the unknown problem; and
    发送模块,用于将聚类后的未知问题发送给与所述业务标签对应的运营终端。The sending module is used to send the clustered unknown problem to the operation terminal corresponding to the service label.
  8. 根据权利要求7所述的装置,其中,所述装置还包括:The device according to claim 7, wherein the device further comprises:
    第二关联度计算模块,用于当匹配成功的分词的个数大于等于所述预设值时,则根据匹配成功的分词的个数据确定所述当前问题与已提问的问题的关联度。The second relevance calculation module is configured to determine the relevance between the current question and the question that has been asked according to the data of the successfully matched word segmentation when the number of successfully matched word segmentation is greater than or equal to the preset value.
  9. 一种计算机设备,包括存储器及一个或多个处理器,所述存储器中储存有计算机可读指令,所述计算机可读指令被所述一个或多个处理器执行时,使得所述一个或多个处理器执行以下步骤:A computer device includes a memory and one or more processors. The memory stores computer-readable instructions. When the computer-readable instructions are executed by the one or more processors, the one or more Each processor performs the following steps:
    接收用户终端发送的用户输入的当前问题,并查询已存储的上一问题对应的当前业务标签;Receive the current question entered by the user from the user terminal, and query the stored current service label corresponding to the previous question;
    获取所述当前业务标签对应的词库,将所述当前问题进行分词处理得到若干分词,将所述分词与所述词库中的关键词进行匹配,并获取匹配成功的分词的个数;Acquiring the word database corresponding to the current business tag, performing word segmentation processing on the current question to obtain a number of word segmentation, matching the word segmentation with keywords in the thesaurus, and obtaining the number of successfully matched word segmentation;
    当匹配成功的分词的个数小于预设值时,则将所述分词输入至预先训练得到的槽位词抽取模型中,以通过所述槽位词抽取模型得到每个所述分词对应的词语类型;When the number of successfully matched word segmentation is less than the preset value, the word segmentation is input into the pre-trained slot word extraction model to obtain the word corresponding to each word segmentation through the slot word extraction model type;
    获取所述词语类型对应的标准文本,通过所述标准文本与所述词库中的关键词进行匹配,并根据匹配成功的标准文本的个数确定所述当前问题与上一问题的关联度;Acquiring the standard text corresponding to the word type, matching the standard text with keywords in the thesaurus, and determining the degree of relevance between the current question and the previous question according to the number of successfully matched standard texts;
    当所述关联度低于预设值时,则根据预设规则判断所述当前问题是否为未知问题;当所述当前问为未知问题时,则将所述未知问题与所述当前业务标签关联输出;When the degree of association is lower than a preset value, it is determined whether the current question is an unknown question according to preset rules; when the current question is an unknown question, the unknown question is associated with the current business tag Output
    将输出的所述未知问题进行聚类处理;及Clustering the output of the unknown problem; and
    将聚类后的未知问题发送给与所述业务标签对应的运营终端。Send the clustered unknown question to the operation terminal corresponding to the service label.
  10. 根据权利要求9所述的计算机设备,其中,所述处理器执行所述计算机可读指令时还执行以下步骤:The computer device according to claim 9, wherein the processor further executes the following steps when executing the computer readable instruction:
    当匹配成功的分词的个数大于等于所述预设值时,则根据匹配成功的分词的个数据确定所述当前问题与已提问的问题的关联度。When the number of successfully matched word segmentation is greater than or equal to the preset value, the degree of relevance between the current question and the question that has been asked is determined according to the data of the successfully matched word segmentation.
  11. 根据权利要求10所述的计算机设备,其中,所述处理器执行所述计算机可读指令时所实现的所述根据预设规则判断所述当前问题是否为未知问题,包括:11. The computer device according to claim 10, wherein the determining whether the current problem is an unknown problem according to a preset rule, which is implemented when the processor executes the computer-readable instruction, comprises:
    从所述当前业务标签对应的收敛问题库中提取当前收敛问句,并将所述当前收敛问句发送至所述用户终端进行显示;Extracting the current convergence question sentence from the convergence question database corresponding to the current service label, and sending the current convergence question sentence to the user terminal for display;
    接收用户终端返回的与所述当前收敛问句对应的确认答复,并根据所述确认答复匹配与所述业务标签对应的业务问题;及Receiving a confirmation reply corresponding to the current convergence question returned by the user terminal, and matching the business question corresponding to the service label according to the confirmation reply; and
    当未匹配到与所述业务标签对应的业务问题时,则将所述当前问题标记为未知问题。When the service question corresponding to the service label is not matched, the current question is marked as an unknown question.
  12. 根据权利要求11所述的计算机设备,其中,所述处理器执行所述计算机可读指令时所实现的所述从所述当前业务标签对应的收敛问题库中提取当前收敛问句,包括:11. The computer device according to claim 11, wherein the extracting the current convergence question from the convergence question database corresponding to the current service tag, which is implemented when the processor executes the computer-readable instruction, comprises:
    对所述当前问题进行分词得到代表词以及所述代表词的出现顺序;Perform word segmentation on the current question to obtain representative words and the appearance order of the representative words;
    从所述当前业务标签对应的收敛问题库中选取与所述代表词对应的初始问句;及Selecting the initial question sentence corresponding to the representative word from the convergence question library corresponding to the current business label; and
    选取词汇出现顺序与所述代表词的出现顺序相一致的所述初始问句作为当前收敛问句。The initial question sentence whose vocabulary appearance order is consistent with the appearance order of the representative words is selected as the current convergent question sentence.
  13. 根据权利要求12所述的计算机设备,其中,所述处理器执行所述计算机可读指令时所实现的所述将所述当前收敛问句发送至所述用户终端进行显示之后,所述方法还包括:The computer device according to claim 12, wherein, after the current convergence question is sent to the user terminal for display by the processor when the processor executes the computer-readable instruction, the method further include:
    接收所述用户终端返回的与所述当前收敛问句对应的否认答复,并判断从所述否认答复中是否能提取出次级代表词;Receiving a denial reply corresponding to the current convergence question returned by the user terminal, and determining whether secondary representative words can be extracted from the denial reply;
    当从所述否认答复中不能提取出次级代表词时,则从所述收敛问题库提取下一收敛问 句作为当前收敛问句,并继续将所述当前收敛问句发送至所述用户终端进行显示;及When the secondary representative word cannot be extracted from the denial reply, the next convergence question is extracted from the convergence question library as the current convergence question, and the current convergence question is continued to be sent to the user terminal Display; and
    直至发送至所述用户终端的收敛问句的数量达到预设值,或者所述用户终端在预设时间段内未接收到与所述当前收敛问句对应的答复时,判定所述当前问题为未知问题,并将所述位置问题与所述当前业务标签进行关联输出。Until the number of convergent questions sent to the user terminal reaches a preset value, or the user terminal does not receive a response corresponding to the current convergent question within a preset time period, it is determined that the current question is Unknown problem, and correlate and output the location problem with the current service label.
  14. 根据权利要求9至13任意一项所述的计算机设备,其中,所述处理器执行所述计算机可读指令时所实现的所述将聚类后的未知问题发送给与所述业务标签对应的运营终端之后,还包括:The computer device according to any one of claims 9 to 13, wherein the said processor executes the computer-readable instruction to send the clustered unknown question to the corresponding service tag After operating the terminal, it also includes:
    接收所述运营终端返回的与所述未知问题对应的标准答复,将所述标准答复、所述未知问题以及所述未知问题对应的业务标签进行关联存储;Receiving the standard answer corresponding to the unknown question returned by the operation terminal, and storing the standard answer, the unknown question, and the service label corresponding to the unknown question in association;
    所述处理器执行所述计算机可读指令时所实现的所述接收用户终端发送的用户输入的当前问题,并查询已存储的上一问题对应的当前业务标签之后,还包括:When the processor executes the computer-readable instruction, after the receiving the current question input by the user sent by the user terminal and querying the stored current service tag corresponding to the previous question, the method further includes:
    根据所述当前业务标签将所述当前问题与已存储的未知问题进行匹配;及Matching the current question with the stored unknown question according to the current service tag; and
    若匹配成功,则获取所述未知问题对应的标准答复进行输出,否则,继续根据所述当前业务标签计算所述当前问题与已提问的问题的关联度。If the matching is successful, obtain the standard response corresponding to the unknown question and output it; otherwise, continue to calculate the degree of relevance between the current question and the question that has been asked according to the current service tag.
  15. 一个或多个存储有计算机可读指令的计算机可读存储介质,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行以下步骤:One or more computer-readable storage media storing computer-readable instructions, which when executed by one or more processors, cause the one or more processors to perform the following steps:
    接收用户终端发送的用户输入的当前问题,并查询已存储的上一问题对应的当前业务标签;Receive the current question entered by the user from the user terminal, and query the stored current service label corresponding to the previous question;
    获取所述当前业务标签对应的词库,将所述当前问题进行分词处理得到若干分词,将所述分词与所述词库中的关键词进行匹配,并获取匹配成功的分词的个数;Acquiring the word database corresponding to the current business tag, performing word segmentation processing on the current question to obtain a number of word segmentation, matching the word segmentation with keywords in the thesaurus, and obtaining the number of successfully matched word segmentation;
    当匹配成功的分词的个数小于预设值时,则将所述分词输入至预先训练得到的槽位词抽取模型中,以通过所述槽位词抽取模型得到每个所述分词对应的词语类型;When the number of successfully matched word segmentation is less than the preset value, the word segmentation is input into the pre-trained slot word extraction model to obtain the word corresponding to each word segmentation through the slot word extraction model type;
    获取所述词语类型对应的标准文本,通过所述标准文本与所述词库中的关键词进行匹配,并根据匹配成功的标准文本的个数确定所述当前问题与上一问题的关联度;Acquiring the standard text corresponding to the word type, matching the standard text with keywords in the thesaurus, and determining the degree of relevance between the current question and the previous question according to the number of successfully matched standard texts;
    当所述关联度低于预设值时,则根据预设规则判断所述当前问题是否为未知问题;当所述当前问为未知问题时,则将所述未知问题与所述当前业务标签关联输出;When the degree of association is lower than a preset value, it is determined whether the current question is an unknown question according to preset rules; when the current question is an unknown question, the unknown question is associated with the current business tag Output
    将输出的所述未知问题进行聚类处理;及Clustering the output of the unknown problem; and
    将聚类后的未知问题发送给与所述业务标签对应的运营终端。Send the clustered unknown question to the operation terminal corresponding to the service label.
  16. 根据权利要求15所述的存储介质,其中,所述计算机可读指令被所述处理器执行时还执行以下步骤:The storage medium according to claim 15, wherein the following steps are further performed when the computer-readable instructions are executed by the processor:
    当匹配成功的分词的个数大于等于所述预设值时,则根据匹配成功的分词的个数据确定所述当前问题与已提问的问题的关联度。When the number of successfully matched word segmentation is greater than or equal to the preset value, the degree of relevance between the current question and the question that has been asked is determined according to the data of the successfully matched word segmentation.
  17. 根据权利要求16所述的存储介质,其中,所述计算机可读指令被所述处理器执行时所实现的所述根据预设规则判断所述当前问题是否为未知问题,包括:16. The storage medium according to claim 16, wherein the determining whether the current problem is an unknown problem according to a preset rule, which is implemented when the computer-readable instruction is executed by the processor, comprises:
    从所述当前业务标签对应的收敛问题库中提取当前收敛问句,并将所述当前收敛问句 发送至所述用户终端进行显示;Extracting the current convergence question sentence from the convergence question database corresponding to the current service label, and sending the current convergence question sentence to the user terminal for display;
    接收用户终端返回的与所述当前收敛问句对应的确认答复,并根据所述确认答复匹配与所述业务标签对应的业务问题;及Receiving a confirmation reply corresponding to the current convergence question returned by the user terminal, and matching the business question corresponding to the service label according to the confirmation reply; and
    当未匹配到与所述业务标签对应的业务问题时,则将所述当前问题标记为未知问题。When the service question corresponding to the service label is not matched, the current question is marked as an unknown question.
  18. 根据权利要求17所述的存储介质,其中,所述计算机可读指令被所述处理器执行时所实现的所述从所述当前业务标签对应的收敛问题库中提取当前收敛问句,包括:18. The storage medium according to claim 17, wherein the extracting the current convergence question from the convergence question database corresponding to the current service tag, which is implemented when the computer-readable instruction is executed by the processor, comprises:
    对所述当前问题进行分词得到代表词以及所述代表词的出现顺序;Perform word segmentation on the current question to obtain representative words and the appearance order of the representative words;
    从所述当前业务标签对应的收敛问题库中选取与所述代表词对应的初始问句;及Selecting the initial question sentence corresponding to the representative word from the convergence question library corresponding to the current business label; and
    选取词汇出现顺序与所述代表词的出现顺序相一致的所述初始问句作为当前收敛问句。The initial question sentence whose vocabulary appearance order is consistent with the appearance order of the representative words is selected as the current convergent question sentence.
  19. 根据权利要求18所述的存储介质,其中,所述计算机可读指令被所述处理器执行时所实现的所述将所述当前收敛问句发送至所述用户终端进行显示之后,所述方法还包括:18. The storage medium according to claim 18, wherein after the sending of the current convergence question to the user terminal for display by the computer-readable instruction when the computer-readable instruction is executed by the processor, the method Also includes:
    接收所述用户终端返回的与所述当前收敛问句对应的否认答复,并判断从所述否认答复中是否能提取出次级代表词;Receiving a denial reply corresponding to the current convergence question returned by the user terminal, and determining whether secondary representative words can be extracted from the denial reply;
    当从所述否认答复中不能提取出次级代表词时,则从所述收敛问题库提取下一收敛问句作为当前收敛问句,并继续将所述当前收敛问句发送至所述用户终端进行显示;及When the secondary representative word cannot be extracted from the denial reply, extract the next convergence question from the convergence question library as the current convergence question, and continue to send the current convergence question to the user terminal Display; and
    直至发送至所述用户终端的收敛问句的数量达到预设值,或者所述用户终端在预设时间段内未接收到与所述当前收敛问句对应的答复时,判定所述当前问题为未知问题,并将所述位置问题与所述当前业务标签进行关联输出。Until the number of convergent questions sent to the user terminal reaches a preset value, or the user terminal does not receive a response corresponding to the current convergent question within a preset time period, it is determined that the current question is Unknown problem, and correlate and output the location problem with the current service label.
  20. 根据权利要求15至19任意一项所述的存储介质,其中,所述计算机可读指令被所述处理器执行时所实现的所述将聚类后的未知问题发送给与所述业务标签对应的运营终端之后,还包括:The storage medium according to any one of claims 15 to 19, wherein when the computer-readable instruction is executed by the processor, the clustered unknown problem is sent to the corresponding service label After operating the terminal, it also includes:
    接收所述运营终端返回的与所述未知问题对应的标准答复,将所述标准答复、所述未知问题以及所述未知问题对应的业务标签进行关联存储;Receiving the standard answer corresponding to the unknown question returned by the operation terminal, and storing the standard answer, the unknown question, and the service label corresponding to the unknown question in association;
    所述计算机可读指令被所述处理器执行时所实现的所述接收用户终端发送的用户输入的当前问题,并查询已存储的上一问题对应的当前业务标签之后,还包括:When the computer-readable instruction is executed by the processor, after the receiving the current question entered by the user from the user terminal and querying the stored current service tag corresponding to the previous question, the method further includes:
    根据所述当前业务标签将所述当前问题与已存储的未知问题进行匹配;及Matching the current question with the stored unknown question according to the current service tag; and
    若匹配成功,则获取所述未知问题对应的标准答复进行输出,否则,继续根据所述当前业务标签计算所述当前问题与已提问的问题的关联度。If the matching is successful, obtain the standard response corresponding to the unknown question and output it; otherwise, continue to calculate the degree of relevance between the current question and the question that has been asked according to the current service tag.
PCT/CN2020/105089 2020-02-11 2020-07-28 Method and apparatus for processing unknown question in intelligent questions and answers, computer device, and medium WO2021159670A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010087214.2 2020-02-11
CN202010087214.2A CN111309881A (en) 2020-02-11 2020-02-11 Method and device for processing unknown questions in intelligent question answering, computer equipment and medium

Publications (1)

Publication Number Publication Date
WO2021159670A1 true WO2021159670A1 (en) 2021-08-19

Family

ID=71145419

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/105089 WO2021159670A1 (en) 2020-02-11 2020-07-28 Method and apparatus for processing unknown question in intelligent questions and answers, computer device, and medium

Country Status (2)

Country Link
CN (1) CN111309881A (en)
WO (1) WO2021159670A1 (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111309881A (en) * 2020-02-11 2020-06-19 深圳壹账通智能科技有限公司 Method and device for processing unknown questions in intelligent question answering, computer equipment and medium
CN113779050A (en) * 2020-06-23 2021-12-10 北京沃东天骏信息技术有限公司 Method and device for managing knowledge base of customer service robot
CN111813915A (en) * 2020-07-21 2020-10-23 腾讯科技(深圳)有限公司 Message interaction method, device, equipment and computer readable storage medium
CN112131876A (en) * 2020-09-04 2020-12-25 交通银行股份有限公司太平洋信用卡中心 Method and system for determining standard problem based on similarity
CN113076431B (en) * 2021-04-28 2022-09-02 平安科技(深圳)有限公司 Question and answer method and device for machine reading understanding, computer equipment and storage medium
CN116932911B (en) * 2023-07-24 2023-12-15 山东翰林科技有限公司 Electric power knowledge question-answering assistant construction method based on ChatGPT

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106776832A (en) * 2016-11-25 2017-05-31 上海智臻智能网络科技股份有限公司 Processing method, apparatus and system for question and answer interactive log
CN107656948A (en) * 2016-11-14 2018-02-02 平安科技(深圳)有限公司 The problem of in automatically request-answering system clustering processing method and device
US20180060421A1 (en) * 2016-08-26 2018-03-01 International Business Machines Corporation Query expansion
CN111309881A (en) * 2020-02-11 2020-06-19 深圳壹账通智能科技有限公司 Method and device for processing unknown questions in intelligent question answering, computer equipment and medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180060421A1 (en) * 2016-08-26 2018-03-01 International Business Machines Corporation Query expansion
CN107656948A (en) * 2016-11-14 2018-02-02 平安科技(深圳)有限公司 The problem of in automatically request-answering system clustering processing method and device
CN106776832A (en) * 2016-11-25 2017-05-31 上海智臻智能网络科技股份有限公司 Processing method, apparatus and system for question and answer interactive log
CN111309881A (en) * 2020-02-11 2020-06-19 深圳壹账通智能科技有限公司 Method and device for processing unknown questions in intelligent question answering, computer equipment and medium

Also Published As

Publication number Publication date
CN111309881A (en) 2020-06-19

Similar Documents

Publication Publication Date Title
WO2021159670A1 (en) Method and apparatus for processing unknown question in intelligent questions and answers, computer device, and medium
CN109446302B (en) Question-answer data processing method and device based on machine learning and computer equipment
WO2021068321A1 (en) Information pushing method and apparatus based on human-computer interaction, and computer device
WO2020077896A1 (en) Method and apparatus for generating question data, computer device, and storage medium
CN109960725B (en) Text classification processing method and device based on emotion and computer equipment
WO2020057022A1 (en) Associative recommendation method and apparatus, computer device, and storage medium
WO2019153522A1 (en) Intelligent interaction method, electronic device, and storage medium
WO2022142006A1 (en) Semantic recognition-based verbal skill recommendation method and apparatus, device, and storage medium
WO2020233131A1 (en) Question-and-answer processing method and apparatus, computer device and storage medium
WO2022227162A1 (en) Question and answer data processing method and apparatus, and computer device and storage medium
US20180225591A1 (en) Classifying unstructured computer text for complaint-specific interactions using rules-based and machine learning modeling
TW201917601A (en) User intention recognition method and device capable of recognizing user intention by acquiring dialogue text from a user
CN111324713B (en) Automatic replying method and device for conversation, storage medium and computer equipment
CN110135888B (en) Product information pushing method, device, computer equipment and storage medium
CN111783471B (en) Semantic recognition method, device, equipment and storage medium for natural language
CN112651236B (en) Method and device for extracting text information, computer equipment and storage medium
CN112925898B (en) Question-answering method and device based on artificial intelligence, server and storage medium
CN112632258A (en) Text data processing method and device, computer equipment and storage medium
WO2021164171A1 (en) Method and apparatus for processing data in knowledge base, and computer device and storage medium
CN116738476A (en) Safe interaction method and device based on large language model
CN110377618B (en) Method, device, computer equipment and storage medium for analyzing decision result
CN109460541A (en) Lexical relation mask method, device, computer equipment and storage medium
CN116628163A (en) Customer service processing method, customer service processing device, customer service processing equipment and storage medium
CN114493902A (en) Multi-mode information anomaly monitoring method and device, computer equipment and storage medium
CN112988704A (en) AI consultation database cluster building method and system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20918654

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205 DATED DD/MM/YYYY09/12/2022

122 Ep: pct application non-entry in european phase

Ref document number: 20918654

Country of ref document: EP

Kind code of ref document: A1