WO2019153607A1

WO2019153607A1 - Intelligent response method, electronic device and storage medium

Info

Publication number: WO2019153607A1
Application number: PCT/CN2018/089882
Authority: WO
Inventors: 于凤英; 王健宗; 肖京
Original assignee: 平安科技（深圳）有限公司
Priority date: 2018-02-09
Filing date: 2018-06-05
Publication date: 2019-08-15
Also published as: CN108345672A

Abstract

An intelligent response method, an electronic device and a storage medium, the method comprising: preprocessing a consultation question after acquiring the consultation question, and then constructing an inverted index for a question and answer knowledge base; querying a candidate question set related to the consultation question from the question and answer knowledge base by means of an inverted index query mode; calculating a question similarity between each candidate question in the candidate question set and the consultation question, the question similarity being obtained by means of linear weighting of a text similarity, a semantic similarity, a subject matter similarity and a syntax similarity between the consultation question and the corresponding candidate question; and finally, selecting the candidate question corresponding to the highest question similarity obtained by calculation, and querying an answer associated with the selected candidate question in the question and answer knowledge base to serve as a target answer to be outputted. Thus, the accuracy and response efficiency of intelligent responses may be increased, and the quality of service may be improved.

Description

Intelligent response method, electronic device and storage medium

This application claims priority to Chinese Patent Application No. 201101134579.9, entitled "Intelligent Response Method, Electronic Device, and Storage Media", filed on February 9, 2018, the entire contents of which are incorporated herein by reference. In the application.

Technical field

The present application relates to the field of computer technologies, and in particular, to an intelligent response method, an electronic device, and a storage medium.

Background technique

With the development of technology, Artificial Intelligence (AI) is gradually changing our way of life. For example, smart question and answer is one of them. When the customer consults online via text or voice, the customer can be intelligently answered by the online intelligent customer service. Intelligent Q&A can effectively alleviate the waiting situation of customer service and improve service quality, so it has a very broad prospect.

At present, the intelligent answering method in intelligent question and answer usually adopts the method of extracting keywords from customer questions, and then finds the answer matching the keyword from the question and answer knowledge base and outputs it to the customer. However, due to the profound and varied language and language, it is difficult to accurately identify the true intention of the customer's problem from the keyword. Therefore, the current intelligent response method is insufficient in accuracy, and usually requires manual response as compensation. The problem of human resource consumption and service efficiency will be caused to a certain extent.

Summary of the invention

In view of the above reasons, it is necessary to provide an intelligent response method, an electronic device and a storage medium, which can improve the accuracy and response efficiency of the intelligent response, save human resources, and improve service quality.

To achieve the above objective, the present application provides an intelligent response method, which includes the following steps: an obtaining step of: obtaining an input consulting question, and pre-processing the consulting question, the pre-processing including word segmentation to obtain each term, Each term carries out part-of-speech tagging and named entity recognition, extracts keywords from each term, and performs statement error correction on the consulting question; and constructs steps: performing the pre-preparation on each question and answer in the question-and-answer knowledge base Processing, mapping each problem and answer after the pre-processing to the inverted record table, thereby constructing an inverted index for the question-and-answer knowledge base, and querying from the question-and-answer knowledge base by means of an inverted index query a set of candidate questions related to the consulting question, the question and answer knowledge base including a plurality of questions collated in advance and one or more answers associated with each question; a calculating step: for each candidate in the set of candidate questions a problem, respectively calculating a similarity between the consulting question and the candidate question, the problem similarity being asked by the consulting question and the corresponding candidate question The text similarity, semantic similarity, topic similarity and syntactic similarity are obtained by linear weighting, wherein the weights of the text similarity and the semantic similarity are greater than the weights of the topic similarity and the syntactic similarity; Step: selecting a candidate question corresponding to the highest problem similarity calculated, querying one or more associated answers of the selected candidate question in the question and answer knowledge base, and outputting the one or more associated answers in a preset time period The most frequently associated answer is output as the target answer.

Optionally, the method for calculating the text similarity between the consulting question and the corresponding candidate question comprises: counting a plurality of specified features between the consulting question and the candidate question, and linearizing the plurality of specified features Weighting calculation to obtain a text similarity between the consulting question and the corresponding candidate question; wherein the plurality of specified features include: a consulting question and a common keyword number a1 of the candidate question; a consulting question and a common key of the candidate question The word length a2; the number of consultation questions and the number of common terms of the candidate question a3; the length of the consultation question and the common term of the candidate question a4; the length of the consultation question a5; the length of the candidate question a6.

To achieve the above object, the present application further provides an electronic device including a memory and a processor, wherein the memory includes an intelligent response program, and when the smart response program is executed by the processor, the following steps are performed: the obtaining step: Acquiring the input consulting question, pre-processing the consulting question, the pre-processing includes segmentation to obtain each term, performing part-of-speech tagging and named entity recognition for each term, extracting keywords from each term, and The consulting problem is performed by the statement error correction; the constructing step: performing the pre-processing on each question and answer in the question-and-answer knowledge base, and mapping each question and answer after the pre-processing to the inverted record table, Thereby constructing an inverted index for the question and answer knowledge base, and querying the candidate question set related to the consulting question from the question and answer knowledge base by using an inverted index query, where the question and answer knowledge base includes multiple presets a question and one or more answers associated with each question; a calculation step: for each candidate question in the set of candidate questions, Do not calculate the similarity between the consulting question and the candidate question, the problem similarity is linearly weighted by text similarity, semantic similarity, topic similarity and syntactic similarity between the consulting question and the corresponding candidate question The weight of the text similarity and the semantic similarity are greater than the weight of the topic similarity and the syntactic similarity; the selecting step: selecting the candidate problem corresponding to the highest problem similarity calculated, and querying the query in the question and answer knowledge base One or more associated answers of the candidate questions are selected, and the associated answers of the one or more associated answers that have the highest output frequency within the preset time period are output as the target answers.

In addition, in order to achieve the above object, the present application further provides a computer readable storage medium including an intelligent response program, when the smart response program is executed by a processor, implementing the intelligent response method as described above Any step.

The intelligent response method proposed by the present application, after obtaining the consulting question, first pre-processing the consulting question, and then constructing an inverted index for the question-and-answer knowledge base, and using the inverted index query method from the question-and-answer knowledge base Querying a set of candidate questions related to the consulting question, and calculating, for each candidate question in the candidate question set, a problem similarity between the consulting question and the candidate question, the problem similarity being consulted The text similarity, semantic similarity, topic similarity and syntactic similarity between the problem and the corresponding candidate problem are obtained by linear weighting. Finally, the candidate problem corresponding to the highest problem similarity calculated is selected, and the query is found in the question and answer knowledge base. Selecting one or more associated answers of the candidate question, and outputting the associated answer with the highest output frequency in the preset time period as the target answer, which can improve the accuracy and response efficiency of the intelligent response, and save Human resources to improve service quality.

DRAWINGS

1 is a schematic diagram of an operating environment of a preferred embodiment of an electronic device of the present application;

2 is a schematic diagram of interaction between an electronic device and a client according to a preferred embodiment of the present application;

3 is a flow chart of a preferred embodiment of the intelligent response method of the present application;

4 is a program block diagram of the smart answering program of FIG. 1.

The implementation, functional features and advantages of the present application will be further described with reference to the accompanying drawings.

Detailed ways

The principles and spirit of the present application are described below with reference to a number of specific embodiments. It is understood that the specific embodiments described herein are merely illustrative of the application and are not intended to be limiting.

Those skilled in the art will appreciate that embodiments of the present application can be implemented as a method, apparatus, device, system, or computer program product. Accordingly, the application can be embodied in a complete hardware, complete software (including firmware, resident software, microcode, etc.), or a combination of hardware and software.

According to an embodiment of the present application, an intelligent response method, an electronic device, and a storage medium are proposed.

1 is a schematic diagram of an operating environment of a preferred embodiment of an electronic device of the present application.

The electronic device 1 may be a terminal device having a storage and computing function such as a server, a portable computer, or a desktop computer.

The electronic device 1 includes a memory 11, a processor 12, a network interface 13, and a communication bus 14. The network interface 13 can optionally include a standard wired interface and a wireless interface (such as a WI-FI interface). The communication bus 14 is used to implement connection communication between the above components.

The memory 11 includes at least one type of readable storage medium. The at least one type of readable storage medium may be a non-volatile storage medium such as a flash memory, a hard disk, a multimedia card, a card type memory, or the like. In some embodiments, the readable storage medium may be an internal storage unit of the electronic device 1, such as a hard disk of the electronic device 1. In other embodiments, the readable storage medium may also be an external memory 11 of the electronic device 1, such as a plug-in hard disk equipped on the electronic device 1, a smart memory card (SMC). , Secure Digital (SD) card, Flash Card, etc.

In the present embodiment, the readable storage medium of the memory 11 is generally used to store the smart response program 10, the question and answer knowledge base 4, and the like installed in the electronic device 1. The memory 11 can also be used to temporarily store data that has been output or is about to be output.

The processor 12, in some embodiments, may be a Central Processing Unit (CPU), microprocessor or other data processing chip for running program code or processing data stored in the memory 11, such as executing an intelligent response program. 10 and so on.

1 shows only the electronic device 1 having the components 11-14 and the intelligent response program 10, but it should be understood that not all illustrated components may be implemented, and more or fewer components may be implemented instead.

Optionally, the electronic device 1 may further include a user interface, and the user interface may include an input unit such as a keyboard, a voice input device such as a microphone, a device with a voice recognition function, a voice output device such as an audio, a headphone, and the like. . Optionally, the user interface may also include a standard wired interface and a wireless interface.

Optionally, the electronic device 1 may further include a display, which may also be referred to as a display screen or a display unit. In some embodiments, it may be an LED display, a liquid crystal display, a touch liquid crystal display, and an Organic Light-Emitting Diode (OLED) display. The display is used to display information processed in the electronic device 1 and a user interface for displaying visualizations.

Optionally, the electronic device 1 further comprises a touch sensor. The area provided by the touch sensor for the user to perform a touch operation is referred to as a touch area. Further, the touch sensor described herein may be a resistive touch sensor, a capacitive touch sensor, or the like. Moreover, the touch sensor includes not only a contact type touch sensor but also a proximity type touch sensor or the like. Furthermore, the touch sensor may be a single sensor or a plurality of sensors arranged, for example, in an array. The user can activate the smart answering program 10 by touching the touch area.

In addition, the area of the display of the electronic device 1 may be the same as or different from the area of the touch sensor. Optionally, a display is stacked with the touch sensor to form a touch display. The device detects a user-triggered touch operation based on a touch screen display.

The electronic device 1 may further include a radio frequency (RF) circuit, a sensor, an audio circuit, and the like, and details are not described herein.

Referring to FIG. 2, it is a schematic diagram of interaction between the electronic device 1 and the client 2 according to a preferred embodiment of the present application. The intelligent response program 10 operates in the electronic device 1. In Fig. 2, the preferred embodiment of the electronic device 1 is a server. The electronic device 1 is communicatively coupled to the client 2 via a network 3. The client 2 can run in various types of terminal devices, such as smart phones, portable computers, and the like. After the user logs in to the electronic device 1 through the client 2, the consulting question can be input to the smart answering program 10, and the smart answering program 10 can process the consulting question by using the smart answering method, so that the question and answer knowledge base is processed. Find the corresponding target answer in 4 and return the target answer to client 2 via network 3.

Referring to FIG. 3, it is a flowchart of a preferred embodiment of the intelligent response method of the present application. The following steps of implementing the intelligent response method when the processor 12 of the electronic device 1 executes the intelligent response program 10 stored in the memory 11:

Step S1: Acquire an input consultation question, and perform pre-processing on the consultation question, the pre-processing includes word segmentation to obtain each term, word-of-speech tagging and named entity recognition for each term, and extracting keywords from each term And correcting the statement of the consulting question.

Specifically, the consultation question may be, for example, “whether the office carbon crystal geothermal pad in the New Year gift package can be quickly started”, and after the word segmentation processing of the consultation question, the obtained terms are “new spring” and “big”. Gift package, "中中", "的", "office", "款款", "carbon crystal", "geothermal pad", "whether", "energy", "fast", "start", where the length of the term Longer terms, such as "big gift", "geothermal pad", etc., are more able to express the meaning of the consulting question than words with shorter term lengths, such as "medium", "", etc. Therefore, the step S1 may use a term whose term length after the word segmentation is greater than the first threshold (for example, the number of characters is 4) as a long cut word, and only the part of the long cut word is marked. Taking the above-mentioned consulting question as an example, the result of the part-of-speech tagging of the long-cut word is, for example, "New Year spree/adjective + noun", "quick start/adverb + verb".

Step S1 may perform a named entity recognition on the long cut word by a hidden Markov model, thereby identifying an entity proper noun having a specific meaning in the consultation question, such as “Microsoft Technology Group”. The entity proper noun includes, for example, a person's name, a place name, an institution name, and the like. In addition, step S1 may extract a keyword from the long cut word by using a TF-IDF (Term frequency – Inverse document frequency) algorithm. The main idea of the TF-IDF algorithm is that if a word or phrase appears frequently in a document and has a low frequency in other documents, the word or phrase is considered to have a good class distinguishing ability. Used as a keyword.

Step S1 may perform sentence error correction processing for the consulting problem by using an N-gram language model and an edit distance. The N-gram language model can use the collocation information between adjacent words in the context to calculate the phrase collocation with the greatest probability, thereby identifying the wrong collocation in the sentence, and providing several possible correct alternative collocation schemes. By editing the distance algorithm to calculate the editing cost generated by various alternative collocations, an alternative scheme with the least editing cost can be determined and adopted, thereby implementing the error correction processing of the query problem. For example, "I want to find the song of Hetang Moonlight as background music", through the speech error correction processing, it can be identified that "Hetang" should be changed to "荷塘".

Step S2, performing the pre-processing on each question and answer in the question-and-answer knowledge base 4, mapping each question and answer after the pre-processing to the inverted record table, thereby being the Q&A knowledge base 4 Constructing an inverted index, and querying, by the inverted index query, a candidate question set related to the consulting question from the Q&A knowledge base 4, the Q&A knowledge base 4 includes a plurality of questions pre-organized and each question One or more answers associated.

Step S2 also performs the pre-processing on each question and answer in the question-and-answer knowledge base 4, and can obtain text feature information such as each term, part-of-speech tag, named entity, keyword, etc. obtained by each question and answer participle. Mapping each question and answer to a preset inverted record table according to the text feature information, mapping all questions and answers having the same entry to the entry, thereby obtaining the question and answer knowledge Library 4 constructs the inverted index. According to the text feature information obtained by the pre-processing of the consulting problem, the candidate question set related to the consulting question may be queried from the question-and-answer knowledge base by means of an inverted index query. At least one candidate question is included in the candidate question set, and each candidate question has a certain degree of association with the consulting question due to the manner of using an inverted index query.

Step S3, calculating, for each candidate question in the candidate question set, a problem similarity between the consulting question and the candidate question, the problem similarity being the text similarity between the consulting question and the corresponding candidate question , semantic similarity, topic similarity and syntactic similarity are obtained by linear weighting, wherein the similarity and semantic similarity of the text can be obtained because the consultation problem and the candidate problem usually focus more on the degree of similarity between text and semantics. The weights are set to be greater than the weights of the topic similarity and syntactic similarity.

The method for calculating the text similarity between the consulting question and the corresponding candidate question may include the following steps:

A plurality of specified features between the consulting question and the candidate question are counted, and the plurality of specified features are linearly weighted to obtain a text similarity between the consulting question and the corresponding candidate question.

Wherein the plurality of specified features include:

Consultation question and the number of common keywords a1 of the candidate question;

Consultation question and the common keyword length a2 of the candidate question;

The number of consultation questions and the number of common terms for the candidate question a3;

Consultation question and the length of the common entry for the candidate question a4;

The length of the consultation question a5;

The length of the candidate question is a6.

The linear weighting calculation on the plurality of specified features may be implemented by using a multiple logistic regression model. Specifically, the weight of each specified feature is first calculated using an inverse document rate algorithm, and the idea of the inverse document rate algorithm is to calculate the importance of each specified feature in a preset large-scale corpus, thereby determining the weight of each specified feature. . The multivariate logistic regression model is used to perform weighted regression fitting calculation on the plurality of specified features, and the text similarity g(z) of the consulting problem and the candidate problem is obtained, and the formula is as follows:

g(z)=1/(1+e ^z ), where e is a natural constant;

z=a1*x1+a2*x2+a3*x3+a4*x4+a5*x5+a6*x6;

Wherein x1, x2, ..., x6 are the weights of the a1, a2, ..., a6, respectively.

In addition, the method for calculating the semantic similarity between the consulting question and the corresponding candidate question may be: using the word2vec algorithm to represent each term after the question segmentation as a word vector, and averaging the word vectors in the consulting question. To the sentence vector of the consultation question; use the word2vec algorithm to represent each term after the candidate problem segmentation as a word vector, average the word vectors in the candidate problem to obtain the sentence vector of the candidate question; calculate the sentence vector of the consulting question The cosine similarity between the sentence vector of the candidate question and the semantic similarity between the consulting question and the candidate question.

The method for calculating the topic similarity between the consulting question and the corresponding candidate question may be: using the topic expression method of the LDA linear discriminant analysis, constructing the topic vector of the consulting question, and the subject vector of the candidate question; calculating the consulting problem The cosine similarity between the subject vector and the subject vector of the candidate question is obtained by the topic similarity between the question and the candidate question.

The method for calculating the syntactic similarity between the consulting question and the corresponding candidate question may be: using the LTP language technology platform to analyze the consulting question and the syntax of the candidate question, obtaining the consulting question and the syntax vector of the candidate question; calculating the consulting problem The cosine similarity between the syntax vector and the syntax vector of the candidate problem is obtained by the syntactic similarity between the consulting problem and the candidate problem.

Step S3 calculates a problem similarity between the consultation question and the candidate question, and may linearly weight the text similarity, the semantic similarity, the topic similarity, and the syntactic similarity between the consultation question and the corresponding candidate question. . Among them, since the consultation problem and the candidate question usually focus more on the similarity between text and semantics, when the text similarity, semantic similarity, topic similarity and syntactic similarity are linearly weighted, The weights of the text similarity and the semantic similarity are set to be greater than the weights of the topic similarity and the syntactic similarity, for example, the weights of the text similarity and the semantic similarity may be set to be respectively the similarity of the theme and Syntactic similarity is 3 times.

Step S4, selecting a candidate question corresponding to the highest problem similarity calculated, and querying one or more associated answers of the selected candidate question in the question and answer knowledge base 4, and setting the one or more associated answers in a preset time period The associated answer with the highest internal output frequency is output as the target answer.

The candidate question corresponding to the highest problem similarity can be regarded as the candidate question most similar to the consulting question. Therefore, the answer associated with the candidate question corresponding to the highest problem similarity in the Q&A knowledge base 4 can be used as the answer corresponding to the consulting question. If the selected candidate question includes a plurality of associated answers, one of the plurality of associated answers may be selected as the target answer. Specifically, the associated answer with the highest output frequency in the preset time period, for example, the target answer may be selected as the target answer.

In order to make the target answer of the output more humanized and give the customer a better experience, step S4 may also perform humanized retouching of the target answer before outputting the target answer. Specifically, the preprocessing of the consulting problem in the step S1 further includes:

Each term obtained by the consultation problem segmentation is compared with a preset positive vocabulary and a negative vocabulary to determine whether the consulting question includes positive vocabulary or negative vocabulary. The positive vocabulary is, for example, "happy wife", and the negative vocabulary such as "I want to complain".

The humanized retouching of the target answer in step S4 includes:

If the consulting question includes only the positive vocabulary, the preset greeting corresponding to the positive vocabulary is obtained, for example, “I wish you happy”, and the preset greeting corresponding to the positive vocabulary is combined with the target answer;

If the consultation question includes only the negative vocabulary, the preset greeting corresponding to the negative vocabulary is obtained, for example, “very sorry”, and the preset greeting corresponding to the negative vocabulary is combined with the target answer;

If the consulting question includes a positive vocabulary and a negative vocabulary, or the consulting question does not include a positive vocabulary and a negative vocabulary, the preset greet corresponding to the neutral vocabulary is obtained, for example, “thanks for support”, the neutrality is The preset greeting corresponding to the vocabulary is combined with the target answer.

According to the smart application method provided by the embodiment, after obtaining the consultation question, the consulting question is pre-processed, and then the inverted index is constructed for the question-and-answer knowledge base 4, and the inverted index query is used to The question answering knowledge base 4 queries the candidate question set related to the consulting question, and then calculates the problem similarity between the consulting question and the candidate question for each candidate question in the candidate question set, the problem The similarity is obtained by linearly weighting the text similarity, semantic similarity, topic similarity and syntactic similarity between the consulting question and the corresponding candidate question, and finally selecting the candidate problem corresponding to the highest problem similarity calculated in the question and answer knowledge. The library 4 queries one or more associated answers of the selected candidate question, and outputs the associated answer with the highest output frequency among the one or more associated answers in the preset time period as the target answer, which can improve the accuracy of the smart response. And response efficiency, saving human resources and improving service quality.

Referring to FIG. 4, it is a program module diagram of the intelligent response program 10 in FIG. In the present embodiment, the intelligent response program 10 is divided into a plurality of modules, which are stored in the memory 11 and executed by the processor 12 to complete the present application. A module as referred to in this application refers to a series of computer program instructions that are capable of performing a particular function.

The intelligent response program 10 can be divided into: an acquisition module 110, a construction module 120, a calculation module 130, and a selection module 140.

The obtaining module 110 is configured to obtain an input consulting question, and perform pre-processing on the consulting question, where the pre-processing includes word segmentation to obtain each term, and each term is subjected to part-of-speech tagging and named entity identification, from each term. Extracting keywords and performing statement correction on the consulting question.

Specifically, the pre-processing of the consulting question by the obtaining module 110 may include: obtaining a term for the consulting question segmentation, using the term after the segmentation term length is greater than the first threshold as the long-cut word, and performing the long-cut word for the long-cut word Part-of-speech tagging, naming entity recognition of the long-cut word by hidden Markov model to identify proper nouns, extracting keywords from the long-cut words by TF-IDF algorithm, adopting N-gram language model and editing The distance is corrected for the query problem.

The construction module 120 is configured to perform the pre-processing on each question and answer in the Q&A knowledge base 4, and map each problem and answer after the pre-processing to the inverted record table, thereby The knowledge base 4 constructs an inverted index, and queries the candidate question set related to the consulting question from the Q&A knowledge base 4 by means of an inverted index query, and the Q&A knowledge base 4 includes a plurality of questions arranged in advance and One or more answers associated with each question.

Specifically, the constructing module 120 performs the pre-processing on each question and answer in the question-and-answer knowledge base 4, and can obtain each term, part-of-speech tag, named entity, keyword, etc. obtained by each of the question and answer word segments. Text feature information, according to the text feature information, mapping each question and answer into a preset inverted record table, mapping all questions and answers having the same entry to the entry, thereby The question and answer knowledge base 4 constructs the inverted index. According to the text feature information obtained by the pre-processing of the consulting problem, the candidate question set related to the consulting question may be queried from the question-and-answer knowledge base by means of an inverted index query.

The calculating module 130 is configured to separately calculate a problem similarity between the consulting question and the candidate question for each candidate question in the candidate question set, where the problem similarity is between the consulting question and the corresponding candidate question The text similarity, the semantic similarity, the topic similarity and the syntactic similarity are obtained by linear weighting, wherein the weights of the text similarity and the semantic similarity are greater than the weights of the topic similarity and the syntactic similarity.

The calculating module 130 may perform linear weighting calculation on the plurality of specified features by counting a plurality of specified features between the consulting question and the candidate question, and obtain text similarity between the consulting question and the corresponding candidate question. . The plurality of specified features includes:

The length of the consultation question a5;

The length of the candidate question is a6.

The linear weighting calculation on the plurality of specified features may be implemented by using a multiple logistic regression model. Specifically, the calculation module 130 may calculate the weight of each specified feature by using an inverse document rate algorithm, and the idea of the inverse document rate algorithm is to calculate the importance degree of each specified feature in the preset large-scale corpus, thereby determining each designation. The weight of the feature. Then, the calculation module 130 performs a weighted regression fitting calculation on the plurality of specified features by using a multiple logistic regression model to obtain a text similarity g(z) of the consultation question and the candidate question, and the formula is as follows:

g(z)=1/(1+e ^z ), where e is a natural constant;

z=a1*x1+a2*x2+a3*x3+a4*x4+a5*x5+a6*x6;

Wherein x1, x2, ..., x6 are the weights of the a1, a2, ..., a6, respectively.

In addition, the calculation method of the semantic similarity between the consulting question and the corresponding candidate question by the calculation module 130 may be: using the word2vec algorithm to represent each term after the question segmentation as a word vector, and consulting each word in the question The vector is averaged to obtain the sentence vector of the consulting question; the word2vec algorithm is used to represent each term after the candidate word segmentation as a word vector, and the word vectors in the candidate problem are averaged to obtain the sentence vector of the candidate question; The cosine similarity between the sentence vector of the question and the sentence vector of the candidate question is obtained by the semantic similarity between the consulting question and the candidate question.

The calculating method of the topic similarity between the consulting question and the corresponding candidate question by the calculating module 130 may be: constructing a topic vector of the consulting question by using the topic expression method of the LDA linear discriminant analysis, and the subject vector of the candidate question; The cosine similarity between the subject vector of the consulting question and the subject vector of the candidate question is calculated, and the subject similarity between the consulting question and the candidate question is obtained.

The calculation method of the syntax similarity between the consulting question and the corresponding candidate question by the calculation module 130 may be: analyzing the consulting question and the syntax of the candidate question by using the LTP language technology platform, and obtaining the consulting question and the syntax vector of the candidate question. Calculating the cosine similarity between the syntax vector of the consulting problem and the syntax vector of the candidate problem, and obtaining the syntactic similarity between the consulting problem and the candidate problem.

The calculating module 130 calculates a problem similarity between the consulting question and the candidate question, and may linearly weight the text similarity, the semantic similarity, the topic similarity, and the syntactic similarity between the consulting question and the corresponding candidate question. get. Among them, since the consultation problem and the candidate question usually focus more on the similarity between text and semantics, when the text similarity, semantic similarity, topic similarity and syntactic similarity are linearly weighted, The weights of the text similarity and the semantic similarity are set to be greater than the weights of the topic similarity and the syntactic similarity.

The selecting module 140 is configured to select a candidate question corresponding to the highest problem similarity calculated, and query one or more associated answers of the selected candidate question in the question and answer knowledge base, and preset the one or more associated answers The associated answer with the highest output frequency in the time period is output as the target answer.

In order to make the target answer of the output more humanized and give the customer a better experience, the selection module 140 may first perform humanized retouching of the target answer before outputting the target answer. Specifically, first, the pre-processing of the consulting problem by the obtaining module 110 further includes: comparing each term obtained by the consulting problem segmentation with a preset positive vocabulary and a negative vocabulary, and determining the consulting problem. Whether it contains positive or negative words.

Then, before the output of the target answer, the selection module 140, if the consultation question includes only the positive vocabulary, the selection module 140 obtains the preset greeting corresponding to the positive vocabulary, and the preset greeting corresponding to the positive vocabulary The target answers are combined;

If the consultation question includes only the negative vocabulary, the selection module 140 acquires the preset greeting corresponding to the negative vocabulary, and combines the preset greeting corresponding to the negative vocabulary with the target answer;

If the consulting question includes a positive vocabulary and a negative vocabulary, or the consulting question does not include the positive vocabulary and the negative vocabulary, the selecting module 140 obtains the preset greeting corresponding to the neutral vocabulary, and corresponds to the neutral vocabulary. The default greeting is combined with the target answer.

In the operating environment diagram of the preferred embodiment of the electronic device 1 shown in FIG. 1, the memory 11 including the readable storage medium may include an operating system, an intelligent response program 10, and a question and answer knowledge base 4. When the processor 12 executes the smart response program 10 stored in the memory 11, the following steps are implemented:

Obtaining step: obtaining an input consulting question, and pre-processing the consulting question, the pre-processing includes segmentation to obtain each term, performing part-of-speech tagging and named entity recognition for each term, and extracting keywords from each term And correcting the statement of the consulting question;

Build step: performing pre-processing on each question and answer in the Q&A knowledge base 4, mapping each question and answer after the pre-processing to the inverted record table, thereby being the Q&A knowledge base 4 Constructing an inverted index, and querying, by the inverted index query, a candidate question set related to the consulting question from the Q&A knowledge base 4, the Q&A knowledge base 4 includes a plurality of questions pre-organized and each question One or more answers associated with;

a calculating step: calculating, for each candidate question in the candidate question set, a problem similarity between the consulting question and the candidate question, the problem similarity being the text similarity between the consulting question and the corresponding candidate question The semantic similarity, the topic similarity, and the syntactic similarity are obtained by linear weighting, wherein the weights of the text similarity and the semantic similarity are greater than the weights of the topic similarity and the syntactic similarity;

The selecting step: selecting the candidate question corresponding to the highest problem similarity calculated, querying the question answering knowledge base 4 for one or more associated answers of the selected candidate question, and setting the one or more associated answers in a preset time period The associated answer with the highest internal output frequency is output as the target answer.

Wherein, the pre-processing includes word segmentation to obtain each term, the term after the segmentation term is greater than the first threshold as a long-cut word, the long-cut word is part-of-speech tagged, and the hidden Markov model is used to The long cut word performs the named entity recognition to identify the proper noun, uses the TF-IDF algorithm to extract the keyword from the long cut word, uses the N-gram language model and the edit distance to perform the statement error correction processing for the consulting problem.

The method for calculating the text similarity between the consulting question and the corresponding candidate question includes:

Counting a plurality of specified features between the consulting question and the candidate question, performing linear weighting calculation on the plurality of specified features to obtain a text similarity between the consulting question and the corresponding candidate question;

Wherein the plurality of specified features include:

The length of the consultation question a5;

The length of the candidate question is a6.

Performing a linear weighting calculation on the plurality of specified features to obtain a text similarity between the consulting question and the corresponding candidate question includes:

The inverse document rate algorithm is used to calculate the weight of each specified feature, and the multiple logistic regression model is used to perform weighted regression fitting calculation on the plurality of specified features, and the text similarity g(z) of the consulting problem and the candidate problem is obtained, and the formula is as follows :

g(z)=1/(1+e ^z ), where e is a natural constant;

z=a1*x1+a2*x2+a3*x3+a4*x4+a5*x5+a6*x6;

Wherein x1, x2, ..., x6 are the weights of the a1, a2, ..., a6, respectively.

Furthermore, the method for calculating the semantic similarity between the consulting question and the corresponding candidate question includes:

The word2vec algorithm is used to represent each term after the question segmentation as a word vector, and the word vectors in the consultation question are averaged to obtain the sentence vector of the consultation question;

Using the word2vec algorithm, each term after the candidate problem segmentation is represented as a word vector, and the word vectors in the candidate question are averaged to obtain a sentence vector of the candidate question;

Calculating the cosine similarity between the sentence vector of the consulting question and the sentence vector of the candidate question, and obtaining the semantic similarity between the consulting question and the candidate question;

The method for calculating the topic similarity between the consulting question and the corresponding candidate question includes:

Using the topic expression method of LDA linear discriminant analysis, constructing a topic vector of the consulting question and the subject vector of the candidate question;

Calculating a cosine similarity between the subject vector of the consulting question and the subject vector of the candidate question, and obtaining a topic similarity between the consulting question and the candidate question;

The method for calculating the syntactic similarity between the consulting question and the corresponding candidate question includes:

Using the LTP language technology platform to analyze the consulting problem and the syntax of the candidate question, and obtain the consulting question and the syntax vector of the candidate question;

The cosine similarity between the syntax vector of the consulting problem and the syntax vector of the candidate problem is calculated, and the syntactic similarity between the consulting problem and the candidate problem is obtained.

In an embodiment, the preprocessing further includes:

Comparing each term obtained by the consulting problem segmentation with a preset positive vocabulary and a negative vocabulary to determine whether the consulting question includes positive vocabulary or negative vocabulary;

Before outputting the target answer, it also includes:

If the consulting question includes only the positive vocabulary, the preset greeting corresponding to the positive vocabulary is obtained, and the preset greeting corresponding to the positive vocabulary is combined with the target answer;

If the consultation question includes only the negative vocabulary, the preset greeting corresponding to the negative vocabulary is obtained, and the preset greeting corresponding to the negative vocabulary is combined with the target answer;

If the consulting question includes a positive vocabulary and a negative vocabulary, or the consulting question does not include the positive vocabulary and the negative vocabulary, the preset greeting corresponding to the neutral vocabulary is obtained, and the preset greeting corresponding to the neutral vocabulary is obtained. The language is combined with the target answer.

For the specific principle, please refer to the program module diagram of the intelligent response program 10 in FIG. 4 and the flowchart of the preferred embodiment of the intelligent response method in FIG.

In addition, the embodiment of the present application further provides a computer readable storage medium, which may be a hard disk, a multimedia card, an SD card, a flash memory card, an SMC, a read only memory (ROM), and an erasable programmable Any combination or combination of any one or more of read only memory (EPROM), portable compact disk read only memory (CD-ROM), USB memory, and the like. The computer readable storage medium includes the Q&A knowledge base 4, the intelligent response program 10, and the like. When the intelligent response program 10 is executed by the processor 12, the following operations are implemented:

Wherein the plurality of specified features include:

The length of the consultation question a5;

The length of the candidate question is a6.

g(z)=1/(1+e ^z ), where e is a natural constant;

z=a1*x1+a2*x2+a3*x3+a4*x4+a5*x5+a6*x6;

Wherein x1, x2, ..., x6 are the weights of the a1, a2, ..., a6, respectively.

In an embodiment, the preprocessing further includes:

Before outputting the target answer, it also includes:

The specific implementation manner of the computer readable storage medium of the present application is substantially the same as the above-mentioned intelligent response method and the specific embodiment of the electronic device 1, and details are not described herein again.

It is to be understood that the term "comprises", "comprising", or any other variants thereof, is intended to encompass a non-exclusive inclusion, such that a process, apparatus, article, or method that comprises a series of elements includes those elements. It also includes other elements not explicitly listed, or elements that are inherent to such a process, device, item, or method. An element that is defined by the phrase "comprising a ..." does not exclude the presence of additional equivalent elements in the process, the device, the item, or the method that comprises the element.

Through the description of the above embodiments, those skilled in the art can clearly understand that the foregoing embodiment method can be implemented by means of software plus a necessary general hardware platform, and of course, can also be through hardware, but in many cases, the former is better. Implementation. Based on such understanding, portions of the technical solution of the present application that contribute substantially or to the prior art may be embodied in the form of a software product stored in a storage medium as described above, including a number of instructions. To enable a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the methods described in the various embodiments of the present application.

The above is only a preferred embodiment of the present application, and is not intended to limit the scope of the patent application, and the equivalent structure or equivalent process transformations made by the specification and the drawings of the present application, or directly or indirectly applied to other related technical fields. The same is included in the scope of patent protection of this application.

Claims

An intelligent response method, characterized in that the method comprises the following steps:

Obtaining step: obtaining an input consulting question, and pre-processing the consulting question, the pre-processing includes segmentation to obtain each term, performing part-of-speech tagging and named entity recognition for each term, and extracting keywords from each term And correcting the statement of the consulting question;

Build step: performing pre-processing on each question and answer in the question-and-answer knowledge base, mapping each question and answer after the pre-processing to the inverted record table, thereby constructing the question-and-answer knowledge base Querying a list of candidate questions related to the consulting question from the Q&A knowledge base by means of an inverted index query, the question and answer knowledge base including a plurality of questions pre-organized and one or each associated with each question Multiple answers;

a calculating step: calculating, for each candidate question in the candidate question set, a problem similarity between the consulting question and the candidate question, the problem similarity being the text similarity between the consulting question and the corresponding candidate question The semantic similarity, the topic similarity, and the syntactic similarity are obtained by linear weighting, wherein the weights of the text similarity and the semantic similarity are greater than the weights of the topic similarity and the syntactic similarity;

The selecting step is: selecting a candidate question corresponding to the highest problem similarity calculated, and querying one or more associated answers of the selected candidate question in the question and answer knowledge base, and the one or more associated answers are within a preset time period The associated answer with the highest output frequency is output as the target answer.
The intelligent response method according to claim 1, wherein the method for calculating the text similarity between the consultation question and the corresponding candidate question comprises:

Counting a plurality of specified features between the consulting question and the candidate question, performing linear weighting calculation on the plurality of specified features to obtain a text similarity between the consulting question and the corresponding candidate question;

Wherein the plurality of specified features include:

Consultation question and the number of common keywords a1 of the candidate question;

Consultation question and the common keyword length a2 of the candidate question;

The number of consultation questions and the number of common terms for the candidate question a3;

Consultation question and the length of the common entry for the candidate question a4;

The length of the consultation question a5;

The length of the candidate question is a6.
The intelligent response method according to claim 2, wherein said performing linear weighting calculation on said plurality of specified features to obtain text similarity between the consulting question and the corresponding candidate question comprises:

The inverse document rate algorithm is used to calculate the weight of each specified feature, and the multiple logistic regression model is used to perform weighted regression fitting calculation on the plurality of specified features, and the text similarity g(z) of the consulting problem and the candidate problem is obtained, and the formula is as follows :

g(z)=1/(1+e z ), where e is a natural constant;

z=a1*x1+a2*x2+a3*x3+a4*x4+a5*x5+a6*x6;

Wherein x1, x2, ..., x6 are the weights of the a1, a2, ..., a6, respectively.
The intelligent response method according to claim 1, wherein the method for calculating the semantic similarity between the consultation question and the corresponding candidate question comprises:

The word2vec algorithm is used to represent each term after the question segmentation as a word vector, and the word vectors in the consultation question are averaged to obtain the sentence vector of the consultation question;

Using the word2vec algorithm, each term after the candidate problem segmentation is represented as a word vector, and the word vectors in the candidate question are averaged to obtain a sentence vector of the candidate question;

Calculating the cosine similarity between the sentence vector of the consulting question and the sentence vector of the candidate question, and obtaining the semantic similarity between the consulting question and the candidate question;

The method for calculating the topic similarity between the consulting question and the corresponding candidate question includes:

Using the topic expression method of LDA linear discriminant analysis, constructing a topic vector of the consulting question and the subject vector of the candidate question;

Calculating a cosine similarity between the subject vector of the consulting question and the subject vector of the candidate question, and obtaining a topic similarity between the consulting question and the candidate question;

The method for calculating the syntactic similarity between the consulting question and the corresponding candidate question includes:

Using the LTP language technology platform to analyze the consulting problem and the syntax of the candidate question, and obtain the consulting question and the syntax vector of the candidate question;

The cosine similarity between the syntax vector of the consulting problem and the syntax vector of the candidate problem is calculated, and the syntactic similarity between the consulting problem and the candidate problem is obtained.
The intelligent response method according to claim 1, wherein the pre-processing includes word segmentation to obtain each term, and the term after the segmentation term length is greater than the first threshold is used as the long-cut word, and the long-cut word is Performing part-of-speech tagging, identifying the long-cut word by the hidden Markov model to identify the proper noun, extracting the keyword from the long-cut word by using the TF-IDF algorithm, adopting the N-gram language model and The edit distance performs statement error correction processing for the consultation question.
The intelligent response method according to claim 1, wherein the preprocessing further comprises:

Comparing each term obtained by the consulting problem segmentation with a preset positive vocabulary and a negative vocabulary to determine whether the consulting question includes positive vocabulary or negative vocabulary;

Before outputting the target answer, it also includes:

If the consulting question includes only the positive vocabulary, the preset greeting corresponding to the positive vocabulary is obtained, and the preset greeting corresponding to the positive vocabulary is combined with the target answer;

If the consultation question includes only the negative vocabulary, the preset greeting corresponding to the negative vocabulary is obtained, and the preset greeting corresponding to the negative vocabulary is combined with the target answer;

If the consulting question includes a positive vocabulary and a negative vocabulary, or the consulting question does not include the positive vocabulary and the negative vocabulary, the preset greeting corresponding to the neutral vocabulary is obtained, and the preset greeting corresponding to the neutral vocabulary is obtained. The language is combined with the target answer.
The intelligent response method according to any one of claims 2 to 5, wherein the preprocessing further comprises:

Comparing each term obtained by the consulting problem segmentation with a preset positive vocabulary and a negative vocabulary to determine whether the consulting question includes positive vocabulary or negative vocabulary;

Before outputting the target answer, it also includes:

If the consulting question includes only the positive vocabulary, the preset greeting corresponding to the positive vocabulary is obtained, and the preset greeting corresponding to the positive vocabulary is combined with the target answer;

If the consultation question includes only the negative vocabulary, the preset greeting corresponding to the negative vocabulary is obtained, and the preset greeting corresponding to the negative vocabulary is combined with the target answer;

If the consulting question includes a positive vocabulary and a negative vocabulary, or the consulting question does not include the positive vocabulary and the negative vocabulary, the preset greeting corresponding to the neutral vocabulary is obtained, and the preset greeting corresponding to the neutral vocabulary is obtained. The language is combined with the target answer.
An electronic device includes a memory and a processor, wherein the memory includes an intelligent response program, and the smart response program is executed by the processor to implement the following steps:

Obtaining step: obtaining an input consulting question, and pre-processing the consulting question, the pre-processing includes segmentation to obtain each term, performing part-of-speech tagging and named entity recognition for each term, and extracting keywords from each term And correcting the statement of the consulting question;

Build step: performing pre-processing on each question and answer in the question-and-answer knowledge base, mapping each question and answer after the pre-processing to the inverted record table, thereby constructing the question-and-answer knowledge base Querying a list of candidate questions related to the consulting question from the Q&A knowledge base by means of an inverted index query, the question and answer knowledge base including a plurality of questions pre-organized and one or each associated with each question Multiple answers;

a calculating step: calculating, for each candidate question in the candidate question set, a problem similarity between the consulting question and the candidate question, the problem similarity being the text similarity between the consulting question and the corresponding candidate question The semantic similarity, the topic similarity, and the syntactic similarity are obtained by linear weighting, wherein the weights of the text similarity and the semantic similarity are greater than the weights of the topic similarity and the syntactic similarity;

The selecting step is: selecting a candidate question corresponding to the highest problem similarity calculated, and querying one or more associated answers of the selected candidate question in the question and answer knowledge base, and the one or more associated answers are within a preset time period The associated answer with the highest output frequency is output as the target answer.
The electronic device according to claim 8, wherein the method for calculating the text similarity between the consultation question and the corresponding candidate question comprises:

Counting a plurality of specified features between the consulting question and the candidate question, performing linear weighting calculation on the plurality of specified features to obtain a text similarity between the consulting question and the corresponding candidate question;

Wherein the plurality of specified features include:

Consultation question and the number of common keywords a1 of the candidate question;

Consultation question and the common keyword length a2 of the candidate question;

The number of consultation questions and the number of common terms for the candidate question a3;

Consultation question and the length of the common entry for the candidate question a4;

The length of the consultation question a5;

The length of the candidate question is a6.
The electronic device according to claim 9, wherein the linear weighting calculation of the plurality of specified features to obtain a text similarity between the consultation question and the corresponding candidate question comprises:

The inverse document rate algorithm is used to calculate the weight of each specified feature, and the multiple logistic regression model is used to perform weighted regression fitting calculation on the plurality of specified features, and the text similarity g(z) of the consulting problem and the candidate problem is obtained, and the formula is as follows :

g(z)=1/(1+e z ), where e is a natural constant;

z=a1*x1+a2*x2+a3*x3+a4*x4+a5*x5+a6*x6;

Wherein x1, x2, ..., x6 are the weights of the a1, a2, ..., a6, respectively.
The electronic device according to claim 8, wherein the method for calculating the semantic similarity between the consultation question and the corresponding candidate question comprises:

The word2vec algorithm is used to represent each term after the question segmentation as a word vector, and the word vectors in the consultation question are averaged to obtain the sentence vector of the consultation question;

Using the word2vec algorithm, each term after the candidate problem segmentation is represented as a word vector, and the word vectors in the candidate question are averaged to obtain a sentence vector of the candidate question;

Calculating the cosine similarity between the sentence vector of the consulting question and the sentence vector of the candidate question, and obtaining the semantic similarity between the consulting question and the candidate question;

The method for calculating the topic similarity between the consulting question and the corresponding candidate question includes:

Using the topic expression method of LDA linear discriminant analysis, constructing a topic vector of the consulting question and the subject vector of the candidate question;

Calculating a cosine similarity between the subject vector of the consulting question and the subject vector of the candidate question, and obtaining a topic similarity between the consulting question and the candidate question;

The method for calculating the syntactic similarity between the consulting question and the corresponding candidate question includes:

Using the LTP language technology platform to analyze the consulting problem and the syntax of the candidate question, and obtain the consulting question and the syntax vector of the candidate question;

The cosine similarity between the syntax vector of the consulting problem and the syntax vector of the candidate problem is calculated, and the syntactic similarity between the consulting problem and the candidate problem is obtained.
The electronic device according to claim 8, wherein the preprocessing comprises word segmentation to obtain each term, the term after the word segmentation term length is greater than the first threshold value as a long word, and the long word segmentation Part-of-speech tagging, naming entity recognition of the long-cut word by hidden Markov model to identify proper nouns, extracting keywords from the long-cut words by TF-IDF algorithm, adopting N-gram language model and editing The distance is corrected for the query problem.
The electronic device according to claim 8, wherein the preprocessing further comprises:

Comparing each term obtained by the consulting problem segmentation with a preset positive vocabulary and a negative vocabulary to determine whether the consulting question includes positive vocabulary or negative vocabulary;

Before outputting the target answer, it also includes:

If the consulting question includes only the positive vocabulary, the preset greeting corresponding to the positive vocabulary is obtained, and the preset greeting corresponding to the positive vocabulary is combined with the target answer;

If the consultation question includes only the negative vocabulary, the preset greeting corresponding to the negative vocabulary is obtained, and the preset greeting corresponding to the negative vocabulary is combined with the target answer;

If the consulting question includes a positive vocabulary and a negative vocabulary, or the consulting question does not include the positive vocabulary and the negative vocabulary, the preset greeting corresponding to the neutral vocabulary is obtained, and the preset greeting corresponding to the neutral vocabulary is obtained. The language is combined with the target answer.
The electronic device according to any one of claims 9 to 12, wherein the preprocessing further comprises:

Comparing each term obtained by the consulting problem segmentation with a preset positive vocabulary and a negative vocabulary to determine whether the consulting question includes positive vocabulary or negative vocabulary;

Before outputting the target answer, it also includes:

If the consulting question includes only the positive vocabulary, the preset greeting corresponding to the positive vocabulary is obtained, and the preset greeting corresponding to the positive vocabulary is combined with the target answer;

If the consultation question includes only the negative vocabulary, the preset greeting corresponding to the negative vocabulary is obtained, and the preset greeting corresponding to the negative vocabulary is combined with the target answer;

If the consulting question includes a positive vocabulary and a negative vocabulary, or the consulting question does not include the positive vocabulary and the negative vocabulary, the preset greeting corresponding to the neutral vocabulary is obtained, and the preset greeting corresponding to the neutral vocabulary is obtained. The language is combined with the target answer.
A computer readable storage medium, characterized in that the computer readable storage medium comprises an intelligent response program, and when the intelligent response program is executed by the processor, the following steps are implemented:

Obtaining step: obtaining an input consulting question, and pre-processing the consulting question, the pre-processing includes segmentation to obtain each term, performing part-of-speech tagging and named entity recognition for each term, and extracting keywords from each term And correcting the statement of the consulting question;

Build step: performing pre-processing on each question and answer in the question-and-answer knowledge base, mapping each question and answer after the pre-processing to the inverted record table, thereby constructing the question-and-answer knowledge base Querying a list of candidate questions related to the consulting question from the Q&A knowledge base by means of an inverted index query, the question and answer knowledge base including a plurality of questions pre-organized and one or each associated with each question Multiple answers;

a calculating step: calculating, for each candidate question in the candidate question set, a problem similarity between the consulting question and the candidate question, the problem similarity being the text similarity between the consulting question and the corresponding candidate question The semantic similarity, the topic similarity, and the syntactic similarity are obtained by linear weighting, wherein the weights of the text similarity and the semantic similarity are greater than the weights of the topic similarity and the syntactic similarity;

The selecting step is: selecting a candidate question corresponding to the highest problem similarity calculated, and querying one or more associated answers of the selected candidate question in the question and answer knowledge base, and the one or more associated answers are within a preset time period The associated answer with the highest output frequency is output as the target answer.
The computer readable storage medium according to claim 15, wherein the method of calculating the text similarity between the consultation question and the corresponding candidate question comprises:

Counting a plurality of specified features between the consulting question and the candidate question, performing linear weighting calculation on the plurality of specified features to obtain a text similarity between the consulting question and the corresponding candidate question;

Wherein the plurality of specified features include:

Consultation question and the number of common keywords a1 of the candidate question;

Consultation question and the common keyword length a2 of the candidate question;

The number of consultation questions and the number of common terms for the candidate question a3;

Consultation question and the length of the common entry for the candidate question a4;

The length of the consultation question a5;

The length of the candidate question is a6.
The computer readable storage medium of claim 16, wherein the linear weighting calculation of the plurality of specified features to obtain text similarity between the consulting question and the corresponding candidate question comprises:

The inverse document rate algorithm is used to calculate the weight of each specified feature, and the multiple logistic regression model is used to perform weighted regression fitting calculation on the plurality of specified features, and the text similarity g(z) of the consulting problem and the candidate problem is obtained, and the formula is as follows :

g(z)=1/(1+e z ), where e is a natural constant;

z=a1*x1+a2*x2+a3*x3+a4*x4+a5*x5+a6*x6;

Wherein x1, x2, ..., x6 are the weights of the a1, a2, ..., a6, respectively.
The computer readable storage medium of claim 15, wherein the method of calculating the semantic similarity between the consulting question and the corresponding candidate question comprises:

The word2vec algorithm is used to represent each term after the question segmentation as a word vector, and the word vectors in the consultation question are averaged to obtain the sentence vector of the consultation question;

Using the word2vec algorithm, each term after the candidate problem segmentation is represented as a word vector, and the word vectors in the candidate question are averaged to obtain a sentence vector of the candidate question;

Calculating the cosine similarity between the sentence vector of the consulting question and the sentence vector of the candidate question, and obtaining the semantic similarity between the consulting question and the candidate question;

The method for calculating the topic similarity between the consulting question and the corresponding candidate question includes:

Using the topic expression method of LDA linear discriminant analysis, constructing a topic vector of the consulting question and the subject vector of the candidate question;

Calculating a cosine similarity between the subject vector of the consulting question and the subject vector of the candidate question, and obtaining a topic similarity between the consulting question and the candidate question;

The method for calculating the syntactic similarity between the consulting question and the corresponding candidate question includes:

Using the LTP language technology platform to analyze the consulting problem and the syntax of the candidate question, and obtain the consulting question and the syntax vector of the candidate question;

The cosine similarity between the syntax vector of the consulting problem and the syntax vector of the candidate problem is calculated, and the syntactic similarity between the consulting problem and the candidate problem is obtained.
The computer readable storage medium according to claim 15, wherein said preprocessing comprises word segmentation to obtain each term, and terminology after the segmentation term length is greater than a first threshold value as a long word, for said length Cutting words for part-of-speech tagging, naming entity recognition of the long-cut words by hidden Markov model to identify proper nouns, extracting keywords from the long-cut words by TF-IDF algorithm, adopting N-gram language The model and the edit distance are subjected to statement error correction processing for the consultation question.
The computer readable storage medium of claim 15, wherein the preprocessing further comprises:

Comparing each term obtained by the consulting problem segmentation with a preset positive vocabulary and a negative vocabulary to determine whether the consulting question includes positive vocabulary or negative vocabulary;

Before outputting the target answer, it also includes:

If the consulting question includes only the positive vocabulary, the preset greeting corresponding to the positive vocabulary is obtained, and the preset greeting corresponding to the positive vocabulary is combined with the target answer;

If the consultation question includes only the negative vocabulary, the preset greeting corresponding to the negative vocabulary is obtained, and the preset greeting corresponding to the negative vocabulary is combined with the target answer;

If the consulting question includes a positive vocabulary and a negative vocabulary, or the consulting question does not include the positive vocabulary and the negative vocabulary, the preset greeting corresponding to the neutral vocabulary is obtained, and the preset greeting corresponding to the neutral vocabulary is obtained. The language is combined with the target answer.