CN108829757B

CN108829757B - Intelligent service method, server and storage medium for chat robot

Info

Publication number: CN108829757B
Application number: CN201810520636.7A
Authority: CN
Inventors: 蒋健波; 兰俊杰
Original assignee: Guangzhou I Mybest Network Technology Co ltd
Current assignee: Guangzhou I Mybest Network Technology Co ltd
Priority date: 2018-05-28
Filing date: 2018-05-28
Publication date: 2022-01-28
Anticipated expiration: 2038-05-28
Also published as: CN108829757A

Abstract

The invention provides an intelligent service method, a server and a storage medium of a chat robot, which are characterized in that natural language information input by a user is received, the natural language information is subjected to information retrieval matching in a preset knowledge base to obtain a first candidate result, the natural language information is subjected to semantic analysis, a seq2seq model of NLG is generated by using the natural language to generate a second candidate result based on the result of the semantic analysis and the first candidate result, the first candidate result and the second candidate result are subjected to priority ordering, a final semantic answer is output according to the priority ordering result, the natural language input by the user can be accurately subjected to semantic analysis, so that the real meaning of the natural language is identified, accurate and proper answer is given, and the satisfaction degree of the user is improved.

Description

Intelligent service method, server and storage medium for chat robot

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to an intelligent service method, a server and a storage medium of a chat robot.

Background

With the continuous development of artificial intelligence technology, people have more and more strong demands on convenient, fast, efficient and accurate intelligent services. Natural language has gradually become the most mainstream man-machine interaction mode in the field of intelligent services as the most convenient and natural way for human to express self thought. Taking the customer service field as an example, when people want to open a flow package through a short message platform, people tend to write short messages with natural language expression patterns such as 'i want to open a 5-flow package', and do not want to memorize a long string of complex character codes. Because natural language has openness and randomness and various expression modes, the semantic analysis is carried out on the natural language, so that the real meaning of the natural language is identified, accurate and proper reply is given, and the method is very important for providing intelligent service.

Disclosure of Invention

Based on this, there is a need for an intelligent service method for a chat robot, which includes:

receiving natural language information input by a user;

performing information retrieval matching on the natural language information in a preset knowledge base to obtain a first candidate result;

performing semantic analysis on the natural language information;

generating a second candidate result by using a seq2seq model of the natural language generation NLG based on the result of the semantic parsing and the first candidate result;

and performing priority ranking on the first candidate result and the second candidate result, and outputting a final semantic reply according to a priority ranking result.

In one embodiment, the natural language information includes voice information and text information;

when the natural language information is voice information, after receiving the natural language information input by the user, the method further includes:

and recognizing the voice information as text information.

In one embodiment, the retrieving and matching the natural language information in a preset knowledge base to obtain a first candidate result includes:

determining a task type of the natural language information;

when the task type is a question-answer type, the natural language information is subjected to information retrieval matching in a preset knowledge base;

and generating a first candidate result corresponding to the natural language information according to the information retrieval matching result.

In one embodiment, the determining the task type of the natural language information includes:

performing relevance analysis on the natural language information and each task type by adopting a machine learning method to obtain relevance values of the natural language information and each task type;

and determining one or more task types with the highest correlation value with the natural language information as the task types of the natural language information.

In one embodiment, the semantic parsing the natural language information includes:

preprocessing the natural language information;

performing semantic analysis on the preprocessed natural language information under various preset semantic scenes to obtain a plurality of semantic analysis results;

sequencing the plurality of semantic analysis results according to a sequencing model obtained by pre-training;

and selecting a semantic parsing result meeting a preset condition from the sequence as a final semantic parsing result of the natural language information.

In one embodiment, the generating a second candidate result using a seq2seq model of natural language NLG based on the result of the semantic parsing and the first candidate result comprises:

performing intention identification, entity extraction and intention sequencing on the preprocessed natural language information by adopting a target detection RCNN algorithm;

a second candidate result is generated in conjunction with the first candidate result, the context understanding, and the inferring.

In one embodiment, the prioritizing the first candidate result and the second candidate result and outputting the final semantic reply according to the prioritized result includes:

selecting a preset number of candidate results from the ranking as a final semantic reply output of the natural language information; or selecting a candidate result with a ranking score larger than a preset threshold value from the ranking as a final semantic reply output of the natural language information.

In one embodiment, if the output final semantic answer is the second candidate result, the natural language information and the corresponding second candidate result are saved in a preset knowledge base.

The present invention also provides a server, comprising: the intelligent service program of the chat robot is configured to realize the steps of the intelligent service method of the chat robot.

The invention also provides a storage medium, wherein the storage medium is stored with an intelligent service program of the chat robot, and the intelligent service program of the chat robot realizes the steps of the intelligent service method of the chat robot when being executed by the processor.

The invention provides an intelligent service method, a server and a storage medium of a chat robot, which are characterized in that natural language information input by a user is received, the natural language information is subjected to information retrieval matching in a preset knowledge base to obtain a first candidate result, the natural language information is subjected to semantic analysis, a seq2seq model of an NLG (non line segment) is generated by using the natural language to generate a second candidate result based on the result of the semantic analysis and the first candidate result, the first candidate result and the second candidate result are subjected to priority ordering, a final semantic answer is output according to the priority ordering result, the natural language input by the user can be accurately subjected to semantic analysis, so that the real meaning of the natural language is identified, an accurate and proper answer is given, and the satisfaction degree of the user is improved.

Drawings

FIG. 1 is a schematic diagram of a server architecture of a hardware operating environment according to an embodiment of the present invention;

FIG. 2 is a flow chart illustrating an embodiment of an intelligent service method of a chat robot according to the present invention;

FIG. 3 is a schematic view of a sub-division flow of step S20 in FIG. 2 according to an embodiment of the present invention;

FIG. 4 is a schematic view of a subdivision flow of step S201 in FIG. 3 according to an embodiment of the present invention;

FIG. 5 is a flowchart illustrating a subdivision process of step S30 in FIG. 2 according to an embodiment of the present invention;

FIG. 6 is a schematic flow chart diagram illustrating an intelligent service method for a chat robot in accordance with another embodiment of the present invention;

FIG. 7 is a schematic diagram of the bidirectional cycle architecture of the RCNN in accordance with an embodiment of the present invention;

FIG. 8 is a block diagram illustrating an entity identification and extraction process based on BI _ LSTM _ CRF according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

Unless the context clearly dictates otherwise, the elements and components of the present invention may be present in either single or in multiple forms and are not limited thereto. Although the steps in the present invention are arranged by using reference numbers, the order of the steps is not limited, and the relative order of the steps can be adjusted unless the order of the steps is explicitly stated or other steps are required for the execution of a certain step. It is to be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.

Referring to fig. 1, fig. 1 is a schematic diagram of a server structure of a hardware operating environment according to an embodiment of the present invention.

As shown in fig. 1, the server may include: a processor 1001, such as a CPU, a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005. Wherein a communication bus 1002 is used to enable connective communication between these components. The user interface 1003 may include a Display screen (Display), and the optional user interface 1003 may further include a standard wired interface and a wireless interface, and the wired interface for the user interface 1003 may be a USB interface in the present invention. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 1005 may be a high-speed RAM memory or a non-volatile memory (e.g., a magnetic disk memory). The memory 1005 may alternatively be a storage device separate from the processor 1001.

Those skilled in the art will appreciate that the architecture shown in FIG. 1 does not constitute a limitation of a server, and may include more or fewer components than those shown, or some components in combination, or a different arrangement of components.

As shown in fig. 1, a memory 1005, which is a kind of computer storage medium, may include therein an operating system, a network communication module, a user interface module, and an intelligent service program of the chat robot.

In the server shown in fig. 1, the network interface 1004 is mainly used for connecting each network device and performing data communication with each network device; the user interface 1003 is mainly used for connecting peripheral equipment; the server calls the intelligent service program of the chat robot stored in the memory 1005 through the processor 1001, and performs the following operations:

receiving natural language information input by a user;

performing semantic analysis on the natural language information;

generating a second candidate result by using a seq2seq model of Natural Language Generation (NLG) (natural Language understanding) based on the result of semantic parsing and the first candidate result;

wherein seq2seq is a network of an Encoder-Decoder structure, the input of which is a sequence and the output of which is also a sequence, the Encoder changes a variable length signal sequence into a fixed length vector expression, and the Decoder changes the fixed length vector into a variable length target signal sequence.

The most important part of this structure is that the length of the input and output sequences is variable and can be used for translation, chat robots, syntactic analysis, text summarization, etc.

Further, processor 1001 may invoke the intelligent service program of the chat robot stored in memory 1005, and also perform the following operations:

the natural language information comprises voice information and text information;

and recognizing the voice information as text information.

determining a task type of the natural language information;

preprocessing the natural language information;

and if the output final semantic answer is a second candidate result, storing the natural language information and the corresponding second candidate result into a preset knowledge base.

According to the embodiment, natural language information input by a user is received, information retrieval matching is carried out on the natural language information in a preset knowledge base to obtain a first candidate result, semantic analysis is carried out on the natural language information, a seq2seq model of NLG is generated by using the natural language to generate a second candidate result based on the result of the semantic analysis and the first candidate result, the first candidate result and the second candidate result are subjected to priority sequencing, a final semantic reply is output according to the result of the priority sequencing, the natural language input by the user can be accurately subjected to semantic analysis, so that the real meaning of the natural language is identified, an accurate and proper reply is given, and the satisfaction degree of the user is improved.

Based on the hardware structure, the embodiment of the intelligent service method of the chat robot is provided.

Referring to fig. 2, fig. 2 is a schematic flowchart illustrating a first embodiment of an intelligent service method of a chat robot according to the present invention, where the intelligent service method of the chat robot includes the following steps:

step S10, receiving natural language information input by a user;

and recognizing the voice information as text information.

It should be understood that the natural language information is information in a form that conforms to the natural speaking habits of humans, such as "i want to know the weather forecast for the future five days of Guangzhou". The natural language information may be voice information or text information input by a user. Under the condition that the natural language information is voice information, after receiving the natural language information input by a user, the method also needs to comprise a step of recognizing the voice information as text information so as to facilitate the subsequent process of further recognizing and analyzing the natural language information; if the natural language information is text information, corresponding identification and analysis can be directly carried out subsequently according to the text information. Wherein the text information can be the text information input by the user through the input method software installed on the electronic equipment.

Step S20, carrying out information retrieval matching on the natural language information in a preset knowledge base to obtain a first candidate result;

in one embodiment, as shown in fig. 3, the step S20 specifically includes:

step S201, determining the task type of the natural language information;

in one embodiment, as shown in fig. 4, the step S201 specifically includes:

step S2011, performing relevance analysis on the natural language information and each task type by adopting a machine learning method to obtain relevance values of the natural language information and each task type;

the process in step 2011 may be to identify all the decomposed words in the natural language information according to a word segmentation method, determine whether the identified words configured for each task type are included in the decomposed words, and further calculate the correlation between the natural language information and the task type according to the number of the identified words of the task type appearing in the decomposed words and the number of all the identified words of the task type. For example, considering two task types of weather query and tv program query, if the natural language information input by the user is "i want to watch a central tv show", the natural language information is decomposed into "i want to watch a central tv show" by a word segmentation method, and since these decomposed words do not include weather-related identification words, and include identification words of tv programs such as "central tv show" and "tv show", the system determines the task type of the natural language information as the tv program query.

In this step, the correlation analysis may be performed by evaluating and scoring the natural language information using knowledge information of each task by using a machine learning method. It can also be understood that the most likely attributive n task categories of the natural language information are calculated by adopting a machine learning technology.

Step S2012, determining one or more task types with the highest relevance value to the natural language information as the task type of the natural language information.

Step 2012 may be performed according to preset configuration information, for example, the preset configuration information specifies that three task types with the highest relevance values need to be selected to be determined as the task classification set of the natural language information. Of course, the preset configuration may also provide that one or five task types with the highest correlation values are determined as the task type set of the natural language information. The specific requirements can be configured according to the needs of the user.

Task classification of natural language information can be formalized as a problem of classifying time series. The classification problem for time series is defined as follows: given a set of data samples, each data sample comprises: an input time series Xi ═ x (1) i, x (2) i.. and its discrete class labels Ci, where x (t) i ∈ Rn is an n-dimensional vector, Ci ∈ {1, 2.., NC }, with the goal of predicting the class label of the new time series. The reason why the time series classification problem is more difficult than the general classification problem is mainly that the time series data to be classified are not equal, which makes the general classification algorithm not directly applicable. Even for time series with equal length, the values of different series at the same position are not directly comparable, so the general classification algorithm is still not suitable for direct application.

The overall idea of the task classification method of the embodiment of the invention can be divided into two steps. Step 1, converting data into equal-length vectors. One of the key issues is how to retain as much time and sequence information as possible. In the embodiment of the invention, a model-based clustering method is used for data conversion. And 2, classifying the converted isometric data set by using a general classification algorithm, such as k-nearest neighbor search, a decision tree, a maximum likelihood method, an SVM (Support Vector Machine, a trainable Machine learning method) and the like. Without loss of generality, when the length of the equal-length vector is 1, the classification operation on the text sequence becomes a text classification method using bag-of-word.

Step S202, when the task type is a question-answer type, the natural language information is subjected to information retrieval matching in a preset knowledge base;

step S203, according to the information retrieval matching result, generating a first candidate result corresponding to the natural language information.

In an embodiment of the present invention, when the text information type input by the user is a question and answer type, the chat robot may search and match an index defined in advance in a preset knowledge base based on a retrieval model ir (retrieval model) to find a candidate answer. That is, some dialogue calls (QA question and answer calls) may be acquired from the internet in advance, and then a search engine is constructed based on the data, and then the search matching is performed according to the text similarity. These dialogues may be common knowledge or knowledge in some technical fields, such as the healthcare industry, and the user may set the dialogues as needed, which is not limited by the present invention.

In one embodiment, the chat robot performs text preprocessing (such as word segmentation, keyword extraction, error correction analysis and the like) on text information input by a user, then matches a well-trained model (usually Rule-Based is used as a main factor: such as a regular expression, keyword similarity and the like) to obtain candidate answers (possibly not unique) in QA-calls, then performs weight sorting on the answers, and finally outputs the answers according to the sorting to obtain a first candidate result.

Step S30, semantic analysis is carried out on the natural language information;

in one embodiment, as shown in fig. 5, the step S30 specifically includes:

step S301, preprocessing the natural language information; the preprocessing comprises word segmentation, keyword extraction, error correction analysis and the like.

Step S302, semantic parsing is respectively carried out on the preprocessed natural language information under various preset semantic scenes to obtain a plurality of semantic parsing results;

in the embodiment of the invention, a plurality of semantic scene types can be preset so as to carry out omnibearing semantic analysis on natural language information input by a user under a plurality of semantic scenes, thereby preventing semantic analysis errors caused by wrong judgment of the semantic scene types. Specifically, the natural language information input by the user can be subjected to semantic analysis under each preset semantic scene, so that a plurality of semantic analysis results can be obtained. At least one semantic analysis result can be obtained in each semantic scene, and the semantic analysis result can be a semantic analysis result with successful matching or a semantic analysis result with failed matching.

Because the semantic analysis mode is various, different semantic analysis modes can be adopted to carry out semantic analysis under each semantic scene, wherein the corresponding relation between the semantic scene and the semantic analysis mode can be preset, for example, the corresponding relation exists in a mapping table mode, namely, one or more semantic analysis modes which are selected in advance can be adopted under a certain specific semantic scene. Preferably, no semantic parsing mode is set in advance, and before semantic parsing is performed on the natural language information, the semantic parsing mode in each semantic scene can be determined, so that the semantic parsing mode can be selected as required to perform semantic parsing on the natural language information. If the semantic analysis mode adopted in each semantic scene is set in advance, the semantic analysis mode in each semantic scene can be determined by calling the set semantic analysis mode, and then the semantic analysis is realized, namely, the process of determining the semantic analysis mode in each semantic scene is expressed as the calling of the semantic analysis mode. The semantic parsing manner may be one or more of any parsing manners such as parsing based on syntax and semantic analysis, parsing based on a grammar rule network, and sensitive word matching, and the embodiments of the present invention are not limited in detail.

Taking semantic parsing by adopting a grammar rule network as an example, under a specific semantic scene, the grammar rule network can correspond to a plurality of grammar rule networks, namely different sentence forms correspond to different grammar rule networks, wherein the grammar rule network preferably adopts a WFST (weighted finite state machine) network compiled based on ABNF (expanded Bax paradigm) grammar rules. Specifically, the semantic analysis may be to perform path matching analysis on the natural language information through a dynamic programming algorithm on the grammar rule network, and obtain corresponding semantic information by backtracking a matching path. And for the paths which are matched with a plurality of paths at the same time, obtaining a path with the highest score as a semantic analysis result.

Step S303, sequencing the plurality of semantic analysis results according to a sequencing model obtained by pre-training;

the ranking model can be obtained by off-line training by adopting a large-scale training corpus which contains a large number of user actual descriptions and covers various intentions and parameter combinations. The large-scale corpus can truly reflect the semantic features of the user's utterance, and specifically, the intention of each sentence of the corpus can be labeled manually, for example, "read with English" is labeled as translation, "see the picture of Beijing Imperial palace" is labeled as picture, and "air of Guangzhou" is labeled as air quality.

After the natural language information is subjected to semantic analysis, the obtained semantic analysis result can be extracted: the method comprises the steps of obtaining key information slots in the scene, the number of the key information slots and the heat degree and other characteristic information of the content extracted by the key information slots in the scene. For example, the text information input by the user is "a restaurant near nine kiosks", and in this restaurant scenario, semantic parsing obtains two key information slots: the number of key information slots of the geographic location "nine kiosks" and the restaurant category "Sichuan restaurant" is two, the "nine kiosks" are used as the popularity of the geographic location and the "Sichuan restaurant" is used as the popularity of the restaurant category.

According to the feature information extracted from the semantic parsing result, and by combining with traditional features such as user history information, scene identification words, scene identification sentence patterns and the like, a feature set required by training and prediction of the Ranking model is formed together, and then a reasonable Ranking trainer, such as a Ranking SVM, a RankNet, a ListNet and the like, is selected to obtain the Ranking model through training.

In the embodiment of the invention, a plurality of semantic analysis results are input into a sequencing model obtained by pre-training, the sequencing model automatically extracts characteristic information from the semantic analysis results to match with the preset characteristic information in the sequencing model, and the plurality of semantic analysis results are sequenced according to the matching results.

Step S304, selecting a semantic parsing result meeting a preset condition from the sequence as a final semantic parsing result of the natural language information.

The preset conditions may be preset as required, for example, the preset conditions may be a preset number or a preset threshold, specifically, a preset number of semantic analysis results may be selected from the sequence as a final semantic analysis result of the natural language information, and preferably, a preset number of semantic analysis results ranked in the top is selected from the sequence as a final semantic analysis result of the natural language information, where the preset number may be preset as required, for example, three, five, ten, and the like. Of course, the semantic analysis results may also be ranked and scored by using a ranking model, the ranking score is compared with a confidence threshold (preset threshold) set in advance through a large number of experiments or experience values, and one or more semantic analysis results with the ranking score larger than the confidence threshold are used as final semantic analysis results of the natural language information.

Because the natural language has various expressions to the same semantic, the final semantic analysis result can be normalized for the convenience of subsequent data statistics or data processing requirements, and preferably, the time, the place and other related information are extracted from the final semantic analysis result for normalization. Wherein, the normalization process refers to adjusting the relevant information or parameters into a uniform format or expression mode for data statistics or information identification. For example, the current date is 1 month and 1 day 2014, the weather of tomorrow is to be queried by the user, and the possible expressions are "weather of tomorrow", or "weather of 1 month and 2 days 2014", or "weather of 2 days", and the like, and the user needs to perform normalized processing on the parameter time in the semantic information of the three expressions, and the normalized processing is uniformly adjusted to: 2014-01-02 to facilitate subsequent data processing or data statistics.

Step S40, generating a second candidate result by using a seq2seq model of natural Language generated nlg (natural Language understanding) based on the result of semantic analysis and the first candidate result;

In one embodiment, the generating a second candidate result based on the result of the semantic parsing and the first candidate result by using a seq2seq model of NLG generated in natural language specifically includes:

And step S50, performing priority ranking on the first candidate result and the second candidate result, and outputting a final semantic reply according to the priority ranking result.

In one embodiment, the prioritizing the first candidate result and the second candidate result and outputting the final semantic reply according to the prioritized result specifically includes:

In one embodiment of the present invention, as shown in fig. 6, for a text (query + context) input by a user, first, IR is used to perform information retrieval in a preset knowledge base, so as to obtain a first candidate result; performing NLU (NLU) core algorithm processing by using a GM (Generation Model, Generation Model-based) framework, performing intention classification and entity attribute extraction on a text input by a user by using an algorithm, and then generating a first candidate result as an input of Bi-Lstm-Seq2 Seq; then, the first candidate result and the second candidate result are subjected to priority sorting, and judgment is carried out according to a threshold value T; if the sorting score is larger than the threshold value T, outputting the sorting score as a final semantic answer; otherwise, the model generation is continued by returning to the GM module.

Compared with the traditional SVM model based on word vector (word vector) feature extraction, the character-level-based SVM model based on the RCNN + CNN mixed model is beneficial to solving the defects of low NLU recognition degree caused by inaccurate word segmentation in the text preprocessing process and the problem of high SVM training time complexity.

In one embodiment, if the output final semantic answer is the second candidate result, the natural language information and the corresponding second candidate result are stored in the preset knowledge base, and the answer corresponding to query + context in the preset knowledge base is updated, so that the retrieval effect may be better later.

In the embodiment of the present invention, different methods are adopted for different intention category scenarios of natural language information to achieve more accurate answers, for example:

1. QA _ Chat (question-answer class): such as "how many points i have" can be implemented based on rules and methods of retrieval.

2. Task _ Chat (Task class): for example, "I want to buy a health product for calcium supplement", this can be done using the Intent _ Clf + Entity _ Extrac scheme.

3. Rec _ Chat (shopping guide recommendation class): for example, "what is better to be supplemented by the elderly with high blood pressure" can be realized based on the domain knowledge base of the health care product pharmaceutical industry, that is, based on the preset knowledge base.

Then, generating an answer by using the Seq2Seq of the NLG; and then reordering with the answers of the IR search to perform threshold judgment, and finally outputting the answers.

The present invention introduces RCNN to capture context information as much as possible when learning word representations, which can greatly reduce noise compared to traditional window-based neural networks.

RCNN is actually a mixed model of RNN + CNN; RNN recurrent neural networks, a biased model in which later words have more advantage than earlier words. Thus, when it is used to capture the semantics of an entire document, it may reduce efficiency, as key components may appear anywhere in the document, not last. To solve the bias problem, the present invention introduces a Convolutional Neural Network (CNN) that introduces a model without bias into the NLP task, which can well determine the identifying phrases with the largest pooling layer in the text. Thus, CNN may capture the semantics of text better than a recurrent or recurrent neural network, but CNN has difficulty determining the window size, small window sizes may result in the loss of some key information, and large windows may result in a huge parameter space (which may be difficult to train). Therefore, it presents a problem: the invention can learn more context information than the traditional window-based neural network, and can more accurately represent the semantics of the text. RCNN was created to address this limitation. First, the present invention employs a bi-directional cyclic structure that can greatly reduce noise compared to conventional window-based neural networks, thereby capturing context information to the maximum extent. In addition, the model may preserve a larger range of word order when learning the text representation. Second, the present invention uses a pooling layer that can automatically determine which properties play a key role in text classification to capture key components in the text. The model of the invention combines the structure and the maximum pooling layer of the RNN and utilizes the advantages of the recurrent neural model and the convolutional neural model. Simultaneously using a text representation mode based on the character string level; the problem of reduced model discrimination accuracy caused by Chinese word segmentation errors in the field can be reduced to a certain extent.

In one embodiment, as shown in fig. 7, the input to the network is a text D, which is a sequence of characters w1, w2... wn. The output of the network is a category. The present invention uses p (k | D, θ), which is a network parameter, to represent the probability of a document class k.

First, construct the convolutional layer of CNN, which is essentially a BiRNN model, construct the context and the above of a word through forward and reverse loops, the present invention defines $ c _ l (w _ i) $ as the text on the left of the character (word) $ w _ i, and $ c _ r (w _ i) $ as the text on the right of the character (word) $ w _ i. E (w { i-1}) $ is the character (word) $ w { i-1} $ embedding, which is a real-valued vector of length | e |. $ c _1(w { i-1}) $ is the left half text of the last word $ w { i-1} $. The left half of the text of the first character (word) uses the same parameter $ c _ l (w _1) $. $ W^∧{ (1) } $ is a matrix that converts the hidden layer (context) to the next hidden layer. $ W^∧{ (s1) } $ is a matrix that is used to combine the semantics of the current word with the left context of the next word. $ f $ is a non-linear activation function

c_l(w_i)＝f(W^(l)c_l(w_i-1)+W^(sl)e(ω_i-1))

c_r(ω_i)＝f(W^(r)c_r(w_i+1)+W^(sr)e(w_i+1))

After the context representation of the character (word) is obtained, the character (i.e., text representation) is represented in a concatenated manner as follows:

x_i＝[c_l(w_i)；e(w_i)；c_r(w_i)]

by representing the characters in this manner, the model of the present invention can better eliminate the ambiguous meaning of the word "$ wi $" rather than the traditional neural model that uses only a fixed window.

The loop structure may obtain all $ c _1 $onthe forward scan of the text and all $ c _ r $onthe reverse scan. The temporal complexity is O (n). After the invention obtains the $ x _ i $ representation for the character (word) $ w _ i $ the invention applies a linear transformation to $ x _ i $ along with the tanh activation function and passes the result to the next layer.

Where y is a potential semantic vector, each semantic factor will be analyzed to determine the most useful factor representing the text.

And then performing pooling (max-posing), namely selecting the maximum value in each dimension of the latent semantic vectors of all the words which are just obtained to form a new vector, wherein the max-posing can be adopted to extract the maximum features in the vector, so that the information of the whole text is obtained.

The max function is a function by element. $ Y^∧(3) The kth element of $ is $ y _ i^∧(2) The maximum of the kth element of all vectors of $ l. The pooling layer converts different lengths of text into fixed length vectors. The time complexity of the pooling process is also O (n), so the time complexity of the entire model isO(n)。

After the text feature vectors are obtained, classification is performed, that is, the output layer:

y⁽⁴⁾＝W⁽⁴⁾y⁽³⁾+b⁽⁴⁾

finally, the softmax function is applied to $ y^∧4, which can convert the output number into a probability:

this approach solves the problem of single RNN model bias and single CNN convolution window size being fixed.

In one embodiment of the present invention, as shown in FIG. 8, Entity identification and extraction (Entity Extractor) based on BI _ LSTM _ CRF, which is a typical NER (named Entity identification) problem; and extracting information such as products, symptoms, diseases, brands, crowds and the like according to the business rules. Assuming that the Entity (Entity) to be extracted by the invention has a region (LOC) and a product (Prod), the invention uses a BIO form of a standard form labeled by NER sequences in the NLP field, namely:

B-LOC($w_0$)

I-LOC($w_1$)

B-Prod($w_2$)

I-Prod($w_3$)

O($w_4$)

therefore, for any text X, it can be represented by the above 5 Label labels. As above, the drawing "Chinese is very big" is denoted as B-LOC I-LOC O.

BI_LSTM_CRF：

1. First using vector representation for each character (word) in X; i.e., $ x _1, x _2, x _3, x _4$ (i.e., text representation within the conventional NLP classification feature engineering) in fig. 8; there is usually Word level Word embedding, or Character level Character embedding, as introduced by RCNN. The Character embedding is initialized randomly. Word embedding is usually from a pre-trained Word embedding file. All embeddings will be fine-tuned during the training process.

2. Inputting a vector X ($ X _1, X _ 2., $ X _ n $) into a BI _ LSTM (bidirectional recurrent neural network), i.e., a forward and backward LSTM; the model can output the label vector, i.e., $ p _1, p _2, p _3, p _4$ in fig. 8 (p _1$, for example, $ p _1$ (1.5, 0, 8, 0.2, 0.03) values correspond to "large chinese" B-LOC, I-LOC, O scores (scores), respectively.

3. The CRF layer may add some restrictions to the final predicted labels to ensure that they are valid (since the label vector of the LSTM output may not necessarily be $ p _1$ (0.3, 0, 8, 0.2, 0.03), for example, to the values exemplified above, at which time the LSTM output may not be the resultant output; the introduction of CRF is to solve this problem). These constraints may be automatically learned by the CRF layer in the training dataset during the training process. The result of the CRF output is the maximum probability for the Label.

Intent Sort: LR + SoftMax, using a classification algorithm to determine whether the intention is related to the above intention by using a feature engineering method, and then sorting and outputting the confidence (probability) of intention classification by Softmax. For example:

u (client): how much protein powder is priced?

A (customer service): 218. after two days, the double eleven have activities; to time 198 ao

U (client): VB woolen cloth

The method comprises the following steps: if the context is not considered, the intent of "VB woolen" cannot be determined. In order to solve the problem, an intention ordering module is designed; for the text input by the user, on one hand, the (intent _ clf + intent _ extra) is firstly used for intention judgment and entity extraction; on the other hand, the above intention is directly inherited, and the attribute extraction is carried out according to the intention. Sending the two results into an LR classifier through feature extraction engineering; judging whether the intention is continued; if yes, the intention and the attribute extraction result thereof can be output as a final result; otherwise, the results can be output according to the confidence degree of the softmax multi-classifier classification.

Generating a sequencing Sequence 2 Sequence by the answer, and generating Sequence to Sequence by an Encoder-Decoder model in an LSTM mode, wherein the Encoder converts an input Sequence into a vector with a fixed length; decoder, which converts the fixed vector generated before into the output sequence.

The basic idea is to use two LSTMs, one as the encoder and the other as the decoder. The encoder is responsible for compressing an input sequence into a vector of a specified length, which can be regarded as the semantics of the sequence, and this process is called encoding. The simplest way to obtain the semantic vector is to directly use the hidden state of the last input as the semantic vector C. The last hidden state can be transformed to obtain a semantic vector, and all hidden states of the input sequence can be transformed to obtain a semantic variable.

The invention also provides a storage medium, the storage medium is stored with an intelligent service program of the chat robot, and when the intelligent service program of the chat robot is executed by a processor, the following steps are executed:

receiving natural language information input by a user;

performing semantic analysis on the natural language information;

and recognizing the voice information as text information.

determining a task type of the natural language information;

preprocessing the natural language information;

According to the intelligent service method, the server and the storage medium of the chat robot, the natural language information input by a user is received, the natural language information is subjected to information retrieval matching in a preset knowledge base to obtain a first candidate result, the natural language information is subjected to semantic analysis, a seq2seq model of NLG is generated by using the natural language based on the result of the semantic analysis and the first candidate result to generate a second candidate result, the first candidate result and the second candidate result are subjected to priority ranking, and a final semantic answer is output according to the priority ranking result. According to the invention, candidate data answers are obtained through traditional IR retrieval; then, generating a new round of candidate answers by using the Seq2 Seq; then, reordering the two answers to obtain a priority ranking answer; then, judging according to a threshold value, and outputting an answer if the score is greater than a set score; otherwise, the GM module is reused to continue generating new answers. The Ensemble (hybrid model) can solve the problems of lack of semantics and corpus limitation of the traditional IR; meanwhile, the model training efficiency of the GM is improved (the GM can not be recognized by only training the IR), and the semantic analysis can be performed on the natural language input by the user more accurately and rapidly, so that the real meaning of the GM is recognized, accurate and appropriate response is given, and the satisfaction degree of the user is improved.

The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. An intelligent service method of a chat robot is characterized by comprising the following steps:

receiving natural language information input by a user;

performing semantic analysis on the natural language information;

2. The intelligent service method of a chat robot according to claim 1, wherein the natural language information includes voice information and text information;

and recognizing the voice information as text information.

3. The intelligent service method of a chat robot as claimed in claim 1, wherein the retrieving and matching the natural language information in a predetermined knowledge base to obtain a first candidate result comprises:

determining a task type of the natural language information;

4. The intelligent service method for chat robots according to claim 3, wherein the determining the task type of the natural language information comprises:

5. The intelligent service method for chat robots according to claim 1, wherein the semantic parsing of the natural language information comprises:

preprocessing the natural language information;

6. The intelligent service method for chat robots according to claim 5, wherein the generating a second candidate result using a seq2seq model of natural language NLG based on the result of semantic parsing and the first candidate result comprises:

7. The intelligent service method for chat robots according to claim 1, wherein the prioritizing the first candidate result and the second candidate result and outputting the final semantic response according to the prioritized result comprises:

8. The intelligent service method of a chat robot as claimed in claim 7, wherein if the final semantic response is the second candidate result, the natural language information and the corresponding second candidate result are saved in a predetermined knowledge base.

9. A server, characterized in that the server comprises: a memory, a processor and an intelligent service program of a chat robot stored on the memory and operable on the processor, the intelligent service program of the chat robot being configured to implement the steps of the intelligent service method of a chat robot according to any of claims 1 to 8.

10. A storage medium having stored thereon an intelligent service program of a chat robot, the intelligent service program of the chat robot implementing the steps of the intelligent service method of a chat robot according to any one of claims 1 to 8 when executed by a processor.