CN113849634B

CN113849634B - Method for improving interpretability of depth model recommendation scheme

Info

Publication number: CN113849634B
Application number: CN202110225889.3A
Authority: CN
Inventors: 曹靖城; 张继东; 王培才; 仇东平; 王猛德
Original assignee: Tianyi Shilian Technology Co ltd
Current assignee: Tianyi Shilian Technology Co ltd
Priority date: 2021-03-01
Filing date: 2021-03-01
Publication date: 2024-04-16
Anticipated expiration: 2041-03-01
Also published as: CN113849634A

Abstract

The invention provides a method for improving the interpretability of a depth model recommendation scheme. The method may include: preprocessing the return visit data of the user; training a summary classifier; extracting features of the return visit data; generating a digest using a digest generator and calculating a digest generation loss; calculating a digest classification loss based on the digest classifier; calculating return visit text classification loss; model parameter training update is carried out with the aim of minimizing the sum of abstract generation loss, abstract classification loss and return visit text classification loss; and training and updating based on the model parameters to obtain a service recommendation model. In addition, the invention also provides a method for generating a business recommendation scheme and a user return visit abstract based on the user return visit text, wherein the generated user return visit abstract can be corrected by using a sequence copy mechanism. The invention can obviously improve the accuracy and the interpretability of the user service demand prediction.

Description

Method for improving interpretability of depth model recommendation scheme

Technical Field

The present invention relates to natural language processing, and more particularly, to a method for improving the interpretability of depth model recommendations.

Background

The task of user business demand prediction can be abstracted into a text classification task in natural language processing, and the related algorithm can be used for realizing automatic intention recognition instead of manual recognition operation. The text classification refers to obtaining the corresponding category of the text for relevant judgment according to a corresponding classification algorithm or model for a given unstructured text. The traditional machine learning algorithm extracts text features based on artificial feature engineering, and has certain limitations on accuracy and robustness of text classification. The deep learning algorithm based on the traditional cyclic neural network and the convolutional neural network has higher quality requirement on training data, and a more accurate and effective classification algorithm needs to be researched and selected for realizing the classification and identification of the user intention.

In addition, the problem of deep learning-based interpretation is one of the work that industry now is constantly discussing and studying, also in the field of natural language processing. Currently, existing studies attempt to make models interpretable, typically interpret the model output or the link between the output and the input. However, the conventional user service recommendation method often ignores the return visit summary information of the user for the recommended service, but this way ignores a lot of fine-grained information (such as text interpretation of labels), and the system cannot generate human readable interpretation after recommending the service. The evaluation information is often text information generated by customer service personnel after refining the customer service requirements. If the evaluation content of the user can be accurately predicted and generated, the classification effect of the classifier can be improved at the same time, and the interpretation of the evaluation content as the classification result improves the interpretability and the robustness of the classifier.

Disclosure of Invention

This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.

In view of the above-described drawbacks of the prior art, an object of the present invention is to improve accuracy and interpretability of user service demand prediction and improve recommended service quality.

According to a first aspect of the present invention, there is provided a method for training a business recommendation model including a digest classifier, a feature encoder, a digest decoder, and a return text classifier, each implemented using an artificial neural network, the method may include: step (1): obtaining a user return visit text, a user return visit abstract and a service recommendation scheme corresponding to the user return visit text, and carrying out data preprocessing on the user return visit text and the user return visit abstract to obtain a user return visit data set; step (2): training a digest classifier using a user return visit digest in a user return visit dataset and using a service recommendation scheme corresponding to the user return visit digest as a tag; step (3): extracting features of the user return visit text in the user return visit data set by using a feature encoder to obtain a return visit text hidden state vector; step (4): inputting the return text hidden state vector into a digest decoder to obtain a generated digest and calculating a digest generation penalty; step (5): calculating a digest classification penalty based on the trained digest classifier; step (6): calculating a return text classification loss based on the return text classifier; step (7): training and updating parameters of a feature encoder, a abstract decoder and a return text classifier with the aim of minimizing the sum of abstract generation loss, abstract classification loss and return text classification loss; and (3) repeating the steps (3) - (7) until the parameters are converged, thereby completing the training of the service recommendation model.

Alternatively, the user return visit dataset may include a training set, a validation set, and a test set, and the ratio of the training set, the validation set, and the test set may be 6:2:2.

Alternatively, the summary classifier may be trained using a text convolutional neural network (TextCNN) model.

Optionally, step (3) may further include: performing feature extraction on a user return visit text in a user return visit data set by using a Bi-directional long-short-time memory (Bi-LSTM) feature encoder based on an Attention (Attention) mechanism; and obtaining a return text hidden state vector of the user return text after encoding by using a Bi-directional long short time memory (Bi-LSTM) model based on Attention (Attention) mechanism based on the extracted features.

Optionally, step (4) may further include: inputting the coded return text hidden state vector of the user return text into a digest decoder adopting a Long Short Time Memory (LSTM) network to obtain a generated digest; calculating a bilingual evaluation replacement (BLEU) score based on the generated digest and a tag digest, wherein the tag digest is a user return visit digest for which the user return visit text corresponds in the user return visit dataset; and determining a summary generation penalty based on the bilingual evaluation alternative (BLEU) score.

Optionally, step (5) may further include: inputting the tag abstract and the generated abstract into the abstract classifier trained in the step (2) to obtain probability distribution for predicting the business recommendation scheme by using the tag abstract and the generated abstract respectively; taking out the probability corresponding to the real service recommendation scheme from the probability distribution associated with the tag abstract as a first probability; taking out the probability corresponding to the real business recommendation scheme from the probability distribution associated with the generated abstract as a second probability; and calculating an absolute value of a difference between the first probability and the second probability as a digest classification loss.

Optionally, step (6) may further include: inputting the coded return text hidden state vector of the user return text into a return text classifier adopting a text convolutional neural network (textCNN) model to obtain the probability distribution predicted by the service recommendation scheme; taking out the probability corresponding to the real service recommendation scheme from the probability distribution as a third probability; the absolute value of the difference between the second probability and the third probability is calculated as a return text classification penalty.

Optionally, updating of parameters of the feature encoder, the digest decoder, and the return text classifier is stopped when the return text classification loss is below a threshold.

According to a second aspect of the present invention, there is provided a method for generating a user return visit summary and a service recommendation based on user return visit text, the method may comprise: obtaining a service recommendation model trained by the method of the invention; obtaining a user return visit text and carrying out data preprocessing on the user return visit text; and inputting the user return visit text subjected to data preprocessing into the service recommendation model to generate a corresponding user return visit abstract and a service recommendation scheme.

Optionally, the method may further comprise: correcting the generated user return visit abstract; and outputting the revised user return visit summary.

By adopting the technical scheme provided by the invention, the accuracy and the interpretability of the user service demand prediction can be obviously improved.

These and other features and advantages will become apparent upon reading the following detailed description and upon reference to the associated drawings. It is to be understood that both the foregoing general description and the following detailed description are explanatory only and are not restrictive of aspects as claimed.

Drawings

So that the manner in which the above recited features of the present invention can be understood in detail, a more particular description of the invention, briefly summarized above, may be had by reference to embodiments, some of which are illustrated in the appended drawings. It is to be noted, however, that the appended drawings illustrate only certain typical aspects of this invention and are therefore not to be considered limiting of its scope, for the description may admit to other equally effective aspects.

Fig. 1 illustrates a schematic diagram of a business recommendation model according to one embodiment of the invention.

Fig. 2 illustrates a flowchart of a method for training a business recommendation model according to one embodiment of the invention.

Fig. 3 illustrates a flow chart of a method for obtaining a user return summary and a business recommendation scheme based on user return text using a trained business recommendation model, according to one embodiment of the invention.

Fig. 4 illustrates a block diagram of an apparatus for implementing a method according to the invention, according to one embodiment of the invention.

Detailed Description

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are used in the description of the embodiments will be briefly described below. It is apparent that the drawings in the following description are only some examples or embodiments of the present invention, and it is possible for those of ordinary skill in the art to apply the present invention to other similar situations according to these drawings without inventive effort. Unless otherwise apparent from the context of the language or otherwise specified, like reference numerals in the figures refer to like structures or operations.

As used in the specification and in the claims, the terms "a," "an," "the," and/or "the" are not specific to a singular, but may include a plurality, unless the context clearly dictates otherwise. Generally, the terms "comprises" and "comprising" merely indicate that the steps and elements are explicitly identified, and they do not constitute an exclusive list, as other steps or elements may be included in a method or apparatus.

A flowchart is used in the present invention to describe the operations performed by methods according to embodiments of the present invention. It should be understood that the preceding or following operations are not necessarily performed in order precisely. Rather, the various steps may be processed in reverse order or simultaneously. At the same time, other operations are added to or removed from these processes.

In natural language processing technology, the traditional text classification task and the text generation task are two relatively independent subtasks. The invention trains the two subtask models simultaneously so as to achieve the purpose of unifying the summary content generation and the business recommendation scheme generation. In addition, the traditional deep learning network generally adopts an end-to-end mode to directly predict the text type, belongs to a black box model and has no interpretability. According to the invention, a text generation technology is used for generating the extracted return visit abstract content, so that the recommendation scheme generated by the model can be manually understood according to the return visit abstract, and the model recommendation accuracy is improved.

Fig. 1 illustrates a schematic diagram of a business recommendation model 100 according to one embodiment of the invention. In the present invention, the service recommendation model 100 refers to a model trained using an artificial neural network, which is capable of generating a corresponding user return visit digest (text generation task) and service recommendation scheme (text classification task) based on user return visit text. The user return text refers to content told by the user during the conversation between the customer service person and the user, which can be obtained by voice recognition and conversion into text. The user return visit abstract refers to a generalized description of user return visit text, which may be text information generated by customer service personnel after refining user business requirements. A service recommendation scheme refers to recommending corresponding types of services to a user based on user requirements (e.g., based on the content of the user return text and/or the user return summary). Thus, the user return visit text, the user return visit summary and the business recommendation are associated or corresponding. For example, in one example, the user return text may include "insufficient flow in cell phone packages per month, a lot of excess cost is incurred", the corresponding user return summary may include "insufficient flow in cell phone packages", and the associated business recommendation scheme may include "upgrade cell phone packages". In another example, the user return text may include "slow surfing at home, frequent churning during movie and game play," the corresponding user return summary may include "the surfing speed is not meeting the demand," and the associated business recommendation may include "upgrade bandwidth. In the invention, by adopting the service recommendation model 100, the user return visit abstract and the service recommendation scheme can be automatically generated based on the user return visit text, so that the service recommendation scheme is provided for the user, and the direct readable abstract text is also provided for the staff to help the service analysis.

In one embodiment of the invention, the business recommendation model 100 may include a plurality of components that may include, but are not limited to, a digest classifier, a feature encoder, a digest decoder, and a return text classifier, each implemented using an artificial neural network. The digest classifier may be used to generate a corresponding business recommendation based on the user return visit digest (i.e., to implement a text classification task based on the return visit digest). A feature encoder may be used to perform feature extraction on the entered user return text to obtain a return text hidden state vector. It is known to those skilled in the art that for text generation tasks (e.g., sequence-to-sequence Seq2Seq task), an encoder-decoder architecture is a commonly used architecture in which an encoder receives an input sequence, generates a hidden state vector as encoded information for the input sequence, and a decoder generates an output sequence based on the hidden state vector. Thus, the hidden state vector of the return text in the present invention refers to the hidden state vector generated after the user return text is extracted by the feature encoder feature. The digest decoder may be used to generate a user return digest based on the return text hidden state vector output by the feature encoder (i.e., to implement a text generation task based on the user return text). The return text classifier may be used to generate corresponding business recommendations (i.e., to implement a text classification task based on the user return text) based on the user return text (e.g., based on the return text hidden state vector).

Fig. 2 illustrates a flowchart of a method 200 for training a business recommendation model according to one embodiment of the invention. In some examples, the method 200 may be performed by the apparatus 400 illustrated in fig. 4. In some examples, the method 200 may be performed by any suitable device or means for performing the functions or algorithms described below. As described above, the service recommendation model may include a digest classifier, a feature encoder, a digest decoder, and a return text classifier, each implemented using an artificial neural network. The service recommendation model, namely, the parameters of an artificial neural network for realizing a summary classifier, a feature encoder, a summary decoder and a return visit text classifier are trained to obtain the service recommendation model capable of accurately generating a user return visit summary and a service recommendation scheme based on the user return visit text.

The method 200 may begin at block 210 (i.e., step (1)) where user return text is obtained along with a user return summary and business recommendations corresponding to the user return text, and the user return text and the user return summary are data preprocessed to obtain a user return dataset. As described above, the user return text, user return summary and business recommendation are associated or corresponding, and historical or existing user return text, user return summary and business recommendation may be stored in the database. Thus, in one embodiment, such data may be obtained from a database to use such data to train a business recommendation model. In another embodiment, the data may also be obtained from a computer, storage device, server, or the like.

In one embodiment, the data preprocessing may include building a common article of manufacture, a word list of mood words, and carrying out text replacement and extraction on the user return visit text and the virtual words and the mood words in the user return visit abstract by using the regular expression so as to obtain the effective data of the user return visit information. The effective data may be proportionally divided into a training set, a validation set, and a test set and stored in a file format to obtain a user return visit data set, wherein the ratio of the training set, the validation set, and the test set may be 6:2:2. The validation set and the test set may be used to validate model training scenarios and generalizations.

At block 220 (i.e., step (2)), the method 200 may include training a digest classifier using the user return digest in the user return data set and using as a tag a business recommendation corresponding to the user return digest. After training, the abstract classifier may be used to derive a probability distribution of business recommendation predictions based on the abstract (i.e., to implement text classification tasks). In one embodiment, the operations of block 220 may further comprise: the user return summaries in the user return data set are replaced with word vectors (e.g., word vectors pre-trained using a global vector (GloVe) model of word representations, word vectors pre-trained using a word2vec model, etc.), resulting in a basic embedded representation of the user return summaries, and then a text classifier is trained as a summary classifier using a text convolutional neural network (TextCNN) model with the corresponding business recommendations as labels. Because the short text abstract is used for classification, the performance of the textCNN model is relatively high, and the classification accuracy of more than 95% can be achieved in actual operation, so that the abstract classifier does not participate in subsequent model parameter updating.

At block 230 (i.e., step (3)), the method 200 may include feature extracting the user return text in the user return data set using a feature encoder to obtain a return text hidden state vector. In one embodiment, the operations of block 230 may further comprise: replacing the user return text in the user return data set with a word vector (e.g., a word vector pre-trained using a GloVe model, a word vector pre-trained using a word2vec model, etc.), to obtain a base embedded representation of the user return text; then, a Bi-directional long-short-term memory (Bi-LSTM) feature encoder based on an Attention (Attention) mechanism is used for extracting features of the user return visit text in the user return visit data set; and obtaining the coded return text hidden state vector of the user return text by using a Bi-directional long short time memory (Bi-LSTM) model based on the attribute mechanism based on the extracted features.

At block 240 (i.e., step (4)), method 200 may include inputting the return text hidden state vector into a digest decoder to obtain a generated digest and calculating a digest generation penalty. In the present invention, digest generation loss can be used to evaluate the generation effect of the digest, i.e., the degree of difference between the generated digest generated by the digest decoder and the tag digest. In one embodiment, the operations of block 240 may further comprise: inputting the coded return text hidden state vector of the user return text into a digest decoder adopting a Long Short Time Memory (LSTM) network to obtain a generated digest; calculating a bilingual evaluation replacement (BLEU) score based on the generated digest and a tag digest, wherein the tag digest is a user return visit digest for which the user return visit text corresponds in the user return visit dataset; and determining a summary generation penalty based on the bilingual evaluation alternative (BLEU) score. The BLEU score may be used to evaluate text generated by a set of natural language processing tasks. It is an index for evaluating the difference between the candidate sentence and the reference sentence. Its value range is between 0.0 and 1.0. If the two sentences match perfectly, then BLEU is 1.0; conversely, if the two sentences do not perfectly match, then BLEU is 0.0. In one example, the BLEU score may be calculated in a Bi-gram manner or in any other known manner. In another example, the digest generation loss may be inversely proportional to the BLEU score, i.e., the higher the BLEU score, the smaller the digest generation loss; conversely, the lower the BLEU score, the greater the digest generation loss.

At block 250 (i.e., step (5)), method 200 may include calculating a digest classification loss based on the trained digest classifier. In the present invention, the digest classification penalty can be used to evaluate the effect of inputting the generated digest into the digest classifier to obtain the service recommendation. In one embodiment, the operations of block 250 may further comprise: inputting the tag abstract and the generated abstract into an abstract classifier trained at a block 220 to obtain probability distribution for predicting a service recommendation scheme by using the tag abstract and the generated abstract respectively; taking out the probability corresponding to the real service recommendation scheme from the probability distribution associated with the tag abstract as a first probability; taking out the probability corresponding to the real business recommendation scheme from the probability distribution associated with the generated abstract as a second probability; and calculating a difference between the first probability and the second probability as a digest classification loss. Here, the real service recommendation scheme refers to the service recommendation scheme corresponding to the tag digest obtained at block 210.

At block 260 (i.e., step (6)), the method 200 may include calculating a return text classification penalty based on the return text classifier. In the present invention, return text classification loss can be used to evaluate the effect of using a return text classifier to obtain a business recommendation. In one embodiment, the operations of block 260 may further comprise: inputting the coded return text hidden state vector of the user return text into a return text classifier adopting a text convolutional neural network model to obtain the probability distribution predicted by the service recommendation scheme; taking out the probability corresponding to the real service recommendation scheme from the probability distribution as a third probability; the absolute value of the difference between the second probability and the third probability is calculated as a return text classification penalty. Here, the real service recommendation scheme refers to the service recommendation scheme corresponding to the user return text obtained at block 210.

At block 270 (i.e., step (7)), the method 200 may include training updates to parameters of the feature encoder, the digest decoder, and the return text classifier with the goal of minimizing the sum of the digest generation loss, the digest classification loss, and the return text classification loss. In one embodiment, the sum of the summary classification penalty and the return text classification penalty may be referred to as the EF factor. The feature encoder, digest decoder, and return text classifier may be updated for parameter training using a model parameter optimizer with the goal of minimizing the sum of the EF factor and the digest generation loss. In one embodiment, the model parameter optimizer may be an adaptive moment estimation (Adam) optimizer. In another embodiment, the model parameter optimizer may be any other suitable optimizer, such as an adaptive gradient (AdaGrad) optimizer, RMSProp optimizer, or the like.

At block 280, the method 100 may include determining whether the model parameters converge. If convergence (e.g., the sum of the summary generation penalty, the summary classification penalty, and the return text classification penalty cannot be further reduced or the magnitude of the further reduction is very small), then the method 200 ends, completing the training of the business recommendation model; otherwise, the operations of blocks 230-270 are repeated to continue the training of the business recommendation model. In one embodiment, when the return text classification loss is below a threshold, the updating of parameters of the feature encoder, the digest decoder, and the return text classifier is stopped to prevent the occurrence of shortening, overfitting, etc. of the generated digest text.

Fig. 3 illustrates a flow chart of a method 300 for obtaining a user return summary and a business recommendation scheme based on user return text using a trained business recommendation model, according to one embodiment of the invention.

In some examples, the method 300 may be performed by the apparatus 400 illustrated in fig. 4. In some examples, the method 300 may be performed by any suitable device or means for performing the functions or algorithms described below.

At block 310, the method 300 may include obtaining a business recommendation model trained by the method 200.

At block 320, the method 300 may include obtaining user return text and data preprocessing the user return text. In one embodiment, the user return text may be obtained by speech recognition of the content of the user's speech and conversion to text during the conversation with the user. In one embodiment, the data preprocessing may include building a common article of manufacture, a word list of mood words, and replacing and extracting the text of the virtual word and the word of the mood in the user return visit text by using the regular expression so as to obtain the effective data of the user return visit text.

At block 330, the method 300 may include inputting the data-preprocessed user return text into the business recommendation model to generate a corresponding user return summary and business recommendation scheme.

Optionally, the method 300 may further comprise: correcting the generated user return visit abstract; and outputting the revised user return visit summary. Conventional text generation models typically discard low frequency, professional domain vocabulary to obtain higher text generation scores. For example, words in the input sequence that are not in the vocabulary are typically replaced with < UNK > tags, and < UNK > tags may also appear in the output sequence. In one embodiment, for the output of the generated digest, a digest modifier based on the Soft-Attention mechanism may be used to modify the generated digest content, which uses a sequence copy mechanism to replace the < UNK > tag in the generated digest to solve the problem that low frequency vocabulary may be ignored in the digest generator.

By comparing the method with the traditional service recommendation model algorithm, the feature representation capability of the model is enhanced by introducing the user return visit abstract based on the EF factor and the sequence copying mechanism, text information which can be read by service personnel is generated by refining, the interpretation of the model is improved, the method has important significance for service recommendation of the service personnel, and meanwhile, obvious growth occurs on multiple indexes. The model evaluation adopts accuracy, recall, AUC value and log loss, wherein the correct recommendation frequency/total recommendation frequency of model prediction is adopted as an accuracy index, the correct and successful recommendation/recommendation marketing frequency of model prediction is adopted as a recall index, the sorting capacity of positive and negative samples (the success rate of the samples predicted to be positive is higher as the AUC value is larger) of the model is calculated by using the AUC value, and the fitting capacity of the model is evaluated by using the log loss (the smaller the AUC value is, the better the fitting degree is indicated).

Specific indices are shown in Table 1 below:

	recall rate of recall	Accuracy rate of	AUC	log loss
					The invention is that	0.2364	0.3563	0.2523	0.3824
MLP	0.1934	0.3021	0.1967	0.3921
					RNNs	0.2039	0.3114	0.1992	0.3974
CNNs	0.2014	0.3301	0.2143	0.3945
					FM	0.1984	0.2945	0.1967	0.4245
GBDT+LR	0.1934	0.3209	0.1932	0.4394

Table 1 comparison of the method of the present invention with conventional service recommendation model algorithm

Fig. 4 illustrates a block diagram of an example of a hardware implementation of an apparatus 400 for implementing a method according to the invention, according to one embodiment of the invention. The apparatus 400 may be implemented using a processing system 414 that includes one or more processors 404. Examples of processor 404 include microprocessors, microcontrollers, digital Signal Processors (DSPs), field Programmable Gate Arrays (FPGAs), programmable Logic Devices (PLDs), state machines, gate logic, discrete hardware circuits, and other suitable hardware configured to perform the various functionality described throughout this disclosure. In various examples, the apparatus 400 may be configured to perform any one or more of the functions described herein. That is, the processor 404 as utilized in the apparatus 400 may be used to implement the method 200 described above with reference to fig. 2 and/or the method 300 described with reference to fig. 3.

In this example, processing system 414 may be implemented with a bus architecture, represented generally by bus 402. Bus 402 may include any number of interconnecting buses and bridges depending on the specific application of processing system 414 and the overall design constraints. Bus 402 communicatively couples together various circuitry including one or more processors (represented generally by processor 404), memory 405, and computer-readable media (represented generally by computer-readable media 406). Bus 402 may also link various other circuits such as timing sources, peripherals, voltage regulators, and power management circuits, which are well known in the art, and therefore, will not be described any further. Bus interface 408 provides an interface between bus 402 and transceiver 410. Transceiver 310 provides a communication interface or means for communicating with various other apparatus over a transmission medium. Depending on the characteristics of the device, a user interface 412 (e.g., keypad, display, speaker, microphone, joystick) may also be provided. Of course, such user interfaces 412 are optional and may be omitted in some examples.

In some aspects, the processor 404 may be configured to: obtaining a user return visit text, a user return visit abstract and a service recommendation scheme corresponding to the user return visit text, and carrying out data preprocessing on the user return visit text and the user return visit abstract to obtain a user return visit data set; training a digest classifier using a user return visit digest in a user return visit dataset and using a service recommendation scheme corresponding to the user return visit digest as a tag; extracting features of the user return visit text in the user return visit data set by using a feature encoder to obtain a return visit text hidden state vector; inputting the return text hidden state vector into a digest decoder to obtain a generated digest and calculating a digest generation penalty; calculating a digest classification penalty based on the trained digest classifier; calculating a return text classification loss based on the return text classifier; and training and updating parameters of the feature encoder, the abstract decoder and the return text classifier with the aim of minimizing the sum of the abstract generation loss, the abstract classification loss and the return text classification loss until the parameters are converged, thereby completing the training of the service recommendation model.

In other aspects, the processor 404 may be configured to: obtaining a service recommendation model trained by the method of the invention; obtaining a user return visit text and carrying out data preprocessing on the user return visit text; and inputting the user return visit text subjected to data preprocessing into the service recommendation model to generate a corresponding user return visit abstract and a service recommendation scheme.

The processor 404 is responsible for managing the bus 402 and general-purpose processing, including the execution of software stored on the computer-readable medium 406. The software, when executed by the processor 404, causes the processing system 414 to perform the various functions described for any particular apparatus. Computer-readable medium 406 and memory 405 may also be used for storing data that is manipulated by processor 404 when executing software.

One or more processors 404 in a processing system may execute software. Software should be construed broadly to mean instructions, instruction sets, code segments, program code, programs, subroutines, software modules, applications, software packages, routines, subroutines, objects, executables, threads of execution, procedures, functions, etc., whether described in software, firmware, middleware, microcode, hardware description language, or other terminology. The software may reside on a computer readable medium 406. Computer readable medium 306 may be a non-transitory computer readable medium. By way of example, non-transitory computer-readable media include magnetic storage devices (e.g., hard disk, floppy disk, magnetic strips), optical disks (e.g., compact Disk (CD) or Digital Versatile Disk (DVD)), smart cards, flash memory devices (e.g., card, stick, or key drive), random Access Memory (RAM), read Only Memory (ROM), programmable ROM (PROM), erasable PROM (EPROM), electrically Erasable PROM (EEPROM), registers, removable disk, and any other suitable medium for storing software and/or instructions that can be accessed and read by a computer. The computer readable medium 406 may reside in the processing system 414, external to the processing system 414, or distributed across multiple entities including the processing system 414. Computer readable medium 406 may be embodied in a computer program product. By way of example, a computer program product may include a computer readable medium in an encapsulating material. Those skilled in the art will recognize how to best implement the described functionality presented throughout this disclosure depending on the particular application and overall design constraints imposed on the overall system.

In one or more examples, the computer-readable storage medium 406 may include software configured for various functions, including, for example, functions for training a business recommendation model and/or functions for generating user return summaries and business recommendation schemes based on user return text using a trained business recommendation model. The software may include instructions that may configure the processing system 414 to perform one or more functions described with reference to fig. 2 and/or 3.

In the description of the present invention, it should be understood that the terms "first," "second," and "third" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.

It will be appreciated by one of ordinary skill in the art that various embodiments of the present invention may be provided as a method, apparatus, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the invention may take the form of a computer program product embodied on one or more computer-readable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-executable program code stored therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus, systems and computer program products according to embodiments of the invention. It will be understood that each flowchart and/or block of the flowchart illustrations and/or block diagrams, and combinations of flowcharts and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

Although aspects of the present invention have been described so far with reference to the accompanying drawings, the above-described methods, systems and apparatuses are merely examples, and the scope of the present invention is not limited to these aspects but is limited only by the appended claims and equivalents thereof. Various components may be omitted or replaced with equivalent components. In addition, the steps may also be implemented in a different order than described in the present invention. Furthermore, the various components may be combined in various ways. It is also important that as technology advances, many of the described components can be replaced by equivalent components that appear later. Various modifications to the disclosure will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other variations without departing from the scope of the disclosure. Thus, the disclosure is not intended to be limited to the examples and designs described herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A method for training a business recommendation model comprising a digest classifier, a feature encoder, a digest decoder, and a return text classifier implemented using an artificial neural network, respectively, the method comprising:

step (1): obtaining a user return visit text, a user return visit abstract and a service recommendation scheme corresponding to the user return visit text, and carrying out data preprocessing on the user return visit text and the user return visit abstract to obtain a user return visit data set;

step (2): training the abstract classifier using a user return visit abstract in the user return visit dataset and using a business recommendation scheme corresponding to the user return visit abstract as a tag;

step (3): extracting features of the user return visit text in the user return visit data set by using the feature encoder to obtain a return visit text hidden state vector;

step (4): inputting the return text hidden state vector to the digest decoder to obtain a generated digest and calculate a digest generation penalty;

step (5): calculating a summary classification loss based on the trained summary classifier;

step (6): calculating a return text classification loss based on the return text classifier;

step (7): training and updating parameters of the feature encoder, the digest decoder and the return text classifier with the aim of minimizing the sum of the digest generation loss, the digest classification loss and the return text classification loss; and

and (3) repeating the steps (3) - (7) until the parameters are converged, thereby completing the training of the service recommendation model.

2. The method of claim 1, wherein the user return data set comprises a training set, a validation set, and a test set, the ratio of the training set, the validation set, and the test set being 6:2:2.

3. The method of claim 1, wherein the summary classifier is trained using a text convolutional neural network model.

4. The method of claim 1, wherein step (3) further comprises:

extracting features of user return visit text in the user return visit dataset by using a Bi-directional long-short-term memory (Bi-LSTM) feature encoder based on an attention mechanism; and

a recall text hidden state vector is obtained for the encoded user recall text based on the extracted features using a attention-based two-way long short term memory (Bi-LSTM) network.

5. The method of claim 4, wherein step (4) further comprises:

inputting the coded return text hidden state vector of the user return text into a digest decoder adopting a Long Short Time Memory (LSTM) network to obtain a generated digest;

calculating a bilingual evaluation replacement (BLEU) score based on the generated summary and a tag summary, wherein the tag summary is a user return summary corresponding to the user return text in the user return dataset; and

the summary generation penalty is determined based on the bilingual evaluation replacement (BLEU) score.

6. The method of claim 5, wherein step (5) further comprises:

inputting the tag abstract and the generated abstract into the abstract classifier trained in the step (2) to obtain probability distribution for predicting a service recommendation scheme by using the tag abstract and the generated abstract respectively;

taking out the probability corresponding to the real business recommendation scheme from the probability distribution associated with the tag abstract as a first probability;

taking out the probability corresponding to the real business recommendation scheme from the probability distribution associated with the generated abstract as a second probability; and

and calculating an absolute value of a difference between the first probability and the second probability as a digest classification loss.

7. The method of claim 6, wherein step (6) further comprises:

inputting the coded return text hidden state vector of the user return text into a return text classifier adopting a text convolutional neural network model to obtain probability distribution predicted by a service recommendation scheme;

taking out the probability corresponding to the real service recommendation scheme from the probability distribution as a third probability;

and calculating the absolute value of the difference between the second probability and the third probability as a return visit text classification loss.

8. The method of claim 1, wherein updating of the parameters is stopped when the return text classification loss is below a threshold.

9. A method for generating a user return visit summary and a business recommendation based on user return visit text, the method comprising:

obtaining a business recommendation model trained by the method of any one of claims 1-8;

obtaining a user return visit text and carrying out data preprocessing on the user return visit text; and

and inputting the user return visit text subjected to data preprocessing into the service recommendation model to generate a corresponding user return visit abstract and a service recommendation scheme.

10. The method of claim 9, wherein the method further comprises:

correcting the generated user return visit abstract; and

and outputting the revised user return visit abstract.