CN111813915A - Message interaction method, device, equipment and computer readable storage medium - Google Patents

Message interaction method, device, equipment and computer readable storage medium Download PDF

Info

Publication number
CN111813915A
CN111813915A CN202010705545.8A CN202010705545A CN111813915A CN 111813915 A CN111813915 A CN 111813915A CN 202010705545 A CN202010705545 A CN 202010705545A CN 111813915 A CN111813915 A CN 111813915A
Authority
CN
China
Prior art keywords
message
query message
query
user
database
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010705545.8A
Other languages
Chinese (zh)
Inventor
高波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202010705545.8A priority Critical patent/CN111813915A/en
Publication of CN111813915A publication Critical patent/CN111813915A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Business, Economics & Management (AREA)
  • Mathematical Physics (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Business, Economics & Management (AREA)
  • Evolutionary Biology (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Human Computer Interaction (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

Disclosed is a message interaction method, including: receiving a query message of a user; matching the query message in a database, wherein the database is provided with one or more first question sets and corresponding first response message sets generated in a first clustering mode, and the first clustering mode adopts a model based on machine learning; and upon determining that no response message to the query message exists in the database: acquiring a response message aiming at the query message; updating the first clustering mode and the database by using the query message and a response message aiming at the query message; and pushing a response message aiming at the query message to the user. Corresponding apparatus, devices and computer-readable storage media are also disclosed.

Description

Message interaction method, device, equipment and computer readable storage medium
Technical Field
The present application relates to message interaction, and more particularly, to a message interaction method, apparatus, device, and computer-readable storage medium.
Background
The traditional customer inquiry service is usually realized by a Call Center (Call Center), and special software and hardware systems and manual customer service personnel are required to be equipped, so that the cost is high. With the development and popularization of the internet in recent years, a new generation customer service system relying on the internet has been applied in a plurality of fields. Such internet-based customer service systems are further divided into two categories: a manual customer service system and an automatic customer service system.
The artificial customer service system based on the Internet is characterized in that the Internet is used as a communication infrastructure, conversation connection is established between an inquiry user and artificial customer service personnel by developing service software based on the Internet, and the inquiry user and the customer service personnel can directly communicate in a text, voice or video mode through the conversation connection. The answer of the user query still needs to be completed manually, so that the cost is high, but the advantages of high accuracy of manual customer service answer of the question and high customer satisfaction are achieved.
The automatic customer service system based on the Internet is characterized in that a machine learning technology is adopted, and answers are automatically given under the support of a database (also called a knowledge base) by analyzing and understanding query questions sent by users without the participation of manual customer service, so that the labor cost is greatly saved. In a specific field of automatic customer service system, a Frequent Ask Questions (FAQ) of the field is often constructed as a database for supporting automatic customer service. The system has the advantages of low cost, real-time online and the like, but the system has the advantages of limited answering questions, limited answering rate and limited user experience.
Disclosure of Invention
The embodiment of the invention provides a message interaction method, a message interaction device, message interaction equipment and a computer-readable storage medium.
According to a first aspect of the present invention, there is provided a message interaction method, including: receiving a query message of a user; matching the query message in a database, wherein the database is provided with one or more first question sets and corresponding first response message sets, the one or more first question sets and the corresponding first response message sets are generated through a first clustering mode, the first clustering mode adopts a model based on machine learning, each first question set is provided with one or more similar questions, and each corresponding first response message set is provided with one or more response messages aiming at the one or more similar questions of the first question set; and upon determining that no response message to the query message exists in the database: acquiring a response message aiming at the query message; updating the first clustering mode and the database by using the query message and a response message aiming at the query message; and pushing a response message aiming at the query message to the user.
In one embodiment, the query message includes a plurality of messages, the method further comprising: and for all the query messages without the corresponding response messages in the database, clustering in a second clustering way to form one or more second problem sets, wherein each second problem set has one or more similar problems, so that the response messages of the query messages without the corresponding response messages in the database can be conveniently obtained in batches, and the second clustering way adopts a machine learning-based model.
In one embodiment, the method further comprises: and refining the query message for which the response message does not exist in the database by utilizing the basic attribute and/or the user image of the query message so as to obtain the response message for the refined query message.
In one embodiment, the method further comprises: for each refined query message in the second question set, extracting the keywords thereof, determining a predetermined number of keywords with higher occurrence frequency, and presenting the keywords and the corresponding refined query message at the same time.
In one embodiment, the method further comprises: after receiving the user's query message and before the matching in the database for the query message, filtering the query message to remove any one or more of: a query message containing only numbers, a query message containing only symbols, a query message with a number of words exceeding a threshold number, a query message repeated with a previous message, a query message exceeding a time range, a query message with the same characters existing exceeding a threshold proportion, a query message with the symbols, letters or numbers existing exceeding a threshold proportion.
In one embodiment, the method further comprises: after receiving the query message of the user and before matching in the database for the query message, filtering the query message to remove: any one or more of the following: a query message containing only numbers, a query message containing only symbols, a query message with a number of words exceeding a threshold number, a query message that is repeated with a previous message, a query message that is outside a time range.
In one embodiment, the filtering also removes: repeated characters in the same character in a query message where there is more than a threshold proportion of the same character, symbols, letters, or numbers in a query message where there is more than a threshold proportion of the symbols, letters, or numbers.
In one embodiment, the filtering the query message to remove duplicate query messages from previous messages comprises: and calculating the similarity edit distance between the previous message and the query message. And when the similarity edit distance is lower than a preset threshold value of the similarity edit distance, removing the query message.
In one embodiment, the first clustering means clusters based on any one or more of: similarity distance, edit distance, semantic distance.
In one embodiment, the first clustering approach is the same as the second clustering approach, and clustering is performed based on any one or more of: similarity distance, edit distance, semantic distance.
In one embodiment, after the matching, the method further comprises: and replying to the user based on the matching result, and informing the matched response message or informing the matching failure.
In one embodiment, it is determined that no reply message to the query message exists in the database when either: and if the matching fails, similar messages input by the user are continuously received, and the dissatisfaction evaluation given by the user for the matched response message is received.
In one embodiment, the method further comprises, after the matching, receiving a similar query message of the user to the query message, and based on receiving the similar query message, determining that no reply message to the query message exists in the database.
In one embodiment, after the pushing, the method further comprises: the user's rating is received and recorded.
According to a second aspect of the present invention, there is provided a message interaction apparatus, comprising: a receiving module configured to receive a query message of a user; a matching module configured to match the query message in a database having one or more first problem sets generated by a first clustering method using a machine learning-based model and respectively corresponding first response message sets, each first problem set having one or more similar problems, each corresponding first response message set having one or more response messages to the one or more similar problems of the first problem set; and a patching module configured to, upon determining that no response message to the query message exists in the database: acquiring a response message aiming at the query message; updating the first clustering mode and the database by using the query message and a response message aiming at the query message; and pushing a response message aiming at the query message to the user.
According to a third aspect of the present invention, there is provided a message interaction device, comprising: a processor; and a memory configured to have stored thereon computer-executable instructions that, when executed in the processor, cause the method according to the first aspect of the invention to be carried out.
According to a fourth aspect of the present invention there is provided a computer readable storage medium having stored thereon instructions which, when run on a computer, cause the computer to carry out the method according to the first aspect of the present invention.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a basic block diagram illustrating a message interaction method according to an embodiment of the present invention.
Fig. 2a illustrates a flow chart of a message interaction method according to an embodiment of the present invention.
Fig. 2b illustrates a further flow diagram of a message interaction method according to an embodiment of the present invention.
Fig. 2c illustrates another flowchart of a message interaction method according to an embodiment of the present invention.
FIG. 3a illustrates an interface for interacting with a database in accordance with an embodiment of the present invention.
FIG. 3b illustrates an interface for message interaction according to an embodiment of the present invention.
FIG. 3c illustrates another interface for message interaction according to an embodiment of the present invention.
FIG. 3d illustrates yet another interface for message interaction in accordance with an embodiment of the present invention.
FIG. 3e illustrates yet another interface for message interaction according to an embodiment of the present invention.
Fig. 3f illustrates a flow diagram of message interaction according to an embodiment of the present invention.
Fig. 3g illustrates a message to be filtered according to an embodiment of the present invention.
Fig. 3h illustrates a further message to be filtered according to an embodiment of the present invention.
Fig. 4 illustrates a block diagram of a message interaction device according to an embodiment of the present invention.
FIG. 5 illustrates a diagram of a hardware environment related to message interaction, according to an embodiment of the present invention.
Detailed Description
To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
Fig. 1 is a basic architecture diagram of a message interaction method according to an embodiment of the present invention. Referring to fig. 1, at least one terminal 101 and at least one server 102 may be included in the implementation environment. The terminal 101 and the server 102 are connected via a network 103. The at least one terminal 101 may be any terminal capable of sending a message, including but not limited to a smart phone, a desktop computer, a laptop computer, a tablet computer, a game machine, a vehicle computer, a smart speaker, and the like, which are integrated with a smart function. The messages include query messages sent by the terminal 101 and response messages sent by the server 102, including but not limited to text messages, image messages, audio messages, video messages and the like, generally, after a user logs in the terminal, the user can send the query messages to the server 102, the server 102 can match the query messages sent by the user in a database to generate corresponding response messages, the server 102 sends the response messages to the terminal 101, and man-machine conversation is achieved in the process of repeatedly executing the steps.
Note that the data in the database includes, but is not limited to, text messages, image messages, audio messages, video messages, etc., and may be the same or different types of query messages and reply messages. Different types of messages may also be converted to each other.
Note that "user" herein should be understood in a broad sense, not just a person, but any functional entity capable of issuing a query message, e.g., an application that generates a query message based on an analysis of the person's behavior.
In the above process, the user may send a message by using the dialog processing logic built in the terminal 101, for example, an intelligent assistant is built in the terminal 101, so that the terminal directly sends the message by using the intelligent assistant, and of course, the user may also send the message by using an application client on the terminal, where the application client may be any client supporting a man-machine dialog, such as an instant messaging client, a game client, a shopping client, a taxi-taking client, and the like configured with the intelligent assistant, and of course, the application client may also be a chat robot, a dialog robot, and the like. It should be understood that, although the following embodiments of the present invention mostly take the message interaction in the game scene as an example, the embodiments of the present invention are not limited to the game scene, but may be applied to any suitable scene such as shopping, taxi taking, etc.
The server 102 may be any computer device capable of providing a machine response service, and when there are multiple servers 102, the servers may collaboratively respond to different user messages based on the cloud, or may be respectively and independently responsible for responding to different user messages, and the users may be divided according to IP addresses of terminals in which the users log in, or according to client types that send messages, or according to real-time centralized scheduling. When the server 102 receives a message from any of the at least one terminal 101, it can send a response message to the terminal.
When the server 102 receives a message of any terminal 101, it matches the message in its database, so as to send the matched response message to the terminal 101. However, there may be no response message for the user, for example, the matching fails in the database, or the user is dissatisfied with the received response message and gives unsatisfactory evaluation, or similar messages are continuously sent, and in this case, the messages need to be submitted to the background operator to manually edit the response message of the messages, and the database is updated with the editing result.
In the man-machine conversation process, a plurality of inquiry messages are often not answered by the robot at present, and the robot can answer the inquiry messages after the response messages of the inquiry messages are obtained through continuous expansion of a subsequent database. However, the user is often required to send out the query message again to obtain the updated response messages, and the user experience is poor.
Fig. 2 illustrates a flow chart of a message interaction method according to an embodiment of the present invention. The method is performed by one or more servers. The allocation and addressing of servers is not elaborated on below, but does not exclude the presence of such steps, which a skilled person may implement with centralized scheduling or distributed management, etc.
First, in step 201, a query message of a user is received. As mentioned before, the query message may be sent by the user via any possible terminal and/or any possible client, etc.
In step 202, matching is performed in a database, the database has one or more first question sets generated by a first clustering manner and respectively corresponding first response message sets, each first question set has one or more similar questions, and each corresponding first response message set has one or more response messages aiming at the one or more similar questions of the first question set. In one example, the database has L sets of questions and L sets of first response messages, each set of first questions containing M questions of which there is a standard question and possibly one or more similar questions, each set of first response messages containing N questions of which there is a standard response message and possibly one or more similar response messages. Where L, M and N are natural numbers, L is generally a relatively large number, such as thousands, tens of thousands or more, and M and N may or may not be equal. The first clustering method adopts a model based on machine learning, and carries out clustering based on any one or more of the following items: similarity distance, edit distance, semantic distance, etc.
There are generally two concepts for matching question-answering systems: the first idea is similar problem matching, namely calculating the similarity between the query message of a user and the problems in the existing database, and returning the most accurate answer corresponding to the query message of the user; the second idea is question reply matching, that is, the matching degree between the query message sent by the user and the response message in the database is calculated, and the most accurate answer corresponding to the query message sent by the user is returned.
The search mode of the question-answering system depends on a database with mass data, the database is formed by operation, writing and accumulation, the data of the database can refer to an interface shown in fig. 3a, and operators can edit questions and answers in the database through the interface. As mentioned above, the questions in the database are stored according to different sets of questions, and the illustrated "standard question law" refers to the "standard question" mentioned above, under which the operator can edit a plurality of answers in the middle column by clicking "+" on the left column, for example, the answer "spring warm flower aircraft" given in the figure as the peace elite first aircraft is a decoration of the nose-down stage of the special soldier's airborne landing from the plane, similar to tail smoke, but more beautiful and novel. ". The right column of the interface lists two tabbed sheets, including a base property and a user representation, and shows various base properties. Different answers can be defined according to different basic attributes and user figures, and the refinement of the answers is realized. The basic attributes are for the response message, and are basic information related to implementation of the response message, such as validation channel of the response message (such as validation via applet, PC, APP, inline (H5) -inside the game client, please APP-a service platform provided for the game player, public number, SDK-a man-machine conversation plug-in, QQ group chat, etc.), application channel of the response message (such as validation for new promotion, non-new promotion, VIP1, VIP2, VIP3, non-VIP user, etc.), validation time of the response message (such as validation immediately, regeneration triggered by a condition), cycle time (such as validation in the morning), startup status, etc. The user portrait is a result of aggregation analysis of behavior attributes of the user and basic attributes of the user. In the game scene, the user figures such as common modes (e.g. single person, double person, four person mode of the first person, single person, double person, four person mode of the third person, etc.) for participating in the game, common maps for the game, common entertainment modes for the game (e.g. fleeing and killing mode, desert fleeing and killing mode, training field mode, etc.). Note that the basic attribute referred to herein is a basic attribute of the response message, which is consistent with a basic attribute of the query message for which it is intended, for example, a query message received from a particular portal is consistent with a response message for the query message on this basic attribute of the portal. Basic attributes of the query message such as the channel of delivery of the query message (such as through an applet, PC-side, APP, inline (H5) -meaning game client internal, happy APP-a service platform provided for game players, public number, SDK-meaning a human-machine conversation plug-in, QQ group chat, etc. take effect), the channel of origin of the query message (such as for new promotions, non-new promotions, VIP1, VIP2, VIP3, non-VIP users, etc.), and so forth. The underlying attributes should not be confused with underlying attributes of the user involved in the representation of the user, such as the user's age, the game player's character, and so forth. However, sometimes the basic attribute of the response message corresponds to the basic attribute of the user, for example, the application channel of the response message is a new promotion user, and the basic attribute of the user is also the new promotion user.
The following description will be made in further detail by taking a similar problem matching method as an example. Of course, the invention is not so limited and, similarly, the manner of question-answer matching may be used.
Firstly, an index needs to be established in advance for massive first problem sets generated by a first clustering module in a database, wherein each first problem set comprises one or more similar problems. In one example, a Lucene engine, which is an open source full-text search engine toolkit under apache, is employed to build word-level inverted indexes for a first set of questions in large volumes. The meaning of inverted index is to place the character in front of the "file" (e.g., the question in the first question set here). For example, "weapon 1[2],2[1]2, 5, 2" indicates that "weapon" appears 2 times in question 1, at positions 2 and 5, respectively; problem 2 occurs once and location is 2. The Lucene engine can greatly improve the retrieval efficiency.
The messages input by the user need to be matched with mass data in the database, and a batch processing mode is adopted. Thus, when a message is received from a user, a first set of questions in the database is recalled (also called a bold line).
For the first question set, calculating a matching degree between the message input by the user and the questions in the first question set, where the matching degree may be obtained by using a Network structure introduced with parameter learning, such as an MLP (Multi-layer perceptron) Network structure or a CNTN (Convolutional Neural Tensor Network), etc.
Then, based on the matching degree, further sorting is performed. The sorting can utilize a network structure with parameter learning introduced, the input of the network structure comprises the matching scores obtained in the foregoing, different types of loss functions are constructed according to actual requirements, such as pointwise loss (pointwise loss), pairwise loss (pair loss), list loss (listwise), the output sorting scores thereof and the first problem sets sorted from high to low in the foregoing preset number are used as final sorting bases.
After the above-mentioned matching degree calculation and sorting are performed on all the first question sets, one or more question messages that best match the message input by the user are obtained, and thus corresponding reply messages in the database are obtained.
For example: the user inputs "how to return to work with the spring warmer", can be matched to "how to play with the spring warmer" in the database, and gives a response message to the question "how to play with the spring warmer".
Optionally, in step 203, a reply is made to the user based on the matching result. In one example, the matching does not set a threshold, a first question set can be matched for the messages input by the user, any response message in the first response message set corresponding to the first question set is replied to the user, and if the user is not satisfied, the user may input a similar question again, or feedback the dissatisfaction of the reply. In another example, where the match is thresholded, there may be instances where the match fails, at which point a statement may be replied to the user indicating "not known". Even if the matching is successful, there is a case where the user is not satisfied, and a similar problem input again by the user or feedback unsatisfactory to the reply may be received. In the above example, whether the user is dissatisfied or the matching fails, it may be determined that no response message to the query message of the user exists in the database. Of course, it is also possible not to reply to the user in case of a failure of the matching.
At step 204, it is determined whether a reply message to the user's query message exists in the database. In one example, it may be inferred that there is no reply message to the user's query message in the database when either: the matching in step 202 fails, similar messages input by the user are continuously received after the matching, and dissatisfaction evaluation given by the user for the matched response message is received. Of course, the embodiments of the present invention are not limited thereto, and there are cases, for example: the matching is successful, but the matching degree is low, the user side is not unsatisfied with evaluation or continuously inputs similar messages, and in this case, the database can be determined not to have a response message aiming at the query message of the user. FIG. 3b shows a human-machine dialog interface with a robot on the left and a user on the right, illustrating a game application client in which the user asks a question "close arms of the training arena" in response to a response from the robot in a manner that alternates between successive questions-training arena close arms "" with close arms in the training arena
Figure DEST_PATH_IMAGE002A
"," I ask you for your training ground weapon of close combat ", alternatively or optionally, the user also replies at the robot a number of candidates given below-" can receive friend invitation at training ground
Figure DEST_PATH_IMAGE002AA
"," Stab introduction "," Stab killer "-clicking to find an answer. From this it can be determined that no reply message to the user's query message is present in the database. Conversely, if the match is successful, or the user receives a reply without entering a similar question or giving a satisfactory rating, then it can be inferred that there is a reply message in the database to the user's query message.
Note that, in this document, a response message to a query message (the expression of which may also be such as "a response message to a query message of a user", "a response message to the query message", "a response to a query of a user", "a response to a question", "a response message to a question", or the like) refers to a response message that can achieve the purpose of a query or solve a query question, and appears as a response message obtained such as a success in matching or a high degree of matching, or a response message that is satisfactory to a user.
When it is determined in step 204 that a response message to the query message does not exist in the database, a response message to the query message is acquired in step 205. In one example, the obtaining includes obtaining by manual input by an operator. For a further detailed description of this example, reference may be made to the description of fig. 2 b. Of course, the embodiments of the present invention are not limited thereto, and may also be identified in other databases through a trained machine learning based model, for example. In step 206, the database is updated and the first clustering method is updated by using the query message input by the user and the response message for the query message. The updating of the first clustering mode is to adjust the first clustering mode by using the query message input by the user and the response message aiming at the query message, so that the response message aiming at the query message can be obtained according to the query message input by the user. In one example, the first clustering approach employs a machine learning based model, and the updating includes; the machine learning based model is further trained using query messages entered by a user and response messages directed to the query messages. In step 207, a response message for the query message input by the user is pushed to the user, so as to complete closed-loop pushing. Some closed-loop push techniques can be designed to initiate the push, which are personally and accurately specified, can account for the time of the query and the specific query message, and can be optionally one presented in a random manner, for example, in the uppermost interface in fig. 3f, the robot customer serves "favorite XX" pushed by a little, the little tangerine learns after one study, knows the XX content of your query at XX time is XX answer, the little tangerine tries to learn to do anything else, and then gives a response message to the query message.
By the embodiment of the invention, the database can be expanded in a targeted manner by utilizing the query message input by the user, and the usability of the database is improved. By responding positively to the user who sent the message, and not having to wait until the user presents similar problems again, the user experience is greatly improved.
Fig. 2b illustrates a further flow diagram of a message interaction method according to an embodiment of the present invention. Which further illustrates one example of step 205. There are a number of query messages from one or more different users over a period of time, where there are many query problems in the database, such as a lack of a successful or a low degree of match, or a dissatisfaction of the user, i.e. there are many query messages in the database that do not have a response message to them, e.g. within a day, a large number of such query problems may accumulate. For example, fig. 3c shows query messages accumulated for reply messages not addressed in the database for presentation to the operator. The query time of the query message, the original text of the query message (since the query message may be a text message, an audio message, an image message, or a video message, where the original text is a text message identified from the query message), the source (e.g., from APP or inline), and the actions (actions that the operator can perform, such as adding to form a new set for a standard question, or associating a similar question with a standard question, or deleting) are indicated. However, such manual operation is highly efficient when facing a large number of query messages to be answered, and therefore, in step 2051, query messages for which there is no answer message in the database are clustered by a second clustering method to form a plurality of second question sets, each having one or more similar questions, so as to be presented separately as different second question sets. In one example, the second clustering method is the same as the first clustering method, and a model based on machine learning is also adopted, and clustering is performed based on any one or more of similarity distance, edit distance, semantic distance and the like. Of course, the present invention is not limited thereto, but may include an example in which the second clustering manner is different from the first clustering manner.
A set with one or more similar problems is quickly formed through clustering, and is presented to operators, so that the operators can give response messages in batches, and the answering efficiency of the problems to be answered is greatly improved.
Optionally, at step 2052, the query message may be refined using the base attributes and/or user images of the query message. For example, the basic properties of a query message may include an entry- "why I just bought something no account
Figure DEST_PATH_IMAGE002AAA
"can be refined" why I just bought in mall did not get account "(things that user bought at different entrances did not get account at different times), e.g., user profile may include user's commonly used weapons-" I won a gun several times most often — "I won
Figure DEST_PATH_IMAGE002AAAA
"refine to" i won several times with M416 most commonly used ", etc., and the condition of refinement is not limited to one kind of data. In one example, this refinement step 2052 occurs after the clustering step 2051, further partitioning the subset among the large clustered set, although the invention is not so limited and may also include schemes where the refinement step 2052 occurs before the clustering step 2051.
The query message is refined by using the basic attribute of the query message and/or the user image, so that operators can input response messages aiming at the query message of the user aiming at specific users, the response messages can be promoted to meet the user requirements more accurately, and the user experience can be improved.
At step 2053, keywords for the messages in each second question set are extracted. And sequencing the keywords from high to low according to the occurrence frequency of the keywords in the second problem set, and determining a predetermined number of the keywords sequenced at the front, namely a predetermined number of the keywords with higher occurrence frequency. In one example, the original text message may be selected as the message in the second set of questions to extract the keywords. In another example, the refined query message can be selected as a message in the second question set to extract keywords, which is beneficial for refining the query question in a refined way and presenting the refined query message to the operator in a hierarchical way.
At step 2054, the top predetermined number of keywords and their corresponding messages are presented concurrently, respectively, in accordance with a second set of questions generated in a second clustering manner. Preferably, the message is a refined message. As shown in fig. 3d, 6 second question sets are shown in 6 block diagrams, and are illustrated by taking the question set in the first row and the first column as an example, wherein the three most frequent keywords extracted from the set are "training field", "weapon" and "close war", the query message with the three keywords is presented, and a message "how best the lamb seat is" can be refined from the user "what kind of weapon i best uses" in combination with the user portrait "lamb seat".
At step 2055, operator input is received. Because the query messages are clustered and presented respectively according to different sets generated by clustering, operators can perform batch operation on similar queries. As shown in fig. 3e, one or more queries to be operated are selected, the existing questions in the database are first searched by using the keywords, several questions with the keywords are obtained, if the operator considers that there are questions that can be associated (i.e. similar questions), the questions can be associated with the questions, and therefore the response message for the question in the database is used as the response message for the one or more questions to be operated. If there are no questions which can be associated after the search, or if the search is not carried out, the response message of one or more questions selected to be operated can be manually input.
Fig. 2c illustrates yet another flowchart of a message interaction method according to an embodiment of the present invention. Which is added with step 208-209 on the basis of step 201-207 of fig. 2 a. The description of steps 201-207 is referred to the description of fig. 2a above, and will not be described herein again. The messages are filtered, in one example, to remove any one or more of: a query message containing only numbers, a query message containing only symbols, a query message with a number of words exceeding a threshold number, a query message repeated with a previous message, a query message exceeding a time range, a query message with the same characters existing exceeding a threshold proportion, a query message with the symbols, letters or numbers existing exceeding a threshold proportion. In another example, the filtering removes any one or more of: query messages containing only numbers, query messages containing only symbols, query messages with a number of words exceeding a threshold number, query messages that repeat with a previous message, query messages outside a time range, query messages that have the same characters present outside a threshold proportion, optionally also filtered to remove: repeated characters in the query message where there are more than a threshold proportion of the same characters, and/or symbols, letters or numbers in the query message where there are more than a threshold proportion of symbols, letters or numbers (but all retaining the query message).
A pure number, a query message containing only a symbol, having the same character in a proportion exceeding a threshold, a query message having a symbol or a number in a proportion exceeding a threshold, is often input by a user randomly hitting a keyboard (or may be input randomly without any logic noise), and does not have any meaning by itself, nor does it have any meaning after clustering, as shown in fig. 3g, the result of clustering a pure number, a pure symbol, and a query message having more than 50% of the symbols or letters, as shown in fig. 3h, the result of clustering a query message having more than 50% of the same character. For such query messages, no reply is necessary.
It is noted that filtering, for example, for a query message with pure letters or a query message with more than 50% letters is a compromise, which is intended to remove randomly input letters, but may have errors, for example, a query message with pure english is also removed. The choice of this item may be determined according to the language environment supported by the terminal issuing the query message.
The repeated messages only need to be processed once, but the server can record all the users sending the repeated messages so as to push the messages to the users after obtaining the response messages. In one example, a similarity edit distance between query messages is employed to determine duplicate query messages. The edit distance, also known as the Levenshtein distance, refers to the minimum number of editing operations required to convert one string into another between two strings. The algorithm for edit distance is first proposed by the Russian scientist Levenshtein, and is therefore called the Levenshtein distance. Similarity, equal to the reciprocal of "edit distance + 1".
A character string a [0.. n ], b [0.. m ] is set. (1) When a [ i ] = b [ j ], it is explained that the editing operation is not required at this time. The edit distance remains, i.e., f (i, j) = f (i-1, j-1). (2) When a [ i ] | = b [ j ], there can be three editing operations. Wherein the delete and insert operations affect only one index i or j. As in table 1 below, at the current match (t1, t2), if delete 'g' is adopted, only the subscript of t1 is changed.
Figure DEST_PATH_IMAGE003
Wherein the replacement operation has an effect on all 2 indices. As in table 2 below, at (t1, t2) the current match is, if 'g' is replaced by'm', then at (t1+1, t2+1) the next time it needs to be executed.
Figure 771053DEST_PATH_IMAGE004
The following recursion formula can be derived.
Figure DEST_PATH_IMAGE005
The editing distance of the character strings a and b can be obtained through a recursive or dynamic programming mode, and therefore the similarity is calculated. The similarity is 1, which indicates that two character strings are completely the same, and the closer the similarity is to 1, the higher the similarity of the two character strings is. One of the corresponding query messages is removed by calculating similarity edit distances between the respective query messages, and when at least one of the calculated similarity edit distances is lower than a predetermined threshold similarity edit distance.
Query messages exceeding a threshold number, e.g., query messages exceeding 20 characters, typically belonging to query messages whose intent is ambiguous or which include multiple intents, may be discarded. Query messages with a time out of time range may also be discarded because of lost timeliness.
After filtering the query message, the subsequent step 202 and 208 is described above with reference to fig. 2a, and it is noted that the subsequent step processes the filtered query message. Wherein at step 207 the robot service also provides two rating buttons, such as "thank you, i's answer" and "no match you would like" prompting the user for rating to allow statistics on adoption satisfaction and to further mine the user's potential needs. At step 208, the user's rating is received. For example, as shown in fig. 3f, when the user selects a positive evaluation "thank you, which is an answer to me", the robot customer service may send a word indicating thank you to the user, for example, "do you can you up you, little orange may try you up you get you up you again", when the user selects a negative evaluation "no you want you up you again", the robot customer service may send a word indicating apology to the user, for example, "do nothing, little pleasure you may continue trying you up again". Other topics may then be turned on based on system prompts and user selections.
Fig. 4 illustrates a block diagram of a message interaction device according to an embodiment of the present invention. The message interaction apparatus 400 includes a receiving module 401, a matching module 402, and a patching module 403. Wherein the receiving module is configured to receive a query message of a user, i.e. to perform step 201 in fig. 2a or 2c, and the matching module 402 is configured to match in the database for the query message, i.e. to perform step 202 in fig. 2a or 2 c. The database is provided with one or more first question sets and corresponding first response message sets, wherein the one or more first question sets and the corresponding first response message sets are generated through a first clustering mode, the first clustering mode adopts a model based on machine learning, each first question set is provided with one or more similar questions, and each corresponding first response message set is provided with one or more response messages aiming at the one or more similar questions of the first question set. The patching module is configured to, upon determining that no reply message to the query message exists in the database: obtaining a response message for the query message, updating the first clustering method and the database by using the query message and the response message for the query message, and pushing the response message for the query message to the user, i.e., executing steps 205, 206, and 207 in fig. 2a or 2 c. For more details of these modules, reference may be made to the description of steps 201, 202, 205, 206, and 207 in fig. 2a or 2c, which are not described herein again.
Referring to fig. 5, in an embodiment of the present invention, a message interaction device 500 includes a processor 504, which includes a hardware master 510. Processor 504 includes, for example, one or more processors such as one or more Digital Signal Processors (DSPs), general purpose microprocessors, Application Specific Integrated Circuits (ASICs), field programmable logic arrays (FPGAs), or other equivalent integrated or discrete logic circuitry. The term "processor," as used herein may refer to any of the foregoing structure or any other structure suitable for implementation of the techniques described herein. In addition, in some aspects, the functionality described herein may be provided within dedicated hardware and/or software modules configured for message interaction, or incorporated in combined hardware and/or software modules. Also, the techniques may be fully implemented in one or more circuits or logic elements. The methods in this disclosure may be implemented in various components, modules, or units, but need not be implemented by different hardware units. Rather, as noted above, the various components, modules or units may be combined or provided by a collection of interoperative hardware units (including one or more processors as noted above) in combination with appropriate software and/or firmware.
In one or more examples, the content described above in connection with fig. 1-4 may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored on computer-readable medium 506 or transmitted over computer-readable medium 506 as one or more instructions or code and executed by a hardware-based processor. Computer-readable media 506 may include computer-readable storage media corresponding to tangible media, such as data storage media, or communication media including any medium that facilitates transfer of a computer program from one place to another, such as according to a communication protocol. In this manner, the computer-readable medium 506 may generally correspond to (1) a non-transitory tangible computer-readable storage medium, or (2) a communication medium such as a signal or carrier wave. The data storage medium can be any available medium that can be read by one or more computers or one or more processors to retrieve instructions, code and/or data structures for implementing the techniques described in this disclosure. The computer program product may include a computer-readable medium 506.
By way of example, and not limitation, such computer-readable storage media can comprise memory such as RAM, ROM, EEPROM, CD _ ROM or other optical disk, magnetic disk memory or other magnetic storage, flash memory or any other memory 512 that can be used to store desired program code in the form of instructions or data structures and that can be read by a computer. Also, any connection is properly termed a computer-readable medium 506. For example, if the instructions are transmitted from a website, server, or other remote source using a coaxial cable, fiber optic cable, twisted pair, Digital Subscriber Line (DSL), or wireless technologies such as infrared, radio, and microwave, then the coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of medium. It should be understood, however, that computer-readable storage media and data storage media do not include connections, carrier waves, signals, or other transitory media, but are instead directed to non-transitory tangible storage media. Disk and disc, as used herein, includes Compact Disc (CD), laser disc, optical disc, Digital Versatile Disc (DVD), floppy disk and blu-ray disc where disks usually reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer-readable media 506.
The message interaction device 500 may also include an I/O interface for transferring data, as well as other functionality 514. The message interaction device 500 may be included in different terminals such as a computer 516, a mobile device 518, and other terminals 520, among others. Each of these configurations includes devices that may have generally different constructs and capabilities, and thus the message interaction device 500 may be configured according to one or more of the different device classes. The techniques of this disclosure may also be implemented, in whole or in part, on the "cloud" 522 through the use of a distributed system, such as through a platform 524 as described below.
Cloud 522 includes and/or is representative of a platform 524 for resources 526. The platform 524 abstracts underlying functionality of hardware (e.g., servers) and software resources of the cloud 522. The resources 526 may include applications and/or data that may be used when executing computer processes on servers remote from the computing device 502. The resources 526 may also include services provided over the internet and/or over a subscriber network such as a cellular or Wi-Fi network.
Platform 524 may abstract resources and functions to connect computing device 502 with other computing devices. The platform 524 may also be used to abstract the hierarchy of resources to provide a corresponding level of hierarchy encountered for the demand for resources 526 implemented via the platform 524. Thus, in interconnected device embodiments, implementation of the functionality described herein may be distributed throughout the device 500. For example, the functionality may be implemented in part on the computing device 502 and by the platform 524 that abstracts the functionality of the cloud 522.
According to the embodiment of the invention, the database can be expanded in a targeted manner by utilizing the query message which is concerned by the user and is missing, and the method is more effective compared with the method of manually adding the query message depending on the operation experience. By responding positively to the user who sent the message, and not having to wait until the user presents similar problems again, the user experience is greatly improved. By filtering the query message, the query efficiency can be improved. The method helps to obtain the response messages of the messages in batch by clustering the non-answer query messages. By refining the query message based on the basic attribute of the query message and/or the user portrait, the accurate query result can be obtained, and the user experience can be further improved.
It should be noted that the appearances of the phrases "first," "second," and the like in this disclosure are not intended to indicate any importance or order to the steps, but are merely used for distinguishing. Method steps are not described in a sequence which does not represent their execution sequence without specific description or prerequisite constraints (i.e., the execution of one step is premised on the execution result of another step), and the described method steps can be executed in a possible and reasonable order.
Other embodiments of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.
It will be understood that the present application is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the application is limited only by the appended claims.

Claims (15)

1. A message interaction method, comprising:
receiving a query message of a user;
matching the query message in a database, wherein the database has one or more first question sets and corresponding first response message sets, the one or more first question sets and the corresponding first response message sets are generated through a first clustering mode, the first clustering mode adopts a model based on machine learning, each first question set has one or more similar questions, and each corresponding first response message set has one or more response messages aiming at the one or more similar questions of the first question set; and
upon determining that no response message to the query message exists in the database:
acquiring a response message aiming at the query message;
updating the first clustering mode and the database by utilizing the query message and a response message aiming at the query message; and
and pushing a response message aiming at the query message to the user.
2. The method of claim 1, wherein the query message includes a plurality, the method further comprising: and for all the query messages without the corresponding response messages in the database, clustering in a second clustering way to form one or more second problem sets, wherein each second problem set has one or more similar problems, so that the response messages of the query messages without the corresponding response messages in the database can be conveniently obtained in batches, and the second clustering way adopts a machine learning-based model.
3. The method of claim 2, further comprising: and refining the query message for which the response message does not exist in the database by utilizing the basic attribute and/or the user image of the query message so as to obtain the response message for the refined query message.
4. The method of claim 3, further comprising: for each refined query message in the second question set, extracting the keywords thereof, determining a predetermined number of keywords with higher occurrence frequency, and presenting the keywords and the corresponding refined query message at the same time.
5. The method of claim 1, further comprising: after receiving a query message from a user and before said matching in a database for the query message, filtering the query message to remove any one or more of: a query message containing only numbers, a query message containing only symbols, a query message with a number of words exceeding a threshold number, a query message repeated with a previous message, a query message exceeding a time range, a query message with the same characters existing exceeding a threshold proportion, a query message with the symbols, letters or numbers existing exceeding a threshold proportion.
6. The method of claim 1, further comprising: after the receiving a query message of a user and before the matching in a database for the query message, filtering the query message to remove: any one or more of the following: a query message containing only numbers, a query message containing only symbols, a query message with a number of words exceeding a threshold number, a query message that repeats with a previous message, a query message that exceeds a time range, a query message with more than a threshold proportion of the same characters present in the same characters in a query message, a query message with more than a threshold proportion of symbols, letters or numbers present in the same characters in a query message.
7. The method of claim 5 or 6, wherein filtering the query message to remove duplicate query messages from previous messages comprises:
calculating a similarity edit distance of the previous message and the query message,
and when the similarity edit distance is lower than a preset threshold value of the similarity edit distance, removing the query message.
8. The method of claim 1, wherein the first clustering means clusters based on any one or more of: similarity distance, edit distance, semantic distance.
9. The method of claim 2, wherein the first clustering means and the second clustering means are the same and are clustered based on any one or more of: similarity distance, edit distance, semantic distance.
10. The method of claim 1, after the matching, further comprising:
and replying to the user based on the matching result, and informing the matched response message or informing the matching failure.
11. The method of claim 10, wherein a determination is made that no reply message to the query message exists in the database when either:
and when the matching fails, receiving the dissatisfaction evaluation given by the user for the matched response message.
12. The method of claim 1, further comprising, after the matching, receiving a similar query message of a user that is similar to the query message, and based on receiving the similar query message, determining that no reply message to the query message exists in the database.
13. A message interaction apparatus, comprising:
a receiving module configured to receive a query message of a user;
a matching module configured to match the query message in a database having one or more first problem sets generated by a first clustering method using a machine learning-based model and respectively corresponding first response message sets, each first problem set having one or more similar problems, each corresponding first response message set having one or more response messages to the one or more similar problems of the first problem set; and
a patching module configured to, upon determining that no response message to the query message exists in the database:
acquiring a response message aiming at the query message;
updating the first clustering mode and the database by utilizing the query message and a response message aiming at the query message; and
and pushing a response message aiming at the query message to the user.
14. A message interaction device, comprising:
a processor; and
a memory configured to have stored thereon computer-executable instructions that, when executed in the processor, cause performance of the method of any one of claims 1-12.
15. A computer-readable storage medium having stored therein instructions which, when run on a computer, cause the computer to implement the method of any one of claims 1-12.
CN202010705545.8A 2020-07-21 2020-07-21 Message interaction method, device, equipment and computer readable storage medium Pending CN111813915A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010705545.8A CN111813915A (en) 2020-07-21 2020-07-21 Message interaction method, device, equipment and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010705545.8A CN111813915A (en) 2020-07-21 2020-07-21 Message interaction method, device, equipment and computer readable storage medium

Publications (1)

Publication Number Publication Date
CN111813915A true CN111813915A (en) 2020-10-23

Family

ID=72861785

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010705545.8A Pending CN111813915A (en) 2020-07-21 2020-07-21 Message interaction method, device, equipment and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN111813915A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113515700A (en) * 2021-07-01 2021-10-19 深圳追一科技有限公司 Information pushing method and device, electronic equipment and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108595695A (en) * 2018-05-08 2018-09-28 和美(深圳)信息技术股份有限公司 Data processing method, device, computer equipment and storage medium
CN111309881A (en) * 2020-02-11 2020-06-19 深圳壹账通智能科技有限公司 Method and device for processing unknown questions in intelligent question answering, computer equipment and medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108595695A (en) * 2018-05-08 2018-09-28 和美(深圳)信息技术股份有限公司 Data processing method, device, computer equipment and storage medium
CN111309881A (en) * 2020-02-11 2020-06-19 深圳壹账通智能科技有限公司 Method and device for processing unknown questions in intelligent question answering, computer equipment and medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113515700A (en) * 2021-07-01 2021-10-19 深圳追一科技有限公司 Information pushing method and device, electronic equipment and storage medium
CN113515700B (en) * 2021-07-01 2024-02-20 深圳追一科技有限公司 Information pushing method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
US20210209109A1 (en) Method, apparatus, device, and storage medium for intention recommendation
US20200301954A1 (en) Reply information obtaining method and apparatus
US20220245200A1 (en) Constructing imaginary discourse trees to improve answering convergent questions
CN107943998B (en) Man-machine conversation control system and method based on knowledge graph
CN108710647B (en) Data processing method and device for chat robot
CN112800170A (en) Question matching method and device and question reply method and device
CN107992513B (en) Information processing system and method for realizing information processing
CN108452526B (en) Game fault reason query method and device, storage medium and electronic device
CN107623621B (en) Chat corpus collection method and device
CN112035599B (en) Query method and device based on vertical search, computer equipment and storage medium
EP3822814A2 (en) Human-machine interaction method and apparatus based on neural network
CN108304424B (en) Text keyword extraction method and text keyword extraction device
US20200167613A1 (en) Image analysis enhanced related item decision
CN107239450B (en) Method for processing natural language based on interactive context
CN112579733B (en) Rule matching method, rule matching device, storage medium and electronic equipment
CN104794145A (en) Connecting people based on content and relational distance
CN110019713A (en) Based on the data retrieval method and device, equipment and storage medium for being intended to understand
CN112163081A (en) Label determination method, device, medium and electronic equipment
CN108306813B (en) Session message processing method, server and client
CN103902599A (en) Fuzzy search method and fuzzy search device
CN112685550A (en) Intelligent question answering method, device, server and computer readable storage medium
CN112287082A (en) Data processing method, device, equipment and storage medium combining RPA and AI
US11200264B2 (en) Systems and methods for identifying dynamic types in voice queries
CN111813915A (en) Message interaction method, device, equipment and computer readable storage medium
CN110019714A (en) More intent query method, apparatus, equipment and storage medium based on historical results

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40030652

Country of ref document: HK