CN118013981A

CN118013981A - Method, server and system for answering questions based on text

Info

Publication number: CN118013981A
Application number: CN202410411719.8A
Authority: CN
Inventors: 穆瑞斌
Original assignee: Lazas Network Technology Shanghai Co Ltd; Zhejiang Koubei Network Technology Co Ltd
Current assignee: Lazas Network Technology Shanghai Co Ltd; Zhejiang Koubei Network Technology Co Ltd
Priority date: 2024-04-07
Filing date: 2024-04-07
Publication date: 2024-05-10
Anticipated expiration: 2044-04-07
Also published as: CN118013981B

Abstract

The application provides a method, a server and a system for answering questions based on texts. The method analyzes the target dialogue text to determine the question intention of the target user, wherein the target dialogue text comprises questions to be answered which are presented by the target user, so that the process can identify the personal requirements of the target user; determining whether there is a target functional component matching the question intent based on the question intent and the functional descriptions of the plurality of functional components; and further generating an answer result based on the target function component, the target dialogue text and the user characteristics of the target user when the target function component exists. Compared with the prior art, the method is characterized in that the target function component is determined based on the function descriptions and the questioning intents of the plurality of function components through the recall capability of the function component, and the user information of the target user is considered, so that the questions to be answered of the target user are answered. Therefore, the question answering process in the method is intelligent, and the personalized requirements and satisfaction of the target user can be met.

Description

Method, server and system for answering questions based on text

Technical Field

The present application relates to the field of electronic terminal technologies, and in particular, to a method, a server, and a system for answering questions based on text in the field of electronic terminal technologies.

Background

With the development of computer technology, it is necessary to have computers replace a part of the activities. For example, let the computer read the text like a human, and then answer questions based on the text.

Currently, most of the questions are answered by comparing the questions to be answered with a plurality of preset questions, and determining the answer result of the question which is most matched with the questions to be answered as the answer result of the questions to be answered. Obviously, the process of answering the questions is not intelligent and cannot meet the personalized needs of the user.

Therefore, a method for answering questions based on text is needed to answer questions to be answered intelligently based on the personalized needs of users, so as to improve the satisfaction of users.

Disclosure of Invention

The application provides a method, a server and a system for answering questions based on texts, which can intelligently answer questions to be answered based on the personalized requirements of target users, and can meet the personalized requirements and satisfaction of the target users.

In a first aspect, there is provided a method of answering a question based on text, the method comprising: responding to a target dialogue text, analyzing the target dialogue text, and determining the question intention of a target user, wherein the target dialogue text comprises questions to be answered which are presented by the target user; determining whether a target function component matching the question intention exists in the plurality of function components based on the function descriptions of the plurality of function components and the question intention; and generating an answer result of the to-be-answered question based on the target function component, the target dialogue text and the user characteristics of the target user in the presence of the target function component.

In the embodiment of the application, the target dialogue text is analyzed to determine the question intention of the target user, wherein the target dialogue text comprises the questions to be answered which are presented by the target user, so that the process can identify the personal requirements of the target user; determining whether a target function component matching the question intention exists in the plurality of function components based on the question intention of the target user and the function descriptions of the plurality of function components; and further generating an answer result of the to-be-answered question based on the target function component, the target dialogue text and the user characteristics of the target user when the target function component exists. That is, the method determines a target function component based on function descriptions of a plurality of function components and question intents of the target user, through recall capabilities of the function components, and considers user information (user characteristics) of the target user, thereby answering questions to be answered of the target user, with respect to a conventional question answering process. Therefore, the question answering process in the method is intelligent, and the personalized requirements and satisfaction of the target user can be met.

With reference to the first aspect, in some possible implementations, analyzing the target dialog text to determine a question intention of the target user includes: performing embedded coding on the target dialogue text to obtain an embedded vector of the target dialogue text; encoding the embedded vector through an attention mechanism to obtain text semantic features of the target dialogue text; and carrying out intention recognition on the target dialogue text based on the text semantic features to obtain the question intention of the target user.

In the embodiment of the application, the target dialogue text is embedded and encoded to obtain the embedded vector of the target dialogue text. In this way, a dialogue text can be represented by an embedded vector with a fixed length, and the positions of words in the target dialogue text can be obtained from the embedded vector, so that the dialogue text is convenient for computer processing. Next, the embedded vector is encoded by an attention mechanism, generating text semantic features of the target dialog text. This helps to focus on important information and reduce the impact of irrelevant information, enabling more accurate text semantic features. Finally, intention recognition is performed on the target dialogue text based on the text semantic features. This can accurately identify the user's question intent and provide a basis for subsequent question processing and answers.

With reference to the first aspect and the foregoing implementation manner, in some possible implementation manners, determining, based on the function descriptions of the plurality of functional components and the question intention, whether there is a target functional component matching the question intention among the plurality of functional components includes: comparing the question intent with the functional descriptions of the plurality of functional components; determining that a target functional component matched with the questioning intention exists in the plurality of functional components when any functional description in the functional descriptions of the plurality of functional components is matched with the questioning intention, wherein the target functional component is the functional component corresponding to the functional description; in the event that none of the functional descriptions of the plurality of functional components match the question intent, it is determined that there is no target functional component of the plurality of functional components that matches the question intent.

In an embodiment of the application, the function descriptions of the plurality of function components are used to describe the use of the plurality of function components in question answering, the question intents being able to clearly indicate the answer needs of the target user for the question to be answered, and therefore the method compares the question intents with the function descriptions of the plurality of function components to determine which function description best matches the question intents. That is, it is possible to find out which function description corresponds to the function component, which can solve the question to be answered of the target user, and it may not be found. After a target function description which is the best match with the questioning intention is determined from the function descriptions of a plurality of function components, the function component corresponding to the target function description is determined as the target function component. The method can accurately determine whether the target functional components exist in the plurality of functional components in a one-to-one comparison mode, and can provide a good basis for subsequent question answers under the condition that the target functional components exist.

With reference to the first aspect and the foregoing implementation manner, in some possible implementation manners, the method for determining a user characteristic of the target user includes: determining attribute information of the target user and historical behavior information of the target user before generating the target dialogue text; the user characteristic is determined based on the user attribute information and the historical behavior information.

In an embodiment of the application, the user characteristics of the user are strongly related to the user attributes of the user script and the behavior of the user history. Therefore, the method can accurately determine the user characteristics of the target user based on the attribute information of the target user and the historical behavior information of the target user before the target dialogue text is generated.

With reference to the first aspect and the foregoing implementation manner, in some possible implementation manners, the attribute information includes dynamic attribute information and natural attribute information, where the dynamic attribute information is used to describe attribute information corresponding to a text input behavior of the target user, and determining the user feature based on the user attribute information and the historical behavior information includes: determining a first user characteristic of the target user in a first period of time based on the time and the position of the text input behavior generated by the target user in the dynamic attribute information and the historical behavior information of the first period of time in the historical behavior information; and determining a second user characteristic of the target user in a second period based on the natural attribute information of the target user and the historical behavior information of the second period in the historical behavior information, wherein the time difference between the earliest time in the second period and the current time is larger than the time difference between the earliest time in the first period and the current time.

In the embodiment of the application, the attribute information of the user comprises natural attribute information (static attribute information) and dynamic attribute information, wherein the static attribute is the basis for outlining the user portrait (gender, age, academic, role, income, region, marital and the like); the dynamic attribute refers to the internet surfing behavior, entertainment preference, social habit, travel mode and knowledge acquisition mode of the user. Since the target dialog text includes text output by the target user, the text input behavior of the target user occurs before the target dialog text is generated, and the method can accurately determine the short-term (recent) characteristics (first user characteristics) of the target user by the time and the location where the text input behavior occurs by the target user, and short-term history behavior information, the time at which the text input behavior occurs being attributed to the recent. Since the static attribute is relatively fixed and not easily changed, the method can accurately determine the long-term characteristics (second user characteristics) of the target user through the natural attribute information and the long-term historical behavior information of the target user.

With reference to the first aspect and the foregoing implementation manner, in some possible implementation manners, determining, based on a time and a location where the text input behavior is generated by the target user in the dynamic attribute information and historical behavior information of a first period in the historical behavior information, a first user feature of the target user in the first period includes: determining whether the time is a working day and/or a time period corresponding to the time and/or weather conditions corresponding to the time, and determining a target city corresponding to the position; and determining the behavior characteristics, the time characteristics, the environment characteristics and the position characteristics of the target user in the first time period based on the historical behavior information of the first time period, whether the time is a working day and/or a time period corresponding to the time and/or weather conditions corresponding to the time and the target city corresponding to the position.

In an embodiment of the present application, short-term characteristics of the target user are discussed further. Specifically, when the target user generates a text input action, there are corresponding generation time (time) and generation position (position). Based on the time, it can be determined whether the time is a workday and/or a time period corresponding to the time and/or a weather condition corresponding to the time, that is, whether the target user is on a workday, a time period in which the target user is located (a time feature in which the target user is located in the first time period), and a weather condition in which the target user is located (an environmental feature in which the target user is located in the first time period); based on the location, a target city corresponding to the location, that is, a target city in which the target user is located (a location feature in which the target user is located in the first period) can be determined; further, based on the historical behavior information of the first period, a behavior characteristic of the target user within the first period can be determined. That is, the method can acquire the user characteristics in multiple ways by the three ways described above.

With reference to the first aspect and the foregoing implementation manner, in some possible implementation manners, in a case where the target function component exists, generating an answer result of the to-be-answered question based on the target function component, the target dialog text and a user feature of the target user includes: taking the target dialogue text as a parameter in the target function component under the condition that the target function component exists, and determining a reference answer result of the to-be-answered question through the target function component and the target dialogue text; and adjusting the reference answer result based on the user characteristics and the reference answer result to obtain the answer result.

In the embodiment of the application, after the target dialogue text is used as the parameter in the target function component, the reference answer result of the to-be-answered question is called through the target function component containing the parameter. That is, the method obtains the reference answer result of the question to be answered faster through the recall function of the target function component. After the reference answer result is obtained, the method adjusts the reference answer result through the user characteristics to obtain the answer result of the to-be-answered question. That is, the method also considers the user characteristics of the target user, so that the method not only can obtain the answer result quickly and intelligently, but also can meet the personalized requirements of the target user, and the satisfaction degree of the target user is improved.

With reference to the first aspect and the foregoing implementation manner, in some possible implementation manners, adjusting the reference answer result based on the user feature and the reference answer result to obtain the answer result includes: encoding the reference answer result and the user characteristic through an attention mechanism to obtain a first word vector and a second word vector; fusing the first word vector, the second word vector and the embedded vector of the target dialogue text to obtain a target word vector; mapping the target word vector into a vector with logarithmic probability, and carrying out character prediction; determining the generation probability of the character through an objective function; and determining the answer result based on the character under the condition that the generation probability of the character is larger than the preset probability.

In the embodiment of the application, the reference answer result and the user characteristic are encoded through an attention mechanism to obtain a first word vector and a second word vector; and fusing the first word vector, the second word vector and the embedded vector of the target dialogue text to obtain a target word vector. In this way, the target word vector can include preference information of the target user, reference answer results, and information of questions to be answered, and the final answer results can be accurately determined based on the target word vector. Mapping the target word vector into a vector with logarithmic probability, and carrying out character prediction; determining the generation probability of the character through an objective function; and determining a final answer result based on the corresponding character when the generation probability is high. That is, a plurality of characters in the answer result of the question to be answered are decoded one by the user feature and the reference answer result.

With reference to the first aspect and the foregoing implementation manner, in some possible implementation manners, in a case where the target function component exists, generating an answer result of the to-be-answered question based on the target function component, the target dialog text and a user feature of the target user includes: extracting key text matching the question intention from the target dialogue text based on the question intention of the target user in the case that the target function component exists and the target dialogue text includes multiple rounds of dialogue; and generating an answer result of the to-be-answered question based on the target functional component, the key text and the user characteristics of the target user.

In the embodiment of the application, the server stores and transmits the text in the text processing process, however, the storage capacity and the transmission capacity of the server are limited. In order to significantly reduce the size of the space storing text in the server and to increase the transmission rate of the text, the method performs text compression on target dialog text comprising multiple rounds of dialog. Specifically, by the question intention of the target user, the key text matched with the question intention is extracted from the target dialogue text, and the length of the target dialogue text is shortened to reduce the occupation of the storage space of the server and improve the transmission rate.

In a second aspect, there is provided a server comprising: the planning module is used for decomposing a to-be-answered question proposed by a target user included in the target dialogue text into a plurality of sub-questions; analyzing the plurality of sub-questions to determine question intents of the plurality of sub-questions; and making business logic for solving the to-be-answered questions; a tool use module for respectively calling target function components matched with the question intents of the plurality of sub-questions from the plurality of function components based on the function descriptions of the plurality of function components and the question intents of the plurality of sub-questions; a memory module for providing the server with long-term or short-term storage of the user characteristics of the target user; and the execution module is used for generating an answer result of the to-be-answered question based on the business logic, the target function component, the target dialogue text and the user characteristics of the target user.

In a third aspect, a system for answering questions based on text is provided, the system comprising: and a server as in the second aspect, the electronic device being configured to obtain and display the answer result generated by the server.

In a fourth aspect, a computer readable storage medium is provided, the computer readable storage medium storing executable program code which, when run on a computer, causes the computer to perform the method of the first aspect or any one of the possible implementations of the first aspect.

In a fifth aspect, there is provided a computer program product comprising: executable program code which, when run on a computer, causes the computer to perform the method of the first aspect or any one of the possible implementations of the first aspect.

Drawings

FIG. 1 is a schematic diagram of an interface for recommending goods according to an embodiment of the present application;

FIG. 2 is a schematic flow chart of a method for answering questions based on text provided by an embodiment of the present application;

FIG. 3 is a schematic diagram of an interface for answering questions based on text, according to an embodiment of the present application;

FIG. 4 is a flow chart for answering questions based on text provided by an embodiment of the present application;

FIG. 5 is a schematic diagram of a server for answering questions based on text according to an embodiment of the present application;

fig. 6 is a schematic structural diagram of a system for answering questions based on text according to an embodiment of the present application.

Detailed Description

The technical scheme of the application will be clearly and thoroughly described below with reference to the accompanying drawings. Wherein, in the description of the embodiments of the present application, unless otherwise indicated, "/" means or, for example, a/B may represent a or B: the text "and/or" is merely an association relation describing the associated object, and indicates that three relations may exist, for example, a and/or B may indicate: the three cases where a exists alone, a and B exist together, and B exists alone, and furthermore, in the description of the embodiments of the present application, "plural" means two or more than two.

The terms "first," "second," and the like, are used below for descriptive purposes only and are not to be construed as implying or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include one or more such feature.

In the prior art, most of the questions are answered by comparing the questions to be answered with a plurality of preset questions, and determining the answer result of the question which is most matched with the questions to be answered as the answer result of the questions to be answered. Obviously, the process of answering the questions is not intelligent and cannot meet the personalized needs of the user.

For easy understanding, a process of answering a question (a process of recommending goods) will be specifically described below with reference to fig. 1by taking a first application installed on a mobile phone as an example.

Fig. 1 is an interface schematic diagram of a recommended commodity according to an embodiment of the present application.

Taking the commodity as an example of the milk tea, fig. 1 (a) illustrates an interface 101 displayed by a first application in a mobile phone in a home page state, and a search box is displayed on the interface 101. As shown in fig. 1 (a), the first application responds to the user filling the milk tea that is drunk in the search box, the prior art ranks all the milk tea according to cost performance, and finally the ranked milk tea is sequentially displayed on the interface 102 of the first application (see fig. 1 (b)) for the user to view.

It should be appreciated that the user may directly populate the search box in fig. 1 (a) with categories of merchandise such that the first application displays purchasable items. In addition, as shown in (a) of fig. 1, a plurality of classification controls including a food control, a vegetable control, a medicine control, a drink control, a fruit control, a flower control, a western-style food control and a hot pot control are displayed below the search box. The first application responds to clicking operation of any item of classification control by a user, displays an interface comprising a search box, and responds to filling commodity subcategories corresponding to the classification control in the search box by the user. In some embodiments, after the user clicks on the vegetable control, the user may fill in the vegetable within the search box of the displayed page. As shown in fig. 1 (a), a list of items generated based on the search behavior of the user history is also displayed under the multiple item classification control.

It should also be appreciated that as shown in fig. 1 (b), information of the ranked milk tea that is generated based on cost performance (corresponding to the comprehensive ranking on the page) is displayed on the interface 102. Of course, the user may adjust the ranked plurality of milk tea by clicking on "sales preferred" or "speed preferred" to obtain a ranked plurality of milk tea generated by sales, and a ranked plurality of milk tea generated by delivery time. Wherein, sales of the milk tea is positively correlated with the rank order of the milk tea, and delivery time of the milk tea is negatively correlated with the rank order of the milk tea. In addition, the user can adjust the ranked milk tea by clicking on "red pack" or "first visit" or "happy and moist". Specifically, the ranking order of the milky tea with the consumption red package is the front, the ranking order of the milky tea with the full-reduction activity at the time of first purchase is the front, and the praise rate of the milky tea is positively correlated with the ranking order of the milky tea.

The information of the milk tea comprises the name of a shop to which the milk tea belongs, the score of the milk tea, the sales of the milk tea in months and the distance between the shop to which the milk tea belongs and the current user.

The "sales of milk tea and rank order of milk tea" in the above scheme can be specifically understood as: the higher the sales, the more forward the ranking order; the lower the sales, the more backward the ranking order. The "delivery time of milk tea and ranking order negative correlation of milk tea" in the above scheme can be specifically understood as: the shorter the delivery time, the earlier the ranking order; the longer the delivery time, the later the ranking order. The "positive correlation of the good evaluation rate of the milk tea and the ranking rank of the milk tea" in the above scheme can be specifically understood as: the higher the score, the higher the ranking order; the lower the score, the more backward the ranking order.

However, in general, users have special demands on brands, prices, tastes, packaging, store locations, delivery times, etc. of milk tea. That is, in the process of recommending milk tea, the prior art scheme is not very intelligent, and the recommendation is performed based on a matching mode, wherein the special requirement (user characteristic) of the user is not considered, so that the recommendation expectation of the user is not reached when the milk tea is recommended to the user, and the user has poor experience.

Therefore, in view of the above problems, the present application provides a method for answering a question based on text, which can intelligently answer a question to be answered based on the personalized needs of a target user, and can satisfy the personalized needs and satisfaction of the target user.

The technical solutions in the embodiments of the present application will be described below with reference to the accompanying drawings in the embodiments of the present application.

Fig. 2 is a schematic flow chart of a method for answering questions based on text provided by an embodiment of the present application.

It should be understood that the method for answering questions based on text provided by the embodiment of the application can be applied to a background server (hereinafter referred to as a server for short) corresponding to a target application. In some embodiments, the target application is for text-based answering of questions, which may be searching or recommending goods, translating code, etc. It should be understood that the questions in this method (hereinafter referred to as questions to be answered) refer to open questions.

Illustratively, as shown in FIG. 2, the method 200 includes:

S201, the server responds to the target dialogue text, analyzes the target dialogue text and determines the question intention of the target user, and the target dialogue text comprises questions to be answered by the target user.

It should be understood that the "target dialog text" in S201 described above has two forms, the first form being a form including only questions and the second form being a form including questions and answers. The first form is equivalent to that the target user clearly expresses own question requirements in the target dialogue text, and the server can answer the questions to be answered more clearly based on the questions to be answered included in the target dialogue text. The second form corresponds to that the server further needs to output at least one question, and after the target user answers the at least one question, the server can answer the question to be answered more clearly based on the answer result of the at least one question and the question to be answered.

It will also be appreciated that, corresponding to the first form described above, after the target user enters text including questions to be answered in the target application, the method generates target dialog text based on the text. Corresponding to the second form described above, after the target user inputs a text including a question to be answered in the target application, the method presents at least one question based on the text, and after the target user answers the at least one question, the method generates a target dialog text based on the text and the answer result of the at least one question.

It should also be understood that the target dialogue text in S201 above can reflect the question to be answered by the target user, where the target dialogue text includes the question to be answered by the target user.

Taking the recommended milk tea as an example, the first form of target dialogue text is as follows:

First case: in this first case, the target dialog text includes A1, A2, and A3.

(User a) A1: milk tea recommended for brand A

(User a) A2: the price is lower than 20 yuan

(User a) A3: the distance between the milky tea shop and the current positioning is less than 3km

Second case: in this second case, the target dialog text includes A4.

(User a) A4: milk tea recommended for brand A and having a price below 20 yuan, and the distance between the store of milk tea and the current location is less than 3km

The second form of target dialog text is where the target dialog text includes A5, B1, A6, B2 and A7.

(User a) A5: milk tea recommended for brand A

What is the price of the milk tea? B1 (server):

(user a) A6: the price is lower than 20 yuan

What is the requirement for the order distance? B2 (server):

(user a) A7: the distance between the milky tea shop and the current positioning is less than 3km

It will be appreciated that for the first form, the target user may enter the question requirement at the target user multiple times or once. For the second form, the method allows the target dialog text to include one or more rounds of dialog.

The process of analyzing the target dialog text and determining the question intention of the target user is described in detail as follows.

In a possible implementation manner, the server in S201 analyzes the target dialogue text to determine the question intention of the target user, including: the server performs embedded coding on the target dialogue text through a large language model to obtain an embedded vector of the target dialogue text; the server encodes the embedded vector through an attention mechanism in the large language model to obtain text semantic features of the target dialogue text; and the server carries out intention recognition on the target dialogue text based on the text semantic features through the large language model to obtain the question intention of the target user.

It will be appreciated that the above-described scheme describes a coding stage in a large language model, in particular a process by which a target dialog text is processed, in particular each character and the position in which the character is located in the target dialog text is coded by means of embedded coding, so that the target dialog text is converted into a vector (embedded vector) representation, and that the method is furthermore capable of memorizing the position information of the individual characters. According to the method, the embedded vector is encoded through an attention mechanism to obtain text semantic features of the target dialogue text, so that the generated text semantic features pay attention to important character information, semantic features (text semantic features) of the target dialogue text can be captured better, and the question intention of the target user can be identified accurately.

It should also be understood that the target dialog text includes a plurality of characters, and there are different degrees of association between the plurality of characters, and the "attention mechanism" in the above solution is to find out the characters with high degrees of association to describe the text semantic features of the target dialog text.

It should be appreciated that the "large language model" in the above scenario may be referred to simply as LLM (Large Language Model) model. The LLM is a model of a neural network with many parameters. The framework of the neural network includes Pytorch, tensorflow and Caffe, etc. The transfomer model is the most well known large language model at present, and can be used for text classification, text abstract, machine translation, answer questions, writing articles and the like in the field of natural language processing; in the field of speech recognition, for speech recognition, speech synthesis, voiceprint recognition, etc.; in the field of computer vision for image classification, object detection, image generation, etc.

It should also be appreciated that the transducer model consists of two parts, an encoder and a decoder, the encoder converting an input sequence (e.g., a dialog text (including the question to be answered)) into a series of context-representative vectors, the encoder consisting of multiple identical layers. Each layer consists of two sublayers, namely an attention layer and a feed-forward full-connection layer. Specifically, the attention layer interacts each position in the input sequence with all other positions to calculate a context representation vector for each position. The feed-forward fully connected layer maps the contextual representation vector for each location to another vector space to capture higher level features. The decoder takes as input the output of the encoder and the target sequence (e.g., predicted answer result) and generates a probability distribution for each character in the target sequence. The decoder consists of multiple identical layers, each consisting of three sublayers, an attention layer, an encoder-decoder attention layer and a feed-forward fully-connected layer, respectively. The attention layer and the feedforward full-connection layer function as the encoder, and the encoder-decoder attention layer interacts the input of the current position of the decoder with all positions of the encoder to obtain information about the target sequence. In the transducer model, the mechanism of attention is a key component. It can model the relationship between any two positions in the input sequence and can automatically learn the interdependence relationship between different positions according to the content of the input sequence. The calculation of the attention mechanism includes four steps: calculating a query vector, a key vector and a value vector by multiplying the query weight, the key weight and the value weight; combining the three vectors to determine an attention score; the server normalizes the attention score to obtain a probability value; multiplying the probability value with the value vector to obtain the text semantic feature of the target dialogue text.

In some embodiments, the large language model is an E-GPT model, which is an improvement and optimization based on a transducer model. The E-GPT model adopts a more complex architecture and training method, and can realize higher-quality natural language processing tasks, such as text generation, question-answering and the like. The E-GPT model predicts one character at a time by means of autoregressive, then takes this predicted result as input, regenerates the next character, and so on until an answer result of the question to be answered, i.e. an output text (comprising a plurality of characters) is generated.

In the embodiment of the application, the target dialogue text is embedded and encoded through a large language model, so that the embedded vector of the target dialogue text is obtained. In this way, a dialogue text can be represented by an embedded vector with a fixed length, and the positions of words in the target dialogue text can be obtained from the embedded vector, so that the dialogue text is convenient for computer processing. Next, the embedded vectors are encoded by an attention mechanism in the large language model, generating text semantic features of the target dialog text. This helps to focus on important information and reduce the impact of irrelevant information, enabling more accurate text semantic features. Finally, through the large language model, intention recognition is performed on the target dialogue text based on the text semantic features. This can accurately identify the user's question intent and provide a basis for subsequent question processing and answers.

In some embodiments, the server encodes the embedded vector through an attention mechanism in the large language model to obtain text semantic features of the target dialog text, including: the server multiplies the embedded vector by the query weight, the key weight and the value weight in the large language model respectively to obtain a query vector, a key vector and a value vector corresponding to the target dialogue text; the server determines the product between the query vector and the key vector as the attention score of the target dialogue text; the server normalizes the attention score to obtain a probability value; the server multiplies the probability value by the value vector to obtain the text semantic feature of the target dialogue text.

It should be appreciated that in the above scheme, the embedded vectors are converted into three vectors in order to capture different aspects of the information of the input object (target dialog text). First, the query vector indicates the goal of the attention calculation that is needed, which is also a representation of the input object. In the field of machine translation, a query vector is typically a word vector in a target language sentence that is used to represent an aspect that currently requires attention, such as a word that requires translation. In a recommendation system, a query vector is typically a word vector in a target text that is used to represent an aspect that currently needs attention, such as an item that needs recommendation, as characterized by a character. Second, a key vector is used to represent a particular aspect of an input object, which can be viewed as a representation of the input object in a particular dimension. For example, in the field of machine translation, an input object may be a word vector in a source language sentence, and a key vector may represent a representation of the word in a particular dimension, such as semantic information of the word. In a recommendation system, the input object may be a word vector in the target text, and the key vector may represent a representation of a character in a particular dimension, such as semantic information of the character. Finally, a value vector is a vector used to calculate the attention weight, which corresponds to a specific value of the aspect represented by the key vector, and can be regarded as an actual value of the vector. In machine translation, a value vector may be represented as a specific word vector of words in a sentence in the source language. In a recommendation system, the value vector may be represented as a specific word vector of characters in the target dialog text.

It should also be appreciated that the "target dialog text" in S201 above may be regarded as a prompt word provided to the large language model for directing the large language model to generate a particular type of text or answer. The term may be a sentence, a question, an article, or a topic that can guide the large language model in generating information or answers related to the term. For example, when given a question "what is artificial intelligence," a large language model can generate an article or answer about artificial intelligence. In addition, the large language model also has a corresponding memory module, and the memory module is used for storing information perceived by the large language model. The memory module can be built through a non-relational database. In some embodiments, the non-relational database is a Redis database.

In some embodiments, the server performs intent recognition on the target dialog text based on the text semantic features through the large language model to obtain a question intent of the target user, including: the server decodes the text semantic features of the target dialogue text through the large language model to obtain decoding features of all characters in the target dialogue text; the server decodes a plurality of words based on the decoding feature through the large language model to obtain a question intent of the target user.

S202, the server determines whether target function components matched with the questioning intention exist in the plurality of function components based on the function descriptions of the plurality of function components and the questioning intention.

It should be understood that the "functional component" in S202 above may be specifically understood as a tool or a function for generating an answer result of a question to be answered. The plurality of functional components comprise a code interpreter component, a searching component, a recommending component and the like, wherein the code interpreter component is used for automatically inquiring and searching a database to generate various reports and visual analysis charts; the searching component is used for acquiring relevant information of the latest time from the Internet; the recommendation component is used to recommend related goods or images, etc.

It should also be understood that the "question intention" in the above-described scheme is used to indicate the purpose of the target user to present the question to be answered and the thought tendency of the question-former (target user), and can be simply understood as what question the target user wants the server to answer.

In addition, for the target function component in the above solution, the method 200 also proposes some usage rules: the target functional component is specifically a functional component; parameters in the target function component may be modified as desired.

In a possible implementation manner, S202 includes: the server compares the question intent with the function descriptions of the plurality of function components; if any one of the function descriptions of the plurality of function components is matched with the questioning intention, the server determines that a target function component matched with the questioning intention exists in the plurality of function components, wherein the target function component is a function component corresponding to the function description; in the event that none of the functional descriptions of the plurality of functional components match the question intent, the server determines that there is no target functional component of the plurality of functional components that matches the question intent.

In some embodiments, the server compares the questioning intent with the functional descriptions of the plurality of functional components, including: the server determines a plurality of similarities between feature vectors corresponding to the question intents and feature vectors corresponding to the function descriptions of the plurality of functional components, respectively.

In some embodiments, in the event that there are multiple functional descriptions matching the questioning intent, the server determines a maximum similarity from the multiple similarities; and the server determines the function description of the function component corresponding to the maximum similarity as the target function description.

S203, in the case that the target function component exists, the server generates an answer result of the to-be-answered question based on the target function component, the target dialogue text and the user characteristics of the target user.

It should be understood that the "target function component" in S203 may be specifically understood as a tool or a function, where corresponding parameters exist. The "target dialogue text" in S203 described above may be used as a parameter in the target function component. Further, the "answer result of question to answer" in S203 described above takes into consideration the user characteristics of the target user, that is, the personal preference of the target user.

In a possible implementation manner, the method for determining the user characteristics of the target user in S203 includes: the server determines attribute information of the target user and historical behavior information of the target user before generating the target dialogue text; the server determines the user characteristic based on the user attribute information and the historical behavior information.

It should be appreciated that the "user features" in the above schemes are user tags that are abstracted over the apparent user roles for distinguishing users. The attribute information of the target user includes natural attribute information (static attribute information) of the target user and dynamic attribute information of the target user, wherein the static attribute is a basis for outlining user portraits (gender, age, academic, role, income, territory, marital, etc.); the dynamic attribute refers to the internet surfing behavior, entertainment preference, social habit, travel mode and knowledge acquisition mode of the user. The internet surfing behavior comprises clicking, searching, browsing, collecting and the like.

It should also be appreciated that the target dialog text is generated by the text entered by the target user (or by the text entered by the target user, and at least one question entered by the server and the answer result of the target user to the at least one question). The target user enters text into the target application through text entry behavior. Thus, the text input behavior of the target user occurs before the target dialog text is generated.

In some embodiments, the historical behavior information of the target user is "2023-09-03 11:45:30, face, not clicked, not ordered" or "2023-09-03 11:46:01, mixed with rice, clicked, not ordered" or "2023-09-03 11:46:55, cover the face, clicked, ordered".

It should be understood that the attribute information of the target user and the historical behavior information of the target user acquired in the method are licensed by the target user.

The determination process of the "attribute information of the target user and the historical behavior information of the target user" is described in detail as follows.

In some embodiments, the server determining attribute information of the target user and historical behavior information of the target user prior to generating the target dialog text includes: the server correspondingly searches attribute information of the target user from the target server based on account information of the target user; the server determining a target time for generating the target dialog text; the server searches target log information before the target time from log information of various applications based on the account information; the server determines the historical behavior information based on the target log information.

It should be understood that the "target server" in the above-described scheme refers to a server for managing user information of a plurality of users. In some embodiments, account information of the target user in the above scheme is information capable of uniquely identifying the target user, and the account information is at least one of a mobile phone number and an identity account of the target user. In some embodiments, the plurality of applications in the above scheme include a target application, and the plurality of applications are applications in a target device, where the target device includes the target application.

In a possible implementation manner, the attribute information includes dynamic attribute information and natural attribute information, the dynamic attribute information is used for describing attribute information corresponding to text input behavior of the target user, and the server determines the user feature based on the user attribute information and the historical behavior information, and includes: the server determines a first user characteristic of the target user in a first period of time based on the time and the position of the text input behavior generated by the target user in the dynamic attribute information and the historical behavior information of the first period of time in the historical behavior information; the server determines a second user characteristic of the target user in a second period based on natural attribute information of the target user and historical behavior information of the second period in the historical behavior information, wherein a time difference between an earliest time in the second period and a current time is larger than a time difference between the earliest time in the first period and the current time.

It should be appreciated that the text input actions of the target user occur prior to generating the target dialog text. The method analyzes the target dialogue text in time after generating the target dialogue text, and the time difference between the time of generating the text input action of the target user and the time of generating the target dialogue text is smaller, so that the time difference between the time of generating the text input action and the current time can be considered to be smaller to a certain extent. In addition, the time difference between the earliest time and the current time in the second period in the above scheme is greater than the time difference between the earliest time and the current time in the first period, so the first period is short or near term and the second period is long term. Thus, a recent first user characteristic (recent preference) of the target user may be determined based on the time and location in the dynamic attribute information at which the text input behavior was generated by the target user, and the historical behavior information for the first period in the historical behavior information; a long-term second user characteristic (long-term preference) of the target user may be determined based on the natural attribute information of the target user and the historical behavior information of the second period of time in the historical behavior information.

In an embodiment of the present application, since the target dialog text includes text output by the target user, the text input behavior of the target user occurs before the generation of the target dialog text, and the method can accurately determine the short-term (recent) characteristics (first user characteristics) of the target user by the time and the location where the text input behavior occurs by the target user, and the short-term historical behavior information, the time when the text input behavior occurs being attributed to the recent. Since the static attribute is relatively fixed and not easily changed, the method can accurately determine the long-term characteristics (second user characteristics) of the target user through the natural attribute information and the long-term historical behavior information of the target user.

First: first user feature

In a possible implementation manner, the server determines a first user characteristic of the target user in a first period based on the time and the location where the text input behavior is generated by the target user in the dynamic attribute information and the historical behavior information of the first period in the historical behavior information, and includes: the server determines whether the time is a working day and/or a time period corresponding to the time and/or weather conditions corresponding to the time, and determines a target city corresponding to the position; the server determines behavior characteristics, time characteristics, environment characteristics and position characteristics of the target user in the first time period based on the historical behavior information of the first time period, whether the time is a working day and/or a time period corresponding to the time and/or weather conditions corresponding to the time and a target city corresponding to the position.

In an embodiment of the present application, short-term characteristics of the target user are discussed further. Specifically, when the target user generates a text input action, there are corresponding generation time (time) and generation position (position). Based on the time, it can be determined whether the time is a workday and/or a time period corresponding to the time and/or a weather condition corresponding to the time, that is, whether the target user is on a workday, a time period in which the target user is located (a time feature in which the target user is located in the first time period), and a weather condition in which the target user is located (an environmental feature in which the target user is located in the first time period); based on the location, a target city corresponding to the location, that is, a target city in which the target user is located (a location feature in which the target user is located in the first period) can be determined; further, based on the historical behavior information of the first period, a behavior characteristic of the target user within the first period can be determined. That is, the method can acquire the user characteristics of the user in the near term (short term) in multiple ways by the three ways.

In some embodiments, the time period is any one of morning, afternoon, and evening.

In some embodiments, where the question to be answered is a recommended food, the period of time is a dietary period. In some embodiments, the method 200 further comprises: the server determines whether the time is a holiday or not and determines the target business circle to which the location belongs.

In some embodiments, the server determines, based on the historical behavior information of the first period, whether the time is a weekday and/or a time period corresponding to the time and/or a weather condition corresponding to the time, and a target city corresponding to the location, a behavior feature, a time feature, an environmental feature, and a location feature of the target user in the first period, including: in the case that the historical behavior information of the first period indicates that the number of times that the target user clicks the target page is greater than a first preset number of times, the server determines that the behavior feature in the first period includes a function that the target user prefers to use in the target page; in the case that the time is a working day, the server determines that the time feature includes that the target user has an idle time less than a first preset duration, and the target user is in the target period; in the case that the weather condition is a target weather condition, the server determines that the target user is in the target weather condition; the server determines, based on the target city, that the location feature includes the target user being in the target city.

In other embodiments, the to-be-answered question is a recommended food, and the server determines, based on the historical behavior information of the first period, whether the time is a weekday and/or a time period corresponding to the time and/or a weather condition corresponding to the time, and a target city corresponding to the location, a behavior feature, a time feature, an environmental feature, and a location feature of the target user in the first period, including: when the historical behavior information of the first period indicates that the number of times of ordering the delicious food with the preset taste by the target user is larger than the first preset number of times, the server determines that the behavior characteristics in the first period comprise that the target user prefers to ordering the delicious food with the preset taste; under the condition that the time is a working day, the server determines that the time characteristic comprises that the target user accepts the delivery waiting time less than a second preset time length and the target user subscribes to the corresponding food in the target time period; under the condition that the weather condition is the target weather condition, the server determines that the target user subscribes to the corresponding food under the target weather condition; the server determines, based on the target city, that the location feature includes the target user subscribing to a corresponding food product around the target city.

Second,: second user feature

In some embodiments, the server determines a second user characteristic of the target user during a second period of time based on natural attribute information of the target user and historical behavior information of the second period of time, including: the server determines behavior characteristics, age characteristics, learning characteristics and income characteristics of the target user in the second period based on the historical behavior information of the second period and the ages, learning and income in the natural attribute information.

In an embodiment of the present application, the long-term characteristics of the target user are discussed further. Specifically, when the natural attribute information includes age, school, and income, there is a corresponding user feature. Based on the age, the academic and the income in the natural attribute information, respectively and correspondingly determining the age characteristic, the academic characteristic and the income characteristic of the target user; further, based on the historical behavior information of the second period, a behavior characteristic of the target user within the second period can be determined. That is, the method can obtain the long-term user characteristics of the user in multiple ways through the two modes.

In some embodiments, the server determines a behavioral characteristic, an age characteristic, an academic characteristic, and an income characteristic of the target user during the second period based on the historical behavioral information of the second period and the age, the academic, and the income in the natural attribute information, including: in the case that the historical behavior information of the second period indicates that the number of times the target user clicks the target page is greater than a second preset number of times, the server determines that the behavior feature in the second period includes a function that the target user prefers to use in the target page; under the condition that the age of the target user belongs to a preset age group, the server determines that the age characteristic comprises the crowd of which the target user is the preset age group; in the case that the subject user's academy belongs to a preset academy, the server determines that the characteristics of the academy include that the subject user has a cultural level corresponding to the preset academy; in the event that the revenue for the target user falls within a preset revenue range, the server determines that the revenue feature includes the target user having a revenue level corresponding to the preset revenue range.

In a possible implementation manner, S203 includes: in the case where the target function component exists and the target dialog text includes multiple rounds of dialog, the server extracts key text matching the question intention from the target dialog text based on the question intention of the target user; and the server generates an answer result of the to-be-answered question based on the target functional component, the key text and the user characteristics of the target user through the large language model.

It will be appreciated that the target application limits the length of text entered, which may reduce the occupation of memory space of the server (for processing text entered into the target application) and increase the transmission rate of text to some extent. In some embodiments, the length is 8k. In general, the byte length of the text of the multi-turn dialog is longer, and the occupied storage space is larger, so that the target dialog text is compressed when the target dialog text comprises the multi-turn dialog. Of course, it may also be determined whether to compress the target dialog text based on the byte length of the target dialog text. That is, in case that the byte length of the target dialog text is greater than the preset length, the server compresses the target dialog text.

In some embodiments, the server extracts key text matching the question intent from the target dialog text based on the question intent of the target user, including: according to the sequence from front to back, the server scores the importance of each character in the target dialogue text based on the questioning intention through the large language model to obtain the importance score of each character; the server takes characters with importance scores larger than preset scores as keywords through the large language model; the server generates key text based on the keywords through the large language model.

In some embodiments, scoring the importance of each character in the target dialog text by the server through the large language model based on the question intent in a front-to-back order, resulting in an importance score for each character, comprising: and (3) the server scores the importance of each character by adopting a PageRank function based on the questioning intention through the large language model according to the sequence from front to back, so as to obtain the PageRank score of each character.

In a possible implementation manner, S203 includes: in the case that the target function component exists, the server takes the target dialogue text as a parameter in the target function component, and determines a reference answer result of the to-be-answered question through the target function component and the target dialogue text; and the server adjusts the reference answer result based on the user characteristics and the reference answer result through the large language model to obtain the answer result.

It should be understood that the "reference answer result of the question to be answered" in the above-described scheme is an answer result recalled by the method through the target function component containing the parameters (target dialog text), which does not take into account the recent preference (also called short-term preference) and long-term preference of the target user.

In the embodiment of the application, after the target dialogue text is used as the parameter in the target function component, the reference answer result of the to-be-answered question is called through the target function component containing the parameter. That is, the method obtains the reference answer result of the question to be answered faster through the recall function of the target function component. After the reference answer result is obtained, the large language model adjusts the reference answer result through the user characteristics to obtain the answer result of the to-be-answered question. That is, the method also considers the user characteristics of the target user, so that the method not only can obtain the answer result quickly and intelligently, but also can meet the personalized requirements of the target user, and the satisfaction degree of the target user is improved.

In a possible implementation manner, the server adjusts the reference answer result based on the user feature and the reference answer result through the large language model to obtain the answer result, and includes: the server encodes the reference answer result and the user characteristic through an attention mechanism in the large language model to obtain a first word vector and a second word vector; the server fuses the first word vector, the second word vector and the embedded vector of the target dialogue text through the large language model to obtain a target word vector; the server maps the target word vector into a vector with logarithmic probability through the large language model, and predicts characters; the server determines the generation probability of the character based on the objective function through the large language model; and under the condition that the generation probability of the characters is larger than the preset probability, the server determines the answer result based on the characters through the large language model.

It should be appreciated that in the above scheme, the large language model may be used to fuse the first word vector, the second word vector and the embedded vector by stitching, multiplying by bits, adding by bits or adding by outer products, so as to obtain the target word vector. In addition, in the above scheme, "mapping the target word vector to a vector of logarithmic probability" refers to mapping the target word vector to a vector of ten thousand cells in length, i.e., a score of each cell corresponding to a certain character.

It should also be appreciated that the above scheme describes a process of determining an answer result of a question to be answered by a decoding stage in a large language model, specifically, encoding the reference answer result and the user feature by an attention mechanism in the large language model to obtain a first word vector and a second word vector; fusing the first word vector, the second word vector and the embedded vector of the target dialogue text based on the large language model to obtain a target word vector; mapping the target word vector into a vector with logarithmic probability through the large language model, and carrying out character prediction on the vector with logarithmic probability; further, when the generation probability of the character is large, the answer result of the question to be answered is determined based on the character through the large language model. The method encodes the reference answer result and the user characteristic through the attention mechanism in the large language model, so that the generated first word vector and second word vector pay attention to important character information, and the important information and the user characteristic in the reference answer result can be captured better. The probability of generation of the character is judged, and the final answer result is determined based on the character when the probability of generation is large, which can improve the accuracy of the answer result generated.

In the embodiment of the application, the reference answer result and the user characteristic are encoded through the attention mechanism in the large language model to obtain a first word vector and a second word vector; and fusing the first word vector, the second word vector and the embedded vector of the target dialogue text through the large language model to obtain a target word vector. In this way, the target word vector can include preference information of the target user, reference answer results, and information of questions to be answered, and the final answer results can be accurately determined based on the target word vector. Mapping the target word vector into a vector with logarithmic probability through the large language model, and carrying out character prediction; determining, by the large language model, a generation probability of the character based on the objective function; and determining a final answer result based on the corresponding character when the generation probability is high. That is, a plurality of characters in the answer result of the question to be answered are decoded one by the user feature and the reference answer result.

Optionally, the objective function is a Softmax function for converting real values into a probability distribution.

Optionally, in a case that the generation probability of the character is greater than the preset probability, the server determines the answer result based on the character through the large language model, including: and integrating the plurality of characters based on the generation time of each character by the server through the large language model to obtain the answer result.

It should be appreciated that the servers in this method 200 may be considered Agents or Agents (Agents), which may be understood as computing entities capable of performing autonomous understanding, planning decisions, and complex tasks, and featuring autonomy, responsiveness, sociality, initiative, and the like. The method 200 combines the Agents with a large language model, specifically constructs intelligent Agents Agents based on LLM model, namely utilizes LLM model as a center of Agents, understands the questioning intention of a target user and further decides to call a target function component, considers the long-term preference and short-term preference of the target user, and finally generates an answer result of the to-be-answered question based on the target function component, a text (target dialogue text) comprising the to-be-answered question, and the long-term preference and short-term preference.

In addition, the method 200 also proposes some usage rules: if the target dialogue text directly expresses that the target user inputs a shop name or a class name, determining that the questioning intention of the target user is a recommendation intention, and calling a shop searching functional component or a commodity searching functional component to obtain an answer result of the to-be-answered question; when the server answers the questions to be answered included in the target dialogue text, the server answers autonomously as much as possible, and no function component is used; if the target user has no obvious user characteristics, in the process of recommending commodities, the commodities around the target user can be recommended to the target user, and the process can be regarded as generalized recommendation; if the text input by the target user for generating the target dialogue text is wrong, the server needs to associate correct characters and check the correct characters with the target user so as to carry out subsequent tasks, wherein the process is to check the intention of the target user and further complete the process of answering the questions; if the target user inputs some unreasonable questions, the server can refuse to answer the questions of the target user and give a graceful answer result, and the process can be regarded as spam guidance.

It should also be appreciated that the target application in the method 200 is also capable of answering objective questions. That is, the server will refer to the answer result as the answer result of the final question to be answered. That is, the method 200, after determining the reference answer result, may determine whether to adjust the reference answer result based on the user characteristics of the target user based on the question to be answered.

Fig. 3 is a schematic diagram of an interface for answering questions based on text according to an embodiment of the present application.

Illustratively, as shown in interface 301 of fig. 3 (a), the target user inputs the text "recommended milk tea", i.e., the target dialog text, on the current page of the target application through the keypad area. The server corresponding to the target application analyzes the target dialogue text and determines the question intention of the target user; determining whether a target function component matched with the question intention exists in the plurality of function components based on the function descriptions of the plurality of function components and the question intention, wherein the target function component is a search function component (item_search) for searching for milk tea; based on the search function component, the target dialog text, and the user characteristics (personal preferences) of the target user, an interface 302 is generated as shown in fig. 3 (b) in which the answer results of the server for "recommended milk tea" are displayed. The answer result is "two pieces of milk tea recommended for you: links 303 corresponding to milk tea a; milk tea B corresponding link 304", the target user may select milk tea a or milk tea B by clicking on link 303 or link 304.

From the interfaces 101 and 102 (prior art) shown in the present application, and the interfaces 301 and 302 (this method), it can be seen that the text-based answer questions method in the present application is different from the prior art, particularly when used on a user interaction interface.

Fig. 4 is a flowchart of a text-based answer to a question provided by an embodiment of the application.

Illustratively, taking the scenario of recommending goods as an example, a process based on text answering questions is described. As shown in fig. 4, under a question of a target user, a server responds to a target dialogue text, analyzes the target dialogue text, determines a question intention of the target user, namely, a decision question intention, and determines attribute information of the target user and historical behavior information of the target user before generating the target dialogue text; the server determines the user characteristics of the target user based on the user attribute information and the historical behavior information; the server determines whether a target function component matched with the questioning intention exists in the plurality of function components based on the function descriptions of the plurality of function components and the questioning intention; if the text input by the target user for generating the target dialogue text is wrong, the server needs to associate correct characters and check with the target user so as to further carry out subsequent tasks to complete the question answer; if the target user inputs some unreasonable questions, the server can refuse to answer the questions of the target user and give out spam answer results. In the case that the target function component exists, the server determines whether the target dialogue text comprises a store name or a class name, in the case that the target dialogue text comprises the store name or the class name, the server determines that the question intention of the target user is a precise recommendation intention, and the store search function component or the commodity search function component can be called to generate an answer result of the question based on the user characteristics of the target user; in the case where the store name or the category name is not included, that is, the generalized recommendation, the server recommends the commodity around the target user to the target user without using the user characteristics of the target user, and then generates the answer result. And finally, the server returns the answer result to the target user.

Fig. 5 is a schematic structural diagram of a server for answering questions based on text according to an embodiment of the present application.

Illustratively, as shown in FIG. 5, the server 500 includes:

A planning module 501, configured to decompose a question to be answered by a target user included in a target dialog text into a plurality of sub-questions; analyzing the plurality of sub-questions to determine question intents of the plurality of sub-questions; and making business logic for solving the to-be-answered questions;

A tool usage module 502 for respectively calling target function components matching the question intents of the plurality of sub-questions from the plurality of function components based on the function descriptions of the plurality of function components and the question intents of the plurality of sub-questions;

A memory module 503 for providing the server 500 with long-term or short-term storage of the user characteristics of the target user;

And an execution module 504, configured to generate an answer result of the question to be answered based on the business logic, the target function component, the target dialogue text and the user characteristics of the target user.

It should be appreciated that the business logic of planning module 501 indicated in FIG. 5 may be understood as some of the usage rules set forth in method 200. The long-term or short-term stored user characteristics of the target user in the memory module 503 in fig. 5 may be considered as a second user characteristic (long-term preference) and a first user characteristic (recent preference), respectively, in the method 200.

It should also be appreciated that the planning module 501 in FIG. 5 analyzes the plurality of sub-questions to determine that the question intent of the plurality of sub-questions corresponds to the decision question intent in FIG. 4. The tool usage module 502 in fig. 5 invokes, from among the plurality of functional components, a branch of the target functional component corresponding to the target functional component in fig. 4 that matches the question intent of the plurality of sub-questions, respectively. The user characteristics of the target user for which the memory module 503 in fig. 5 is used to provide long-term or short-term storage for the server 500 correspond to the user characteristics of the target user in fig. 4. The execution module 504 in fig. 5 generates an answer result for the question to be answered based on the business logic, the target function component, the target dialog text, and the user characteristics, corresponding to the refined recommendation in fig. 4, and then generates the answer result.

Illustratively, as shown in FIG. 6, the system 600 includes: an electronic device 601 and a server 500 as in fig. 5, the electronic device is used for obtaining and displaying the answer result generated by the server 500.

It should be appreciated that the electronic device 601 includes a target application.

In this embodiment, the server may be divided into functional modules according to the above method example, for example, each functional module may be corresponding to one processing module, or two or more functions may be integrated into one processing module, where the integrated modules may be implemented in a hardware form. It should be noted that, in this embodiment, the division of the modules is schematic, only one logic function is divided, and another division manner may be implemented in actual implementation.

In the case of dividing each function module by corresponding each function, the server may further include a planning module, a tool use module, a memory module, an execution module, and the like. It should be noted that, all relevant content related to the above method embodiments may be cited to the functional descriptions of the corresponding functional modules, which are not described herein.

In addition, the embodiment of the application also protects a device, which can comprise a memory and a processor, wherein executable program codes are stored in the memory, and the processor is used for calling and executing the executable program codes to execute the method for answering the questions based on the text.

It should be understood that the apparatus provided in this embodiment is used to perform a method for answering questions based on text, so that the same effects as those of the implementation method can be achieved.

In case of an integrated unit, the apparatus may comprise a processing module, a memory module. When the device is applied to a server, the processing module can be used for controlling and managing the action of the server. The storage module may be used to support a server to execute inter-program code, etc.

Wherein the processing module may be a processor or controller that may implement or execute the various illustrative logical blocks, modules, and circuits described in connection with the present disclosure. A processor may also be a combination of computing functions, including for example one or more microprocessors, digital Signal Processing (DSP) and microprocessor combinations, etc., and a memory module may be a memory.

In addition, the device provided by the embodiment of the application can be a chip, a component or a module, wherein the chip can comprise a processor and a memory which are connected; the memory is used for storing instructions, and when the processor calls and executes the instructions, the chip can be caused to execute the method for answering the questions based on the text provided by the embodiment.

The present embodiment also provides a computer-readable storage medium having stored therein executable program code which, when run on a computer, causes the computer to perform the above-described related method steps to implement a method for answering questions based on text provided in the above-described embodiments.

The present embodiment also provides a computer program product comprising: executable program code which, when run on a computer, causes the computer to perform the above-described related steps to implement a method for text-based answering questions provided by the above-described embodiments.

The server, the apparatus, the computer readable storage medium, the computer program product, or the chip provided in this embodiment are used to execute the corresponding method provided above, and therefore, the advantages achieved by the method can refer to the advantages in the corresponding method provided above, and will not be described herein.

It will be appreciated by those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional modules is illustrated, and in practical application, the above-described functional allocation may be performed by different functional modules according to needs, i.e. the internal structure of the apparatus is divided into different functional modules to perform all or part of the functions described above.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of modules or units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another apparatus, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.

The foregoing is merely illustrative of the present application, and the present application is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present application. Therefore, the protection scope of the application is subject to the protection scope of the claims.

Claims

1. A method for answering questions based on text, the method comprising:

Responding to a target dialogue text, analyzing the target dialogue text, and determining a question intention of a target user, wherein the target dialogue text comprises questions to be answered which are presented by the target user;

determining whether a target function component matching the question intention exists in the plurality of function components based on the function descriptions of the plurality of function components and the question intention;

And generating an answer result of the to-be-answered question based on the target functional component, the target dialogue text and the user characteristics of the target user when the target functional component exists.

2. The method of claim 1, wherein analyzing the target dialog text to determine a question intent of a target user comprises:

performing embedded coding on the target dialogue text to obtain an embedded vector of the target dialogue text; encoding the embedded vector through an attention mechanism to obtain text semantic features of the target dialogue text;

And carrying out intention recognition on the target dialogue text based on the text semantic features to obtain the question intention of the target user.

3. The method of claim 1, wherein the determining whether a target functional component of the plurality of functional components that matches the question intent exists based on the functional description of the plurality of functional components and the question intent comprises:

Comparing the questioning intent with the functional descriptions of the plurality of functional components;

Determining that a target functional component matched with the questioning intention exists in the functional components when any functional description in the functional descriptions of the functional components is matched with the questioning intention, wherein the target functional component is the functional component corresponding to the functional description;

In the event that none of the functional descriptions of the plurality of functional components match the question intent, it is determined that there is no target functional component of the plurality of functional components that matches the question intent.

4. The method of claim 1, wherein the method of determining the user characteristics of the target user comprises:

determining attribute information of the target user and historical behavior information of the target user before generating the target dialogue text;

the user characteristics are determined based on the user attribute information and the historical behavior information.

5. The method of claim 4, wherein the attribute information includes dynamic attribute information and natural attribute information, the dynamic attribute information describing attribute information corresponding to text input behavior of the target user, the determining the user characteristic based on the user attribute information and the historical behavior information, comprising:

Determining a first user characteristic of the target user in a first period of time based on the time and the position of the text input behavior generated by the target user in the dynamic attribute information and the historical behavior information of the first period of time in the historical behavior information;

and determining a second user characteristic of the target user in a second period based on the natural attribute information of the target user and the historical behavior information of the second period in the historical behavior information, wherein the time difference between the earliest time in the second period and the current time is larger than the time difference between the earliest time in the first period and the current time.

6. The method of claim 5, wherein the determining the first user characteristic of the target user during the first period based on the time and the location in the dynamic attribute information at which the text input behavior was generated by the target user and the historical behavior information for the first period in the historical behavior information comprises:

Determining whether the time is a working day and/or a time period corresponding to the time and/or a weather condition corresponding to the time, and determining a target city corresponding to the position;

And determining behavior characteristics, time characteristics, environment characteristics and position characteristics of the target user in the first time period based on the historical behavior information of the first time period, whether the time is a working day and/or a time period corresponding to the time and/or weather conditions corresponding to the time and the target city corresponding to the position.

7. The method of claim 1, wherein the generating the answer result to the question to be answered based on the target function component, the target dialog text, and the user characteristics of the target user in the presence of the target function component comprises:

When the target functional component exists, the target dialogue text is used as a parameter in the target functional component, and a reference answer result of the to-be-answered question is determined through the target functional component and the target dialogue text;

and adjusting the reference answer result based on the user characteristics and the reference answer result to obtain the answer result.

8. The method of claim 7, wherein adjusting the reference answer result based on the user characteristic and the reference answer result to obtain the answer result comprises:

Encoding the reference answer result and the user characteristics through an attention mechanism to obtain a first word vector and a second word vector;

fusing the first word vector, the second word vector and the embedded vector of the target dialogue text to obtain a target word vector;

Mapping the target word vector into a vector with logarithmic probability, and carrying out character prediction;

determining the generation probability of the character through an objective function;

And determining the answer result based on the character under the condition that the generation probability of the character is larger than the preset probability.

9. The method according to any one of claims 1-8, wherein the generating an answer result to the question to be answered based on the target functional component, the target dialog text and the user characteristics of the target user in the presence of the target functional component comprises:

extracting key text matching the question intention from the target dialogue text based on the question intention of the target user in the case that the target function component exists and the target dialogue text comprises multiple rounds of dialogue;

And generating an answer result of the to-be-answered question based on the target functional component, the key text and the user characteristics of the target user.

10. A server for answering questions based on text, the server comprising:

The planning module is used for decomposing a to-be-answered question proposed by a target user included in the target dialogue text into a plurality of sub-questions; analyzing the plurality of sub-questions to determine question intents of the plurality of sub-questions; and making business logic for solving the to-be-answered questions;

A tool use module for respectively calling target function components matched with the question intents of the plurality of sub-questions from the plurality of function components based on the function descriptions of the plurality of function components and the question intents of the plurality of sub-questions;

A memory module for providing long-term or short-term storage of user characteristics of the target user for the server;

And the execution module is used for generating an answer result of the to-be-answered question based on the business logic, the target function component, the target dialogue text and the user characteristics of the target user.

11. A system for answering questions based on text, the system comprising: an electronic device and the server of claim 10, wherein the electronic device is configured to obtain and display an answer result generated by the server.

12. A computer readable storage medium, characterized in that the computer readable storage medium stores executable program code, which when run on a computer causes the computer to perform the method of any one of claims 1 to 9.

13. A computer program product, characterized in that the computer program product, when run on a computer, causes the computer to perform the method of any of claims 1 to 9.